Warning

Matteo just upgraded both Bootstrap and Font Awesome, and did a major change to the Mako templating so bugginess is a given.

MutanalystMutanalyst logo

Mutational load and spectrum calculator

Description

An online tool for assessing the mutational spectrum of epPCR libraries with poor sampling.

Background

Error prone PCR is a method to create a pool of amplicons with some random errors. For the best results the number of mutations and the spectrum of the mutations needs to be controlled, hence the need for a test library. The calculations of a test libray are slightly laborious and are affected by the very small sample size. This calculator tries to overcome these two issues by computing the mutational biases given a starting sequence and list of mutant genotypes, by calculating the mutations per sequence by fitting it to a Poisson distribution and by estimating the errors in the values. In particular, the errors are calculated using the assumption that a mutation and its complementary are equally likely in light of the double helix nature of DNA (e.g. A to G on one strand will result in T to C on the other). For the specific formulae used see this note about propagating errors.
The program can calculate mutation frequencies from the list of mutations found and the template sequence or it can also accept the frequencies directly. The 'Demo' values are from an actual experiment.


Choose starting point

There are two possible starting points for mutanalysts.
One is proving a sequence and the mutations sampled, for which the mutational load, mutational spectrum and the mutational bias indicators will be calculated.
The other is more downstream, wherein one proves a mutational spectrum and mutational load and the mutational bias indicators will be calculated.


Choose:


If you want to know what mutations you have in a series of ab1 file check out out Mutantcaller.
If you want to know the library composition (e.g. redundancy) check out PedelAA or go to the bottom of this page.

Starting from a sequence and a mutant genotype list

Sequence

In frame sequence that was mutagenised. Note that all symbols that aren't uppecase ATUGC, will be discarded along with a Fasta header (e.g. '>T. maritima Cystathionine β-lyase'), therefore for masked sequences use lowercase.

Sequence


Library size

For Pedel-AA calculations, the library size is required.


Size

Mutations found

This is the list of the mutations found. Identifying the mutations can be done using the Mutantcaller tool.
The format is as follows:

  • Each line contains one or moremutations of a variant sampled.
  • The mutations can only be in the forms A123C or 123A>C, where the number is irrelevant (and can be omitted).
  • A wild type sequence can be indicated with 'wt', it is not needed for the main calculations and it is used solely for the mutational frequency —and useful for Pedel.
  • Rarer events such as insertions, deletion, duplications, frameshifts and inversions, are not taken into account, but their frequency can be easily calculated using the 'values for further analysis' below.
Variants

Mutational frequency

The simplest estimate of the frequency of mutations per sequence is the average of the point mutations per sequence (m), however due to the small sample size this may be off. The distribution of number of mutations per sequence follows a PCR distribution, which can approximated with a Poisson distribution (Sun, 1995). In the latter, the mean and the variance are the same (λ —unrelated to PCR efficiency—). The sample average and variance may differ, especially at low sampling. The number to trust the most is the λPoisson.

The average is N/A mutations per sequence (N/A kb).

The sample variance is N/A mutations per sequence.

The λPoisson is N/A mutations per sequence.


If the λPoisson and average are very different and the plot is very poor, sequencing more variants from the test library may be reccomendable.

Starting from a table of tallied nucleotide specific mutations  

Rows represent the wildtype base, while columns the base in the mutant.

From\To A T G C
A
T
G
C

Colour codes Identity Purine transition Pyrimine transition Transversion

Required colour change

These colours suck. Change around. Theme linked??
Proportion of adenine
%
Proportion of thymine
%
Proportion of guanine
%
Proportion of cytosine
%

Corrected mutation incidence

Data display options Raw data Frequency normalised Strand complimentary normalised

Sequence-composition–corrected incidence of mutations (%):

From/To A T G C
A
T
G
C

Graphical Representation

A G C T

Download

Bias indicators

Indicator Calculated Estimated error
Ts/Tv
AT→GC/GC→AT
A→N, T→N (%)
G→N,C→N (%)
AT→GC (%)
GC→AT (%)
Transitions (%) total
A→G, T→C (%)
G→A, C→T (%)
transversions (%) Total
A→T, T→A (%)
A→C, T→G (%)
G→C, C→G (%)
G→T, C→A (%)


Pedel-AA results

For details about pedel-AA see pedel-AA homepage.