An online tool for assessing the mutational spectrum of epPCR libraries with poor sampling.

Error prone PCR is a method to create a pool of amplicons with some random
errors. For the best results the number of mutations and the spectrum of
the mutations needs to be controlled, hence the need for a test library.
The calculations of a test libray are slightly laborious and are affected
by the very small sample size. This calculator tries to overcome these
two issues by computing the mutational biases given a starting sequence
and list of mutant genotypes, by calculating the mutations per sequence
by fitting it to a Poisson distribution and by estimating the errors in
the values. In particular, the errors are calculated using the assumption
that a mutation and its complementary are equally likely in light of the
double helix nature of DNA (*e.g.* A to G on one strand will result
in T to C on the other). For the specific formulae used see this note about propagating
errors.

The program can calculate mutation frequencies from the list of mutations
found and the template sequence or it can also accept the frequencies directly.
The 'Demo' values are from
an actual experiment.

There are two possible starting points for mutanalysts.

One is proving a sequence and the mutations sampled, for which the mutational
load, mutational spectrum and the mutational bias indicators will be calculated.

The other is more downstream, wherein one proves a mutational spectrum
and mutational load and the mutational bias indicators will be calculated.

Choose:

If you want to know what mutations you have in a series of ab1 file check out out Mutantcaller.

If you want to know the library composition (

In frame sequence that was mutagenised. Note that all symbols that aren't
uppecase ATUGC, will be discarded along with a Fasta header (*e.g.* '>T.
maritima Cystathionine β-lyase'), therefore for masked sequences
use lowercase.

Sequence

For Pedel-AA calculations, the library size is required.

Size

This is the list of the mutations found. Identifying the mutations can
be done using the Mutantcaller tool.

The format is as follows:

- Each line contains one or moremutations of a variant sampled.
- The mutations can only be in the forms A123C or 123A>C, where the number is irrelevant (and can be omitted).
- A wild type sequence can be indicated with 'wt', it is not needed for the main calculations and it is used solely for the mutational frequency —and useful for Pedel.
- Rarer events such as insertions, deletion, duplications, frameshifts and inversions, are not taken into account, but their frequency can be easily calculated using the 'values for further analysis' below.

Variants

The simplest estimate of the frequency of mutations per sequence is the
average of the point mutations per sequence (*m*), however due to
the small sample size this may be off. The distribution of number of mutations
per sequence follows a PCR distribution, which can approximated
with
a Poisson distribution (Sun, 1995).
In the latter,
the mean and the variance are the same (λ —unrelated to PCR
efficiency—). The *sample* average and variance may differ,
especially at low sampling. The number to trust the most is the λ_{Poisson}.

The average is **N/A** mutations
per sequence (N/A kb).

The sample variance is **N/A** mutations
per sequence.

The λ_{Poisson} is **N/A** mutations
per sequence.

If the λ_{Poisson} and average are very different and the
plot is very poor, sequencing more variants from the test library may be
reccomendable.

Rows represent the wildtype base, while columns the base in the mutant.

From\To | A | T | G | C |
---|---|---|---|---|

A | ||||

T | ||||

G | ||||

C |

Colour codes | Identity | Purine transition | Pyrimine transition | Transversion |
---|

Proportion of adenine

%

Proportion of thymine

%

Proportion of guanine

%

Proportion of cytosine

%

Data display options | Raw data | Frequency normalised | Strand complimentary normalised |
---|

Sequence-composition–corrected incidence of mutations (%):

From/To | A | T | G | C |
---|---|---|---|---|

A | ||||

T | ||||

G | ||||

C |

Indicator | Calculated | Estimated error |
---|---|---|

Ts/Tv | ||

AT→GC/GC→AT | ||

A→N, T→N (%) | ||

G→N,C→N (%) | ||

AT→GC (%) | ||

GC→AT (%) | ||

Transitions (%) total | ||

A→G, T→C (%) | ||

G→A, C→T (%) | ||

transversions (%) Total | ||

A→T, T→A (%) | ||

A→C, T→G (%) | ||

G→C, C→G (%) | ||

G→T, C→A (%) |

For details about pedel-AA see pedel-AA homepage.