Glossary

Commonly used technical words within this site

  • Mutational load Average number of mutations per variant in the library. This can be per gene or per kb.
  • Mutational spectrum The distribution of mutation types (e.g. A to C)
  • Mutational bias Various metrics to assess the divergence from the ideal situation where a nucleotide position has equiprobably change of mutating into any of the other three.
  • error rate per division The average number of de novo mutations per kb that happen during one PCR duplication cycle
  • Transversions This is a mutation where a purine (adenine, guanine), becomes a pyrimidine (thymine/uracil, cytosine) or vice versa.
  • Transitions This is a mutation where the class of nucleobase (purine or pyrimidine) is unchanged.
  • Doubling number Number of duplications during PCR, which differs from PCR cycle number
  • Missense mutation A mutation that results in an amino acid change. These are good mutations for directed evolution.
  • Nonsense mutation A mutation that results in a premature stop. These are bad mutations for directed evolution.
  • Synonymous mutation A mutation that alters the codon into another that codes the same amino acid. one third of changes in a equiprobable scenario are synonymous.
  • Error-prone PCR A PCR reaction with a polymerase without proof-reading ability that may have lower fidelity due to amino acid changes (Mutazyme) or cofactor (manganese) or in the presence of wobbly dNTPs (oxodGTP and dPTP). This differs from PCR where the primers carry the variation (QuickChange-like).

FAQ

Answers to question you might have.

If still troubled why not leave a comment?

  • What makes a good library?

    A good library for directed evolution is a diverse targetted one.
    The target of the library can focus from a few residues to a whole region: the former is done with degenerate primers, while the latter is done by error-prone PCR. Although some combinations of the two are sometimes done.

    In terms of degenerate primers (or more correctly primers with degenerate nucleotides, also called QuickChange primers after the original kit), the app GlueIT can help in determining which degenerate codons to use and how many to reasonably alter, while the app MutantPrimers can help by designing the primers. Unfortunately, the annealing temperature of the PCR reaction plays a large part, so the app QCCC was made to dermine the extent of randomisation.

    In terms of error-prone PCR, the starting template and nucleotide balance can be altered to achieve different mutational loads and biases.
    First, the amount of starting template controls the average mutational load of the library. If this is low, there will be a large redundancy in the library, including a significant fraction of wild type sequences, but if this too high, the library will be dominated by deleterious mutations masking beneficial ones.
    Second, the PCR method employed has a profound effect on the resulting mutational spectrum, which the more biases it is the less diversity is present.

  • How many mutations should I have?

    Ideally, high enough that you have little redundancy, while low enough that you have most single amino acid variants accessible via a single mutation, which is somewhere around 5 mutations per kb (see PedelAA for your specific case).

    The thing to remember is that the number of mutations per sequence follows a Poisson distribution: if you had an average mutational load of 1.0, 36% of your library will have zero mutations and 36% with a single nucleotide mutations as opposed to all sequences with a single mutation. Also, two thirds of nucleotide mutation are missense mutations, so a load of 1 mutation per kb is just a bit less than half wild type at the amino acid level.

    There are several papers that explore the fraction of mutations that are beneficial, neutral or deleterious. It depends on the protein and whether one is looking at it from structural (ΔΔG) or a catalytic (ΔΔG‡) point of view. But roughly, 1-10% are beneficial, 10-20% are deleterious and the rest neutral (to see an example or calculate on your own protein see landscape app). Therefore for a protein, for every 1.3 mutations that the load increases, the fraction of non-dead protein decreases by 20% (or whatever the value is), so at an average 5 nucleotides mutations 57% of the library is dead (0.8^(5*0.7)). But the rest are neutral, or neutral with beneficial mutations!

  • What error-prone PCR method should I use?

    It depends on the desired mutational load, on the importance of mutational biases... and how much time you are willing to troubleshoot.

    The simplest and cleanest method is using the error-prone enzyme Mutazyme (Promega GeneMorph kit; technically Pfu PolB-Sso 7d D215A D473A), which has a low error rate (about 0.9/doubling/kb), but is less biases and is less likely to give a smear or no PCR products on an agarose gel (although it is nowhere as robust as say a normal Q5 reaction).

    Another method is manganese mutagenesis, where up to 5 mM manganese are added to a Taq reaction, which results in a high error rate, but the product is highly biases towards adenine mutations, so is often counterbalanced by using unequal NTPs.

    A third method is using nucleotide analogues that increase mutations, such as 8-oxoGTP or dPTP (Jena Bioscience kit), which result in a very high mutational load.

    Unfortunately, the latter two strongly affect PCR yields, which means that one cannot mix and match, say manganese with Mutazyme. Similarly, with mutation shuffling methods there is a sensitive PCR step, but neither DNA shuffling or StEP work with epPCR to a usable/satisfactory degree.

    A recent development is getting a synthetic library which scan all mutations or similar. These completely circumvent the above issues, but are expensive and require a great deal of study into the detail of the final product to make sure that the limitations don't interfere. For example, in the case of a scanning library, where each variant will have only a single mutation, will not be able to find epistatic cases etc.).

  • I got no mutations...

    If you believe it is something wrong with our code, please contact us. Here are some pointers:

    • If the sequences were weird, you have some contaminant during your cloning steps.
    • Was the PCR yield low? a 50 µl reaction should give a yield between 0.5–2,000 ng (e.g. 20 µl of 20-100 ng/µl).
    • Check the manual or protocol for the method
    • When the manganese stock was made, were the crystals a cool pink? Silly as it sounds, getting manganese and magnesium mixed up is a common mistake.
    • etc.
  • What about Sun's PCR distribution?

    Due to the fact that the PCR efficiency is not a straightforward parameter to obtain and can be strongly misleading it was not ported to this project to avoid beguiling the user with false data. For more see: Matteo's blogpost on PCR distribution
  • Is mutational bias a big deal?

    Unfortunately, yes: it both has a strong effect on diversity and is a strong phenomenon.

    If the mutational spectrum greatly differs from an equiprobably scenario, the diversity is greatly reduced. Some methods of introducing random mutations favour certain mutations so much that certain desired amino acid changes, such as a glutamaine to glutamate, becomes very unlikely. For more see: Matteo's blogpost on mutational biases
  • I transformed into a competent strain then into my test strain, what's my library size?

    Library size is the number of variants counted or estimated after the library construction transformation. This is not the population size of the culture, which will be repeats of originals (unless using the MP6 plasmid) and no unique variants are added. Often the test strain is sick and poorly competent, so two transformations are done. Namely, a library is assembled, transformed (into a high competency strain, e.g. DH5α, TOP10 etc.) and plasmid prepped and re-transformed (into the test strain). In this case the library size is based on the original transformation as no new variants are getting added —worse still, there is a bottleneck event which will result in some skewness.