DNA replication errors that escape polymerase proofreading and mismatch repair (MMR) can lead to base substitution and frameshift mutations. Such mutations can disrupt gene function, reduce fitness, and promote diseases such as cancer, and are also the raw material of molecular evolution. To analyze with limited bias genomic features associated with DNA polymerase errors, we performed a genome-wide analysis of mutations that accumulate in MMR-deficient diploid lines of Saccharomyces cerevisiae. These lines were derived from a common ancestor and were grown for 160 generations, with bottlenecks reducing the population to one cell every twenty generations. We sequenced to between eight and twenty-fold coverage one wild-type and three mutator lines using Illumina Solexa 36-bp reads. Using an experimentally aware Bayesian genotype caller developed to pool experimental data across sequencing runs for all strains, we detected 28 heterozygous single-nucleotide polymorphisms (SNPs) and 48 single nt insertion/deletions (indels) from the data set. This method was evaluated on simulated data sets and found to have a very low false positive rate (~6 x 10(-5)) and a false negative rate of 0.08 within the unique mapping regions of the genome that contained at least seven-fold coverage. The heterozygous mutations identified by the Bayesian genotype caller were confirmed by Sanger sequencing. All of the mutations were unique to a given line, except for a single nt deletion mutation which occurred independently in two lines. All 48 indels, comprised of 46 deletions and two insertions, occurred in homopolymer tracts (i.e., 47 poly A or T tracts, 1 poly G or C tract) between five and thirteen base pairs long. Our findings are of interest because HP tracts are present at high levels in the yeast genome (> 77,400 for five to twenty nt HP tracts), and frameshift mutations in these regions are likely to disrupt gene function. In addition, they demonstrate that the mutation pattern seen previously in mismatch repair defective strains using a limited number of reporters holds true for the entire genome.
|Evidence ID||Analyze ID||Interactor||Interactor Systematic Name||Interactor||Interactor Systematic Name||Type||Assay||Annotation||Action||Modification||Phenotype||Source||Reference||Note|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Gene Ontology Term||Gene Ontology Term ID||Qualifier||Aspect||Method||Evidence||Source||Assigned On||Reference||Annotation Extension|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Phenotype||Experiment Type||Experiment Type Category||Mutant Information||Strain Background||Chemical||Details||Reference|
|Evidence ID||Analyze ID||Regulator||Regulator Systematic Name||Target||Target Systematic Name||Experiment||Conditions||Strain||Source||Reference|