The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels) and single-nucleotide polymorphisms (SNPs). While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that the genome sequences of both closely and distantly related species provide a means of identifying deleterious polymorphisms that disrupt functionally conserved coding and noncoding sequences.
|Evidence ID||Analyze ID||Interactor||Interactor Systematic Name||Interactor||Interactor Systematic Name||Type||Assay||Annotation||Action||Modification||Phenotype||Source||Reference||Note|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Gene Ontology Term||Gene Ontology Term ID||Qualifier||Aspect||Method||Evidence||Source||Assigned On||Reference||Annotation Extension|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Phenotype||Experiment Type||Experiment Type Category||Mutant Information||Strain Background||Chemical||Details||Reference|
|Evidence ID||Analyze ID||Regulator||Regulator Systematic Name||Target||Target Systematic Name||Experiment||Conditions||Strain||Source||Reference|