We introduce a computational method for identifying subcellular locations of proteins from the phylogenetic distribution of the homologs of organellar proteins. This method is based on the observation that proteins localized to a given organelle by experiments tend to share a characteristic phylogenetic distribution of their homologs-a phylogenetic profile. Therefore any other protein can be localized by its phylogenetic profile. Application of this method to mitochondrial proteins reveals that nucleus-encoded proteins previously known to be destined for mitochondria fall into three groups: prokaryote-derived, eukaryote-derived, and organism-specific (i.e., found only in the organism under study). Prokaryote-derived mitochondrial proteins can be identified effectively by their phylogenetic profiles. In the yeast Saccharomyces cerevisiae, 361 nucleus-encoded mitochondrial proteins can be identified at 50% accuracy with 58% coverage. From these values and the proportion of conserved mitochondrial genes, it can be inferred that approximately 630 genes, or 10% of the nuclear genome, is devoted to mitochondrial function. In the worm Caenorhabditis elegans, we estimate that there are approximately 660 nucleus-encoded mitochondrial genes, or 4% of its genome, with approximately 400 of these genes contributed from the prokaryotic mitochondrial ancestor. The large fraction of organism-specific and eukaryote-derived genes suggests that mitochondria perform specialized roles absent from prokaryotic mitochondrial ancestors. We observe measurably distinct phylogenetic profiles among proteins from different subcellular compartments, allowing the general use of prokaryotic genomes in learning features of eukaryotic proteins.
|Evidence ID||Analyze ID||Interactor||Interactor Systematic Name||Interactor||Interactor Systematic Name||Type||Assay||Annotation||Action||Modification||Phenotype||Source||Reference||Note|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Gene Ontology Term||Gene Ontology Term ID||Qualifier||Aspect||Method||Evidence||Source||Assigned On||Annotation Extension||Reference|
|Evidence ID||Analyze ID||Gene||Gene Systematic Name||Phenotype||Experiment Type||Experiment Type Category||Mutant Information||Strain Background||Chemical||Details||Reference|
|Evidence ID||Analyze ID||Regulator||Regulator Systematic Name||Target||Target Systematic Name||Experiment||Assay||Construct||Conditions||Strain Background||Reference|