We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.
Sinorhizobium meliloti is an alpha-proteobacterium that forms agronomically important N(2)-fixing root nodules in legumes. We report here the complete sequence of the largest constituent of its genome, a 62.7% GC-rich 3,654,135-bp circular chromosome. Annotation allowed assignment of a function to 59% of the 3,341 predicted protein-coding ORFs, the rest exhibiting partial, weak, or no similarity with any known sequence. Unexpectedly, the level of reiteration within this replicon is low, with only two genes duplicated with more than 90% nucleotide sequence identity, transposon elements accounting for 2.2% of the sequence, and a few hundred short repeated palindromic motifs (RIME1, RIME2, and C) widespread over the chromosome. Three regions with a significantly lower GC content are most likely of external origin. Detailed annotation revealed that this replicon contains all housekeeping genes except two essential genes that are located on pSymB. Amino acid/peptide transport and degradation and sugar metabolism appear as two major features of the S. meliloti chromosome. The presence in this replicon of a large number of nucleotide cyclases with a peculiar structure, as well as of genes homologous to virulence determinants of animal and plant pathogens, opens perspectives in the study of this bacterium both as a free-living soil microorganism and as a plant symbiont.
The scarcity of usable nitrogen frequently limits plant growth. A tight metabolic association with rhizobial bacteria allows legumes to obtain nitrogen compounds by bacterial reduction of dinitrogen (N2) to ammonium (NH4+). We present here the annotated DNA sequence of the alpha-proteobacterium Sinorhizobium meliloti, the symbiont of alfalfa. The tripartite 6.7-megabase (Mb) genome comprises a 3.65-Mb chromosome, and 1.35-Mb pSymA and 1.68-Mb pSymB megaplasmids. Genome sequence analysis indicates that all three elements contribute, in varying degrees, to symbiosis and reveals how this genome may have emerged during evolution. The genome sequence will be useful in understanding the dynamics of interkingdom associations and of life in soil environments.
The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.