The genome of the unicellular cyanobacterium Thermosynechococcus sp. strain NK55a, isolated from the Nakabusa hot spring, Nagano Prefecture, Japan, comprises a single, circular, 2.5-Mb chromosome. The genome is predicted to contain 2,358 protein-encoding genes, including genes for all typical cyanobacterial photosynthetic and metabolic functions. No genes encoding hydrogenases or nitrogenase were identified.
The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N(2) fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.
Only five bacterial phyla with members capable of chlorophyll (Chl)-based phototrophy are presently known. Metagenomic data from the phototrophic microbial mats of alkaline siliceous hot springs in Yellowstone National Park revealed the existence of a distinctive bacteriochlorophyll (BChl)-synthesizing, phototrophic bacterium. A highly enriched culture of this bacterium grew photoheterotrophically, synthesized BChls a and c under oxic conditions, and had chlorosomes and type 1 reaction centers. "Candidatus Chloracidobacterium thermophilum" is a BChl-producing member of the poorly characterized phylum Acidobacteria.
Dichelobacter nodosus causes ovine footrot, a disease that leads to severe economic losses in the wool and meat industries. We sequenced its 1.4-Mb genome, the smallest known genome of an anaerobe. It differs markedly from small genomes of intracellular bacteria, retaining greater biosynthetic capabilities and lacking any evidence of extensive ongoing genome reduction. Comparative genomic microarray studies and bioinformatic analysis suggested that, despite its small size, almost 20% of the genome is derived from lateral gene transfer. Most of these regions seem to be associated with virulence. Metabolic reconstruction indicated unsuspected capabilities, including carbohydrate utilization, electron transfer and several aerobic pathways. Global transcriptional profiling and bioinformatic analysis enabled the prediction of virulence factors and cell surface proteins. Screening of these proteins against ovine antisera identified eight immunogenic proteins that are candidate antigens for a cross-protective vaccine.
Clostridium perfringens is a Gram-positive, anaerobic spore-forming bacterium commonly found in soil, sediments, and the human gastrointestinal tract. C. perfringens is responsible for a wide spectrum of disease, including food poisoning, gas gangrene (clostridial myonecrosis), enteritis necroticans, and non-foodborne gastrointestinal infections. The complete genome sequences of Clostridium perfringens strain ATCC 13124, a gas gangrene isolate and the species type strain, and the enterotoxin-producing food poisoning strain SM101, were determined and compared with the published C. perfringens strain 13 genome. Comparison of the three genomes revealed considerable genomic diversity with >300 unique "genomic islands" identified, with the majority of these islands unusually clustered on one replichore. PCR-based analysis indicated that the large genomic islands are widely variable across a large collection of C. perfringens strains. These islands encode genes that correlate to differences in virulence and phenotypic characteristics of these strains. Significant differences between the strains include numerous novel mobile elements and genes encoding metabolic capabilities, strain-specific extracellular polysaccharide capsule, sporulation factors, toxins, and other secreted enzymes, providing substantial insight into this medically important bacterial pathogen.
Coastal aquatic environments are typically more highly productive and dynamic than open ocean ones. Despite these differences, cyanobacteria from the genus Synechococcus are important primary producers in both types of ecosystems. We have found that the genome of a coastal cyanobacterium, Synechococcus sp. strain CC9311, has significant differences from an open ocean strain, Synechococcus sp. strain WH8102, and these are consistent with the differences between their respective environments. CC9311 has a greater capacity to sense and respond to changes in its (coastal) environment. It has a much larger capacity to transport, store, use, or export metals, especially iron and copper. In contrast, phosphate acquisition seems less important, consistent with the higher concentration of phosphate in coastal environments. CC9311 is predicted to have differences in its outer membrane lipopolysaccharide, and this may be characteristic of the speciation of some cyanobacterial groups. In addition, the types of potentially horizontally transferred genes are markedly different between the coastal and open ocean genomes and suggest a more prominent role for phages in horizontal gene transfer in oligotrophic environments.
Pseudomonas syringae pv. phaseolicola, a gram-negative bacterial plant pathogen, is the causal agent of halo blight of bean. In this study, we report on the genome sequence of P. syringae pv. phaseolicola isolate 1448A, which encodes 5,353 open reading frames (ORFs) on one circular chromosome (5,928,787 bp) and two plasmids (131,950 bp and 51,711 bp). Comparative analyses with a phylogenetically divergent pathovar, P. syringae pv. tomato DC3000, revealed a strong degree of conservation at the gene and genome levels. In total, 4,133 ORFs were identified as putative orthologs in these two pathovars using a reciprocal best-hit method, with 3,941 ORFs present in conserved, syntenic blocks. Although these two pathovars are highly similar at the physiological level, they have distinct host ranges; 1448A causes disease in beans, and DC3000 is pathogenic on tomato and Arabidopsis. Examination of the complement of ORFs encoding virulence, fitness, and survival factors revealed a substantial, but not complete, overlap between these two pathovars. Another distinguishing feature between the two pathovars is their distinctive sets of transposable elements. With access to a fifth complete pseudomonad genome sequence, we were able to identify 3,567 ORFs that likely comprise the core Pseudomonas genome and 365 ORFs that are P. syringae specific.
The completion of the 5,373,180-bp genome sequence of the marine psychrophilic bacterium Colwellia psychrerythraea 34H, a model for the study of life in permanently cold environments, reveals capabilities important to carbon and nutrient cycling, bioremediation, production of secondary metabolites, and cold-adapted enzymes. From a genomic perspective, cold adaptation is suggested in several broad categories involving changes to the cell membrane fluidity, uptake and synthesis of compounds conferring cryotolerance, and strategies to overcome temperature-dependent barriers to carbon uptake. Modeling of three-dimensional protein homology from bacteria representing a range of optimal growth temperatures suggests changes to proteome composition that may enhance enzyme effectiveness at low temperatures. Comparative genome analyses suggest that the psychrophilic lifestyle is most likely conferred not by a unique set of genes but by a collection of synergistic changes in overall genome content and amino acid composition.
Pseudomonas fluorescens Pf-5 is a plant commensal bacterium that inhabits the rhizosphere and produces secondary metabolites that suppress soilborne plant pathogens. The complete sequence of the 7.1-Mb Pf-5 genome was determined. We analyzed repeat sequences to identify genomic islands that, together with other approaches, suggested P. fluorescens Pf-5's recent lateral acquisitions include six secondary metabolite gene clusters, seven phage regions and a mobile genomic island. We identified various features that contribute to its commensal lifestyle on plants, including broad catabolic and transport capabilities for utilizing plant-derived compounds, the apparent ability to use a diversity of iron siderophores, detoxification systems to protect from oxidative stress, and the lack of a type III secretion system and toxins found in related pathogens. In addition to six known secondary metabolites produced by P. fluorescens Pf-5, three novel secondary metabolite biosynthesis gene clusters were also identified that may contribute to the biocontrol properties of P. fluorescens Pf-5.
BACKGROUND: The Trace Archive is a repository for the raw, unanalyzed data generated by large-scale genome sequencing projects. The existence of this data offers scientists the possibility of discovering additional genomic sequences beyond those originally sequenced. In particular, if the source DNA for a sequencing project came from a species that was colonized by another organism, then the project may yield substantial amounts of genomic DNA, including near-complete genomes, from the symbiotic or parasitic organism. RESULTS: By searching the publicly available repository of DNA sequencing trace data, we discovered three new species of the bacterial endosymbiont Wolbachia pipientis in three different species of fruit fly: Drosophila ananassae, D. simulans, and D. mojavensis. We extracted all sequences with partial matches to a previously sequenced Wolbachia strain and assembled those sequences using customized software. For one of the three new species, the data recovered were sufficient to produce an assembly that covers more than 95% of the genome; for a second species the data produce the equivalent of a 'light shotgun' sampling of the genome, covering an estimated 75-80% of the genome; and for the third species the data cover approximately 6-7% of the genome. CONCLUSIONS: The results of this study reveal an unexpected benefit of depositing raw data in a central genome sequence repository: new species can be discovered within this data. The differences between these three new Wolbachia genomes and the previously sequenced strain revealed numerous rearrangements and insertions within each lineage and hundreds of novel genes. The three new genomes, with annotation, have been deposited in GenBank.
Dehalococcoides ethenogenes is the only bacterium known to reductively dechlorinate the groundwater pollutants, tetrachloroethene (PCE) and trichloroethene, to ethene. Its 1,469,720-base pair chromosome contains large dynamic duplicated regions and integrated elements. Genes encoding 17 putative reductive dehalogenases, nearly all of which were adjacent to genes for transcription regulators, and five hydrogenase complexes were identified. These findings, plus a limited repertoire of other metabolic modes, indicate that D. ethenogenes is highly evolved to utilize halogenated organic compounds and H2. Diversification of reductive dehalogenase functions appears to have been mediated by recent genetic exchange and amplification. Genome analysis provides insights into the organism's complex nutrient requirements and suggests that an ancestor was a nitrogen-fixing autotroph.
The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and also limits genome-wide screens for vaccine candidates or for antimicrobial targets. We have generated the genomic sequence of six strains representing the five major disease-causing serotypes of Streptococcus agalactiae, the main cause of neonatal infection in humans. Analysis of these genomes and those available in databases showed that the S. agalactiae species can be described by a pan-genome consisting of a core genome shared by all isolates, accounting for approximately 80% of any single genome, plus a dispensable genome consisting of partially shared and strain-specific genes. Mathematical extrapolation of the data suggests that the gene reservoir available for inclusion in the S. agalactiae pan-genome is vast and that unique genes will continue to be identified even after sequencing hundreds of genomes.
We report here the sequencing and analysis of the genome of the thermophilic bacterium Carboxydothermus hydrogenoformans Z-2901. This species is a model for studies of hydrogenogens, which are diverse bacteria and archaea that grow anaerobically utilizing carbon monoxide (CO) as their sole carbon source and water as an electron acceptor, producing carbon dioxide and hydrogen as waste products. Organisms that make use of CO do so through carbon monoxide dehydrogenase complexes. Remarkably, analysis of the genome of C. hydrogenoformans reveals the presence of at least five highly differentiated anaerobic carbon monoxide dehydrogenase complexes, which may in part explain how this species is able to grow so much more rapidly on CO than many other species. Analysis of the genome also has provided many general insights into the metabolism of this organism which should make it easier to use it as a source of biologically produced hydrogen gas. One surprising finding is the presence of many genes previously found only in sporulating species in the Firmicutes Phylum. Although this species is also a Firmicutes, it was not known to sporulate previously. Here we show that it does sporulate and because it is missing many of the genes involved in sporulation in other species, this organism may serve as a "minimal" model for sporulation studies. In addition, using phylogenetic profile analysis, we have identified many uncharacterized gene families found in all known sporulating Firmicutes, but not in any non-sporulating bacteria, including a sigma factor not known to be involved in sporulation previously.
Desulfovibrio vulgaris Hildenborough is a model organism for studying the energy metabolism of sulfate-reducing bacteria (SRB) and for understanding the economic impacts of SRB, including biocorrosion of metal infrastructure and bioremediation of toxic metal ions. The 3,570,858 base pair (bp) genome sequence reveals a network of novel c-type cytochromes, connecting multiple periplasmic hydrogenases and formate dehydrogenases, as a key feature of its energy metabolism. The relative arrangement of genes encoding enzymes for energy transduction, together with inferred cellular location of the enzymes, provides a basis for proposing an expansion to the 'hydrogen-cycling' model for increasing energy efficiency in this bacterium. Plasmid-encoded functions include modification of cell surface components, nitrogen fixation and a type-III protein secretion system. This genome sequence represents a substantial step toward the elucidation of pathways for reduction (and bioremediation) of pollutants such as uranium and chromium and offers a new starting point for defining this organism's complex anaerobic respiration.
Since the recognition of prokaryotes as essential components of the oceanic food web, bacterioplankton have been acknowledged as catalysts of most major biogeochemical processes in the sea. Studying heterotrophic bacterioplankton has been challenging, however, as most major clades have never been cultured or have only been grown to low densities in sea water. Here we describe the genome sequence of Silicibacter pomeroyi, a member of the marine Roseobacter clade (Fig. 1), the relatives of which comprise approximately 10-20% of coastal and oceanic mixed-layer bacterioplankton. This first genome sequence from any major heterotrophic clade consists of a chromosome (4,109,442 base pairs) and megaplasmid (491,611 base pairs). Genome analysis indicates that this organism relies upon a lithoheterotrophic strategy that uses inorganic compounds (carbon monoxide and sulphide) to supplement heterotrophy. Silicibacter pomeroyi also has genes advantageous for associations with plankton and suspended particles, including genes for uptake of algal-derived compounds, use of metabolites from reducing microzones, rapid growth and cell-density-dependent regulation. This bacterium has a physiology distinct from that of marine oligotrophs, adding a new strategy to the recognized repertoire for coping with a nutrient-poor ocean.
The genomes of three strains of Listeria monocytogenes that have been associated with food-borne illness in the USA were subjected to whole genome comparative analysis. A total of 51, 97 and 69 strain-specific genes were identified in L.monocytogenes strains F2365 (serotype 4b, cheese isolate), F6854 (serotype 1/2a, frankfurter isolate) and H7858 (serotype 4b, meat isolate), respectively. Eighty-three genes were restricted to serotype 1/2a and 51 to serotype 4b strains. These strain- and serotype-specific genes probably contribute to observed differences in pathogenicity, and the ability of the organisms to survive and grow in their respective environmental niches. The serotype 1/2a-specific genes include an operon that encodes the rhamnose biosynthetic pathway that is associated with teichoic acid biosynthesis, as well as operons for five glycosyl transferases and an adenine-specific DNA methyltransferase. A total of 8603 and 105 050 high quality single nucleotide polymorphisms (SNPs) were found on the draft genome sequences of strain H7858 and strain F6854, respectively, when compared with strain F2365. Whole genome comparative analyses revealed that the L.monocytogenes genomes are essentially syntenic, with the majority of genomic differences consisting of phage insertions, transposable elements and SNPs.
The complete genome sequence of Burkholderia mallei ATCC 23344 provides insight into this highly infectious bacterium's pathogenicity and evolutionary history. B. mallei, the etiologic agent of glanders, has come under renewed scientific investigation as a result of recent concerns about its past and potential future use as a biological weapon. Genome analysis identified a number of putative virulence factors whose function was supported by comparative genome hybridization and expression profiling of the bacterium in hamster liver in vivo. The genome contains numerous insertion sequence elements that have mediated extensive deletions and rearrangements of the genome relative to Burkholderia pseudomallei. The genome also contains a vast number (>12,000) of simple sequence repeats. Variation in simple sequence repeats in key genes can provide a mechanism for generating antigenic variation that may account for the mammalian host's inability to mount a durable adaptive immune response to a B. mallei infection.
We sequenced the complete genome of Bacillus cereus ATCC 10987, a non-lethal dairy isolate in the same genetic subgroup as Bacillus anthracis. Comparison of the chromosomes demonstrated that B.cereus ATCC 10987 was more similar to B.anthracis Ames than B.cereus ATCC 14579, while containing a number of unique metabolic capabilities such as urease and xylose utilization and lacking the ability to utilize nitrate and nitrite. Additionally, genetic mechanisms for variation of capsule carbohydrate and flagella surface structures were identified. Bacillus cereus ATCC 10987 contains a single large plasmid (pBc10987), of approximately 208 kb, that is similar in gene content and organization to B.anthracis pXO1 but is lacking the pathogenicity-associated island containing the anthrax lethal and edema toxin complex genes. The chromosomal similarity of B.cereus ATCC 10987 to B.anthracis Ames, as well as the fact that it contains a large pXO1-like plasmid, may make it a possible model for studying B.anthracis plasmid biology and regulatory cross-talk.
Methanotrophs are ubiquitous bacteria that can use the greenhouse gas methane as a sole carbon and energy source for growth, thus playing major roles in global carbon cycles, and in particular, substantially reducing emissions of biologically generated methane to the atmosphere. Despite their importance, and in contrast to organisms that play roles in other major parts of the carbon cycle such as photosynthesis, no genome-level studies have been published on the biology of methanotrophs. We report the first complete genome sequence to our knowledge from an obligate methanotroph, Methylococcus capsulatus (Bath), obtained by the shotgun sequencing approach. Analysis revealed a 3.3-Mb genome highly specialized for a methanotrophic lifestyle, including redundant pathways predicted to be involved in methanotrophy and duplicated genes for essential enzymes such as the methane monooxygenases. We used phylogenomic analysis, gene order information, and comparative analysis with the partially sequenced methylotroph Methylobacterium extorquens to detect genes of unknown function likely to be involved in methanotrophy and methylotrophy. Genome analysis suggests the ability of M. capsulatus to scavenge copper (including a previously unreported nonribosomal peptide synthetase) and to use copper in regulation of methanotrophy, but the exact regulatory mechanisms remain unclear. One of the most surprising outcomes of the project is evidence suggesting the existence of previously unsuspected metabolic flexibility in M. capsulatus, including an ability to grow on sugars, oxidize chemolithotrophic hydrogen and sulfur, and live under reduced oxygen tension, all of which have implications for methanotroph ecology. The availability of the complete genome of M. capsulatus (Bath) deepens our understanding of methanotroph biology and its relationship to global carbon cycles. We have gained evidence for greater metabolic flexibility than was previously known, and for genetic components that may have biotechnological potential.
The complete sequence of the 1,267,782 bp genome of Wolbachia pipientis wMel, an obligate intracellular bacteria of Drosophila melanogaster, has been determined. Wolbachia, which are found in a variety of invertebrate species, are of great interest due to their diverse interactions with different hosts, which range from many forms of reproductive parasitism to mutualistic symbioses. Analysis of the wMel genome, in particular phylogenomic comparisons with other intracellular bacteria, has revealed many insights into the biology and evolution of wMel and Wolbachia in general. For example, the wMel genome is unique among sequenced obligate intracellular species in both being highly streamlined and containing very high levels of repetitive DNA and mobile DNA elements. This observation, coupled with multiple evolutionary reconstructions, suggests that natural selection is somewhat inefficient in wMel, most likely owing to the occurrence of repeated population bottlenecks. Genome analysis predicts many metabolic differences with the closely related Rickettsia species, including the presence of intact glycolysis and purine synthesis, which may compensate for an inability to obtain ATP directly from its host, as Rickettsia can. Other discoveries include the apparent inability of wMel to synthesize lipopolysaccharide and the presence of the most genes encoding proteins with ankyrin repeat domains of any prokaryotic genome yet sequenced. Despite the ability of wMel to infect the germline of its host, we find no evidence for either recent lateral gene transfer between wMel and D. melanogaster or older transfers between Wolbachia and any host. Evolutionary analysis further supports the hypothesis that mitochondria share a common ancestor with the alpha-Proteobacteria, but shows little support for the grouping of mitochondria with species in the order Rickettsiales. With the availability of the complete genomes of both species and excellent genetic tools for the host, the wMel-D. melanogaster symbiosis is now an ideal system for studying the biology and evolution of Wolbachia infections.
We report the complete genome sequence of the model bacterial pathogen Pseudomonas syringae pathovar tomato DC3000 (DC3000), which is pathogenic on tomato and Arabidopsis thaliana. The DC3000 genome (6.5 megabases) contains a circular chromosome and two plasmids, which collectively encode 5,763 ORFs. We identified 298 established and putative virulence genes, including several clusters of genes encoding 31 confirmed and 19 predicted type III secretion system effector proteins. Many of the virulence genes were members of paralogous families and also were proximal to mobile elements, which collectively comprise 7% of the DC3000 genome. The bacterium possesses a large repertoire of transporters for the acquisition of nutrients, particularly sugars, as well as genes implicated in attachment to plant surfaces. Over 12% of the genes are dedicated to regulation, which may reflect the need for rapid adaptation to the diverse environments encountered during epiphytic growth and pathogenesis. Comparative analyses confirmed a high degree of similarity with two sequenced pseudomonads, Pseudomonas putida and Pseudomonas aeruginosa, yet revealed 1,159 genes unique to DC3000, of which 811 lack a known function.
The complete 2,343,479-bp genome sequence of the gram-negative, pathogenic oral bacterium Porphyromonas gingivalis strain W83, a major contributor to periodontal disease, was determined. Whole-genome comparative analysis with other available complete genome sequences confirms the close relationship between the Cytophaga-Flavobacteria-Bacteroides (CFB) phylum and the green-sulfur bacteria. Within the CFB phyla, the genomes most similar to that of P. gingivalis are those of Bacteroides thetaiotaomicron and B. fragilis. Outside of the CFB phyla the most similar genome to P. gingivalis is that of Chlorobium tepidum, supporting the previous phylogenetic studies that indicated that the Chlorobia and CFB phyla are related, albeit distantly. Genome analysis of strain W83 reveals a range of pathways and virulence determinants that relate to the novel biology of this oral pathogen. Among these determinants are at least six putative hemagglutinin-like genes and 36 previously unidentified peptidases. Genome analysis also reveals that P. gingivalis can metabolize a range of amino acids and generate a number of metabolic end products that are toxic to the human host or human gingival tissue and contribute to the development of periodontal disease.
Bacillus anthracis is an endospore-forming bacterium that causes inhalational anthrax. Key virulence genes are found on plasmids (extra-chromosomal, circular, double-stranded DNA molecules) pXO1 (ref. 2) and pXO2 (ref. 3). To identify additional genes that might contribute to virulence, we analysed the complete sequence of the chromosome of B. anthracis Ames (about 5.23 megabases). We found several chromosomally encoded proteins that may contribute to pathogenicity--including haemolysins, phospholipases and iron acquisition functions--and identified numerous surface proteins that might be important targets for vaccines and drugs. Almost all these putative chromosomal virulence and surface proteins have homologues in Bacillus cereus, highlighting the similarity of B. anthracis to near-neighbours that are not associated with anthrax. By performing a comparative genome hybridization of 19 B. cereus and Bacillus thuringiensis strains against a B. anthracis DNA microarray, we confirmed the general similarity of chromosomal genes among this group of close relatives. However, we found that the gene sequences of pXO1 and pXO2 were more variable between strains, suggesting plasmid mobility in the group. The complete sequence of B. anthracis is a step towards a better understanding of anthrax pathogenesis.
The genome of Chlamydophila caviae (formerly Chlamydia psittaci, GPIC isolate) (1 173 390 nt with a plasmid of 7966 nt) was determined, representing the fourth species with a complete genome sequence from the Chlamydiaceae family of obligate intracellular bacterial pathogens. Of 1009 annotated genes, 798 were conserved in all three other completed Chlamydiaceae genomes. The C.caviae genome contains 68 genes that lack orthologs in any other completed chlamydial genomes, including tryptophan and thiamine biosynthesis determinants and a ribose-phosphate pyrophosphokinase, the product of the prsA gene. Notable amongst these was a novel member of the virulence-associated invasin/intimin family (IIF) of Gram-negative bacteria. Intriguingly, two authentic frameshift mutations in the ORF indicate that this gene is not functional. Many of the unique genes are found in the replication termination region (RTR or plasticity zone), an area of frequent symmetrical inversion events around the replication terminus shown to be a hotspot for genome variation in previous genome sequencing studies. In C.caviae, the RTR includes several loci of particular interest including a large toxin gene and evidence of ancestral insertion(s) of a bacteriophage. This toxin gene, not present in Chlamydia pneumoniae, is a member of the YopT effector family of type III-secreted cysteine proteases. One gene cluster (guaBA-add) in the RTR is much more similar to orthologs in Chlamydia muridarum than those in the phylogenetically closest species C.pneumoniae, suggesting the possibility of horizontal transfer of genes between the rodent-associated Chlamydiae. With most genes observed in the other chlamydial genomes represented, C.caviae provides a good model for the Chlamydiaceae and a point of comparison against the human atherosclerosis-associated C.pneumoniae. This crucial addition to the set of completed Chlamydiaceae genome sequences is enabling dissection of the roles played by niche-specific genes in these important bacterial pathogens.
The 1,995,275-bp genome of Coxiella burnetii, Nine Mile phase I RSA493, a highly virulent zoonotic pathogen and category B bioterrorism agent, was sequenced by the random shotgun method. This bacterium is an obligate intracellular acidophile that is highly adapted for life within the eukaryotic phagolysosome. Genome analysis revealed many genes with potential roles in adhesion, invasion, intracellular trafficking, host-cell modulation, and detoxification. A previously uncharacterized 13-member family of ankyrin repeat-containing proteins is implicated in the pathogenesis of this organism. Although the lifestyle and parasitic strategies of C. burnetii resemble that of Rickettsiae and Chlamydiae, their genome architectures differ considerably in terms of presence of mobile elements, extent of genome reduction, metabolic capabilities, and transporter profiles. The presence of 83 pseudogenes displays an ongoing process of gene degradation. Unlike other obligate intracellular bacteria, 32 insertion sequences are found dispersed in the chromosome, indicating some plasticity in the C. burnetii genome. These analyses suggest that the obligate intracellular lifestyle of C. burnetii may be a relatively recent innovation.
The complete genome of the green-sulfur eubacterium Chlorobium tepidum TLS was determined to be a single circular chromosome of 2,154,946 bp. This represents the first genome sequence from the phylum Chlorobia, whose members perform anoxygenic photosynthesis by the reductive tricarboxylic acid cycle. Genome comparisons have identified genes in C. tepidum that are highly conserved among photosynthetic species. Many of these have no assigned function and may play novel roles in photosynthesis or photobiology. Phylogenomic analysis reveals likely duplications of genes involved in biosynthetic pathways for photosynthesis and the metabolism of sulfur and nitrogen as well as strong similarities between metabolic processes in C. tepidum and many Archaeal species.
Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.
Shewanella oneidensis is an important model organism for bioremediation studies because of its diverse respiratory capabilities, conferred in part by multicomponent, branched electron transport systems. Here we report the sequencing of the S. oneidensis genome, which consists of a 4,969,803-base pair circular chromosome with 4,758 predicted protein-encoding open reading frames (CDS) and a 161,613-base pair plasmid with 173 CDSs. We identified the first Shewanella lambda-like phage, providing a potential tool for further genome engineering. Genome analysis revealed 39 c-type cytochromes, including 32 previously unidentified in S. oneidensis, and a novel periplasmic [Fe] hydrogenase, which are integral members of the electron transport system. This genome sequence represents a critical step in the elucidation of the pathways for reduction (and bioremediation) of pollutants such as uranium (U) and chromium (Cr), and offers a starting point for defining this organism's complex electron transport systems and metal ion-reducing capabilities.
The 3.31-Mb genome sequence of the intracellular pathogen and potential bioterrorism agent, Brucella suis, was determined. Comparison of B. suis with Brucella melitensis has defined a finite set of differences that could be responsible for the differences in virulence and host preference between these organisms, and indicates that phage have played a significant role in their divergence. Analysis of the B. suis genome reveals transport and metabolic capabilities akin to soil/plant-associated bacteria. Extensive gene synteny between B. suis chromosome 1 and the genome of the plant symbiont Mesorhizobium loti emphasizes the similarity between this animal pathogen and plant pathogens and symbionts. A limited repertoire of genes homologous to known bacterial virulence factors were identified.
The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living alpha-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.
The 2,160,837-base pair genome sequence of an isolate of Streptococcus pneumoniae, a Gram-positive pathogen that causes pneumonia, bacteremia, meningitis, and otitis media, contains 2236 predicted coding regions; of these, 1440 (64%) were assigned a biological role. Approximately 5% of the genome is composed of insertion sequences that may contribute to genome rearrangements through uptake of foreign DNA. Extracellular enzyme systems for the metabolism of polysaccharides and hexosamines provide a substantial source of carbon and nitrogen for S. pneumoniae and also damage host tissues and facilitate colonization. A motif identified within the signal peptide of proteins is potentially involved in targeting these proteins to the cell surface of low-guanine/cytosine (GC) Gram-positive species. Several surface-exposed proteins that may serve as potential vaccine candidates were identified. Comparative genome hybridization with DNA arrays revealed strain differences in S. pneumoniae that could contribute to differences in virulence and antigenicity.
Here we determine the complete genomic sequence of the gram negative, gamma-Proteobacterium Vibrio cholerae El Tor N16961 to be 4,033,460 base pairs (bp). The genome consists of two circular chromosomes of 2,961,146 bp and 1,072,314 bp that together encode 3,885 open reading frames. The vast majority of recognizable genes for essential cell functions (such as DNA replication, transcription, translation and cell-wall biosynthesis) and pathogenicity (for example, toxins, surface antigens and adhesins) are located on the large chromosome. In contrast, the small chromosome contains a larger fraction (59%) of hypothetical genes compared with the large chromosome (42%), and also contains many more genes that appear to have origins other than the gamma-Proteobacteria. The small chromosome also carries a gene capture system (the integron island) and host 'addiction' genes that are typically found on plasmids; thus, the small chromosome may have originally been a megaplasmid that was captured by an ancestral Vibrio species. The V. cholerae genomic sequence provides a starting point for understanding how a free-living, environmental organism emerged to become a significant human bacterial pathogen.
The 2,272,351-base pair genome of Neisseria meningitidis strain MC58 (serogroup B), a causative agent of meningitis and septicemia, contains 2158 predicted coding regions, 1158 (53.7%) of which were assigned a biological role. Three major islands of horizontal DNA transfer were identified; two of these contain genes encoding proteins involved in pathogenicity, and the third island contains coding sequences only for hypothetical proteins. Insights into the commensal and virulence behavior of N. meningitidis can be gleaned from the genome, in which sequences for structural proteins of the pilus are clustered and several coding regions unique to serogroup B capsular polysaccharide synthesis can be identified. Finally, N. meningitidis contains more genes that undergo phase variation than any pathogen studied to date, a mechanism that controls their expression and contributes to the evasion of the host immune system.
The 1,860,725-base-pair genome of Thermotoga maritima MSB8 contains 1,877 predicted coding regions, 1,014 (54%) of which have functional assignments and 863 (46%) of which are of unknown function. Genome analysis reveals numerous pathways involved in degradation of sugars and plant polysaccharides, and 108 genes that have orthologues only in the genomes of other thermophilic Eubacteria and Archaea. Of the Eubacteria sequenced to date, T. maritima has the highest percentage (24%) of genes that are most similar to archaeal genes. Eighty-one archaeal-like genes are clustered in 15 regions of the T. maritima genome that range in size from 4 to 20 kilobases. Conservation of gene order between T. maritima and Archaea in many of the clustered regions suggests that lateral gene transfer may have occurred between thermophilic Eubacteria and Archaea.
The complete genome sequence of the radiation-resistant bacterium Deinococcus radiodurans R1 is composed of two chromosomes (2,648,638 and 412,348 base pairs), a megaplasmid (177,466 base pairs), and a small plasmid (45,704 base pairs), yielding a total genome of 3,284, 156 base pairs. Multiple components distributed on the chromosomes and megaplasmid that contribute to the ability of D. radiodurans to survive under conditions of starvation, oxidative stress, and high amounts of DNA damage were identified. Deinococcus radiodurans represents an organism in which all systems for DNA repair, DNA damage export, desiccation and starvation recovery, and genetic redundancy are present in one cell.