Experimental validation of enzyme function is crucial for genome interpretation, but it remains challenging because it cannot be scaled up to accommodate the constant accumulation of genome sequences. We tackled this issue for the MetA and MetX enzyme families, phylogenetically unrelated families of acyl-L-homoserine transferases involved in L-methionine biosynthesis. Members of these families are prone to incorrect annotation because MetX and MetA enzymes are assumed to always use acetyl-CoA and succinyl-CoA, respectively. We determined the enzymatic activities of 100 enzymes from diverse species, and interpreted the results by structural classification of active sites based on protein structure modeling. We predict that >60% of the 10,000 sequences from these families currently present in databases are incorrectly annotated, and suggest that acetyl-CoA was originally the sole substrate of these isofunctional enzymes, which evolved to use exclusively succinyl-CoA in the most recent bacteria. We also uncovered a divergent subgroup of MetX enzymes in fungi that participate only in L-cysteine biosynthesis as O-succinyl-L-serine transferases.
Arsenic is widespread in the environment and its presence is a result of natural or anthropogenic activities. Microbes have developed different mechanisms to deal with toxic compounds such as arsenic and this is to resist or metabolize the compound. Here, we present the first reference set of genomic, transcriptomic and proteomic data of an Alphaproteobacterium isolated from an arsenic-containing goldmine: Rhizobium sp. NT-26. Although phylogenetically related to the plant-associated bacteria, this organism has lost the major colonizing capabilities needed for symbiosis with legumes. In contrast, the genome of Rhizobium sp. NT-26 comprises a megaplasmid containing the various genes, which enable it to metabolize arsenite. Remarkably, although the genes required for arsenite oxidation and flagellar motility/biofilm formation are carried by the megaplasmid and the chromosome, respectively, a coordinate regulation of these two mechanisms was observed. Taken together, these processes illustrate the impact environmental pressure can have on the evolution of bacterial genomes, improving the fitness of bacterial strains by the acquisition of novel functions.
Micromonospora strains have been isolated from diverse niches, including soil, water, and marine sediments and root nodules of diverse symbiotic plants. In this work, we report the genome sequence of Micromonospora lupini Lupac 08 isolated from root nodules of the wild legume Lupinus angustifolious.
The pathogenic strain Nocardia cyriacigeorgica GUH-2 was isolated from a fatal human nocardiosis case, and its genome was sequenced. The complete genomic sequence of this strain contains 6,194,645 bp, an average G+C content of 68.37%, and no plasmids. We also identified several protein-coding genes to which N. cyriacigeorgica's virulence can potentially be attributed.
Streptomyces cattleya, a producer of the antibiotics thienamycin and cephamycin C, is one of the rare bacteria known to synthesize fluorinated metabolites. The genome consists of two linear replicons. The genes involved in fluorine metabolism and in the biosynthesis of the antibiotic thienamycin were mapped on both replicons.
Tropical aquatic species of the legume genus Aeschynomene are stem- and root-nodulated by bradyrhizobia strains that exhibit atypical features such as photosynthetic capacities or the use of a nod gene-dependent (ND) or a nod gene-independent (NI) pathway to enter into symbiosis with legumes. In this study we used a comparative genomics approach on nine Aeschynomene symbionts representative of their phylogenetic diversity. We produced draft genomes of bradyrhizobial strains representing different phenotypes: five NI photosynthetic strains (STM3809, ORS375, STM3847, STM4509 and STM4523) in addition to the previously sequenced ORS278 and BTAi1 genomes, one photosynthetic strain ORS285 hosting both ND and NI symbiotic systems, and one NI non-photosynthetic strain (STM3843). Comparative genomics allowed us to infer the core, pan and dispensable genomes of Aeschynomene bradyrhizobia, and to detect specific genes and their location in Genomic Islands (GI). Specific gene sets linked to photosynthetic and NI/ND abilities were identified, and are currently being studied in functional analyses.
The Ralstonia solanacearum species complex includes R. solanacearum, R. syzygii, and the Blood Disease Bacterium (BDB). All colonize plant xylem vessels and cause wilt diseases, but with significant biological differences. R. solanacearum is a soilborne bacterium that infects the roots of a broad range of plants. R. syzygii causes Sumatra disease of clove trees and is actively transmitted by cercopoid insects. BDB is also pathogenic to a single host, banana, and is transmitted by pollinating insects. Sequencing and DNA-DNA hybridization studies indicated that despite their phenotypic differences, these three plant pathogens are actually very closely related, falling into the Phylotype IV subgroup of the R. solanacearum species complex. To better understand the relationships among these bacteria, we sequenced and annotated the genomes of R. syzygii strain R24 and BDB strain R229. These genomes were compared to strain PSI07, a closely related Phylotype IV tomato isolate of R. solanacearum, and to five additional R. solanacearum genomes. Whole-genome comparisons confirmed previous phylogenetic results: the three phylotype IV strains share more and larger syntenic regions with each other than with other R. solanacearum strains. Furthermore, the genetic distances between strains, assessed by an in-silico equivalent of DNA-DNA hybridization, unambiguously showed that phylotype IV strains of BDB, R. syzygii and R. solanacearum form one genomic species. Based on these comprehensive data we propose a revision of the taxonomy of the R. solanacearum species complex. The BDB and R. syzygii genomes encoded no obvious unique metabolic capacities and contained no evidence of horizontal gene transfer from bacteria occupying similar niches. Genes specific to R. syzygii and BDB were almost all of unknown function or extrachromosomal origin. Thus, the pathogenic life-styles of these organisms are more probably due to ecological adaptation and genomic convergence during vertical evolution than to the acquisition of DNA by horizontal transfer.
Comparative genomics is the cornerstone of identification of gene functions. The immense number of living organisms precludes experimental identification of functions except in a handful of model organisms. The bacterial domain is split into large branches, among which the Firmicutes occupy a considerable space. Bacillus subtilis has been the model of Firmicutes for decades and its genome has been a reference for more than 10 years. Sequencing the genome involved more than 30 laboratories, with different expertises, in a attempt to make the most of the experimental information that could be associated with the sequence. This had the expected drawback that the sequencing expertise was quite varied among the groups involved, especially at a time when sequencing genomes was extremely hard work. The recent development of very efficient, fast and accurate sequencing techniques, in parallel with the development of high-level annotation platforms, motivated the present resequencing work. The updated sequence has been reannotated in agreement with the UniProt protein knowledge base, keeping in perspective the split between the paleome (genes necessary for sustaining and perpetuating life) and the cenome (genes required for occupation of a niche, suggesting here that B. subtilis is an epiphyte). This should permit investigators to make reliable inferences to prepare validation experiments in a variety of domains of bacterial growth and development as well as build up accurate phylogenies.
Escherichia coli K-12 and B have been the subjects of classical experiments from which much of our understanding of molecular genetics has emerged. We present here complete genome sequences of two E. coli B strains, REL606, used in a long-term evolution experiment, and BL21(DE3), widely used to express recombinant proteins. The two genomes differ in length by 72,304 bp and have 426 single base pair differences, a seemingly large difference for laboratory strains having a common ancestor within the last 67 years. Transpositions by IS1 and IS150 have occurred in both lineages. Integration of the DE3 prophage in BL21(DE3) apparently displaced a defective prophage in the lambda attachment site of B. As might have been anticipated from the many genetic and biochemical experiments comparing B and K-12 over the years, the B genomes are similar in size and organization to the genome of E. coli K-12 MG1655 and have >99% sequence identity over approximately 92% of their genomes. E. coli B and K-12 differ considerably in distribution of IS elements and in location and composition of larger mobile elements. An unexpected difference is the absence of a large cluster of flagella genes in B, due to a 41 kbp IS1-mediated deletion. Gene clusters that specify the LPS core, O antigen, and restriction enzymes differ substantially, presumably because of horizontal transfer. Comparative analysis of 32 independently isolated E. coli and Shigella genomes, both commensals and pathogenic strains, identifies a minimal set of genes in common plus many strain-specific genes that constitute a large E. coli pan-genome.
BACKGROUND: Genome sequences, now available for most pathogens, hold promise for the rational design of new therapies. However, biological resources for genome-scale identification of gene function (notably genes involved in pathogenesis) and/or genes essential for cell viability, which are necessary to achieve this goal, are often sorely lacking. This holds true for Neisseria meningitidis, one of the most feared human bacterial pathogens that causes meningitis and septicemia. RESULTS: By determining and manually annotating the complete genome sequence of a serogroup C clinical isolate of N. meningitidis (strain 8013) and assembling a library of defined mutants in up to 60% of its non-essential genes, we have created NeMeSys, a biological resource for Neisseria meningitidis systematic functional analysis. To further enhance the versatility of this toolbox, we have manually (re)annotated eight publicly available Neisseria genome sequences and stored all these data in a publicly accessible online database. The potential of NeMeSys for narrowing the gap between sequence and function is illustrated in several ways, notably by performing a functional genomics analysis of the biogenesis of type IV pili, one of the most widespread virulence factors in bacteria, and by identifying through comparative genomics a complete biochemical pathway (for sulfur metabolism) that may potentially be important for nasopharyngeal colonization. CONCLUSIONS: By improving our capacity to understand gene function in an important human pathogen, NeMeSys is expected to contribute to the ongoing efforts aimed at understanding a prokaryotic cell comprehensively and eventually to the design of new therapies.
The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the approximately 18,000 families of orthologous genes, we found approximately 2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome.
BACKGROUND: Methylotrophy describes the ability of organisms to grow on reduced organic compounds without carbon-carbon bonds. The genomes of two pink-pigmented facultative methylotrophic bacteria of the Alpha-proteobacterial genus Methylobacterium, the reference species Methylobacterium extorquens strain AM1 and the dichloromethane-degrading strain DM4, were compared. METHODOLOGY/PRINCIPAL FINDINGS: The 6.88 Mb genome of strain AM1 comprises a 5.51 Mb chromosome, a 1.26 Mb megaplasmid and three plasmids, while the 6.12 Mb genome of strain DM4 features a 5.94 Mb chromosome and two plasmids. The chromosomes are highly syntenic and share a large majority of genes, while plasmids are mostly strain-specific, with the exception of a 130 kb region of the strain AM1 megaplasmid which is syntenic to a chromosomal region of strain DM4. Both genomes contain large sets of insertion elements, many of them strain-specific, suggesting an important potential for genomic plasticity. Most of the genomic determinants associated with methylotrophy are nearly identical, with two exceptions that illustrate the metabolic and genomic versatility of Methylobacterium. A 126 kb dichloromethane utilization (dcm) gene cluster is essential for the ability of strain DM4 to use DCM as the sole carbon and energy source for growth and is unique to strain DM4. The methylamine utilization (mau) gene cluster is only found in strain AM1, indicating that strain DM4 employs an alternative system for growth with methylamine. The dcm and mau clusters represent two of the chromosomal genomic islands (AM1: 28; DM4: 17) that were defined. The mau cluster is flanked by mobile elements, but the dcm cluster disrupts a gene annotated as chelatase and for which we propose the name "island integration determinant" (iid). CONCLUSION/SIGNIFICANCE: These two genome sequences provide a platform for intra- and interspecies genomic comparisons in the genus Methylobacterium, and for investigations of the adaptive mechanisms which allow bacterial lineages to acquire methylotrophic lifestyles.
Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i) whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss); ii) strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii) several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors) were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS). Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment), louse, soil.
Leguminous plants (such as peas and soybeans) and rhizobial soil bacteria are symbiotic partners that communicate through molecular signaling pathways, resulting in the formation of nodules on legume roots and occasionally stems that house nitrogen-fixing bacteria. Nodule formation has been assumed to be exclusively initiated by the binding of bacterial, host-specific lipochito-oligosaccharidic Nod factors, encoded by the nodABC genes, to kinase-like receptors of the plant. Here we show by complete genome sequencing of two symbiotic, photosynthetic, Bradyrhizobium strains, BTAi1 and ORS278, that canonical nodABC genes and typical lipochito-oligosaccharidic Nod factors are not required for symbiosis in some legumes. Mutational analyses indicated that these unique rhizobia use an alternative pathway to initiate symbioses, where a purine derivative may play a key role in triggering nodule formation.
Microbial biotransformations have a major impact on contamination by toxic elements, which threatens public health in developing and industrial countries. Finding a means of preserving natural environments-including ground and surface waters-from arsenic constitutes a major challenge facing modern society. Although this metalloid is ubiquitous on Earth, thus far no bacterium thriving in arsenic-contaminated environments has been fully characterized. In-depth exploration of the genome of the beta-proteobacterium Herminiimonas arsenicoxydans with regard to physiology, genetics, and proteomics, revealed that it possesses heretofore unsuspected mechanisms for coping with arsenic. Aside from multiple biochemical processes such as arsenic oxidation, reduction, and efflux, H. arsenicoxydans also exhibits positive chemotaxis and motility towards arsenic and metalloid scavenging by exopolysaccharides. These observations demonstrate the existence of a novel strategy to efficiently colonize arsenic-rich environments, which extends beyond oxidoreduction reactions. Such a microbial mechanism of detoxification, which is possibly exploitable for bioremediation applications of contaminated sites, may have played a crucial role in the occupation of ancient ecological niches on earth.
Soil bacteria that also form mutualistic symbioses in plants encounter two major levels of selection. One occurs during adaptation to and survival in soil, and the other occurs in concert with host plant speciation and adaptation. Actinobacteria from the genus Frankia are facultative symbionts that form N(2)-fixing root nodules on diverse and globally distributed angiosperms in the "actinorhizal" symbioses. Three closely related clades of Frankia sp. strains are recognized; members of each clade infect a subset of plants from among eight angiosperm families. We sequenced the genomes from three strains; their sizes varied from 5.43 Mbp for a narrow host range strain (Frankia sp. strain HFPCcI3) to 7.50 Mbp for a medium host range strain (Frankia alni strain ACN14a) to 9.04 Mbp for a broad host range strain (Frankia sp. strain EAN1pec.) This size divergence is the largest yet reported for such closely related soil bacteria (97.8%-98.9% identity of 16S rRNA genes). The extent of gene deletion, duplication, and acquisition is in concert with the biogeographic history of the symbioses and host plant speciation. Host plant isolation favored genome contraction, whereas host plant diversification favored genome expansion. The results support the idea that major genome expansions as well as reductions can occur in facultative symbiotic soil bacteria as they respond to new environments in the context of their symbioses.
Anaerobic ammonium oxidation (anammox) has become a main focus in oceanography and wastewater treatment. It is also the nitrogen cycle's major remaining biochemical enigma. Among its features, the occurrence of hydrazine as a free intermediate of catabolism, the biosynthesis of ladderane lipids and the role of cytoplasm differentiation are unique in biology. Here we use environmental genomics--the reconstruction of genomic data directly from the environment--to assemble the genome of the uncultured anammox bacterium Kuenenia stuttgartiensis from a complex bioreactor community. The genome data illuminate the evolutionary history of the Planctomycetes and allow us to expose the genetic blueprint of the organism's special properties. Most significantly, we identified candidate genes responsible for ladderane biosynthesis and biological hydrazine metabolism, and discovered unexpected metabolic versatility.
Pseudomonas entomophila is an entomopathogenic bacterium that, upon ingestion, kills Drosophila melanogaster as well as insects from different orders. The complete sequence of the 5.9-Mb genome was determined and compared to the sequenced genomes of four Pseudomonas species. P. entomophila possesses most of the catabolic genes of the closely related strain P. putida KT2440, revealing its metabolically versatile properties and its soil lifestyle. Several features that probably contribute to its entomopathogenic properties were disclosed. Unexpectedly for an animal pathogen, P. entomophila is devoid of a type III secretion system and associated toxins but rather relies on a number of potential virulence factors such as insecticidal toxins, proteases, putative hemolysins, hydrogen cyanide and novel secondary metabolites to infect and kill insects. Genome-wide random mutagenesis revealed the major role of the two-component system GacS/GacA that regulates most of the potential virulence factors identified.
A considerable fraction of life develops in the sea at temperatures lower than 15 degrees C. Little is known about the adaptive features selected under those conditions. We present the analysis of the genome sequence of the fast growing Antarctica bacterium Pseudoalteromonas haloplanktis TAC125. We find that it copes with the increased solubility of oxygen at low temperature by multiplying dioxygen scavenging while deleting whole pathways producing reactive oxygen species. Dioxygen-consuming lipid desaturases achieve both protection against oxygen and synthesis of lipids making the membrane fluid. A remarkable strategy for avoidance of reactive oxygen species generation is developed by P. haloplanktis, with elimination of the ubiquitous molybdopterin-dependent metabolism. The P. haloplanktis proteome reveals a concerted amino acid usage bias specific to psychrophiles, consistently appearing apt to accommodate asparagine, a residue prone to make proteins age. Adding to its originality, P. haloplanktis further differs from its marine counterparts with recruitment of a plasmid origin of replication for its second chromosome.
Acinetobacter sp. strain ADP1 is a nutritionally versatile soil bacterium closely related to representatives of the well-characterized Pseudomonas aeruginosa and Pseudomonas putida. Unlike these bacteria, the Acinetobacter ADP1 is highly competent for natural transformation which affords extraordinary convenience for genetic manipulation. The circular chromosome of the Acinetobacter ADP1, presented here, encodes 3325 predicted coding sequences, of which 60% have been classified based on sequence similarity to other documented proteins. The close evolutionary proximity of Acinetobacter and Pseudomonas species, as judged by the sequences of their 16S RNA genes and by the highest level of bidirectional best hits, contrasts with the extensive divergence in the GC content of their DNA (40 versus 62%). The chromosomes also differ significantly in size, with the Acinetobacter ADP1 chromosome <60% of the length of the Pseudomonas counterparts. Genome analysis of the Acinetobacter ADP1 revealed genes for metabolic pathways involved in utilization of a large variety of compounds. Almost all of these genes, with orthologs that are scattered in other species, are located in five major 'islands of catabolic diversity', now an apparent 'archipelago of catabolic diversity', within one-quarter of the overall genome. Acinetobacter ADP1 displays many features of other aerobic soil bacteria with metabolism oriented toward the degradation of organic compounds found in their natural habitat. A distinguishing feature of this genome is the absence of a gene corresponding to pyruvate kinase, the enzyme that generally catalyzes the terminal step in conversion of carbohydrates to pyruvate for respiration by the citric acid cycle. This finding supports the view that the cycle itself is centrally geared to the catabolic capabilities of this exceptionally versatile organism.