We have sequenced the genome of the emerging human pathogen Babesia microti and compared it with that of other protozoa. B. microti has the smallest nuclear genome among all Apicomplexan parasites sequenced to date with three chromosomes encoding approximately 3500 polypeptides, several of which are species specific. Genome-wide phylogenetic analyses indicate that B. microti is significantly distant from all species of Babesidae and Theileridae and defines a new clade in the phylum Apicomplexa. Furthermore, unlike all other Apicomplexa, its mitochondrial genome is circular. Genome-scale reconstruction of functional networks revealed that B. microti has the minimal metabolic requirement for intraerythrocytic protozoan parasitism. B. microti multigene families differ from those of other protozoa in both the copy number and organization. Two lateral transfer events with significant metabolic implications occurred during the evolution of this parasite. The genomic sequencing of B. microti identified several targets suitable for the development of diagnostic assays and novel therapies for human babesiosis.
Polyploidization is an important process in the evolution of eukaryotic genomes, but ensuing molecular mechanisms remain to be clarified. Autopolyploidization or whole-genome duplication events frequently are resolved in resulting lineages by the loss of single genes from most duplicated pairs, causing transient gene dosage imbalance and accelerating speciation through meiotic infertility. Allopolyploidization or formation of interspecies hybrids raises the problem of genetic incompatibility (Bateson-Dobzhansky-Muller effect) and may be resolved by the accumulation of mutational changes in resulting lineages. In this article, we show that an osmotolerant yeast species, Pichia sorbitophila, recently isolated in a concentrated sorbitol solution in industry, illustrates this last situation. Its genome is a mosaic of homologous and homeologous chromosomes, or parts thereof, that corresponds to a recently formed hybrid in the process of evolution. The respective parental contributions to this genome were characterized using existing variations in GC content. The genomic changes that occurred during the short period since hybrid formation were identified (e.g., loss of heterozygosity, unilateral loss of rDNA, reciprocal exchange) and distinguished from those undergone by the two parental genomes after separation from their common ancestor (i.e., NUMT (NUclear sequences of MiTochondrial origin) insertions, gene acquisitions, gene location movements, reciprocal translocation). We found that the physiological characteristics of this new yeast species are determined by specific but unequal contributions of its two parents, one of which could be identified as very closely related to an extant Pichia farinosa strain.
Here we report the full genome sequence of Modestobacter marinus strain BC501, an actinobacterial isolate that thrives on stone surfaces. The generated chromosome is circular, with a length of 5.57 Mb and a G+C content of 74.13%, containing 5,445 protein-coding genes, 48 tRNAs, and 3 ribosomal operons.
To shed light on the genetic equipment of the beneficial plant-associated bacterium Pseudomonas brassicacearum, we sequenced the whole genome of the strain NFM421. Its genome consists of one chromosome equipped with a repertoire of factors beneficial for plant growth. In addition, a complete type III secretion system and two complete type VI secretion systems were identified. We report here the first genome sequence of this species.
Members of the genus Flavobacterium occur in a variety of ecological niches and represent an interesting diversity of lifestyles. Flavobacterium branchiophilum is the main causative agent of bacterial gill disease, a severe condition affecting various cultured freshwater fish species worldwide, in particular salmonids in Canada and Japan. We report here the complete genome sequence of strain FL-15 isolated from a diseased sheatfish (Silurus glanis) in Hungary. The analysis of the F. branchiophilum genome revealed putative mechanisms of pathogenicity strikingly different from those of the other, closely related fish pathogen Flavobacterium psychrophilum, including the first cholera-like toxin in a non-Proteobacteria and a wealth of adhesins. The comparison with available genomes of other Flavobacterium species revealed a small genome size, large differences in chromosome organization, and fewer rRNA and tRNA genes, in line with its more fastidious growth. In addition, horizontal gene transfer shaped the evolution of F. branchiophilum, as evidenced by its virulence factors, genomic islands, and CRISPR (clustered regularly interspaced short palindromic repeats) systems. Further functional analysis should help in the understanding of host-pathogen interactions and in the development of rational diagnostic tools and control strategies in fish farms.
BACKGROUND: Propionibacterium freudenreichii is essential as a ripening culture in Swiss-type cheeses and is also considered for its probiotic use. This species exhibits slow growth, low nutritional requirements, and hardiness in many habitats. It belongs to the taxonomic group of dairy propionibacteria, in contrast to the cutaneous species P. acnes. The genome of the type strain, P. freudenreichii subsp. shermanii CIRM-BIA1 (CIP 103027(T)), was sequenced with an 11-fold coverage. METHODOLOGY/PRINCIPAL FINDINGS: The circular chromosome of 2.7 Mb of the CIRM-BIA1 strain has a GC-content of 67% and contains 22 different insertion sequences (3.5% of the genome in base pairs). Using a proteomic approach, 490 of the 2439 predicted proteins were confirmed. The annotation revealed the genetic basis for the hardiness of P. freudenreichii, as the bacterium possesses a complete enzymatic arsenal for de novo biosynthesis of aminoacids and vitamins (except panthotenate and biotin) as well as sequences involved in metabolism of various carbon sources, immunity against phages, duplicated chaperone genes and, interestingly, genes involved in the management of polyphosphate, glycogen and trehalose storage. The complete biosynthesis pathway for a bifidogenic compound is described, as well as a high number of surface proteins involved in interactions with the host and present in other probiotic bacteria. By comparative genomics, no pathogenicity factors found in P. acnes or in other pathogenic microbial species were identified in P. freudenreichii, which is consistent with the Generally Recognized As Safe and Qualified Presumption of Safety status of P. freudenreichii. Various pathways for formation of cheese flavor compounds were identified: the Wood-Werkman cycle for propionic acid formation, amino acid degradation pathways resulting in the formation of volatile branched chain fatty acids, and esterases involved in the formation of free fatty acids and esters. CONCLUSIONS/SIGNIFICANCE: With the exception of its ability to degrade lactose, P. freudenreichii seems poorly adapted to dairy niches. This genome annotation opens up new prospects for the understanding of the P. freudenreichii probiotic activity.
Nitrospira are barely studied and mostly uncultured nitrite-oxidizing bacteria, which are, according to molecular data, among the most diverse and widespread nitrifiers in natural ecosystems and biological wastewater treatment. Here, environmental genomics was used to reconstruct the complete genome of "Candidatus Nitrospira defluvii" from an activated sludge enrichment culture. On the basis of this first-deciphered Nitrospira genome and of experimental data, we show that Ca. N. defluvii differs dramatically from other known nitrite oxidizers in the key enzyme nitrite oxidoreductase (NXR), in the composition of the respiratory chain, and in the pathway used for autotrophic carbon fixation, suggesting multiple independent evolution of chemolithoautotrophic nitrite oxidation. Adaptations of Ca. N. defluvii to substrate-limited conditions include an unusual periplasmic NXR, which is constitutively expressed, and pathways for the transport, oxidation, and assimilation of simple organic compounds that allow a mixotrophic lifestyle. The reverse tricarboxylic acid cycle as the pathway for CO2 fixation and the lack of most classical defense mechanisms against oxidative stress suggest that Nitrospira evolved from microaerophilic or even anaerobic ancestors. Unexpectedly, comparative genomic analyses indicate functionally significant lateral gene-transfer events between the genus Nitrospira and anaerobic ammonium-oxidizing planctomycetes, which share highly similar forms of NXR and other proteins reflecting that two key processes of the nitrogen cycle are evolutionarily connected.
BACKGROUND: Ileal lesions of Crohn's disease (CD) patients are abnormally colonized by pathogenic adherent-invasive Escherichia coli (AIEC) able to invade and to replicate within intestinal epithelial cells and macrophages. PRINCIPAL FINDINGS: We report here the complete genome sequence of E. coli LF82, the reference strain of adherent-invasive E. coli associated with ileal Crohn's disease. The LF82 genome of 4,881,487 bp total size contains a circular chromosome with a size of 4,773,108 bp and a plasmid of 108,379 bp. The analysis of predicted coding sequences (CDSs) within the LF82 flexible genome indicated that this genome is close to the avian pathogenic strain APEC_01, meningitis-associated strain S88 and urinary-isolated strain UTI89 with regards to flexible genome and single nucleotide polymorphisms in various virulence factors. Interestingly, we observed that strains LF82 and UTI89 adhered at a similar level to Intestine-407 cells and that like LF82, APEC_01 and UTI89 were highly invasive. However, A1EC strain LF82 had an intermediate killer phenotype compared to APEC-01 and UTI89 and the LF82 genome does not harbour most of specific virulence genes from ExPEC. LF82 genome has evolved from those of ExPEC B2 strains by the acquisition of Salmonella and Yersinia isolated or clustered genes or CDSs located on pLF82 plasmid and at various loci on the chromosome. CONCLUSION: LF82 genome analysis indicated that a number of genes, gene clusters and pathoadaptative mutations which have been acquired may play a role in virulence of AIEC strain LF82.
Arthrobacter arilaitensis is one of the major bacterial species found at the surface of cheeses, especially in smear-ripened cheeses, where it contributes to the typical colour, flavour and texture properties of the final product. The A. arilaitensis Re117 genome is composed of a 3,859,257 bp chromosome and two plasmids of 50,407 and 8,528 bp. The chromosome shares large regions of synteny with the chromosomes of three environmental Arthrobacter strains for which genome sequences are available: A. aurescens TC1, A. chlorophenolicus A6 and Arthrobacter sp. FB24. In contrast however, 4.92% of the A. arilaitensis chromosome is composed of ISs elements, a portion that is at least 15 fold higher than for the other Arthrobacter strains. Comparative genomic analyses reveal an extensive loss of genes associated with catabolic activities, presumably as a result of adaptation to the properties of the cheese surface habitat. Like the environmental Arthrobacter strains, A. arilaitensis Re117 is well-equipped with enzymes required for the catabolism of major carbon substrates present at cheese surfaces such as fatty acids, amino acids and lactic acid. However, A. arilaitensis has several specificities which seem to be linked to its adaptation to its particular niche. These include the ability to catabolize D-galactonate, a high number of glycine betaine and related osmolyte transporters, two siderophore biosynthesis gene clusters and a high number of Fe(3+)/siderophore transport systems. In model cheese experiments, addition of small amounts of iron strongly stimulated the growth of A. arilaitensis, indicating that cheese is a highly iron-restricted medium. We suggest that there is a strong selective pressure at the surface of cheese for strains with efficient iron acquisition and salt-tolerance systems together with abilities to catabolize substrates such as lactic acid, lipids and amino acids.
Our knowledge of yeast genomes remains largely dominated by the extensive studies on Saccharomyces cerevisiae and the consequences of its ancestral duplication, leaving the evolution of the entire class of hemiascomycetes only partly explored. We concentrate here on five species of Saccharomycetaceae, a large subdivision of hemiascomycetes, that we call "protoploid" because they diverged from the S. cerevisiae lineage prior to its genome duplication. We determined the complete genome sequences of three of these species: Kluyveromyces (Lachancea) thermotolerans and Saccharomyces (Lachancea) kluyveri (two members of the newly described Lachancea clade), and Zygosaccharomyces rouxii. We included in our comparisons the previously available sequences of Kluyveromyces lactis and Ashbya (Eremothecium) gossypii. Despite their broad evolutionary range and significant individual variations in each lineage, the five protoploid Saccharomycetaceae share a core repertoire of approximately 3300 protein families and a high degree of conserved synteny. Synteny blocks were used to define gene orthology and to infer ancestors. Far from representing minimal genomes without redundancy, the five protoploid yeasts contain numerous copies of paralogous genes, either dispersed or in tandem arrays, that, altogether, constitute a third of each genome. Ancient, conserved paralogs as well as novel, lineage-specific paralogs were identified.
The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the approximately 18,000 families of orthologous genes, we found approximately 2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome.
To better understand adaptation to harsh conditions encountered in hot arid deserts, we report the first complete genome sequence and proteome analysis of a bacterium, Deinococcus deserti VCD115, isolated from Sahara surface sand. Its genome consists of a 2.8-Mb chromosome and three large plasmids of 324 kb, 314 kb, and 396 kb. Accurate primary genome annotation of its 3,455 genes was guided by extensive proteome shotgun analysis. From the large corpus of MS/MS spectra recorded, 1,348 proteins were uncovered and semiquantified by spectral counting. Among the highly detected proteins are several orphans and Deinococcus-specific proteins of unknown function. The alliance of proteomics and genomics high-throughput techniques allowed identification of 15 unpredicted genes and, surprisingly, reversal of incorrectly predicted orientation of 11 genes. Reversal of orientation of two Deinococcus-specific radiation-induced genes, ddrC and ddrH, and identification in D. deserti of supplementary genes involved in manganese import extend our knowledge of the radiotolerance toolbox of Deinococcaceae. Additional genes involved in nutrient import and in DNA repair (i.e., two extra recA, three translesion DNA polymerases, a photolyase) were also identified and found to be expressed under standard growth conditions, and, for these DNA repair genes, after exposure of the cells to UV. The supplementary nutrient import and DNA repair genes are likely important for survival and adaptation of D. deserti to its nutrient-poor, dry, and UV-exposed extreme environment.
Pseudomonas entomophila is an entomopathogenic bacterium that, upon ingestion, kills Drosophila melanogaster as well as insects from different orders. The complete sequence of the 5.9-Mb genome was determined and compared to the sequenced genomes of four Pseudomonas species. P. entomophila possesses most of the catabolic genes of the closely related strain P. putida KT2440, revealing its metabolically versatile properties and its soil lifestyle. Several features that probably contribute to its entomopathogenic properties were disclosed. Unexpectedly for an animal pathogen, P. entomophila is devoid of a type III secretion system and associated toxins but rather relies on a number of potential virulence factors such as insecticidal toxins, proteases, putative hemolysins, hydrogen cyanide and novel secondary metabolites to infect and kill insects. Genome-wide random mutagenesis revealed the major role of the two-component system GacS/GacA that regulates most of the potential virulence factors identified.
Tetraodon nigroviridis is a freshwater puffer fish with the smallest known vertebrate genome. Here, we report a draft genome sequence with long-range linkage and substantial anchoring to the 21 Tetraodon chromosomes. Genome analysis provides a greatly improved fish gene catalogue, including identifying key genes previously thought to be absent in fish. Comparison with other vertebrates and a urochordate indicates that fish proteins have diverged markedly faster than their mammalian homologues. Comparison with the human genome suggests approximately 900 previously unannotated human genes. Analysis of the Tetraodon and human genomes shows that whole-genome duplication occurred in the teleost fish lineage, subsequent to its divergence from mammals. The analysis also makes it possible to infer the basic structure of the ancestral bony vertebrate genome, which was composed of 12 chromosomes, and to reconstruct much of the evolutionary history of ancient and recent chromosome rearrangements leading to the modern human karyotype.