A distinct class of infectious agents, the virophages that infect giant viruses of the Mimiviridae family, has been recently described. Here we report the simultaneous discovery of a giant virus of Acanthamoeba polyphaga (Lentille virus) that contains an integrated genome of a virophage (Sputnik 2), and a member of a previously unknown class of mobile genetic elements, the transpovirons. The transpovirons are linear DNA elements of ~7 kb that encompass six to eight protein-coding genes, two of which are homologous to virophage genes. Fluorescence in situ hybridization showed that the free form of the transpoviron replicates within the giant virus factory and accumulates in high copy numbers inside giant virus particles, Sputnik 2 particles, and amoeba cytoplasm. Analysis of deep-sequencing data showed that the virophage and the transpoviron can integrate in nearly any place in the chromosome of the giant virus host and that, although less frequently, the transpoviron can also be linked to the virophage chromosome. In addition, integrated fragments of transpoviron DNA were detected in several giant virus and Sputnik genomes. Analysis of 19 Mimivirus strains revealed three distinct transpovirons associated with three subgroups of Mimiviruses. The virophage, the transpoviron, and the previously identified self-splicing introns and inteins constitute the complex, interconnected mobilome of the giant viruses and are likely to substantially contribute to interviral gene transfer.
We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.
The genome sequence of the Mamavirus, a new Acanthamoeba polyphaga mimivirus strain, is reported. With 1,191,693 nt in length and 1,023 predicted protein-coding genes, the Mamavirus has the largest genome among the known viruses. The genomes of the Mamavirus and the previously described Mimivirus are highly similar in both the protein-coding genes and the intergenic regions. However, the Mamavirus contains an extra 5'-terminal segment that encompasses primarily disrupted duplicates of genes present elsewhere in the genome. The Mamavirus also has several unique genes including a small regulatory polyA polymerase subunit that is shared with poxviruses. Detailed analysis of the protein sequences of the two Mimiviruses led to a substantial amendment of the functional annotation of the viral genomes.
The candidate division Korarchaeota comprises a group of uncultivated microorganisms that, by their small subunit rRNA phylogeny, may have diverged early from the major archaeal phyla Crenarchaeota and Euryarchaeota. Here, we report the initial characterization of a member of the Korarchaeota with the proposed name, "Candidatus Korarchaeum cryptofilum," which exhibits an ultrathin filamentous morphology. To investigate possible ancestral relationships between deep-branching Korarchaeota and other phyla, we used whole-genome shotgun sequencing to construct a complete composite korarchaeal genome from enriched cells. The genome was assembled into a single contig 1.59 Mb in length with a G + C content of 49%. Of the 1,617 predicted protein-coding genes, 1,382 (85%) could be assigned to a revised set of archaeal Clusters of Orthologous Groups (COGs). The predicted gene functions suggest that the organism relies on a simple mode of peptide fermentation for carbon and energy and lacks the ability to synthesize de novo purines, CoA, and several other cofactors. Phylogenetic analyses based on conserved single genes and concatenated protein sequences positioned the korarchaeote as a deep archaeal lineage with an apparent affinity to the Crenarchaeota. However, the predicted gene content revealed that several conserved cellular systems, such as cell division, DNA replication, and tRNA maturation, resemble the counterparts in the Euryarchaeota. In light of the known composition of archaeal genomes, the Korarchaeota might have retained a set of cellular features that represents the ancestral archaeal form.
BACKGROUND: The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. RESULTS: We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ~2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C1-utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. CONCLUSION: The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. REVIEWERS: This article was reviewed by John A. Fuerst, Ludmila Chistoserdova, and Radhey S. Gupta.
BACKGROUND: Gram-positive bacteria of the genus Anoxybacillus have been found in diverse thermophilic habitats, such as geothermal hot springs and manure, and in processed foods such as gelatin and milk powder. Anoxybacillus flavithermus is a facultatively anaerobic bacterium found in super-saturated silica solutions and in opaline silica sinter. The ability of A. flavithermus to grow in super-saturated silica solutions makes it an ideal subject to study the processes of sinter formation, which might be similar to the biomineralization processes that occurred at the dawn of life. RESULTS: We report here the complete genome sequence of A. flavithermus strain WK1, isolated from the waste water drain at the Wairakei geothermal power station in New Zealand. It consists of a single chromosome of 2,846,746 base pairs and is predicted to encode 2,863 proteins. In silico genome analysis identified several enzymes that could be involved in silica adaptation and biofilm formation, and their predicted functions were experimentally validated in vitro. Proteomic analysis confirmed the regulation of biofilm-related proteins and crucial enzymes for the synthesis of long-chain polyamines as constituents of silica nanospheres. CONCLUSIONS: Microbial fossils preserved in silica and silica sinters are excellent objects for studying ancient life, a new paleobiological frontier. An integrated analysis of the A. flavithermus genome and proteome provides the first glimpse of metabolic adaptation during silicification and sinter formation. Comparative genome analysis suggests an extensive gene loss in the Anoxybacillus/Geobacillus branch after its divergence from other bacilli.
We report the complete genome sequence of the deep-sea gamma-proteobacterium, Idiomarina loihiensis, isolated recently from a hydrothermal vent at 1,300-m depth on the Loihi submarine volcano, Hawaii. The I. loihiensis genome comprises a single chromosome of 2,839,318 base pairs, encoding 2,640 proteins, four rRNA operons, and 56 tRNA genes. A comparison of I. loihiensis to the genomes of other gamma-proteobacteria reveals abundance of amino acid transport and degradation enzymes, but a loss of sugar transport systems and certain enzymes of sugar metabolism. This finding suggests that I. loihiensis relies primarily on amino acid catabolism, rather than on sugar fermentation, for carbon and energy. Enzymes for biosynthesis of purines, pyrimidines, the majority of amino acids, and coenzymes are encoded in the genome, but biosynthetic pathways for Leu, Ile, Val, Thr, and Met are incomplete. Auxotrophy for Val and Thr was confirmed by in vivo experiments. The I. loihiensis genome contains a cluster of 32 genes encoding enzymes for exopolysaccharide and capsular polysaccharide synthesis. It also encodes diverse peptidases, a variety of peptide and amino acid uptake systems, and versatile signal transduction machinery. We propose that the source of amino acids for I. loihiensis growth are the proteinaceous particles present in the deep sea hydrothermal vent waters. I. loihiensis would colonize these particles by using the secreted exopolysaccharide, digest these proteins, and metabolize the resulting peptides and amino acids. In summary, the I. loihiensis genome reveals an integrated mechanism of metabolic adaptation to the constantly changing deep-sea hydrothermal ecosystem.
Prochlorococcus marinus, the dominant photosynthetic organism in the ocean, is found in two main ecological forms: high-light-adapted genotypes in the upper part of the water column and low-light-adapted genotypes at the bottom of the illuminated layer. P. marinus SS120, the complete genome sequence reported here, is an extremely low-light-adapted form. The genome of P. marinus SS120 is composed of a single circular chromosome of 1,751,080 bp with an average G+C content of 36.4%. It contains 1,884 predicted protein-coding genes with an average size of 825 bp, a single rRNA operon, and 40 tRNA genes. Together with the 1.66-Mbp genome of P. marinus MED4, the genome of P. marinus SS120 is one of the two smallest genomes of a photosynthetic organism known to date. It lacks many genes that are involved in photosynthesis, DNA repair, solute uptake, intermediary metabolism, motility, phototaxis, and other functions that are conserved among other cyanobacteria. Systems of signal transduction and environmental stress response show a particularly drastic reduction in the number of components, even taking into account the small size of the SS120 genome. In contrast, housekeeping genes, which encode enzymes of amino acid, nucleotide, cofactor, and cell wall biosynthesis, are all present. Because of its remarkable compactness, the genome of P. marinus SS120 might approximate the minimal gene complement of a photosynthetic organism.
We have determined the complete 1,694,969-nt sequence of the GC-rich genome of Methanopyrus kandleri by using a whole direct genome sequencing approach. This approach is based on unlinking of genomic DNA with the ThermoFidelase version of M. kandleri topoisomerase V and cycle sequencing directed by 2'-modified oligonucleotides (Fimers). Sequencing redundancy (3.3x) was sufficient to assemble the genome with less than one error per 40 kb. Using a combination of sequence database searches and coding potential prediction, 1,692 protein-coding genes and 39 genes for structural RNAs were identified. M. kandleri proteins show an unusually high content of negatively charged amino acids, which might be an adaptation to the high intracellular salinity. Previous phylogenetic analysis of 16S RNA suggested that M. kandleri belonged to a very deep branch, close to the root of the archaeal tree. However, genome comparisons indicate that, in both trees constructed using concatenated alignments of ribosomal proteins and trees based on gene content, M. kandleri consistently groups with other archaeal methanogens. M. kandleri shares the set of genes implicated in methanogenesis and, in part, its operon organization with Methanococcus jannaschii and Methanothermobacter thermoautotrophicum. These findings indicate that archaeal methanogens are monophyletic. A distinctive feature of M. kandleri is the paucity of proteins involved in signaling and regulation of gene expression. Also, M. kandleri appears to have fewer genes acquired via lateral transfer than other archaea. These features might reflect the extreme habitat of this organism.
The genome sequence of the solvent-producing bacterium Clostridium acetobutylicum ATCC 824 has been determined by the shotgun approach. The genome consists of a 3.94-Mb chromosome and a 192-kb megaplasmid that contains the majority of genes responsible for solvent production. Comparison of C. acetobutylicum to Bacillus subtilis reveals significant local conservation of gene order, which has not been seen in comparisons of other genomes with similar, or, in some cases closer, phylogenetic proximity. This conservation allows the prediction of many previously undetected operons in both bacteria. However, the C. acetobutylicum genome also contains a significant number of predicted operons that are shared with distantly related bacteria and archaea but not with B. subtilis. Phylogenetic analysis is compatible with the dissemination of such operons by horizontal transfer. The enzymes of the solventogenesis pathway and of the cellulosome of C. acetobutylicum comprise a new set of metabolic capacities not previously represented in the collection of complete genomes. These enzymes show a complex pattern of evolutionary affinities, emphasizing the role of lateral gene exchange in the evolution of the unique metabolic profile of the bacterium. Many of the sporulation genes identified in B. subtilis are missing in C. acetobutylicum, which suggests major differences in the sporulation process. Thus, comparative analysis reveals both significant conservation of the genome organization and pronounced differences in many systems that reflect unique adaptive strategies of the two gram-positive bacteria.
Extremely halophilic archaea contain retinal-binding integral membrane proteins called bacteriorhodopsins that function as light-driven proton pumps. So far, bacteriorhodopsins capable of generating a chemiosmotic membrane potential in response to light have been demonstrated only in halophilic archaea. We describe here a type of rhodopsin derived from bacteria that was discovered through genomic analyses of naturally occuring marine bacterioplankton. The bacterial rhodopsin was encoded in the genome of an uncultivated gamma-proteobacterium and shared highest amino acid sequence similarity with archaeal rhodopsins. The protein was functionally expressed in Escherichia coli and bound retinal to form an active, light-driven proton pump. The new rhodopsin exhibited a photochemical reaction cycle with intermediates and kinetics characteristic of archaeal proton-pumping rhodopsins. Our results demonstrate that archaeal-like rhodopsins are broadly distributed among different taxa, including members of the domain Bacteria. Our data also indicate that a previously unsuspected mode of bacterially mediated light-driven energy generation may commonly occur in oceanic surface waters worldwide.
Chromosome 2 of Plasmodium falciparum was sequenced; this sequence contains 947,103 base pairs and encodes 210 predicted genes. In comparison with the Saccharomyces cerevisiae genome, chromosome 2 has a lower gene density, introns are more frequent, and proteins are markedly enriched in nonglobular domains. A family of surface proteins, rifins, that may play a role in antigenic variation was identified. The complete sequencing of chromosome 2 has shown that sequencing of the A+T-rich P. falciparum genome is technically feasible.
Analysis of the 1,042,519-base pair Chlamydia trachomatis genome revealed unexpected features related to the complex biology of chlamydiae. Although chlamydiae lack many biosynthetic capabilities, they retain functions for performing key steps and interconversions of metabolites obtained from their mammalian host cells. Numerous potential virulence-associated proteins also were characterized. Several eukaryotic chromatin-associated domain proteins were identified, suggesting a eukaryotic-like mechanism for chlamydial nucleoid condensation and decondensation. The phylogenetic mosaic of chlamydial genes, including a large number of genes with phylogenetic origins from eukaryotes, implies a complex evolution for adaptation to obligate intracellular parasitism.