Legionella pneumophila and L. longbeachae are two species of a large genus of bacteria that are ubiquitous in nature. L. pneumophila is mainly found in natural and artificial water circuits while L. longbeachae is mainly present in soil. Under the appropriate conditions both species are human pathogens, capable of causing a severe form of pneumonia termed Legionnaires' disease. Here we report the sequencing and analysis of four L. longbeachae genomes, one complete genome sequence of L. longbeachae strain NSW150 serogroup (Sg) 1, and three draft genome sequences another belonging to Sg1 and two to Sg2. The genome organization and gene content of the four L. longbeachae genomes are highly conserved, indicating strong pressure for niche adaptation. Analysis and comparison of L. longbeachae strain NSW150 with L. pneumophila revealed common but also unexpected features specific to this pathogen. The interaction with host cells shows distinct features from L. pneumophila, as L. longbeachae possesses a unique repertoire of putative Dot/Icm type IV secretion system substrates, eukaryotic-like and eukaryotic domain proteins, and encodes additional secretion systems. However, analysis of the ability of a dotA mutant of L. longbeachae NSW150 to replicate in the Acanthamoeba castellanii and in a mouse lung infection model showed that the Dot/Icm type IV secretion system is also essential for the virulence of L. longbeachae. In contrast to L. pneumophila, L. longbeachae does not encode flagella, thereby providing a possible explanation for differences in mouse susceptibility to infection between the two pathogens. Furthermore, transcriptome analysis revealed that L. longbeachae has a less pronounced biphasic life cycle as compared to L. pneumophila, and genome analysis and electron microscopy suggested that L. longbeachae is encapsulated. These species-specific differences may account for the different environmental niches and disease epidemiology of these two Legionella species.
Streptococcus gallolyticus (formerly known as Streptococcus bovis biotype I) is an increasing cause of endocarditis among streptococci and frequently associated with colon cancer. S. gallolyticus is part of the rumen flora but also a cause of disease in ruminants as well as in birds. Here we report the complete nucleotide sequence of strain UCN34, responsible for endocarditis in a patient also suffering from colon cancer. Analysis of the 2,239 proteins encoded by its 2,350-kb-long genome revealed unique features among streptococci, probably related to its adaptation to the rumen environment and its capacity to cause endocarditis. S. gallolyticus has the capacity to use a broad range of carbohydrates of plant origin, in particular to degrade polysaccharides derived from the plant cell wall. Its genome encodes a large repertoire of transporters and catalytic activities, like tannase, phenolic compounds decarboxylase, and bile salt hydrolase, that should contribute to the detoxification of the gut environment. Furthermore, S. gallolyticus synthesizes all 20 amino acids and more vitamins than any other sequenced Streptococcus species. Many of the genes encoding these specific functions were likely acquired by lateral gene transfer from other bacterial species present in the rumen. The surface properties of strain UCN34 may also contribute to its virulence. A polysaccharide capsule might be implicated in resistance to innate immunity defenses, and glucan mucopolysaccharides, three types of pili, and collagen binding proteins may play a role in adhesion to tissues in the course of endocarditis.
Reductive evolution and massive pseudogene formation have shaped the 3.31-Mb genome of Mycobacterium leprae, an unculturable obligate pathogen that causes leprosy in humans. The complete genome sequence of M. leprae strain Br4923 from Brazil was obtained by conventional methods (6x coverage), and Illumina resequencing technology was used to obtain the sequences of strains Thai53 (38x coverage) and NHDP63 (46x coverage) from Thailand and the United States, respectively. Whole-genome comparisons with the previously sequenced TN strain from India revealed that the four strains share 99.995% sequence identity and differ only in 215 polymorphic sites, mainly SNPs, and by 5 pseudogenes. Sixteen interrelated SNP subtypes were defined by genotyping both extant and extinct strains of M. leprae from around the world. The 16 SNP subtypes showed a strong geographical association that reflects the migration patterns of early humans and trade routes, with the Silk Road linking Europe to China having contributed to the spread of leprosy.
Leptospira biflexa is a free-living saprophytic spirochete present in aquatic environments. We determined the genome sequence of L. biflexa, making it the first saprophytic Leptospira to be sequenced. The L. biflexa genome has 3,590 protein-coding genes distributed across three circular replicons: the major 3,604 chromosome, a smaller 278-kb replicon that also carries essential genes, and a third 74-kb replicon. Comparative sequence analysis provides evidence that L. biflexa is an excellent model for the study of Leptospira evolution; we conclude that 2052 genes (61%) represent a progenitor genome that existed before divergence of pathogenic and saprophytic Leptospira species. Comparisons of the L. biflexa genome with two pathogenic Leptospira species reveal several major findings. Nearly one-third of the L. biflexa genes are absent in pathogenic Leptospira. We suggest that once incorporated into the L. biflexa genome, laterally transferred DNA undergoes minimal rearrangement due to physical restrictions imposed by high gene density and limited presence of transposable elements. In contrast, the genomes of pathogenic Leptospira species undergo frequent rearrangements, often involving recombination between insertion sequences. Identification of genes common to the two pathogenic species, L. borgpetersenii and L. interrogans, but absent in L. biflexa, is consistent with a role for these genes in pathogenesis. Differences in environmental sensing capacities of L. biflexa, L. borgpetersenii, and L. interrogans suggest a model which postulates that loss of signal transduction functions in L. borgpetersenii has impaired its survival outside a mammalian host, whereas L. interrogans has retained environmental sensory functions that facilitate disease transmission through water.
Legionella pneumophila, the causative agent of Legionnaires' disease, replicates as an intracellular parasite of amoebae and persists in the environment as a free-living microbe. Here we have analyzed the complete genome sequences of L. pneumophila Paris (3,503,610 bp, 3,077 genes), an endemic strain that is predominant in France, and Lens (3,345,687 bp, 2,932 genes), an epidemic strain responsible for a major outbreak of disease in France. The L. pneumophila genomes show marked plasticity, with three different plasmids and with about 13% of the sequence differing between the two strains. Only strain Paris contains a type V secretion system, and its Lvh type IV secretion system is encoded by a 36-kb region that is either carried on a multicopy plasmid or integrated into the chromosome. Genetic mobility may enhance the versatility of L. pneumophila. Numerous genes encode eukaryotic-like proteins or motifs that are predicted to modulate host cell functions to the pathogen's advantage. The genome thus reflects the history and lifestyle of L. pneumophila, a human pathogen of macrophages that coevolved with fresh-water amoebae.