(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Bacteria: NE > Terrabacteria group: NE > Actinobacteria [phylum]: NE > Actinobacteria [class]: NE > Corynebacteriales: NE > Mycobacteriaceae: NE > Mycobacterium: NE > Mycobacterium tuberculosis complex: NE > Mycobacterium tuberculosis: NE
Warning: This entry is a compilation of different species or line or strain with more than 90% amino acid identity. You can retrieve all strain data
(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) Mycobacterium tuberculosis 02_1987: N, E.
Mycobacterium tuberculosis KZN 1435: N, E.
Mycobacterium tuberculosis F11: N, E.
Mycobacterium tuberculosis H37Ra: N, E.
Mycobacterium tuberculosis GM 1503: N, E.
Mycobacterium tuberculosis str. Haarlem: N, E.
Mycobacterium tuberculosis str. Haarlem/NITR202: N, E.
Mycobacterium tuberculosis 94_M4241A: N, E.
Mycobacterium tuberculosis C: N, E.
Mycobacterium tuberculosis T85: N, E.
Mycobacterium tuberculosis EAS054: N, E.
Mycobacterium tuberculosis SUMu004: N, E.
Mycobacterium tuberculosis SUMu010: N, E.
Mycobacterium tuberculosis SUMu009: N, E.
Mycobacterium tuberculosis SUMu002: N, E.
Mycobacterium tuberculosis SUMu008: N, E.
Mycobacterium tuberculosis SUMu001: N, E.
Mycobacterium tuberculosis CDC1551A: N, E.
Mycobacterium tuberculosis TKK-01-0051: N, E.
Mycobacterium tuberculosis T17: N, E.
Mycobacterium tuberculosis T46: N, E.
Mycobacterium tuberculosis CPHL_A: N, E.
Mycobacterium tuberculosis K85: N, E.
Mycobacterium tuberculosis CDC1551: N, E.
Mycobacterium tuberculosis SUMu011: N, E.
Mycobacterium tuberculosis SUMu007: N, E.
Mycobacterium tuberculosis SUMu006: N, E.
Mycobacterium tuberculosis SUMu003: N, E.
Mycobacterium tuberculosis SUMu012: N, E.
Mycobacterium tuberculosis SUMu005: N, E.
Mycobacterium tuberculosis T92: N, E.
Mycobacterium tuberculosis str. Erdman = ATCC 35801: N, E.
Mycobacterium tuberculosis FJ05194: N, E.
Mycobacterium tuberculosis EAI5/NITR206: N, E.
Mycobacterium tuberculosis UT205: N, E.
Mycobacterium tuberculosis CCDC5180: N, E.
Mycobacterium tuberculosis H37Rv: N, E.
Mycobacterium tuberculosis CCDC5079: N, E.
Mycobacterium tuberculosis BT2: N, E.
Mycobacterium tuberculosis EAI5: N, E.
Mycobacterium tuberculosis W-148: N, E.
Mycobacterium tuberculosis CTRI-2: N, E.
Mycobacterium tuberculosis RGTB327: N, E.
Mycobacterium tuberculosis '98-R604 INH-RIF-EM': N, E.
Mycobacterium tuberculosis str. Beijing/NITR203: N, E.
Mycobacterium tuberculosis HKBS1: N, E.
Mycobacterium tuberculosis CAS/NITR204: N, E.
Mycobacterium tuberculosis 7199-99: N, E.
Mycobacterium tuberculosis KZN 605: N, E.
Mycobacterium tuberculosis NCGM2209: N, E.
Mycobacterium tuberculosis BT1: N, E.
Mycobacterium tuberculosis RGTB423: N, E.
Mycobacterium tuberculosis KZN 4207: N, E.
Mycobacterium tuberculosis GuangZ0019: N, E.
Mycobacterium tuberculosis 2092HD: N, E.
Mycobacterium tuberculosis variant caprae: N, E.
Mycobacterium tuberculosis variant africanum: N, E.
Mycobacterium tuberculosis variant microti OV254: N, E.
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA MGAPTERLVDTNGVRLRVVEAGEPGAPVVILAHGFPELAYSWRHQIPALA DAGYHVLAPDQRGYGGSSRPEAIEAYDIHRLTADLVGLLDDVGAERAVWV GHDWGAVVVWNAPLLHADRVAAVAALSVPALPRAQVPPTQAFRSRFGENF FYILYFQEPGIADAELNGDPARTMRRMIGGLRPPGDQSAAMRMLAPGPDG FIDRLPEPAGLPAWISQEELDHYIGEFTRTGFTGGLNWYRNFDRNWETTA DLAGKTISVPSLFIAGTADPVLTFTRTDRAAEVISGPYREVLIDGAGHWL QQERPGEVTAALLEFLTGLELR
Original genome annotations need to be regularly updated if the information they contain is to remain accurate and relevant. Here the complete re-annotation of the genome sequence of Mycobacterium tuberculosis strain H37Rv is presented almost 4 years after the first submission. Eighty-two new protein-coding sequences (CDS) have been included and 22 of these have a predicted function. The majority were identified by manual or automated re-analysis of the genome and most of them were shorter than the 100 codon cut-off used in the initial genome analysis. The functional classification of 643 CDS has been changed based principally on recent sequence comparisons and new experimental data from the literature. More than 300 gene names and over 1000 targeted citations have been added and the lengths of 60 genes have been modified. Presently, it is possible to assign a function to 2058 proteins (52% of the 3995 proteins predicted) and only 376 putative proteins share no homology with known proteins and thus could be unique to M. tuberculosis.
Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.
Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.
        
3 lessTitle: The alpha/beta Hydrolase Fold Proteins of Mycobacterium tuberculosis, With Reference to their Contribution to Virulence Johnson G Ref: Curr Protein Pept Sci, 18:190, 2016 : PubMed
The alpha/beta hydrolase fold superfamily is an ancient and widely diversified group of primarily hydrolytic enzymes. In this review, the adaptations of these proteins to the pathogenic lifestyle of Mycobacterium tuberculosis (Mtb), the causative agent of tuberculosis, are examined. Of the 105 alpha/beta hydrolases identified in Mtb, many are associated with lipid metabolism, particularly in the biosynthesis and maintenance of the Mtb's unique cell envelope, as well in the large number of extracellular lipases that are likely responsible for degradation of host lipid material. alpha/beta hydrolase fold proteins are also involved in the evasion and modulation of the immune response, detoxification and metabolic adaptations, including growth, response to acidification of the intracellular environment and dormancy. A striking feature of Mtb's alpha/beta hydrolases is their diversification into virulence-associated niches. It is clear that the alpha/beta hydrolase fold family has made a significant contribution to Mtb's remarkable success as a pathogen.
The genome sequencing of H37Rv strain of Mycobacterium tuberculosis was completed in 1998 followed by the whole genome sequencing of a clinical isolate, CDC1551 in 2002. Since then, the genomic sequences of a number of other strains have become available making it one of the better studied pathogenic bacterial species at the genomic level. However, annotation of its genome remains challenging because of high GC content and dissimilarity to other model prokaryotes. To this end, we carried out an in-depth proteogenomic analysis of the M. tuberculosis H37Rv strain using Fourier transform mass spectrometry with high resolution at both MS and tandem MS levels. In all, we identified 3176 proteins from Mycobacterium tuberculosis representing ~80% of its total predicted gene count. In addition to protein database search, we carried out a genome database search, which led to identification of ~250 novel peptides. Based on these novel genome search-specific peptides, we discovered 41 novel protein coding genes in the H37Rv genome. Using peptide evidence and alternative gene prediction tools, we also corrected 79 gene models. Finally, mass spectrometric data from N terminus-derived peptides confirmed 727 existing annotations for translational start sites while correcting those for 33 proteins. We report creation of a high confidence set of protein coding regions in Mycobacterium tuberculosis genome obtained by high resolution tandem mass-spectrometry at both precursor and fragment detection steps for the first time. This proteogenomic approach should be generally applicable to other organisms whose genomes have already been sequenced for obtaining a more accurate catalogue of protein-coding genes.
TubercuList (http:\/\/tuberculist.epfl.ch/), the relational database that presents genome-derived information about H37Rv, the paradigm strain of Mycobacterium tuberculosis, has been active for ten years and now presents its twentieth release. Here, we describe some of the recent changes that have resulted from manual annotation with information from the scientific literature. Through manual curation, TubercuList strives to provide current gene-based information and is thus distinguished from other online sources of genome sequence data for M. tuberculosis. New, mostly small, genes have been discovered and the coordinates of some existing coding sequences have been changed when bioinformatics or experimental data suggest that this is required. Nucleotides that are polymorphic between different sources of H37Rv are annotated and gene essentiality data have been updated. A host of functional information has been gleaned from the literature and many new activities of proteins and RNAs have been included. To facilitate basic and translational research, TubercuList also provides links to other specialized databases that present diverse datasets such as 3D-structures, expression profiles, drug development criteria and drug resistance information, in addition to direct access to PubMed articles pertinent to particular genes. TubercuList has been and remains a highly valuable tool for the tuberculosis research community with >75,000 visitors per month.
        
Title: Re-annotation of the genome sequence of Mycobacterium tuberculosis H37Rv Camus JC, Pryor MJ, Medigue C, Cole ST Ref: Microbiology, 148:2967, 2002 : PubMed
Original genome annotations need to be regularly updated if the information they contain is to remain accurate and relevant. Here the complete re-annotation of the genome sequence of Mycobacterium tuberculosis strain H37Rv is presented almost 4 years after the first submission. Eighty-two new protein-coding sequences (CDS) have been included and 22 of these have a predicted function. The majority were identified by manual or automated re-analysis of the genome and most of them were shorter than the 100 codon cut-off used in the initial genome analysis. The functional classification of 643 CDS has been changed based principally on recent sequence comparisons and new experimental data from the literature. More than 300 gene names and over 1000 targeted citations have been added and the lengths of 60 genes have been modified. Presently, it is possible to assign a function to 2058 proteins (52% of the 3995 proteins predicted) and only 376 putative proteins share no homology with known proteins and thus could be unique to M. tuberculosis.
Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.
Countless millions of people have died from tuberculosis, a chronic infectious disease caused by the tubercle bacillus. The complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv, has been determined and analysed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. The genome comprises 4,411,529 base pairs, contains around 4,000 genes, and has a very high guanine + cytosine content that is reflected in the biased amino-acid content of the proteins. M. tuberculosis differs radically from other bacteria in that a very large portion of its coding capacity is devoted to the production of enzymes involved in lipogenesis and lipolysis, and to two new families of glycine-rich proteins with a repetitive structure that may represent a source of antigenic variation.