Awamori is a traditional distilled beverage made from steamed Thai-Indica rice in Okinawa, Japan. For brewing the liquor, two microbes, local kuro (black) koji mold Aspergillus luchuensis and awamori yeast Saccharomyces cerevisiae are involved. In contrast, that yeasts are used for ethanol fermentation throughout the world, a characteristic of Japanese fermentation industries is the use of Aspergillus molds as a source of enzymes for the maceration and saccharification of raw materials. Here we report the draft genome of a kuro (black) koji mold, A. luchuensis NBRC 4314 (RIB 2604). The total length of nonredundant sequences was nearly 34.7 Mb, comprising approximately 2,300 contigs with 16 telomere-like sequences. In total, 11,691 genes were predicted to encode proteins. Most of the housekeeping genes, such as transcription factors and N-and O-glycosylation system, were conserved with respect to Aspergillus niger and Aspergillus oryzae An alternative oxidase and acid-stable alpha-amylase regarding citric acid production and fermentation at a low pH as well as a unique glutamic peptidase were also found in the genome. Furthermore, key biosynthetic gene clusters of ochratoxin A and fumonisin B were absent when compared with A. niger genome, showing the safety of A. luchuensis for food and beverage production. This genome information will facilitate not only comparative genomics with industrial kuro-koji molds, but also molecular breeding of the molds in improvements of awamori fermentation.
The term 'sake yeast' is generally used to indicate the Saccharomyces cerevisiae strains that possess characteristics distinct from others including the laboratory strain S288C and are well suited for sake brewery. Here, we report the draft whole-genome shotgun sequence of a commonly used diploid sake yeast strain, Kyokai no. 7 (K7). The assembled sequence of K7 was nearly identical to that of the S288C, except for several subtelomeric polymorphisms and two large inversions in K7. A survey of heterozygous bases between the homologous chromosomes revealed the presence of mosaic-like uneven distribution of heterozygosity in K7. The distribution patterns appeared to have resulted from repeated losses of heterozygosity in the ancestral lineage of K7. Analysis of genes revealed the presence of both K7-acquired and K7-lost genes, in addition to numerous others with segmentations and terminal discrepancies in comparison with those of S288C. The distribution of Ty element also largely differed in the two strains. Interestingly, two regions in chromosomes I and VII of S288C have apparently been replaced by Ty elements in K7. Sequence comparisons suggest that these gene conversions were caused by cDNA-mediated recombination of Ty elements. The present study advances our understanding of the functional and evolutionary genomics of the sake yeast.
A filamentous non-N(2)-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca(2+)-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis.
The complete genome sequence of the thermophilic sulphur-reducing bacterium, Deferribacter desulfuricans SMM1, isolated from a hydrothermal vent chimney has been determined. The genome comprises a single circular chromosome of 2,234,389 bp and a megaplasmid of 308,544 bp. Many genes encoded in the genome are most similar to the genes of sulphur- or sulphate-reducing bacterial species within Deltaproteobacteria. The reconstructed central metabolisms showed a heterotrophic lifestyle primarily driven by C1 to C3 organics, e.g. formate, acetate, and pyruvate, and also suggested that the inability of autotrophy via a reductive tricarboxylic acid cycle may be due to the lack of ATP-dependent citrate lyase. In addition, the genome encodes numerous genes for chemoreceptors, chemotaxis-like systems, and signal transduction machineries. These signalling networks may be linked to this bacterium's versatile energy metabolisms and may provide ecophysiological advantages for D. desulfuricans SSM1 thriving in the physically and chemically fluctuating environments near hydrothermal vents. This is the first genome sequence from the phylum Deferribacteres.
Acetobacter species have been used for brewing traditional vinegar and are known to have genetic instability. To clarify the mutability, Acetobacter pasteurianus NBRC 3283, which forms a multi-phenotype cell complex, was subjected to genome DNA sequencing. The genome analysis revealed that there are more than 280 transposons and five genes with hyper-mutable tandem repeats as common features in the genome consisting of a 2.9-Mb chromosome and six plasmids. There were three single nucleotide mutations and five transposon insertions in 32 isolates from the cell complex. The A. pasteurianus hyper-mutability was applied for breeding a temperature-resistant strain grown at an unviable high-temperature (42 degrees C). The genomic DNA sequence of a heritable mutant showing temperature resistance was analyzed by mutation mapping, illustrating that a 92-kb deletion and three single nucleotide mutations occurred in the genome during the adaptation. Alpha-proteobacteria including A. pasteurianus consists of many intracellular symbionts and parasites, and their genomes show increased evolution rates and intensive genome reduction. However, A. pasteurianus is assumed to be a free-living bacterium, it may have the potentiality to evolve to fit in natural niches of seasonal fruits and flowers with other organisms, such as yeasts and lactic acid bacteria.
Lactobacillus reuteri is a heterofermentative lactic acid bacterium that naturally inhabits the gut of humans and other animals. The probiotic effects of L. reuteri have been proposed to be largely associated with the production of the broad-spectrum antimicrobial compound reuterin during anaerobic metabolism of glycerol. We determined the complete genome sequences of the reuterin-producing L. reuteri JCM 1112(T) and its closely related species Lactobacillus fermentum IFO 3956. Both are in the same phylogenetic group within the genus Lactobacillus. Comparative genome analysis revealed that L. reuteri JCM 1112(T) has a unique cluster of 58 genes for the biosynthesis of reuterin and cobalamin (vitamin B(12)). The 58-gene cluster has a lower GC content and is apparently inserted into the conserved region, suggesting that the cluster represents a genomic island acquired from an anomalous source. Two-dimensional nuclear magnetic resonance (2D-NMR) with (13)C(3)-glycerol demonstrated that L. reuteri JCM 1112(T) could convert glycerol to reuterin in vivo, substantiating the potential of L. reuteri JCM 1112(T) to produce reuterin in the intestine. Given that glycerol is shown to be naturally present in feces, the acquired ability to produce reuterin and cobalamin is an adaptive evolutionary response that likely contributes to the probiotic properties of L. reuteri.
The genome of Aspergillus oryzae, a fungus important for the production of traditional fermented foods and beverages in Japan, has been sequenced. The ability to secrete large amounts of proteins and the development of a transformation system have facilitated the use of A. oryzae in modern biotechnology. Although both A. oryzae and Aspergillus flavus belong to the section Flavi of the subgenus Circumdati of Aspergillus, A. oryzae, unlike A. flavus, does not produce aflatoxin, and its long history of use in the food industry has proved its safety. Here we show that the 37-megabase (Mb) genome of A. oryzae contains 12,074 genes and is expanded by 7-9 Mb in comparison with the genomes of Aspergillus nidulans and Aspergillus fumigatus. Comparison of the three aspergilli species revealed the presence of syntenic blocks and A. oryzae-specific blocks (lacking synteny with A. nidulans and A. fumigatus) in a mosaic manner throughout the genome of A. oryzae. The blocks of A. oryzae-specific sequence are enriched for genes involved in metabolism, particularly those for the synthesis of secondary metabolites. Specific expansion of genes for secretory hydrolytic enzymes, amino acid metabolism and amino acid/sugar uptake transporters supports the idea that A. oryzae is an ideal microorganism for fermentation.
The complete genomic sequence of an aerobic thermoacidophilic crenarchaeon, Sulfolobus tokodaii strain7 which optimally grows at 80 degrees C, at low pH, and under aerobic conditions, has been determined by the whole genome shotgun method with slight modifications. The genomic size was 2,694,756 bp long and the G + C content was 32.8%. The following RNA-coding genes were identified: a single 16S-23S rRNA cluster, one 5S rRNA gene and 46 tRNA genes (including 24 intron-containing tRNA genes). The repetitive sequences identified were SR-type repetitive sequences, long dispersed-type repetitive sequences and Tn-like repetitive elements. The genome contained 2826 potential protein-coding regions (open reading frames, ORFs). By similarity search against public databases, 911 (32.2%) ORFs were related to functional assigned genes, 921 (32.6%) were related to conserved ORFs of unknown function, 145 (5.1%) contained some motifs, and remaining 849 (30.0%) did not show any significant similarity to the registered sequences. The ORFs with functional assignments included the candidate genes involved in sulfide metabolism, the TCA cycle and the respiratory chain. Sequence comparison provided evidence suggesting the integration of plasmid, rearrangement of genomic structure, and duplication of genomic regions that may be responsible for the larger genomic size of the S. tokodaii strain7 genome. The genome contained eukaryote-type genes which were not identified in other archaea and lacked the CCA sequence in the tRNA genes. The result suggests that this strain is closer to eukaryotes among the archaea strains so far sequenced. The data presented in this paper are also available on the internet homepage (http:\/\/www.bio.nite.go.jp\/E-home\/genome_list-e.html\/).
Streptomyces avermitilis is a soil bacterium that carries out not only a complex morphological differentiation but also the production of secondary metabolites, one of which, avermectin, is commercially important in human and veterinary medicine. The major interest in this genus Streptomyces is the diversity of its production of secondary metabolites as an industrial microorganism. A major factor in its prominence as a producer of the variety of secondary metabolites is its possession of several metabolic pathways for biosynthesis. Here we report sequence analysis of S. avermitilis, covering 99% of its genome. At least 8.7 million base pairs exist in the linear chromosome; this is the largest bacterial genome sequence, and it provides insights into the intrinsic diversity of the production of the secondary metabolites of Streptomyces. Twenty-five kinds of secondary metabolite gene clusters were found in the genome of S. avermitilis. Four of them are concerned with the biosyntheses of melanin pigments, in which two clusters encode tyrosinase and its cofactor, another two encode an ochronotic pigment derived from homogentiginic acid, and another polyketide-derived melanin. The gene clusters for carotenoid and siderophore biosyntheses are composed of seven and five genes, respectively. There are eight kinds of gene clusters for type-I polyketide compound biosyntheses, and two clusters are involved in the biosyntheses of type-II polyketide-derived compounds. Furthermore, a polyketide synthase that resembles phloroglucinol synthase was detected. Eight clusters are involved in the biosyntheses of peptide compounds that are synthesized by nonribosomal peptide synthetases. These secondary metabolite clusters are widely located in the genome but half of them are near both ends of the genome. The total length of these clusters occupies about 6.4% of the genome.
The complete sequence of the genome of an aerobic hyper-thermophilic crenarchaeon, Aeropyrum pernix K1, which optimally grows at 95 degrees C, has been determined by the whole genome shotgun method with some modifications. The entire length of the genome was 1,669,695 bp. The authenticity of the entire sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA. As the potential protein-coding regions, a total of 2,694 open reading frames (ORFs) were assigned. By similarity search against public databases, 633 (23.5%) of the ORFs were related to genes with putative function and 523 (19.4%) to the sequences registered but with unknown function. All the genes in the TCA cycle except for that of alpha-ketoglutarate dehydrogenase were included, and instead of the alpha-ketoglutarate dehydrogenase gene, the genes coding for the two subunits of 2-oxoacid:ferredoxin oxidoreductase were identified. The remaining 1,538 ORFs (57.1%) did not show any significant similarity to the sequences in the databases. Sequence comparison among the assigned ORFs suggested that a considerable member of ORFs were generated by sequence duplication. The RNA genes identified were a single 16S-23S rRNA operon, two 5S rRNA genes and 47 tRNA genes including 14 genes with intron structures. All the assigned ORFs and RNA coding regions occupied 89.12% of the whole genome. The data presented in this paper are available on the internet homepage (http:\/\/www.mild.nite.go.jp).
The complete sequence of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3, has been determined by assembling the sequences of the physical map-based contigs of fosmid clones and of long polymerase chain reaction (PCR) products which were used for gap-filling. The entire length of the genome was 1,738,505 bp. The authenticity of the entire genome sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA. As the potential protein-coding regions, a total of 2061 open reading frames (ORFs) were assigned, and by similarity search against public databases, 406 (19.7%) were related to genes with putative function and 453 (22.0%) to the sequences registered but with unknown function. The remaining 1202 ORFs (58.3%) did not show any significant similarity to the sequences in the databases. Sequence comparison among the assigned ORFs in the genome provided evidence that a considerable number of ORFs were generated by sequence duplication. By similarity search, 11 ORFs were assumed to contain the intein elements. The RNA genes identified were a single 16S-23S rRNA operon, two 5S rRNA genes and 46 tRNA genes including two with the intron structure. All the assigned ORFs and RNA coding regions occupied 91.25% of the whole genome. The data presented in this paper are available on the internet at http:@www.nite.go.jp.