Lipoxygenase (LOX)-catalyzed oxidation of the essential fatty acid, linoleate, represents a vital step in construction of the mammalian epidermal permeability barrier. Analysis of epidermal lipids indicates that linoleate is converted to a trihydroxy derivative by hydrolysis of an epoxy-hydroxy precursor. We evaluated different epoxide hydrolase (EH) enzymes in the hydrolysis of skin-relevant fatty acid epoxides and compared the products to those of acid-catalyzed hydrolysis. In the absence of enzyme, exposure to pH 5 or pH 6 at 37 degreesC for 30 min hydrolyzed fatty acid allylic epoxyalcohols to four trihydroxy products. By contrast, human soluble EH [sEH (EPHX2)] and human or murine epoxide hydrolase-3 [EH3 (EPHX3)] hydrolyzed cis or trans allylic epoxides to single diastereomers, identical to the major isomers detected in epidermis. Microsomal EH [mEH (EPHX1)] was inactive with these substrates. At low substrate concentrations (<10 microM), EPHX2 hydrolyzed 14,15-epoxyeicosatrienoic acid (EET) at twice the rate of the epidermal epoxyalcohol, 9R,10R-trans-epoxy-11E-13R-hydroxy-octadecenoic acid, whereas human or murine EPHX3 hydrolyzed the allylic epoxyalcohol at 31-fold and 39-fold higher rates, respectively. These data implicate the activities of EPHX2 and EPHX3 in production of the linoleate triols detected as end products of the 12R-LOX pathway in the epidermis and implicate their functioning in formation of the mammalian water permeability barrier.
Recently, several endophytic fungi have been demonstrated to produce volatile organic compounds (VOCs) with properties similar to fossil fuels, called "mycodiesel," while growing on lignocellulosic plant and agricultural residues. The fact that endophytes are plant symbionts suggests that some may be able to produce lignocellulolytic enzymes, making them capable of both deconstructing lignocellulose and converting it into mycodiesel, two properties that indicate that these strains may be useful consolidated bioprocessing (CBP) hosts for the biofuel production. In this study, four endophytes Hypoxylon sp. CI4A, Hypoxylon sp. EC38, Hypoxylon sp. CO27, and Daldinia eschscholzii EC12 were selected and evaluated for their CBP potential. Analysis of their genomes indicates that these endophytes have a rich reservoir of biomass-deconstructing carbohydrate-active enzymes (CAZys), which includes enzymes active on both polysaccharides and lignin, as well as terpene synthases (TPSs), enzymes that may produce fuel-like molecules, suggesting that they do indeed have CBP potential. GC-MS analyses of their VOCs when grown on four representative lignocellulosic feedstocks revealed that these endophytes produce a wide spectrum of hydrocarbons, the majority of which are monoterpenes and sesquiterpenes, including some known biofuel candidates. Analysis of their cellulase activity when grown under the same conditions revealed that these endophytes actively produce endoglucanases, exoglucanases, and beta-glucosidases. The richness of CAZymes as well as terpene synthases identified in these four endophytic fungi suggests that they are great candidates to pursue for development into platform CBP organisms.
Several quantitative trait loci (QTL) mapping strategies can successfully identify major-effect loci, but often have poor success detecting loci with minor effects, potentially due to the confounding effects of major loci, epistasis, and limited sample sizes. To overcome such difficulties, we used a targeted backcross mapping strategy that genetically eliminated the effect of a previously identified major QTL underlying high-temperature growth (Htg) in yeast. This strategy facilitated the mapping of three novel QTL contributing to Htg of a clinically derived yeast strain. One QTL, which is linked to the previously identified major-effect QTL, was dissected, and NCS2 was identified as the causative gene. The interaction of the NCS2 QTL with the first major-effect QTL was background dependent, revealing a complex QTL architecture spanning these two linked loci. Such complex architecture suggests that more genes than can be predicted are likely to contribute to quantitative traits. The targeted backcrossing approach overcomes the difficulties posed by sample size, genetic linkage, and epistatic effects and facilitates identification of additional alleles with smaller contributions to complex traits.
We sequenced the genome of Saccharomyces cerevisiae strain YJM789, which was derived from a yeast isolated from the lung of an AIDS patient with pneumonia. The strain is used for studies of fungal infections and quantitative genetics because of its extensive phenotypic differences to the laboratory reference strain, including growth at high temperature and deadly virulence in mouse models. Here we show that the approximately 12-Mb genome of YJM789 contains approximately 60,000 SNPs and approximately 6,000 indels with respect to the reference S288c genome, leading to protein polymorphisms with a few known cases of phenotypic changes. Several ORFs are found to be unique to YJM789, some of which might have been acquired through horizontal transfer. Localized regions of high polymorphism density are scattered over the genome, in some cases spanning multiple ORFs and in others concentrated within single genes. The sequence of YJM789 contains clues to pathogenicity and spurs the development of more powerful approaches to dissecting the genetic basis of complex hereditary traits.
Cryptococcus neoformans is a basidiomycetous yeast ubiquitous in the environment, a model for fungal pathogenesis, and an opportunistic human pathogen of global importance. We have sequenced its approximately 20-megabase genome, which contains approximately 6500 intron-rich gene structures and encodes a transcriptome abundant in alternatively spliced and antisense messages. The genome is rich in transposons, many of which cluster at candidate centromeric regions. The presence of these transposons may drive karyotype instability and phenotypic variation. C. neoformans encodes unique genes that may contribute to its unusual virulence properties, and comparison of two phenotypically distinct strains reveals variation in gene content in addition to sequence polymorphisms between the genomes.
We present the diploid genome sequence of the fungal pathogen Candida albicans. Because C. albicans has no known haploid or homozygous form, sequencing was performed as a whole-genome shotgun of the heterozygous diploid genome in strain SC5314, a clinical isolate that is the parent of strains widely used for molecular analysis. We developed computational methods to assemble a diploid genome sequence in good agreement with available physical mapping data. We provide a whole-genome description of heterozygosity in the organism. Comparative genomic analyses provide important clues about the evolution of the species and its mechanisms of pathogenesis.
Functional analysis of a genome requires accurate gene structure information and a complete gene inventory. A dual experimental strategy was used to verify and correct the initial genome sequence annotation of the reference plant Arabidopsis. Sequencing full-length cDNAs and hybridizations using RNA populations from various tissues to a set of high-density oligonucleotide arrays spanning the entire genome allowed the accurate annotation of thousands of gene structures. We identified 5817 novel transcription units, including a substantial amount of antisense gene transcription, and 40 genes within the genetically defined centromeres. This approach resulted in completion of approximately 30% of the Arabidopsis ORFeome as a resource for global functional experimentation of the plant proteome.
The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host-parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.
Determining the effect of gene deletion is a fundamental approach to understanding gene function. Conventional genetic screens exhibit biases, and genes contributing to a phenotype are often missed. We systematically constructed a nearly complete collection of gene-deletion mutants (96% of annotated open reading frames, or ORFs) of the yeast Saccharomyces cerevisiae. DNA sequences dubbed 'molecular bar codes' uniquely identify each strain, enabling their growth to be analysed in parallel and the fitness contribution of each gene to be quantitatively assessed by hybridization to high-density oligonucleotide arrays. We show that previously known and new genes are necessary for optimal growth under six well-studied conditions: high salt, sorbitol, galactose, pH 8, minimal medium and nystatin treatment. Less than 7% of genes that exhibit a significant increase in messenger RNA expression are also required for optimal growth in four of the tested conditions. Our results validate the yeast gene-deletion collection as a valuable resource for functional genomics.
The human malaria parasite Plasmodium falciparum is responsible for the death of more than a million people every year. To stimulate basic research on the disease, and to promote the development of effective drugs and vaccines against the parasite, the complete genome of P. falciparum clone 3D7 has been sequenced, using a chromosome-by-chromosome shotgun strategy. Here we report the nucleotide sequence of the third largest of the parasite's 14 chromosomes, chromosome 12, which comprises about 10% of the 23-megabase genome. As the most (A + T)-rich (80.6%) genome sequenced to date, the P. falciparum genome presented severe problems during the assembly of primary sequence reads. We discuss the methodology that yielded a finished and fully contiguous sequence for chromosome 12. The biological implications of the sequence data are more thoroughly discussed in an accompanying Article (ref. 3).
The symbiotic nitrogen-fixing soil bacterium Sinorhizobium meliloti contains three replicons: pSymA, pSymB, and the chromosome. We report here the complete 1,354,226-nt sequence of pSymA. In addition to a large fraction of the genes known to be specifically involved in symbiosis, pSymA contains genes likely to be involved in nitrogen and carbon metabolism, transport, stress, and resistance responses, and other functions that give S. meliloti an advantage in its specialized niche.
The scarcity of usable nitrogen frequently limits plant growth. A tight metabolic association with rhizobial bacteria allows legumes to obtain nitrogen compounds by bacterial reduction of dinitrogen (N2) to ammonium (NH4+). We present here the annotated DNA sequence of the alpha-proteobacterium Sinorhizobium meliloti, the symbiont of alfalfa. The tripartite 6.7-megabase (Mb) genome comprises a 3.65-Mb chromosome, and 1.35-Mb pSymA and 1.68-Mb pSymB megaplasmids. Genome sequence analysis indicates that all three elements contribute, in varying degrees, to symbiosis and reveals how this genome may have emerged during evolution. The genome sequence will be useful in understanding the dynamics of interkingdom associations and of life in soil environments.
The genome of the flowering plant Arabidopsis thaliana has five chromosomes. Here we report the sequence of the largest, chromosome 1, in two contigs of around 14.2 and 14.6 megabases. The contigs extend from the telomeres to the centromeric borders, regions rich in transposons, retrotransposons and repetitive elements such as the 180-base-pair repeat. The chromosome represents 25% of the genome and contains about 6,850 open reading frames, 236 transfer RNAs (tRNAs) and 12 small nuclear RNAs. There are two clusters of tRNA genes at different places on the chromosome. One consists of 27 tRNA(Pro) genes and the other contains 27 tandem repeats of tRNA(Tyr)-tRNA(Tyr)-tRNA(Ser) genes. Chromosome 1 contains about 300 gene families with clustered duplications. There are also many repeat elements, representing 8% of the sequence.
Chlamydia are obligate intracellular eubacteria that are phylogenetically separated from other bacterial divisions. C. trachomatis and C. pneumoniae are both pathogens of humans but differ in their tissue tropism and spectrum of diseases. C. pneumoniae is a newly recognized species of Chlamydia that is a natural pathogen of humans, and causes pneumonia and bronchitis. In the United States, approximately 10% of pneumonia cases and 5% of bronchitis cases are attributed to C. pneumoniae infection. Chronic disease may result following respiratory-acquired infection, such as reactive airway disease, adult-onset asthma and potentially lung cancer. In addition, C. pneumoniae infection has been associated with atherosclerosis. C. trachomatis infection causes trachoma, an ocular infection that leads to blindness, and sexually transmitted diseases such as pelvic inflammatory disease, chronic pelvic pain, ectopic pregnancy and epididymitis. Although relatively little is known about C. trachomatis biology, even less is known concerning C. pneumoniae. Comparison of the C. pneumoniae genome with the C. trachomatis genome will provide an understanding of the common biological processes required for infection and survival in mammalian cells. Genomic differences are implicated in the unique properties that differentiate the two species in disease spectrum. Analysis of the 1,230,230-nt C. pneumoniae genome revealed 214 protein-coding sequences not found in C. trachomatis, most without homologues to other known sequences. Prominent comparative findings include expansion of a novel family of 21 sequence-variant outer-membrane proteins, conservation of a type-III secretion virulence system, three serine/threonine protein kinases and a pair of parologous phospholipase-D-like proteins, additional purine and biotin biosynthetic capability, a homologue for aromatic amino acid (tryptophan) hydroxylase and the loss of tryptophan biosynthesis genes.
Analysis of the 1,042,519-base pair Chlamydia trachomatis genome revealed unexpected features related to the complex biology of chlamydiae. Although chlamydiae lack many biosynthetic capabilities, they retain functions for performing key steps and interconversions of metabolites obtained from their mammalian host cells. Numerous potential virulence-associated proteins also were characterized. Several eukaryotic chromatin-associated domain proteins were identified, suggesting a eukaryotic-like mechanism for chlamydial nucleoid condensation and decondensation. The phylogenetic mosaic of chlamydial genes, including a large number of genes with phylogenetic origins from eukaryotes, implies a complex evolution for adaptation to obligate intracellular parasitism.
The nucleotide sequence of the 948,061 base pairs of chromosome XVI has been determined, completing the sequence of the yeast genome. Chromosome XVI was the last yeast chromosome identified, and some of the genes mapped early to it, such as GAL4, PEP4 and RAD1 (ref. 2) have played important roles in the development of yeast biology. The architecture of this final chromosome seems to be typical of the large yeast chromosomes, and shows large duplications with other yeast chromosomes. Chromosome XVI contains 487 potential protein-encoding genes, 17 tRNA genes and two small nuclear RNA genes; 27% of the genes have significant similarities to human gene products, and 48% are new and of unknown biological function. Systematic efforts to explore gene function have begun.
The complete DNA sequence of the yeast Saccharomyces cerevisiae chromosome IV has been determined. Apart from chromosome XII, which contains the 1-2 Mb rDNA cluster, chromosome IV is the longest S. cerevisiae chromosome. It was split into three parts, which were sequenced by a consortium from the European Community, the Sanger Centre, and groups from St Louis and Stanford in the United States. The sequence of 1,531,974 base pairs contains 796 predicted or known genes, 318 (39.9%) of which have been previously identified. Of the 478 new genes, 225 (28.3%) are homologous to previously identified genes and 253 (32%) have unknown functions or correspond to spurious open reading frames (ORFs). On average there is one gene approximately every two kilobases. Superimposed on alternating regional variations in G+C composition, there is a large central domain with a lower G+C content that contains all the yeast transposon (Ty) elements and most of the tRNA genes. Chromosome IV shares with chromosomes II, V, XII, XIII and XV some long clustered duplications which partly explain its origin.
The genome of the yeast Saccharomyces cerevisiae has been completely sequenced through a worldwide collaboration. The sequence of 12,068 kilobases defines 5885 potential protein-encoding genes, approximately 140 genes specifying ribosomal RNA, 40 genes for small nuclear RNA molecules, and 275 transfer RNA genes. In addition, the complete sequence provides information about the higher order organization of yeast's 16 chromosomes and allows some insight into their evolutionary history. The genome shows a considerable amount of apparent genetic redundancy, and one of the major problems to be tackled during the next stage of the yeast genome project is to elucidate the biological functions of all of these genes.