The draft genome of the moss model, Physcomitrella patens, comprised approximately 2000 unordered scaffolds. In order to enable analyses of genome structure and evolution we generated a chromosome-scale genome assembly using genetic linkage as well as (end) sequencing of long DNA fragments. We find that 57% of the genome comprises transposable elements (TEs), some of which may be actively transposing during the life cycle. Unlike in flowering plant genomes, gene- and TE-rich regions show an overall even distribution along the chromosomes. However, the chromosomes are mono-centric with peaks of a class of Copia elements potentially coinciding with centromeres. Gene body methylation is evident in 5.7% of the protein-coding genes, typically coinciding with low GC and low expression. Some giant virus insertions are transcriptionally active and might protect gametes from viral infection via siRNA mediated silencing. Structure-based detection methods show that the genome evolved via two rounds of whole genome duplications (WGDs), apparently common in mosses but not in liverworts and hornworts. Several hundred genes are present in colinear regions conserved since the last common ancestor of plants. These syntenic regions are enriched for functions related to plant-specific cell growth and tissue organization. The P. patens genome lacks the TE-rich pericentromeric and gene-rich distal regions typical for most flowering plant genomes. More non-seed plant genomes are needed to unravel how plant genomes evolve, and to understand whether the P. patens genome structure is typical for mosses or bryophytes.
BACKGROUND: Strigolactones (SLs) are a class of plant hormones that control many aspects of plant growth. The SL signalling mechanism is homologous to that of karrikins (KARs), smoke-derived compounds that stimulate seed germination. In angiosperms, the SL receptor is an alpha/beta-hydrolase known as DWARF14 (D14); its close homologue, KARRIKIN INSENSITIVE2 (KAI2), functions as a KAR receptor and likely recognizes an uncharacterized, endogenous signal ('KL'). Previous phylogenetic analyses have suggested that the KAI2 lineage is ancestral in land plants, and that canonical D14-type SL receptors only arose in seed plants; this is paradoxical, however, as non-vascular plants synthesize and respond to SLs. RESULTS: We have used a combination of phylogenetic and structural approaches to re-assess the evolution of the D14/KAI2 family in land plants. We analysed 339 members of the D14/KAI2 family from land plants and charophyte algae. Our phylogenetic analyses show that the divergence between the eu-KAI2 lineage and the DDK (D14/DLK2/KAI2) lineage that includes D14 occurred very early in land plant evolution. We show that eu-KAI2 proteins are highly conserved, and have unique features not found in DDK proteins. Conversely, we show that DDK proteins show considerable sequence and structural variation to each other, and lack clearly definable characteristics. We use homology modelling to show that the earliest members of the DDK lineage structurally resemble KAI2 and that SL receptors in non-seed plants likely do not have D14-like structure. We also show that certain groups of DDK proteins lack the otherwise conserved MORE AXILLARY GROWTH2 (MAX2) interface, and may thus function independently of MAX2, which we show is highly conserved throughout land plant evolution. CONCLUSIONS: Our results suggest that D14-like structure is not required for SL perception, and that SL perception has relatively relaxed structural requirements compared to KAI2-mediated signalling. We suggest that SL perception gradually evolved by neo-functionalization within the DDK lineage, and that the transition from KAI2-like to D14-like protein may have been driven by interactions with protein partners, rather than being required for SL perception per se.
Colonization of land by plants was a major transition on Earth, but the developmental and genetic innovations required for this transition remain unknown. Physiological studies and the fossil record strongly suggest that the ability of the first land plants to form symbiotic associations with beneficial fungi was one of these critical innovations. In angiosperms, genes required for the perception and transduction of diffusible fungal signals for root colonization and for nutrient exchange have been characterized. However, the origin of these genes and their potential correlation with land colonization remain elusive. A comprehensive phylogenetic analysis of 259 transcriptomes and 10 green algal and basal land plant genomes, coupled with the characterization of the evolutionary path leading to the appearance of a key regulator, a calcium- and calmodulin-dependent protein kinase, showed that the symbiotic signaling pathway predated the first land plants. In contrast, downstream genes required for root colonization and their specific expression pattern probably appeared subsequent to the colonization of land. We conclude that the most recent common ancestor of extant land plants and green algae was preadapted for symbiotic associations. Subsequent improvement of this precursor stage in early land plants through rounds of gene duplication led to the acquisition of additional pathways and the ability to form a fully functional arbuscular mycorrhizal symbiosis.
Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94x raw, approximately 69x filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.
Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes.
Laribacter hongkongensis is a newly discovered Gram-negative bacillus of the Neisseriaceae family associated with freshwater fish-borne gastroenteritis and traveler's diarrhea. The complete genome sequence of L. hongkongensis HLHK9, recovered from an immunocompetent patient with severe gastroenteritis, consists of a 3,169-kb chromosome with G+C content of 62.35%. Genome analysis reveals different mechanisms potentially important for its adaptation to diverse habitats of human and freshwater fish intestines and freshwater environments. The gene contents support its phenotypic properties and suggest that amino acids and fatty acids can be used as carbon sources. The extensive variety of transporters, including multidrug efflux and heavy metal transporters as well as genes involved in chemotaxis, may enable L. hongkongensis to survive in different environmental niches. Genes encoding urease, bile salts efflux pump, adhesin, catalase, superoxide dismutase, and other putative virulence factors-such as hemolysins, RTX toxins, patatin-like proteins, phospholipase A1, and collagenases-are present. Proteomes of L. hongkongensis HLHK9 cultured at 37 degrees C (human body temperature) and 20 degrees C (freshwater habitat temperature) showed differential gene expression, including two homologous copies of argB, argB-20, and argB-37, which encode two isoenzymes of N-acetyl-L-glutamate kinase (NAGK)-NAGK-20 and NAGK-37-in the arginine biosynthesis pathway. NAGK-20 showed higher expression at 20 degrees C, whereas NAGK-37 showed higher expression at 37 degrees C. NAGK-20 also had a lower optimal temperature for enzymatic activities and was inhibited by arginine probably as negative-feedback control. Similar duplicated copies of argB are also observed in bacteria from hot springs such as Thermus thermophilus, Deinococcus geothermalis, Deinococcus radiodurans, and Roseiflexus castenholzii, suggesting that similar mechanisms for temperature adaptation may be employed by other bacteria. Genome and proteome analysis of L. hongkongensis revealed novel mechanisms for adaptations to survival at different temperatures and habitats.
Large-insert genome analysis (LIGAN) is a broadly applicable, high-throughput technology designed to characterize genome-scale structural variation. Fosmid paired-end sequences and DNA fingerprints from a query genome are compared to a reference sequence using the Genomic Variation Analysis (GenVal) suite of software tools to pinpoint locations of insertions, deletions, and rearrangements. Fosmids spanning regions that contain new structural variants can then be sequenced. Clonal pairs of Pseudomonas aeruginosa isolates from four cystic fibrosis patients were used to validate the LIGAN technology. Approximately 1.5 Mb of inserted sequences were identified, including 743 kb containing 615 ORFs that are absent from published P. aeruginosa genomes. Six rearrangement breakpoints and 220 kb of deleted sequences were also identified. Our study expands the "genome universe" of P. aeruginosa and validates a technology that complements emerging, short-read sequencing methods that are better suited to characterizing single-nucleotide polymorphisms than structural variation.
Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution. Despite remarkable similarities among these Drosophila species, we identified many putatively non-neutral changes in protein-coding genes, non-coding RNA genes, and cis-regulatory regions. These may prove to underlie differences in the ecology and behaviour of these diverse species.
After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chromosomes. Chromosome 3 comprises just four contigs, one of which currently represents the longest unbroken stretch of finished DNA sequence known so far. The chromosome is remarkable in having the lowest rate of segmental duplication in the genome. It also includes a chemokine receptor gene cluster as well as numerous loci involved in multiple human cancers such as the gene encoding FHIT, which contains the most common constitutive fragile site in the genome, FRA3B. Using genomic sequence from chimpanzee and rhesus macaque, we were able to characterize the breakpoints defining a large pericentric inversion that occurred some time after the split of Homininae from Ponginae, and propose an evolutionary history of the inversion.
We report improved whole-genome shotgun sequences for the genomes of indica and japonica rice, both with multimegabase contiguity, or almost 1,000-fold improvement over the drafts of 2002. Tested against a nonredundant collection of 19,079 full-length cDNAs, 97.7% of the genes are aligned, without fragmentation, to the mapped super-scaffolds of one or the other genome. We introduce a gene identification procedure for plants that does not rely on similarity to known genes to remove erroneous predictions resulting from transposable elements. Using the available EST data to adjust for residual errors in the predictions, the estimated gene count is at least 38,000-40,000. Only 2%-3% of the genes are unique to any one subspecies, comparable to the amount of sequence that might still be missing. Despite this lack of variation in gene content, there is enormous variation in the intergenic regions. At least a quarter of the two sequences could not be aligned, and where they could be aligned, single nucleotide polymorphism (SNP) rates varied from as little as 3.0 SNP/kb in the coding regions to 27.6 SNP/kb in the transposable elements. A more inclusive new approach for analyzing duplication history is introduced here. It reveals an ancient whole-genome duplication, a recent segmental duplication on Chromosomes 11 and 12, and massive ongoing individual gene duplications. We find 18 distinct pairs of duplicated segments that cover 65.7% of the genome; 17 of these pairs date back to a common time before the divergence of the grasses. More important, ongoing individual gene duplications provide a never-ending source of raw material for gene genesis and are major contributors to the differences between members of the grass family.
        
Title: Apolipoprotein E4 inhibits, and apolipoprotein E3 promotes neurite outgrowth in cultured adult mouse cortical neurons through the low-density lipoprotein receptor-related protein Nathan BP, Jiang Y, Wong GK, Shen F, Brewer GJ, Struble RG Ref: Brain Research, 928:96, 2002 : PubMed
The apolipoprotein E4 (apoE4) genotype is a major risk factor for Alzheimer's disease (AD); however, the mechanism is unknown. We previously demonstrated that apoE isoforms differentially modulated neurite outgrowth in embryonic neurons and in neuronal cell lines. ApoE3 increased neurite outgrowth whereas apoE4 decreased outgrowth, suggesting that apoE4 may directly affect neurons in the brain. In the present study we examined the effects of apoE on neurite outgrowth from cultured adult mouse cortical neurons to examine if adult neurons respond the same way that embryonic cells do. The results from this study demonstrated that (1) cortical neurons derived from adult apoE-gene knockout (apoE KO) mice have significantly shorter neurites than neurons from adult wild-type (WT) mice; (2) incubation of cortical neurons from adult apoE KO mice with human apoE3 increased neurite outgrowth, whereas human apoE4 decreased outgrowth in a dose-dependent fashion; (3) the isoform specific effects were abolished by incubation of the neurons with either receptor associated protein (RAP) or lactoferrin, both of which block the interaction of apoE-containing lipoproteins with the low-density lipoprotein receptor-related protein (LRP). These data suggest a potential mechanism whereby apoE4 may play a role in regenerative failure and accelerate the development of AD.
Apolipoprotein E (apoE), a lipid transporting protein, has been postulated to participate in nerve regeneration. To better clarify apoE function in the olfactory system, we evaluated the amount and distribution of apoE in the olfactory bulb following olfactory nerve lesion in mice. Olfactory nerve was lesioned in 2- to 4-month-old mice by intranasal irrigation with Triton X-100. Olfactory bulbs were collected at 0, 3, 7, 21, 42, and 56 days postlesion, and both apoE concentrations and apoE distribution were determined. ApoE levels, as determined by immunoblot analysis, were twofold greater than normal during nerve degeneration at 3 days. ApoE levels remained elevated by approximately 1.5 times normal levels at 7 through 21 days after injury and returned to baseline by 56 days. Immunocytochemical studies supported these observations. ApoE immunoreactivity was prominent on the olfactory nerve at 3 days after lesion and decreased to baseline levels at later time periods. Double-labeling immunocytochemical studies confirmed that both reactive astroglia and microglia produced detectable amounts of apoE following the lesion. Return of apoE expression to baseline paralleled measures of olfactory nerve maturation as measured by olfactory marker protein. These data suggest that apoE increases concurrent with nerve degeneration. ApoE may facilitate efficient regeneration perhaps by recycling lipids from degenerating fibers for use by growing axons. The association of apoE genotype with dementing illnesses may represent a diminished ability to support a lifetime of nerve regeneration.
Pseudomonas aeruginosa is a ubiquitous environmental bacterium that is one of the top three causes of opportunistic human infections. A major factor in its prominence as a pathogen is its intrinsic resistance to antibiotics and disinfectants. Here we report the complete sequence of P. aeruginosa strain PAO1. At 6.3 million base pairs, this is the largest bacterial genome sequenced, and the sequence provides insights into the basis of the versatility and intrinsic drug resistance of P. aeruginosa. Consistent with its larger genome size and environmental adaptability, P. aeruginosa contains the highest proportion of regulatory genes observed for a bacterial genome and a large number of genes involved in the catabolism, transport and efflux of organic compounds as well as four potential chemotaxis systems. We propose that the size and complexity of the P. aeruginosa genome reflect an evolutionary adaptation permitting it to thrive in diverse environments and resist the effects of a variety of antimicrobial substances.