The order Chaetothyriales (Pezizomycotina, Ascomycetes) harbours obligatorily melanised fungi and includes numerous etiologic agents of chromoblastomycosis, phaeohyphomycosis and other diseases of vertebrate hosts. Diseases range from mild cutaneous to fatal cerebral or disseminated infections and affect humans and cold-blooded animals globally. In addition, Chaetothyriales comprise species with aquatic, rock-inhabiting, ant-associated, and mycoparasitic life-styles, as well as species that tolerate toxic compounds, suggesting a high degree of versatile extremotolerance. To understand their biology and divergent niche occupation, we sequenced and annotated a set of 23 genomes of main the human opportunists within the Chaetothyriales as well as related environmental species. Our analyses included fungi with diverse life-styles, namely opportunistic pathogens and closely related saprobes, to identify genomic adaptations related to pathogenesis. Furthermore, ecological preferences of Chaetothyriales were analysed, in conjuncture with the order-level phylogeny based on conserved ribosomal genes. General characteristics, phylogenomic relationships, transposable elements, sex-related genes, protein family evolution, genes related to protein degradation (MEROPS), carbohydrate-active enzymes (CAZymes), melanin synthesis and secondary metabolism were investigated and compared between species. Genome assemblies varied from 25.81 Mb (Capronia coronata) to 43.03 Mb (Cladophialophora immunda). The bantiana-clade contained the highest number of predicted genes (12817 on average) as well as larger genomes. We found a low content of mobile elements, with DNA transposons from Tc1/Mariner superfamily being the most abundant across analysed species. Additionally, we identified a reduction of carbohydrate degrading enzymes, specifically many of the Glycosyl Hydrolase (GH) class, while most of the Pectin Lyase (PL) genes were lost in etiological agents of chromoblastomycosis and phaeohyphomycosis. An expansion was found in protein degrading peptidase enzyme families S12 (serine-type D-Ala-D-Ala carboxypeptidases) and M38 (isoaspartyl dipeptidases). Based on genomic information, a wide range of abilities of melanin biosynthesis was revealed; genes related to metabolically distinct DHN, DOPA and pyomelanin pathways were identified. The MAT (MAting Type) locus and other sex-related genes were recognized in all 23 black fungi. Members of the asexual genera Fonsecaea and Cladophialophora appear to be heterothallic with a single copy of either MAT-1-1 or MAT-1-2 in each individual. All Capronia species are homothallic as both MAT1-1 and MAT1-2 genes were found in each single genome. The genomic synteny of the MAT-locus flanking genes (SLA2-APN2-COX13) is not conserved in black fungi as is commonly observed in Eurotiomycetes, indicating a unique genomic context for MAT in those species. The heterokaryon (het) genes expansion associated with the low selective pressure at the MAT-locus suggests that a parasexual cycle may play an important role in generating diversity among those fungi.
Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses.
Sporothrix schenckii is a pathogenic dimorphic fungus that grows as a yeast and as mycelia. This species is the causative agent of sporotrichosis, typically a skin infection. We report the genome sequence of S. schenckii, which will facilitate the study of this fungus and of the Sporothrix schenckii group.
The degree to which molecular epidemiology reveals information about the sources and transmission patterns of an outbreak depends on the resolution of the technology used and the samples studied. Isolates of Escherichia coli O104:H4 from the outbreak centered in Germany in May-July 2011, and the much smaller outbreak in southwest France in June 2011, were indistinguishable by standard tests. We report a molecular epidemiological analysis using multiplatform whole-genome sequencing and analysis of multiple isolates from the German and French outbreaks. Isolates from the German outbreak showed remarkably little diversity, with only two single nucleotide polymorphisms (SNPs) found in isolates from four individuals. Surprisingly, we found much greater diversity (19 SNPs) in isolates from seven individuals infected in the French outbreak. The German isolates form a clade within the more diverse French outbreak strains. Moreover, five isolates derived from a single infected individual from the French outbreak had extremely limited diversity. The striking difference in diversity between the German and French outbreak samples is consistent with several hypotheses, including a bottleneck that purged diversity in the German isolates, variation in mutation rates in the two E. coli outbreak populations, or uneven distribution of diversity in the seed populations that led to each outbreak.
We have sequenced the genomes of 18 isolates of the closely related human pathogenic fungi Coccidioides immitis and Coccidioides posadasii to more clearly elucidate population genomic structure, bringing the total number of sequenced genomes for each species to 10. Our data confirm earlier microsatellite-based findings that these species are genetically differentiated, but our population genomics approach reveals that hybridization and genetic introgression have recently occurred between the two species. The directionality of introgression is primarily from C. posadasii to C. immitis, and we find more than 800 genes exhibiting strong evidence of introgression in one or more sequenced isolates. We performed PCR-based sequencing of one region exhibiting introgression in 40 C. immitis isolates to confirm and better define the extent of gene flow between the species. We find more coding sequence than expected by chance in the introgressed regions, suggesting that natural selection may play a role in the observed genetic exchange. We find notable heterogeneity in repetitive sequence composition among the sequenced genomes and present the first detailed genome-wide profile of a repeat-induced point mutation (RIP) process distinctly different from what has been observed in Neurospora. We identify promiscuous HLA-I and HLA-II epitopes in both proteomes and discuss the possible implications of introgression and population genomic data for public health and vaccine candidate prioritization. This study highlights the importance of population genomic data for detecting subtle but potentially important phenomena such as introgression.
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Here we present a finished sequence of human chromosome 15, together with a high-quality gene catalogue. As chromosome 15 is one of seven human chromosomes with a high rate of segmental duplication, we have carried out a detailed analysis of the duplication structure of the chromosome. Segmental duplications in chromosome 15 are largely clustered in two regions, on proximal and distal 15q; the proximal region is notable because recombination among the segmental duplications can result in deletions causing Prader-Willi and Angelman syndromes. Sequence analysis shows that the proximal and distal regions of 15q share extensive ancient similarity. Using a simple approach, we have been able to reconstruct many of the events by which the current duplication structure arose. We find that most of the intrachromosomal duplications seem to share a common ancestry. Finally, we demonstrate that some remaining gaps in the genome sequence are probably due to structural polymorphisms between haplotypes; this may explain a significant fraction of the gaps remaining in the human genome.
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Chromosome 18 appears to have the lowest gene density of any human chromosome and is one of only three chromosomes for which trisomic individuals survive to term. There are also a number of genetic disorders stemming from chromosome 18 trisomy and aneuploidy. Here we report the finished sequence and gene annotation of human chromosome 18, which will allow a better understanding of the normal and disease biology of this chromosome. Despite the low density of protein-coding genes on chromosome 18, we find that the proportion of non-protein-coding sequences evolutionarily conserved among mammals is close to the genome-wide average. Extending this analysis to the entire human genome, we find that the density of conserved non-protein-coding sequences is largely uncorrelated with gene density. This has important implications for the nature and roles of non-protein-coding sequence elements.