(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Eukaryota: NE > Opisthokonta: NE > Fungi: NE > Dikarya: NE > Ascomycota: NE > saccharomyceta: NE > Saccharomycotina: NE > Saccharomycetes: NE > Saccharomycetales: NE > Phaffomycetaceae: NE > Komagataella: NE > Komagataella pastoris: NE
Warning: This entry is a compilation of different species or line or strain with more than 90% amino acide identity. You can retrieve all strain data
(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) Pichia pastoris: N, E.
Pichia pastoris GS115: N, E.
Komagataella pastoris GS115: N, E.
Komagataella phaffii CBS 7435: N, E.
Komagataella pastoris CBS 7435: N, E.
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA RLCDAVRNSQAQYLSSKFGFINLPVVSGKGLTSPSLDNQVLSTKDIFQAN TIIVVIHDPSDVWARRDPRSGLLDLSGSVIVDTSSKFVEWAVNQKYGVID INVPLTLTGKDDETYNNVTTSQELLLYIWDNYIRYFPVTKIAFVGFGDAY NGVIHLCGHREVRNIVKASINFLDRTPLRAIVSSIDESVTDWFYKNSLVF TSTRHPCWGEQGTSEMKRPRKKYGRVLKADIDGLPNIVDERFEETTDFIL DSIEEYEDESSN
As manually curated and non-automated BLAST analysis of the published Pichia pastoris genome sequences revealed many differences between the gene annotations of the strains GS115 and CBS7435, RNA-Seq analysis, supported by proteomics, was performed to improve the genome annotation. Detailed analysis of sequence alignment and protein domain predictions were made to extend the functional genome annotation to all P. pastoris sequences. This allowed the identification of 492 new ORFs, 4916 hypothetical UTRs and the correction of 341 incorrect ORF predictions, which were mainly due to the presence of upstream ATG or erroneous intron predictions. Moreover, 175 previously erroneously annotated ORFs need to be removed from the annotation. In total, we have annotated 5325 ORFs. Regarding the functionality of those genes, we improved all gene and protein descriptions. Thereby, the percentage of ORFs with functional annotation was increased from 48% to 73%. Furthermore, we defined functional groups, covering 25 biological cellular processes of interest, by grouping all genes that are part of the defined process. All data are presented in the newly launched genome browser and database available at www.pichiagenome.org In summary, we present a wide spectrum of curation of the P. pastoris genome annotation from gene level to protein function.
The methylotrophic yeast Pichia pastoris (Komagataella phaffii) CBS7435 is the parental strain of commonly used P. pastoris recombinant protein production hosts making it well suited for improving the understanding of associated genomic features. Here, we present a 9.35 Mbp high-quality genome sequence of P. pastoris CBS7435 established by a combination of 454 and Illumina sequencing. An automatic annotation of the genome sequence yielded 5007 protein-coding genes, 124 tRNAs and 29 rRNAs. Moreover, we report the complete DNA sequence of the first mitochondrial genome of a methylotrophic yeast. Fifteen genes encoding proteins, 2 rRNA and 25 tRNA loci were identified on the 35.7 kbp circular, mitochondrial DNA. Furthermore, the architecture of the putative alpha mating factor protein of P. pastoris CBS7435 turned out to be more complex than the corresponding protein of Saccharomyces cerevisiae.
The methylotrophic yeast Pichia pastoris is widely used for the production of proteins and as a model organism for studying peroxisomal biogenesis and methanol assimilation. P. pastoris strains capable of human-type N-glycosylation are now available, which increases the utility of this organism for biopharmaceutical production. Despite its biotechnological importance, relatively few genetic tools or engineered strains have been generated for P. pastoris. To facilitate progress in these areas, we present the 9.43 Mbp genomic sequence of the GS115 strain of P. pastoris. We also provide manually curated annotation for its 5,313 protein-coding genes.