Homo sapiens (Human) Transmembrane protein 53, FLJ22353, NET4
Comment
Protein of unknown function DUF829. Contains exclusively eukaryote proteins including transmembrane protein 53. Said to be integral membrane proteins (!?) Dictyostelium discoideum Net4 (dicdi-q54yr8), which has strong homologies to mammalian DUF829/Tmem53/NET4 is found on lipid droplets (Du et al.). Seems to have a conserved catalytic triad Ser-113, Asp-220, and His-252 in TMEM53_HUMAN
(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Eukaryota: NE > Opisthokonta: NE > Metazoa: NE > Eumetazoa: NE > Bilateria: NE > Deuterostomia: NE > Chordata: NE > Craniata: NE > Vertebrata: NE > Gnathostomata: NE > Teleostomi: NE > Euteleostomi: NE > Sarcopterygii: NE > Dipnotetrapodomorpha: NE > Tetrapoda: NE > Amniota: NE > Mammalia: NE > Theria: NE > Eutheria: NE > Boreoeutheria: NE > Euarchontoglires: NE > Primates: NE > Haplorrhini: NE > Simiiformes: NE > Catarrhini: NE > Hominoidea: NE > Hominidae: NE > Homininae: NE > Homo: NE > Homo sapiens: NE
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA MASAELDYTIEIPDQPCWSQKNSPSPGGKEAETRQPVVILLGWGGCKDKN LAKYSAIYHKRGCIVIRYTAPWHMVFFSESLGIPSLRVLAQKLLELLFDY EIEKEPLLFHVFSNGGVMLYRYVLELLQTRRFCRLRVVGTIFDSAPGDSN LVGALRALAAILERRAAMLRLLLLVAFALVVVLFHVLLAPITALFHTHFY DRLQDAGSRWPELYLYSRADEVVLARDIERMVEARLARRVLARSVDFVSS AHVSHLRDYPTYYTSLCVDFMRNCVRC
The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
As a base for human transcriptome and functional genomics, we created the "full-length long Japan" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.
Across all kingdoms of life, cells store energy in a specialized organelle, the lipid droplet. In general, it consists of a hydrophobic core of triglycerides and steryl esters surrounded by only one leaflet derived from the endoplasmic reticulum membrane to which a specific set of proteins is bound. We have chosen the unicellular organism Dictyostelium discoideum to establish kinetics of lipid droplet formation and degradation and to further identify the lipid constituents and proteins of lipid droplets. Here, we show that the lipid composition is similar to what is found in mammalian lipid droplets. In addition, phospholipids preferentially consist of mainly saturated fatty acids, whereas neutral lipids are enriched in unsaturated fatty acids. Among the novel protein components are LdpA, a protein specific to Dictyostelium, and Net4, which has strong homologies to mammalian DUF829/Tmem53/NET4 that was previously only known as a constituent of the mammalian nuclear envelope. The proteins analyzed so far appear to move from the endoplasmic reticulum to the lipid droplets, supporting the concept that lipid droplets are formed on this membrane.
Disruption of cell cycle regulation is one mechanism proposed for how nuclear envelope protein mutation can cause disease. Thus far only a few nuclear envelope proteins have been tested/found to affect cell cycle progression: to identify others, 39 novel nuclear envelope transmembrane proteins were screened for their ability to alter flow cytometry cell cycle/DNA content profiles when exogenously expressed. Eight had notable effects with seven increasing and one decreasing the 4N:2N ratio. We subsequently focused on NET4/Tmem53 that lost its effects in p53(-/-) cells and retinoblastoma protein-deficient cells. NET4/TMEM53 knockdown by siRNA altered flow cytometry cell cycle/DNA content profiles in a similar way as overexpression. NET4/TMEM53 knockdown did not affect total retinoblastoma protein levels, unlike nuclear envelope-associated proteins Lamin A and LAP2alpha. However, a decrease in phosphorylated retinoblastoma protein was observed along with a doubling of p53 levels and a 7-fold increase in p21. Consequently cells withdrew from the cell cycle, which was confirmed in MRC5 cells by a drop in the percentage of cells expressing Ki-67 antigen and an increase in the number of cells stained for ss-galactosidase. The ss-galactosidase upregulation suggests that cells become prematurely senescent. Finally, the changes in retinoblastoma protein, p53, and p21 resulting from loss of NET4/Tmem53 were dependent upon active p38 MAP kinase. The finding that roughly a fifth of nuclear envelope transmembrane proteins screened yielded alterations in flow cytometry cell cycle/DNA content profiles suggests a much greater influence of the nuclear envelope on the cell cycle than is widely held.
The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
As a base for human transcriptome and functional genomics, we created the "full-length long Japan" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.