(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Bacteria: NE > Terrabacteria group: NE > Cyanobacteria/Melainabacteria group: NE > Cyanobacteria: NE > Synechococcales: NE > Merismopediaceae: NE > Synechocystis: NE > Synechocystis sp.: NE
Warning: This entry is a compilation of different species or line or strain with more than 90% amino acide identity. You can retrieve all strain data
(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) Synechocystis sp. (strain PCC 6803): N, E.
Synechocystis sp. PCC 6803: N, E.
Synechocystis sp. PCC 6803 substr. GT-S: N, E.
Synechocystis sp. PCC 6803 substr. Kazusa: N, E.
Synechocystis sp. PCC 6803 substr. PCC-P: N, E.
Synechocystis sp. PCC 6803 substr. GT-I: N, E.
Synechocystis sp. PCC 6803 substr. PCC-N: N, E.
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA MGAKVIATGTKEKRSNFWPQIPPRAKKLNHTEEDQSLWSMPQPLGDSMIE PLGFTRNSLVTSLGTIVYYEATEAPWVEAVDSLGDRQTLVFLHGFGGGSS AYEWSKVYPAFAADYRVLAPDLLGWGRSDHPVKNYTPQDYIQVIQEFLQQ TCDQPVMVIASSLVAAIAVRTAIEHPNLFTGLILSTPTGLSDFGEDYRSN FFAQLVSVPVLDRFIYTTGIANTAGIMNFLEQRQFAQARRIYPEIIDAYL ESATQPRAEYAALSFVRGDLCFDLAQFMPDLTVTTAILWGEYAQFTPPAI GRRLAALNPNAIKAFVEIKDVGLTPHLELPGVTIGVIRRFLALLQ
The sequence determination of the entire genome of the Synechocystis sp. strain PCC6803 was completed. The total length of the genome finally confirmed was 3,573,470 bp, including the previously reported sequence of 1,003,450 bp from map position 64% to 92% of the genome. The entire sequence was assembled from the sequences of the physical map-based contigs of cosmid clones and of lambda clones and long PCR products which were used for gap-filling. The accuracy of the sequence was guaranteed by analysis of both strands of DNA through the entire genome. The authenticity of the assembled sequence was supported by restriction analysis of long PCR products, which were directly amplified from the genomic DNA using the assembled sequence data. To predict the potential protein-coding regions, analysis of open reading frames (ORFs), analysis by the GeneMark program and similarity search to databases were performed. As a result, a total of 3,168 potential protein genes were assigned on the genome, in which 145 (4.6%) were identical to reported genes and 1,257 (39.6%) and 340 (10.8%) showed similarity to reported and hypothetical genes, respectively. The remaining 1,426 (45.0%) had no apparent similarity to any genes in databases. Among the potential protein genes assigned, 128 were related to the genes participating in photosynthetic reactions. The sum of the sequences coding for potential protein genes occupies 87% of the genome length. By adding rRNA and tRNA genes, therefore, the genome has a very compact arrangement of protein- and RNA-coding regions. A notable feature on the gene organization of the genome was that 99 ORFs, which showed similarity to transposase genes and could be classified into 6 groups, were found spread all over the genome, and at least 26 of them appeared to remain intact. The result implies that rearrangement of the genome occurred frequently during and after establishment of this species.
        
Title: Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. I. Sequence features in the 1 Mb region from map positions 64% to 92% of the genome Kaneko T, Tanaka A, Sato S, Kotani H, Sazuka T, Miyajima N, Sugiura M, Tabata S Ref: DNA Research, 2:153, 1995 : PubMed
The contiguous sequence of 1,003,450 bp spanning map positions 64% to 92% of the genome of Synechocystis sp. strain PCC6803 has been deduced. Computer analysis of the sequence predicts that this region contains at least 818 potential ORFs, in which 255 (31%) were either genes that had already been identified or their homologues, 84 (10%) were homologues to registered hypothetical genes, and 149 (18%) showed weak similarities to reported genes. The remaining 330 ORFs showed no apparent similarity to any reported genes or carried no significant protein motifs. The potential ORFs as a whole occupied 86% of the sequenced region, implying compact arrangement of genes in the genome. As to the structural RNA genes, one rRNA operon consisting of 5,028 bp and at least 11 species of tRNA genes were identified. It is noteworthy that 10 out of the 11 tRNA species showed significant sequence similarities to tRNAs reported in plant chloroplasts. As other notable unique sequences, three classes of IS-like elements each with characteristics typical of IS elements were identified, and a typical unit of WD(Trp-Asp)-repeats which have only been detected in the regulatory proteins of eukaryotes was identified within the large 5,079-bp ORF located at map position 69%.