(Below N is a link to NCBI taxonomic web page and E link to ESTHER at designed phylum.) > cellular organisms: NE > Eukaryota: NE > Viridiplantae: NE > Streptophyta: NE > Streptophytina: NE > Embryophyta: NE > Tracheophyta: NE > Euphyllophyta: NE > Spermatophyta: NE > Magnoliophyta: NE > Mesangiospermae: NE > eudicotyledons: NE > Gunneridae: NE > Pentapetalae: NE > asterids: NE > lamiids: NE > Solanales: NE > Solanaceae: NE > Solanoideae: NE > Solaneae: NE > Solanum: NE > Lycopersicon: NE > Solanum lycopersicum: NE
LegendThis sequence has been compared to family alignement (MSA) red => minority aminoacid blue => majority aminoacid color intensity => conservation rate title => sequence position(MSA position)aminoacid rate Catalytic site Catalytic site in the MSA MEPIKKQGRHFVLVHGACHGGWCWYKLKPLLEVAGHKVTTLDLAASGIDL RKIEQLHTFHDYTLPLMELMESLPQEEKVILVGHSLGGMNLGLVMEKYPQ KIYVAVFLAAFMPDSIHSSSYVLDQYFERMQTMNWLDTQFVSYGSHEEPL PSIFFGPKFLAYNLYQLCPPEDVALVSSLGRASSLFLEDLSKSKYLTDEG YGSVKKVYIVCTDDKLLPKEFQKWQIDNINSIIETKEIEGADHMAMLSMP KKLCDTLLEIADKYN
A collection of 9,990 single-pass nuclear genomic sequences, corresponding to 5 Mb of tomato DNA, were obtained using methylation filtration (MF) strategy and reduced to 7,053 unique undermethylated genomic islands (UGIs) distributed as follows: (1) 59% non-coding sequences, (2) 28% coding sequences, (3) 12% transposons-96% of which are class I retroelements, and (4) 1% organellar sequences integrated into the nuclear genome over the past approximately 100 million years. A more detailed analysis of coding UGIs indicates that the unmethylated portion of tomato genes extends as far as 676 bp upstream and 766 bp downstream of coding regions with an average of 174 and 171 bp, respectively. Based on the analysis of the UGI copy distribution, the undermethylated portion of the tomato genome is determined to account for the majority of the unmethylated genes in the genome and is estimated to constitute 61+/-15 Mb of DNA (approximately 5% of the entire genome)--which is significantly less than the 220 Mb estimated for gene-rich euchromatic arms of the tomato genome. This result indicates that, while most genes reside in the euchromatin, a significant portion of euchromatin is methylated in the intergenic spacer regions. Implications of the results for sequencing the genome of tomato and other solanaceous species are discussed.
        
Title: A deep-coverage tomato BAC library and prospects toward development of an STC framework for genome sequencing Budiman MA, Mao L, Wood TC, Wing RA Ref: Genome Res, 10:129, 2000 : PubMed
Recently a new strategy using BAC end sequences as sequence-tagged connectors (STCs) was proposed for whole-genome sequencing projects. In this study, we present the construction and detailed characterization of a 15.0 haploid genome equivalent BAC library for the cultivated tomato, Lycopersicon esculentum cv. Heinz 1706. The library contains 129,024 clones with an average insert size of 117.5 kb and a chloroplast content of 1.11%. BAC end sequences from 1490 ends were generated and analyzed as a preliminary evaluation for using this library to develop an STC framework to sequence the tomato genome. A total of 1205 BAC end sequences (80.9%) were obtained, with an average length of 360 high-quality bases, and were searched against the GenBank database. Using a cutoff expectation value of <10(-6), and combining the results from BLASTN, BLASTX, and TBLASTX searches, 24.3% of the BAC end sequences were similar to known sequences, of which almost half (48.7%) share sequence similarities to retrotransposons and 7% to known genes. Some of the transposable element sequences were the first reported in tomato, such as sequences similar to maize transposon Activator (Ac) ORF and tobacco pararetrovirus-like sequences. Interestingly, there were no BAC end sequences similar to the highly repeated TGRI and TGRII elements. However, the majority (70.3%) of STCs did not share significant sequence similarities to any sequences in GenBank at either the DNA or predicted protein levels, indicating that a large portion of the tomato genome is still unknown. Our data demonstrate that this BAC library is suitable for developing an STC database to sequence the tomato genome. The advantages of developing an STC framework for whole-genome sequencing of tomato are discussed.