Stapleton M

References (7)

Title : Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence - Celniker_2002_Genome.Biol_3_RESEARCH0079
Author(s) : Celniker SE , Wheeler DA , Kronmiller B , Carlson JW , Halpern A , Patel S , Adams M , Champe M , Dugan SP , Frise E , Hodgson A , George RA , Hoskins RA , Laverty T , Muzny DM , Nelson CR , Pacleb JM , Park S , Pfeiffer BD , Richards S , Sodergren EJ , Svirskas R , Tabor PE , Wan K , Stapleton M , Sutton GG , Venter C , Weinstock G , Scherer SE , Myers EW , Gibbs RA , Rubin GM
Ref : Genome Biol , 3 :RESEARCH0079 , 2002
Abstract : BACKGROUND: The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.
RESULTS: Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.
CONCLUSIONS: The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.
ESTHER : Celniker_2002_Genome.Biol_3_RESEARCH0079
PubMedSearch : Celniker_2002_Genome.Biol_3_RESEARCH0079
PubMedID: 12537568
Gene_locus related to this paper: drome-CG8058 , drome-CG9542 , drome-CG11309 , drome-CG11406 , drome-CG17097 , drome-CG17374 , drome-glita , drome-KRAKEN

Title : Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences - Strausberg_2002_Proc.Natl.Acad.Sci.U.S.A_99_16899
Author(s) : Strausberg RL , Feingold EA , Grouse LH , Derge JG , Klausner RD , Collins FS , Wagner L , Shenmen CM , Schuler GD , Altschul SF , Zeeberg B , Buetow KH , Schaefer CF , Bhat NK , Hopkins RF , Jordan H , Moore T , Max SI , Wang J , Hsieh F , Diatchenko L , Marusina K , Farmer AA , Rubin GM , Hong L , Stapleton M , Soares MB , Bonaldo MF , Casavant TL , Scheetz TE , Brownstein MJ , Usdin TB , Toshiyuki S , Carninci P , Prange C , Raha SS , Loquellano NA , Peters GJ , Abramson RD , Mullahy SJ , Bosak SA , McEwan PJ , McKernan KJ , Malek JA , Gunaratne PH , Richards S , Worley KC , Hale S , Garcia AM , Gay LJ , Hulyk SW , Villalon DK , Muzny DM , Sodergren EJ , Lu X , Gibbs RA , Fahey J , Helton E , Ketteman M , Madan A , Rodrigues S , Sanchez A , Whiting M , Young AC , Shevchenko Y , Bouffard GG , Blakesley RW , Touchman JW , Green ED , Dickson MC , Rodriguez AC , Grimwood J , Schmutz J , Myers RM , Butterfield YS , Krzywinski MI , Skalska U , Smailus DE , Schnerch A , Schein JE , Jones SJ , Marra MA
Ref : Proc Natl Acad Sci U S A , 99 :16899 , 2002
Abstract : The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).
ESTHER : Strausberg_2002_Proc.Natl.Acad.Sci.U.S.A_99_16899
PubMedSearch : Strausberg_2002_Proc.Natl.Acad.Sci.U.S.A_99_16899
PubMedID: 12477932
Gene_locus related to this paper: bovin-q3zcj6 , danre-OVCA2 , danre-q4qrh4 , danre-q4v960 , danre-q32ls6 , danre-q503e2 , ratno-CPVL , ratno-q3mhs0 , ratno-q4qr68 , ratno-q5fvr5 , ratno-q32q55 , xenla-a2bd54 , xenla-q2tap9 , xenla-q3kq37 , xenla-q3kq76 , xenla-q4klb6 , xenla-q32n48 , xenla-q32ns5 , xenla-q52l41 , xentr-q4va73 , danre-a7mbu9

Title : A Drosophila full-length cDNA resource - Stapleton_2002_Genome.Biol_3_RESEARCH0080
Author(s) : Stapleton M , Carlson J , Brokstein P , Yu C , Champe M , George R , Guarin H , Kronmiller B , Pacleb J , Park S , Wan K , Rubin GM , Celniker SE
Ref : Genome Biol , 3 :RESEARCH0080 , 2002
Abstract : BACKGROUND: A collection of sequenced full-length cDNAs is an important resource both for functional genomics studies and for the determination of the intron-exon structure of genes. Providing this resource to the Drosophila melanogaster research community has been a long-term goal of the Berkeley Drosophila Genome Project. We have previously described the Drosophila Gene Collection (DGC), a set of putative full-length cDNAs that was produced by generating and analyzing over 250,000 expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages.
RESULTS: We have generated high-quality full-insert sequence for 8,921 clones in the DGC. We compared the sequence of these clones to the annotated Release 3 genomic sequence, and identified more than 5,300 cDNAs that contain a complete and accurate protein-coding sequence. This corresponds to at least one splice form for 40% of the predicted D. melanogaster genes. We also identified potential new cases of RNA editing.
CONCLUSIONS: We show that comparison of cDNA sequences to a high-quality annotated genomic sequence is an effective approach to identifying and eliminating defective clones from a cDNA collection and ensure its utility for experimentation. Clones were eliminated either because they carry single nucleotide discrepancies, which most probably result from reverse transcriptase errors, or because they are truncated and contain only part of the protein-coding sequence.
ESTHER : Stapleton_2002_Genome.Biol_3_RESEARCH0080
PubMedSearch : Stapleton_2002_Genome.Biol_3_RESEARCH0080
PubMedID: 12537569
Gene_locus related to this paper: drome-KRAKEN

Title : Annotation of the Drosophila melanogaster euchromatic genome: a systematic review - Misra_2002_Genome.Biol_3_RESEARCH0083
Author(s) : Misra S , Crosby MA , Mungall CJ , Matthews BB , Campbell KS , Hradecky P , Huang Y , Kaminker JS , Millburn GH , Prochnik SE , Smith CD , Tupy JL , Whitfied EJ , Bayraktaroglu L , Berman BP , Bettencourt BR , Celniker SE , de Grey AD , Drysdale RA , Harris NL , Richter J , Russo S , Schroeder AJ , Shu SQ , Stapleton M , Yamada C , Ashburner M , Gelbart WM , Rubin GM , Lewis SE
Ref : Genome Biol , 3 :RESEARCH0083 , 2002
Abstract : BACKGROUND: The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences.
RESULTS: Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes.
CONCLUSIONS: Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations.
ESTHER : Misra_2002_Genome.Biol_3_RESEARCH0083
PubMedSearch : Misra_2002_Genome.Biol_3_RESEARCH0083
PubMedID: 12537572
Gene_locus related to this paper: drome-a1z6g9 , drome-abhd2 , drome-ACHE , drome-CG8058 , drome-CG8093 , drome-CG8233 , drome-CG8425 , drome-CG9059 , drome-CG9186 , drome-CG9542 , drome-CG10982 , drome-CG11309 , drome-CG11406 , drome-CG11598 , drome-CG17097 , drome-glita , drome-KRAKEN , drome-nrtac , drome-OME , drome-q7k274 , drome-q9vux3

Title : Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome - Bergman_2002_Genome.Biol_3_RESEARCH0086
Author(s) : Bergman CM , Pfeiffer BD , Rincon-Limas DE , Hoskins RA , Gnirke A , Mungall CJ , Wang AM , Kronmiller B , Pacleb J , Park S , Stapleton M , Wan K , George RA , de Jong PJ , Botas J , Rubin GM , Celniker SE
Ref : Genome Biol , 3 :RESEARCH0086 , 2002
Abstract : BACKGROUND: It is widely accepted that comparative sequence data can aid the functional annotation of genome sequences; however, the most informative species and features of genome evolution for comparison remain to be determined.
RESULTS: We analyzed conservation in eight genomic regions (apterous, even-skipped, fushi tarazu, twist, and Rhodopsins 1, 2, 3 and 4) from four Drosophila species (D. erecta, D. pseudoobscura, D. willistoni, and D. littoralis) covering more than 500 kb of the D. melanogaster genome. All D. melanogaster genes (and 78-82% of coding exons) identified in divergent species such as D. pseudoobscura show evidence of functional constraint. Addition of a third species can reveal functional constraint in otherwise non-significant pairwise exon comparisons. Microsynteny is largely conserved, with rearrangement breakpoints, novel transposable element insertions, and gene transpositions occurring in similar numbers. Rates of amino-acid substitution are higher in uncharacterized genes relative to genes that have previously been studied. Conserved non-coding sequences (CNCSs) tend to be spatially clustered with conserved spacing between CNCSs, and clusters of CNCSs can be used to predict enhancer sequences.
CONCLUSIONS: Our results provide the basis for choosing species whose genome sequences would be most useful in aiding the functional annotation of coding and cis-regulatory sequences in Drosophila. Furthermore, this work shows how decoding the spatial organization of conserved sequences, such as the clustering of CNCSs, can complement efforts to annotate eukaryotic genomes on the basis of sequence conservation alone.
ESTHER : Bergman_2002_Genome.Biol_3_RESEARCH0086
PubMedSearch : Bergman_2002_Genome.Biol_3_RESEARCH0086
PubMedID: 12537575
Gene_locus related to this paper: drops-CG4390 , drowi-b4ngb5 , droya-q71d76

Title : The genome sequence of Drosophila melanogaster - Adams_2000_Science_287_2185
Author(s) : Adams MD , Celniker SE , Holt RA , Evans CA , Gocayne JD , Amanatides PG , Scherer SE , Li PW , Hoskins RA , Galle RF , George RA , Lewis SE , Richards S , Ashburner M , Henderson SN , Sutton GG , Wortman JR , Yandell MD , Zhang Q , Chen LX , Brandon RC , Rogers YH , Blazej RG , Champe M , Pfeiffer BD , Wan KH , Doyle C , Baxter EG , Helt G , Nelson CR , Gabor GL , Abril JF , Agbayani A , An HJ , Andrews-Pfannkoch C , Baldwin D , Ballew RM , Basu A , Baxendale J , Bayraktaroglu L , Beasley EM , Beeson KY , Benos PV , Berman BP , Bhandari D , Bolshakov S , Borkova D , Botchan MR , Bouck J , Brokstein P , Brottier P , Burtis KC , Busam DA , Butler H , Cadieu E , Center A , Chandra I , Cherry JM , Cawley S , Dahlke C , Davenport LB , Davies P , de Pablos B , Delcher A , Deng Z , Mays AD , Dew I , Dietz SM , Dodson K , Doup LE , Downes M , Dugan-Rocha S , Dunkov BC , Dunn P , Durbin KJ , Evangelista CC , Ferraz C , Ferriera S , Fleischmann W , Fosler C , Gabrielian AE , Garg NS , Gelbart WM , Glasser K , Glodek A , Gong F , Gorrell JH , Gu Z , Guan P , Harris M , Harris NL , Harvey D , Heiman TJ , Hernandez JR , Houck J , Hostin D , Houston KA , Howland TJ , Wei MH , Ibegwam C , Jalali M , Kalush F , Karpen GH , Ke Z , Kennison JA , Ketchum KA , Kimmel BE , Kodira CD , Kraft C , Kravitz S , Kulp D , Lai Z , Lasko P , Lei Y , Levitsky AA , Li J , Li Z , Liang Y , Lin X , Liu X , Mattei B , McIntosh TC , McLeod MP , McPherson D , Merkulov G , Milshina NV , Mobarry C , Morris J , Moshrefi A , Mount SM , Moy M , Murphy B , Murphy L , Muzny DM , Nelson DL , Nelson DR , Nelson KA , Nixon K , Nusskern DR , Pacleb JM , Palazzolo M , Pittman GS , Pan S , Pollard J , Puri V , Reese MG , Reinert K , Remington K , Saunders RD , Scheeler F , Shen H , Shue BC , Siden-Kiamos I , Simpson M , Skupski MP , Smith T , Spier E , Spradling AC , Stapleton M , Strong R , Sun E , Svirskas R , Tector C , Turner R , Venter E , Wang AH , Wang X , Wang ZY , Wassarman DA , Weinstock GM , Weissenbach J , Williams SM , WoodageT , Worley KC , Wu D , Yang S , Yao QA , Ye J , Yeh RF , Zaveri JS , Zhan M , Zhang G , Zhao Q , Zheng L , Zheng XH , Zhong FN , Zhong W , Zhou X , Zhu S , Zhu X , Smith HO , Gibbs RA , Myers EW , Rubin GM , Venter JC
Ref : Science , 287 :2185 , 2000
Abstract : The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the approximately 120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes approximately 13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.
ESTHER : Adams_2000_Science_287_2185
PubMedSearch : Adams_2000_Science_287_2185
PubMedID: 10731132
Gene_locus related to this paper: drome-1vite , drome-2vite , drome-3vite , drome-a1z6g9 , drome-abhd2 , drome-ACHE , drome-b6idz4 , drome-BEM46 , drome-CG5707 , drome-CG5704 , drome-CG1309 , drome-CG1882 , drome-CG1986 , drome-CG2059 , drome-CG2493 , drome-CG2528 , drome-CG2772 , drome-CG3160 , drome-CG3344 , drome-CG3523 , drome-CG3524 , drome-CG3734 , drome-CG3739 , drome-CG3744 , drome-CG3841 , drome-CG4267 , drome-CG4382 , drome-CG4390 , drome-CG4572 , drome-CG4582 , drome-CG4851 , drome-CG4979 , drome-CG5068 , drome-CG5162 , drome-CG5355 , drome-CG5377 , drome-CG5397 , drome-CG5412 , drome-CG5665 , drome-CG5932 , drome-CG5966 , drome-CG6018 , drome-CG6113 , drome-CG6271 , drome-CG6283 , drome-CG6295 , drome-CG6296 , drome-CG6414 , drome-CG6431 , drome-CG6472 , drome-CG6567 , drome-CG6675 , drome-CG6753 , drome-CG6847 , drome-CG7329 , drome-CG7367 , drome-CG7529 , drome-CG7632 , drome-CG8058 , drome-CG8093 , drome-CG8233 , drome-CG8424 , drome-CG8425 , drome-CG9059 , drome-CG9186 , drome-CG9287 , drome-CG9289 , drome-CG9542 , drome-CG9858 , drome-CG9953 , drome-CG9966 , drome-CG10116 , drome-CG10163 , drome-CG10175 , drome-CG10339 , drome-CG10357 , drome-CG10982 , drome-CG11034 , drome-CG11055 , drome-CG11309 , drome-CG11319 , drome-CG11406 , drome-CG11598 , drome-CG11600 , drome-CG11608 , drome-CG11626 , drome-CG11935 , drome-CG12108 , drome-CG12869 , drome-CG13282 , drome-CG13562 , drome-CG13772 , drome-CG14034 , drome-nlg3 , drome-CG14717 , drome-CG15101 , drome-CG15102 , drome-CG15106 , drome-CG15111 , drome-CG15820 , drome-CG15821 , drome-CG15879 , drome-CG17097 , drome-CG17099 , drome-CG17101 , drome-CG17191 , drome-CG17192 , drome-CG17292 , drome-CG18258 , drome-CG18284 , drome-CG18301 , drome-CG18302 , drome-CG18493 , drome-CG18530 , drome-CG18641 , drome-CG18815 , drome-CG31089 , drome-CG31091 , drome-CG32333 , drome-CG32483 , drome-CG33174 , drome-dnlg1 , drome-este4 , drome-este6 , drome-GH02384 , drome-GH02439 , drome-glita , drome-KRAKEN , drome-lip1 , drome-LIP2 , drome-lip3 , drome-MESK2 , drome-nrtac , drome-OME , drome-q7k274 , drome-Q9VJN0 , drome-Q8IP31 , drome-q9vux3

Title : A Drosophila complementary DNA resource - Rubin_2000_Science_287_2222
Author(s) : Rubin GM , Hong L , Brokstein P , Evans-Holm M , Frise E , Stapleton M , Harvey DA
Ref : Science , 287 :2222 , 2000
Abstract : Collections of nonredundant, full-length complementary DNA (cDNA) clones for each of the model organisms and humans will be important resources for studies of gene structure and function. We describe a general strategy for producing such collections and its implementation, which so far has generated a set of cDNAs corresponding to over 40% of the genes in the fruit fly Drosophila melanogaster.
ESTHER : Rubin_2000_Science_287_2222
PubMedSearch : Rubin_2000_Science_287_2222
PubMedID: 10731138
Gene_locus related to this paper: drome-nrtac