348 Genome Informatics 11 348–349(2000) Large-Scale Sequence Analyses in the Arabidopsis

发布时间:2011-08-18 21:48:22   来源:文档文库   
字号:
348Genome Informatics11:348–349(2000) Large-Scale Sequence Analyses in the Arabidopsisthaliana Genome Sequencing Project:Status2000Yasukazu Nakamura Satoshi Tabataynakamu@kazusa.or.jp tabata@kazusa.or.jpThe First Laboratory for Plant Gene Research,Kazusa DNA Research Institute1532-3Yana,Kisarazu,Chiba292-0812,JapanKeywords:gene-finding,annotation,plant genome seqeuncing project1IntroductionArabidopsis thaliana is a smallflowering plant that is widely used by plant science researchers as a model organism to study many aspects of plant biology.The genome size is ca.130-megabase and organized intofive chromosomes.Estimations of the protein gene number are around25,000.In order to understand the entire genetic system in this plants,multi-national sequencing project of the Arabidopsis thaliana genome has been performed from1996[1].Kazusa DNA Research Institute is taking part in nucleating of the entire bottom arm and portions of the top arm of chromosome5,and also the top arm of chromosome3along the line of the international agreement of AGI(Arabidopsis Genome Initiative).The unique sequence regions of chromosome2and4were nucleated and published already[2,3]and the entire sequence and thefirst-pass annotation will be published in Nature in the end of the year2000.We had been nucleated and made annotation for twenty-seven megabases of Arabidopsis sequences on the Kazusa-allocated regions.During the process of annotating genomic sequence of clones on the chromosomes3and5,in order to make reliable gene annotation,we constructed a computer-aided gene-modeling system which combine and display outputs of similarity search outputs and prediction algorithms[4].High-throughput annotation processes of the allocated regions of Arabidopsis thaliana genome was carried out with the assistance of the system.In this poster,we will present current status of the gene-finding and data presentation systems as well as structural and functional features of deduced genes on the regions.2Methods and DescriptionThe nucleotide sequence of each clone was determined according to the shotgun based strategy as described in previous paper[5].The high-throughput gene-finding and annotation system is reported in this meeting in the last year[4].Using a combination of gene prediction programs and database searches,the nucleated regions of Arabidopsis thaliana chromosome3and5by us were annotated.Currently,6,251genes were deduced in the27,061,818bp regions on chromosome3and5of Arabidopsis covered by461P1,TAC,BAC clones and direct PCR and lambda as gap closing units.An average density of the protein genes estimated to be1gene per4.4kilobase pairs.Of the over6,000protein and RNA genes deduced, around a half portion of the genes were assigned either a definite or putative function by similarity to known genes,the remains were functionally unknown genes.We made a trial analysis for classification by biological role of the annotated genes.The deduced genes whose function anticipated by similarity were grouped into twenty-two categories with the

本文来源:https://www.2haoxitong.net/k/doc/8209df9a51e79b8968022687.html

《348 Genome Informatics 11 348–349(2000) Large-Scale Sequence Analyses in the Arabidopsis.doc》
将本文的Word文档下载到电脑,方便收藏和打印
推荐度:
点击下载文档

文档为doc格式