From m.han at gsf.de Tue Jun 15 12:11:48 2004 From: m.han at gsf.de (Michael Han) Date: Tue Jun 15 12:15:32 2004 Subject: [BioRuby] selenoproteins Message-ID: Hi everyone, i am working on fine-tuning an annotation pipeline with ruby. As selenoproteins seem not to be supported by BioRuby, they break my assertions (and most gene predictions). Therefore i would like to add a codon-table for selenoproteins (TGA=>U). Also i would like to ask if there are people working on support for BLAT / BED / PSL / moving tests into separate Test::Unit classes. greetings, Michael Han --------------------- Institute for Bioinformatics National Research Center for Environment and Health Munich - Germany ---------------------- i am thinking of something like adding to bio/data/codontable.rb ------------------------ 8< --------------- # codon table 99 - selenoproteins (Eukaryote) 99 => { 'ttt' => 'F', 'tct' => 'S', 'tat' => 'Y', 'tgt' => 'C', 'ttc' => 'F', 'tcc' => 'S', 'tac' => 'Y', 'tgc' => 'C', 'tta' => 'L', 'tca' => 'S', 'taa' => '*', 'tga' => 'U', 'ttg' => 'L', 'tcg' => 'S', 'tag' => '*', 'tgg' => 'W', 'ctt' => 'L', 'cct' => 'P', 'cat' => 'H', 'cgt' => 'R', 'ctc' => 'L', 'ccc' => 'P', 'cac' => 'H', 'cgc' => 'R', 'cta' => 'L', 'cca' => 'P', 'caa' => 'Q', 'cga' => 'R', 'ctg' => 'L', 'ccg' => 'P', 'cag' => 'Q', 'cgg' => 'R', 'att' => 'I', 'act' => 'T', 'aat' => 'N', 'agt' => 'S', 'atc' => 'I', 'acc' => 'T', 'aac' => 'N', 'agc' => 'S', 'ata' => 'I', 'aca' => 'T', 'aaa' => 'K', 'aga' => 'R', 'atg' => 'M', 'acg' => 'T', 'aag' => 'K', 'agg' => 'R', 'gtt' => 'V', 'gct' => 'A', 'gat' => 'D', 'ggt' => 'G', 'gtc' => 'V', 'gcc' => 'A', 'gac' => 'D', 'ggc' => 'G', 'gta' => 'V', 'gca' => 'A', 'gaa' => 'E', 'gga' => 'G', 'gtg' => 'V', 'gcg' => 'A', 'gag' => 'E', 'ggg' => 'G', }, ----------------------- 8< ------------------------- From ngoto at gen-info.osaka-u.ac.jp Fri Jun 18 12:07:45 2004 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Fri Jun 18 12:10:34 2004 Subject: [BioRuby] selenoproteins In-Reply-To: References: Message-ID: Hi, On Tue, 15 Jun 2004 18:11:48 +0200 Michael Han wrote: > Hi everyone, > > i am working on fine-tuning an annotation pipeline with ruby. > As selenoproteins seem not to be supported by BioRuby, they break my > assertions (and most gene predictions). > Therefore i would like to add a codon-table for selenoproteins (TGA=>U). In next version, Bio::Sequence::NA#translate and Bio::CodonTable will be changed, and users will be able to modify codon tables freely based on NCBI taxonomy's definitions or from scratch. We'll soon commit new codes into CVS in few days. > Also i would like to ask if there are people working on support for > BLAT / BED / PSL / moving tests into separate Test::Unit classes. I have some codes to support BLAT (psl and pslx format) but not refined and not added to bioruby yet. (In addition, I have some primitive codes to support sim4, est2genome and spidey.) I'm sorry I don't know much about BED ('Brain Est Database' ?). Unit tests are important but, in bioruby, only few efforts have been done yet. Regards, -- Naohisa GOTO ngoto@gen-info.osaka-u.ac.jp Genome Information Research Center, Osaka University, Japan From ktym at hgc.jp Sat Jun 26 11:26:28 2004 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sat Jun 26 11:29:00 2004 Subject: [BioRuby] selenoproteins In-Reply-To: References: Message-ID: <31979E9A-C785-11D8-A1A4-000A95CA1390@hgc.jp> Hello, On 2004/06/19, at 1:07, GOTO Naohisa wrote: > On Tue, 15 Jun 2004 18:11:48 +0200 > Michael Han wrote: > >> Hi everyone, >> >> i am working on fine-tuning an annotation pipeline with ruby. >> As selenoproteins seem not to be supported by BioRuby, they break my >> assertions (and most gene predictions). >> Therefore i would like to add a codon-table for selenoproteins >> (TGA=>U). > > In next version, Bio::Sequence::NA#translate and Bio::CodonTable will > be > changed, and users will be able to modify codon tables freely based on > NCBI taxonomy's definitions or from scratch. We'll soon commit new > codes > into CVS in few days. I have committed these changes into CVS. * Bio::CodonTable is now changed to be a class. * Bio::Sequence::NA#translate method accepts Bio::CodonTable object. You can use your own table and its definition as follows: hash = { 'atg' => 'M', ... } definition = "my table" ct = Bio::CodonTable.new(hash, definition) Selecting from hard coded tables is same as before: ct = Bio::CodonTable[1] We have also implemented 'revtrans', 'start_codon?' and 'stop_codon?' methods. Please take a look at test codes and documents added at the bottom of codontable.rb file. http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioruby/lib/bio/ data/codontable.rb?rev=0.10&cvsroot=bioruby&content-type=text/ vnd.viewcvs-markup Additionally, Bio::Sequence::NA#translate method is made 30-50% faster than before with the code contributed by N. Goto. Regards, Toshiaki Katayama -- Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan tel://+81-3-5449-5614, fax://+81-3-5449-5434 BioRuby project http://bioruby.org/~k/ GenomeNet/KEGG http://www.genome.jp/ Human Genome Center http://www.hgc.jp/