From shameer at ncbs.res.in Tue Aug 7 03:39:49 2007 From: shameer at ncbs.res.in (Shameer Khadar) Date: Tue, 7 Aug 2007 11:39:49 +0400 (RET) Subject: [BioRuby] Ruby Newbie Message-ID: <39717.192.168.1.186.1186472389.squirrel@mail.ncbs.res.in> Dear All, I am a Ruby/Bio-Ruby newbie. Please direct me to some good resource to start learning Ruby by using Bio-Ruby classes. cheers, -- Shameer Khadar Prof. R. Sowdhamini's Lab (# 25) The Computational Biology Group National Centre for Biological Sciences (TIFR) GKVK Campus, Bellary Road, Bangalore - 65, Karnataka - India T - 91-080-23666001 EXT - 6251 W - http://www.ncbs.res.in From ngoto at gen-info.osaka-u.ac.jp Thu Aug 9 10:40:18 2007 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Thu, 9 Aug 2007 23:40:18 +0900 Subject: [BioRuby] Ruby Newbie In-Reply-To: <39717.192.168.1.186.1186472389.squirrel@mail.ncbs.res.in> References: <39717.192.168.1.186.1186472389.squirrel@mail.ncbs.res.in> Message-ID: <20070809144018.57D9B1CBC558@idnmail.gen-info.osaka-u.ac.jp> Hi, On Tue, 7 Aug 2007 11:39:49 +0400 (RET) "Shameer Khadar" wrote: > Dear All, > > I am a Ruby/Bio-Ruby newbie. Please direct me to some good resource to > start learning Ruby by using Bio-Ruby classes. BioRuby in Anger http://dev.bioruby.org/wiki/en/?BioRuby+in+Anger Tutorial.rd (bundled with bioruby-X.X.X.tar.gz) http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/bioruby/doc/Tutorial.rd?rev=HEAD&cvsroot=bioruby&content-type=text/plain Tutorial.rd on Bioruby Wiki (older than above, and parse error occurred?) http://dev.bioruby.org/wiki/en/?Tutorial.rd However, above documents may be too difficult for newbies who don't know Ruby. If so, you may also want to read introductions and tutorials of Ruby itself, for example, http://tryruby.hobix.com/ (interactive tutorial of Ruby is available) Thanks, Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Japan From ngoto at gen-info.osaka-u.ac.jp Thu Aug 9 12:15:45 2007 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Fri, 10 Aug 2007 01:15:45 +0900 Subject: [BioRuby] Wu-blast report parsing issue In-Reply-To: <4702CF45-61F8-43CF-B758-083EA59AD10B@unil.ch> References: <4702CF45-61F8-43CF-B758-083EA59AD10B@unil.ch> Message-ID: <20070809161545.D4A971CBC3F7@idnmail.gen-info.osaka-u.ac.jp> Hello, I'm sorry it's too late. It seems this error occurred in the line 29 of your xml file The content of the Hit_def is empty. For sequences with no definition, NCBI BLAST outputs No definition line found and the content of the Hit_def is not empty. This means the output of WU-BLAST xml is sometimes incompatible with the NCBI BLAST. However, because this is very small difference, I think this can be covered with BioRuby. I can repeat the same error with the following data: (saved as database.fst) -------------------------------------------------------------- >lcl|EXAMPLE AGACATAACCCAAACAGAATAACCTGAAAGAGACCCACGACCATGCAGGGGACCTGGATG GTGCTGTTGGCACTGATATTGGGCACCTTCGGGGAGCTTGCTATGGCCTTACAGTGCTAC ACCTGTGCGAATCCTGTGAGTGCATCCAACTGTGTCACCACCACCCACTGCCACATCAAT GAAACCATGTGCAAGACTACGCTCTACTCCCTGGAGATTGTTTTCCCTTTCCTGGGGGAC TCCACGGTGACCAAGTCCTGCGCCAGCAAGTGTGAGCCTTCGGATGTGGATGGCATTGGG CAAACCCGGCCAGTGTCCTGCTGCAATTCTGACCTATGCAACGTGGATGGGGCACCCAGC CTGGGCAGTCCTGGTGGCCTGCTCCTTGCCCTGGCACTTTTCTTGCTCTTGGGTGTCCTG CTGTAAAGCCATGGCCATCTAGCTCCACTCCCTTGTCCCTGACATCCCAGTTCCCTAATG CCTAGAAGAAATACAATGGCCATCTGC -------------------------------------------------------------- (saved as query.fst) -------------------------------------------------------------- >Contig1 AGACATAACCCAAACAGAATAACCTGAAAGAGACCCACGACCATGCAGGGGACCTGGATG GTGCTGTTGGCACTGATATTGGGCACCTTCGGGGAGCTTGCTATGGCCTTACAGTGCTAC ACCTGTGCGAATCCTGTGAGTGCATCCAACTGTGTCACCACCACCCACTGCCACATCAAT GAAACCATGTGCAAGACTACGCTCTACTCCCTGGAGATTGTTTTCCCTTTCCTGGGGGAC TCCACGGTGACCAAGTCCTGCGCCAGCAAGTGTGAGCCTTCGGATGTGGATGGCATTGGG CAAACCCGGCCAGTGTCCTGCTGCAATTCTGACCTATGCAACGTGGATGGGGCACCCAGC CTGGGCAGTCCTGGTGGCCTGCTCCTTGCCCTGGCACTTTTCTTGCTCTTGGGTGTCCTG CTGTAAAGCCATGGCCATCTAGCTCCACTCCCTTGTCCCTGACATCCCAGTTCCCTAATG CCTAGAAGAAATACAATGGCCATCTGC -------------------------------------------------------------- The sequence of query.fst is completely the same as database.fst. Only definition line is different. commands for WU BLAST: % xdformat -n database.fst % wu-blastall -p blastn -i query.fst -d database.fst \ -o wu-blastn.xml -e 1e-10 -m 7 -F F commands for NCBI BLAST: % formatdb -i database.fst -p F -o % blastall -p blastn -i query.fst -d database.fst \ -o ncbi-blastn.xml -e 1e-10 -m 7 -F F Report of WU BLAST: 1 lcl|EXAMPLE EXAMPLE 507 Report of NCBI BLAST: 1 lcl|EXAMPLE No definition line found EXAMPLE 507 The Hit_def line of WU-BLAST is incompatible with NCBI BLAST for sequences with no definitions. The versions of WU BLAST and NCBI BLAST were: 2.0MP-WashU [04-May-2006] [linux26-i786-ILP32F64 2006-05-09T12:19:58] blastn 2.2.15 [Oct-15-2006] > I've tried feeding my script normal (-m0) wublast output too. It > doesn't crash - but @reportsArray.length == 0). Bio::Blast.reports can only be used for XML output. For normal format, 49: @reportsArray = Bio::FlatFile.new(nil, @file).to_a would work. Thank you, Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Tue, 24 Apr 2007 21:03:56 +0200 Yannick Wurm wrote: > Hello, > > I've generated a blast report using wu-blastall with -m7 to get xml > output. > It should be easy to get this into ruby, but I'm having a hard time. > > Here's the error I get: > #~/ruby/dotGraphOfStrongHits.rb simple.xml simple.xml.dot 1.0e-5 > /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/xmlparser.rb:158:in > `clone': can't clone NilClass (TypeError) > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/xmlparser.rb: > 158:in `xmlparser_parse_hit' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/xmlparser.rb: > 72:in `xmlparser_parse' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/xmlparser.rb: > 41:in `xmlparser_parse' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/report.rb: > 66:in `auto_parse' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast/report.rb: > 89:in `initialize' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast.rb:115:in > `reports' > from /sw/lib/ruby/site_ruby/1.8/bio/appl/blast.rb:109:in > `reports' > from /Users/yannickwurm/ruby/wublastReportParser.rb:49:in > `loadBlastReport' > from /Users/yannickwurm/ruby/dotGraphOfStrongHits.rb:30:in > `parseFile' > from /Users/yannickwurm/ruby/dotGraphOfStrongHits.rb:61 > > The corresponding lines of wublastReportParser.rb are: > 48: @file = File.open(@blast_report, IO::RDONLY) > 49: @reportsArray = Bio::Blast.reports(@file) > > > Does wublast not respect the standard blast xml output? > I've tried feeding my script normal (-m0) wublast output too. It > doesn't crash - but @reportsArray.length == 0). > > My xml-ed blast report is here: > http://wwwpeople.unil.ch/yannick.wurm/simple.xml > > > What am I doing wrong? Do you have ideas how to solve this issue? > > > My version info: > wu-blastall 2.2.6 > ruby 1.8.4 (2005-12-24) [powerpc-darwin] > bio.rb,v 1.84 2007/04/05 > > Thanks in advance for any pointers! > yannick > > -------------------------------------------- > yannick . wurm @ unil . ch > Ant Genomics, Ecology & Evolution @ Lausanne > http://www.unil.ch/dee/page28685_fr.html > > > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From jan.aerts at bbsrc.ac.uk Tue Aug 14 10:36:32 2007 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Tue, 14 Aug 2007 15:36:32 +0100 Subject: [BioRuby] ruby API to ensembl core database Message-ID: <1F16910BB8546C4DA5526FABB0C98D093C09F3@ebre2ksrv1.ebrc.bbsrc.ac.uk> All, I've released a ruby API to the Ensembl core database, similar to the one available for perl. Mitsuteru already wrote a very useful interface to the ExportView functionality of the Ensembl website, but this API is linked to the underlying mysql database and attempts to copy the functionality of the full-blown perl API. This includes slices, projections and transformations. A tutorial for the API is available at the homepage where you can also download the gem (for the Ensembl core people: 'gem' refers to a type of file, not to a developer who lost touch with reality). I've tested quite a bit, and hope there are no bugs left in the code... If you find any: let me know. For more information, see http://bioruby-annex.rubyforge.org and http://saaientist.blogspot.com If you have any suggestions, please don't hesitate to contact me. And if anyone would be interested to help me in the future development of this API, I'd be very happy. Big big thanks to the Ensembl core team for helping me (and being my hosts as a Geek for a Week). Dr Jan Aerts Bioinformatics Group Roslin Institute Roslin EH25 9PS Scotland, UK tel: +44 131 527 4198 skype: aerts_ri ----...and the obligatory disclaimer---- Roslin Institute is a company limited by guarantee, registered in Scotland (registered number SC157100) and a Scottish Charity (registered number SC023592). Our registered office is at Roslin, Midlothian, EH25 9PS. VAT registration number 847380013. The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From ktemme at gmail.com Thu Aug 16 23:54:26 2007 From: ktemme at gmail.com (Karsten Temme) Date: Thu, 16 Aug 2007 20:54:26 -0700 Subject: [BioRuby] Parse string containing GenBenk file? Message-ID: <67c2cfda0708162054v6295dc25x881db2a50a51a6ec@mail.gmail.com> Can anyone point me to a method for creating a Bio object from a string that contains a Genbank file? For example, I have a Genbank file and its contents have been read into a single string. I am unable to use Bio::FlatFile.auto(string) to create a Bio object. Thanks for the help, Karsten From ngoto at gen-info.osaka-u.ac.jp Fri Aug 17 00:30:27 2007 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa Goto) Date: Fri, 17 Aug 2007 13:30:27 +0900 Subject: [BioRuby] Parse string containing GenBenk file? In-Reply-To: <67c2cfda0708162054v6295dc25x881db2a50a51a6ec@mail.gmail.com> References: <67c2cfda0708162054v6295dc25x881db2a50a51a6ec@mail.gmail.com> Message-ID: <20070817131318.6647.NGOTO@gen-info.osaka-u.ac.jp> Hi, If you have a string with single GenBank entry, you can use Bio::GenBank.new(string). For example, require "bio" str = <<__EOF__ LOCUS A00001 335 bp DNA linear PAT 11-MAY-2001 DEFINITION Cauliflower mosaic virus satellite cDNA. ACCESSION A00001 VERSION A00001.1 GI:58418 KEYWORDS . SOURCE Cauliflower mosaic virus ORGANISM Cauliflower mosaic virus Viruses; Retro-transcribing viruses; Caulimoviridae; Caulimovirus. REFERENCE 1 (bases 1 to 335) AUTHORS Baulcombe,D.C., Mayo,M.A., Harrison,B.D. and Bevan,M.W. TITLE Modification of plant viruses or their effects JOURNAL Patent: EP 0242016-A 1 21-OCT-1987; AGRICULTURAL GENETICS COMPANY LIMITED FEATURES Location/Qualifiers source 1..335 /organism="Cauliflower mosaic virus" /mol_type="unassigned DNA" /db_xref="taxon:10641" misc_feature 1..335 /note="satellite DNA" ORIGIN 1 gttttgtttg atggagaatt gcgcagaggg gttatatctg cgtgaggatc tgtcactcgg 61 cggtgtggga tacctccctg ctaaggcggg ttgagtgatg ttccctcgga ctggggaccg 121 ctggcttgcg agctatgtcc gctactctca gtactacact ctcatttgag cccccgctca 181 gtttgctagc agaacccggc acatggttcg ccgataccat ggaatttcga aagaaacact 241 ctgttaggtg gtatgagtca tgacgcacgc agggagaggc taaggcttat gctatgctga 301 tctccgtgaa tgtctatcat tcctacacag gaccc // __EOF__ gb = Bio::GenBank.new(str) p gb If you have a string with multiple entries, or the string might be other than GenBank (e.g. EMBL.), using StringIO is better. require 'stringio' Bio::FlatFile.auto(StringIO.new(string)) Regards, Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org > Can anyone point me to a method for creating a Bio object from a > string that contains a Genbank file? > > For example, I have a Genbank file and its contents have been read > into a single string. I am unable to use Bio::FlatFile.auto(string) > to create a Bio object. > > Thanks for the help, > Karsten > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From raoul.bonnal at itb.cnr.it Wed Aug 29 05:21:09 2007 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Wed, 29 Aug 2007 11:21:09 +0200 Subject: [BioRuby] Blast Format8 hsps' patch Message-ID: <1188379270.9964.6.camel@localhost> Hi Guys, I found a bug in the parser. Take this example: contig00002 gi|15902044|ref|NC_003098.1| 99.71 1399 4 0 5 1403 414678 416076 0.0 2742 contig00002 gi|118090026|ref|NC_003028.2| 98.25 858 5 1 556 1403 448891 449748 0.0 1592 contig00003 gi|116515308|ref|NC_008533.1| 99.67 2997 7 2 1 2994 423818 426814 0.0 5848 contig00003 gi|15902044|ref|NC_003098.1| 99.67 2997 7 2 1 2994 416288 419284 0.0 5848 contig00003 gi|118090026|ref|NC_003028.2| 99.60 2997 9 2 1 2994 449959 452955 0.0 5832 contig00004 gi|118090026|ref|NC_003028.2| 98.08 2238 40 3 5 2242 453000 455234 0.0 4072 contig00004 gi|116515308|ref|NC_008533.1| 97.94 2238 43 3 5 2242 426859 429093 0.0 4048 contig00004 gi|15902044|ref|NC_003098.1| 97.94 2238 43 3 5 2242 419329 421563 0.0 4048 about last contig00003 the parser create an hit with 2 hsps, putting togheter results from last contig00003 and first contig00004, this is wrong. The code check only if the target is different from the prev one, in this case the target is the same but the query is different. In attach the patch to solve the problem, I put a chk on the query too. Actually I don't know if the problem is present with other parsers. Best regards. -- Ra