From F.Schwach at uea.ac.uk Mon Dec 10 12:21:31 2007 From: F.Schwach at uea.ac.uk (Schwach Frank Dr (CMP)) Date: Mon, 10 Dec 2007 17:21:31 -0000 Subject: [BioRuby] using Bio::FlatFileIndex Message-ID: Hi, I need to retrieve sequences from fasta files. In Perl I used to do this with Bio::DB:fasta but at first I couldn't find an equivalent in Bioruby and was almost about to give up and use Perl for this purpose when I found Bio::FlatFileIndex. Unfortunately, this class is not very well documented (unless I missed something). I think I can more or less figure out most of it from the code and the comments in the rdoc (http://bioruby.org/rdoc/classes/Bio/FlatFileIndex.html) but it would really be great to have some examples from people who are more familiar with this class, especially since I am relatively new to Ruby still. What I want to do is simply: 1) Build an index for a directory containing a few fasta files 2) In a Rails App (or any other Ruby script): retrieve sequences by their accessions and update the index if the fasta db is updated by the user. Some of the questions I have are: What are the options that I can pass to the makeindex method? In Bioperl it is possible to retrieve a subsequence straight away like this: my $seq_db_obj = Bio::DB::Fasta->new($path_to_db); my $seq = $seq_db_obj->seq($accession, $start, $end) ; # retrieve (sub)sequence from the database Can I do this in Ruby too or would I retrieve the entire sequence and then get the subsequence from that? Any help and examples welcome! Thanks a lot! From jan.aerts at bbsrc.ac.uk Mon Dec 10 15:43:24 2007 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 10 Dec 2007 20:43:24 -0000 Subject: [BioRuby] rcov Message-ID: <1F16910BB8546C4DA5526FABB0C98D09AA9A49@ebre2ksrv1.ebrc.bbsrc.ac.uk> Just had a look at the test coverage for bioruby at http://swdev.cbri.umn.edu/rcov-bioruby20070405/ In case we've got time to spare: it would be good to get the coverage up... Just to remind everyone :-) jan. From ngoto at gen-info.osaka-u.ac.jp Tue Dec 11 09:59:52 2007 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Tue, 11 Dec 2007 23:59:52 +0900 Subject: [BioRuby] using Bio::FlatFileIndex In-Reply-To: References: Message-ID: <20071211145953.30AF51CBC411@idnmail.gen-info.osaka-u.ac.jp> Hi, Indexes can be generated with a command-line application br_bioflat.rb or within Ruby script. Example: creates an index from command line: % br_bioflat.rb --create --type flat --location /home/xx/dbidx \ --dbname test --files /home/xx/test01.fst /home/xx/test02.fst equivalent ruby script: require 'bio' is_bdb = nil # is_bdb = Bio::FlatFileIndex::MAGIC_BDB for BDB index dbname = '/home/xx/dbidx/test' format = nil # file format is automatically determined options = {} files = ['/home/xx/test01.fst', '/home/xx/test02.fst' ] Bio::FlatFileIndex.makeindex(is_bdb, dbname, format, options, *files) As Bio::FlatFileIndex was first written in 2002 and is very old, the API is ugly. In addition, its internal structure is too complicated. It may be rewritten and the API might be changed in the future. Addes files to the index: % br_bioflat.rb --update --location /home/xx/dbidx \ --dbname test --files /home/xx/test03.fst /home/xx/test04.fst equivalent ruby script: require 'bio' dbname = '/home/xx/dbidx/test' options = {} files = ['/home/xx/test03.fst', '/home/xx/test04.fst' ] Bio::FlatFileIndex::update_index(dbname, nil, options, *files) Re-read all files and re-generate the index: % br_bioflat.rb --update --location /home/xx/dbidx \ --dbname test --renew equivalent ruby script: require 'bio' dbname = '/home/xx/dbidx/test' options = {} options['renew'] = true Bio::FlatFileIndex::update_index(dbname, nil, options, []) Note that add files or updating the flat database (without BDB) is very slow because it actually rebuilds indexes again. Retrieving sequences in the index: % br_bioflat.rb --location /home/xx/dbidx --dbname test M12963 equivalent ruby script: require 'bio' dbname = '/home/xx/dbidx/test' key = 'M12963' idx = Bio::FlatFileIndex.open(dbname) results = idx.search(key) results.each do |str| print str end idx.close 'results' is a Bio::FlatFileIndex::Results object. Each search result is an string. (For more information, please see RDoc http://bioruby.org/rdoc/classes/Bio/FlatFileIndex/Results.html ) If you want subsequence of fasta formatted data, for example, require 'bio' dbname = '/home/xx/dbidx/test' key = 'M12963' result = idx.search(key) result.each do |str| ent = Bio::FastaFormat.new(str) # for nucleic acid sequence puts ent.naseq[0..100] # for amino acid sequence puts ent.aaseq[0..100] # nucleic or amino acid sequence puts ent.seq[0..100] end idx.close Please see OBDA flat file indexing specifications for philosophy and internal structure of index. http://code.open-bio.org/cgi/viewcvs.cgi/obda-specs/flatfile/?cvsroot=obf-common Thanks, Naohisa Goto ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp On Mon, 10 Dec 2007 17:21:31 -0000 "Schwach Frank Dr \(CMP\)" wrote: > > Hi, > > I need to retrieve sequences from fasta files. In Perl I used to do this with Bio::DB:fasta but at first I couldn't find an equivalent in Bioruby and was almost about to give up and use Perl for this purpose when I found Bio::FlatFileIndex. > Unfortunately, this class is not very well documented (unless I missed something). I think I can more or less figure out most of it from the code and the comments in the rdoc (http://bioruby.org/rdoc/classes/Bio/FlatFileIndex.html) but it would really be great to have some examples from people who are more familiar with this class, especially since I am relatively new to Ruby still. > > What I want to do is simply: > > 1) Build an index for a directory containing a few fasta files > 2) In a Rails App (or any other Ruby script): retrieve sequences by their accessions and update the index if the fasta db is updated by the user. > > Some of the questions I have are: > What are the options that I can pass to the makeindex method? > In Bioperl it is possible to retrieve a subsequence straight away like this: > > my $seq_db_obj = Bio::DB::Fasta->new($path_to_db); > my $seq = $seq_db_obj->seq($accession, $start, $end) ; # retrieve (sub)sequence from the database > > Can I do this in Ruby too or would I retrieve the entire sequence and then get the subsequence from that? > > Any help and examples welcome! > Thanks a lot! > > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From yjchenx at gmail.com Wed Dec 12 20:54:04 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Wed, 12 Dec 2007 17:54:04 -0800 Subject: [BioRuby] Parse big PDB use up all memory Message-ID: This is what I did: require 'bio' serv = Bio::Fetch.new() entry = serv.fetch('pdb', '1w6k') pdb = Bio::PDB.new(entry) The last step use up all memory and quit. The pdb file is quite big and I only need the information from header. Is it possible to do something like this ? pdb = Bio::PDB.new(entry[0-40000]) Thanx for the help From yjchenx at gmail.com Wed Dec 12 23:50:29 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Wed, 12 Dec 2007 20:50:29 -0800 Subject: [BioRuby] Parse big PDB use up all memory In-Reply-To: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> References: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> Message-ID: Thank you for the hint for retrieve only header. I am using the default Ruby on Mac OS X 10.5. Here is the output of 'ruby -v' ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0] And bioruby is 1.1.0 from gems. I will test it on Linux and see. Yen-Ju On Dec 12, 2007 7:49 PM, Alex Gutteridge wrote: > Hi, > > Could you give some more details on what system and ruby/bioruby > version you are running? The same script uses less than 20MB on my > machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't > seem so bad. Also 1w6k is biggish, but there are certainly bigger PDB > files out there so if you're having trouble with this one then others > will certainly be a problem. > > In answer to your second question, yes you should be able to just > extract the header (everything up to the ATOM records). But if you're > really running out of memory just parsing that file then I suspect you > have deeper issues. Anyway, the sample below works for me for parsing > the header from 1w6k: > > require 'bio' > > serv = Bio::Fetch.new > entry = serv.fetch('pdb','1w6k') > > header = '' > entry.each do |l| > break if l.match(/^ATOM/) > header << l > end > > pdb = Bio::PDB.new(header) > p pdb.accession > > > On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote: > > > This is what I did: > > > > require 'bio' > > serv = Bio::Fetch.new() > > entry = serv.fetch('pdb', '1w6k') > > pdb = Bio::PDB.new(entry) > > > > The last step use up all memory and quit. > > The pdb file is quite big and I only need the information from header. > > Is it possible to do something like this ? > > > > pdb = Bio::PDB.new(entry[0-40000]) > > > > Thanx for the help > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > Alex Gutteridge > > Bioinformatics Center > Kyoto University > > > From yjchenx at gmail.com Thu Dec 13 00:22:36 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Wed, 12 Dec 2007 21:22:36 -0800 Subject: [BioRuby] Detect error from Bio::Fetch Message-ID: This is the script I run: require 'bio' serv = Bio::Fetch.new() entry = serv.fetch('swissprot', 'not_existing_id') swissprot = Bio::SwissProt.new(entry) p swissprot.entry_name # <== Error raises here The problem is that Bio.Fetch does not raise an exception or something else to notify that it cannot find the entry in database. An error shows up only at 'swissprot.entry_name'. It would be nice to detect the error early on, either in Bio::Fetch.fetch() or Bio::SwissProt.new(). Yen-Ju From alexg at kuicr.kyoto-u.ac.jp Thu Dec 13 00:22:59 2007 From: alexg at kuicr.kyoto-u.ac.jp (Alex Gutteridge) Date: Thu, 13 Dec 2007 14:22:59 +0900 Subject: [BioRuby] Parse big PDB use up all memory In-Reply-To: References: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> Message-ID: <20495B39-57E6-46C4-87AF-24B041CBA54D@kuicr.kyoto-u.ac.jp> Yup, I see the same behavior on linux and osx. Bio::PDB.new kills irb but runs fine in a script. Thanks for the bug report. I'll see if I can identify what's going on. AlexG On 13 Dec 2007, at 14:11, Yen-Ju Chen wrote: > I did a quick test and found the problem is that I ran it in irb. > If I run it in script, like 'ruby test.rb', then it works fine. > > Yen-Ju > > On Dec 12, 2007 8:50 PM, Yen-Ju Chen wrote: >> Thank you for the hint for retrieve only header. >> >> I am using the default Ruby on Mac OS X 10.5. >> Here is the output of 'ruby -v' >> >> ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0] >> >> And bioruby is 1.1.0 from gems. >> >> I will test it on Linux and see. >> >> Yen-Ju >> >> >> On Dec 12, 2007 7:49 PM, Alex Gutteridge > u.ac.jp> wrote: >>> Hi, >>> >>> Could you give some more details on what system and ruby/bioruby >>> version you are running? The same script uses less than 20MB on my >>> machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't >>> seem so bad. Also 1w6k is biggish, but there are certainly bigger >>> PDB >>> files out there so if you're having trouble with this one then >>> others >>> will certainly be a problem. >>> >>> In answer to your second question, yes you should be able to just >>> extract the header (everything up to the ATOM records). But if >>> you're >>> really running out of memory just parsing that file then I suspect >>> you >>> have deeper issues. Anyway, the sample below works for me for >>> parsing >>> the header from 1w6k: >>> >>> require 'bio' >>> >>> serv = Bio::Fetch.new >>> entry = serv.fetch('pdb','1w6k') >>> >>> header = '' >>> entry.each do |l| >>> break if l.match(/^ATOM/) >>> header << l >>> end >>> >>> pdb = Bio::PDB.new(header) >>> p pdb.accession >>> >>> >>> On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote: >>> >>>> This is what I did: >>>> >>>> require 'bio' >>>> serv = Bio::Fetch.new() >>>> entry = serv.fetch('pdb', '1w6k') >>>> pdb = Bio::PDB.new(entry) >>>> >>>> The last step use up all memory and quit. >>>> The pdb file is quite big and I only need the information from >>>> header. >>>> Is it possible to do something like this ? >>>> >>>> pdb = Bio::PDB.new(entry[0-40000]) >>>> >>>> Thanx for the help >>>> _______________________________________________ >>>> BioRuby mailing list >>>> BioRuby at lists.open-bio.org >>>> http://lists.open-bio.org/mailman/listinfo/bioruby >>>> >>> >>> Alex Gutteridge >>> >>> Bioinformatics Center >>> Kyoto University >>> >>> >>> >> > Alex Gutteridge Bioinformatics Center Kyoto University From yjchenx at gmail.com Thu Dec 13 00:11:33 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Wed, 12 Dec 2007 21:11:33 -0800 Subject: [BioRuby] Parse big PDB use up all memory In-Reply-To: References: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> Message-ID: I did a quick test and found the problem is that I ran it in irb. If I run it in script, like 'ruby test.rb', then it works fine. Yen-Ju On Dec 12, 2007 8:50 PM, Yen-Ju Chen wrote: > Thank you for the hint for retrieve only header. > > I am using the default Ruby on Mac OS X 10.5. > Here is the output of 'ruby -v' > > ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0] > > And bioruby is 1.1.0 from gems. > > I will test it on Linux and see. > > Yen-Ju > > > On Dec 12, 2007 7:49 PM, Alex Gutteridge wrote: > > Hi, > > > > Could you give some more details on what system and ruby/bioruby > > version you are running? The same script uses less than 20MB on my > > machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't > > seem so bad. Also 1w6k is biggish, but there are certainly bigger PDB > > files out there so if you're having trouble with this one then others > > will certainly be a problem. > > > > In answer to your second question, yes you should be able to just > > extract the header (everything up to the ATOM records). But if you're > > really running out of memory just parsing that file then I suspect you > > have deeper issues. Anyway, the sample below works for me for parsing > > the header from 1w6k: > > > > require 'bio' > > > > serv = Bio::Fetch.new > > entry = serv.fetch('pdb','1w6k') > > > > header = '' > > entry.each do |l| > > break if l.match(/^ATOM/) > > header << l > > end > > > > pdb = Bio::PDB.new(header) > > p pdb.accession > > > > > > On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote: > > > > > This is what I did: > > > > > > require 'bio' > > > serv = Bio::Fetch.new() > > > entry = serv.fetch('pdb', '1w6k') > > > pdb = Bio::PDB.new(entry) > > > > > > The last step use up all memory and quit. > > > The pdb file is quite big and I only need the information from header. > > > Is it possible to do something like this ? > > > > > > pdb = Bio::PDB.new(entry[0-40000]) > > > > > > Thanx for the help > > > _______________________________________________ > > > BioRuby mailing list > > > BioRuby at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > > > > Alex Gutteridge > > > > Bioinformatics Center > > Kyoto University > > > > > > > From alexg at kuicr.kyoto-u.ac.jp Wed Dec 12 22:49:04 2007 From: alexg at kuicr.kyoto-u.ac.jp (Alex Gutteridge) Date: Thu, 13 Dec 2007 12:49:04 +0900 Subject: [BioRuby] Parse big PDB use up all memory In-Reply-To: References: Message-ID: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> Hi, Could you give some more details on what system and ruby/bioruby version you are running? The same script uses less than 20MB on my machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't seem so bad. Also 1w6k is biggish, but there are certainly bigger PDB files out there so if you're having trouble with this one then others will certainly be a problem. In answer to your second question, yes you should be able to just extract the header (everything up to the ATOM records). But if you're really running out of memory just parsing that file then I suspect you have deeper issues. Anyway, the sample below works for me for parsing the header from 1w6k: require 'bio' serv = Bio::Fetch.new entry = serv.fetch('pdb','1w6k') header = '' entry.each do |l| break if l.match(/^ATOM/) header << l end pdb = Bio::PDB.new(header) p pdb.accession On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote: > This is what I did: > > require 'bio' > serv = Bio::Fetch.new() > entry = serv.fetch('pdb', '1w6k') > pdb = Bio::PDB.new(entry) > > The last step use up all memory and quit. > The pdb file is quite big and I only need the information from header. > Is it possible to do something like this ? > > pdb = Bio::PDB.new(entry[0-40000]) > > Thanx for the help > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > Alex Gutteridge Bioinformatics Center Kyoto University From odo at mac.com Thu Dec 13 03:23:48 2007 From: odo at mac.com (Florian Odronitz) Date: Thu, 13 Dec 2007 09:23:48 +0100 Subject: [BioRuby] Proton Nomenclature in PDB In-Reply-To: References: Message-ID: <3A227C17-9C34-42BF-80C6-B96467573291@mac.com> Hi, I am using Bio::PDB in my NMR-related software project. I was encountering a problem with the naming of protons that were generated by PyMol and MolMol and wrote a method to rename the protons according to BMRB nomenclature (http://www.bmrb.wisc.edu/ref_info/statsel.htm). If anyone thinks this could be useful to others, I would like to contribute it to BioRuby. Or is it to specific? Maybe I could do it in a more general way since it also involves things like bonding which are, to my understanding, not implemented yet. Who would be the right person to talk to? Thanks, Florian From ktym at hgc.jp Fri Dec 14 12:20:34 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sat, 15 Dec 2007 02:20:34 +0900 Subject: [BioRuby] BioRuby 1.2.0 is released Message-ID: Hi all, I just released the BioRuby 1.2.0 at http://bioruby.org/archive/bioruby-1.2.0.tar.gz http://bioruby.org/ http://bioruby.org/rdoc/ http://rubyforge.org/projects/bioruby/ http://raa.ruby-lang.org/project/bioruby/ I also put RubyGems pacakge at RubyForge as always. % sudo gem update bio Here is a brief summary of updates snipped from the ChangeLog file. * BioRuby 1.2.0 released * BioRuby shell is improved * file save functionality is fixed * deprecated require_gem is changed to gem to suppress warnings * deprecated end_form_tag is rewrited to suppress warnings * images for Rails shell are separated to the bioruby directory * spinner is shown during the evaluation * background image in the textarea is removed for the visibility * Bio::Blast is fixed to parse -m 8 formatted result correctly * Bio::PubMed is rewrited to enhance its functionality * e.g. 'rettype' => 'count' and 'retmode' => 'xml' are available * Bio::FlatFile is improved to accept recent MEDLINE format * Bio::KEGG::COMPOUND is enhanced to utilize REMARK field * Bio::KEGG::API is fixed to skip filter when the value is Fixnum * A number of minor bug fixes Hope you enjoy. Regards, Toshiaki Katayama -- Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan tel://+81-3-5449-5614 fax://+81-3-5449-5434 http://www.hgc.jp/ (Human Genome Center) http://bioruby.org/ (BioRuby project) http://das.hgc.jp/ (KEGG DAS) http://www.genome.jp/kegg/soap/ (KEGG API) From raoul.bonnal at itb.cnr.it Fri Dec 14 08:50:30 2007 From: raoul.bonnal at itb.cnr.it (Raoul Jean Pierre Bonnal) Date: Fri, 14 Dec 2007 14:50:30 +0100 Subject: [BioRuby] FlatFile loading genbank, the last entry is a fake In-Reply-To: References: Message-ID: <1197640230.10347.15.camel@Graco> Downloading the AJ561198's genbank file from ncbi and loading it with data=Bio::FlatFile.auto("AJ561198.gb") data.each_entry do |entry| puts entry.entry_id end You get AJ561198 nil I think the parser identify the "\n" at the end of the genbank (after "//\n") and think there is another entry, but it's wrong. Deleting the last line, works. -- Ra From ktym at hgc.jp Fri Dec 14 17:31:11 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sat, 15 Dec 2007 07:31:11 +0900 Subject: [BioRuby] Fwd: BioRuby 1.2.0 is released References: <1F16910BB8546C4DA5526FABB0C98D09AA9A53@ebre2ksrv1.ebrc.bbsrc.ac.uk> Message-ID: <2D2BADE4-A31A-4356-9820-FC700AEE903C@hgc.jp> Hi all, Does anybody has the same problem on Linux/Windows? Toshiaki Begin forwarded message: > From: "jan aerts (RI)" > Date: 2007?12?15? 5:50:42:JST > To: "Toshiaki Katayama" > Cc: > Subject: RE: [BioRuby] BioRuby 1.2.0 is released > > Ubuntu 7.10 (Gutsy Gibbon). > ruby 1.8.6 > soap4r 1.5.5-1 (apt-get package) > > j. > > > -----Original Message----- > From: Toshiaki Katayama [mailto:ktym at hgc.jp] > Sent: Fri 14/12/2007 18:42 > To: jan aerts (RI) > Cc: n at bioruby.org > Subject: Re: [BioRuby] BioRuby 1.2.0 is released > > Jan, > > In my environment (OS X Leopard), I have no errors on all tests in BioRuby 1.2.0 with Ruby 1.8.6 > What kind of environment do you use? > > Regards, > Toshiaki > > On 2007/12/15, at 3:28, jan aerts (RI) wrote: > >> Thanks T. >> >> Good to see a new release is out. >> >> I noticed that the test/functional/bio/io/test_soapwsdl.rb test returned errors. All 4 tests in that testfile give the following error: >> >> NoMethodError: undefined method `location=' for nil:NilClass >> /usr/lib/ruby/1.8/wsdl/xmlSchema/importer.rb:31:in `import' >> /usr/lib/ruby/1.8/wsdl/importer.rb:18:in `import' >> /usr/lib/ruby/1.8/soap/wsdlDriver.rb:124:in `import' >> /usr/lib/ruby/1.8/soap/wsdlDriver.rb:28:in `initialize' >> ../../../../lib/bio/io/soapwsdl.rb:63:in `new' >> ../../../../lib/bio/io/soapwsdl.rb:63:in `create_driver' >> ../../../../lib/bio/io/soapwsdl.rb:57:in `initialize' >> ./test_soapwsdl.rb:25:in `new' >> ./test_soapwsdl.rb:25:in `setup' >> >> jan. >> >> >> -----Original Message----- >> From: bioruby-bounces at lists.open-bio.org on behalf of Toshiaki Katayama >> Sent: Fri 14/12/2007 17:20 >> To: BioRuby; bioruby-ja at lists.open-bio.org >> Subject: [BioRuby] BioRuby 1.2.0 is released >> >> Hi all, >> >> I just released the BioRuby 1.2.0 at http://bioruby.org/archive/bioruby-1.2.0.tar.gz >> >> http://bioruby.org/ >> http://bioruby.org/rdoc/ >> http://rubyforge.org/projects/bioruby/ >> http://raa.ruby-lang.org/project/bioruby/ >> >> I also put RubyGems pacakge at RubyForge as always. >> >> % sudo gem update bio >> >> Here is a brief summary of updates snipped from the ChangeLog file. >> >> * BioRuby 1.2.0 released >> >> * BioRuby shell is improved >> * file save functionality is fixed >> * deprecated require_gem is changed to gem to suppress warnings >> * deprecated end_form_tag is rewrited to suppress warnings >> * images for Rails shell are separated to the bioruby directory >> * spinner is shown during the evaluation >> * background image in the textarea is removed for the visibility >> * Bio::Blast is fixed to parse -m 8 formatted result correctly >> * Bio::PubMed is rewrited to enhance its functionality >> * e.g. 'rettype' => 'count' and 'retmode' => 'xml' are available >> * Bio::FlatFile is improved to accept recent MEDLINE format >> * Bio::KEGG::COMPOUND is enhanced to utilize REMARK field >> * Bio::KEGG::API is fixed to skip filter when the value is Fixnum >> * A number of minor bug fixes >> >> Hope you enjoy. >> >> Regards, >> Toshiaki Katayama >> -- >> Human Genome Center, Institute of Medical Science, University of Tokyo >> 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan >> tel://+81-3-5449-5614 >> fax://+81-3-5449-5434 >> http://www.hgc.jp/ (Human Genome Center) >> http://bioruby.org/ (BioRuby project) >> http://das.hgc.jp/ (KEGG DAS) >> http://www.genome.jp/kegg/soap/ (KEGG API) >> >> >> >> _______________________________________________ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby >> > > > From ngoto at gen-info.osaka-u.ac.jp Tue Dec 18 08:55:57 2007 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Tue, 18 Dec 2007 22:55:57 +0900 Subject: [BioRuby] Parse big PDB use up all memory In-Reply-To: <20495B39-57E6-46C4-87AF-24B041CBA54D@kuicr.kyoto-u.ac.jp> References: <16683AAA-7D69-4D8A-9B3D-A878DA98E727@kuicr.kyoto-u.ac.jp> <20495B39-57E6-46C4-87AF-24B041CBA54D@kuicr.kyoto-u.ac.jp> Message-ID: <20071218135558.4880D1CBC43F@idnmail.gen-info.osaka-u.ac.jp> Hi, Objects inside Bio::PDB often refer another objects in the same Bio::PDB object, and this might cause infinite recursion in Bio::PDB#inspect. To define customized Bio::PDB#inspect seems to prevent the memory exhaust problem. class Bio::PDB # returns a string containing human-readable representation # of this object. def inspect "#<#{self.class.to_s} entry_id=#{entry_id.inspect}>" end end I also defined Bio::PDB::(Model|Chain|Residue)#inspect like above, and committed them into CVS. Naohisa Goto ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp On Thu, 13 Dec 2007 14:22:59 +0900 Alex Gutteridge wrote: > Yup, I see the same behavior on linux and osx. Bio::PDB.new kills irb > but runs fine in a script. Thanks for the bug report. I'll see if I > can identify what's going on. > > AlexG > > On 13 Dec 2007, at 14:11, Yen-Ju Chen wrote: > > > I did a quick test and found the problem is that I ran it in irb. > > If I run it in script, like 'ruby test.rb', then it works fine. > > > > Yen-Ju > > > > On Dec 12, 2007 8:50 PM, Yen-Ju Chen wrote: > >> Thank you for the hint for retrieve only header. > >> > >> I am using the default Ruby on Mac OS X 10.5. > >> Here is the output of 'ruby -v' > >> > >> ruby 1.8.6 (2007-06-07 patchlevel 36) [universal-darwin9.0] > >> > >> And bioruby is 1.1.0 from gems. > >> > >> I will test it on Linux and see. > >> > >> Yen-Ju > >> > >> > >> On Dec 12, 2007 7:49 PM, Alex Gutteridge >> u.ac.jp> wrote: > >>> Hi, > >>> > >>> Could you give some more details on what system and ruby/bioruby > >>> version you are running? The same script uses less than 20MB on my > >>> machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't > >>> seem so bad. Also 1w6k is biggish, but there are certainly bigger > >>> PDB > >>> files out there so if you're having trouble with this one then > >>> others > >>> will certainly be a problem. > >>> > >>> In answer to your second question, yes you should be able to just > >>> extract the header (everything up to the ATOM records). But if > >>> you're > >>> really running out of memory just parsing that file then I suspect > >>> you > >>> have deeper issues. Anyway, the sample below works for me for > >>> parsing > >>> the header from 1w6k: > >>> > >>> require 'bio' > >>> > >>> serv = Bio::Fetch.new > >>> entry = serv.fetch('pdb','1w6k') > >>> > >>> header = '' > >>> entry.each do |l| > >>> break if l.match(/^ATOM/) > >>> header << l > >>> end > >>> > >>> pdb = Bio::PDB.new(header) > >>> p pdb.accession > >>> > >>> > >>> On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote: > >>> > >>>> This is what I did: > >>>> > >>>> require 'bio' > >>>> serv = Bio::Fetch.new() > >>>> entry = serv.fetch('pdb', '1w6k') > >>>> pdb = Bio::PDB.new(entry) > >>>> > >>>> The last step use up all memory and quit. > >>>> The pdb file is quite big and I only need the information from > >>>> header. > >>>> Is it possible to do something like this ? > >>>> > >>>> pdb = Bio::PDB.new(entry[0-40000]) > >>>> > >>>> Thanx for the help > >>>> _______________________________________________ > >>>> BioRuby mailing list > >>>> BioRuby at lists.open-bio.org > >>>> http://lists.open-bio.org/mailman/listinfo/bioruby > >>>> > >>> > >>> Alex Gutteridge > >>> > >>> Bioinformatics Center > >>> Kyoto University > >>> > >>> > >>> > >> > > > > Alex Gutteridge > > Bioinformatics Center > Kyoto University > > > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From yjchen at reciprocallattice.com Tue Dec 18 16:54:34 2007 From: yjchen at reciprocallattice.com (Yen-Ju Chen) Date: Tue, 18 Dec 2007 13:54:34 -0800 Subject: [BioRuby] A Rails application with BioRuby Message-ID: Hi, I am working on a rails application using BioRuby to collect references and database entries. You can find the application (not source code yet) at journalclub.reciprocallattice.com It is still at early stage. I use it personally and figure it would be interesting to have more users. If you want to join, please write to me in private so that it will not pollute BioRuby maillist. I don't know how many users the application can take. Please see the website for more details. These are things related to BioRuby, * The output from Reference to BibTex format lacks abstract. * It would be nice to be able to output to RIS format for EndNote and ReferenceManager. * Is it possible to get DOI from PubMed ? * BioRuby can get information from many databases through biofetch, but not processing them, like Pfam, Prosite, etc. * it is not clear what's the database from biofetch, for example: rn, rp, str, pr. I am in structural biology. Many of these abbreviation is not obvious. If I have chance to write codes for these missing features, I will submit them back to BioRuby. Have fun. Yen-Ju From sgujja at broad.mit.edu Wed Dec 19 11:03:24 2007 From: sgujja at broad.mit.edu (Sharvari Gujja) Date: Wed, 19 Dec 2007 11:03:24 -0500 Subject: [BioRuby] how to retrieve a genbank record by GI Message-ID: <476940CC.6000803@broad.mit.edu> Hi all, I am new to Ruby and Bioruby and am amazed at how simple and yet powerful is it. I am trying to access a genbank record (NCBI) by GI number. I have tried Bio::Fetch, Bio::Registry but none seems to work. Any help is appreciated. Thanks -S From robert.citek at gmail.com Wed Dec 19 14:39:01 2007 From: robert.citek at gmail.com (Robert Citek) Date: Wed, 19 Dec 2007 13:39:01 -0600 Subject: [BioRuby] how to retrieve a genbank record by GI In-Reply-To: <476940CC.6000803@broad.mit.edu> References: <476940CC.6000803@broad.mit.edu> Message-ID: <4145b6790712191139o2fa6c37er6331fa38def372d9@mail.gmail.com> On Dec 19, 2007 10:03 AM, Sharvari Gujja wrote: > I am new to Ruby and Bioruby and am amazed at how simple and yet > powerful is it. > > I am trying to access a genbank record (NCBI) by GI number. I have tried > Bio::Fetch, Bio::Registry but none seems to work. Can you give an example of what you've tried? Also, on what system are you running bioruby on, e.g. Windows XP, Cygwin in Windows, Ubuntu Linux, Mac OS X, Solaris? What version of bioruby? Regards, - Robert From robert.citek at gmail.com Wed Dec 19 15:46:07 2007 From: robert.citek at gmail.com (Robert Citek) Date: Wed, 19 Dec 2007 14:46:07 -0600 Subject: [BioRuby] how to retrieve a genbank record by GI In-Reply-To: <4769756B.3080406@broad.mit.edu> References: <476940CC.6000803@broad.mit.edu> <4145b6790712191139o2fa6c37er6331fa38def372d9@mail.gmail.com> <4769756B.3080406@broad.mit.edu> Message-ID: <4145b6790712191246i2abd5252q11f702f116a76115@mail.gmail.com> On Dec 19, 2007 1:47 PM, Sharvari Gujja wrote: > Robert Citek wrote: > > Can you give an example of what you've tried? Also, on what system > > are you running bioruby on, e.g. Windows XP, Cygwin in Windows, Ubuntu > > Linux, Mac OS X, Solaris? What version of bioruby? > > I have tried: > > reg = Bio::Registry.new > serv = reg.get_database('genbank') > puts serv.get_by_id('J00231') > > > puts Bio::Fetch.query('genbank','185041') > > server = Bio::Fetch.new() > #server = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch') > puts server.fetch('genbank','J00231','html') > > entry = Bio::DBGET.bget("AF139016") > > gb = Bio::GenBank.new(Bio::Fetch.query('gb', 'J00231')) > puts gb.read > > And running on Windows XP. Ruby 1.8.6 I also get errors: $ ruby -rbio -e 'reg = Bio::Registry.new' /usr/lib/ruby/1.8/net/http.rb:560:in `initialize': No route to host - connect(2) (Errno::EHOSTUNREACH) from /usr/lib/ruby/1.8/net/http.rb:560:in `open' from /usr/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/lib/ruby/1.8/timeout.rb:48:in `timeout' from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout' from /usr/lib/ruby/1.8/net/http.rb:560:in `connect' from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start' from /usr/lib/ruby/1.8/net/http.rb:542:in `start' from /usr/lib/ruby/1.8/net/http.rb:440:in `start' from /usr/lib/ruby/1.8/bio/io/registry.rb:190:in `read_remote' from /usr/lib/ruby/1.8/bio/io/registry.rb:126:in `initialize' from -e:1:in `new' from -e:1 $ ruby -v ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 7.10 Release: 7.10 Codename: gutsy Unfortunately, I don't know how to display what version of bioruby I'm using. I guess I'm too new to ruby, let alone bioruby, to be of any help. Anyone have a working example? Unfortunately, my connection to bioruby.org doesn't work (I suspect our 'Net connection is snafu'ed). Regards, - Robert From ktym at hgc.jp Thu Dec 20 02:41:12 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 20 Dec 2007 16:41:12 +0900 Subject: [BioRuby] A Rails application with BioRuby In-Reply-To: References: Message-ID: Hi Yen-Ju, On 2007/12/19, at 6:54, Yen-Ju Chen wrote: > Hi, > I am working on a rails application using BioRuby to collect references > and database entries. > You can find the application (not source code yet) at > journalclub.reciprocallattice.com Cool. > It is still at early stage. I use it personally and figure it would be > interesting to have more users. > If you want to join, please write to me in private so that it will not > pollute BioRuby maillist. > I don't know how many users the application can take. Please see the > website for more details. > > These are things related to BioRuby, > * The output from Reference to BibTex format lacks abstract. > * It would be nice to be able to output to RIS format for EndNote and > ReferenceManager. If you could provide a patch for them, I'll include it in BioRuby. > * Is it possible to get DOI from PubMed ? entry = Bio::PubMed.query(16946072) doi = entry[/AID - (\S+) \[doi\]/, 1] or you can extend the Bio::MEDLINE class to add the doi method class Bio::MEDLINE attr_reader :pubmed def doi @pubmed['AID'][/(\S+) \[doi\]/, 1] end end entry = Bio::PubMed.query(16946072) medline = Bio::MEDLINE.new(entry) doi = medline.doi or utilize the XML format of the PubMed output entry_xml = Bio::PubMed.efetch(16946072, {"retmode" => "xml"}) : 313/5791/1295 10.1126/science.1131542 16946072 : then extract DOI ID require 'rexml/document' pubmed = REXML::Document.new(entry_xml) doi = pubmed.elements['//ArticleId[@IdType="doi"]'].get_text > * BioRuby can get information from many databases through biofetch, > but not processing them, like Pfam, Prosite, etc. You can process them by appropriate corresponding classes. For example, cyclins = Bio::Fetch.query('prosite', 'PS00292') prosite = Bio::PROSIE.new(cyclins) prosite.entry_id # ==> "PS00292" prosite.definition # ==> "Cyclins signature." prosite.pattern # ==> "R-x(2)-[LIVMSA]-x(2)-[FYWS]-[LIVM]-x(8)-[LIVMFC]-x(4)-[LIVMFYA]-x(2)-[STAGC]-[LIVMFYQ]-x-[LIVMFYC]-[LIVMFY]-D-[RKH]-[LIVMFYW]." prosite.re # ==> /R.{2}[LIVMSA].{2}[FYWS][LIVM].{8}[LIVMFC].{4}[LIVMFYA].{2}[STAGC][LIVMFYQ].[LIVMFYC][LIVMFY]D[RKH][LIVMFYW]/i > * it is not clear what's the database from biofetch, for example: rn, rp, > str, pr. > I am in structural biology. Many of these abbreviation is not obvious. In BioRuby, the default BioFetch server is implemented as a proxy for the DBGET system through KEGG API. So, please refer to the abbreviation field in the DBGET manual at http://www.genome.jp/dbget/ and also note that the DBGET service for GenBank (gb) database is no longer available. Regards, Toshiaki Katayama From ktym at hgc.jp Thu Dec 20 03:29:48 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 20 Dec 2007 17:29:48 +0900 Subject: [BioRuby] how to retrieve a genbank record by GI In-Reply-To: <4145b6790712191246i2abd5252q11f702f116a76115@mail.gmail.com> References: <476940CC.6000803@broad.mit.edu> <4145b6790712191139o2fa6c37er6331fa38def372d9@mail.gmail.com> <4769756B.3080406@broad.mit.edu> <4145b6790712191246i2abd5252q11f702f116a76115@mail.gmail.com> Message-ID: Hi Gujja, On 2007/12/20, at 5:46, Robert Citek wrote: > On Dec 19, 2007 1:47 PM, Sharvari Gujja wrote: >> Robert Citek wrote: >>> Can you give an example of what you've tried? Also, on what system >>> are you running bioruby on, e.g. Windows XP, Cygwin in Windows, Ubuntu >>> Linux, Mac OS X, Solaris? What version of bioruby? >> >> I have tried: >> >> reg = Bio::Registry.new >> serv = reg.get_database('genbank') >> puts serv.get_by_id('J00231') Did you setup your "seqdatabase.ini" file as described in the README file? http://code.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioruby/README?rev=1.17&cvsroot=bioruby Otherwise, 'genbank' database is not supported by OBDA (Bio::Registry) by defalut. However, there is another problem. In the BioRuby's default configuration file, 'genbank' refers to the BioFetch server at bioruby.org and as I wrote in the separate mail, current BioFetch server won't continue to support GenBank database. [genbank] protocol=biofetch location=http://bioruby.org/cgi-bin/biofetch.rb dbname=genbank Thus, the above configuration is not valid already... >> puts Bio::Fetch.query('genbank','185041') >> >> server = Bio::Fetch.new() >> #server = Bio::Fetch.new('http://www.ebi.ac.uk/cgi-bin/dbfetch') >> puts server.fetch('genbank','J00231','html') Besides, as you can find at another BioFetch server provided by EBI (Dbfetch), http://www.ebi.ac.uk/cgi-bin/dbfetch they doesn't provide GenBank database also (because they have EMBL instead). As a conclusion, if you need to fetch a GenBank entry from remote server, using NCBI with E-Utils is the best way for now. Unfortunately, we don't have the Bio::NCBI::Eutils class yet, it seems that you can temporally divert the Bio::PubMed class to do that. Bio::PubMed.efetch("185041", {"db"=>"nuccore", "rettype"=>"gb"}) Bio::PubMed.efetch("J00231", {"db"=>"nuccore", "rettype"=>"gb"}) ESOAP can be alternative but it takes quite long time to read the current version of the WSDL file and returned value is not easy to handle. Regards, Toshiaki Katayama >> entry = Bio::DBGET.bget("AF139016") >> >> gb = Bio::GenBank.new(Bio::Fetch.query('gb', 'J00231')) >> puts gb.read >> >> And running on Windows XP. Ruby 1.8.6 > > I also get errors: > > $ ruby -rbio -e 'reg = Bio::Registry.new' > /usr/lib/ruby/1.8/net/http.rb:560:in `initialize': No route to host - > connect(2) (Errno::EHOSTUNREACH) > from /usr/lib/ruby/1.8/net/http.rb:560:in `open' > from /usr/lib/ruby/1.8/net/http.rb:560:in `connect' > from /usr/lib/ruby/1.8/timeout.rb:48:in `timeout' > from /usr/lib/ruby/1.8/timeout.rb:76:in `timeout' > from /usr/lib/ruby/1.8/net/http.rb:560:in `connect' > from /usr/lib/ruby/1.8/net/http.rb:553:in `do_start' > from /usr/lib/ruby/1.8/net/http.rb:542:in `start' > from /usr/lib/ruby/1.8/net/http.rb:440:in `start' > from /usr/lib/ruby/1.8/bio/io/registry.rb:190:in `read_remote' > from /usr/lib/ruby/1.8/bio/io/registry.rb:126:in `initialize' > from -e:1:in `new' > from -e:1 > > $ ruby -v > ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] > > $ lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 7.10 > Release: 7.10 > Codename: gutsy > > Unfortunately, I don't know how to display what version of bioruby I'm > using. I guess I'm too new to ruby, let alone bioruby, to be of any > help. Anyone have a working example? Unfortunately, my connection to > bioruby.org doesn't work (I suspect our 'Net connection is snafu'ed). > > Regards, > - Robert > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Thu Dec 20 11:54:18 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Fri, 21 Dec 2007 01:54:18 +0900 Subject: [BioRuby] how to retrieve a genbank record by GI In-Reply-To: <476A8817.9030108@broad.mit.edu> References: <476940CC.6000803@broad.mit.edu> <4145b6790712191139o2fa6c37er6331fa38def372d9@mail.gmail.com> <4769756B.3080406@broad.mit.edu> <4145b6790712191246i2abd5252q11f702f116a76115@mail.gmail.com> <476A8817.9030108@broad.mit.edu> Message-ID: <95F14218-A50E-4A18-9ECB-3FC68B4D8DAE@hgc.jp> Hi Gujja, On 2007/12/21, at 0:19, Sharvari Gujja wrote: > On 2007/12/20, at 5:46, Robert Citek wrote: >>> Unfortunately, I don't know how to display what version of bioruby I'm >>> using. You can check the version of BioRuby by % ruby -rubygems -rbio -e 'p Bio::BIORUBY_VERSION' [1, 2, 0] or by running the bioruby command like % bioruby Loading config (/Users/ktym/.bioruby/shell/session/config) ... done Loading object (/Users/ktym/.bioruby/shell/session/object) ... done Loading history (/Users/ktym/.bioruby/shell/session/history) ... done . . . B i o R u b y i n t h e s h e l l . . . Version : BioRuby 1.2.0 / Ruby 1.8.6 bioruby> exit > Hi all > > Thanks for all your input. > > However, can s'one explain how to set up seqdatabase.ini file. I did go thru the read me file but does not make much sense to me. Ah, if you are using Windows, I have no idea as I have never tried. Instead, you can also put the file on the net as described in: http://bioruby.org/rdoc/files/lib/bio/io/registry_rb.html Anyway, the OBDA is still available in BioRuby but I feel it is not actively used in other Bio* projects these days. This situation reminds me one more way to retrieve a GenBank entry. If you have installed the EMBOSS suite, you can setup ~/.embossrc file to access NCBI like: DB genbank [ type: N format: genbank method: url url: "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&rettype=gb&retmode=text&id=%s" ] and call entret command by Bio::EMBOSS.entret('genbank:185041') > Also , I have tried > > Bio::PubMed.efetch("185041", {"db"=>"nuccore", "rettype"=>"gb"}) > > but this gives me the pubmed entry. I need the genbank format. If your BioRuby is older than 1.2.0, try update it first. In my environment, I've got a GenBank entry correctly. I expect that this way is most feasible on Windows for now. I'll prepare the Bio::NCBI::Eutils class in the next release. > Appreciate your help. > > Thanks > S Regards, Toshiaki Katayama -- Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan tel://+81-3-5449-5614 fax://+81-3-5449-5434 http://www.hgc.jp/ (Human Genome Center) http://bioruby.org/ (BioRuby project) http://das.hgc.jp/ (KEGG DAS) http://www.genome.jp/kegg/soap/ (KEGG API) From yjchen at reciprocallattice.com Thu Dec 20 14:11:39 2007 From: yjchen at reciprocallattice.com (Yen-Ju Chen) Date: Thu, 20 Dec 2007 11:11:39 -0800 Subject: [BioRuby] A Rails application with BioRuby In-Reply-To: References: Message-ID: On 12/19/07, Toshiaki Katayama wrote: > > Hi Yen-Ju, > > On 2007/12/19, at 6:54, Yen-Ju Chen wrote: > > > Hi, > > I am working on a rails application using BioRuby to collect references > > and database entries. > > You can find the application (not source code yet) at > > journalclub.reciprocallattice.com > > Cool. > > > > It is still at early stage. I use it personally and figure it would be > > interesting to have more users. > > If you want to join, please write to me in private so that it will not > > pollute BioRuby maillist. > > I don't know how many users the application can take. Please see the > > website for more details. > > > > These are things related to BioRuby, > > * The output from Reference to BibTex format lacks abstract. > > * It would be nice to be able to output to RIS format for EndNote and > > ReferenceManager. > > > If you could provide a patch for them, I'll include it in BioRuby. I will look at the RIS format and supply a patch later. > * Is it possible to get DOI from PubMed ? > > entry = Bio::PubMed.query(16946072) > doi = entry[/AID - (\S+) \[doi\]/, 1] > > > or you can extend the Bio::MEDLINE class to add the doi method Is it possible to have this feature in BioRuby ? I found DOI becomes more common recently, even PDB has DOI number. And it seems the only way to have a unique id on an article. For example, PubMed and Goggle Scholar may return the same article with their own id (PMID and Google Scholar ID). I found it is only possible to compare the DOI to ensure two entries refer to the same article. [snip] > > * BioRuby can get information from many databases through biofetch, > > but not processing them, like Pfam, Prosite, etc. > > You can process them by appropriate corresponding classes. For example, > > cyclins = Bio::Fetch.query('prosite', 'PS00292') > prosite = Bio::PROSIE.new(cyclins) Thanx. I didn't notice PROSITE from BioRuby API before. Pfam is still missing. I will see what I can do about it. > > > > * it is not clear what's the database from biofetch, for example: rn, > rp, > > str, pr. > > I am in structural biology. Many of these abbreviation is not > obvious. > > In BioRuby, the default BioFetch server is implemented as a proxy for the > DBGET system through KEGG API. > So, please refer to the abbreviation field in the DBGET manual at > > http://www.genome.jp/dbget/ That's a good tip. It would also be user-friendly to show them from BioRuby. Thanx for these information. Yen-Ju and also note that the DBGET service for GenBank (gb) database is no longer > available. > > > Regards, > Toshiaki Katayama > > > > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From ktym at hgc.jp Fri Dec 21 00:16:06 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Fri, 21 Dec 2007 14:16:06 +0900 Subject: [BioRuby] A Rails application with BioRuby In-Reply-To: References: Message-ID: <8FADCE93-34C9-468F-99B5-96CCE49D6ECF@hgc.jp> Hi Yen-Ju, On 2007/12/21, at 4:11, Yen-Ju Chen wrote: > > * Is it possible to get DOI from PubMed ? > > entry = Bio::PubMed.query(16946072) > doi = entry[/AID - (\S+) \[doi\]/, 1] > > > or you can extend the Bio::MEDLINE class to add the doi method > > > Is it possible to have this feature in BioRuby ? > I just committed the following changes to the CVS. def doi @pubmed['AID'][/(\S+) \[doi\]/, 1] end def pii @pubmed['AID'][/(\S+) \[pii\]/, 1] end so that you can use them as entry = Bio::PubMed.query(16946072) medline = Bio::MEDLINE.new(entry) doi = medline.doi pii = medline.pii Regards, Toshiaki From ktym at hgc.jp Sat Dec 29 15:12:11 2007 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sun, 30 Dec 2007 05:12:11 +0900 Subject: [BioRuby] BioRuby 1.2.1 is released Message-ID: Hi all, I just released the BioRuby 1.2.1 including fix for BLAST 2.2.17 output. Note that this version is not yet Ruby 1.9 compliant. http://bioruby.org/archive/bioruby-1.2.1.tar.gz http://rubyforge.org/projects/bioruby/ You can see changes at http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/bioruby/ChangeLog?rev=1.79&cvsroot=bioruby P.S. Unfortunately, I removed the RAA entry for BioRuby by mistake (I need to sleep now :). I immediately re-added as a new project but our history was lost. http://raa.ruby-lang.org/project/bioruby/ I took a screenshot of the old admin screen for record. http://bioruby.org/tmp/bioruby-deleted-raa.png Happy holidays! Regards, Toshiaki Katayama -- Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan tel://+81-3-5449-5614 fax://+81-3-5449-5434 http://www.hgc.jp/ (Human Genome Center) http://bioruby.org/ (BioRuby project) http://das.hgc.jp/ (KEGG DAS) http://www.genome.jp/kegg/soap/ (KEGG API)