From missy at be.to Sat Jan 1 19:54:08 2011 From: missy at be.to (MISHIMA, Hiroyuki) Date: Sun, 02 Jan 2011 09:54:08 +0900 Subject: [BioRuby] Workflows: NGS + miRNA (Re: Workflows and Parallelization) In-Reply-To: <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> Message-ID: <4D1FCCB0.2000303@be.to> Dear Raoul and the BioRuby list, My workflow for miRNA analysis using Illumina GAii is like the followings: 1) Read alignment using Novoalign. ( http://www.novocraft.com/ ). It is a proprietary software, but its binary is free for academic use with several restrictions. The advantage of Novoalign is the function to remove adapter sequences from each read. Adapter clipping is indispensable for miRNA analyses because target molecules are always shorter than read length. 1b) You may be able to use BWA/MAQ instead. Adopter clipping tool such as Cutadapt ( http://cutadapt.googlecode.com/ ) is available. 2) To find miRBASE-registered miRNAs, I used miRExpress ( http://mirexpress.mbc.nctu.edu.tw/ , Wang et al, BMC Bioinform 10, p328, 2009. http://www.biomedcentral.com/1471-2105/10/328 ) 2b) Data analysis. I plotted heatmaps using R. See Ruby et al. (Genome Res, 17, p1850, 2007. http://genome.cshlp.org/content/17/12/1850.long ). 3) To find potentially novel miRNA, I used miRTRAP (http://flybuzz.berkeley.edu/miRTRAP.html (Hendrix et al., Genome Biolo, 11, pR39, 2010. http://genomebiology.com/2010/11/4/R39 ). The workflow may have to be updated. Hopefully, it will help you. Thanks, Hiro. Raoul Bonnal wrote (2010/12/23 18:47): > Actually the focus of my institute is mainly on mirna, so I'm also > interested on techniques for analyzing NGS(illumina) and microRNA. -- MISHIMA, Hiroyuki, DDS, Ph.D. COE Research Fellow Department of Human Genetics Nagasaki University Graduate School of Biomedical Sciences From missy at be.to Sat Jan 1 20:38:57 2011 From: missy at be.to (MISHIMA, Hiroyuki) Date: Sun, 02 Jan 2011 10:38:57 +0900 Subject: [BioRuby] Workflows: NGS + miRNA In-Reply-To: <4D1FCCB0.2000303@be.to> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> <4D1FCCB0.2000303@be.to> Message-ID: <4D1FD731.4070309@be.to> Hi all, Addition to my workflow. Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files. miRExpress is easy and fast. This one is good for first try. During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits. MISHIMA, Hiroyuki wrote (2011/01/02 9:54): > Dear Raoul and the BioRuby list, > > My workflow for miRNA analysis using Illumina GAii is like the followings: -- MISHIMA, Hiroyuki, DDS, Ph.D. COE Research Fellow Department of Human Genetics Nagasaki University Graduate School of Biomedical Sciences From pjotr.public14 at thebird.nl Sun Jan 2 07:04:48 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 2 Jan 2011 13:04:48 +0100 Subject: [BioRuby] GFF3 Message-ID: <20110102120448.GA23804@thebird.nl> The GFF3 plugin works rather well. Anyone who has ruby 1.9.x on his system can just type as a user: gem install bio-gff3 and even bioruby itself gets installed, if needed. Next you can type, for example gff3-fetch mRNA test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3 to assemble all mRNA. Unfortunately I am finding some problems with data. For example the reading frame is *wrong* in this wormbase data file (predicted gene). The contig starts as: >MhA1_Contig3426 TTAATAAATTTAATTCATTAAAATTTTAAAAAGAAAGGGACATTCGAGGGGAAATGAGAGAGAACGAGAGAAAATGGACG GGAAATTAAATTAAAAAATAAAAAATTAATTTTTATTTTTTTTTATTTAATTTAAAATTAATTTTCTACATTTATTAAAT CTTAAATTATTAATTTTAAATTAATTTAAAG GCATCCAACAACAACAATTAGAAGTCTTTCCCAGCTCCTCCTCTGCCCC TCAGCAACAACAATACCCAGCGCAGCAGCTTCAATTAGTTACTCCTTTTATTGCATGCATAGCAGATGAATTGAGGGAGT TGATAGATGAAATGCGTATGTTTTAG AATATTTTTTAAAAAAAAATTAAAAAAAATTTTTTTTTGCCAAACAGGCTCTCG and the full record is: ##gff-version 3 ##sequence-region MhA1_Contig3426 1 2029 # Gene gene:MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase gene 192 346 . + . ID=gene:MhA1_Contig3426.frz3.gene1;Name=MhA1_Contig3426.frz3.gene1;Note=PREDICTE D protein_coding;public_name=MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase mRNA 192 346 . + . ID=transcript:MhA1_Contig3426.frz3.gene1;Parent=gene:MhA1_Contig3426.frz3.gene1; Name=MhA1_Contig3426.frz3.gene1;public_name=MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase exon 192 346 . + . ID=exon:MhA1_Contig3426.frz3.gene1.1;Parent=transcript:MhA1_Contig3426.frz3.gene 1 MhA1_Contig3426 WormBase CDS 192 346 . + 0 ID=cds:MhA1_Contig3426.frz3.gene1;Parent=transcript:MhA1_Contig3426.frz3.gene1 So, forward reading frame start at 192 and CDS phase 0. The actual sequence is GCATCCAACA ACAACAATTA GAAGTCTTTC CCAGCTCCTC CTCTGCCCCT CAGCAACAAC AATACCCAGC GCAGCAGCTT CAATTAGTTA CTCCTTTTAT TGCATGCATA GCAGATGAAT TGAGGGAGTT GATAGATGAA ATGCGTATGT TTTAG which translates to a valid protein only in frame 2(!). This is not compliant with GFF3 in any interpretation. Turns out for this particular GFF3 file this is the case only with the *first* ORF on every contig, and probably a bug of the gene predictor used. None of the other genes is in the wrong frame. I have informed Wormbase some time ago, but I don't have the impression that anyone is interested. You can validate its contents at http://www.wormbase.org/db/gb2/gbrowse/m_hapla/?name=id:2258995;dbid=m_hapla:database I am going to add an option to the GFF3 plugin to test for valid reading frames, so these files give the expected results. Be good for validation anyway. Pj. From pjotr.public14 at thebird.nl Sun Jan 2 13:49:58 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 2 Jan 2011 19:49:58 +0100 Subject: [BioRuby] BioRuby and log4r Message-ID: <20110102184958.GA25699@thebird.nl> I propose we start using http://log4r.rubyforge.org/manual.html which has the standard logging features one would expect. I particularly like the lazy evaluation (deferred block). What it does fall short on, as well as most other loggers, is usage use cases. A logger has to behave differently when a tool is used by: - developer: fail early and often (on warnings!) - user: fail on normal error - library: fail on serious error - web server: fail on serious error - fault tolerant system: never fail, try to resume Essentially, I see three or four error handlers. We can create a default logger for BioRuby = user But I like to have more options. It would be nice to have several levels within 'info', 'warn' or 'error', to be displayed/logged on user needs. Also, with the plugins we should have standardized switches for CLI utilities. Are we interested in making this core BioRuby, or should I incorporate it as a bio-plugin? I am thinking of writing a front-end of log4r. Pj. From bonnalraoul at ingm.it Mon Jan 3 07:14:44 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 3 Jan 2011 13:14:44 +0100 Subject: [BioRuby] Workflows: NGS + miRNA In-Reply-To: <4D1FD731.4070309@be.to> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> <4D1FCCB0.2000303@be.to> <4D1FD731.4070309@be.to> Message-ID: <3D720507-34D5-4A3C-9F2C-A54CB9556E3D@ingm.it> Thank you very much, I'll read all the refs. On 02/gen/2011, at 02.38, MISHIMA, Hiroyuki wrote: > Hi all, > > Addition to my workflow. > > Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files. > > miRExpress is easy and fast. This one is good for first try. > > During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits. > > MISHIMA, Hiroyuki wrote (2011/01/02 9:54): >> Dear Raoul and the BioRuby list, >> >> My workflow for miRNA analysis using Illumina GAii is like the followings: > > -- > MISHIMA, Hiroyuki, DDS, Ph.D. > COE Research Fellow > Department of Human Genetics > Nagasaki University Graduate School of Biomedical Sciences > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From pjotr.public14 at thebird.nl Fri Jan 7 03:52:21 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 7 Jan 2011 09:52:21 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110102184958.GA25699@thebird.nl> References: <20110102184958.GA25699@thebird.nl> Message-ID: <20110107085221.GA14735@thebird.nl> I am creating a plugin 'bio-logger' for sane handling of errors and exceptions in different situations (log-act): * Normal user * Developer * Web server * Fault-tolerant systems One example is a program logs a warning to stdout, as a user, but raises an exception, as a developer. bio-logger builds up on log4r functionality, using a more fine-grained approach for logging errors. I.e. within 'debug', 'info', 'warn', 'error' an addition value 1..10 can be set to limit output and logging. When a program, e.g. gff3-fetch, supports bio-logger switches, the following is possible: --logger stderr Add stderr logger (default is stdout) --logger filen Add filename logger --trace debug Show all messages --trace warn Show messages more serious than 'warn' --trace warn:3 Show messaged more serious that 'warn' level 3 module overrides: --trace gff3:info:5 Override level for 'gff3' to info level 5 --trace blast:debug Override level for 'blast' --trace blast,gff3:debug Override level for 'blast' and 'gff3' --trace stderr:blast:debug Override level for 'blast' on stderr Also behaviour can be changed. This normally happens through library calls. There is one command line switch, which changes log-act: --log-act Developer Modify the logger for development log4r supports rotating logs and remote logging. Which will be available. Any comments? Pj. On Sun, Jan 02, 2011 at 07:49:58PM +0100, Pjotr Prins wrote: > http://log4r.rubyforge.org/manual.html From pjotr.public14 at thebird.nl Fri Jan 7 10:01:47 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 7 Jan 2011 16:01:47 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110107085221.GA14735@thebird.nl> References: <20110102184958.GA25699@thebird.nl> <20110107085221.GA14735@thebird.nl> Message-ID: <20110107150147.GA16116@thebird.nl> bio-logger created. YABP (yet another BioRuby plugin). https://github.com/pjotrp/bioruby-logger-plugin Finally the logger I always wanted to have... Pj. From pjotr.public14 at thebird.nl Sat Jan 8 08:06:12 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sat, 8 Jan 2011 14:06:12 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110107150147.GA16116@thebird.nl> References: <20110102184958.GA25699@thebird.nl> <20110107085221.GA14735@thebird.nl> <20110107150147.GA16116@thebird.nl> Message-ID: <20110108130612.GA19929@thebird.nl> If anyone is interested, the bio-logger plugin is fully functional (I am using it in the GFF3 plugin): This is a plugin for nailing down problems with big data parsers, common in bioinformatics, and sane handling of errors and exceptions in different situations. In Bioinformatics the following is a common scenario when dealing with parsers: Large data files sometimes contain errors. As a user you want to continue and hope for the best (logging the error). As a developer you want to see how you can fix the problem. Waiting for a full run and checking the logs is tedious. The logger can be helpful here, and avoids sticking temporary solutions in code. Read on... https://github.com/pjotrp/bioruby-logger-plugin I think we should use this throughout BioRuby to get consistent error handling and logging. No more $stderr.print statements. Pj. From bonnalraoul at ingm.it Mon Jan 10 17:06:48 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 10 Jan 2011 23:06:48 +0100 Subject: [BioRuby] biogem and options Message-ID: Hi all, I have updated the github repo with some requests from Pjotr. Now is possible to create bin, db and test/data directory if needed from the command line biogem --with-bin --with-bd --with-test-data youprojectname NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. NOTE 2: README updated. about rspec and cucumber jeweler already has those options. type 'biogem -h' and you'll get the help. This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. I'm refactoring the code so there are some variations in the original tree. I "hope", by the end of the week, to provide templates files too. -- R.J.P.B. From pjotr.public14 at thebird.nl Tue Jan 11 01:38:34 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 11 Jan 2011 07:38:34 +0100 Subject: [BioRuby] biogem and options In-Reply-To: References: Message-ID: <20110111063834.GA2409@thebird.nl> Super! On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote: > Hi all, > I have updated the github repo with some requests from Pjotr. > Now is possible to create bin, db and test/data directory if needed from the command line > > biogem --with-bin --with-bd --with-test-data youprojectname > > NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. > NOTE 2: README updated. > > about rspec and cucumber jeweler already has those options. > > type 'biogem -h' and you'll get the help. > > This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. > > I'm refactoring the code so there are some variations in the original tree. > > I "hope", by the end of the week, to provide templates files too. > > -- > R.J.P.B. > > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Tue Jan 11 05:47:55 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Tue, 11 Jan 2011 19:47:55 +0900 Subject: [BioRuby] biogem and options In-Reply-To: <20110111063834.GA2409@thebird.nl> References: <20110111063834.GA2409@thebird.nl> Message-ID: Raoul, http://twitter.com/#!/ilpuccio/status/24766316493672448 > @tktym could you point me to some example of what you mena, please? > "provide a recommended template for rdoc, require lines, and class def" In my example plugin, https://github.com/ktym/bioruby-hello/blob/master/lib/bio-hello.rb I used a style something similar with the BioRuby core library which is described in https://github.com/bioruby/bioruby/blob/master/README_DEV.rdoc but I'm not sure what is the best practice for plugin. It might be better to include the documentation in the README file instead. In ether case, what in my mind is to auto-generate a plugin description from those embedded description for the "plugin showcase" which will be available somewhere on the bioruby.org site in the future. For that purpose, we may also want to have some flags indicating: * status of the plugin (stable, usable, buggy, just started etc.) * the plugin will override the BioRuby core or just provide new features harmlessly * pre-requirements (especially, other than gems) etc. etc. Here's a material for further discussion (example template): # # = Bio::XXX - BioRuby plugin for XXX # # Copyright:: Copyright (C) 2001, 2003-2005 Bio R. Hacker , # Copyright:: Copyright (C) 2006 Chem R. Hacker # License:: The Ruby License # Site: http://github.com/user/bioruby-xxx # # == Description # # This plugin provides an interface for the XXX database. # # == Usage # # Lorem ipsum dolor sit amet, consectetur adipisicing elit, .... # # == Effects (Overrides?) # # * Modify the behavior of Bio::Sequence::NA#translate destructively # * Add methods to the Bio::DB class # # == Depends (Requirements?) # # * External MySQL database system # * RubyGem package 'foobar' # # == References # # * Hoge F. et al., The XXX database, Nucleic. Acid. Res. 123:100--123 (2030) # * http://hoge.db/ # # Do we need these two lines in every BioRuby plugin? require 'rubygems' require 'bio' # Do we allow classes defined outside of the 'Bio' namespace? module Bio class XXX # : end # XXX end # Bio Thanks, Toshiaki On 2011/01/11, at 15:38, Pjotr Prins wrote: > Super! > > On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote: >> Hi all, >> I have updated the github repo with some requests from Pjotr. >> Now is possible to create bin, db and test/data directory if needed from the command line >> >> biogem --with-bin --with-bd --with-test-data youprojectname >> >> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. >> NOTE 2: README updated. >> >> about rspec and cucumber jeweler already has those options. >> >> type 'biogem -h' and you'll get the help. >> >> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. >> >> I'm refactoring the code so there are some variations in the original tree. >> >> I "hope", by the end of the week, to provide templates files too. >> >> -- >> R.J.P.B. >> >> >> >> >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From chmille4 at gmail.com Wed Jan 12 13:37:19 2011 From: chmille4 at gmail.com (Chase Miller) Date: Wed, 12 Jan 2011 13:37:19 -0500 Subject: [BioRuby] bio-assembly Message-ID: Hi All, Quick update on the bio-assembly plugin. Francesco has added support for CAF files. According to his preliminary tests it can handle a 27k contig 454 file in about a minute. He also improved the performance overall so now the ace parser can process a 70 mb file in about 10 seconds. Nice work! If there are any requests for parsers or functionality, let us know. source code: https://github.com/chmille4/bioruby-assembly usage: https://github.com/chmille4/bioruby-assembly#readme gem: https://rubygems.org/gems/bio-assembly Cheers Chase From bonnalraoul at ingm.it Thu Jan 13 04:30:03 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 13 Jan 2011 10:30:03 +0100 Subject: [BioRuby] bio-assembly In-Reply-To: References: Message-ID: <8EB2ADDB-7137-4C8E-AB70-C5574A797886@ingm.it> Hi, great work guys. I have updated the Plugins' page http://bioruby.open-bio.org/wiki/Plugins#On_Development_Plugins, it's a list/resume with the state of the art of the plugins. Please let me know if there is something wrong. In my mind Planned plugins are just ideas not yet coded. The other are "on going development". I tried to list the plugins in order of creations. @Jan: Do you plan to release Ensembl API as a plugin ? I think is't just a matter of rename the gem @Geroge: To avoid problems, please, yank isoelectric_point from rubygems I didn't receive any reply from Ricardo H. Ram?rez-Gonzalez about samtools-ruby-ffi Do you think that a separate page would be better? I think so, u? Ciao. On 12/gen/2011, at 19.37, Chase Miller wrote: > Hi All, > > Quick update on the bio-assembly plugin. > > Francesco has added support for CAF files. According to his preliminary > tests it can handle a 27k contig 454 file in about a minute. He also > improved the performance overall so now the ace parser can process a 70 mb > file in about 10 seconds. Nice work! > > If there are any requests for parsers or functionality, let us know. > > source code: https://github.com/chmille4/bioruby-assembly > > usage: > https://github.com/chmille4/bioruby-assembly#readme > > gem: https://rubygems.org/gems/bio-assembly > > > Cheers > Chase > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From yannick.wurm at unil.ch Sun Jan 16 06:57:04 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Sun, 16 Jan 2011 18:57:04 +0700 Subject: [BioRuby] trees Message-ID: is a specific person "responsible" for coordinating the wiki? the following page is largely misleading (contains tons of ruby code): http://bioruby.open-bio.org/wiki/HOWTO:Trees cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From ngoto at gen-info.osaka-u.ac.jp Mon Jan 17 00:34:48 2011 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Mon, 17 Jan 2011 14:34:48 +0900 Subject: [BioRuby] trees In-Reply-To: References: Message-ID: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> On Sun, 16 Jan 2011 18:57:04 +0700 Yannick Wurm wrote: > is a specific person "responsible" for coordinating the wiki? > > the following page is largely misleading (contains tons of ruby code): > http://bioruby.open-bio.org/wiki/HOWTO:Trees The page is a trial to translate BioPerl HowTOs from Perl to Ruby, but is still left unfinished. See the discussion: http://bioruby.open-bio.org/wiki/Talk:HOWTOs One of the reasons why the trial stalled is the differences between BioPerl and BioRuby is larger than we expected. In the Talk:HOWTOs page, to write BioRuby original documentation were also discussed, but it stalled too. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp From pjotr.public14 at thebird.nl Mon Jan 17 03:47:05 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 17 Jan 2011 09:47:05 +0100 Subject: [BioRuby] bio-logger release 0.9.0 Message-ID: <20110117084705.GA5136@thebird.nl> Just released bio-logger 0.9.0. Most important feature I added is that you can inject a filter on log messages (by module). I.e. for the blast logger you could only show messages relating to a contig: log = LoggerPlus['blast'] log.filter { | level, sub_level, msg | msg =~ /contig1133/ } on the command line you can do the same with: --trace "blast:= msg =~ /contig1133/" another option is to filter on level and sub_level values: log.filter { | level, sub_level, msg | sub_level == 3 or level <= ERROR } providing lots of possibilities. Obviously much of this can be handled (multi)grep'ing log files, but the power of using Ruby and filter combinations makes at a great feature for debugging big data problems. And you can limit the size of log files, without limiting expressive power. Pj. From pjotr.public14 at thebird.nl Mon Jan 17 05:08:12 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 17 Jan 2011 11:08:12 +0100 Subject: [BioRuby] Bio-gff3 plugin 0.8.6 Message-ID: <20110117100812.GA6947@thebird.nl> Released bio-gff3 parser plugin 0.8.6 on rubygems, and can be used from the command-line. E.g. gem install bio-gff3 gff3-fetch --help Introduced LRU cache, replaced the BioRuby GFF line parser and added lazy parsing. All with significant speedups compared to the original (No-cache, BioRuby parser, non-lazy). The LRU version has limited RAM use for any sized data (730MB), and currently runs 6 times slower than the full memory version. Digesting parser: Cache real user sys version RAM ------------------------------------------------------------ full,bioruby 12m41 12m28 0m09 (0.8.0) full,line 12m13 12m06 0m07 (0.8.5) full,line,lazy 11m51 11m43 0m07 (0.8.6) 6,600M none,bioruby 504m 477m 26m50 (0.8.0) none,line 297m 267m 28m36 (0.8.5) none,line,lazy 132m 106m 26m01 (0.8.6) 650M lru,bioruby 533m 510m 22m47 (0.8.5) lru,line 353m 326m 26m44 (0.8.5) 1K lru,line 305m 281m 22m30 (0.8.5) 10K lru,line,lazy 182m 161m 21m10 (0.8.6) 10K lru,line,lazy 75m 75m 0m17 (0.8.6) 50K 730M ------------------------------------------------------------ where 52M m_hapla.WS217.dna.fa 456M m_hapla.WS217.gff3 ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-linux] on 64-bits CPU 2.6 GHz (6MB cache), 16 GB RAM machine. Note bio-gff3 0.8.6 is a fully digesting parser, with scope for full validation of the GFF3 relations. The next step, a limited 'optimistic' digestion, will speed things up. Note also that bio-gff3 exploits the bio-logger plugin - it is a good example. Pj. From yannick.wurm at unil.ch Tue Jan 18 02:55:15 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Tue, 18 Jan 2011 14:55:15 +0700 Subject: [BioRuby] trees In-Reply-To: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> References: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> Message-ID: Thanks for the details Naohisa-san Maybe I suggest we try to "hide from google" things that are not finalized, and links to non-existant documents? (I have the feeling it may be better to have nothing than to create confusion?) On 17 Jan 2011, at 12:34, Naohisa GOTO wrote: > On Sun, 16 Jan 2011 18:57:04 +0700 > Yannick Wurm wrote: > >> is a specific person "responsible" for coordinating the wiki? >> >> the following page is largely misleading (contains tons of ruby code): >> http://bioruby.open-bio.org/wiki/HOWTO:Trees > > The page is a trial to translate BioPerl HowTOs from Perl to Ruby, > but is still left unfinished. See the discussion: > http://bioruby.open-bio.org/wiki/Talk:HOWTOs > > One of the reasons why the trial stalled is the differences between > BioPerl and BioRuby is larger than we expected. > > In the Talk:HOWTOs page, to write BioRuby original documentation > were also discussed, but it stalled too. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp From yannick.wurm at unil.ch Tue Jan 18 03:56:26 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Tue, 18 Jan 2011 15:56:26 +0700 Subject: [BioRuby] Rake In-Reply-To: References: Message-ID: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Dear List, I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: - tranlsation - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml) AND - codon-level alignment: reverse translated from protein multiple alignment (pal2nal) - gblocks to remove crap - tree building (phyml) https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. Cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From mail at michaelbarton.me.uk Tue Jan 18 10:17:08 2011 From: mail at michaelbarton.me.uk (Michael Barton) Date: Tue, 18 Jan 2011 10:17:08 -0500 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: <20110118151708.GB3430@nku069218.hh.nku.edu> Hi Yannick, I think it's a great idea to generate predefined pipelines for common bioinformatics tasks. I experimented with a tool called Boson six months ago. It could be worth looking if you feel like investing more time into your pipeline. Boson commands, similar to rake tasks, are more modular and can be installed from the web into a ~/.boson directory. This has obvious advantages over a single rake file. Boson tasks can be chained together where the data is passed around in YAML format. The github link is - https://github.com/cldwalker/boson Cheers Michael Barton On Tue, Jan 18, 2011 at 03:56:26PM +0700, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san > and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" > and goes through the following steps for tree building: - tranlsation > - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml) > AND - codon-level alignment: reverse translated from protein multiple > alignment (pal2nal) - gblocks to remove crap - tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere > out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in > the future. If you have any comments or suggestions, I'd be most happy to > hear them. > > Cheers, > > yannick > > ------------------------- Ant Genomes & Evolution http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ BioRuby Project > - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From diapriid at gmail.com Tue Jan 18 10:21:36 2011 From: diapriid at gmail.com (Matt) Date: Tue, 18 Jan 2011 10:21:36 -0500 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: Yannick- I like it. It might be nice to extend your pipeline in a generic manner (SimpleAnalysisPipeline). Just a couple of steps that would be extensible/swappable to different software. The generic pipeline would "just work" given a minimal local configuration (I like your starting point). Swappable/configurable steps might be Pre-process (trim / quality filters?) Alignment (align) Post alignment (gblocks) Translation (to Nexus) Analysis (Phyml) The idea is that we could swap in components (TNT or RaXML for Phyml, Muscle for MAFFT etc.)- but also that the pipeline remains "simple". If I find some time I'd like to work on my first attempt at a BioRuby Plugin, a wrapper for TNT (hopefully tied in to the analysis bit above). cheers, Matt On Tue, Jan 18, 2011 at 3:56 AM, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: > ?- tranlsation > ?- multiple alignment (mafft) > ?- gblocks to remove crap > ?- tree building (phyml) > AND > ?- codon-level alignment: reverse translated from protein multiple alignment (pal2nal) > ?- gblocks to remove crap > ?- tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. > > Cheers, > > yannick > > ------------------------- > ?Ant Genomes & Evolution > http://yannick.poulet.org > ? skype://yannickwurm > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From francesco.strozzi at gmail.com Tue Jan 18 15:55:20 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Tue, 18 Jan 2011 21:55:20 +0100 Subject: [BioRuby] BioRuby HTSeq-like Message-ID: Hi BioRuby people, just wondering if something similar exists for BioRuby (is a package to work and manipulate next-gen sequencing data, in Python): http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html Many features could be implemented or are already available for BioRuby....these are the basics: - Getting statistical summaries about the base-call quality scores to study the data quality. - Calculating a coverage vector and exporting it for visualization in a genome browser. - Reading in annotation data from a GFF file. - Assigning aligned reads from an RNA-Seq experiments to exons and genes. Particularly, the plotting functions to explore and assess quality data seems very interesting. If nothing similar exists for BioRuby, I think we should discuss about coding a BioRuby "NextGenSequencing" plugin, to provide the same functionalities and also to add something new as well.... What do you think? Cheers -- Francesco From yannick.wurm at unil.ch Wed Jan 19 23:08:55 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Thu, 20 Jan 2011 11:08:55 +0700 Subject: [BioRuby] Rake In-Reply-To: References: Message-ID: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> Hello & thanks for the comments. Matt wrote: > Swappable/configurable steps might be > > Pre-process (trim / quality filters?) > Alignment (align) > Post alignment (gblocks) > Translation (to Nexus) > Analysis (Phyml) > > The idea is that we could swap in components (TNT or RaXML for Phyml, > Muscle for MAFFT etc.)- but also that the pipeline remains "simple". Yes, thats what I would ideally want. (as well as being able to easily modify the run options of the programs). How would you go about generalizing this? Right now I'm basing "what do to" on the file extensions I provide... which limits me based on the file extensions... Michael wrote: > I think it's a great idea to generate predefined pipelines for common > bioinformatics tasks. I experimented with a tool called Boson six months ago. > It could be worth looking if you feel like investing more time into your > pipeline. > > Boson commands, similar to rake tasks, are more modular and can be installed > from the web into a ~/.boson directory. This has obvious advantages over > a single rake file. Boson tasks can be chained together where the data is > passed around in YAML format. > > The github link is - https://github.com/cldwalker/boson I haven't looked thoroughly now, but at least superficially, Boson looks real cool. However, I'm a bit scared of investing energy into technologies that are too new. Boson has only one developer who may or may not keep his project alive over the next years. Time I invest in learning something today ... must continue to help improve my productivity over the next 5 or 10 years by still being reusable in 5 or 10 years (with as few modifications as possible). There is uncertainty to everything, but something like Boson does seems a bit too risky right now... Cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From bonnalraoul at ingm.it Thu Jan 20 04:13:20 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 20 Jan 2011 10:13:20 +0100 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> Dear Yanninck, rake usually is used inside a project directory to provide common operations to the project. Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are. Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability What about spend some energy on wrapping that commands into BioRuby classes? In that way those application could be available to other scripts. If you want to keep the rake approach we should find a way to not replicate rakefiles. One idea could be to create a rakefile in your working directory, similar to Rails: # Add your own tasks in files placed in lib/tasks ending in .rake, # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake. require File.expand_path('../config/application', __FILE__) require 'rake' #The user needs just to add the tasks he wants: Bio::SomeName.load_tasks Bio::SomeOtherName.load_tasks Bio::AnotherName.load_tasks On 18/gen/2011, at 09.56, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: > - tranlsation > - multiple alignment (mafft) > - gblocks to remove crap > - tree building (phyml) > AND > - codon-level alignment: reverse translated from protein multiple alignment (pal2nal) > - gblocks to remove crap > - tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. > > Cheers, > > yannick > > ------------------------- > Ant Genomes & Evolution > http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From bonnalraoul at ingm.it Thu Jan 20 05:35:58 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 20 Jan 2011 11:35:58 +0100 Subject: [BioRuby] BioRuby HTSeq-like In-Reply-To: References: Message-ID: Hi folks, Yesterday I met Francesco in my lab and was a wonderful opportunity to exchange ideas and thoughts. About Fancesco's mail I think that we could grab inspiration from Galaxy/BioPython (http://main.g2.bx.psu.edu/) , they did a very good work on wrapping the common software for crunching NGS data. So my input is, let's start wrapping them and possibly opening a bioruby-ngs project on github: https://github.com/helios/bioruby-ngs (just the repo :-)) reading around http://seqanswers.com/forums/showthread.php?t=2461 sometimes there is the need to split and distribute the computation: there are different possibilities, but splitting the fastq file and at the same time enabling the multithreading seems to be the best option; if you have suggestions please comment. Thanks to Goto san, fastq support is on Thanks to Pjotr, GFF3 support is on Thanks to Chase and Fancesco, CAF and Ace support is on For plotting as we said one possibility is http://rubyvis.rubyforge.org/ from Claudio Bustos but if you have better alternatives... please discuss. About statistics please join http://groups.google.com/group/sciruby-dev Having this tools in our arsenal is useful and strategical for founding. I would say +1 PS: Please clone and add your name to the list of the authors if you want to join into this project. PS: if someone is using SGE what do you think about http://gridengine.info/2010/12/24/goodbye-grid-engine ? On 18/gen/2011, at 21.55, Francesco Strozzi wrote: > Hi BioRuby people, > just wondering if something similar exists for BioRuby (is a package > to work and manipulate next-gen sequencing data, in Python): > > http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html > > Many features could be implemented or are already available for > BioRuby....these are the basics: > - Getting statistical summaries about the base-call quality scores to > study the data quality. > - Calculating a coverage vector and exporting it for visualization in > a genome browser. > - Reading in annotation data from a GFF file. > - Assigning aligned reads from an RNA-Seq experiments to exons and genes. > > Particularly, the plotting functions to explore and assess quality > data seems very interesting. > If nothing similar exists for BioRuby, I think we should discuss about > coding a BioRuby "NextGenSequencing" plugin, to provide the same > functionalities and also to add something new as well.... > > What do you think? > > Cheers > -- > > Francesco > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From francesco.strozzi at gmail.com Thu Jan 20 06:27:43 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Thu, 20 Jan 2011 12:27:43 +0100 Subject: [BioRuby] BioRuby HTSeq-like In-Reply-To: References: Message-ID: Today is Thursday (BioRuby IRC day), I will try to join the #bioruby channel this afternoon (CET time). If there is someone else we could discuss about this plugin and new ideas. > PS: Please clone and add your name to the list of the authors if you want to > join into this project. Done! I'm in! -- Francesco From mail at michaelbarton.me.uk Thu Jan 20 15:57:26 2011 From: mail at michaelbarton.me.uk (Michael Barton) Date: Thu, 20 Jan 2011 15:57:26 -0500 Subject: [BioRuby] Rake In-Reply-To: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> References: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> Message-ID: <20110120205726.GD245@Michael-Bartons-MacBook.local> Yannick, you make an excellent point about the long term stability for boson. The ruby community, myself often guilty of this, is quick to jump on a new gem, which may or may not last into the future. A example of this is the Less gem for compiling CSS which has seen some recet popularity. I believe the developer has said he will no longer maintain it. Another option could be Thor. I believe this is also aimed at being a more modular rake-like tool. This is developed by Yehuda Katz and I think is used for the basis of few mainstream ruby command line tools (possibly the rails3 CLI? I'm not 100% about this.). I think you could expect Thor to be more mature and likely to be continually developed. If can find the episode of the ChangeLog with Yehuda you can hear him discuss it. On Thu, Jan 20, 2011 at 11:08:55AM +0700, Yannick Wurm wrote: > Hello & thanks for the comments. > > Matt wrote: > > Swappable/configurable steps might be > > > > Pre-process (trim / quality filters?) Alignment (align) Post alignment > > (gblocks) Translation (to Nexus) Analysis (Phyml) > > > > The idea is that we could swap in components (TNT or RaXML for Phyml, > > Muscle for MAFFT etc.)- but also that the pipeline remains "simple". > > Yes, thats what I would ideally want. (as well as being able to easily modify > the run options of the programs). How would you go about generalizing this? > Right now I'm basing "what do to" on the file extensions I provide... which > limits me based on the file extensions... > > > > Michael wrote: > > I think it's a great idea to generate predefined pipelines for common > > bioinformatics tasks. I experimented with a tool called Boson six months > > ago. It could be worth looking if you feel like investing more time into > > your pipeline. > > > > Boson commands, similar to rake tasks, are more modular and can be > > installed from the web into a ~/.boson directory. This has obvious > > advantages over a single rake file. Boson tasks can be chained together > > where the data is passed around in YAML format. > > > > The github link is - https://github.com/cldwalker/boson > > I haven't looked thoroughly now, but at least superficially, Boson looks real > cool. > > However, I'm a bit scared of investing energy into technologies that are too > new. Boson has only one developer who may or may not keep his project alive > over the next years. Time I invest in learning something today ... must > continue to help improve my productivity over the next 5 or 10 years by still > being reusable in 5 or 10 years (with as few modifications as possible). > There is uncertainty to everything, but something like Boson does seems a bit > too risky right now... > > Cheers, > > yannick > > > ------------------------- Ant Genomes & Evolution http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ BioRuby Project > - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From yannick.wurm at unil.ch Fri Jan 21 00:24:22 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Fri, 21 Jan 2011 12:24:22 +0700 Subject: [BioRuby] Rake In-Reply-To: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> Message-ID: <2E930098-FD2E-4D3B-AC61-1B54D3653DE7@unil.ch> Ciao Raoul, mi dispiace, I was away from the computer during most of the irc thing. On 20 Jan 2011, at 16:13, Raoul Bonnal wrote: > Dear Yanninck, > rake usually is used inside a project directory to provide common operations to the project. > Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are. > > Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability That is true. My main worry here (and for most things) is rapidly getting a biological result. I'm still working on finding the optimal balance between quick hack and maintainability/reusability. Migrating from shell scripts to ruby hacks does probably save me some time because in ruby it's really simple to put in a few verifications by raising Errors if a tool I need isn't in the $PATH or if an input/output file is empty. Those mean that debugging and fixing is much faster if I decide to run things on the linux server instead of the macbook, or in 2 years time after a reinstall. > What about spend some energy on wrapping that commands into BioRuby classes? > In that way those application could be available to other scripts. I have two answers. - right now I cannot dedicate the time required to learn how to do that well. I need understand how ants work first :) (If I were developping a big uniprot-type web application that needs to be robust for users, making wrappers may be defendable.... for one-off hacks its not) - call me conservative, but I'm also generally scared of wrappers. First, I want to have the raw input & output files that the programs use, because I may need to read or edit or rerun them in the future... I know I'll be able to read a raw text file. Thus I've never used bioruby's wrappers for blast or codeml or multiple sequence alignment (However, I have recently discovered the amazingly timesaving Bio::Tree however -wow). Second, programs are constantly changing... and thus wrappers must too - they're a ton of work to maintain and -like the Boson thing- there is no guarantee that that will be done. > If you want to keep the rake approach we should find a way to not replicate rakefiles. > One idea could be to create a rakefile in your working directory, similar to Rails: > > # Add your own tasks in files placed in lib/tasks ending in .rake, > # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake. > > require File.expand_path('../config/application', __FILE__) > require 'rake' > > #The user needs just to add the tasks he wants: > Bio::SomeName.load_tasks > Bio::SomeOtherName.load_tasks > Bio::AnotherName.load_tasks That sounds like a really cool approach. I want to hear more :) ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From francesco.strozzi at gmail.com Fri Jan 21 04:41:47 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Fri, 21 Jan 2011 10:41:47 +0100 Subject: [BioRuby] BIO-NGS (and Rake/Thor for bioinformatics) Message-ID: Hi all, in the yesterday IRC chat (http://bioruby.org/irc/?date=2011-01) we discussed about the bio-ngs plugin that Raoul wrote in a previous email. Here is the Wiki page on BioRuby describing the general idea for this plugin: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing We want to use wrappers and/or bindings to existing tools like MAQ,BWA,SAMtools and we want to use Rake or Thor to provide custom tasks and let the user run NGS analysis. We would like to include also the possibility to create reports using statsample and rubyvis. Maybe some aspects are still a bit unclear at the moment (I think we need to define some sort of guidelines), but I hope we could come up with a useful (let me use this term) "framework" to run bioinformatics NGS analyses with Ruby. Any comment/help/feedback/suggestion is more than welcome! Cheers -- Francesco From mictadlo at gmail.com Tue Jan 25 19:41:11 2011 From: mictadlo at gmail.com (Michal) Date: Wed, 26 Jan 2011 10:41:11 +1000 Subject: [BioRuby] marshal data too short Message-ID: <4D3F6DA7.8050101@gmail.com> Hi, I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way $ tar xvfz ruby-1.9.2-p136.tar.gz $ cd ruby-1.9.2-p136/ $ ./configure --prefix=/home/mictadlo/apps/ruby $ make $ make install $ vim ~/.bashrc export APPS=/home/mictadlo/apps export RUBY_HOME=$APPS/ruby export LD_LIBRARY_PATH=/RUBY_HOME/lib PATH=$RUBY_HOME/bin:$PATH $ . ~/.bashrc $ ruby -v ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] $ tar xvfz bioruby-1.4.1.tar.gz $ cd bioruby-1.4.1/ $ ruby setup.rb $ bioruby Loading config (/home/mitlox/.bioruby/shell/session/config) ... done Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short done . . . B i o R u b y i n t h e s h e l l . . . Version : BioRuby 1.4.1 / Ruby 1.9.2 bioruby> exit How can I fix the error in BioRuby? Thank you in advance. Michal From bonnalraoul at ingm.it Wed Jan 26 10:23:03 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Wed, 26 Jan 2011 16:23:03 +0100 Subject: [BioRuby] IRC meeting Message-ID: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> As usual, tomorrow the IRC meeting. -- R.J.P.B. From ktym at hgc.jp Wed Jan 26 11:09:02 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 27 Jan 2011 01:09:02 +0900 Subject: [BioRuby] marshal data too short In-Reply-To: <4D3F6DA7.8050101@gmail.com> References: <4D3F6DA7.8050101@gmail.com> Message-ID: Hi Michal, Could you give me some additional information? % ls -l ~/.bioruby/shell/session/object -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object [4, 8] Have you ever used the bioruby shell with the old version of Ruby before? If your file is not corrupted, this might be due to the backward incompatibility of the Marshal file format (if so, does anyone know whether there are any workaround to convert old marshal data into 1.9's?). Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully restore the saved objects: % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object Toshiaki On 2011/01/26, at 9:41, Michal wrote: > Hi, > I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way > > $ tar xvfz ruby-1.9.2-p136.tar.gz > $ cd ruby-1.9.2-p136/ > $ ./configure --prefix=/home/mictadlo/apps/ruby > $ make > $ make install > $ vim ~/.bashrc > export APPS=/home/mictadlo/apps > export RUBY_HOME=$APPS/ruby > export LD_LIBRARY_PATH=/RUBY_HOME/lib > PATH=$RUBY_HOME/bin:$PATH > $ . ~/.bashrc > $ ruby -v > ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] > > $ tar xvfz bioruby-1.4.1.tar.gz > $ cd bioruby-1.4.1/ > $ ruby setup.rb > $ bioruby > Loading config (/home/mitlox/.bioruby/shell/session/config) ... done > Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short > done > > . . . B i o R u b y i n t h e s h e l l . . . > > Version : BioRuby 1.4.1 / Ruby 1.9.2 > > bioruby> exit > > How can I fix the error in BioRuby? > > Thank you in advance. > > Michal > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Wed Jan 26 11:46:23 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 27 Jan 2011 01:46:23 +0900 Subject: [BioRuby] IRC meeting In-Reply-To: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> References: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> Message-ID: Raoul, On 2011/01/27, at 0:23, Raoul Bonnal wrote: > As usual, tomorrow the IRC meeting. > > -- > R.J.P.B. Thank you for the reminder! The next will be our 6th IRC meeting. In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system. As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became the first use case of the logger (he posted announcements to this list on Jan 17th). We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting. As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing As for the weekly BioRuby IRC meeting, please see http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference Thanks, Toshiaki From bonnalraoul at ingm.it Wed Jan 26 14:10:26 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Wed, 26 Jan 2011 20:10:26 +0100 Subject: [BioRuby] IRC meeting In-Reply-To: Message-ID: <20110126191026.e71c169e@mail.ingm.it> Hi all, I have updated the page http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing I'll try to keep you up to date about samtools from ml and that page. I can't remember who is involved in the workflows, tomorrow we'll fix the page with the rigth names. _____ From: Toshiaki Katayama [mailto:ktym at hgc.jp] To: Raoul Bonnal [mailto:bonnalraoul at ingm.it] Cc: BioRuby ML [mailto:bioruby at lists.open-bio.org] Sent: Wed, 26 Jan 2011 17:46:23 +0100 Subject: Re: [BioRuby] IRC meeting Raoul, On 2011/01/27, at 0:23, Raoul Bonnal wrote: > As usual, tomorrow the IRC meeting. > > -- > R.J.P.B. Thank you for the reminder! The next will be our 6th IRC meeting. In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system. As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became the first use case of the logger (he posted announcements to this list on Jan 17th). We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting. As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing As for the weekly BioRuby IRC meeting, please see http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference Thanks, Toshiaki From mictadlo at gmail.com Fri Jan 28 07:18:30 2011 From: mictadlo at gmail.com (Michal) Date: Fri, 28 Jan 2011 22:18:30 +1000 Subject: [BioRuby] marshal data too short In-Reply-To: References: <4D3F6DA7.8050101@gmail.com> Message-ID: <4D42B416.8010503@gmail.com> Hi Toshiaki, On my system was not Ruby installed before and I just installed the latest version in my home directory: $ ls -l ~/.bioruby/shell/session/object -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object [nil, nil] $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object -e:1:in `load': marshal data too short (ArgumentError) from -e:1:in `
' Do you need another information? Thank you in advance. Michal On 01/27/2011 02:09 AM, Toshiaki Katayama wrote: > Hi Michal, > > Could you give me some additional information? > > % ls -l ~/.bioruby/shell/session/object > -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object > > % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object > [4, 8] > > Have you ever used the bioruby shell with the old version of Ruby before? > > If your file is not corrupted, this might be due to the backward > incompatibility of the Marshal file format (if so, does anyone know > whether there are any workaround to convert old marshal data into 1.9's?). > > Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully > restore the saved objects: > > % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object > > Toshiaki > > > On 2011/01/26, at 9:41, Michal wrote: > >> Hi, >> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way >> >> $ tar xvfz ruby-1.9.2-p136.tar.gz >> $ cd ruby-1.9.2-p136/ >> $ ./configure --prefix=/home/mictadlo/apps/ruby >> $ make >> $ make install >> $ vim ~/.bashrc >> export APPS=/home/mictadlo/apps >> export RUBY_HOME=$APPS/ruby >> export LD_LIBRARY_PATH=/RUBY_HOME/lib >> PATH=$RUBY_HOME/bin:$PATH >> $ . ~/.bashrc >> $ ruby -v >> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] >> >> $ tar xvfz bioruby-1.4.1.tar.gz >> $ cd bioruby-1.4.1/ >> $ ruby setup.rb >> $ bioruby >> Loading config (/home/mitlox/.bioruby/shell/session/config) ... done >> Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short >> done >> >> . . . B i o R u b y i n t h e s h e l l . . . >> >> Version : BioRuby 1.4.1 / Ruby 1.9.2 >> >> bioruby> exit >> >> How can I fix the error in BioRuby? >> >> Thank you in advance. >> >> Michal >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > From ktym at hgc.jp Sat Jan 29 07:18:04 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sat, 29 Jan 2011 21:18:04 +0900 Subject: [BioRuby] marshal data too short In-Reply-To: <4D42B416.8010503@gmail.com> References: <4D3F6DA7.8050101@gmail.com> <4D42B416.8010503@gmail.com> Message-ID: <8DFDDEA3-9B1D-44DC-BCB6-DCBA2C06BAF9@hgc.jp> Hi Michal, When I remove the ~/.bioruby directory, I could reproduce the same error with Ruby 1.9.2. The ~/.bioruby/shell/session/object file was empty because BioRuby shell failed to save the file. Saving object (/Users/ktym/.bioruby/shell/session/object) ... Error: Failed to save (/Users/ktym/.bioruby/shell/session/object) : can't convert Symbol into String I'll try to fix this. Toshiaki On 2011/01/28, at 21:18, Michal wrote: > Hi Toshiaki, > On my system was not Ruby installed before and I just installed the latest version in my home directory: > $ ls -l ~/.bioruby/shell/session/object > -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object > $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object > [nil, nil] > $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object > -e:1:in `load': marshal data too short (ArgumentError) > from -e:1:in `
' > > Do you need another information? > > Thank you in advance. > > Michal > > > On 01/27/2011 02:09 AM, Toshiaki Katayama wrote: >> Hi Michal, >> >> Could you give me some additional information? >> >> % ls -l ~/.bioruby/shell/session/object >> -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object >> >> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object >> [4, 8] >> >> Have you ever used the bioruby shell with the old version of Ruby before? >> >> If your file is not corrupted, this might be due to the backward >> incompatibility of the Marshal file format (if so, does anyone know >> whether there are any workaround to convert old marshal data into 1.9's?). >> >> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully >> restore the saved objects: >> >> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object >> >> Toshiaki >> >> >> On 2011/01/26, at 9:41, Michal wrote: >> >>> Hi, >>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way >>> >>> $ tar xvfz ruby-1.9.2-p136.tar.gz >>> $ cd ruby-1.9.2-p136/ >>> $ ./configure --prefix=/home/mictadlo/apps/ruby >>> $ make >>> $ make install >>> $ vim ~/.bashrc >>> export APPS=/home/mictadlo/apps >>> export RUBY_HOME=$APPS/ruby >>> export LD_LIBRARY_PATH=/RUBY_HOME/lib >>> PATH=$RUBY_HOME/bin:$PATH >>> $ . ~/.bashrc >>> $ ruby -v >>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] >>> >>> $ tar xvfz bioruby-1.4.1.tar.gz >>> $ cd bioruby-1.4.1/ >>> $ ruby setup.rb >>> $ bioruby >>> Loading config (/home/mitlox/.bioruby/shell/session/config) ... done >>> Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short >>> done >>> >>> . . . B i o R u b y i n t h e s h e l l . . . >>> >>> Version : BioRuby 1.4.1 / Ruby 1.9.2 >>> >>> bioruby> exit >>> >>> How can I fix the error in BioRuby? >>> >>> Thank you in advance. >>> >>> Michal >>> >>> _______________________________________________ >>> BioRuby Project - http://www.bioruby.org/ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >> > From mictadlo at gmail.com Sun Jan 30 06:42:09 2011 From: mictadlo at gmail.com (Michal) Date: Sun, 30 Jan 2011 21:42:09 +1000 Subject: [BioRuby] samtools-ruby Message-ID: <4D454E91.1080604@gmail.com> Hi, I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response. What did I wrong? Michal From bonnalraoul at ingm.it Mon Jan 31 05:11:46 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 11:11:46 +0100 Subject: [BioRuby] samtools-ruby In-Reply-To: <4D454E91.1080604@gmail.com> References: <4D454E91.1080604@gmail.com> Message-ID: Dear Michal, please check this out: https://github.com/helios/bioruby-samtools This is the inital port of samtools-ruby as plugin. It comes with library for osx and linux, no windows. I need to test the linux library because I'm developing under osx. If the libbam.a is wrong please give me the right one and I'll add it to the repo. Also note that the library has been compiled for 64bit. Ciao! On 30/gen/2011, at 12.42, Michal wrote: > Hi, > I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response. > > What did I wrong? > > Michal > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From bonnalraoul at ingm.it Mon Jan 31 05:27:51 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 11:27:51 +0100 Subject: [BioRuby] BioGem and Rails Message-ID: <1CCE39E6-F232-44C8-B95D-3C620443EF5C@ingm.it> Dear All, I've created a new branch in biogem. https://github.com/helios/bioruby-gem/tree/rails_engine It adds an option at biogem script for creating a rails engine with your gem, ONLY Rails3 !!! The idea is: develop a gem that can be used in a script and extend it to be integrated in a rails project. Which library can benefits from this approach ? I think, databases, parser or any data that you want to expose to a rails application. It's in a very early stage so don't use it now, this message is just to let you know that we are adding new features. from the help: --with-engine create a Rails engine with the namespace give in input. Set default database creation Note: Is not possible to add the engine to an old gem, I need to fix it and implement the generator to accomplish to this task. Any input is welcome. Ciao. -- R.J.P.B. From jan.aerts at gmail.com Mon Jan 31 10:07:39 2011 From: jan.aerts at gmail.com (Jan Aerts) Date: Mon, 31 Jan 2011 16:07:39 +0100 Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics Message-ID: All, FYI: There is now a Bioinformatics paper that describes the Ruby API to the Ensembl databases. Thanks to Francesco Strozzi for working on this with me. You can find it here: http://bit.ly/fzQamR At this moment this API covers the core and variation databases. If anyone is interested in working on the API for compara or functional, please let me know. Kind regards, jan. From bonnalraoul at ingm.it Mon Jan 31 10:22:12 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 16:22:12 +0100 Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics In-Reply-To: References: Message-ID: <193864A0-D798-4737-83CE-7A7932E4552C@ingm.it> well done! On 31/gen/2011, at 16.07, Jan Aerts wrote: > All, > > FYI: There is now a Bioinformatics paper that describes the Ruby API to the > Ensembl databases. Thanks to Francesco Strozzi for working on this with me. > You can find it here: http://bit.ly/fzQamR > > At this moment this API covers the core and variation databases. If anyone > is interested in working on the API for compara or functional, please let me > know. > > Kind regards, > jan. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From missy at be.to Sun Jan 2 00:54:08 2011 From: missy at be.to (MISHIMA, Hiroyuki) Date: Sun, 02 Jan 2011 09:54:08 +0900 Subject: [BioRuby] Workflows: NGS + miRNA (Re: Workflows and Parallelization) In-Reply-To: <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> Message-ID: <4D1FCCB0.2000303@be.to> Dear Raoul and the BioRuby list, My workflow for miRNA analysis using Illumina GAii is like the followings: 1) Read alignment using Novoalign. ( http://www.novocraft.com/ ). It is a proprietary software, but its binary is free for academic use with several restrictions. The advantage of Novoalign is the function to remove adapter sequences from each read. Adapter clipping is indispensable for miRNA analyses because target molecules are always shorter than read length. 1b) You may be able to use BWA/MAQ instead. Adopter clipping tool such as Cutadapt ( http://cutadapt.googlecode.com/ ) is available. 2) To find miRBASE-registered miRNAs, I used miRExpress ( http://mirexpress.mbc.nctu.edu.tw/ , Wang et al, BMC Bioinform 10, p328, 2009. http://www.biomedcentral.com/1471-2105/10/328 ) 2b) Data analysis. I plotted heatmaps using R. See Ruby et al. (Genome Res, 17, p1850, 2007. http://genome.cshlp.org/content/17/12/1850.long ). 3) To find potentially novel miRNA, I used miRTRAP (http://flybuzz.berkeley.edu/miRTRAP.html (Hendrix et al., Genome Biolo, 11, pR39, 2010. http://genomebiology.com/2010/11/4/R39 ). The workflow may have to be updated. Hopefully, it will help you. Thanks, Hiro. Raoul Bonnal wrote (2010/12/23 18:47): > Actually the focus of my institute is mainly on mirna, so I'm also > interested on techniques for analyzing NGS(illumina) and microRNA. -- MISHIMA, Hiroyuki, DDS, Ph.D. COE Research Fellow Department of Human Genetics Nagasaki University Graduate School of Biomedical Sciences From missy at be.to Sun Jan 2 01:38:57 2011 From: missy at be.to (MISHIMA, Hiroyuki) Date: Sun, 02 Jan 2011 10:38:57 +0900 Subject: [BioRuby] Workflows: NGS + miRNA In-Reply-To: <4D1FCCB0.2000303@be.to> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> <4D1FCCB0.2000303@be.to> Message-ID: <4D1FD731.4070309@be.to> Hi all, Addition to my workflow. Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files. miRExpress is easy and fast. This one is good for first try. During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits. MISHIMA, Hiroyuki wrote (2011/01/02 9:54): > Dear Raoul and the BioRuby list, > > My workflow for miRNA analysis using Illumina GAii is like the followings: -- MISHIMA, Hiroyuki, DDS, Ph.D. COE Research Fellow Department of Human Genetics Nagasaki University Graduate School of Biomedical Sciences From pjotr.public14 at thebird.nl Sun Jan 2 12:04:48 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 2 Jan 2011 13:04:48 +0100 Subject: [BioRuby] GFF3 Message-ID: <20110102120448.GA23804@thebird.nl> The GFF3 plugin works rather well. Anyone who has ruby 1.9.x on his system can just type as a user: gem install bio-gff3 and even bioruby itself gets installed, if needed. Next you can type, for example gff3-fetch mRNA test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3 to assemble all mRNA. Unfortunately I am finding some problems with data. For example the reading frame is *wrong* in this wormbase data file (predicted gene). The contig starts as: >MhA1_Contig3426 TTAATAAATTTAATTCATTAAAATTTTAAAAAGAAAGGGACATTCGAGGGGAAATGAGAGAGAACGAGAGAAAATGGACG GGAAATTAAATTAAAAAATAAAAAATTAATTTTTATTTTTTTTTATTTAATTTAAAATTAATTTTCTACATTTATTAAAT CTTAAATTATTAATTTTAAATTAATTTAAAG GCATCCAACAACAACAATTAGAAGTCTTTCCCAGCTCCTCCTCTGCCCC TCAGCAACAACAATACCCAGCGCAGCAGCTTCAATTAGTTACTCCTTTTATTGCATGCATAGCAGATGAATTGAGGGAGT TGATAGATGAAATGCGTATGTTTTAG AATATTTTTTAAAAAAAAATTAAAAAAAATTTTTTTTTGCCAAACAGGCTCTCG and the full record is: ##gff-version 3 ##sequence-region MhA1_Contig3426 1 2029 # Gene gene:MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase gene 192 346 . + . ID=gene:MhA1_Contig3426.frz3.gene1;Name=MhA1_Contig3426.frz3.gene1;Note=PREDICTE D protein_coding;public_name=MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase mRNA 192 346 . + . ID=transcript:MhA1_Contig3426.frz3.gene1;Parent=gene:MhA1_Contig3426.frz3.gene1; Name=MhA1_Contig3426.frz3.gene1;public_name=MhA1_Contig3426.frz3.gene1 MhA1_Contig3426 WormBase exon 192 346 . + . ID=exon:MhA1_Contig3426.frz3.gene1.1;Parent=transcript:MhA1_Contig3426.frz3.gene 1 MhA1_Contig3426 WormBase CDS 192 346 . + 0 ID=cds:MhA1_Contig3426.frz3.gene1;Parent=transcript:MhA1_Contig3426.frz3.gene1 So, forward reading frame start at 192 and CDS phase 0. The actual sequence is GCATCCAACA ACAACAATTA GAAGTCTTTC CCAGCTCCTC CTCTGCCCCT CAGCAACAAC AATACCCAGC GCAGCAGCTT CAATTAGTTA CTCCTTTTAT TGCATGCATA GCAGATGAAT TGAGGGAGTT GATAGATGAA ATGCGTATGT TTTAG which translates to a valid protein only in frame 2(!). This is not compliant with GFF3 in any interpretation. Turns out for this particular GFF3 file this is the case only with the *first* ORF on every contig, and probably a bug of the gene predictor used. None of the other genes is in the wrong frame. I have informed Wormbase some time ago, but I don't have the impression that anyone is interested. You can validate its contents at http://www.wormbase.org/db/gb2/gbrowse/m_hapla/?name=id:2258995;dbid=m_hapla:database I am going to add an option to the GFF3 plugin to test for valid reading frames, so these files give the expected results. Be good for validation anyway. Pj. From pjotr.public14 at thebird.nl Sun Jan 2 18:49:58 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 2 Jan 2011 19:49:58 +0100 Subject: [BioRuby] BioRuby and log4r Message-ID: <20110102184958.GA25699@thebird.nl> I propose we start using http://log4r.rubyforge.org/manual.html which has the standard logging features one would expect. I particularly like the lazy evaluation (deferred block). What it does fall short on, as well as most other loggers, is usage use cases. A logger has to behave differently when a tool is used by: - developer: fail early and often (on warnings!) - user: fail on normal error - library: fail on serious error - web server: fail on serious error - fault tolerant system: never fail, try to resume Essentially, I see three or four error handlers. We can create a default logger for BioRuby = user But I like to have more options. It would be nice to have several levels within 'info', 'warn' or 'error', to be displayed/logged on user needs. Also, with the plugins we should have standardized switches for CLI utilities. Are we interested in making this core BioRuby, or should I incorporate it as a bio-plugin? I am thinking of writing a front-end of log4r. Pj. From bonnalraoul at ingm.it Mon Jan 3 12:14:44 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 3 Jan 2011 13:14:44 +0100 Subject: [BioRuby] Workflows: NGS + miRNA In-Reply-To: <4D1FD731.4070309@be.to> References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it> <4D10B9AE.2010206@be.to> <5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it> <4D12A1CE.4040702@be.to> <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it> <4D1FCCB0.2000303@be.to> <4D1FD731.4070309@be.to> Message-ID: <3D720507-34D5-4A3C-9F2C-A54CB9556E3D@ingm.it> Thank you very much, I'll read all the refs. On 02/gen/2011, at 02.38, MISHIMA, Hiroyuki wrote: > Hi all, > > Addition to my workflow. > > Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files. > > miRExpress is easy and fast. This one is good for first try. > > During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits. > > MISHIMA, Hiroyuki wrote (2011/01/02 9:54): >> Dear Raoul and the BioRuby list, >> >> My workflow for miRNA analysis using Illumina GAii is like the followings: > > -- > MISHIMA, Hiroyuki, DDS, Ph.D. > COE Research Fellow > Department of Human Genetics > Nagasaki University Graduate School of Biomedical Sciences > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From pjotr.public14 at thebird.nl Fri Jan 7 08:52:21 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 7 Jan 2011 09:52:21 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110102184958.GA25699@thebird.nl> References: <20110102184958.GA25699@thebird.nl> Message-ID: <20110107085221.GA14735@thebird.nl> I am creating a plugin 'bio-logger' for sane handling of errors and exceptions in different situations (log-act): * Normal user * Developer * Web server * Fault-tolerant systems One example is a program logs a warning to stdout, as a user, but raises an exception, as a developer. bio-logger builds up on log4r functionality, using a more fine-grained approach for logging errors. I.e. within 'debug', 'info', 'warn', 'error' an addition value 1..10 can be set to limit output and logging. When a program, e.g. gff3-fetch, supports bio-logger switches, the following is possible: --logger stderr Add stderr logger (default is stdout) --logger filen Add filename logger --trace debug Show all messages --trace warn Show messages more serious than 'warn' --trace warn:3 Show messaged more serious that 'warn' level 3 module overrides: --trace gff3:info:5 Override level for 'gff3' to info level 5 --trace blast:debug Override level for 'blast' --trace blast,gff3:debug Override level for 'blast' and 'gff3' --trace stderr:blast:debug Override level for 'blast' on stderr Also behaviour can be changed. This normally happens through library calls. There is one command line switch, which changes log-act: --log-act Developer Modify the logger for development log4r supports rotating logs and remote logging. Which will be available. Any comments? Pj. On Sun, Jan 02, 2011 at 07:49:58PM +0100, Pjotr Prins wrote: > http://log4r.rubyforge.org/manual.html From pjotr.public14 at thebird.nl Fri Jan 7 15:01:47 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 7 Jan 2011 16:01:47 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110107085221.GA14735@thebird.nl> References: <20110102184958.GA25699@thebird.nl> <20110107085221.GA14735@thebird.nl> Message-ID: <20110107150147.GA16116@thebird.nl> bio-logger created. YABP (yet another BioRuby plugin). https://github.com/pjotrp/bioruby-logger-plugin Finally the logger I always wanted to have... Pj. From pjotr.public14 at thebird.nl Sat Jan 8 13:06:12 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sat, 8 Jan 2011 14:06:12 +0100 Subject: [BioRuby] BioRuby and log4r In-Reply-To: <20110107150147.GA16116@thebird.nl> References: <20110102184958.GA25699@thebird.nl> <20110107085221.GA14735@thebird.nl> <20110107150147.GA16116@thebird.nl> Message-ID: <20110108130612.GA19929@thebird.nl> If anyone is interested, the bio-logger plugin is fully functional (I am using it in the GFF3 plugin): This is a plugin for nailing down problems with big data parsers, common in bioinformatics, and sane handling of errors and exceptions in different situations. In Bioinformatics the following is a common scenario when dealing with parsers: Large data files sometimes contain errors. As a user you want to continue and hope for the best (logging the error). As a developer you want to see how you can fix the problem. Waiting for a full run and checking the logs is tedious. The logger can be helpful here, and avoids sticking temporary solutions in code. Read on... https://github.com/pjotrp/bioruby-logger-plugin I think we should use this throughout BioRuby to get consistent error handling and logging. No more $stderr.print statements. Pj. From bonnalraoul at ingm.it Mon Jan 10 22:06:48 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 10 Jan 2011 23:06:48 +0100 Subject: [BioRuby] biogem and options Message-ID: Hi all, I have updated the github repo with some requests from Pjotr. Now is possible to create bin, db and test/data directory if needed from the command line biogem --with-bin --with-bd --with-test-data youprojectname NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. NOTE 2: README updated. about rspec and cucumber jeweler already has those options. type 'biogem -h' and you'll get the help. This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. I'm refactoring the code so there are some variations in the original tree. I "hope", by the end of the week, to provide templates files too. -- R.J.P.B. From pjotr.public14 at thebird.nl Tue Jan 11 06:38:34 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 11 Jan 2011 07:38:34 +0100 Subject: [BioRuby] biogem and options In-Reply-To: References: Message-ID: <20110111063834.GA2409@thebird.nl> Super! On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote: > Hi all, > I have updated the github repo with some requests from Pjotr. > Now is possible to create bin, db and test/data directory if needed from the command line > > biogem --with-bin --with-bd --with-test-data youprojectname > > NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. > NOTE 2: README updated. > > about rspec and cucumber jeweler already has those options. > > type 'biogem -h' and you'll get the help. > > This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. > > I'm refactoring the code so there are some variations in the original tree. > > I "hope", by the end of the week, to provide templates files too. > > -- > R.J.P.B. > > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Tue Jan 11 10:47:55 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Tue, 11 Jan 2011 19:47:55 +0900 Subject: [BioRuby] biogem and options In-Reply-To: <20110111063834.GA2409@thebird.nl> References: <20110111063834.GA2409@thebird.nl> Message-ID: Raoul, http://twitter.com/#!/ilpuccio/status/24766316493672448 > @tktym could you point me to some example of what you mena, please? > "provide a recommended template for rdoc, require lines, and class def" In my example plugin, https://github.com/ktym/bioruby-hello/blob/master/lib/bio-hello.rb I used a style something similar with the BioRuby core library which is described in https://github.com/bioruby/bioruby/blob/master/README_DEV.rdoc but I'm not sure what is the best practice for plugin. It might be better to include the documentation in the README file instead. In ether case, what in my mind is to auto-generate a plugin description from those embedded description for the "plugin showcase" which will be available somewhere on the bioruby.org site in the future. For that purpose, we may also want to have some flags indicating: * status of the plugin (stable, usable, buggy, just started etc.) * the plugin will override the BioRuby core or just provide new features harmlessly * pre-requirements (especially, other than gems) etc. etc. Here's a material for further discussion (example template): # # = Bio::XXX - BioRuby plugin for XXX # # Copyright:: Copyright (C) 2001, 2003-2005 Bio R. Hacker , # Copyright:: Copyright (C) 2006 Chem R. Hacker # License:: The Ruby License # Site: http://github.com/user/bioruby-xxx # # == Description # # This plugin provides an interface for the XXX database. # # == Usage # # Lorem ipsum dolor sit amet, consectetur adipisicing elit, .... # # == Effects (Overrides?) # # * Modify the behavior of Bio::Sequence::NA#translate destructively # * Add methods to the Bio::DB class # # == Depends (Requirements?) # # * External MySQL database system # * RubyGem package 'foobar' # # == References # # * Hoge F. et al., The XXX database, Nucleic. Acid. Res. 123:100--123 (2030) # * http://hoge.db/ # # Do we need these two lines in every BioRuby plugin? require 'rubygems' require 'bio' # Do we allow classes defined outside of the 'Bio' namespace? module Bio class XXX # : end # XXX end # Bio Thanks, Toshiaki On 2011/01/11, at 15:38, Pjotr Prins wrote: > Super! > > On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote: >> Hi all, >> I have updated the github repo with some requests from Pjotr. >> Now is possible to create bin, db and test/data directory if needed from the command line >> >> biogem --with-bin --with-bd --with-test-data youprojectname >> >> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory. >> NOTE 2: README updated. >> >> about rspec and cucumber jeweler already has those options. >> >> type 'biogem -h' and you'll get the help. >> >> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr. >> >> I'm refactoring the code so there are some variations in the original tree. >> >> I "hope", by the end of the week, to provide templates files too. >> >> -- >> R.J.P.B. >> >> >> >> >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From chmille4 at gmail.com Wed Jan 12 18:37:19 2011 From: chmille4 at gmail.com (Chase Miller) Date: Wed, 12 Jan 2011 13:37:19 -0500 Subject: [BioRuby] bio-assembly Message-ID: Hi All, Quick update on the bio-assembly plugin. Francesco has added support for CAF files. According to his preliminary tests it can handle a 27k contig 454 file in about a minute. He also improved the performance overall so now the ace parser can process a 70 mb file in about 10 seconds. Nice work! If there are any requests for parsers or functionality, let us know. source code: https://github.com/chmille4/bioruby-assembly usage: https://github.com/chmille4/bioruby-assembly#readme gem: https://rubygems.org/gems/bio-assembly Cheers Chase From bonnalraoul at ingm.it Thu Jan 13 09:30:03 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 13 Jan 2011 10:30:03 +0100 Subject: [BioRuby] bio-assembly In-Reply-To: References: Message-ID: <8EB2ADDB-7137-4C8E-AB70-C5574A797886@ingm.it> Hi, great work guys. I have updated the Plugins' page http://bioruby.open-bio.org/wiki/Plugins#On_Development_Plugins, it's a list/resume with the state of the art of the plugins. Please let me know if there is something wrong. In my mind Planned plugins are just ideas not yet coded. The other are "on going development". I tried to list the plugins in order of creations. @Jan: Do you plan to release Ensembl API as a plugin ? I think is't just a matter of rename the gem @Geroge: To avoid problems, please, yank isoelectric_point from rubygems I didn't receive any reply from Ricardo H. Ram?rez-Gonzalez about samtools-ruby-ffi Do you think that a separate page would be better? I think so, u? Ciao. On 12/gen/2011, at 19.37, Chase Miller wrote: > Hi All, > > Quick update on the bio-assembly plugin. > > Francesco has added support for CAF files. According to his preliminary > tests it can handle a 27k contig 454 file in about a minute. He also > improved the performance overall so now the ace parser can process a 70 mb > file in about 10 seconds. Nice work! > > If there are any requests for parsers or functionality, let us know. > > source code: https://github.com/chmille4/bioruby-assembly > > usage: > https://github.com/chmille4/bioruby-assembly#readme > > gem: https://rubygems.org/gems/bio-assembly > > > Cheers > Chase > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From yannick.wurm at unil.ch Sun Jan 16 11:57:04 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Sun, 16 Jan 2011 18:57:04 +0700 Subject: [BioRuby] trees Message-ID: is a specific person "responsible" for coordinating the wiki? the following page is largely misleading (contains tons of ruby code): http://bioruby.open-bio.org/wiki/HOWTO:Trees cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From ngoto at gen-info.osaka-u.ac.jp Mon Jan 17 05:34:48 2011 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Mon, 17 Jan 2011 14:34:48 +0900 Subject: [BioRuby] trees In-Reply-To: References: Message-ID: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> On Sun, 16 Jan 2011 18:57:04 +0700 Yannick Wurm wrote: > is a specific person "responsible" for coordinating the wiki? > > the following page is largely misleading (contains tons of ruby code): > http://bioruby.open-bio.org/wiki/HOWTO:Trees The page is a trial to translate BioPerl HowTOs from Perl to Ruby, but is still left unfinished. See the discussion: http://bioruby.open-bio.org/wiki/Talk:HOWTOs One of the reasons why the trial stalled is the differences between BioPerl and BioRuby is larger than we expected. In the Talk:HOWTOs page, to write BioRuby original documentation were also discussed, but it stalled too. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp From pjotr.public14 at thebird.nl Mon Jan 17 08:47:05 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 17 Jan 2011 09:47:05 +0100 Subject: [BioRuby] bio-logger release 0.9.0 Message-ID: <20110117084705.GA5136@thebird.nl> Just released bio-logger 0.9.0. Most important feature I added is that you can inject a filter on log messages (by module). I.e. for the blast logger you could only show messages relating to a contig: log = LoggerPlus['blast'] log.filter { | level, sub_level, msg | msg =~ /contig1133/ } on the command line you can do the same with: --trace "blast:= msg =~ /contig1133/" another option is to filter on level and sub_level values: log.filter { | level, sub_level, msg | sub_level == 3 or level <= ERROR } providing lots of possibilities. Obviously much of this can be handled (multi)grep'ing log files, but the power of using Ruby and filter combinations makes at a great feature for debugging big data problems. And you can limit the size of log files, without limiting expressive power. Pj. From pjotr.public14 at thebird.nl Mon Jan 17 10:08:12 2011 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 17 Jan 2011 11:08:12 +0100 Subject: [BioRuby] Bio-gff3 plugin 0.8.6 Message-ID: <20110117100812.GA6947@thebird.nl> Released bio-gff3 parser plugin 0.8.6 on rubygems, and can be used from the command-line. E.g. gem install bio-gff3 gff3-fetch --help Introduced LRU cache, replaced the BioRuby GFF line parser and added lazy parsing. All with significant speedups compared to the original (No-cache, BioRuby parser, non-lazy). The LRU version has limited RAM use for any sized data (730MB), and currently runs 6 times slower than the full memory version. Digesting parser: Cache real user sys version RAM ------------------------------------------------------------ full,bioruby 12m41 12m28 0m09 (0.8.0) full,line 12m13 12m06 0m07 (0.8.5) full,line,lazy 11m51 11m43 0m07 (0.8.6) 6,600M none,bioruby 504m 477m 26m50 (0.8.0) none,line 297m 267m 28m36 (0.8.5) none,line,lazy 132m 106m 26m01 (0.8.6) 650M lru,bioruby 533m 510m 22m47 (0.8.5) lru,line 353m 326m 26m44 (0.8.5) 1K lru,line 305m 281m 22m30 (0.8.5) 10K lru,line,lazy 182m 161m 21m10 (0.8.6) 10K lru,line,lazy 75m 75m 0m17 (0.8.6) 50K 730M ------------------------------------------------------------ where 52M m_hapla.WS217.dna.fa 456M m_hapla.WS217.gff3 ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-linux] on 64-bits CPU 2.6 GHz (6MB cache), 16 GB RAM machine. Note bio-gff3 0.8.6 is a fully digesting parser, with scope for full validation of the GFF3 relations. The next step, a limited 'optimistic' digestion, will speed things up. Note also that bio-gff3 exploits the bio-logger plugin - it is a good example. Pj. From yannick.wurm at unil.ch Tue Jan 18 07:55:15 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Tue, 18 Jan 2011 14:55:15 +0700 Subject: [BioRuby] trees In-Reply-To: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> References: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp> Message-ID: Thanks for the details Naohisa-san Maybe I suggest we try to "hide from google" things that are not finalized, and links to non-existant documents? (I have the feeling it may be better to have nothing than to create confusion?) On 17 Jan 2011, at 12:34, Naohisa GOTO wrote: > On Sun, 16 Jan 2011 18:57:04 +0700 > Yannick Wurm wrote: > >> is a specific person "responsible" for coordinating the wiki? >> >> the following page is largely misleading (contains tons of ruby code): >> http://bioruby.open-bio.org/wiki/HOWTO:Trees > > The page is a trial to translate BioPerl HowTOs from Perl to Ruby, > but is still left unfinished. See the discussion: > http://bioruby.open-bio.org/wiki/Talk:HOWTOs > > One of the reasons why the trial stalled is the differences between > BioPerl and BioRuby is larger than we expected. > > In the Talk:HOWTOs page, to write BioRuby original documentation > were also discussed, but it stalled too. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp From yannick.wurm at unil.ch Tue Jan 18 08:56:26 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Tue, 18 Jan 2011 15:56:26 +0700 Subject: [BioRuby] Rake In-Reply-To: References: Message-ID: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Dear List, I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: - tranlsation - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml) AND - codon-level alignment: reverse translated from protein multiple alignment (pal2nal) - gblocks to remove crap - tree building (phyml) https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. Cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From mail at michaelbarton.me.uk Tue Jan 18 15:17:08 2011 From: mail at michaelbarton.me.uk (Michael Barton) Date: Tue, 18 Jan 2011 10:17:08 -0500 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: <20110118151708.GB3430@nku069218.hh.nku.edu> Hi Yannick, I think it's a great idea to generate predefined pipelines for common bioinformatics tasks. I experimented with a tool called Boson six months ago. It could be worth looking if you feel like investing more time into your pipeline. Boson commands, similar to rake tasks, are more modular and can be installed from the web into a ~/.boson directory. This has obvious advantages over a single rake file. Boson tasks can be chained together where the data is passed around in YAML format. The github link is - https://github.com/cldwalker/boson Cheers Michael Barton On Tue, Jan 18, 2011 at 03:56:26PM +0700, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san > and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" > and goes through the following steps for tree building: - tranlsation > - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml) > AND - codon-level alignment: reverse translated from protein multiple > alignment (pal2nal) - gblocks to remove crap - tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere > out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in > the future. If you have any comments or suggestions, I'd be most happy to > hear them. > > Cheers, > > yannick > > ------------------------- Ant Genomes & Evolution http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ BioRuby Project > - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From diapriid at gmail.com Tue Jan 18 15:21:36 2011 From: diapriid at gmail.com (Matt) Date: Tue, 18 Jan 2011 10:21:36 -0500 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: Yannick- I like it. It might be nice to extend your pipeline in a generic manner (SimpleAnalysisPipeline). Just a couple of steps that would be extensible/swappable to different software. The generic pipeline would "just work" given a minimal local configuration (I like your starting point). Swappable/configurable steps might be Pre-process (trim / quality filters?) Alignment (align) Post alignment (gblocks) Translation (to Nexus) Analysis (Phyml) The idea is that we could swap in components (TNT or RaXML for Phyml, Muscle for MAFFT etc.)- but also that the pipeline remains "simple". If I find some time I'd like to work on my first attempt at a BioRuby Plugin, a wrapper for TNT (hopefully tied in to the analysis bit above). cheers, Matt On Tue, Jan 18, 2011 at 3:56 AM, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: > ?- tranlsation > ?- multiple alignment (mafft) > ?- gblocks to remove crap > ?- tree building (phyml) > AND > ?- codon-level alignment: reverse translated from protein multiple alignment (pal2nal) > ?- gblocks to remove crap > ?- tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. > > Cheers, > > yannick > > ------------------------- > ?Ant Genomes & Evolution > http://yannick.poulet.org > ? skype://yannickwurm > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From francesco.strozzi at gmail.com Tue Jan 18 20:55:20 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Tue, 18 Jan 2011 21:55:20 +0100 Subject: [BioRuby] BioRuby HTSeq-like Message-ID: Hi BioRuby people, just wondering if something similar exists for BioRuby (is a package to work and manipulate next-gen sequencing data, in Python): http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html Many features could be implemented or are already available for BioRuby....these are the basics: - Getting statistical summaries about the base-call quality scores to study the data quality. - Calculating a coverage vector and exporting it for visualization in a genome browser. - Reading in annotation data from a GFF file. - Assigning aligned reads from an RNA-Seq experiments to exons and genes. Particularly, the plotting functions to explore and assess quality data seems very interesting. If nothing similar exists for BioRuby, I think we should discuss about coding a BioRuby "NextGenSequencing" plugin, to provide the same functionalities and also to add something new as well.... What do you think? Cheers -- Francesco From yannick.wurm at unil.ch Thu Jan 20 04:08:55 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Thu, 20 Jan 2011 11:08:55 +0700 Subject: [BioRuby] Rake In-Reply-To: References: Message-ID: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> Hello & thanks for the comments. Matt wrote: > Swappable/configurable steps might be > > Pre-process (trim / quality filters?) > Alignment (align) > Post alignment (gblocks) > Translation (to Nexus) > Analysis (Phyml) > > The idea is that we could swap in components (TNT or RaXML for Phyml, > Muscle for MAFFT etc.)- but also that the pipeline remains "simple". Yes, thats what I would ideally want. (as well as being able to easily modify the run options of the programs). How would you go about generalizing this? Right now I'm basing "what do to" on the file extensions I provide... which limits me based on the file extensions... Michael wrote: > I think it's a great idea to generate predefined pipelines for common > bioinformatics tasks. I experimented with a tool called Boson six months ago. > It could be worth looking if you feel like investing more time into your > pipeline. > > Boson commands, similar to rake tasks, are more modular and can be installed > from the web into a ~/.boson directory. This has obvious advantages over > a single rake file. Boson tasks can be chained together where the data is > passed around in YAML format. > > The github link is - https://github.com/cldwalker/boson I haven't looked thoroughly now, but at least superficially, Boson looks real cool. However, I'm a bit scared of investing energy into technologies that are too new. Boson has only one developer who may or may not keep his project alive over the next years. Time I invest in learning something today ... must continue to help improve my productivity over the next 5 or 10 years by still being reusable in 5 or 10 years (with as few modifications as possible). There is uncertainty to everything, but something like Boson does seems a bit too risky right now... Cheers, yannick ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From bonnalraoul at ingm.it Thu Jan 20 09:13:20 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 20 Jan 2011 10:13:20 +0100 Subject: [BioRuby] Rake In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> Message-ID: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> Dear Yanninck, rake usually is used inside a project directory to provide common operations to the project. Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are. Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability What about spend some energy on wrapping that commands into BioRuby classes? In that way those application could be available to other scripts. If you want to keep the rake approach we should find a way to not replicate rakefiles. One idea could be to create a rakefile in your working directory, similar to Rails: # Add your own tasks in files placed in lib/tasks ending in .rake, # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake. require File.expand_path('../config/application', __FILE__) require 'rake' #The user needs just to add the tasks he wants: Bio::SomeName.load_tasks Bio::SomeOtherName.load_tasks Bio::AnotherName.load_tasks On 18/gen/2011, at 09.56, Yannick Wurm wrote: > Dear List, > > I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working. > > It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building: > - tranlsation > - multiple alignment (mafft) > - gblocks to remove crap > - tree building (phyml) > AND > - codon-level alignment: reverse translated from protein multiple alignment (pal2nal) > - gblocks to remove crap > - tree building (phyml) > > https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile > > > It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. > > However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. > > Cheers, > > yannick > > ------------------------- > Ant Genomes & Evolution > http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From bonnalraoul at ingm.it Thu Jan 20 10:35:58 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Thu, 20 Jan 2011 11:35:58 +0100 Subject: [BioRuby] BioRuby HTSeq-like In-Reply-To: References: Message-ID: Hi folks, Yesterday I met Francesco in my lab and was a wonderful opportunity to exchange ideas and thoughts. About Fancesco's mail I think that we could grab inspiration from Galaxy/BioPython (http://main.g2.bx.psu.edu/) , they did a very good work on wrapping the common software for crunching NGS data. So my input is, let's start wrapping them and possibly opening a bioruby-ngs project on github: https://github.com/helios/bioruby-ngs (just the repo :-)) reading around http://seqanswers.com/forums/showthread.php?t=2461 sometimes there is the need to split and distribute the computation: there are different possibilities, but splitting the fastq file and at the same time enabling the multithreading seems to be the best option; if you have suggestions please comment. Thanks to Goto san, fastq support is on Thanks to Pjotr, GFF3 support is on Thanks to Chase and Fancesco, CAF and Ace support is on For plotting as we said one possibility is http://rubyvis.rubyforge.org/ from Claudio Bustos but if you have better alternatives... please discuss. About statistics please join http://groups.google.com/group/sciruby-dev Having this tools in our arsenal is useful and strategical for founding. I would say +1 PS: Please clone and add your name to the list of the authors if you want to join into this project. PS: if someone is using SGE what do you think about http://gridengine.info/2010/12/24/goodbye-grid-engine ? On 18/gen/2011, at 21.55, Francesco Strozzi wrote: > Hi BioRuby people, > just wondering if something similar exists for BioRuby (is a package > to work and manipulate next-gen sequencing data, in Python): > > http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html > > Many features could be implemented or are already available for > BioRuby....these are the basics: > - Getting statistical summaries about the base-call quality scores to > study the data quality. > - Calculating a coverage vector and exporting it for visualization in > a genome browser. > - Reading in annotation data from a GFF file. > - Assigning aligned reads from an RNA-Seq experiments to exons and genes. > > Particularly, the plotting functions to explore and assess quality > data seems very interesting. > If nothing similar exists for BioRuby, I think we should discuss about > coding a BioRuby "NextGenSequencing" plugin, to provide the same > functionalities and also to add something new as well.... > > What do you think? > > Cheers > -- > > Francesco > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From francesco.strozzi at gmail.com Thu Jan 20 11:27:43 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Thu, 20 Jan 2011 12:27:43 +0100 Subject: [BioRuby] BioRuby HTSeq-like In-Reply-To: References: Message-ID: Today is Thursday (BioRuby IRC day), I will try to join the #bioruby channel this afternoon (CET time). If there is someone else we could discuss about this plugin and new ideas. > PS: Please clone and add your name to the list of the authors if you want to > join into this project. Done! I'm in! -- Francesco From mail at michaelbarton.me.uk Thu Jan 20 20:57:26 2011 From: mail at michaelbarton.me.uk (Michael Barton) Date: Thu, 20 Jan 2011 15:57:26 -0500 Subject: [BioRuby] Rake In-Reply-To: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> References: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch> Message-ID: <20110120205726.GD245@Michael-Bartons-MacBook.local> Yannick, you make an excellent point about the long term stability for boson. The ruby community, myself often guilty of this, is quick to jump on a new gem, which may or may not last into the future. A example of this is the Less gem for compiling CSS which has seen some recet popularity. I believe the developer has said he will no longer maintain it. Another option could be Thor. I believe this is also aimed at being a more modular rake-like tool. This is developed by Yehuda Katz and I think is used for the basis of few mainstream ruby command line tools (possibly the rails3 CLI? I'm not 100% about this.). I think you could expect Thor to be more mature and likely to be continually developed. If can find the episode of the ChangeLog with Yehuda you can hear him discuss it. On Thu, Jan 20, 2011 at 11:08:55AM +0700, Yannick Wurm wrote: > Hello & thanks for the comments. > > Matt wrote: > > Swappable/configurable steps might be > > > > Pre-process (trim / quality filters?) Alignment (align) Post alignment > > (gblocks) Translation (to Nexus) Analysis (Phyml) > > > > The idea is that we could swap in components (TNT or RaXML for Phyml, > > Muscle for MAFFT etc.)- but also that the pipeline remains "simple". > > Yes, thats what I would ideally want. (as well as being able to easily modify > the run options of the programs). How would you go about generalizing this? > Right now I'm basing "what do to" on the file extensions I provide... which > limits me based on the file extensions... > > > > Michael wrote: > > I think it's a great idea to generate predefined pipelines for common > > bioinformatics tasks. I experimented with a tool called Boson six months > > ago. It could be worth looking if you feel like investing more time into > > your pipeline. > > > > Boson commands, similar to rake tasks, are more modular and can be > > installed from the web into a ~/.boson directory. This has obvious > > advantages over a single rake file. Boson tasks can be chained together > > where the data is passed around in YAML format. > > > > The github link is - https://github.com/cldwalker/boson > > I haven't looked thoroughly now, but at least superficially, Boson looks real > cool. > > However, I'm a bit scared of investing energy into technologies that are too > new. Boson has only one developer who may or may not keep his project alive > over the next years. Time I invest in learning something today ... must > continue to help improve my productivity over the next 5 or 10 years by still > being reusable in 5 or 10 years (with as few modifications as possible). > There is uncertainty to everything, but something like Boson does seems a bit > too risky right now... > > Cheers, > > yannick > > > ------------------------- Ant Genomes & Evolution http://yannick.poulet.org > skype://yannickwurm > > > > > _______________________________________________ BioRuby Project > - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From yannick.wurm at unil.ch Fri Jan 21 05:24:22 2011 From: yannick.wurm at unil.ch (Yannick Wurm) Date: Fri, 21 Jan 2011 12:24:22 +0700 Subject: [BioRuby] Rake In-Reply-To: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> References: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch> <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it> Message-ID: <2E930098-FD2E-4D3B-AC61-1B54D3653DE7@unil.ch> Ciao Raoul, mi dispiace, I was away from the computer during most of the irc thing. On 20 Jan 2011, at 16:13, Raoul Bonnal wrote: > Dear Yanninck, > rake usually is used inside a project directory to provide common operations to the project. > Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are. > > Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability That is true. My main worry here (and for most things) is rapidly getting a biological result. I'm still working on finding the optimal balance between quick hack and maintainability/reusability. Migrating from shell scripts to ruby hacks does probably save me some time because in ruby it's really simple to put in a few verifications by raising Errors if a tool I need isn't in the $PATH or if an input/output file is empty. Those mean that debugging and fixing is much faster if I decide to run things on the linux server instead of the macbook, or in 2 years time after a reinstall. > What about spend some energy on wrapping that commands into BioRuby classes? > In that way those application could be available to other scripts. I have two answers. - right now I cannot dedicate the time required to learn how to do that well. I need understand how ants work first :) (If I were developping a big uniprot-type web application that needs to be robust for users, making wrappers may be defendable.... for one-off hacks its not) - call me conservative, but I'm also generally scared of wrappers. First, I want to have the raw input & output files that the programs use, because I may need to read or edit or rerun them in the future... I know I'll be able to read a raw text file. Thus I've never used bioruby's wrappers for blast or codeml or multiple sequence alignment (However, I have recently discovered the amazingly timesaving Bio::Tree however -wow). Second, programs are constantly changing... and thus wrappers must too - they're a ton of work to maintain and -like the Boson thing- there is no guarantee that that will be done. > If you want to keep the rake approach we should find a way to not replicate rakefiles. > One idea could be to create a rakefile in your working directory, similar to Rails: > > # Add your own tasks in files placed in lib/tasks ending in .rake, > # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake. > > require File.expand_path('../config/application', __FILE__) > require 'rake' > > #The user needs just to add the tasks he wants: > Bio::SomeName.load_tasks > Bio::SomeOtherName.load_tasks > Bio::AnotherName.load_tasks That sounds like a really cool approach. I want to hear more :) ------------------------- Ant Genomes & Evolution http://yannick.poulet.org skype://yannickwurm From francesco.strozzi at gmail.com Fri Jan 21 09:41:47 2011 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Fri, 21 Jan 2011 10:41:47 +0100 Subject: [BioRuby] BIO-NGS (and Rake/Thor for bioinformatics) Message-ID: Hi all, in the yesterday IRC chat (http://bioruby.org/irc/?date=2011-01) we discussed about the bio-ngs plugin that Raoul wrote in a previous email. Here is the Wiki page on BioRuby describing the general idea for this plugin: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing We want to use wrappers and/or bindings to existing tools like MAQ,BWA,SAMtools and we want to use Rake or Thor to provide custom tasks and let the user run NGS analysis. We would like to include also the possibility to create reports using statsample and rubyvis. Maybe some aspects are still a bit unclear at the moment (I think we need to define some sort of guidelines), but I hope we could come up with a useful (let me use this term) "framework" to run bioinformatics NGS analyses with Ruby. Any comment/help/feedback/suggestion is more than welcome! Cheers -- Francesco From mictadlo at gmail.com Wed Jan 26 00:41:11 2011 From: mictadlo at gmail.com (Michal) Date: Wed, 26 Jan 2011 10:41:11 +1000 Subject: [BioRuby] marshal data too short Message-ID: <4D3F6DA7.8050101@gmail.com> Hi, I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way $ tar xvfz ruby-1.9.2-p136.tar.gz $ cd ruby-1.9.2-p136/ $ ./configure --prefix=/home/mictadlo/apps/ruby $ make $ make install $ vim ~/.bashrc export APPS=/home/mictadlo/apps export RUBY_HOME=$APPS/ruby export LD_LIBRARY_PATH=/RUBY_HOME/lib PATH=$RUBY_HOME/bin:$PATH $ . ~/.bashrc $ ruby -v ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] $ tar xvfz bioruby-1.4.1.tar.gz $ cd bioruby-1.4.1/ $ ruby setup.rb $ bioruby Loading config (/home/mitlox/.bioruby/shell/session/config) ... done Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short done . . . B i o R u b y i n t h e s h e l l . . . Version : BioRuby 1.4.1 / Ruby 1.9.2 bioruby> exit How can I fix the error in BioRuby? Thank you in advance. Michal From bonnalraoul at ingm.it Wed Jan 26 15:23:03 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Wed, 26 Jan 2011 16:23:03 +0100 Subject: [BioRuby] IRC meeting Message-ID: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> As usual, tomorrow the IRC meeting. -- R.J.P.B. From ktym at hgc.jp Wed Jan 26 16:09:02 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 27 Jan 2011 01:09:02 +0900 Subject: [BioRuby] marshal data too short In-Reply-To: <4D3F6DA7.8050101@gmail.com> References: <4D3F6DA7.8050101@gmail.com> Message-ID: Hi Michal, Could you give me some additional information? % ls -l ~/.bioruby/shell/session/object -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object [4, 8] Have you ever used the bioruby shell with the old version of Ruby before? If your file is not corrupted, this might be due to the backward incompatibility of the Marshal file format (if so, does anyone know whether there are any workaround to convert old marshal data into 1.9's?). Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully restore the saved objects: % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object Toshiaki On 2011/01/26, at 9:41, Michal wrote: > Hi, > I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way > > $ tar xvfz ruby-1.9.2-p136.tar.gz > $ cd ruby-1.9.2-p136/ > $ ./configure --prefix=/home/mictadlo/apps/ruby > $ make > $ make install > $ vim ~/.bashrc > export APPS=/home/mictadlo/apps > export RUBY_HOME=$APPS/ruby > export LD_LIBRARY_PATH=/RUBY_HOME/lib > PATH=$RUBY_HOME/bin:$PATH > $ . ~/.bashrc > $ ruby -v > ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] > > $ tar xvfz bioruby-1.4.1.tar.gz > $ cd bioruby-1.4.1/ > $ ruby setup.rb > $ bioruby > Loading config (/home/mitlox/.bioruby/shell/session/config) ... done > Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short > done > > . . . B i o R u b y i n t h e s h e l l . . . > > Version : BioRuby 1.4.1 / Ruby 1.9.2 > > bioruby> exit > > How can I fix the error in BioRuby? > > Thank you in advance. > > Michal > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ktym at hgc.jp Wed Jan 26 16:46:23 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Thu, 27 Jan 2011 01:46:23 +0900 Subject: [BioRuby] IRC meeting In-Reply-To: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> References: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it> Message-ID: Raoul, On 2011/01/27, at 0:23, Raoul Bonnal wrote: > As usual, tomorrow the IRC meeting. > > -- > R.J.P.B. Thank you for the reminder! The next will be our 6th IRC meeting. In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system. As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became the first use case of the logger (he posted announcements to this list on Jan 17th). We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting. As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing As for the weekly BioRuby IRC meeting, please see http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference Thanks, Toshiaki From bonnalraoul at ingm.it Wed Jan 26 19:10:26 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Wed, 26 Jan 2011 20:10:26 +0100 Subject: [BioRuby] IRC meeting In-Reply-To: Message-ID: <20110126191026.e71c169e@mail.ingm.it> Hi all, I have updated the page http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing I'll try to keep you up to date about samtools from ml and that page. I can't remember who is involved in the workflows, tomorrow we'll fix the page with the rigth names. _____ From: Toshiaki Katayama [mailto:ktym at hgc.jp] To: Raoul Bonnal [mailto:bonnalraoul at ingm.it] Cc: BioRuby ML [mailto:bioruby at lists.open-bio.org] Sent: Wed, 26 Jan 2011 17:46:23 +0100 Subject: Re: [BioRuby] IRC meeting Raoul, On 2011/01/27, at 0:23, Raoul Bonnal wrote: > As usual, tomorrow the IRC meeting. > > -- > R.J.P.B. Thank you for the reminder! The next will be our 6th IRC meeting. In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system. As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became the first use case of the logger (he posted announcements to this list on Jan 17th). We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting. As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing As for the weekly BioRuby IRC meeting, please see http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference Thanks, Toshiaki From mictadlo at gmail.com Fri Jan 28 12:18:30 2011 From: mictadlo at gmail.com (Michal) Date: Fri, 28 Jan 2011 22:18:30 +1000 Subject: [BioRuby] marshal data too short In-Reply-To: References: <4D3F6DA7.8050101@gmail.com> Message-ID: <4D42B416.8010503@gmail.com> Hi Toshiaki, On my system was not Ruby installed before and I just installed the latest version in my home directory: $ ls -l ~/.bioruby/shell/session/object -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object [nil, nil] $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object -e:1:in `load': marshal data too short (ArgumentError) from -e:1:in `
' Do you need another information? Thank you in advance. Michal On 01/27/2011 02:09 AM, Toshiaki Katayama wrote: > Hi Michal, > > Could you give me some additional information? > > % ls -l ~/.bioruby/shell/session/object > -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object > > % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object > [4, 8] > > Have you ever used the bioruby shell with the old version of Ruby before? > > If your file is not corrupted, this might be due to the backward > incompatibility of the Marshal file format (if so, does anyone know > whether there are any workaround to convert old marshal data into 1.9's?). > > Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully > restore the saved objects: > > % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object > > Toshiaki > > > On 2011/01/26, at 9:41, Michal wrote: > >> Hi, >> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way >> >> $ tar xvfz ruby-1.9.2-p136.tar.gz >> $ cd ruby-1.9.2-p136/ >> $ ./configure --prefix=/home/mictadlo/apps/ruby >> $ make >> $ make install >> $ vim ~/.bashrc >> export APPS=/home/mictadlo/apps >> export RUBY_HOME=$APPS/ruby >> export LD_LIBRARY_PATH=/RUBY_HOME/lib >> PATH=$RUBY_HOME/bin:$PATH >> $ . ~/.bashrc >> $ ruby -v >> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] >> >> $ tar xvfz bioruby-1.4.1.tar.gz >> $ cd bioruby-1.4.1/ >> $ ruby setup.rb >> $ bioruby >> Loading config (/home/mitlox/.bioruby/shell/session/config) ... done >> Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short >> done >> >> . . . B i o R u b y i n t h e s h e l l . . . >> >> Version : BioRuby 1.4.1 / Ruby 1.9.2 >> >> bioruby> exit >> >> How can I fix the error in BioRuby? >> >> Thank you in advance. >> >> Michal >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > From ktym at hgc.jp Sat Jan 29 12:18:04 2011 From: ktym at hgc.jp (Toshiaki Katayama) Date: Sat, 29 Jan 2011 21:18:04 +0900 Subject: [BioRuby] marshal data too short In-Reply-To: <4D42B416.8010503@gmail.com> References: <4D3F6DA7.8050101@gmail.com> <4D42B416.8010503@gmail.com> Message-ID: <8DFDDEA3-9B1D-44DC-BCB6-DCBA2C06BAF9@hgc.jp> Hi Michal, When I remove the ~/.bioruby directory, I could reproduce the same error with Ruby 1.9.2. The ~/.bioruby/shell/session/object file was empty because BioRuby shell failed to save the file. Saving object (/Users/ktym/.bioruby/shell/session/object) ... Error: Failed to save (/Users/ktym/.bioruby/shell/session/object) : can't convert Symbol into String I'll try to fix this. Toshiaki On 2011/01/28, at 21:18, Michal wrote: > Hi Toshiaki, > On my system was not Ruby installed before and I just installed the latest version in my home directory: > $ ls -l ~/.bioruby/shell/session/object > -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object > $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object > [nil, nil] > $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object > -e:1:in `load': marshal data too short (ArgumentError) > from -e:1:in `
' > > Do you need another information? > > Thank you in advance. > > Michal > > > On 01/27/2011 02:09 AM, Toshiaki Katayama wrote: >> Hi Michal, >> >> Could you give me some additional information? >> >> % ls -l ~/.bioruby/shell/session/object >> -rw-r--r-- 1 ktym staff 17401 1 19 13:09 /Users/ktym/.bioruby/shell/session/object >> >> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object >> [4, 8] >> >> Have you ever used the bioruby shell with the old version of Ruby before? >> >> If your file is not corrupted, this might be due to the backward >> incompatibility of the Marshal file format (if so, does anyone know >> whether there are any workaround to convert old marshal data into 1.9's?). >> >> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully >> restore the saved objects: >> >> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object >> >> Toshiaki >> >> >> On 2011/01/26, at 9:41, Michal wrote: >> >>> Hi, >>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way >>> >>> $ tar xvfz ruby-1.9.2-p136.tar.gz >>> $ cd ruby-1.9.2-p136/ >>> $ ./configure --prefix=/home/mictadlo/apps/ruby >>> $ make >>> $ make install >>> $ vim ~/.bashrc >>> export APPS=/home/mictadlo/apps >>> export RUBY_HOME=$APPS/ruby >>> export LD_LIBRARY_PATH=/RUBY_HOME/lib >>> PATH=$RUBY_HOME/bin:$PATH >>> $ . ~/.bashrc >>> $ ruby -v >>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux] >>> >>> $ tar xvfz bioruby-1.4.1.tar.gz >>> $ cd bioruby-1.4.1/ >>> $ ruby setup.rb >>> $ bioruby >>> Loading config (/home/mitlox/.bioruby/shell/session/config) ... done >>> Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short >>> done >>> >>> . . . B i o R u b y i n t h e s h e l l . . . >>> >>> Version : BioRuby 1.4.1 / Ruby 1.9.2 >>> >>> bioruby> exit >>> >>> How can I fix the error in BioRuby? >>> >>> Thank you in advance. >>> >>> Michal >>> >>> _______________________________________________ >>> BioRuby Project - http://www.bioruby.org/ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >> > From mictadlo at gmail.com Sun Jan 30 11:42:09 2011 From: mictadlo at gmail.com (Michal) Date: Sun, 30 Jan 2011 21:42:09 +1000 Subject: [BioRuby] samtools-ruby Message-ID: <4D454E91.1080604@gmail.com> Hi, I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response. What did I wrong? Michal From bonnalraoul at ingm.it Mon Jan 31 10:11:46 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 11:11:46 +0100 Subject: [BioRuby] samtools-ruby In-Reply-To: <4D454E91.1080604@gmail.com> References: <4D454E91.1080604@gmail.com> Message-ID: Dear Michal, please check this out: https://github.com/helios/bioruby-samtools This is the inital port of samtools-ruby as plugin. It comes with library for osx and linux, no windows. I need to test the linux library because I'm developing under osx. If the libbam.a is wrong please give me the right one and I'll add it to the repo. Also note that the library has been compiled for 64bit. Ciao! On 30/gen/2011, at 12.42, Michal wrote: > Hi, > I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response. > > What did I wrong? > > Michal > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B. From bonnalraoul at ingm.it Mon Jan 31 10:27:51 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 11:27:51 +0100 Subject: [BioRuby] BioGem and Rails Message-ID: <1CCE39E6-F232-44C8-B95D-3C620443EF5C@ingm.it> Dear All, I've created a new branch in biogem. https://github.com/helios/bioruby-gem/tree/rails_engine It adds an option at biogem script for creating a rails engine with your gem, ONLY Rails3 !!! The idea is: develop a gem that can be used in a script and extend it to be integrated in a rails project. Which library can benefits from this approach ? I think, databases, parser or any data that you want to expose to a rails application. It's in a very early stage so don't use it now, this message is just to let you know that we are adding new features. from the help: --with-engine create a Rails engine with the namespace give in input. Set default database creation Note: Is not possible to add the engine to an old gem, I need to fix it and implement the generator to accomplish to this task. Any input is welcome. Ciao. -- R.J.P.B. From jan.aerts at gmail.com Mon Jan 31 15:07:39 2011 From: jan.aerts at gmail.com (Jan Aerts) Date: Mon, 31 Jan 2011 16:07:39 +0100 Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics Message-ID: All, FYI: There is now a Bioinformatics paper that describes the Ruby API to the Ensembl databases. Thanks to Francesco Strozzi for working on this with me. You can find it here: http://bit.ly/fzQamR At this moment this API covers the core and variation databases. If anyone is interested in working on the API for compara or functional, please let me know. Kind regards, jan. From bonnalraoul at ingm.it Mon Jan 31 15:22:12 2011 From: bonnalraoul at ingm.it (Raoul Bonnal) Date: Mon, 31 Jan 2011 16:22:12 +0100 Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics In-Reply-To: References: Message-ID: <193864A0-D798-4737-83CE-7A7932E4552C@ingm.it> well done! On 31/gen/2011, at 16.07, Jan Aerts wrote: > All, > > FYI: There is now a Bioinformatics paper that describes the Ruby API to the > Ensembl databases. Thanks to Francesco Strozzi for working on this with me. > You can find it here: http://bit.ly/fzQamR > > At this moment this API covers the core and variation databases. If anyone > is interested in working on the API for compara or functional, please let me > know. > > Kind regards, > jan. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- R.J.P.B.