From marian.povolny at gmail.com Sat May 5 09:07:30 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sat, 5 May 2012 15:07:30 +0200 Subject: [BioRuby] GSoC weekly status report No.1 Message-ID: Hello all, It might be a little early, but there has been so much going on in the last 10 days since the results of GSoC were published... http://blog.mpthecoder.com/post/22380853664/gsoc-weekly-status-report-no-1 A short summary: It has been 10 days since the GSoC results were published, and a lot has happened since then. I got to know the other students and mentors in a longish meeting on Google hangout, I got into a discussion with my mentor on IRC in which we didn?t agree about the parallelization strategy for the parser (experiments will show who?s right) and my inbox is full with mails from my mentor and other students, in which we exchanged loads of interesting ideas. Also, I solved a bug in biogems.info website, which was stopping Pjotr from updating the website with new information about biogems. There is now a GitHub repository for my project: https://github.com/mamarjan/bioruby-hpc-gff3 The work for the first week of coding is halfway done too. There seems to be huge interest for a GFF3 parser with more features, like indexing, random access and writing output, and also support for linking into trees of features that are not located close to each other in the file. A fast sequential parser could be used to generate indexes, and the lower-level parts can be used to reorder the file for faster future usage. Based on that, I think this project is a good start. *I would like to ask you if you?re using the GFF3/GTF file formats in your research, to send me example files and descriptions of how are your applications using the data. This way I?ll be able to test the parser against your files and optimize it for your applications. Currently I have GFF files from Ensembl and Wormbase, and Pjotr pointed me to the genome browser web application at wormbase.org.* -- Marjan From lomereiter at googlemail.com Sun May 6 15:56:50 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Sun, 6 May 2012 23:56:50 +0400 Subject: [BioRuby] [GSoC][BAM] Weekly report No. 0 Message-ID: Hi all, I wrote a few words about what I've done last week: http://lomereiter.wordpress.com/2012/05/06/gsoc-weekly-report-0/ Summary: The code is available at github: https://github.com/lomereiter/BAMread/ I already started to write code planned for the first week so as to have more time in June for exam preparation. Opening BAM and parsing SAM header works, and is available from Ruby, and now I need to write some tests and documentation. Also, I described some compile-time metaprogramming tricks in D which I use to reduce duplication in the code. I'd be grateful for some small BAM files, 1-50 kilobytes in size, with non-empty headers, for testing purposes. -- Artem From bonnal at ingm.org Mon May 7 03:08:53 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 07 May 2012 09:08:53 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: Message-ID: Dear All, BioRuby wiki is up to date with the accepted projects. I created new pages for each accepted project ( just created ). Are we going to keep it up to date with results and summarizing blog posts or what ? -- Ra From p.j.a.cock at googlemail.com Mon May 7 03:31:09 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 7 May 2012 08:31:09 +0100 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: References: Message-ID: On Monday, May 7, 2012, Raoul Bonnal wrote: > Dear All, > BioRuby wiki is up to date with the accepted projects. I created new pages > for each accepted project ( just created ). Are we going to keep it up to > date with results and summarizing blog posts or what ? > > Blog posts (sent to the mailing list too) for weekly updates, but more static wiki page for summary? You can link to the blog posts from the wiki too. Peter From pjotr.public14 at thebird.nl Mon May 7 03:49:09 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 7 May 2012 09:49:09 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: References: Message-ID: <20120507074909.GB30679@thebird.nl> I was thinking to add news items to biogems.info, and its RSS feed. That gets updated a few times a day. Anyone interested in helping out? Should be straightforward: - Add YAML ./etc/blogs.yaml with links to BLOG RSS feeds - Write script to fetch these and merge it with the RSS for biogems That would give us a new RSS feed. Useful. Next step: - Add news column on main http://biogems.info/ page - Fill it with same RSS items Later I would also like to add a list of active pushes to projects (github style). But that is later. Pj. On Mon, May 07, 2012 at 09:41:48AM +0200, Raoul Bonnal wrote: > Fine. > On 07/05/12 09.31, "Peter Cock" <[1]p.j.a.cock at googlemail.com> wrote: > > On Monday, May 7, 2012, Raoul Bonnal wrote: > > Dear All, > BioRuby wiki is up to date with the accepted projects. I created new > pages > for each accepted project ( just created ). Are we going to keep it > up to > date with results and summarizing blog posts or what ? > > Blog posts (sent to the mailing list too) for weekly updates, > but more static wiki page for summary? You can link to the > blog posts from the wiki too. > Peter > > References > > 1. file://localhost/tmp/p.j.a.cock at googlemail.com From bonnal at ingm.org Mon May 7 03:41:48 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 07 May 2012 09:41:48 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: Message-ID: Fine. On 07/05/12 09.31, "Peter Cock" wrote: > > > On Monday, May 7, 2012, Raoul Bonnal wrote: >> Dear All, >> BioRuby wiki is up to date with the accepted projects. I created new pages >> for each accepted project ( just created ). Are we going to keep it up to >> date with results and summarizing blog posts or what ? >> > > Blog posts (sent to the mailing list too) for weekly updates, > but more static wiki page for summary? You can link to the > blog posts from the wiki too. > > > > Peter > ? > From john.woods at marcottelab.org Tue May 8 18:08:47 2012 From: john.woods at marcottelab.org (John Woods) Date: Tue, 8 May 2012 17:08:47 -0500 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship Message-ID: Hi BioRuby folks, I'm pleased to announce that we've opened applications for our first ever Summer of Code, generously sponsored by Brighter Planet. http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ Please note that we recommend you have your application in by *Monday*, which is really soon. Help us out by sharing this around on various social media. Here are links to existing tweets/posts/etc that you can retweet/share/etc. Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH Reddit: http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ Cheers, John Woods Director, SciRuby Project From pjotr.public14 at thebird.nl Wed May 9 02:43:08 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 08:43:08 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: References: Message-ID: <20120509064308.GA24946@thebird.nl> Hi John, That is awesome news! Google has set a right trend with these summer of code initiatives. The OBF has quite some experience with mentoring students, see http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports and one thing we thing very important is weekly meetings between students (and mentors), and weekly blogs by the students. These will be captured on http://biogems.info/. It would be great your students participate in some of our meetings, so we can exchange ideas on Ruby and performance (we use extensions and parallel computing). Also I would like to invite your programme to blog, and that we track those blogs. Pj. On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote: > Hi BioRuby folks, > > I'm pleased to announce that we've opened applications for our first ever > Summer of Code, generously sponsored by Brighter Planet. > > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ > > Please note that we recommend you have your application in by *Monday*, > which is really soon. > > Help us out by sharing this around on various social media. Here are links > to existing tweets/posts/etc that you can retweet/share/etc. > > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH > Reddit: > http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ > > Cheers, > John Woods > Director, SciRuby Project > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From pjotr.public14 at thebird.nl Wed May 9 13:14:49 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:14:49 +0200 Subject: [BioRuby] BioRuby on Travis-ci! Message-ID: <20120509171449.GA29529@thebird.nl> Hi, Some have maybe noticed Goto-san put BioRuby on travis-ci now! See http://travis-ci.org/#!/bioruby/bioruby You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test failure. JRuby fails on a handful of tests and the crash on Rubinius looks spectacular. Note the clever .travis.yml file. We invite you to submit fixes to these tests. Especially our GSoC students, and other students on this ML, can get honors by providing a few fixes, and/or sending in issues to the JRuby/Rubinius projects :). Note both JRuby and Rubinius come with very interesting debugger support. Worth a shot. Your chance to show your Ruby muscles! Pj. From p.j.a.cock at googlemail.com Wed May 9 13:26:31 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:26:31 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci now! See > > ?http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test > failure. ?JRuby fails on a handful of tests and the crash on Rubinius > looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially our GSoC > students, and other students on this ML, can get honors by providing > a few fixes, and/or sending in issues to the JRuby/Rubinius projects > :). Note both JRuby and Rubinius come with very interesting debugger > support. Worth a shot. Your chance to show your Ruby muscles! > > Pj. And if you can fix the different bug identified via the BuildBot too, even better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html Starting from a clean nightly test result makes spotting regressions much easier ;) Peter From pjotr.public14 at thebird.nl Wed May 9 13:32:39 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:32:39 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509173239.GA30220@thebird.nl> Right, the link is here http://testing.open-bio.org/bioruby/one_line_per_build (I need to incorporate this also in http://biogems.info/) On Wed, May 09, 2012 at 06:26:31PM +0100, Peter Cock wrote: > And if you can fix the different bug identified via the BuildBot too, even > better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html > > Starting from a clean nightly test result makes spotting regressions > much easier ;) > > Peter > From cjfields at illinois.edu Wed May 9 13:29:49 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 9 May 2012 17:29:49 +0000 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> On May 9, 2012, at 12:26 PM, Peter Cock wrote: > On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: >> Hi, >> >> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See >> >> http://travis-ci.org/#!/bioruby/bioruby >> >> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test >> failure. JRuby fails on a handful of tests and the crash on Rubinius >> looks spectacular. >> >> Note the clever .travis.yml file. >> >> We invite you to submit fixes to these tests. Especially our GSoC >> students, and other students on this ML, can get honors by providing >> a few fixes, and/or sending in issues to the JRuby/Rubinius projects >> :). Note both JRuby and Rubinius come with very interesting debugger >> support. Worth a shot. Your chance to show your Ruby muscles! >> >> Pj. > > And if you can fix the different bug identified via the BuildBot too, even > better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html > > Starting from a clean nightly test result makes spotting regressions > much easier ;) > > Peter *sigh* Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P chris From pjotr.public14 at thebird.nl Wed May 9 13:35:17 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:35:17 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> References: <20120509171449.GA29529@thebird.nl> <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> Message-ID: <20120509173517.GB30220@thebird.nl> On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote: > *sigh* > > Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P Peter knows someone in Scotland who can help! Now I got to see a man about a sheep... Pj. From p.j.a.cock at googlemail.com Wed May 9 13:49:59 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:49:59 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci now! See > > ?http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test > failure. ?JRuby fails on a handful of tests and the crash on Rubinius > looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially our GSoC > students, and other students on this ML, can get honors by providing > a few fixes, and/or sending in issues to the JRuby/Rubinius projects > :). Note both JRuby and Rubinius come with very interesting debugger > support. Worth a shot. Your chance to show your Ruby muscles! > > Pj. I see Travis supports Perl, Python and Java too (amongst others) so could be used by the other Bio* projects too for nightly testing (on a 32bit Debian Linux platform). How did you do this in Travis regarding the GitHub authorization? I don't see any way when logged in as me (peterjc) to allow Travis access to the repositories of GitHub organizations I have access to (like Biopython). Thanks, Peter From p.j.a.cock at googlemail.com Wed May 9 13:56:17 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:56:17 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:49 PM, Peter Cock wrote: > On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: >> Hi, >> >> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See >> >> ?http://travis-ci.org/#!/bioruby/bioruby >> >> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test >> failure. ?JRuby fails on a handful of tests and the crash on Rubinius >> looks spectacular. >> >> Note the clever .travis.yml file. >> >> We invite you to submit fixes to these tests. Especially our GSoC >> students, and other students on this ML, can get honors by providing >> a few fixes, and/or sending in issues to the JRuby/Rubinius projects >> :). Note both JRuby and Rubinius come with very interesting debugger >> support. Worth a shot. Your chance to show your Ruby muscles! >> >> Pj. > > I see Travis supports Perl, Python and Java too (amongst others) > so could be used by the other Bio* projects too for nightly testing > (on a 32bit Debian Linux platform). > > How did you do this in Travis regarding the GitHub authorization? > I don't see any way when logged in as me (peterjc) to allow Travis > access to the repositories of GitHub organizations I have access > to (like Biopython). I found there is an open issue on this missing feature: https://github.com/travis-ci/travis-ci/issues/242 There a comment links to a manual workaround: http://about.travis-ci.org/docs/user/how-to-setup-and-trigger-the-hook-manually/ I'm guessing that's how you did it for BioRuby? Thanks, Peter From mail at michaelbarton.me.uk Wed May 9 14:24:54 2012 From: mail at michaelbarton.me.uk (Michael Barton) Date: Wed, 9 May 2012 14:24:54 -0400 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509182454.GA4429@bartonh-mbp-01.uanet.edu> Travis CI is also rolling out a new feature when pull requests on github are automatically tested using the specs in the upstream merge. This can make it much easier to spot broken builds (and vice versa) before they are merged into the blessed branch. http://about.travis-ci.org/blog/announcing-pull-request-support/ On Wed, May 09, 2012 at 07:14:49PM +0200, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci > now! See > > http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small > unit test failure. JRuby fails on a handful of tests and > the crash on Rubinius looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially > our GSoC students, and other students on this ML, can get > honors by providing a few fixes, and/or sending in issues > to the JRuby/Rubinius projects :). Note both JRuby and > Rubinius come with very interesting debugger support. > Worth a shot. Your chance to show your Ruby muscles! > > Pj. _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From john.woods at marcottelab.org Wed May 9 15:25:38 2012 From: john.woods at marcottelab.org (John Woods) Date: Wed, 9 May 2012 14:25:38 -0500 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: <20120509064308.GA24946@thebird.nl> References: <20120509064308.GA24946@thebird.nl> Message-ID: Hi Pjotr, I'll discuss having our fellow participate in some of your meetings with the SciRuby team. I think the weekly meetings suggestion is a very good one, and we definitely do pay attention to how BioRuby handles its GSoC fellows. We do blog periodically. You can find it here: http://sciruby.com/blog/ I'll make sure that blogging is also a requirement for our fellow. Cheers, John On Wed, May 9, 2012 at 1:43 AM, Pjotr Prins wrote: > Hi John, > > That is awesome news! Google has set a right trend with these summer > of code initiatives. The OBF has quite some experience with mentoring > students, see > > http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports > > and one thing we thing very important is weekly meetings > between students (and mentors), and weekly blogs by the students. > These will be captured on http://biogems.info/. > > It would be great your students participate in some of our meetings, > so we can exchange ideas on Ruby and performance (we use extensions > and parallel computing). Also I would like to invite your programme > to blog, and that we track those blogs. > > Pj. > > On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote: > > Hi BioRuby folks, > > > > I'm pleased to announce that we've opened applications for our first ever > > Summer of Code, generously sponsored by Brighter Planet. > > > > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ > > > > Please note that we recommend you have your application in by *Monday*, > > which is really soon. > > > > Help us out by sharing this around on various social media. Here are > links > > to existing tweets/posts/etc that you can retweet/share/etc. > > > > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 > > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH > > Reddit: > > > http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ > > > > Cheers, > > John Woods > > Director, SciRuby Project > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > From p.j.a.cock at googlemail.com Wed May 9 13:44:37 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:44:37 +0100 Subject: [BioRuby] BioPerl BuildBot Message-ID: Hi all, I've retitled this and sent it to the BioPerl list, continuing from this thread on the BioRuby list: http://lists.open-bio.org/pipermail/bioruby/2012-May/002247.html On Wed, May 9, 2012 at 6:35 PM, Pjotr Prins wrote: > On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote: >> *sigh* >> >> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P > > Peter knows someone in Scotland who can help! Now I got to see a man > about a sheep... > > Pj. You mean Dolly The Sheep? ;) Tiago or I can assist on the BuilBot server side for BioPerl - in fact Tiago had already made a start (CC'd). We'll need help from a BioPerl developer with a spare machine or two to use as a buildslave (and I can probably borrow some of my employer's which are already nightly tests) to help with how we setup the BuildSlaves - essentially how to get BioPerl and relevant dependencies installed, and then what needs to be done from a fresh git checkout to build and run the tests. Tiago has got this currently: perl Build.PL --accepts ./Build test Once that is working on a single buildslave we can talk about different targets which is where BuildBot is really helpful (e.g. versions of Perl, different OS, etc) Peter From pjotr.public14 at thebird.nl Wed May 9 17:31:58 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 23:31:58 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509213158.GB31329@thebird.nl> On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: > I'm guessing that's how you did it for BioRuby? I think I added it before we were a github organization. Or we were just lucky :) Pj. From pjotr.public14 at thebird.nl Thu May 10 03:27:47 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 10 May 2012 09:27:47 +0200 Subject: [BioRuby] BioRuby rss news feed Message-ID: <20120510072747.GA4587@thebird.nl> Marjan and I have revamped the BioRuby/biogems news feed. See http://www.biogems.info/rss.xml Health warning: Includes opiniated and caffeenated Google Summer of Code blog entries :) Pj. From p.j.a.cock at googlemail.com Thu May 10 06:31:07 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 10 May 2012 11:31:07 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509213158.GB31329@thebird.nl> References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> Message-ID: On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins wrote: > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: >> I'm guessing that's how you did it for BioRuby? > > I think I added it before we were a github organization. Or we were > just lucky :) > > Pj. I'd guess the former - I've now got a personal Travis account via my personal GitHub account), but for now I can't seem to create a Biopython Travis account via the Biopython organization account on GitHub. Nevertheless, I could get the basic Biopython unit tests running on Travis last night (including Python 3), although this needs more work installing dependencies to get the full test suite coverage: http://travis-ci.org/#!/peterjc/biopython Peter From pjotr.public14 at thebird.nl Thu May 10 12:40:02 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 10 May 2012 18:40:02 +0200 Subject: [BioRuby] BioRuby rss news feed In-Reply-To: <20120510072747.GA4587@thebird.nl> References: <20120510072747.GA4587@thebird.nl> Message-ID: <20120510164002.GA9030@thebird.nl> http://www.biogems.info/ also shows news items and blog entries on the right now. If you want your blog on Bio/Ruby added, just tell us :) Pj. On Thu, May 10, 2012 at 09:27:47AM +0200, Pjotr Prins wrote: > Marjan and I have revamped the BioRuby/biogems news feed. See > > http://www.biogems.info/rss.xml > > Health warning: Includes opiniated and caffeenated Google Summer of Code > blog entries :) > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From georgkam at gmail.com Thu May 10 13:34:14 2012 From: georgkam at gmail.com (George Githinji) Date: Thu, 10 May 2012 20:34:14 +0300 Subject: [BioRuby] BioRuby rss news feed In-Reply-To: <20120510072747.GA4587@thebird.nl> References: <20120510072747.GA4587@thebird.nl> Message-ID: Thanks for all the hardwork! On Thu, May 10, 2012 at 10:27 AM, Pjotr Prins wrote: > Marjan and I have revamped the BioRuby/biogems news feed. See > > ?http://www.biogems.info/rss.xml > > Health warning: Includes opiniated and caffeenated Google Summer of Code > blog entries :) > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- --------------- Sincerely George Skype: george_g2 Blog: http://biorelated.wordpress.com/ Twitter: http://twitter.com/#!/george_l From pjotr.public14 at thebird.nl Fri May 11 05:06:48 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 11 May 2012 11:06:48 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: References: <20120509064308.GA24946@thebird.nl> Message-ID: <20120511090648.GA15897@thebird.nl> We can now list non-biogem rubygems. SciRuby is listed on http://www.biogems.info/rubygems.html Pj. From bonnal at ingm.org Fri May 11 06:58:44 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Fri, 11 May 2012 12:58:44 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: <20120511090648.GA15897@thebird.nl> Message-ID: +1 :) On 11/05/12 11.06, "Pjotr Prins" wrote: > We can now list non-biogem rubygems. > > SciRuby is listed on http://www.biogems.info/rubygems.html > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From throwern at msu.edu Fri May 11 10:20:19 2012 From: throwern at msu.edu (Nick Thrower) Date: Fri, 11 May 2012 10:20:19 -0400 Subject: [BioRuby] BioTabix gem Message-ID: Hello all, I recently released a bio-tabix gem. It is available on rubygems: https://rubygems.org/gems/bio-tabix and Github: https://github.com/throwern/bio-tabix The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml Feel free to contact me with any comments or suggestions. Best, Nick -- Nick Thrower Information Technology Professional Michigan State University Great Lakes Bioenergy Research Center From pjotr.public14 at thebird.nl Fri May 11 11:43:49 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 11 May 2012 17:43:49 +0200 Subject: [BioRuby] BioTabix gem In-Reply-To: References: Message-ID: <20120511154349.GB17747@thebird.nl> Super :) On Fri, May 11, 2012 at 10:20:19AM -0400, Nick Thrower wrote: > Hello all, > > I recently released a bio-tabix gem. > > It is available on rubygems: > https://rubygems.org/gems/bio-tabix > > and Github: > https://github.com/throwern/bio-tabix > > The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml > > Feel free to contact me with any comments or suggestions. > > Best, > Nick > > -- > Nick Thrower > Information Technology Professional > Michigan State University > Great Lakes Bioenergy Research Center > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cswh at umich.edu Fri May 11 21:21:02 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Fri, 11 May 2012 21:21:02 -0400 Subject: [BioRuby] Submitted JRuby bug and RubySpec addition for unit test failures under JRuby Message-ID: <57CFFD67-58BC-41AD-87E8-9C70A0A7AC97@umich.edu> Hi all, I've noticed that many of the BioRuby unit tests are failing under JRuby, locally and on travis-ci, with NameErrors for 'uninitialized constant' conditions. Many of these tests work when running just a single test script in isolation, but fail when the full suite is run with 'rake test'. I've identified the root cause of this problem, which appears to be a JRuby bug triggered when an autoload entry is defined, the file which would have been autoloaded is explicitly required, and the autoload entry is defined again. Subsequent attempts to access the target of the autoload entry fail with a NameError. This is an unusual sequence of events, but BioRuby and its test suites contain many 'horizontal' autoload entries between various parts of the source tree. For instance, bio/sequence/common.rb sets up an autoload for Bio::Locations, which I observed causing a problem with subsequent use of Bio::Locations. I created a minimized RubySpec illustrating the problem, which succeeds under MRI but fails under JRuby, and submitted it: https://github.com/rubyspec/rubyspec/pull/136 I also filed JRUBY-6658 (http://jira.codehaus.org/browse/JRUBY-6658) for this. If this bug is accepted and fixed, JRuby versions containing the fix should do much better on the test suite. Without a JRuby fix, it might be possible to work around this by restructuring autoloading in the BioRuby code base to avoid horizontal autoload invocations (that is, autoload declarations not in the parent of the module to be autoloaded), but that could be too invasive to justify. Clayton Wheeler cswh at umich.edu From marian.povolny at gmail.com Sat May 12 15:46:46 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sat, 12 May 2012 21:46:46 +0200 Subject: [BioRuby] GSoC weekly status report No.1.1 Message-ID: Hi all, Here is my status report for this week: This year we the GSoC students sure are a very creative group, just look at our numbering schemes for our status reports for the pre-coding period - everyone has his own thing going :) And now back to the GFF3 project. I found a few more sites with big GFF3 files, those will be great for performance testing. And Robert Buels suggested that I should reuse the test suite from the Perl?s Bio::GFF3::LowLevel::Parser, and I think that?s a great idea. I should definitely use that for completeness testing and I will check the test suites of other GFF3 parsers. I have also finished the work for the first week. That means basically I?m already more then two weeks ahead of schedule. The parser is now reading data on the D side and forwarding that to Ruby line by line. That won?t be faster then reading the file from Ruby, but that?s a nice basic case to get data flowing from D to Ruby. The rake tasks have been improved too. There are now two tasks for building the D library, ?compile? and ?compiledebug?, and there is the ?spec? task for running rspec tests and ?features? task for running cucumber tests. The ?clean? task now deletes object and library files. There is also a problem with the D library and garbage collector. It seems this is the problem Iain Buclaw (one of the GDC developers) has warned us about. When using a D shared library, when the GC kicks in for the first time, it looks like if it collects all the static data, for example the per-module variables. And pretty much everything, even when we register with GC a chuck of memory allocated with malloc, it still gets collected. Or at least that?s what it looks like. However, Iain also assured us that this will be solved by the end of this month/beginning of the next. My cucumber and rspec tests still work because they don?t require enough memory for the GC to run, but to be sure that this issue doesn?t interfere with development at this point, I manually disabled the GC on library initialization. I didn?t try yet, but from what has been discussed in the forums, both 32 and 64-bit DLLs on windows built using DMD work fine. I also helped Pjotr with getting our blog posts included in the RSS feed on biogems.info. That's all for now, you can find this report on my blog too: http://blog.mpthecoder.com/post/22919943701/gsoc-weekly-status-report-no-1-1 -- Best regards, Marjan From lomereiter at googlemail.com Sun May 13 16:10:45 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Mon, 14 May 2012 00:10:45 +0400 Subject: [BioRuby] [GSoC] Weekly report No 0.5 Message-ID: Hi all, this is yet another GSoC report. During last week, I was mainly concentrated on D part of the project, adding functionality to it. I implemented parsing of the whole BAM file :) Today I wrote a simple utility in D, which uses my library to convert BAM to SAM. It doesn?t work with array tags yet, and not as fast as samtools, but nevertheless? On a couple of BAM files from test/data directory (namely, bins.bam and ex1_header.bam) the output is identical to that of samtools view ? I checked with diff ? and that kinda proves that everything works fine. Speed issues are mainly due to using std.variant module for storing tags. It uses runtime reflection which is quite slow. Maybe, there?re some other reasons. Anyway, I?m going to write my own tagged union type next week, it should improve the performance quite a bit, and also fix design flaws. For testing tag parsing, I used file tags.bam provided to me by Peter Cock. It contains tests for all types of tags, and my library successfully passes them. Later I?ll experiment with possible speed improvements, and having unit tests covering full range of possible tag types is a must. Also, I downloaded and compiled gdc from trunk. It provides decent performance, not worse than dmd, at least. We expect gdc to gain shared library support in the next two months. Before that happens, we have to use dmd, although there?re some issues with its garbage collector, causing segfaults. We discussed that with Marjan and Pjotr and decided that the best option in such circumstances would be to disable GC during development ? testing library on small files won?t consume much memory anyway. Another thing I downloaded and compiled, is Rubinius. I?m going to investigate why it hangs on BioRuby unittests in 1.9 mode. Another mode, 1.8, seems to work fine except maybe some very minor bugs. During next week, I?m going to learn how to use Cucumber and Rspec, improve D library performance a little, and start to write Ruby bindings. So it will be mostly ?Ruby week? ;) -- Artem From cswh at umich.edu Mon May 14 23:36:17 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Mon, 14 May 2012 23:36:17 -0400 Subject: [BioRuby] GSoC week 1 status report Message-ID: <2D9F6030-8A11-4443-B610-58464F506EE5@umich.edu> Hi all, I've put my first GSoC status report on my project blog: http://csw.github.com/bioruby-maf/blog/2012/05/13/progress/ (The web version of this has 100% more hyperlinks, but here's a plain text version, too.) This has been my first half-week of work on my Google Summer of Code project, and it?s off to an exciting start. The first order of business has been to get my development environment together; since I?ve been a microbiology student instead of a programmer for the last year, it?s taken some work. In that process, I?ve ended up making a few open source contributions just to get my tools working the way I want. I?m running GNU Emacs 24 and trying to take more advantage of it than I have in the past. I?ll have much more to say about this in a future post. I?ve also started working on the BioRuby unit test failures under JRuby, as a way of familiarizing myself with the BioRuby code base as well as the community and its development processes. Right now, JRuby in 1.8 mode is showing 6 failures and 126 errors, which is hardly confidence-inspiring for people considering using JRuby with BioRuby. This is too bad, since JRuby has some definite advantages as a Ruby implementation. After looking into these failures, I?ve broken them down into a few categories: ? temporary file permissions problems, likely due to some sort of Travis-CI environment issue ? a bug in JRuby?s implementation of Open3.popen3 which I?m working up a bug report for ? an odd autoload problem I?ve filed JRUBY-6658 for and sent an accompanying RubySpec patch for ? a problem with libxml-jruby, which appears unmaintained, for which I?ve submitted a BioRuby patch plus JRUBY-6662 ? and a small test case bug relating to floating point handling, which I?ve submitted apatch for. Once these are resolved, JRuby should be passing the BioRuby unit tests in 1.8 mode, and closer to passing in 1.9 mode. (There are a few extra failures under 1.9 that I haven?t sorted through yet.) I?ve also gotten a start on my project itself, creating the bioruby-maf Github repository with a project skeleton and writing my first Cucumber feature for it. This is, in fact, my first Cucumber feature ever. However, I did spend a few cross-country flights reading the RSpec and Cucumber books last week; between that and cribbing from Pjotr?s code I feel like I have some idea what I?m doing. Just assembling that feature has been useful, too, since I?ve had to get several of the existing MAF tools running on my machine. In fact, my test MAF data and the FASTA version of it are courtesy of bx-python, which will be my reference implementation in many respects. Clayton Wheeler cswh at umich.edu From cswh at umich.edu Tue May 15 13:08:20 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Tue, 15 May 2012 13:08:20 -0400 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it Message-ID: Hi all, The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? Clayton Wheeler cswh at umich.edu From pjotr.public14 at thebird.nl Tue May 15 14:54:32 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 15 May 2012 20:54:32 +0200 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: References: Message-ID: <20120515185432.GC20185@thebird.nl> Marvellous work Clayton! My suggestion to BioRuby is to split out phyloxml and to deprecate the current library module. In the next release, or after, we should take out that code. I suspect few people really depend on it, and they can adapt. I am partly responsible for that dependency, and I think the Travis-ci tests also point out that the purer Ruby BioRuby is, the better ;). Naohisa, what do you say? We should also ask the original author, even though she has left our little group and now works for google (and I am claiming Google does not recruit from GSoC :). Diana, maybe you are reading the ML? Pj. On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: > Hi all, > > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) > > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. > > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cjfields at illinois.edu Tue May 15 15:14:02 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 19:14:02 +0000 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <20120515185432.GC20185@thebird.nl> References: <20120515185432.GC20185@thebird.nl> Message-ID: I am intending on following the same tact with BioPerl's phyloxml (splitting it out), primarily so it can be maintained separately from the rest of bioperl. chris On May 15, 2012, at 1:54 PM, Pjotr Prins wrote: > Marvellous work Clayton! My suggestion to BioRuby is to split out > phyloxml and to deprecate the current library module. In the next > release, or after, we should take out that code. I suspect few people > really depend on it, and they can adapt. I am partly responsible for > that dependency, and I think the Travis-ci tests also point out that > the purer Ruby BioRuby is, the better ;). > > Naohisa, what do you say? We should also ask the original author, even > though she has left our little group and now works for google (and I > am claiming Google does not recruit from GSoC :). Diana, maybe you are > reading the ML? > > Pj. > > On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: >> Hi all, >> >> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) >> >> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. >> >> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? >> >> Clayton Wheeler >> cswh at umich.edu >> >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby >> > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Tue May 15 17:51:51 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Tue, 15 May 2012 17:51:51 -0400 Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test failures Message-ID: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> Hi all, I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods). Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead? Clayton Wheeler cswh at umich.edu From ngoto at gen-info.osaka-u.ac.jp Wed May 16 03:30:35 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 16:30:35 +0900 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> Message-ID: <201205160739.q4G7dS4G004980@portal.open-bio.org> Hi, For Bioruby, I manually set the hook with my (ngoto's) personal Travis account. As far as I can see, organization accout in Travis is currently not available. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Thu, 10 May 2012 11:31:07 +0100 Peter Cock wrote: > On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins wrote: > > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: > >> I'm guessing that's how you did it for BioRuby? > > > > I think I added it before we were a github organization. Or we were > > just lucky :) > > > > Pj. > > I'd guess the former - I've now got a personal Travis account via my > personal GitHub account), but for now I can't seem to create a Biopython > Travis account via the Biopython organization account on GitHub. > > Nevertheless, I could get the basic Biopython unit tests running on > Travis last night (including Python 3), although this needs more > work installing dependencies to get the full test suite coverage: > http://travis-ci.org/#!/peterjc/biopython > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ngoto at gen-info.osaka-u.ac.jp Wed May 16 03:54:53 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 16:54:53 +0900 Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test failures In-Reply-To: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> Message-ID: <201205160754.q4G7srSc005733@portal.open-bio.org> Hi Clayton, In addition, we have a Redmine page hosted on OBF. https://redmine.open-bio.org/projects/bioruby Currently, bugs and feature requests moved from old RubyForge BTS are submitted. I think the Redmine page will be used for bugs and feature requests without pull requests. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Tue, 15 May 2012 17:51:51 -0400 Clayton Wheeler wrote: > Hi all, > > I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods). > > Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead? > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From anurag08priyam at gmail.com Wed May 16 04:15:40 2012 From: anurag08priyam at gmail.com (Anurag Priyam) Date: Wed, 16 May 2012 13:45:40 +0530 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <201205160739.q4G7dS4G004980@portal.open-bio.org> References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> <201205160739.q4G7dS4G004980@portal.open-bio.org> Message-ID: On Wed, May 16, 2012 at 1:00 PM, Naohisa GOTO wrote: > For Bioruby, I manually set the hook with my (ngoto's) personal > Travis account. As far as I can see, organization accout in Travis > is currently not available. You are talking about the toggle button on your Travis profile page, right? For repos that belong to an organization, you need to enable Travis hook from Github (admin/service-hooks), iirc, using the token on your Travis profile page. -- Anurag Priyam From ngoto at gen-info.osaka-u.ac.jp Wed May 16 04:17:57 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 17:17:57 +0900 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <20120515185432.GC20185@thebird.nl> References: <20120515185432.GC20185@thebird.nl> Message-ID: <201205160817.q4G8HwBO007774@portal.open-bio.org> Hi, Great work, Clayton! I think separate gem (Biogem) is good, too. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Tue, 15 May 2012 20:54:32 +0200 Pjotr Prins wrote: > Marvellous work Clayton! My suggestion to BioRuby is to split out > phyloxml and to deprecate the current library module. In the next > release, or after, we should take out that code. I suspect few people > really depend on it, and they can adapt. I am partly responsible for > that dependency, and I think the Travis-ci tests also point out that > the purer Ruby BioRuby is, the better ;). > > Naohisa, what do you say? We should also ask the original author, even > though she has left our little group and now works for google (and I > am claiming Google does not recruit from GSoC :). Diana, maybe you are > reading the ML? > > Pj. > > On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: > > Hi all, > > > > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) > > > > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. > > > > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? > > > > Clayton Wheeler > > cswh at umich.edu > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From donttrustben at gmail.com Wed May 16 07:09:24 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 21:09:24 +1000 Subject: [BioRuby] hmmer3 Message-ID: Hi guys, I noticed today that there isn't HMMER3 support in bioruby - particularly I'm interested in a parser for hmmsearch outputs as I want to iterate over aligned positions. I noticed that there is mention of this in the 1.4.1 release notes, that hmmer3 will be supported in 1.5, although I'm not sure what exactly this means. http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ Can I ask what the state of this merge is please? Is there code somewhere just waiting to be merged? Can it be quickly spun out into a biogem in the meantime? Thanks, ben -- Ben Woodcroft From bonnal at ingm.org Wed May 16 07:27:18 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 16 May 2012 13:27:18 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: Message-ID: If you need to wrap the binary please have a look at our wrapper. I wondering is this wrapper could be useful to other gems, I could create a separated gem just for it. Let me know. Docs about the wrapper is in the readme. https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper On 16/05/12 13.09, "Ben Woodcroft" wrote: > Hi guys, > > I noticed today that there isn't HMMER3 support in bioruby - particularly > I'm interested in a parser for hmmsearch outputs as I want to iterate over > aligned positions. > > I noticed that there is mention of this in the 1.4.1 release notes, that > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > means. > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > Can I ask what the state of this merge is please? Is there code somewhere > just waiting to be merged? Can it be quickly spun out into a biogem in the > meantime? > > Thanks, > ben From donttrustben at gmail.com Wed May 16 07:43:44 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 21:43:44 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: References: Message-ID: Thanks for the feedback dudes. I'm happy to spin it out myself, only I don't know where the code is. I don't personally need a wrapper, but I've got 40G of hmmsearch result files to parse. Relatedly I've written a gem that parses HMM model files - I'll release that after a little more testing, hopefully tomorrow. On 16 May 2012 21:27, Raoul Bonnal wrote: > If you need to wrap the binary please have a look at our wrapper. I > wondering is this wrapper could be useful to other gems, I could create a > separated gem just for it. Let me know. Docs about the wrapper is in the > readme. > > https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb > https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper > > > On 16/05/12 13.09, "Ben Woodcroft" wrote: > > > Hi guys, > > > > I noticed today that there isn't HMMER3 support in bioruby - particularly > > I'm interested in a parser for hmmsearch outputs as I want to iterate > over > > aligned positions. > > > > I noticed that there is mention of this in the 1.4.1 release notes, that > > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > > means. > > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > > > Can I ask what the state of this merge is please? Is there code somewhere > > just waiting to be merged? Can it be quickly spun out into a biogem in > the > > meantime? > > > > Thanks, > > ben > > > From ngoto at gen-info.osaka-u.ac.jp Wed May 16 07:48:14 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 20:48:14 +0900 Subject: [BioRuby] hmmer3 In-Reply-To: References: Message-ID: <201205161148.q4GBmFSj016839@portal.open-bio.org> Hi Ben, HMMER3 result parser is written by Christian. https://github.com/cmzmasek/bioruby I guess it may be enough quality, except RDF/XML support which is experimental. I'd like to discuss that the class name Bio::Hmmer3Report is suitable. For HMMER2, Bio::HMMER::Report. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Wed, 16 May 2012 21:09:24 +1000 Ben Woodcroft wrote: > Hi guys, > > I noticed today that there isn't HMMER3 support in bioruby - particularly > I'm interested in a parser for hmmsearch outputs as I want to iterate over > aligned positions. > > I noticed that there is mention of this in the 1.4.1 release notes, that > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > means. > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > Can I ask what the state of this merge is please? Is there code somewhere > just waiting to be merged? Can it be quickly spun out into a biogem in the > meantime? > > Thanks, > ben > > -- > Ben Woodcroft > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From bonnal at ingm.org Wed May 16 08:46:34 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 16 May 2012 14:46:34 +0200 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <201205160817.q4G8HwBO007774@portal.open-bio.org> Message-ID: Impressive. This is the right approach for cleaning BioRuby from dependencies which could create problems. Thanks Clayton. On 16/05/12 10.17, "Naohisa GOTO" wrote: > Hi, > > Great work, Clayton! > > I think separate gem (Biogem) is good, too. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org > > On Tue, 15 May 2012 20:54:32 +0200 > Pjotr Prins wrote: > >> Marvellous work Clayton! My suggestion to BioRuby is to split out >> phyloxml and to deprecate the current library module. In the next >> release, or after, we should take out that code. I suspect few people >> really depend on it, and they can adapt. I am partly responsible for >> that dependency, and I think the Travis-ci tests also point out that >> the purer Ruby BioRuby is, the better ;). >> >> Naohisa, what do you say? We should also ask the original author, even >> though she has left our little group and now works for google (and I >> am claiming Google does not recruit from GSoC :). Diana, maybe you are >> reading the ML? >> >> Pj. >> >> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: >>> Hi all, >>> >>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby >>> gem (an implementation of the libxml API using native Java XML libraries) >>> does not support the full API of libxml-ruby. My first approach to this was >>> to simply use the native libxml-ruby gem and its C extension, which works >>> with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a >>> Unicode issue, and the JRuby developers indicate that the C extension API >>> (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 >>> mode. (see http://bit.ly/JGWC4K) >>> >>> There was a discussion of the PhyloXML parser on the mailing list a couple >>> of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be >>> rewritten to use Nokogiri at some point soon, since Nokogiri is now the de >>> facto standard XML parser. Following that lead, I've gone ahead and ported >>> the PhyloXML parser to use Nokogiri; it only took an hour or two, and the >>> unit tests are passing. My branch for this is at >>> https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a >>> good approach, I can port the writer as well. >>> >>> However, Pjotr suggested that it might make sense to split PhyloXML out into >>> a separate gem. This should be straightforward enough, since no other >>> BioRuby components appear to call PhyloXML. It would mean that any PhyloXML >>> users would need to install a separate gem. On the other hand, it would >>> remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I >>> proceed with this approach? >>> >>> Clayton Wheeler >>> cswh at umich.edu >>> >>> >>> _______________________________________________ >>> BioRuby Project - http://www.bioruby.org/ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >>> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From donttrustben at gmail.com Wed May 16 09:28:01 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 23:28:01 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> Message-ID: Ah cool, thanks ngoto. Thanks for writing this Christian. I believe I've extracted the hmmer3 stuff into a new biogem. I've added you as an author on this Christian - hope that's ok with you? https://github.com/wwood/bioruby-hmmer3_report I've not released it to rubygems yet - I wanted to clear up namespace issues first. What do you suggest Naohisa? BIo::HMMER::HMMER3::Report ? On looking at the code it seems it only handles tabular format data, which is rather unfortunate for me, as I need the actual alignment. Looks like I'll have to roll my sleeves up after all, unless there is yet more code out there that parses the regular textual format? I'm not sure about your feelings on this Christian, but how do you feel about putting the rdf stuff in another biogem? If the aim is to get this gem merged into the bioruby core code (and I hope it is since when people say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a bit tangential? I also noticed that in the tests Christian referred to BioRubyTestDataPath which isn't recognised in the biogem. Is there a recommended way to do this in a biogem? Perhaps we should mirror what bioruby itself does to make the code more portable. Thanks everyone for the openness and responsiveness. ben On 16 May 2012 21:48, Naohisa GOTO wrote: > Hi Ben, > > HMMER3 result parser is written by Christian. > https://github.com/cmzmasek/bioruby > > I guess it may be enough quality, except RDF/XML support > which is experimental. > > I'd like to discuss that the class name Bio::Hmmer3Report > is suitable. For HMMER2, Bio::HMMER::Report. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org > > On Wed, 16 May 2012 21:09:24 +1000 > Ben Woodcroft wrote: > > > Hi guys, > > > > I noticed today that there isn't HMMER3 support in bioruby - particularly > > I'm interested in a parser for hmmsearch outputs as I want to iterate > over > > aligned positions. > > > > I noticed that there is mention of this in the 1.4.1 release notes, that > > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > > means. > > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > > > Can I ask what the state of this merge is please? Is there code somewhere > > just waiting to be merged? Can it be quickly spun out into a biogem in > the > > meantime? > > > > Thanks, > > ben > > > > -- > > Ben Woodcroft > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > -- -- Ben Woodcroft http://ecogenomic.org/users/ben-woodcroft From pjotr.public14 at thebird.nl Wed May 16 09:46:12 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 16 May 2012 15:46:12 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> Message-ID: <20120516134612.GA26059@thebird.nl> On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > I'm not sure about your feelings on this Christian, but how do you feel > about putting the rdf stuff in another biogem? If the aim is to get this > gem merged into the bioruby core code (and I hope it is since when people > say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a > bit tangential? I think it should be decoupled. RDF, in general, is a (searchable) result-based (post-parser) format. Maybe we should coin that definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF into triple stores. Sounds that bio-rdf is the right place for that translation code to me :). Feel free to push it in. Pj. From pjotr.public14 at thebird.nl Thu May 17 12:51:01 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 17 May 2012 18:51:01 +0200 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <201205160754.q4G7srSc005733@portal.open-bio.org> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> Message-ID: <20120517165101.GA32610@thebird.nl> I don't know if you all track github, but thanks to two GSoC coders (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. Travis-CI should show the green light for all Rubies once Rubinius itself gets updated on Travis :) Kudos. Pj. From cjfields at illinois.edu Thu May 17 12:59:33 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 17 May 2012 16:59:33 +0000 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <20120517165101.GA32610@thebird.nl> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> <20120517165101.GA32610@thebird.nl> Message-ID: Sounds like GSoC this year is paying lots of dividends :) chris On May 17, 2012, at 11:51 AM, Pjotr Prins wrote: > I don't know if you all track github, but thanks to two GSoC coders > (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. > > Travis-CI should show the green light for all Rubies once Rubinius > itself gets updated on Travis :) > > Kudos. > > Pj. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Thu May 17 13:42:06 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 13:42:06 -0400 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <20120517165101.GA32610@thebird.nl> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> <20120517165101.GA32610@thebird.nl> Message-ID: <7D2E3046-44E9-4275-B294-8DB39D36294B@umich.edu> On May 17, 2012, at 12:51 PM, Pjotr Prins wrote: > I don't know if you all track github, but thanks to two GSoC coders > (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. > > Travis-CI should show the green light for all Rubies once Rubinius > itself gets updated on Travis :) Thanks Pjotr. Unfortunately I think we're not going to be quite there for JRuby just yet; we've hit a couple of JRuby bugs which will probably need to be fixed to solve some of the failures. Also, I think we may be stuck with PhyloXML test failures under JRuby in 1.9 mode until we split that out into a separate gem. It's definitely progress, though. Clayton Wheeler cswh at umich.edu From cswh at umich.edu Thu May 17 15:39:27 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 15:39:27 -0400 Subject: [BioRuby] PhyloXML and libxml-ruby Message-ID: Hi all, It appears that the native extension for libxml-ruby is not building reliably under JRuby, causing Travis-CI runs to fail as seen at: http://travis-ci.org/#!/ngoto/bioruby/jobs/1356992 I'm not having much luck identifying exactly why it builds in some JRuby environments and not others, but I've been able to reproduce the Travis-CI problem on a test Linux machine and don't see an obvious fix. If we're going to repackage PhyloXML into a separate gem, I think the safest course of action would be to revert to calling for libxml-jruby in the Travis-CI Gemfiles (i.e. back out http://bit.ly/JmNjDY). Using libxml-ruby instead of libxml-jruby doesn't solve the PhyloXML problems on JRuby in 1.9 mode anyway, and 1.9 mode will soon be the default in JRuby. The PhyloXML gem can be explicitly declared to depend on libxml-ruby, and moving it out of the core BioRuby gem will remove this whole issue, as far as the unit tests go. Then PhyloXML's library requirements can be addressed separately. Thoughts? Clayton Wheeler cswh at umich.edu From cswh at umich.edu Thu May 17 23:10:52 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 23:10:52 -0400 Subject: [BioRuby] bio-phyloxml gem Message-ID: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> Hi all, I have repackaged BioRuby's PhyloXML support as a separate gem: https://github.com/csw/bioruby-phyloxml I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.) Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough. Also, feel free to contribute changes to the README. If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0. Clayton Wheeler cswh at umich.edu From donttrustben at gmail.com Fri May 18 00:59:44 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Fri, 18 May 2012 14:59:44 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: <20120516134612.GA26059@thebird.nl> References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> <20120516134612.GA26059@thebird.nl> Message-ID: On 16 May 2012 23:46, Pjotr Prins wrote: > On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > > maybe the rdf stuff is a > > bit tangential? > > I think it should be decoupled. RDF, in general, is a (searchable) > result-based (post-parser) format. Maybe we should coin that > definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF > into triple stores. Sounds that bio-rdf is the right place for that > translation code to me :). Feel free to push it in. > Thanks. I've removed the rdf related code all in one commit: https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20 However, I don't feel like I should be adding this to a different repository because I don't feel like I understand the technology enough, and therefore am not really inclined to maintain it. All of the relevant code should be in that commit, so should be quite simple to add in yourself if you are inclined (though I couldn't find any unit tests). Only, I've changed the namespace of it to Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested. I've also now pushed the new biogem to rubygems/biogems.info. Thanks, ben From pjotr.public14 at thebird.nl Fri May 18 01:21:23 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 18 May 2012 07:21:23 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> <20120516134612.GA26059@thebird.nl> Message-ID: <20120518052123.GA3360@thebird.nl> OK, I'll take the orphaned RDF code. On Fri, May 18, 2012 at 02:59:44PM +1000, Ben Woodcroft wrote: > On 16 May 2012 23:46, Pjotr Prins <[1]pjotr.public14 at thebird.nl> wrote: > > On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > > maybe the rdf stuff is a > > bit tangential? > > I think it should be decoupled. RDF, in general, is a (searchable) > result-based (post-parser) format. Maybe we should coin that > definition somewhere :). I created bio-rdf biogem as a 'sink' for > RDF > into triple stores. Sounds that bio-rdf is the right place for that > translation code to me :). Feel free to push it in. > > Thanks. I've removed the rdf related code all in one commit: > [2]https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011 > cb600e78e6ef10603187c99d20 > However, I don't feel like I should be adding this to a different > repository because I don't feel like I understand the technology > enough, and therefore am not really inclined to maintain it. All of the > relevant code should be in that commit, so should be quite simple to > add in yourself if you are inclined (though I couldn't find any unit > tests). Only, I've changed the namespace of it to > Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested. > I've also now pushed the new biogem to rubygems/[3]biogems.info. > Thanks, > ben > > References > > 1. mailto:pjotr.public14 at thebird.nl > 2. https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20 > 3. http://biogems.info/ From pjotr.public14 at thebird.nl Fri May 18 01:24:40 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 18 May 2012 07:24:40 +0200 Subject: [BioRuby] bio-phyloxml gem In-Reply-To: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> References: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> Message-ID: <20120518052440.GB3360@thebird.nl> I am with Raoul and Francesco today. We will take a look and discuss. Good job, also saving the revision history :). On Thu, May 17, 2012 at 11:10:52PM -0400, Clayton Wheeler wrote: > Hi all, > > I have repackaged BioRuby's PhyloXML support as a separate gem: > > https://github.com/csw/bioruby-phyloxml > > I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.) > > Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough. > > Also, feel free to contribute changes to the README. > > If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0. > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From donttrustben at gmail.com Fri May 18 01:40:28 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Fri, 18 May 2012 15:40:28 +1000 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer Message-ID: Hi guys, Here's some blatant advertising for some code I've recently written in biogem form. bio-gag: "gag error" is the term I've coined to describe an error that various people have observed on certain sequencing kits with IonTorrent, though it has not previously been characterised very well that I know of (we noticed that the errors seemed to occur at GAG positions in the reads that were supposed to be GAAG). This biogem tries to find and fix these errors. It isn't benchmarked for accuracy but worked well enough for my lab's own purposes. Actually to be honest we've only used an older version of the software on real data and the logic has a little since given some recent evidence we have, but I thought I'd push it out with the latest and greatest error model. https://github.com/wwood/bioruby-gag bio-pileup_iterator: To find gag errors bio-gag iterates through pileup files looking for particular patterns e.g. strand bias of insertions. This gem can be used to iterate through pileup files one position (one line) at a time, building up the sequence of each read as it goes, recording their direction etc. Probably not the fastest piece of code in the world, sorry. I'm not sure whether this should/can be incorporated into bio-samtools? It adds functionality - there's no duplication (I don't think). https://github.com/wwood/bioruby-pileup_iterator bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to the hmmer v3 manual. https://github.com/wwood/bioruby-hmmer_model bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles tabular format files - the guts of this were written by Christian - see yesterday's thread for details. I'm hoping to add regular (non-tabular) format parsing in the near future, but no promises. https://github.com/wwood/bioruby-hmmer3_report I'm sure there is bugs and deficiencies - apologies in advance. Enjoy, ben From francesco.strozzi at gmail.com Fri May 18 04:01:01 2012 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Fri, 18 May 2012 10:01:01 +0200 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer In-Reply-To: References: Message-ID: Hi Ben, thanks for the amazing work! I'm not using Ion Torrent atm but I eventually will and it's good to see there is something already setup. Francesco On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft wrote: > Hi guys, > > Here's some blatant advertising for some code I've recently written in > biogem form. > > bio-gag: "gag error" is the term I've coined to describe an error that > various people have observed on certain sequencing kits with IonTorrent, > though it has not previously been characterised very well that I know of > (we noticed that the errors seemed to occur at GAG positions in the reads > that were supposed to be GAAG). This biogem tries to find and fix these > errors. It isn't benchmarked for accuracy but worked well enough for my > lab's own purposes. Actually to be honest we've only used an older version > of the software on real data and the logic has a little since given some > recent evidence we have, but I thought I'd push it out with the latest and > greatest error model. > https://github.com/wwood/bioruby-gag > > bio-pileup_iterator: To find gag errors bio-gag iterates through pileup > files looking for particular patterns e.g. strand bias of insertions. This > gem can be used to iterate through pileup files one position (one line) at > a time, building up the sequence of each read as it goes, recording their > direction etc. Probably not the fastest piece of code in the world, sorry. > I'm not sure whether this should/can be incorporated into bio-samtools? It > adds functionality - there's no duplication (I don't think). > https://github.com/wwood/bioruby-pileup_iterator > > bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to > the hmmer v3 manual. > https://github.com/wwood/bioruby-hmmer_model > > bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles > tabular format files - the guts of this were written by Christian - see > yesterday's thread for details. I'm hoping to add regular (non-tabular) > format parsing in the near future, but no promises. > https://github.com/wwood/bioruby-hmmer3_report > > I'm sure there is bugs and deficiencies - apologies in advance. > > Enjoy, > ben > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- Francesco From bonnal at ingm.org Fri May 18 04:54:44 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Fri, 18 May 2012 10:54:44 +0200 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer In-Reply-To: Message-ID: My lab (Alberto) will try your HMM parsers because we are going to annotate a lot of stuff coming form NGS ^_^ On 18/05/12 10.01, "Francesco Strozzi" wrote: > Hi Ben, > thanks for the amazing work! I'm not using Ion Torrent atm but I > eventually will and it's good to see there is something already setup. > > Francesco > > On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft wrote: >> Hi guys, >> >> Here's some blatant advertising for some code I've recently written in >> biogem form. >> >> bio-gag: "gag error" is the term I've coined to describe an error that >> various people have observed on certain sequencing kits with IonTorrent, >> though it has not previously been characterised very well that I know of >> (we noticed that the errors seemed to occur at GAG positions in the reads >> that were supposed to be GAAG). This biogem tries to find and fix these >> errors. It isn't benchmarked for accuracy but worked well enough for my >> lab's own purposes. Actually to be honest we've only used an older version >> of the software on real data and the logic has a little since given some >> recent evidence we have, but I thought I'd push it out with the latest and >> greatest error model. >> https://github.com/wwood/bioruby-gag >> >> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup >> files looking for particular patterns e.g. strand bias of insertions. This >> gem can be used to iterate through pileup files one position (one line) at >> a time, building up the sequence of each read as it goes, recording their >> direction etc. Probably not the fastest piece of code in the world, sorry. >> I'm not sure whether this should/can be incorporated into bio-samtools? It >> adds functionality - there's no duplication (I don't think). >> https://github.com/wwood/bioruby-pileup_iterator >> >> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to >> the hmmer v3 manual. >> https://github.com/wwood/bioruby-hmmer_model >> >> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles >> tabular format files - the guts of this were written by Christian - see >> yesterday's thread for details. I'm hoping to add regular (non-tabular) >> format parsing in the near future, but no promises. >> https://github.com/wwood/bioruby-hmmer3_report >> >> I'm sure there is bugs and deficiencies - apologies in advance. >> >> Enjoy, >> ben >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > > From pjotr.public14 at thebird.nl Sun May 20 08:31:31 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 20 May 2012 14:31:31 +0200 Subject: [BioRuby] biogems.info updated Message-ID: <20120520123131.GA17983@thebird.nl> Marjan and I have updated the RSS feed for biogems.info - now we can support more blogs. If you are blogging on Ruby for Bioinformatics, give us the feed :) Pj. From marian.povolny at gmail.com Mon May 21 05:36:01 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Mon, 21 May 2012 11:36:01 +0200 Subject: [BioRuby] GSoC weekly status report No.1.2 Message-ID: http://blog.mpthecoder.com/post/23473020471/gsoc-weekly-status-report-no-1-2 It?s been three months since my first introduction on the BioRuby ML and it?s been great. As it is the end of the GSoC community bonding period, I would like to thank Pjotr most and then all the other community members for their help and support. It?s a great feeling to become a member of a small but growing community of enthusiasts that work together for the better of all of us and for fun. As Pjotr already did, I would like to encourage you to write blog posts about using Ruby in Bioinformatics and let us include them in our RSS and news feeds on the biogems.info website. The site supports both RSS and Atom feeds now, and a similar functionality will be part of the new website for BioRuby once it?s finished. The code also supports adding only posts for one category/tag, so you can tag your posts with BioRuby or similar, and only those posts will be included in the RSS feed on biogems.info. The GSoC coding period starts today, It?s time for me to roll my sleeves up, and start working on the GFF3 parser full-time. -- Marjan From lomereiter at googlemail.com Mon May 21 07:58:46 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Mon, 21 May 2012 15:58:46 +0400 Subject: [BioRuby] [GSoC] Weekly report #1 Message-ID: Hi all, here's my report about the past week: http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ Brief summary: 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius bugtracker, and one of them is already solved. Rubinius in 1.8 mode should now pass all tests. The situation with 1.9 mode is not that great, but I'm working on it. 2) I started to collect D optimization tricks on github wiki page. Currently, it contains just 6 tips, but this number is going to grow. Probably, another page will be created soon to keep best practices of connecting Ruby and D. Since my project and Marjan's one have a lot in common, I think it's important for us to not waste time on something that already have been investigated. 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and wrote my first two features. 4) Measurements of object instantiation time in Ruby suggest that exposing low-level D functions via FFI makes little sense. I'm going to discuss with mentors which high-level functions should be available, and make that into Cucumber features. -- Artem From cswh at umich.edu Mon May 21 11:50:18 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Mon, 21 May 2012 11:50:18 -0400 Subject: [BioRuby] GSoC week 2 status report Message-ID: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Hi all, Here's my report on last week's work: http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ This was my second week of work on my GSoC project, and the last week of the ?community bonding? period before the official start of coding. A major focus of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing unit test failures under JRuby. In the end, the best course of action seemed to separate the phyloXML support as a separate plugin, which I have done as the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries entirely and that JRuby issue along with it. At the same time, users of the phyloXML code should be able to continue using it with no substantive changes. Separately, I began porting this phyloXML code to use Nokogiri instead of libxml-ruby, but ran into difficulties with this effort. While it is possible, and the library APIs are very similar, the code uses relatively low-level XML processing APIs in ways that seem to be sensitive to subtle differences in text node and namespace semantics between the two libraries. Substantial restructuring of the code and the addition of quite a few unit tests might be necessary to carry out such a port with confidence that the resulting code would work well. Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, we?ll be close to having all the tests passing under JRuby. I identified another JRuby bug, JRUBY-6666, causing several unit test failures. This one affects BioRuby?s code for running external commands, so it would be likely to be encountered in production use. For this one, I also worked up a patch. I also spent some time preparing a performance testing environment, for evaluating existing MAF implementations as well as my own. This will be important, since I will be considering the use of an existing C parser. I will also want to ensure that the performance of my code is competitive with the alternatives. Lacking any hardware more powerful than a MacBook Air, I am setting this up with Amazon EC2. To simplify environment setup, I?ll be using Chef. I?ve already set up a Chef repository with configuration logic, and some rudimentary code to streamline launching Ubuntu machines on EC2 and bootstrapping a Chef environment. To save money, I plan to make use of EC2 Spot Instances, which are perfect for instances that only need to run for a few hours for batch tasks. Clayton Wheeler cswh at umich.edu From bonnal at ingm.org Tue May 22 05:21:42 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Tue, 22 May 2012 11:21:42 +0200 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Message-ID: Hi Clayton, Well done and thanks for your contributes to bioruby and jruby community. For you computing issue I have two solutions: 1) I can create a VM and give you the access, I need to contact my IT dep. 2) Could Amazon provide some VM for our students? On 21/05/12 17.50, "Clayton Wheeler" wrote: > Hi all, > > Here's my report on last week's work: > > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ > > This was my second week of work on my GSoC project, and the last week of the > ?community bonding? period before the official start of coding. A major focus > of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing > unit test failures under JRuby. In the end, the best course of action seemed > to separate the phyloXML support as a separate plugin, which I have done as > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries > entirely and that JRuby issue along with it. At the same time, users of the > phyloXML code should be able to continue using it with no substantive changes. > > Separately, I began porting this phyloXML code to use Nokogiri instead of > libxml-ruby, but ran into difficulties with this effort. While it is possible, > and the library APIs are very similar, the code uses relatively low-level XML > processing APIs in ways that seem to be sensitive to subtle differences in > text node and namespace semantics between the two libraries. Substantial > restructuring of the code and the addition of quite a few unit tests might be > necessary to carry out such a port with confidence that the resulting code > would work well. > > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major > causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, > we?ll be close to having all the tests passing under JRuby. > > I identified another JRuby bug, JRUBY-6666, causing several unit test > failures. This one affects BioRuby?s code for running external commands, so it > would be likely to be encountered in production use. For this one, I also > worked up a patch. > > I also spent some time preparing a performance testing environment, for > evaluating existing MAF implementations as well as my own. This will be > important, since I will be considering the use of an existing C parser. I will > also want to ensure that the performance of my code is competitive with the > alternatives. Lacking any hardware more powerful than a MacBook Air, I am > setting this up with Amazon EC2. To simplify environment setup, I?ll be using > Chef. I?ve already set up a Chef repository with configuration logic, and some > rudimentary code to streamline launching Ubuntu machines on EC2 and > bootstrapping a Chef environment. To save money, I plan to make use of EC2 > Spot Instances, which are perfect for instances that only need to run for a > few hours for batch tasks. > > Clayton Wheeler > cswh at umich.edu > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From p.j.a.cock at googlemail.com Tue May 22 07:07:15 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 May 2012 12:07:15 +0100 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: <4F9AFA1F.6030103@med.nyu.edu> References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: Hi all, I've CC'd the BioRuby mailing list just to ensure you're aware of the potentially useful combination of MAF indexing and BGZF compression. We can continue this on the BioRuby list if more appropriate. The start of this Biopython-dev thread is here: http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html This might be a nice opportunity to combine the work of this year's OBF Google Summer of Code students - Clayton is doing MAF for BioRuby, and part of Artem's project could provide BGZF support for BioRuby. On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak wrote: > Peter, > >> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >> and I'm willing to do this myself for MAF (while going over your index >> work - something I want to do anyway). The only potential catch is >> avoiding offset arithmetic. > > I have no problem with you doing this if you're willing. It would be great > to have some code review of MafIndex as well. I'm not sure if Clayton will be able to comment on the Python code, but he should have some thoughts on the MAF indexing itself. Regards, Peter From pjotr.public14 at thebird.nl Tue May 22 11:23:17 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 22 May 2012 17:23:17 +0200 Subject: [BioRuby] BioRuby hitting 20K Message-ID: <20120522152317.GA30752@thebird.nl> Looks like we'll have 20K downloads of the bioruby gem by tomorrow :). Maybe time for a new release? We are getting a lot more activity anyway - Go BioRuby Go! Pj. From mh6 at sanger.ac.uk Tue May 22 11:32:03 2012 From: mh6 at sanger.ac.uk (Michael Paulini) Date: Tue, 22 May 2012 16:32:03 +0100 Subject: [BioRuby] BioRuby hitting 20K In-Reply-To: <20120522152317.GA30752@thebird.nl> References: <20120522152317.GA30752@thebird.nl> Message-ID: <4FBBB173.2030001@sanger.ac.uk> congrats biorubystas :-) M On 22/05/12 16:23, Pjotr Prins wrote: > Looks like we'll have 20K downloads of the bioruby gem by tomorrow > :). Maybe time for a new release? > > We are getting a lot more activity anyway - Go BioRuby Go! > > Pj. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From bonnal at ingm.org Wed May 23 09:24:56 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 23 May 2012 15:24:56 +0200 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: Message-ID: Thanks Peter, These are valuable hints. On 22/05/12 13.07, "Peter Cock" wrote: > Hi all, > > I've CC'd the BioRuby mailing list just to ensure you're aware of the > potentially useful combination of MAF indexing and BGZF compression. > We can continue this on the BioRuby list if more appropriate. > > The start of this Biopython-dev thread is here: > http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html > > This might be a nice opportunity to combine the work of this year's OBF > Google Summer of Code students - Clayton is doing MAF for BioRuby, > and part of Artem's project could provide BGZF support for BioRuby. > > On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak > wrote: >> Peter, >> >>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >>> and I'm willing to do this myself for MAF (while going over your index >>> work - something I want to do anyway). The only potential catch is >>> avoiding offset arithmetic. >> >> I have no problem with you doing this if you're willing. It would be great >> to have some code review of MafIndex as well. > > I'm not sure if Clayton will be able to comment on the Python code, > but he should have some thoughts on the MAF indexing itself. > > Regards, > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Wed May 23 21:35:46 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Wed, 23 May 2012 21:35:46 -0400 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: On May 22, 2012, at 7:07 AM, Peter Cock wrote: > Hi all, > > I've CC'd the BioRuby mailing list just to ensure you're aware of the > potentially useful combination of MAF indexing and BGZF compression. > We can continue this on the BioRuby list if more appropriate. > > The start of this Biopython-dev thread is here: > http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html > > This might be a nice opportunity to combine the work of this year's OBF > Google Summer of Code students - Clayton is doing MAF for BioRuby, > and part of Artem's project could provide BGZF support for BioRuby. Indeed, thanks Peter. BGZF sounds like a great approach for MAF compression; I'm just about to start looking into indexing support, and it makes sense to tackle compression in that context. So far, I think Artem's BGZF implementation is entirely in D; I may just add Ruby support for BGZF separately. > On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak > wrote: >> Peter, >> >>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >>> and I'm willing to do this myself for MAF (while going over your index >>> work - something I want to do anyway). The only potential catch is >>> avoiding offset arithmetic. >> >> I have no problem with you doing this if you're willing. It would be great >> to have some code review of MafIndex as well. > > I'm not sure if Clayton will be able to comment on the Python code, > but he should have some thoughts on the MAF indexing itself. I'll definitely be spending more time with that code; it and the bx-python MAF indexing code will be my main reference points for indexed access. It's been a little while, but I have done some Python work in the past, so I should be able to follow along okay. I'll send some comments out in a few days. Clayton Wheeler cswh at umich.edu From mictadlo at gmail.com Thu May 24 00:30:22 2012 From: mictadlo at gmail.com (Mic) Date: Thu, 24 May 2012 14:30:22 +1000 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: D to Ruby: http://www.swig.org/compare.html On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov wrote: > Hi all, > > here's my report about the past week: > http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ > > Brief summary: > > 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius > bugtracker, and one of them is already solved. Rubinius in 1.8 mode should > now pass all tests. The situation with 1.9 mode is not that great, but I'm > working on it. > > 2) I started to collect D optimization tricks on github wiki page. > Currently, it contains just 6 tips, but this number is going to grow. > Probably, another page will be created soon to keep best practices of > connecting Ruby and D. Since my project and Marjan's one have a lot in > common, I think it's important for us to not waste time on something that > already have been investigated. > > 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and > wrote my first two features. > > 4) Measurements of object instantiation time in Ruby suggest that exposing > low-level D functions via FFI makes little sense. I'm going to discuss with > mentors which high-level functions should be available, and make that into > Cucumber features. > > > > > -- > Artem > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cjfields at illinois.edu Thu May 24 01:14:20 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 24 May 2012 05:14:20 +0000 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken... chris On May 23, 2012, at 11:30 PM, Mic wrote: > D to Ruby: http://www.swig.org/compare.html > > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov wrote: > >> Hi all, >> >> here's my report about the past week: >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ >> >> Brief summary: >> >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should >> now pass all tests. The situation with 1.9 mode is not that great, but I'm >> working on it. >> >> 2) I started to collect D optimization tricks on github wiki page. >> Currently, it contains just 6 tips, but this number is going to grow. >> Probably, another page will be created soon to keep best practices of >> connecting Ruby and D. Since my project and Marjan's one have a lot in >> common, I think it's important for us to not waste time on something that >> already have been investigated. >> >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and >> wrote my first two features. >> >> 4) Measurements of object instantiation time in Ruby suggest that exposing >> low-level D functions via FFI makes little sense. I'm going to discuss with >> mentors which high-level functions should be available, and make that into >> Cucumber features. >> >> >> >> >> -- >> Artem >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby >> > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Thu May 24 01:33:40 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 24 May 2012 01:33:40 -0400 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: References: Message-ID: <9DBCD042-7086-4F4B-ABB9-1A7F63C089B8@umich.edu> Thanks for the offers of help, everybody. Raoul, if it's convenient for you to set up a test VM in house, that would probably make the most sense. I don't think it's a pressing need at this point, but let's look into that. If we run into issues, we can revisit the EC2 options. (I've had an AWS account too long to qualify for the free usage tier, unfortunately.) An Amazon grant might be worth looking at, especially if we can use it to publicly host, say, BGZF-compressed pre-indexed MAF data sets also. On the other hand, that might be overkill just for my needs; using spot-priced instances, I expect I could do all the testing I need for under $50. Clayton Wheeler cswh at umich.edu From lomereiter at googlemail.com Thu May 24 01:40:54 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Thu, 24 May 2012 09:40:54 +0400 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: Chris is right. Currently, it's easier to write everything manually. When I'll develop some 'best practices' I may put then into compile-time algorithms and generate bindings from D. (The language has compile-time introspection but doesn't have run-time one, probably because that would hurt the performance.) On Thu, May 24, 2012 at 9:14 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries > wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken... > > chris > > On May 23, 2012, at 11:30 PM, Mic wrote: > > > D to Ruby: http://www.swig.org/compare.html > > > > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov < > lomereiter at googlemail.com>wrote: > > > >> Hi all, > >> > >> here's my report about the past week: > >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ > >> > >> Brief summary: > >> > >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius > >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode > should > >> now pass all tests. The situation with 1.9 mode is not that great, but > I'm > >> working on it. > >> > >> 2) I started to collect D optimization tricks on github wiki page. > >> Currently, it contains just 6 tips, but this number is going to grow. > >> Probably, another page will be created soon to keep best practices of > >> connecting Ruby and D. Since my project and Marjan's one have a lot in > >> common, I think it's important for us to not waste time on something > that > >> already have been investigated. > >> > >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, > and > >> wrote my first two features. > >> > >> 4) Measurements of object instantiation time in Ruby suggest that > exposing > >> low-level D functions via FFI makes little sense. I'm going to discuss > with > >> mentors which high-level functions should be available, and make that > into > >> Cucumber features. > >> > >> > >> > >> > >> -- > >> Artem > >> > >> _______________________________________________ > >> BioRuby Project - http://www.bioruby.org/ > >> BioRuby mailing list > >> BioRuby at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioruby > >> > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > From lomereiter at googlemail.com Thu May 24 01:52:42 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Thu, 24 May 2012 09:52:42 +0400 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: Hi all, it's a good point that many line-based formats need some sort of compression with indexing, and BGZF is good enough in that sense. So far, I think Artem's BGZF implementation is entirely in D; I may just > add Ruby support for BGZF separately. > The only problem I see with that approach is that it's hardly possible to get parallel compression with MRI. But overall I tend to agree with Clayton. Firstly, it's hard to abstract away some common interface right now, not writing any code and looking at it. Secondly, there're still problems with D shared library support. We were assured by GDC developer that they'll get solved soon, but at the moment the situation is far from perfect. From p.j.a.cock at googlemail.com Thu May 24 05:18:33 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 24 May 2012 10:18:33 +0100 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov wrote: > Hi all, > > it's a good point that many line-based formats need some sort of compression > with indexing, and BGZF is good enough in that sense. BGZF doesn't have to be used with line-based formats, anything with sequential records would work (like BAM files of course). I've not tried it to see how well it compressed, but SFF files in BGZF should work too as another example. >> So far, I think Artem's BGZF implementation is entirely in D; I may just >> add Ruby support for BGZF separately. > > The only problem I see with that approach is that it's hardly possible to > get parallel compression with MRI. But overall I tend to agree with Clayton. > Firstly, it's hard to abstract away some common interface right now, not > writing any code and looking at it. Secondly, there're still problems with D > shared library support. We were assured by GDC developer that they'll get > solved soon, but at the moment the situation is far from perfect. My BGZF code is pure Python (using C zlib via Python's zlib library), and does not currently tackle parallel compression or decompression. There as been recent work in samtools for this. We don't need parallel compression/decompression of BGZF for it to be useful. Peter From john.woods at marcottelab.org Thu May 24 10:01:08 2012 From: john.woods at marcottelab.org (John Woods) Date: Thu, 24 May 2012 09:01:08 -0500 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: References: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Message-ID: If I can just suggest, there's a startup pitch out there which was formerly known as Happy Science Coding, now Appsoma, which lets you run Ruby code on Rackspace instances. It may or may not be appropriate for what you want to do. It's not EC2, but it is a VM (right?). http://appsoma.com/ It's still a bit buggy with Ruby. If you have trouble, email Zack (see the "About us" page). He's fairly responsive. John SciRuby On Tue, May 22, 2012 at 4:21 AM, Raoul Bonnal wrote: > Hi Clayton, > Well done and thanks for your contributes to bioruby and jruby community. > > For you computing issue I have two solutions: > 1) I can create a VM and give you the access, I need to contact my IT dep. > 2) Could Amazon provide some VM for our students? > > > > On 21/05/12 17.50, "Clayton Wheeler" wrote: > > > Hi all, > > > > Here's my report on last week's work: > > > > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ > > > > This was my second week of work on my GSoC project, and the last week of > the > > ?community bonding? period before the official start of coding. A major > focus > > of mine was BioRuby?s phyloXML support; it uses libxml, which has been > causing > > unit test failures under JRuby. In the end, the best course of action > seemed > > to separate the phyloXML support as a separate plugin, which I have done > as > > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML > libraries > > entirely and that JRuby issue along with it. At the same time, users of > the > > phyloXML code should be able to continue using it with no substantive > changes. > > > > Separately, I began porting this phyloXML code to use Nokogiri instead of > > libxml-ruby, but ran into difficulties with this effort. While it is > possible, > > and the library APIs are very similar, the code uses relatively > low-level XML > > processing APIs in ways that seem to be sensitive to subtle differences > in > > text node and namespace semantics between the two libraries. Substantial > > restructuring of the code and the addition of quite a few unit tests > might be > > necessary to carry out such a port with confidence that the resulting > code > > would work well. > > > > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the > major > > causes of BioRuby?s unit test failures with JRuby; once a fix is > integrated, > > we?ll be close to having all the tests passing under JRuby. > > > > I identified another JRuby bug, JRUBY-6666, causing several unit test > > failures. This one affects BioRuby?s code for running external commands, > so it > > would be likely to be encountered in production use. For this one, I also > > worked up a patch. > > > > I also spent some time preparing a performance testing environment, for > > evaluating existing MAF implementations as well as my own. This will be > > important, since I will be considering the use of an existing C parser. > I will > > also want to ensure that the performance of my code is competitive with > the > > alternatives. Lacking any hardware more powerful than a MacBook Air, I am > > setting this up with Amazon EC2. To simplify environment setup, I?ll be > using > > Chef. I?ve already set up a Chef repository with configuration logic, > and some > > rudimentary code to streamline launching Ubuntu machines on EC2 and > > bootstrapping a Chef environment. To save money, I plan to make use of > EC2 > > Spot Instances, which are perfect for instances that only need to run > for a > > few hours for batch tasks. > > > > Clayton Wheeler > > cswh at umich.edu > > > > > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From mictadlo at gmail.com Fri May 25 02:49:13 2012 From: mictadlo at gmail.com (Mic) Date: Fri, 25 May 2012 16:49:13 +1000 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: I think Pircard-tools does parallel compression/decompression of BGZF. Cheers, Mic On Thu, May 24, 2012 at 7:18 PM, Peter Cock wrote: > On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov > wrote: > > Hi all, > > > > it's a good point that many line-based formats need some sort of > compression > > with indexing, and BGZF is good enough in that sense. > > BGZF doesn't have to be used with line-based formats, anything > with sequential records would work (like BAM files of course). I've not > tried it to see how well it compressed, but SFF files in BGZF should > work too as another example. > > >> So far, I think Artem's BGZF implementation is entirely in D; I may just > >> add Ruby support for BGZF separately. > > > > The only problem I see with that approach is that it's hardly possible to > > get parallel compression with MRI. But overall I tend to agree with > Clayton. > > Firstly, it's hard to abstract away some common interface right now, not > > writing any code and looking at it. Secondly, there're still problems > with D > > shared library support. We were assured by GDC developer that they'll get > > solved soon, but at the moment the situation is far from perfect. > > My BGZF code is pure Python (using C zlib via Python's zlib library), > and does not currently tackle parallel compression or decompression. > There as been recent work in samtools for this. > > We don't need parallel compression/decompression of BGZF for it to > be useful. > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cswh at umich.edu Fri May 25 16:42:13 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Fri, 25 May 2012 16:42:13 -0400 Subject: [BioRuby] New blog post on this week's work Message-ID: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Hi all, I've written a new blog post on the work I did on my MAF parser this week: http://csw.github.com/bioruby-maf/blog/2012/05/25/first_milestone/ It covers parser implementation and performance issues, BDD, and tools. Clayton Wheeler cswh at umich.edu From lomereiter at googlemail.com Sun May 27 14:27:43 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Sun, 27 May 2012 22:27:43 +0400 Subject: [BioRuby] [GSoC] weekly report #2 Message-ID: Hi all, I wrote a blog post about the past week: http://lomereiter.wordpress.com/2012/05/27/gsoc-weekly-report-2/ Topics are: 1) I have quite good validation module for BAM now. More kinds of checks can be added, just request them :) 2) Also I started to implement random access via BAI file, just because I mostly finished what I planned for the first two weeks, and random access seems to be one of the most important things. Also it's not mentioned in the blog, but I started to work on BGZF gem, as Pjotr suggested to me. I'll try to document it and publish the first version next week. Currently I write it in pure Ruby. From marian.povolny at gmail.com Sun May 27 15:21:48 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sun, 27 May 2012 21:21:48 +0200 Subject: [BioRuby] GSoC weekly status report No.1.9 Message-ID: http://blog.mpthecoder.com/post/23877896288/gsoc-weekly-status-report-no-1-9 This is the final post in 1.x series, I promise. The last week was spent adding support of parsing lines into records. It was a lot of work, and when I read the comments from my mentor, I wasn?t happy. But I agree with him, I did make it more complicated then it had to be (the C API, for example), I should spend some time polishing and refactoring the D side, and my cucumber features should be split into more features. So that?s the rough plan for the next week. -- Marjan From bonnal at ingm.org Mon May 28 04:50:19 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 28 May 2012 10:50:19 +0200 Subject: [BioRuby] DevTools In-Reply-To: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Message-ID: In case you want to use RedMine I can give you the license for free, any bioruby developer can request it. From p.j.a.cock at googlemail.com Mon May 28 05:00:30 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 28 May 2012 10:00:30 +0100 Subject: [BioRuby] DevTools In-Reply-To: References: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Message-ID: On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal wrote: > In case you want to use RedMine I can give you the license for free, any > bioruby developer can request it. > ??? Redmine is licensed under the GPL. Did you mean admin rights on the OBF RedMine instance, for example to close bug reports? https://redmine.open-bio.org/projects/bioruby Peter From bonnal at ingm.org Mon May 28 05:03:01 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 28 May 2012 11:03:01 +0200 Subject: [BioRuby] DevTools In-Reply-To: Message-ID: Ahhhhhhhhhhh I mean RubyMine http://www.jetbrains.com/ruby/ sorry On 28/05/12 11.00, "Peter Cock" wrote: > > > On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal wrote: >> In case you want to use RedMine I can give you the license for free, any >> bioruby developer can request it. > > ??? Redmine is licensed under the GPL. > > Did you mean admin rights on the OBF RedMine instance, for > example to close bug reports? > https://redmine.open-bio.org/projects/bioruby > > Peter > > From francesco.strozzi at gmail.com Thu May 31 05:11:25 2012 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Thu, 31 May 2012 11:11:25 +0200 Subject: [BioRuby] EU Codefest 2012 Announcement Message-ID: The Open Bioinformatics Foundation (OBF) EU-CodeFest will be held in Parco Tecnologico Padano (PTP) Lodi, Italy on the19th ? 20th of July. The CodeFest is a small focused event under the auspices of the Open Bioinformatics Foundation, and is a sister event of BOSC2012 being held in California USA this year. Three main topics will be worked on during the CodeFest: - NGS and high performance parsers for OpenBio projects. - RDF and semantic web for bioinformatics. - Bioinformatics pipelines definition, execution and distribution. The number of places is limited to 30 participants at maximum, on a first come, first serve basis. Undergraduate and PhD students are welcome to participate. The cost of the event is EUR 100 per person, which includes also lunches, coffee breaks and the social dinner on the 19th of July. Only for students, we can sponsor a limited number of attendees that will not pay for the registration fee. Those students, willing to participate for free to the event, will be asked to submit their qualifications and experience in software development. The organizing committee will review students? applications before final acceptance. Talks and abstracts may be presented during the CodeFest in sessions of 10 minutes plus questions. Coding activities will continue during the talks. The City of Lodi is very close to Milano and has good hotel facilities. The connections by air are excellent, via Milano Malpensa, Milano Linate and Bergamo Orio Al Serio airports. Please register soon using the form at this page http://tecnoparco.org/codefest, places may run out quickly. -- Francesco From marian.povolny at gmail.com Sat May 5 13:07:30 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sat, 5 May 2012 15:07:30 +0200 Subject: [BioRuby] GSoC weekly status report No.1 Message-ID: Hello all, It might be a little early, but there has been so much going on in the last 10 days since the results of GSoC were published... http://blog.mpthecoder.com/post/22380853664/gsoc-weekly-status-report-no-1 A short summary: It has been 10 days since the GSoC results were published, and a lot has happened since then. I got to know the other students and mentors in a longish meeting on Google hangout, I got into a discussion with my mentor on IRC in which we didn?t agree about the parallelization strategy for the parser (experiments will show who?s right) and my inbox is full with mails from my mentor and other students, in which we exchanged loads of interesting ideas. Also, I solved a bug in biogems.info website, which was stopping Pjotr from updating the website with new information about biogems. There is now a GitHub repository for my project: https://github.com/mamarjan/bioruby-hpc-gff3 The work for the first week of coding is halfway done too. There seems to be huge interest for a GFF3 parser with more features, like indexing, random access and writing output, and also support for linking into trees of features that are not located close to each other in the file. A fast sequential parser could be used to generate indexes, and the lower-level parts can be used to reorder the file for faster future usage. Based on that, I think this project is a good start. *I would like to ask you if you?re using the GFF3/GTF file formats in your research, to send me example files and descriptions of how are your applications using the data. This way I?ll be able to test the parser against your files and optimize it for your applications. Currently I have GFF files from Ensembl and Wormbase, and Pjotr pointed me to the genome browser web application at wormbase.org.* -- Marjan From lomereiter at googlemail.com Sun May 6 19:56:50 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Sun, 6 May 2012 23:56:50 +0400 Subject: [BioRuby] [GSoC][BAM] Weekly report No. 0 Message-ID: Hi all, I wrote a few words about what I've done last week: http://lomereiter.wordpress.com/2012/05/06/gsoc-weekly-report-0/ Summary: The code is available at github: https://github.com/lomereiter/BAMread/ I already started to write code planned for the first week so as to have more time in June for exam preparation. Opening BAM and parsing SAM header works, and is available from Ruby, and now I need to write some tests and documentation. Also, I described some compile-time metaprogramming tricks in D which I use to reduce duplication in the code. I'd be grateful for some small BAM files, 1-50 kilobytes in size, with non-empty headers, for testing purposes. -- Artem From bonnal at ingm.org Mon May 7 07:08:53 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 07 May 2012 09:08:53 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: Message-ID: Dear All, BioRuby wiki is up to date with the accepted projects. I created new pages for each accepted project ( just created ). Are we going to keep it up to date with results and summarizing blog posts or what ? -- Ra From p.j.a.cock at googlemail.com Mon May 7 07:31:09 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 7 May 2012 08:31:09 +0100 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: References: Message-ID: On Monday, May 7, 2012, Raoul Bonnal wrote: > Dear All, > BioRuby wiki is up to date with the accepted projects. I created new pages > for each accepted project ( just created ). Are we going to keep it up to > date with results and summarizing blog posts or what ? > > Blog posts (sent to the mailing list too) for weekly updates, but more static wiki page for summary? You can link to the blog posts from the wiki too. Peter From pjotr.public14 at thebird.nl Mon May 7 07:49:09 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Mon, 7 May 2012 09:49:09 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: References: Message-ID: <20120507074909.GB30679@thebird.nl> I was thinking to add news items to biogems.info, and its RSS feed. That gets updated a few times a day. Anyone interested in helping out? Should be straightforward: - Add YAML ./etc/blogs.yaml with links to BLOG RSS feeds - Write script to fetch these and merge it with the RSS for biogems That would give us a new RSS feed. Useful. Next step: - Add news column on main http://biogems.info/ page - Fill it with same RSS items Later I would also like to add a list of active pushes to projects (github style). But that is later. Pj. On Mon, May 07, 2012 at 09:41:48AM +0200, Raoul Bonnal wrote: > Fine. > On 07/05/12 09.31, "Peter Cock" <[1]p.j.a.cock at googlemail.com> wrote: > > On Monday, May 7, 2012, Raoul Bonnal wrote: > > Dear All, > BioRuby wiki is up to date with the accepted projects. I created new > pages > for each accepted project ( just created ). Are we going to keep it > up to > date with results and summarizing blog posts or what ? > > Blog posts (sent to the mailing list too) for weekly updates, > but more static wiki page for summary? You can link to the > blog posts from the wiki too. > Peter > > References > > 1. file://localhost/tmp/p.j.a.cock at googlemail.com From bonnal at ingm.org Mon May 7 07:41:48 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 07 May 2012 09:41:48 +0200 Subject: [BioRuby] [GSoC] BioRuby wiki In-Reply-To: Message-ID: Fine. On 07/05/12 09.31, "Peter Cock" wrote: > > > On Monday, May 7, 2012, Raoul Bonnal wrote: >> Dear All, >> BioRuby wiki is up to date with the accepted projects. I created new pages >> for each accepted project ( just created ). Are we going to keep it up to >> date with results and summarizing blog posts or what ? >> > > Blog posts (sent to the mailing list too) for weekly updates, > but more static wiki page for summary? You can link to the > blog posts from the wiki too. > > > > Peter > ? > From john.woods at marcottelab.org Tue May 8 22:08:47 2012 From: john.woods at marcottelab.org (John Woods) Date: Tue, 8 May 2012 17:08:47 -0500 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship Message-ID: Hi BioRuby folks, I'm pleased to announce that we've opened applications for our first ever Summer of Code, generously sponsored by Brighter Planet. http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ Please note that we recommend you have your application in by *Monday*, which is really soon. Help us out by sharing this around on various social media. Here are links to existing tweets/posts/etc that you can retweet/share/etc. Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH Reddit: http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ Cheers, John Woods Director, SciRuby Project From pjotr.public14 at thebird.nl Wed May 9 06:43:08 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 08:43:08 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: References: Message-ID: <20120509064308.GA24946@thebird.nl> Hi John, That is awesome news! Google has set a right trend with these summer of code initiatives. The OBF has quite some experience with mentoring students, see http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports and one thing we thing very important is weekly meetings between students (and mentors), and weekly blogs by the students. These will be captured on http://biogems.info/. It would be great your students participate in some of our meetings, so we can exchange ideas on Ruby and performance (we use extensions and parallel computing). Also I would like to invite your programme to blog, and that we track those blogs. Pj. On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote: > Hi BioRuby folks, > > I'm pleased to announce that we've opened applications for our first ever > Summer of Code, generously sponsored by Brighter Planet. > > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ > > Please note that we recommend you have your application in by *Monday*, > which is really soon. > > Help us out by sharing this around on various social media. Here are links > to existing tweets/posts/etc that you can retweet/share/etc. > > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH > Reddit: > http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ > > Cheers, > John Woods > Director, SciRuby Project > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From pjotr.public14 at thebird.nl Wed May 9 17:14:49 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:14:49 +0200 Subject: [BioRuby] BioRuby on Travis-ci! Message-ID: <20120509171449.GA29529@thebird.nl> Hi, Some have maybe noticed Goto-san put BioRuby on travis-ci now! See http://travis-ci.org/#!/bioruby/bioruby You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test failure. JRuby fails on a handful of tests and the crash on Rubinius looks spectacular. Note the clever .travis.yml file. We invite you to submit fixes to these tests. Especially our GSoC students, and other students on this ML, can get honors by providing a few fixes, and/or sending in issues to the JRuby/Rubinius projects :). Note both JRuby and Rubinius come with very interesting debugger support. Worth a shot. Your chance to show your Ruby muscles! Pj. From p.j.a.cock at googlemail.com Wed May 9 17:26:31 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:26:31 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci now! See > > ?http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test > failure. ?JRuby fails on a handful of tests and the crash on Rubinius > looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially our GSoC > students, and other students on this ML, can get honors by providing > a few fixes, and/or sending in issues to the JRuby/Rubinius projects > :). Note both JRuby and Rubinius come with very interesting debugger > support. Worth a shot. Your chance to show your Ruby muscles! > > Pj. And if you can fix the different bug identified via the BuildBot too, even better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html Starting from a clean nightly test result makes spotting regressions much easier ;) Peter From pjotr.public14 at thebird.nl Wed May 9 17:32:39 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:32:39 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509173239.GA30220@thebird.nl> Right, the link is here http://testing.open-bio.org/bioruby/one_line_per_build (I need to incorporate this also in http://biogems.info/) On Wed, May 09, 2012 at 06:26:31PM +0100, Peter Cock wrote: > And if you can fix the different bug identified via the BuildBot too, even > better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html > > Starting from a clean nightly test result makes spotting regressions > much easier ;) > > Peter > From cjfields at illinois.edu Wed May 9 17:29:49 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Wed, 9 May 2012 17:29:49 +0000 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> On May 9, 2012, at 12:26 PM, Peter Cock wrote: > On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: >> Hi, >> >> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See >> >> http://travis-ci.org/#!/bioruby/bioruby >> >> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test >> failure. JRuby fails on a handful of tests and the crash on Rubinius >> looks spectacular. >> >> Note the clever .travis.yml file. >> >> We invite you to submit fixes to these tests. Especially our GSoC >> students, and other students on this ML, can get honors by providing >> a few fixes, and/or sending in issues to the JRuby/Rubinius projects >> :). Note both JRuby and Rubinius come with very interesting debugger >> support. Worth a shot. Your chance to show your Ruby muscles! >> >> Pj. > > And if you can fix the different bug identified via the BuildBot too, even > better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html > > Starting from a clean nightly test result makes spotting regressions > much easier ;) > > Peter *sigh* Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P chris From pjotr.public14 at thebird.nl Wed May 9 17:35:17 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 19:35:17 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> References: <20120509171449.GA29529@thebird.nl> <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu> Message-ID: <20120509173517.GB30220@thebird.nl> On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote: > *sigh* > > Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P Peter knows someone in Scotland who can help! Now I got to see a man about a sheep... Pj. From p.j.a.cock at googlemail.com Wed May 9 17:49:59 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:49:59 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci now! See > > ?http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test > failure. ?JRuby fails on a handful of tests and the crash on Rubinius > looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially our GSoC > students, and other students on this ML, can get honors by providing > a few fixes, and/or sending in issues to the JRuby/Rubinius projects > :). Note both JRuby and Rubinius come with very interesting debugger > support. Worth a shot. Your chance to show your Ruby muscles! > > Pj. I see Travis supports Perl, Python and Java too (amongst others) so could be used by the other Bio* projects too for nightly testing (on a 32bit Debian Linux platform). How did you do this in Travis regarding the GitHub authorization? I don't see any way when logged in as me (peterjc) to allow Travis access to the repositories of GitHub organizations I have access to (like Biopython). Thanks, Peter From p.j.a.cock at googlemail.com Wed May 9 17:56:17 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:56:17 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: On Wed, May 9, 2012 at 6:49 PM, Peter Cock wrote: > On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins wrote: >> Hi, >> >> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See >> >> ?http://travis-ci.org/#!/bioruby/bioruby >> >> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test >> failure. ?JRuby fails on a handful of tests and the crash on Rubinius >> looks spectacular. >> >> Note the clever .travis.yml file. >> >> We invite you to submit fixes to these tests. Especially our GSoC >> students, and other students on this ML, can get honors by providing >> a few fixes, and/or sending in issues to the JRuby/Rubinius projects >> :). Note both JRuby and Rubinius come with very interesting debugger >> support. Worth a shot. Your chance to show your Ruby muscles! >> >> Pj. > > I see Travis supports Perl, Python and Java too (amongst others) > so could be used by the other Bio* projects too for nightly testing > (on a 32bit Debian Linux platform). > > How did you do this in Travis regarding the GitHub authorization? > I don't see any way when logged in as me (peterjc) to allow Travis > access to the repositories of GitHub organizations I have access > to (like Biopython). I found there is an open issue on this missing feature: https://github.com/travis-ci/travis-ci/issues/242 There a comment links to a manual workaround: http://about.travis-ci.org/docs/user/how-to-setup-and-trigger-the-hook-manually/ I'm guessing that's how you did it for BioRuby? Thanks, Peter From mail at michaelbarton.me.uk Wed May 9 18:24:54 2012 From: mail at michaelbarton.me.uk (Michael Barton) Date: Wed, 9 May 2012 14:24:54 -0400 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509171449.GA29529@thebird.nl> References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509182454.GA4429@bartonh-mbp-01.uanet.edu> Travis CI is also rolling out a new feature when pull requests on github are automatically tested using the specs in the upstream merge. This can make it much easier to spot broken builds (and vice versa) before they are merged into the blessed branch. http://about.travis-ci.org/blog/announcing-pull-request-support/ On Wed, May 09, 2012 at 07:14:49PM +0200, Pjotr Prins wrote: > Hi, > > Some have maybe noticed Goto-san put BioRuby on travis-ci > now! See > > http://travis-ci.org/#!/bioruby/bioruby > > You can see MRI 1.9.x passes, and 1.8.7 has only a small > unit test failure. JRuby fails on a handful of tests and > the crash on Rubinius looks spectacular. > > Note the clever .travis.yml file. > > We invite you to submit fixes to these tests. Especially > our GSoC students, and other students on this ML, can get > honors by providing a few fixes, and/or sending in issues > to the JRuby/Rubinius projects :). Note both JRuby and > Rubinius come with very interesting debugger support. > Worth a shot. Your chance to show your Ruby muscles! > > Pj. _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From john.woods at marcottelab.org Wed May 9 19:25:38 2012 From: john.woods at marcottelab.org (John Woods) Date: Wed, 9 May 2012 14:25:38 -0500 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: <20120509064308.GA24946@thebird.nl> References: <20120509064308.GA24946@thebird.nl> Message-ID: Hi Pjotr, I'll discuss having our fellow participate in some of your meetings with the SciRuby team. I think the weekly meetings suggestion is a very good one, and we definitely do pay attention to how BioRuby handles its GSoC fellows. We do blog periodically. You can find it here: http://sciruby.com/blog/ I'll make sure that blogging is also a requirement for our fellow. Cheers, John On Wed, May 9, 2012 at 1:43 AM, Pjotr Prins wrote: > Hi John, > > That is awesome news! Google has set a right trend with these summer > of code initiatives. The OBF has quite some experience with mentoring > students, see > > http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports > > and one thing we thing very important is weekly meetings > between students (and mentors), and weekly blogs by the students. > These will be captured on http://biogems.info/. > > It would be great your students participate in some of our meetings, > so we can exchange ideas on Ruby and performance (we use extensions > and parallel computing). Also I would like to invite your programme > to blog, and that we track those blogs. > > Pj. > > On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote: > > Hi BioRuby folks, > > > > I'm pleased to announce that we've opened applications for our first ever > > Summer of Code, generously sponsored by Brighter Planet. > > > > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/ > > > > Please note that we recommend you have your application in by *Monday*, > > which is really soon. > > > > Help us out by sharing this around on various social media. Here are > links > > to existing tweets/posts/etc that you can retweet/share/etc. > > > > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528 > > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH > > Reddit: > > > http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/ > > > > Cheers, > > John Woods > > Director, SciRuby Project > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > From p.j.a.cock at googlemail.com Wed May 9 17:44:37 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 May 2012 18:44:37 +0100 Subject: [BioRuby] BioPerl BuildBot Message-ID: Hi all, I've retitled this and sent it to the BioPerl list, continuing from this thread on the BioRuby list: http://lists.open-bio.org/pipermail/bioruby/2012-May/002247.html On Wed, May 9, 2012 at 6:35 PM, Pjotr Prins wrote: > On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote: >> *sigh* >> >> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P > > Peter knows someone in Scotland who can help! Now I got to see a man > about a sheep... > > Pj. You mean Dolly The Sheep? ;) Tiago or I can assist on the BuilBot server side for BioPerl - in fact Tiago had already made a start (CC'd). We'll need help from a BioPerl developer with a spare machine or two to use as a buildslave (and I can probably borrow some of my employer's which are already nightly tests) to help with how we setup the BuildSlaves - essentially how to get BioPerl and relevant dependencies installed, and then what needs to be done from a fresh git checkout to build and run the tests. Tiago has got this currently: perl Build.PL --accepts ./Build test Once that is working on a single buildslave we can talk about different targets which is where BuildBot is really helpful (e.g. versions of Perl, different OS, etc) Peter From pjotr.public14 at thebird.nl Wed May 9 21:31:58 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 9 May 2012 23:31:58 +0200 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> Message-ID: <20120509213158.GB31329@thebird.nl> On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: > I'm guessing that's how you did it for BioRuby? I think I added it before we were a github organization. Or we were just lucky :) Pj. From pjotr.public14 at thebird.nl Thu May 10 07:27:47 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 10 May 2012 09:27:47 +0200 Subject: [BioRuby] BioRuby rss news feed Message-ID: <20120510072747.GA4587@thebird.nl> Marjan and I have revamped the BioRuby/biogems news feed. See http://www.biogems.info/rss.xml Health warning: Includes opiniated and caffeenated Google Summer of Code blog entries :) Pj. From p.j.a.cock at googlemail.com Thu May 10 10:31:07 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 10 May 2012 11:31:07 +0100 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <20120509213158.GB31329@thebird.nl> References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> Message-ID: On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins wrote: > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: >> I'm guessing that's how you did it for BioRuby? > > I think I added it before we were a github organization. Or we were > just lucky :) > > Pj. I'd guess the former - I've now got a personal Travis account via my personal GitHub account), but for now I can't seem to create a Biopython Travis account via the Biopython organization account on GitHub. Nevertheless, I could get the basic Biopython unit tests running on Travis last night (including Python 3), although this needs more work installing dependencies to get the full test suite coverage: http://travis-ci.org/#!/peterjc/biopython Peter From pjotr.public14 at thebird.nl Thu May 10 16:40:02 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 10 May 2012 18:40:02 +0200 Subject: [BioRuby] BioRuby rss news feed In-Reply-To: <20120510072747.GA4587@thebird.nl> References: <20120510072747.GA4587@thebird.nl> Message-ID: <20120510164002.GA9030@thebird.nl> http://www.biogems.info/ also shows news items and blog entries on the right now. If you want your blog on Bio/Ruby added, just tell us :) Pj. On Thu, May 10, 2012 at 09:27:47AM +0200, Pjotr Prins wrote: > Marjan and I have revamped the BioRuby/biogems news feed. See > > http://www.biogems.info/rss.xml > > Health warning: Includes opiniated and caffeenated Google Summer of Code > blog entries :) > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From georgkam at gmail.com Thu May 10 17:34:14 2012 From: georgkam at gmail.com (George Githinji) Date: Thu, 10 May 2012 20:34:14 +0300 Subject: [BioRuby] BioRuby rss news feed In-Reply-To: <20120510072747.GA4587@thebird.nl> References: <20120510072747.GA4587@thebird.nl> Message-ID: Thanks for all the hardwork! On Thu, May 10, 2012 at 10:27 AM, Pjotr Prins wrote: > Marjan and I have revamped the BioRuby/biogems news feed. See > > ?http://www.biogems.info/rss.xml > > Health warning: Includes opiniated and caffeenated Google Summer of Code > blog entries :) > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- --------------- Sincerely George Skype: george_g2 Blog: http://biorelated.wordpress.com/ Twitter: http://twitter.com/#!/george_l From pjotr.public14 at thebird.nl Fri May 11 09:06:48 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 11 May 2012 11:06:48 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: References: <20120509064308.GA24946@thebird.nl> Message-ID: <20120511090648.GA15897@thebird.nl> We can now list non-biogem rubygems. SciRuby is listed on http://www.biogems.info/rubygems.html Pj. From bonnal at ingm.org Fri May 11 10:58:44 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Fri, 11 May 2012 12:58:44 +0200 Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship In-Reply-To: <20120511090648.GA15897@thebird.nl> Message-ID: +1 :) On 11/05/12 11.06, "Pjotr Prins" wrote: > We can now list non-biogem rubygems. > > SciRuby is listed on http://www.biogems.info/rubygems.html > > Pj. > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From throwern at msu.edu Fri May 11 14:20:19 2012 From: throwern at msu.edu (Nick Thrower) Date: Fri, 11 May 2012 10:20:19 -0400 Subject: [BioRuby] BioTabix gem Message-ID: Hello all, I recently released a bio-tabix gem. It is available on rubygems: https://rubygems.org/gems/bio-tabix and Github: https://github.com/throwern/bio-tabix The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml Feel free to contact me with any comments or suggestions. Best, Nick -- Nick Thrower Information Technology Professional Michigan State University Great Lakes Bioenergy Research Center From pjotr.public14 at thebird.nl Fri May 11 15:43:49 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 11 May 2012 17:43:49 +0200 Subject: [BioRuby] BioTabix gem In-Reply-To: References: Message-ID: <20120511154349.GB17747@thebird.nl> Super :) On Fri, May 11, 2012 at 10:20:19AM -0400, Nick Thrower wrote: > Hello all, > > I recently released a bio-tabix gem. > > It is available on rubygems: > https://rubygems.org/gems/bio-tabix > > and Github: > https://github.com/throwern/bio-tabix > > The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml > > Feel free to contact me with any comments or suggestions. > > Best, > Nick > > -- > Nick Thrower > Information Technology Professional > Michigan State University > Great Lakes Bioenergy Research Center > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cswh at umich.edu Sat May 12 01:21:02 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Fri, 11 May 2012 21:21:02 -0400 Subject: [BioRuby] Submitted JRuby bug and RubySpec addition for unit test failures under JRuby Message-ID: <57CFFD67-58BC-41AD-87E8-9C70A0A7AC97@umich.edu> Hi all, I've noticed that many of the BioRuby unit tests are failing under JRuby, locally and on travis-ci, with NameErrors for 'uninitialized constant' conditions. Many of these tests work when running just a single test script in isolation, but fail when the full suite is run with 'rake test'. I've identified the root cause of this problem, which appears to be a JRuby bug triggered when an autoload entry is defined, the file which would have been autoloaded is explicitly required, and the autoload entry is defined again. Subsequent attempts to access the target of the autoload entry fail with a NameError. This is an unusual sequence of events, but BioRuby and its test suites contain many 'horizontal' autoload entries between various parts of the source tree. For instance, bio/sequence/common.rb sets up an autoload for Bio::Locations, which I observed causing a problem with subsequent use of Bio::Locations. I created a minimized RubySpec illustrating the problem, which succeeds under MRI but fails under JRuby, and submitted it: https://github.com/rubyspec/rubyspec/pull/136 I also filed JRUBY-6658 (http://jira.codehaus.org/browse/JRUBY-6658) for this. If this bug is accepted and fixed, JRuby versions containing the fix should do much better on the test suite. Without a JRuby fix, it might be possible to work around this by restructuring autoloading in the BioRuby code base to avoid horizontal autoload invocations (that is, autoload declarations not in the parent of the module to be autoloaded), but that could be too invasive to justify. Clayton Wheeler cswh at umich.edu From marian.povolny at gmail.com Sat May 12 19:46:46 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sat, 12 May 2012 21:46:46 +0200 Subject: [BioRuby] GSoC weekly status report No.1.1 Message-ID: Hi all, Here is my status report for this week: This year we the GSoC students sure are a very creative group, just look at our numbering schemes for our status reports for the pre-coding period - everyone has his own thing going :) And now back to the GFF3 project. I found a few more sites with big GFF3 files, those will be great for performance testing. And Robert Buels suggested that I should reuse the test suite from the Perl?s Bio::GFF3::LowLevel::Parser, and I think that?s a great idea. I should definitely use that for completeness testing and I will check the test suites of other GFF3 parsers. I have also finished the work for the first week. That means basically I?m already more then two weeks ahead of schedule. The parser is now reading data on the D side and forwarding that to Ruby line by line. That won?t be faster then reading the file from Ruby, but that?s a nice basic case to get data flowing from D to Ruby. The rake tasks have been improved too. There are now two tasks for building the D library, ?compile? and ?compiledebug?, and there is the ?spec? task for running rspec tests and ?features? task for running cucumber tests. The ?clean? task now deletes object and library files. There is also a problem with the D library and garbage collector. It seems this is the problem Iain Buclaw (one of the GDC developers) has warned us about. When using a D shared library, when the GC kicks in for the first time, it looks like if it collects all the static data, for example the per-module variables. And pretty much everything, even when we register with GC a chuck of memory allocated with malloc, it still gets collected. Or at least that?s what it looks like. However, Iain also assured us that this will be solved by the end of this month/beginning of the next. My cucumber and rspec tests still work because they don?t require enough memory for the GC to run, but to be sure that this issue doesn?t interfere with development at this point, I manually disabled the GC on library initialization. I didn?t try yet, but from what has been discussed in the forums, both 32 and 64-bit DLLs on windows built using DMD work fine. I also helped Pjotr with getting our blog posts included in the RSS feed on biogems.info. That's all for now, you can find this report on my blog too: http://blog.mpthecoder.com/post/22919943701/gsoc-weekly-status-report-no-1-1 -- Best regards, Marjan From lomereiter at googlemail.com Sun May 13 20:10:45 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Mon, 14 May 2012 00:10:45 +0400 Subject: [BioRuby] [GSoC] Weekly report No 0.5 Message-ID: Hi all, this is yet another GSoC report. During last week, I was mainly concentrated on D part of the project, adding functionality to it. I implemented parsing of the whole BAM file :) Today I wrote a simple utility in D, which uses my library to convert BAM to SAM. It doesn?t work with array tags yet, and not as fast as samtools, but nevertheless? On a couple of BAM files from test/data directory (namely, bins.bam and ex1_header.bam) the output is identical to that of samtools view ? I checked with diff ? and that kinda proves that everything works fine. Speed issues are mainly due to using std.variant module for storing tags. It uses runtime reflection which is quite slow. Maybe, there?re some other reasons. Anyway, I?m going to write my own tagged union type next week, it should improve the performance quite a bit, and also fix design flaws. For testing tag parsing, I used file tags.bam provided to me by Peter Cock. It contains tests for all types of tags, and my library successfully passes them. Later I?ll experiment with possible speed improvements, and having unit tests covering full range of possible tag types is a must. Also, I downloaded and compiled gdc from trunk. It provides decent performance, not worse than dmd, at least. We expect gdc to gain shared library support in the next two months. Before that happens, we have to use dmd, although there?re some issues with its garbage collector, causing segfaults. We discussed that with Marjan and Pjotr and decided that the best option in such circumstances would be to disable GC during development ? testing library on small files won?t consume much memory anyway. Another thing I downloaded and compiled, is Rubinius. I?m going to investigate why it hangs on BioRuby unittests in 1.9 mode. Another mode, 1.8, seems to work fine except maybe some very minor bugs. During next week, I?m going to learn how to use Cucumber and Rspec, improve D library performance a little, and start to write Ruby bindings. So it will be mostly ?Ruby week? ;) -- Artem From cswh at umich.edu Tue May 15 03:36:17 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Mon, 14 May 2012 23:36:17 -0400 Subject: [BioRuby] GSoC week 1 status report Message-ID: <2D9F6030-8A11-4443-B610-58464F506EE5@umich.edu> Hi all, I've put my first GSoC status report on my project blog: http://csw.github.com/bioruby-maf/blog/2012/05/13/progress/ (The web version of this has 100% more hyperlinks, but here's a plain text version, too.) This has been my first half-week of work on my Google Summer of Code project, and it?s off to an exciting start. The first order of business has been to get my development environment together; since I?ve been a microbiology student instead of a programmer for the last year, it?s taken some work. In that process, I?ve ended up making a few open source contributions just to get my tools working the way I want. I?m running GNU Emacs 24 and trying to take more advantage of it than I have in the past. I?ll have much more to say about this in a future post. I?ve also started working on the BioRuby unit test failures under JRuby, as a way of familiarizing myself with the BioRuby code base as well as the community and its development processes. Right now, JRuby in 1.8 mode is showing 6 failures and 126 errors, which is hardly confidence-inspiring for people considering using JRuby with BioRuby. This is too bad, since JRuby has some definite advantages as a Ruby implementation. After looking into these failures, I?ve broken them down into a few categories: ? temporary file permissions problems, likely due to some sort of Travis-CI environment issue ? a bug in JRuby?s implementation of Open3.popen3 which I?m working up a bug report for ? an odd autoload problem I?ve filed JRUBY-6658 for and sent an accompanying RubySpec patch for ? a problem with libxml-jruby, which appears unmaintained, for which I?ve submitted a BioRuby patch plus JRUBY-6662 ? and a small test case bug relating to floating point handling, which I?ve submitted apatch for. Once these are resolved, JRuby should be passing the BioRuby unit tests in 1.8 mode, and closer to passing in 1.9 mode. (There are a few extra failures under 1.9 that I haven?t sorted through yet.) I?ve also gotten a start on my project itself, creating the bioruby-maf Github repository with a project skeleton and writing my first Cucumber feature for it. This is, in fact, my first Cucumber feature ever. However, I did spend a few cross-country flights reading the RSpec and Cucumber books last week; between that and cribbing from Pjotr?s code I feel like I have some idea what I?m doing. Just assembling that feature has been useful, too, since I?ve had to get several of the existing MAF tools running on my machine. In fact, my test MAF data and the FASTA version of it are courtesy of bx-python, which will be my reference implementation in many respects. Clayton Wheeler cswh at umich.edu From cswh at umich.edu Tue May 15 17:08:20 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Tue, 15 May 2012 13:08:20 -0400 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it Message-ID: Hi all, The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? Clayton Wheeler cswh at umich.edu From pjotr.public14 at thebird.nl Tue May 15 18:54:32 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 15 May 2012 20:54:32 +0200 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: References: Message-ID: <20120515185432.GC20185@thebird.nl> Marvellous work Clayton! My suggestion to BioRuby is to split out phyloxml and to deprecate the current library module. In the next release, or after, we should take out that code. I suspect few people really depend on it, and they can adapt. I am partly responsible for that dependency, and I think the Travis-ci tests also point out that the purer Ruby BioRuby is, the better ;). Naohisa, what do you say? We should also ask the original author, even though she has left our little group and now works for google (and I am claiming Google does not recruit from GSoC :). Diana, maybe you are reading the ML? Pj. On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: > Hi all, > > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) > > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. > > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cjfields at illinois.edu Tue May 15 19:14:02 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Tue, 15 May 2012 19:14:02 +0000 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <20120515185432.GC20185@thebird.nl> References: <20120515185432.GC20185@thebird.nl> Message-ID: I am intending on following the same tact with BioPerl's phyloxml (splitting it out), primarily so it can be maintained separately from the rest of bioperl. chris On May 15, 2012, at 1:54 PM, Pjotr Prins wrote: > Marvellous work Clayton! My suggestion to BioRuby is to split out > phyloxml and to deprecate the current library module. In the next > release, or after, we should take out that code. I suspect few people > really depend on it, and they can adapt. I am partly responsible for > that dependency, and I think the Travis-ci tests also point out that > the purer Ruby BioRuby is, the better ;). > > Naohisa, what do you say? We should also ask the original author, even > though she has left our little group and now works for google (and I > am claiming Google does not recruit from GSoC :). Diana, maybe you are > reading the ML? > > Pj. > > On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: >> Hi all, >> >> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) >> >> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. >> >> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? >> >> Clayton Wheeler >> cswh at umich.edu >> >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby >> > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Tue May 15 21:51:51 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Tue, 15 May 2012 17:51:51 -0400 Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test failures Message-ID: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> Hi all, I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods). Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead? Clayton Wheeler cswh at umich.edu From ngoto at gen-info.osaka-u.ac.jp Wed May 16 07:30:35 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 16:30:35 +0900 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> Message-ID: <201205160739.q4G7dS4G004980@portal.open-bio.org> Hi, For Bioruby, I manually set the hook with my (ngoto's) personal Travis account. As far as I can see, organization accout in Travis is currently not available. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Thu, 10 May 2012 11:31:07 +0100 Peter Cock wrote: > On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins wrote: > > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote: > >> I'm guessing that's how you did it for BioRuby? > > > > I think I added it before we were a github organization. Or we were > > just lucky :) > > > > Pj. > > I'd guess the former - I've now got a personal Travis account via my > personal GitHub account), but for now I can't seem to create a Biopython > Travis account via the Biopython organization account on GitHub. > > Nevertheless, I could get the basic Biopython unit tests running on > Travis last night (including Python 3), although this needs more > work installing dependencies to get the full test suite coverage: > http://travis-ci.org/#!/peterjc/biopython > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From ngoto at gen-info.osaka-u.ac.jp Wed May 16 07:54:53 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 16:54:53 +0900 Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test failures In-Reply-To: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> Message-ID: <201205160754.q4G7srSc005733@portal.open-bio.org> Hi Clayton, In addition, we have a Redmine page hosted on OBF. https://redmine.open-bio.org/projects/bioruby Currently, bugs and feature requests moved from old RubyForge BTS are submitted. I think the Redmine page will be used for bugs and feature requests without pull requests. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Tue, 15 May 2012 17:51:51 -0400 Clayton Wheeler wrote: > Hi all, > > I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods). > > Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead? > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From anurag08priyam at gmail.com Wed May 16 08:15:40 2012 From: anurag08priyam at gmail.com (Anurag Priyam) Date: Wed, 16 May 2012 13:45:40 +0530 Subject: [BioRuby] BioRuby on Travis-ci! In-Reply-To: <201205160739.q4G7dS4G004980@portal.open-bio.org> References: <20120509171449.GA29529@thebird.nl> <20120509213158.GB31329@thebird.nl> <201205160739.q4G7dS4G004980@portal.open-bio.org> Message-ID: On Wed, May 16, 2012 at 1:00 PM, Naohisa GOTO wrote: > For Bioruby, I manually set the hook with my (ngoto's) personal > Travis account. As far as I can see, organization accout in Travis > is currently not available. You are talking about the toggle button on your Travis profile page, right? For repos that belong to an organization, you need to enable Travis hook from Github (admin/service-hooks), iirc, using the token on your Travis profile page. -- Anurag Priyam From ngoto at gen-info.osaka-u.ac.jp Wed May 16 08:17:57 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 17:17:57 +0900 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <20120515185432.GC20185@thebird.nl> References: <20120515185432.GC20185@thebird.nl> Message-ID: <201205160817.q4G8HwBO007774@portal.open-bio.org> Hi, Great work, Clayton! I think separate gem (Biogem) is good, too. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Tue, 15 May 2012 20:54:32 +0200 Pjotr Prins wrote: > Marvellous work Clayton! My suggestion to BioRuby is to split out > phyloxml and to deprecate the current library module. In the next > release, or after, we should take out that code. I suspect few people > really depend on it, and they can adapt. I am partly responsible for > that dependency, and I think the Travis-ci tests also point out that > the purer Ruby BioRuby is, the better ;). > > Naohisa, what do you say? We should also ask the original author, even > though she has left our little group and now works for google (and I > am claiming Google does not recruit from GSoC :). Diana, maybe you are > reading the ML? > > Pj. > > On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: > > Hi all, > > > > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K) > > > > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well. > > > > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach? > > > > Clayton Wheeler > > cswh at umich.edu > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From donttrustben at gmail.com Wed May 16 11:09:24 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 21:09:24 +1000 Subject: [BioRuby] hmmer3 Message-ID: Hi guys, I noticed today that there isn't HMMER3 support in bioruby - particularly I'm interested in a parser for hmmsearch outputs as I want to iterate over aligned positions. I noticed that there is mention of this in the 1.4.1 release notes, that hmmer3 will be supported in 1.5, although I'm not sure what exactly this means. http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ Can I ask what the state of this merge is please? Is there code somewhere just waiting to be merged? Can it be quickly spun out into a biogem in the meantime? Thanks, ben -- Ben Woodcroft From bonnal at ingm.org Wed May 16 11:27:18 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 16 May 2012 13:27:18 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: Message-ID: If you need to wrap the binary please have a look at our wrapper. I wondering is this wrapper could be useful to other gems, I could create a separated gem just for it. Let me know. Docs about the wrapper is in the readme. https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper On 16/05/12 13.09, "Ben Woodcroft" wrote: > Hi guys, > > I noticed today that there isn't HMMER3 support in bioruby - particularly > I'm interested in a parser for hmmsearch outputs as I want to iterate over > aligned positions. > > I noticed that there is mention of this in the 1.4.1 release notes, that > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > means. > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > Can I ask what the state of this merge is please? Is there code somewhere > just waiting to be merged? Can it be quickly spun out into a biogem in the > meantime? > > Thanks, > ben From donttrustben at gmail.com Wed May 16 11:43:44 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 21:43:44 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: References: Message-ID: Thanks for the feedback dudes. I'm happy to spin it out myself, only I don't know where the code is. I don't personally need a wrapper, but I've got 40G of hmmsearch result files to parse. Relatedly I've written a gem that parses HMM model files - I'll release that after a little more testing, hopefully tomorrow. On 16 May 2012 21:27, Raoul Bonnal wrote: > If you need to wrap the binary please have a look at our wrapper. I > wondering is this wrapper could be useful to other gems, I could create a > separated gem just for it. Let me know. Docs about the wrapper is in the > readme. > > https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb > https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper > > > On 16/05/12 13.09, "Ben Woodcroft" wrote: > > > Hi guys, > > > > I noticed today that there isn't HMMER3 support in bioruby - particularly > > I'm interested in a parser for hmmsearch outputs as I want to iterate > over > > aligned positions. > > > > I noticed that there is mention of this in the 1.4.1 release notes, that > > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > > means. > > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > > > Can I ask what the state of this merge is please? Is there code somewhere > > just waiting to be merged? Can it be quickly spun out into a biogem in > the > > meantime? > > > > Thanks, > > ben > > > From ngoto at gen-info.osaka-u.ac.jp Wed May 16 11:48:14 2012 From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO) Date: Wed, 16 May 2012 20:48:14 +0900 Subject: [BioRuby] hmmer3 In-Reply-To: References: Message-ID: <201205161148.q4GBmFSj016839@portal.open-bio.org> Hi Ben, HMMER3 result parser is written by Christian. https://github.com/cmzmasek/bioruby I guess it may be enough quality, except RDF/XML support which is experimental. I'd like to discuss that the class name Bio::Hmmer3Report is suitable. For HMMER2, Bio::HMMER::Report. Naohisa Goto ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org On Wed, 16 May 2012 21:09:24 +1000 Ben Woodcroft wrote: > Hi guys, > > I noticed today that there isn't HMMER3 support in bioruby - particularly > I'm interested in a parser for hmmsearch outputs as I want to iterate over > aligned positions. > > I noticed that there is mention of this in the 1.4.1 release notes, that > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > means. > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > Can I ask what the state of this merge is please? Is there code somewhere > just waiting to be merged? Can it be quickly spun out into a biogem in the > meantime? > > Thanks, > ben > > -- > Ben Woodcroft > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From bonnal at ingm.org Wed May 16 12:46:34 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 16 May 2012 14:46:34 +0200 Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it In-Reply-To: <201205160817.q4G8HwBO007774@portal.open-bio.org> Message-ID: Impressive. This is the right approach for cleaning BioRuby from dependencies which could create problems. Thanks Clayton. On 16/05/12 10.17, "Naohisa GOTO" wrote: > Hi, > > Great work, Clayton! > > I think separate gem (Biogem) is good, too. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org > > On Tue, 15 May 2012 20:54:32 +0200 > Pjotr Prins wrote: > >> Marvellous work Clayton! My suggestion to BioRuby is to split out >> phyloxml and to deprecate the current library module. In the next >> release, or after, we should take out that code. I suspect few people >> really depend on it, and they can adapt. I am partly responsible for >> that dependency, and I think the Travis-ci tests also point out that >> the purer Ruby BioRuby is, the better ;). >> >> Naohisa, what do you say? We should also ask the original author, even >> though she has left our little group and now works for google (and I >> am claiming Google does not recruit from GSoC :). Diana, maybe you are >> reading the ML? >> >> Pj. >> >> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote: >>> Hi all, >>> >>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby >>> gem (an implementation of the libxml API using native Java XML libraries) >>> does not support the full API of libxml-ruby. My first approach to this was >>> to simply use the native libxml-ruby gem and its C extension, which works >>> with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a >>> Unicode issue, and the JRuby developers indicate that the C extension API >>> (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 >>> mode. (see http://bit.ly/JGWC4K) >>> >>> There was a discussion of the PhyloXML parser on the mailing list a couple >>> of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be >>> rewritten to use Nokogiri at some point soon, since Nokogiri is now the de >>> facto standard XML parser. Following that lead, I've gone ahead and ported >>> the PhyloXML parser to use Nokogiri; it only took an hour or two, and the >>> unit tests are passing. My branch for this is at >>> https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a >>> good approach, I can port the writer as well. >>> >>> However, Pjotr suggested that it might make sense to split PhyloXML out into >>> a separate gem. This should be straightforward enough, since no other >>> BioRuby components appear to call PhyloXML. It would mean that any PhyloXML >>> users would need to install a separate gem. On the other hand, it would >>> remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I >>> proceed with this approach? >>> >>> Clayton Wheeler >>> cswh at umich.edu >>> >>> >>> _______________________________________________ >>> BioRuby Project - http://www.bioruby.org/ >>> BioRuby mailing list >>> BioRuby at lists.open-bio.org >>> http://lists.open-bio.org/mailman/listinfo/bioruby >>> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From donttrustben at gmail.com Wed May 16 13:28:01 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Wed, 16 May 2012 23:28:01 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> Message-ID: Ah cool, thanks ngoto. Thanks for writing this Christian. I believe I've extracted the hmmer3 stuff into a new biogem. I've added you as an author on this Christian - hope that's ok with you? https://github.com/wwood/bioruby-hmmer3_report I've not released it to rubygems yet - I wanted to clear up namespace issues first. What do you suggest Naohisa? BIo::HMMER::HMMER3::Report ? On looking at the code it seems it only handles tabular format data, which is rather unfortunate for me, as I need the actual alignment. Looks like I'll have to roll my sleeves up after all, unless there is yet more code out there that parses the regular textual format? I'm not sure about your feelings on this Christian, but how do you feel about putting the rdf stuff in another biogem? If the aim is to get this gem merged into the bioruby core code (and I hope it is since when people say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a bit tangential? I also noticed that in the tests Christian referred to BioRubyTestDataPath which isn't recognised in the biogem. Is there a recommended way to do this in a biogem? Perhaps we should mirror what bioruby itself does to make the code more portable. Thanks everyone for the openness and responsiveness. ben On 16 May 2012 21:48, Naohisa GOTO wrote: > Hi Ben, > > HMMER3 result parser is written by Christian. > https://github.com/cmzmasek/bioruby > > I guess it may be enough quality, except RDF/XML support > which is experimental. > > I'd like to discuss that the class name Bio::Hmmer3Report > is suitable. For HMMER2, Bio::HMMER::Report. > > Naohisa Goto > ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org > > On Wed, 16 May 2012 21:09:24 +1000 > Ben Woodcroft wrote: > > > Hi guys, > > > > I noticed today that there isn't HMMER3 support in bioruby - particularly > > I'm interested in a parser for hmmsearch outputs as I want to iterate > over > > aligned positions. > > > > I noticed that there is mention of this in the 1.4.1 release notes, that > > hmmer3 will be supported in 1.5, although I'm not sure what exactly this > > means. > > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/ > > > > Can I ask what the state of this merge is please? Is there code somewhere > > just waiting to be merged? Can it be quickly spun out into a biogem in > the > > meantime? > > > > Thanks, > > ben > > > > -- > > Ben Woodcroft > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > -- -- Ben Woodcroft http://ecogenomic.org/users/ben-woodcroft From pjotr.public14 at thebird.nl Wed May 16 13:46:12 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Wed, 16 May 2012 15:46:12 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> Message-ID: <20120516134612.GA26059@thebird.nl> On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > I'm not sure about your feelings on this Christian, but how do you feel > about putting the rdf stuff in another biogem? If the aim is to get this > gem merged into the bioruby core code (and I hope it is since when people > say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a > bit tangential? I think it should be decoupled. RDF, in general, is a (searchable) result-based (post-parser) format. Maybe we should coin that definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF into triple stores. Sounds that bio-rdf is the right place for that translation code to me :). Feel free to push it in. Pj. From pjotr.public14 at thebird.nl Thu May 17 16:51:01 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Thu, 17 May 2012 18:51:01 +0200 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <201205160754.q4G7srSc005733@portal.open-bio.org> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> Message-ID: <20120517165101.GA32610@thebird.nl> I don't know if you all track github, but thanks to two GSoC coders (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. Travis-CI should show the green light for all Rubies once Rubinius itself gets updated on Travis :) Kudos. Pj. From cjfields at illinois.edu Thu May 17 16:59:33 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 17 May 2012 16:59:33 +0000 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <20120517165101.GA32610@thebird.nl> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> <20120517165101.GA32610@thebird.nl> Message-ID: Sounds like GSoC this year is paying lots of dividends :) chris On May 17, 2012, at 11:51 AM, Pjotr Prins wrote: > I don't know if you all track github, but thanks to two GSoC coders > (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. > > Travis-CI should show the green light for all Rubies once Rubinius > itself gets updated on Travis :) > > Kudos. > > Pj. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Thu May 17 17:42:06 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 13:42:06 -0400 Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures In-Reply-To: <20120517165101.GA32610@thebird.nl> References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu> <201205160754.q4G7srSc005733@portal.open-bio.org> <20120517165101.GA32610@thebird.nl> Message-ID: <7D2E3046-44E9-4275-B294-8DB39D36294B@umich.edu> On May 17, 2012, at 12:51 PM, Pjotr Prins wrote: > I don't know if you all track github, but thanks to two GSoC coders > (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius. > > Travis-CI should show the green light for all Rubies once Rubinius > itself gets updated on Travis :) Thanks Pjotr. Unfortunately I think we're not going to be quite there for JRuby just yet; we've hit a couple of JRuby bugs which will probably need to be fixed to solve some of the failures. Also, I think we may be stuck with PhyloXML test failures under JRuby in 1.9 mode until we split that out into a separate gem. It's definitely progress, though. Clayton Wheeler cswh at umich.edu From cswh at umich.edu Thu May 17 19:39:27 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 15:39:27 -0400 Subject: [BioRuby] PhyloXML and libxml-ruby Message-ID: Hi all, It appears that the native extension for libxml-ruby is not building reliably under JRuby, causing Travis-CI runs to fail as seen at: http://travis-ci.org/#!/ngoto/bioruby/jobs/1356992 I'm not having much luck identifying exactly why it builds in some JRuby environments and not others, but I've been able to reproduce the Travis-CI problem on a test Linux machine and don't see an obvious fix. If we're going to repackage PhyloXML into a separate gem, I think the safest course of action would be to revert to calling for libxml-jruby in the Travis-CI Gemfiles (i.e. back out http://bit.ly/JmNjDY). Using libxml-ruby instead of libxml-jruby doesn't solve the PhyloXML problems on JRuby in 1.9 mode anyway, and 1.9 mode will soon be the default in JRuby. The PhyloXML gem can be explicitly declared to depend on libxml-ruby, and moving it out of the core BioRuby gem will remove this whole issue, as far as the unit tests go. Then PhyloXML's library requirements can be addressed separately. Thoughts? Clayton Wheeler cswh at umich.edu From cswh at umich.edu Fri May 18 03:10:52 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 17 May 2012 23:10:52 -0400 Subject: [BioRuby] bio-phyloxml gem Message-ID: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> Hi all, I have repackaged BioRuby's PhyloXML support as a separate gem: https://github.com/csw/bioruby-phyloxml I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.) Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough. Also, feel free to contribute changes to the README. If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0. Clayton Wheeler cswh at umich.edu From donttrustben at gmail.com Fri May 18 04:59:44 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Fri, 18 May 2012 14:59:44 +1000 Subject: [BioRuby] hmmer3 In-Reply-To: <20120516134612.GA26059@thebird.nl> References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> <20120516134612.GA26059@thebird.nl> Message-ID: On 16 May 2012 23:46, Pjotr Prins wrote: > On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > > maybe the rdf stuff is a > > bit tangential? > > I think it should be decoupled. RDF, in general, is a (searchable) > result-based (post-parser) format. Maybe we should coin that > definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF > into triple stores. Sounds that bio-rdf is the right place for that > translation code to me :). Feel free to push it in. > Thanks. I've removed the rdf related code all in one commit: https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20 However, I don't feel like I should be adding this to a different repository because I don't feel like I understand the technology enough, and therefore am not really inclined to maintain it. All of the relevant code should be in that commit, so should be quite simple to add in yourself if you are inclined (though I couldn't find any unit tests). Only, I've changed the namespace of it to Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested. I've also now pushed the new biogem to rubygems/biogems.info. Thanks, ben From pjotr.public14 at thebird.nl Fri May 18 05:21:23 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 18 May 2012 07:21:23 +0200 Subject: [BioRuby] hmmer3 In-Reply-To: References: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com> <20120516134612.GA26059@thebird.nl> Message-ID: <20120518052123.GA3360@thebird.nl> OK, I'll take the orphaned RDF code. On Fri, May 18, 2012 at 02:59:44PM +1000, Ben Woodcroft wrote: > On 16 May 2012 23:46, Pjotr Prins <[1]pjotr.public14 at thebird.nl> wrote: > > On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote: > > maybe the rdf stuff is a > > bit tangential? > > I think it should be decoupled. RDF, in general, is a (searchable) > result-based (post-parser) format. Maybe we should coin that > definition somewhere :). I created bio-rdf biogem as a 'sink' for > RDF > into triple stores. Sounds that bio-rdf is the right place for that > translation code to me :). Feel free to push it in. > > Thanks. I've removed the rdf related code all in one commit: > [2]https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011 > cb600e78e6ef10603187c99d20 > However, I don't feel like I should be adding this to a different > repository because I don't feel like I understand the technology > enough, and therefore am not really inclined to maintain it. All of the > relevant code should be in that commit, so should be quite simple to > add in yourself if you are inclined (though I couldn't find any unit > tests). Only, I've changed the namespace of it to > Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested. > I've also now pushed the new biogem to rubygems/[3]biogems.info. > Thanks, > ben > > References > > 1. mailto:pjotr.public14 at thebird.nl > 2. https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20 > 3. http://biogems.info/ From pjotr.public14 at thebird.nl Fri May 18 05:24:40 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Fri, 18 May 2012 07:24:40 +0200 Subject: [BioRuby] bio-phyloxml gem In-Reply-To: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> References: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu> Message-ID: <20120518052440.GB3360@thebird.nl> I am with Raoul and Francesco today. We will take a look and discuss. Good job, also saving the revision history :). On Thu, May 17, 2012 at 11:10:52PM -0400, Clayton Wheeler wrote: > Hi all, > > I have repackaged BioRuby's PhyloXML support as a separate gem: > > https://github.com/csw/bioruby-phyloxml > > I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.) > > Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough. > > Also, feel free to contribute changes to the README. > > If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0. > > Clayton Wheeler > cswh at umich.edu > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From donttrustben at gmail.com Fri May 18 05:40:28 2012 From: donttrustben at gmail.com (Ben Woodcroft) Date: Fri, 18 May 2012 15:40:28 +1000 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer Message-ID: Hi guys, Here's some blatant advertising for some code I've recently written in biogem form. bio-gag: "gag error" is the term I've coined to describe an error that various people have observed on certain sequencing kits with IonTorrent, though it has not previously been characterised very well that I know of (we noticed that the errors seemed to occur at GAG positions in the reads that were supposed to be GAAG). This biogem tries to find and fix these errors. It isn't benchmarked for accuracy but worked well enough for my lab's own purposes. Actually to be honest we've only used an older version of the software on real data and the logic has a little since given some recent evidence we have, but I thought I'd push it out with the latest and greatest error model. https://github.com/wwood/bioruby-gag bio-pileup_iterator: To find gag errors bio-gag iterates through pileup files looking for particular patterns e.g. strand bias of insertions. This gem can be used to iterate through pileup files one position (one line) at a time, building up the sequence of each read as it goes, recording their direction etc. Probably not the fastest piece of code in the world, sorry. I'm not sure whether this should/can be incorporated into bio-samtools? It adds functionality - there's no duplication (I don't think). https://github.com/wwood/bioruby-pileup_iterator bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to the hmmer v3 manual. https://github.com/wwood/bioruby-hmmer_model bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles tabular format files - the guts of this were written by Christian - see yesterday's thread for details. I'm hoping to add regular (non-tabular) format parsing in the near future, but no promises. https://github.com/wwood/bioruby-hmmer3_report I'm sure there is bugs and deficiencies - apologies in advance. Enjoy, ben From francesco.strozzi at gmail.com Fri May 18 08:01:01 2012 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Fri, 18 May 2012 10:01:01 +0200 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer In-Reply-To: References: Message-ID: Hi Ben, thanks for the amazing work! I'm not using Ion Torrent atm but I eventually will and it's good to see there is something already setup. Francesco On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft wrote: > Hi guys, > > Here's some blatant advertising for some code I've recently written in > biogem form. > > bio-gag: "gag error" is the term I've coined to describe an error that > various people have observed on certain sequencing kits with IonTorrent, > though it has not previously been characterised very well that I know of > (we noticed that the errors seemed to occur at GAG positions in the reads > that were supposed to be GAAG). This biogem tries to find and fix these > errors. It isn't benchmarked for accuracy but worked well enough for my > lab's own purposes. Actually to be honest we've only used an older version > of the software on real data and the logic has a little since given some > recent evidence we have, but I thought I'd push it out with the latest and > greatest error model. > https://github.com/wwood/bioruby-gag > > bio-pileup_iterator: To find gag errors bio-gag iterates through pileup > files looking for particular patterns e.g. strand bias of insertions. This > gem can be used to iterate through pileup files one position (one line) at > a time, building up the sequence of each read as it goes, recording their > direction etc. Probably not the fastest piece of code in the world, sorry. > I'm not sure whether this should/can be incorporated into bio-samtools? It > adds functionality - there's no duplication (I don't think). > https://github.com/wwood/bioruby-pileup_iterator > > bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to > the hmmer v3 manual. > https://github.com/wwood/bioruby-hmmer_model > > bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles > tabular format files - the guts of this were written by Christian - see > yesterday's thread for details. I'm hoping to add regular (non-tabular) > format parsing in the near future, but no promises. > https://github.com/wwood/bioruby-hmmer3_report > > I'm sure there is bugs and deficiencies - apologies in advance. > > Enjoy, > ben > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- Francesco From bonnal at ingm.org Fri May 18 08:54:44 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Fri, 18 May 2012 10:54:44 +0200 Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer In-Reply-To: Message-ID: My lab (Alberto) will try your HMM parsers because we are going to annotate a lot of stuff coming form NGS ^_^ On 18/05/12 10.01, "Francesco Strozzi" wrote: > Hi Ben, > thanks for the amazing work! I'm not using Ion Torrent atm but I > eventually will and it's good to see there is something already setup. > > Francesco > > On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft wrote: >> Hi guys, >> >> Here's some blatant advertising for some code I've recently written in >> biogem form. >> >> bio-gag: "gag error" is the term I've coined to describe an error that >> various people have observed on certain sequencing kits with IonTorrent, >> though it has not previously been characterised very well that I know of >> (we noticed that the errors seemed to occur at GAG positions in the reads >> that were supposed to be GAAG). This biogem tries to find and fix these >> errors. It isn't benchmarked for accuracy but worked well enough for my >> lab's own purposes. Actually to be honest we've only used an older version >> of the software on real data and the logic has a little since given some >> recent evidence we have, but I thought I'd push it out with the latest and >> greatest error model. >> https://github.com/wwood/bioruby-gag >> >> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup >> files looking for particular patterns e.g. strand bias of insertions. This >> gem can be used to iterate through pileup files one position (one line) at >> a time, building up the sequence of each read as it goes, recording their >> direction etc. Probably not the fastest piece of code in the world, sorry. >> I'm not sure whether this should/can be incorporated into bio-samtools? It >> adds functionality - there's no duplication (I don't think). >> https://github.com/wwood/bioruby-pileup_iterator >> >> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to >> the hmmer v3 manual. >> https://github.com/wwood/bioruby-hmmer_model >> >> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles >> tabular format files - the guts of this were written by Christian - see >> yesterday's thread for details. I'm hoping to add regular (non-tabular) >> format parsing in the near future, but no promises. >> https://github.com/wwood/bioruby-hmmer3_report >> >> I'm sure there is bugs and deficiencies - apologies in advance. >> >> Enjoy, >> ben >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby > > From pjotr.public14 at thebird.nl Sun May 20 12:31:31 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Sun, 20 May 2012 14:31:31 +0200 Subject: [BioRuby] biogems.info updated Message-ID: <20120520123131.GA17983@thebird.nl> Marjan and I have updated the RSS feed for biogems.info - now we can support more blogs. If you are blogging on Ruby for Bioinformatics, give us the feed :) Pj. From marian.povolny at gmail.com Mon May 21 09:36:01 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Mon, 21 May 2012 11:36:01 +0200 Subject: [BioRuby] GSoC weekly status report No.1.2 Message-ID: http://blog.mpthecoder.com/post/23473020471/gsoc-weekly-status-report-no-1-2 It?s been three months since my first introduction on the BioRuby ML and it?s been great. As it is the end of the GSoC community bonding period, I would like to thank Pjotr most and then all the other community members for their help and support. It?s a great feeling to become a member of a small but growing community of enthusiasts that work together for the better of all of us and for fun. As Pjotr already did, I would like to encourage you to write blog posts about using Ruby in Bioinformatics and let us include them in our RSS and news feeds on the biogems.info website. The site supports both RSS and Atom feeds now, and a similar functionality will be part of the new website for BioRuby once it?s finished. The code also supports adding only posts for one category/tag, so you can tag your posts with BioRuby or similar, and only those posts will be included in the RSS feed on biogems.info. The GSoC coding period starts today, It?s time for me to roll my sleeves up, and start working on the GFF3 parser full-time. -- Marjan From lomereiter at googlemail.com Mon May 21 11:58:46 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Mon, 21 May 2012 15:58:46 +0400 Subject: [BioRuby] [GSoC] Weekly report #1 Message-ID: Hi all, here's my report about the past week: http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ Brief summary: 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius bugtracker, and one of them is already solved. Rubinius in 1.8 mode should now pass all tests. The situation with 1.9 mode is not that great, but I'm working on it. 2) I started to collect D optimization tricks on github wiki page. Currently, it contains just 6 tips, but this number is going to grow. Probably, another page will be created soon to keep best practices of connecting Ruby and D. Since my project and Marjan's one have a lot in common, I think it's important for us to not waste time on something that already have been investigated. 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and wrote my first two features. 4) Measurements of object instantiation time in Ruby suggest that exposing low-level D functions via FFI makes little sense. I'm going to discuss with mentors which high-level functions should be available, and make that into Cucumber features. -- Artem From cswh at umich.edu Mon May 21 15:50:18 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Mon, 21 May 2012 11:50:18 -0400 Subject: [BioRuby] GSoC week 2 status report Message-ID: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Hi all, Here's my report on last week's work: http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ This was my second week of work on my GSoC project, and the last week of the ?community bonding? period before the official start of coding. A major focus of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing unit test failures under JRuby. In the end, the best course of action seemed to separate the phyloXML support as a separate plugin, which I have done as the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries entirely and that JRuby issue along with it. At the same time, users of the phyloXML code should be able to continue using it with no substantive changes. Separately, I began porting this phyloXML code to use Nokogiri instead of libxml-ruby, but ran into difficulties with this effort. While it is possible, and the library APIs are very similar, the code uses relatively low-level XML processing APIs in ways that seem to be sensitive to subtle differences in text node and namespace semantics between the two libraries. Substantial restructuring of the code and the addition of quite a few unit tests might be necessary to carry out such a port with confidence that the resulting code would work well. Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, we?ll be close to having all the tests passing under JRuby. I identified another JRuby bug, JRUBY-6666, causing several unit test failures. This one affects BioRuby?s code for running external commands, so it would be likely to be encountered in production use. For this one, I also worked up a patch. I also spent some time preparing a performance testing environment, for evaluating existing MAF implementations as well as my own. This will be important, since I will be considering the use of an existing C parser. I will also want to ensure that the performance of my code is competitive with the alternatives. Lacking any hardware more powerful than a MacBook Air, I am setting this up with Amazon EC2. To simplify environment setup, I?ll be using Chef. I?ve already set up a Chef repository with configuration logic, and some rudimentary code to streamline launching Ubuntu machines on EC2 and bootstrapping a Chef environment. To save money, I plan to make use of EC2 Spot Instances, which are perfect for instances that only need to run for a few hours for batch tasks. Clayton Wheeler cswh at umich.edu From bonnal at ingm.org Tue May 22 09:21:42 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Tue, 22 May 2012 11:21:42 +0200 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Message-ID: Hi Clayton, Well done and thanks for your contributes to bioruby and jruby community. For you computing issue I have two solutions: 1) I can create a VM and give you the access, I need to contact my IT dep. 2) Could Amazon provide some VM for our students? On 21/05/12 17.50, "Clayton Wheeler" wrote: > Hi all, > > Here's my report on last week's work: > > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ > > This was my second week of work on my GSoC project, and the last week of the > ?community bonding? period before the official start of coding. A major focus > of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing > unit test failures under JRuby. In the end, the best course of action seemed > to separate the phyloXML support as a separate plugin, which I have done as > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries > entirely and that JRuby issue along with it. At the same time, users of the > phyloXML code should be able to continue using it with no substantive changes. > > Separately, I began porting this phyloXML code to use Nokogiri instead of > libxml-ruby, but ran into difficulties with this effort. While it is possible, > and the library APIs are very similar, the code uses relatively low-level XML > processing APIs in ways that seem to be sensitive to subtle differences in > text node and namespace semantics between the two libraries. Substantial > restructuring of the code and the addition of quite a few unit tests might be > necessary to carry out such a port with confidence that the resulting code > would work well. > > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major > causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, > we?ll be close to having all the tests passing under JRuby. > > I identified another JRuby bug, JRUBY-6666, causing several unit test > failures. This one affects BioRuby?s code for running external commands, so it > would be likely to be encountered in production use. For this one, I also > worked up a patch. > > I also spent some time preparing a performance testing environment, for > evaluating existing MAF implementations as well as my own. This will be > important, since I will be considering the use of an existing C parser. I will > also want to ensure that the performance of my code is competitive with the > alternatives. Lacking any hardware more powerful than a MacBook Air, I am > setting this up with Amazon EC2. To simplify environment setup, I?ll be using > Chef. I?ve already set up a Chef repository with configuration logic, and some > rudimentary code to streamline launching Ubuntu machines on EC2 and > bootstrapping a Chef environment. To save money, I plan to make use of EC2 > Spot Instances, which are perfect for instances that only need to run for a > few hours for batch tasks. > > Clayton Wheeler > cswh at umich.edu > > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From p.j.a.cock at googlemail.com Tue May 22 11:07:15 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 May 2012 12:07:15 +0100 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: <4F9AFA1F.6030103@med.nyu.edu> References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: Hi all, I've CC'd the BioRuby mailing list just to ensure you're aware of the potentially useful combination of MAF indexing and BGZF compression. We can continue this on the BioRuby list if more appropriate. The start of this Biopython-dev thread is here: http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html This might be a nice opportunity to combine the work of this year's OBF Google Summer of Code students - Clayton is doing MAF for BioRuby, and part of Artem's project could provide BGZF support for BioRuby. On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak wrote: > Peter, > >> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >> and I'm willing to do this myself for MAF (while going over your index >> work - something I want to do anyway). The only potential catch is >> avoiding offset arithmetic. > > I have no problem with you doing this if you're willing. It would be great > to have some code review of MafIndex as well. I'm not sure if Clayton will be able to comment on the Python code, but he should have some thoughts on the MAF indexing itself. Regards, Peter From pjotr.public14 at thebird.nl Tue May 22 15:23:17 2012 From: pjotr.public14 at thebird.nl (Pjotr Prins) Date: Tue, 22 May 2012 17:23:17 +0200 Subject: [BioRuby] BioRuby hitting 20K Message-ID: <20120522152317.GA30752@thebird.nl> Looks like we'll have 20K downloads of the bioruby gem by tomorrow :). Maybe time for a new release? We are getting a lot more activity anyway - Go BioRuby Go! Pj. From mh6 at sanger.ac.uk Tue May 22 15:32:03 2012 From: mh6 at sanger.ac.uk (Michael Paulini) Date: Tue, 22 May 2012 16:32:03 +0100 Subject: [BioRuby] BioRuby hitting 20K In-Reply-To: <20120522152317.GA30752@thebird.nl> References: <20120522152317.GA30752@thebird.nl> Message-ID: <4FBBB173.2030001@sanger.ac.uk> congrats biorubystas :-) M On 22/05/12 16:23, Pjotr Prins wrote: > Looks like we'll have 20K downloads of the bioruby gem by tomorrow > :). Maybe time for a new release? > > We are getting a lot more activity anyway - Go BioRuby Go! > > Pj. > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From bonnal at ingm.org Wed May 23 13:24:56 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Wed, 23 May 2012 15:24:56 +0200 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: Message-ID: Thanks Peter, These are valuable hints. On 22/05/12 13.07, "Peter Cock" wrote: > Hi all, > > I've CC'd the BioRuby mailing list just to ensure you're aware of the > potentially useful combination of MAF indexing and BGZF compression. > We can continue this on the BioRuby list if more appropriate. > > The start of this Biopython-dev thread is here: > http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html > > This might be a nice opportunity to combine the work of this year's OBF > Google Summer of Code students - Clayton is doing MAF for BioRuby, > and part of Artem's project could provide BGZF support for BioRuby. > > On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak > wrote: >> Peter, >> >>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >>> and I'm willing to do this myself for MAF (while going over your index >>> work - something I want to do anyway). The only potential catch is >>> avoiding offset arithmetic. >> >> I have no problem with you doing this if you're willing. It would be great >> to have some code review of MafIndex as well. > > I'm not sure if Clayton will be able to comment on the Python code, > but he should have some thoughts on the MAF indexing itself. > > Regards, > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Thu May 24 01:35:46 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Wed, 23 May 2012 21:35:46 -0400 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: On May 22, 2012, at 7:07 AM, Peter Cock wrote: > Hi all, > > I've CC'd the BioRuby mailing list just to ensure you're aware of the > potentially useful combination of MAF indexing and BGZF compression. > We can continue this on the BioRuby list if more appropriate. > > The start of this Biopython-dev thread is here: > http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html > > This might be a nice opportunity to combine the work of this year's OBF > Google Summer of Code students - Clayton is doing MAF for BioRuby, > and part of Artem's project could provide BGZF support for BioRuby. Indeed, thanks Peter. BGZF sounds like a great approach for MAF compression; I'm just about to start looking into indexing support, and it makes sense to tackle compression in that context. So far, I think Artem's BGZF implementation is entirely in D; I may just add Ruby support for BGZF separately. > On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak > wrote: >> Peter, >> >>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py >>> and I'm willing to do this myself for MAF (while going over your index >>> work - something I want to do anyway). The only potential catch is >>> avoiding offset arithmetic. >> >> I have no problem with you doing this if you're willing. It would be great >> to have some code review of MafIndex as well. > > I'm not sure if Clayton will be able to comment on the Python code, > but he should have some thoughts on the MAF indexing itself. I'll definitely be spending more time with that code; it and the bx-python MAF indexing code will be my main reference points for indexed access. It's been a little while, but I have done some Python work in the past, so I should be able to follow along okay. I'll send some comments out in a few days. Clayton Wheeler cswh at umich.edu From mictadlo at gmail.com Thu May 24 04:30:22 2012 From: mictadlo at gmail.com (Mic) Date: Thu, 24 May 2012 14:30:22 +1000 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: D to Ruby: http://www.swig.org/compare.html On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov wrote: > Hi all, > > here's my report about the past week: > http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ > > Brief summary: > > 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius > bugtracker, and one of them is already solved. Rubinius in 1.8 mode should > now pass all tests. The situation with 1.9 mode is not that great, but I'm > working on it. > > 2) I started to collect D optimization tricks on github wiki page. > Currently, it contains just 6 tips, but this number is going to grow. > Probably, another page will be created soon to keep best practices of > connecting Ruby and D. Since my project and Marjan's one have a lot in > common, I think it's important for us to not waste time on something that > already have been investigated. > > 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and > wrote my first two features. > > 4) Measurements of object instantiation time in Ruby suggest that exposing > low-level D functions via FFI makes little sense. I'm going to discuss with > mentors which high-level functions should be available, and make that into > Cucumber features. > > > > > -- > Artem > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cjfields at illinois.edu Thu May 24 05:14:20 2012 From: cjfields at illinois.edu (Fields, Christopher J) Date: Thu, 24 May 2012 05:14:20 +0000 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken... chris On May 23, 2012, at 11:30 PM, Mic wrote: > D to Ruby: http://www.swig.org/compare.html > > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov wrote: > >> Hi all, >> >> here's my report about the past week: >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ >> >> Brief summary: >> >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should >> now pass all tests. The situation with 1.9 mode is not that great, but I'm >> working on it. >> >> 2) I started to collect D optimization tricks on github wiki page. >> Currently, it contains just 6 tips, but this number is going to grow. >> Probably, another page will be created soon to keep best practices of >> connecting Ruby and D. Since my project and Marjan's one have a lot in >> common, I think it's important for us to not waste time on something that >> already have been investigated. >> >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and >> wrote my first two features. >> >> 4) Measurements of object instantiation time in Ruby suggest that exposing >> low-level D functions via FFI makes little sense. I'm going to discuss with >> mentors which high-level functions should be available, and make that into >> Cucumber features. >> >> >> >> >> -- >> Artem >> >> _______________________________________________ >> BioRuby Project - http://www.bioruby.org/ >> BioRuby mailing list >> BioRuby at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioruby >> > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby From cswh at umich.edu Thu May 24 05:33:40 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Thu, 24 May 2012 01:33:40 -0400 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: References: Message-ID: <9DBCD042-7086-4F4B-ABB9-1A7F63C089B8@umich.edu> Thanks for the offers of help, everybody. Raoul, if it's convenient for you to set up a test VM in house, that would probably make the most sense. I don't think it's a pressing need at this point, but let's look into that. If we run into issues, we can revisit the EC2 options. (I've had an AWS account too long to qualify for the free usage tier, unfortunately.) An Amazon grant might be worth looking at, especially if we can use it to publicly host, say, BGZF-compressed pre-indexed MAF data sets also. On the other hand, that might be overkill just for my needs; using spot-priced instances, I expect I could do all the testing I need for under $50. Clayton Wheeler cswh at umich.edu From lomereiter at googlemail.com Thu May 24 05:40:54 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Thu, 24 May 2012 09:40:54 +0400 Subject: [BioRuby] [GSoC] Weekly report #1 In-Reply-To: References: Message-ID: Chris is right. Currently, it's easier to write everything manually. When I'll develop some 'best practices' I may put then into compile-time algorithms and generate bindings from D. (The language has compile-time introspection but doesn't have run-time one, probably because that would hurt the performance.) On Thu, May 24, 2012 at 9:14 AM, Fields, Christopher J < cjfields at illinois.edu> wrote: > I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries > wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken... > > chris > > On May 23, 2012, at 11:30 PM, Mic wrote: > > > D to Ruby: http://www.swig.org/compare.html > > > > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov < > lomereiter at googlemail.com>wrote: > > > >> Hi all, > >> > >> here's my report about the past week: > >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/ > >> > >> Brief summary: > >> > >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius > >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode > should > >> now pass all tests. The situation with 1.9 mode is not that great, but > I'm > >> working on it. > >> > >> 2) I started to collect D optimization tricks on github wiki page. > >> Currently, it contains just 6 tips, but this number is going to grow. > >> Probably, another page will be created soon to keep best practices of > >> connecting Ruby and D. Since my project and Marjan's one have a lot in > >> common, I think it's important for us to not waste time on something > that > >> already have been investigated. > >> > >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, > and > >> wrote my first two features. > >> > >> 4) Measurements of object instantiation time in Ruby suggest that > exposing > >> low-level D functions via FFI makes little sense. I'm going to discuss > with > >> mentors which high-level functions should be available, and make that > into > >> Cucumber features. > >> > >> > >> > >> > >> -- > >> Artem > >> > >> _______________________________________________ > >> BioRuby Project - http://www.bioruby.org/ > >> BioRuby mailing list > >> BioRuby at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/bioruby > >> > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > From lomereiter at googlemail.com Thu May 24 05:52:42 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Thu, 24 May 2012 09:52:42 +0400 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: Hi all, it's a good point that many line-based formats need some sort of compression with indexing, and BGZF is good enough in that sense. So far, I think Artem's BGZF implementation is entirely in D; I may just > add Ruby support for BGZF separately. > The only problem I see with that approach is that it's hardly possible to get parallel compression with MRI. But overall I tend to agree with Clayton. Firstly, it's hard to abstract away some common interface right now, not writing any code and looking at it. Secondly, there're still problems with D shared library support. We were assured by GDC developer that they'll get solved soon, but at the moment the situation is far from perfect. From p.j.a.cock at googlemail.com Thu May 24 09:18:33 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 24 May 2012 10:18:33 +0100 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov wrote: > Hi all, > > it's a good point that many line-based formats need some sort of compression > with indexing, and BGZF is good enough in that sense. BGZF doesn't have to be used with line-based formats, anything with sequential records would work (like BAM files of course). I've not tried it to see how well it compressed, but SFF files in BGZF should work too as another example. >> So far, I think Artem's BGZF implementation is entirely in D; I may just >> add Ruby support for BGZF separately. > > The only problem I see with that approach is that it's hardly possible to > get parallel compression with MRI. But overall I tend to agree with Clayton. > Firstly, it's hard to abstract away some common interface right now, not > writing any code and looking at it. Secondly, there're still problems with D > shared library support. We were assured by GDC developer that they'll get > solved soon, but at the moment the situation is far from perfect. My BGZF code is pure Python (using C zlib via Python's zlib library), and does not currently tackle parallel compression or decompression. There as been recent work in samtools for this. We don't need parallel compression/decompression of BGZF for it to be useful. Peter From john.woods at marcottelab.org Thu May 24 14:01:08 2012 From: john.woods at marcottelab.org (John Woods) Date: Thu, 24 May 2012 09:01:08 -0500 Subject: [BioRuby] GSoC week 2 status report In-Reply-To: References: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu> Message-ID: If I can just suggest, there's a startup pitch out there which was formerly known as Happy Science Coding, now Appsoma, which lets you run Ruby code on Rackspace instances. It may or may not be appropriate for what you want to do. It's not EC2, but it is a VM (right?). http://appsoma.com/ It's still a bit buggy with Ruby. If you have trouble, email Zack (see the "About us" page). He's fairly responsive. John SciRuby On Tue, May 22, 2012 at 4:21 AM, Raoul Bonnal wrote: > Hi Clayton, > Well done and thanks for your contributes to bioruby and jruby community. > > For you computing issue I have two solutions: > 1) I can create a VM and give you the access, I need to contact my IT dep. > 2) Could Amazon provide some VM for our students? > > > > On 21/05/12 17.50, "Clayton Wheeler" wrote: > > > Hi all, > > > > Here's my report on last week's work: > > > > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/ > > > > This was my second week of work on my GSoC project, and the last week of > the > > ?community bonding? period before the official start of coding. A major > focus > > of mine was BioRuby?s phyloXML support; it uses libxml, which has been > causing > > unit test failures under JRuby. In the end, the best course of action > seemed > > to separate the phyloXML support as a separate plugin, which I have done > as > > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML > libraries > > entirely and that JRuby issue along with it. At the same time, users of > the > > phyloXML code should be able to continue using it with no substantive > changes. > > > > Separately, I began porting this phyloXML code to use Nokogiri instead of > > libxml-ruby, but ran into difficulties with this effort. While it is > possible, > > and the library APIs are very similar, the code uses relatively > low-level XML > > processing APIs in ways that seem to be sensitive to subtle differences > in > > text node and namespace semantics between the two libraries. Substantial > > restructuring of the code and the addition of quite a few unit tests > might be > > necessary to carry out such a port with confidence that the resulting > code > > would work well. > > > > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the > major > > causes of BioRuby?s unit test failures with JRuby; once a fix is > integrated, > > we?ll be close to having all the tests passing under JRuby. > > > > I identified another JRuby bug, JRUBY-6666, causing several unit test > > failures. This one affects BioRuby?s code for running external commands, > so it > > would be likely to be encountered in production use. For this one, I also > > worked up a patch. > > > > I also spent some time preparing a performance testing environment, for > > evaluating existing MAF implementations as well as my own. This will be > > important, since I will be considering the use of an existing C parser. > I will > > also want to ensure that the performance of my code is competitive with > the > > alternatives. Lacking any hardware more powerful than a MacBook Air, I am > > setting this up with Amazon EC2. To simplify environment setup, I?ll be > using > > Chef. I?ve already set up a Chef repository with configuration logic, > and some > > rudimentary code to streamline launching Ubuntu machines on EC2 and > > bootstrapping a Chef environment. To save money, I plan to make use of > EC2 > > Spot Instances, which are perfect for instances that only need to run > for a > > few hours for batch tasks. > > > > Clayton Wheeler > > cswh at umich.edu > > > > > > > > > > _______________________________________________ > > BioRuby Project - http://www.bioruby.org/ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From mictadlo at gmail.com Fri May 25 06:49:13 2012 From: mictadlo at gmail.com (Mic) Date: Fri, 25 May 2012 16:49:13 +1000 Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond In-Reply-To: References: <4F91E4CF.8040602@med.nyu.edu> <4F9AFA1F.6030103@med.nyu.edu> Message-ID: I think Pircard-tools does parallel compression/decompression of BGZF. Cheers, Mic On Thu, May 24, 2012 at 7:18 PM, Peter Cock wrote: > On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov > wrote: > > Hi all, > > > > it's a good point that many line-based formats need some sort of > compression > > with indexing, and BGZF is good enough in that sense. > > BGZF doesn't have to be used with line-based formats, anything > with sequential records would work (like BAM files of course). I've not > tried it to see how well it compressed, but SFF files in BGZF should > work too as another example. > > >> So far, I think Artem's BGZF implementation is entirely in D; I may just > >> add Ruby support for BGZF separately. > > > > The only problem I see with that approach is that it's hardly possible to > > get parallel compression with MRI. But overall I tend to agree with > Clayton. > > Firstly, it's hard to abstract away some common interface right now, not > > writing any code and looking at it. Secondly, there're still problems > with D > > shared library support. We were assured by GDC developer that they'll get > > solved soon, but at the moment the situation is far from perfect. > > My BGZF code is pure Python (using C zlib via Python's zlib library), > and does not currently tackle parallel compression or decompression. > There as been recent work in samtools for this. > > We don't need parallel compression/decompression of BGZF for it to > be useful. > > Peter > _______________________________________________ > BioRuby Project - http://www.bioruby.org/ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From cswh at umich.edu Fri May 25 20:42:13 2012 From: cswh at umich.edu (Clayton Wheeler) Date: Fri, 25 May 2012 16:42:13 -0400 Subject: [BioRuby] New blog post on this week's work Message-ID: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Hi all, I've written a new blog post on the work I did on my MAF parser this week: http://csw.github.com/bioruby-maf/blog/2012/05/25/first_milestone/ It covers parser implementation and performance issues, BDD, and tools. Clayton Wheeler cswh at umich.edu From lomereiter at googlemail.com Sun May 27 18:27:43 2012 From: lomereiter at googlemail.com (Artem Tarasov) Date: Sun, 27 May 2012 22:27:43 +0400 Subject: [BioRuby] [GSoC] weekly report #2 Message-ID: Hi all, I wrote a blog post about the past week: http://lomereiter.wordpress.com/2012/05/27/gsoc-weekly-report-2/ Topics are: 1) I have quite good validation module for BAM now. More kinds of checks can be added, just request them :) 2) Also I started to implement random access via BAI file, just because I mostly finished what I planned for the first two weeks, and random access seems to be one of the most important things. Also it's not mentioned in the blog, but I started to work on BGZF gem, as Pjotr suggested to me. I'll try to document it and publish the first version next week. Currently I write it in pure Ruby. From marian.povolny at gmail.com Sun May 27 19:21:48 2012 From: marian.povolny at gmail.com (Marjan Povolni) Date: Sun, 27 May 2012 21:21:48 +0200 Subject: [BioRuby] GSoC weekly status report No.1.9 Message-ID: http://blog.mpthecoder.com/post/23877896288/gsoc-weekly-status-report-no-1-9 This is the final post in 1.x series, I promise. The last week was spent adding support of parsing lines into records. It was a lot of work, and when I read the comments from my mentor, I wasn?t happy. But I agree with him, I did make it more complicated then it had to be (the C API, for example), I should spend some time polishing and refactoring the D side, and my cucumber features should be split into more features. So that?s the rough plan for the next week. -- Marjan From bonnal at ingm.org Mon May 28 08:50:19 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 28 May 2012 10:50:19 +0200 Subject: [BioRuby] DevTools In-Reply-To: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Message-ID: In case you want to use RedMine I can give you the license for free, any bioruby developer can request it. From p.j.a.cock at googlemail.com Mon May 28 09:00:30 2012 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 28 May 2012 10:00:30 +0100 Subject: [BioRuby] DevTools In-Reply-To: References: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu> Message-ID: On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal wrote: > In case you want to use RedMine I can give you the license for free, any > bioruby developer can request it. > ??? Redmine is licensed under the GPL. Did you mean admin rights on the OBF RedMine instance, for example to close bug reports? https://redmine.open-bio.org/projects/bioruby Peter From bonnal at ingm.org Mon May 28 09:03:01 2012 From: bonnal at ingm.org (Raoul Bonnal) Date: Mon, 28 May 2012 11:03:01 +0200 Subject: [BioRuby] DevTools In-Reply-To: Message-ID: Ahhhhhhhhhhh I mean RubyMine http://www.jetbrains.com/ruby/ sorry On 28/05/12 11.00, "Peter Cock" wrote: > > > On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal wrote: >> In case you want to use RedMine I can give you the license for free, any >> bioruby developer can request it. > > ??? Redmine is licensed under the GPL. > > Did you mean admin rights on the OBF RedMine instance, for > example to close bug reports? > https://redmine.open-bio.org/projects/bioruby > > Peter > > From francesco.strozzi at gmail.com Thu May 31 09:11:25 2012 From: francesco.strozzi at gmail.com (Francesco Strozzi) Date: Thu, 31 May 2012 11:11:25 +0200 Subject: [BioRuby] EU Codefest 2012 Announcement Message-ID: The Open Bioinformatics Foundation (OBF) EU-CodeFest will be held in Parco Tecnologico Padano (PTP) Lodi, Italy on the19th ? 20th of July. The CodeFest is a small focused event under the auspices of the Open Bioinformatics Foundation, and is a sister event of BOSC2012 being held in California USA this year. Three main topics will be worked on during the CodeFest: - NGS and high performance parsers for OpenBio projects. - RDF and semantic web for bioinformatics. - Bioinformatics pipelines definition, execution and distribution. The number of places is limited to 30 participants at maximum, on a first come, first serve basis. Undergraduate and PhD students are welcome to participate. The cost of the event is EUR 100 per person, which includes also lunches, coffee breaks and the social dinner on the 19th of July. Only for students, we can sponsor a limited number of attendees that will not pay for the registration fee. Those students, willing to participate for free to the event, will be asked to submit their qualifications and experience in software development. The organizing committee will review students? applications before final acceptance. Talks and abstracts may be presented during the CodeFest in sessions of 10 minutes plus questions. Coding activities will continue during the talks. The City of Lodi is very close to Milano and has good hotel facilities. The connections by air are excellent, via Milano Malpensa, Milano Linate and Bergamo Orio Al Serio airports. Please register soon using the form at this page http://tecnoparco.org/codefest, places may run out quickly. -- Francesco