From marian.povolny at gmail.com  Sat May  5 09:07:30 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sat, 5 May 2012 15:07:30 +0200
Subject: [BioRuby] GSoC weekly status report No.1
Message-ID: <CADKP5C=v+GH4XsWZmhDz3feDtvkyy6KTkA1-bOsAtGBdbnNNzw@mail.gmail.com>

Hello all,

It might be a little early, but there has been so much going on in the last
10 days since the results of GSoC were published...

http://blog.mpthecoder.com/post/22380853664/gsoc-weekly-status-report-no-1

A short summary:

It has been 10 days since the GSoC results were published, and a lot has
happened since then. I got to know the other students and mentors in a
longish meeting on Google hangout, I got into a discussion with my mentor
on IRC in which we didn?t agree about the parallelization strategy for the
parser (experiments will show who?s right) and my inbox is full with mails
from my mentor and other students, in which we exchanged loads of
interesting ideas. Also, I solved a bug in biogems.info website, which was
stopping Pjotr from updating the website with new information about biogems.

There is now a GitHub repository for my project:

https://github.com/mamarjan/bioruby-hpc-gff3

The work for the first week of coding is halfway done too.

There seems to be huge interest for a GFF3 parser with more features, like
indexing, random access and writing output, and also support for linking
into trees of features that are not located close to each other in the
file. A fast sequential parser could be used to generate indexes, and the
lower-level parts can be used to reorder the file for faster future usage.
Based on that, I think this project is a good start.

*I would like to ask you if you?re using the GFF3/GTF file formats in your
research, to send me example files and descriptions of how are your
applications using the data. This way I?ll be able to test the parser
against your files and optimize it for your applications. Currently I have
GFF files from Ensembl and Wormbase, and Pjotr pointed me to the genome
browser web application at wormbase.org.*

--
Marjan


From lomereiter at googlemail.com  Sun May  6 15:56:50 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Sun, 6 May 2012 23:56:50 +0400
Subject: [BioRuby] [GSoC][BAM] Weekly report No. 0
Message-ID: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>

Hi all,

I wrote a few words about what I've done last week:
http://lomereiter.wordpress.com/2012/05/06/gsoc-weekly-report-0/

Summary:

The code is available at github: https://github.com/lomereiter/BAMread/
I already started to write code planned for the first week so as to have
more time in June for exam preparation.
Opening BAM and parsing SAM header works, and is available from Ruby, and
now I need to write some tests and documentation. Also, I described
some compile-time metaprogramming tricks in D which I use to reduce
duplication in the code.


I'd be grateful for some small BAM files, 1-50 kilobytes in size, with
non-empty headers, for testing purposes.


--
Artem

From bonnal at ingm.org  Mon May  7 03:08:53 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 07 May 2012 09:08:53 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>
Message-ID: <CBCD41A5.963E%bonnal@ingm.org>

Dear All,
BioRuby wiki is up to date with the accepted projects. I created new pages
for each accepted project ( just created ). Are we going to keep it up to
date with results and summarizing blog posts or what ?

--
Ra


From p.j.a.cock at googlemail.com  Mon May  7 03:31:09 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 7 May 2012 08:31:09 +0100
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CBCD41A5.963E%bonnal@ingm.org>
References: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>
	<CBCD41A5.963E%bonnal@ingm.org>
Message-ID: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>

On Monday, May 7, 2012, Raoul Bonnal wrote:

> Dear All,
> BioRuby wiki is up to date with the accepted projects. I created new pages
> for each accepted project ( just created ). Are we going to keep it up to
> date with results and summarizing blog posts or what ?
>
>
Blog posts (sent to the mailing list too) for weekly updates,
but more static wiki page for summary? You can link to the
blog posts from the wiki too.

Peter

From pjotr.public14 at thebird.nl  Mon May  7 03:49:09 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 7 May 2012 09:49:09 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CBCD495C.9641%bonnal@ingm.org>
References: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>
	<CBCD495C.9641%bonnal@ingm.org>
Message-ID: <20120507074909.GB30679@thebird.nl>

I was thinking to add news items to biogems.info, and its RSS feed.
That gets updated a few times a day. Anyone interested in helping
out? Should be straightforward:

- Add YAML ./etc/blogs.yaml with links to BLOG RSS feeds
- Write script to fetch these and merge it with the RSS for biogems

That would give us a new RSS feed. Useful. Next step:

- Add news column on main http://biogems.info/ page
- Fill it with same RSS items

Later I would also like to add a list of active pushes to projects
(github style). But that is later.

Pj.

On Mon, May 07, 2012 at 09:41:48AM +0200, Raoul Bonnal wrote:
>    Fine.
>    On 07/05/12 09.31, "Peter Cock" <[1]p.j.a.cock at googlemail.com> wrote:
> 
>      On Monday, May 7, 2012, Raoul Bonnal  wrote:
> 
>      Dear All,
>      BioRuby wiki is up to date with the accepted projects. I created new
>      pages
>      for each accepted project ( just created ). Are we going to keep it
>      up to
>      date with results and summarizing blog posts or what ?
> 
>      Blog posts (sent to the mailing list too) for weekly updates,
>      but more static wiki page for summary? You can link to the
>      blog posts from the wiki too.
>      Peter
> 
> References
> 
>    1. file://localhost/tmp/p.j.a.cock at googlemail.com

From bonnal at ingm.org  Mon May  7 03:41:48 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 07 May 2012 09:41:48 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>
Message-ID: <CBCD495C.9641%bonnal@ingm.org>

Fine.


On 07/05/12 09.31, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> 
> 
> On Monday, May 7, 2012, Raoul Bonnal  wrote:
>> Dear All,
>> BioRuby wiki is up to date with the accepted projects. I created new pages
>> for each accepted project ( just created ). Are we going to keep it up to
>> date with results and summarizing blog posts or what ?
>> 
> 
> Blog posts (sent to the mailing list too) for weekly updates,
> but more static wiki page for summary? You can link to the
> blog posts from the wiki too.
> 
> 
> 
> Peter
> ?
> 


From john.woods at marcottelab.org  Tue May  8 18:08:47 2012
From: john.woods at marcottelab.org (John Woods)
Date: Tue, 8 May 2012 17:08:47 -0500
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
Message-ID: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>

Hi BioRuby folks,

I'm pleased to announce that we've opened applications for our first ever
Summer of Code, generously sponsored by Brighter Planet.

http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/

Please note that we recommend you have your application in by *Monday*,
which is really soon.

Help us out by sharing this around on various social media. Here are links
to existing tweets/posts/etc that you can retweet/share/etc.

Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
Reddit:
http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/

Cheers,
John Woods
Director, SciRuby Project

From pjotr.public14 at thebird.nl  Wed May  9 02:43:08 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 08:43:08 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
Message-ID: <20120509064308.GA24946@thebird.nl>

Hi John,

That is awesome news! Google has set a right trend with these summer
of code initiatives. The OBF has quite some experience with mentoring
students, see

  http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports

and one thing we thing very important is weekly meetings
between students (and mentors), and weekly blogs by the students.
These will be captured on http://biogems.info/.

It would be great your students participate in some of our meetings,
so we can exchange ideas on Ruby and performance (we use extensions
and parallel computing). Also I would like to invite your programme
to blog, and that we track those blogs.

Pj.

On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote:
> Hi BioRuby folks,
> 
> I'm pleased to announce that we've opened applications for our first ever
> Summer of Code, generously sponsored by Brighter Planet.
> 
> http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/
> 
> Please note that we recommend you have your application in by *Monday*,
> which is really soon.
> 
> Help us out by sharing this around on various social media. Here are links
> to existing tweets/posts/etc that you can retweet/share/etc.
> 
> Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
> Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
> Reddit:
> http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/
> 
> Cheers,
> John Woods
> Director, SciRuby Project
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From pjotr.public14 at thebird.nl  Wed May  9 13:14:49 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:14:49 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
Message-ID: <20120509171449.GA29529@thebird.nl>

Hi,

Some have maybe noticed Goto-san put BioRuby on travis-ci now! See

  http://travis-ci.org/#!/bioruby/bioruby

You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
failure.  JRuby fails on a handful of tests and the crash on Rubinius
looks spectacular. 

Note the clever .travis.yml file.

We invite you to submit fixes to these tests. Especially our GSoC
students, and other students on this ML, can get honors by providing
a few fixes, and/or sending in issues to the JRuby/Rubinius projects
:). Note both JRuby and Rubinius come with very interesting debugger
support. Worth a shot. Your chance to show your Ruby muscles!

Pj.

From p.j.a.cock at googlemail.com  Wed May  9 13:26:31 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:26:31 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>

On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>
> ?http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
> looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially our GSoC
> students, and other students on this ML, can get honors by providing
> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
> :). Note both JRuby and Rubinius come with very interesting debugger
> support. Worth a shot. Your chance to show your Ruby muscles!
>
> Pj.

And if you can fix the different bug identified via the BuildBot too, even
better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html

Starting from a clean nightly test result makes spotting regressions
much easier ;)

Peter


From pjotr.public14 at thebird.nl  Wed May  9 13:32:39 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:32:39 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
Message-ID: <20120509173239.GA30220@thebird.nl>

Right, the link is here

  http://testing.open-bio.org/bioruby/one_line_per_build

(I need to incorporate this also in http://biogems.info/)

On Wed, May 09, 2012 at 06:26:31PM +0100, Peter Cock wrote:
> And if you can fix the different bug identified via the BuildBot too, even
> better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html
> 
> Starting from a clean nightly test result makes spotting regressions
> much easier ;)
> 
> Peter
> 

From cjfields at illinois.edu  Wed May  9 13:29:49 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 9 May 2012 17:29:49 +0000
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
Message-ID: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>

On May 9, 2012, at 12:26 PM, Peter Cock wrote:

> On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
>> Hi,
>> 
>> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>> 
>>  http://travis-ci.org/#!/bioruby/bioruby
>> 
>> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
>> failure.  JRuby fails on a handful of tests and the crash on Rubinius
>> looks spectacular.
>> 
>> Note the clever .travis.yml file.
>> 
>> We invite you to submit fixes to these tests. Especially our GSoC
>> students, and other students on this ML, can get honors by providing
>> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
>> :). Note both JRuby and Rubinius come with very interesting debugger
>> support. Worth a shot. Your chance to show your Ruby muscles!
>> 
>> Pj.
> 
> And if you can fix the different bug identified via the BuildBot too, even
> better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html
> 
> Starting from a clean nightly test result makes spotting regressions
> much easier ;)
> 
> Peter

*sigh*

Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P

chris


From pjotr.public14 at thebird.nl  Wed May  9 13:35:17 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:35:17 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
	<31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>
Message-ID: <20120509173517.GB30220@thebird.nl>

On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote:
> *sigh*
> 
> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P

Peter knows someone in Scotland who can help! Now I got to see a man
about a sheep...

Pj.

From p.j.a.cock at googlemail.com  Wed May  9 13:49:59 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:49:59 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>

On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>
> ?http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
> looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially our GSoC
> students, and other students on this ML, can get honors by providing
> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
> :). Note both JRuby and Rubinius come with very interesting debugger
> support. Worth a shot. Your chance to show your Ruby muscles!
>
> Pj.

I see Travis supports Perl, Python and Java too (amongst others)
so could be used by the other Bio* projects too for nightly testing
(on a 32bit Debian Linux platform).

How did you do this in Travis regarding the GitHub authorization?
I don't see any way when logged in as me (peterjc) to allow Travis
access to the repositories of GitHub organizations I have access
to (like Biopython).

Thanks,

Peter


From p.j.a.cock at googlemail.com  Wed May  9 13:56:17 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:56:17 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
Message-ID: <CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>

On Wed, May 9, 2012 at 6:49 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
>> Hi,
>>
>> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>>
>> ?http://travis-ci.org/#!/bioruby/bioruby
>>
>> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
>> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
>> looks spectacular.
>>
>> Note the clever .travis.yml file.
>>
>> We invite you to submit fixes to these tests. Especially our GSoC
>> students, and other students on this ML, can get honors by providing
>> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
>> :). Note both JRuby and Rubinius come with very interesting debugger
>> support. Worth a shot. Your chance to show your Ruby muscles!
>>
>> Pj.
>
> I see Travis supports Perl, Python and Java too (amongst others)
> so could be used by the other Bio* projects too for nightly testing
> (on a 32bit Debian Linux platform).
>
> How did you do this in Travis regarding the GitHub authorization?
> I don't see any way when logged in as me (peterjc) to allow Travis
> access to the repositories of GitHub organizations I have access
> to (like Biopython).

I found there is an open issue on this missing feature:
https://github.com/travis-ci/travis-ci/issues/242

There a comment links to a manual workaround:
http://about.travis-ci.org/docs/user/how-to-setup-and-trigger-the-hook-manually/

I'm guessing that's how you did it for BioRuby?

Thanks,

Peter


From mail at michaelbarton.me.uk  Wed May  9 14:24:54 2012
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Wed, 9 May 2012 14:24:54 -0400
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <20120509182454.GA4429@bartonh-mbp-01.uanet.edu>

Travis CI is also rolling out a new feature when pull
requests on github are automatically tested using the specs
in the upstream merge. This can make it much easier to spot
broken builds (and vice versa) before they are merged into
the blessed branch.

http://about.travis-ci.org/blog/announcing-pull-request-support/

On Wed, May 09, 2012 at 07:14:49PM +0200, Pjotr Prins wrote:

> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci
> now! See
>
>   http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small
> unit test failure. JRuby fails on a handful of tests and
> the crash on Rubinius looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially
> our GSoC students, and other students on this ML, can get
> honors by providing a few fixes, and/or sending in issues
> to the JRuby/Rubinius projects :). Note both JRuby and
> Rubinius come with very interesting debugger support.
> Worth a shot. Your chance to show your Ruby muscles!
>
> Pj. _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From john.woods at marcottelab.org  Wed May  9 15:25:38 2012
From: john.woods at marcottelab.org (John Woods)
Date: Wed, 9 May 2012 14:25:38 -0500
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <20120509064308.GA24946@thebird.nl>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
	<20120509064308.GA24946@thebird.nl>
Message-ID: <CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>

Hi Pjotr,

I'll discuss having our fellow participate in some of your meetings with
the SciRuby team. I think the weekly meetings suggestion is a very good
one, and we definitely do pay attention to how BioRuby handles its GSoC
fellows.

We do blog periodically. You can find it here: http://sciruby.com/blog/
I'll make sure that blogging is also a requirement for our fellow.

Cheers,
John


On Wed, May 9, 2012 at 1:43 AM, Pjotr Prins <pjotr.public14 at thebird.nl>wrote:

> Hi John,
>
> That is awesome news! Google has set a right trend with these summer
> of code initiatives. The OBF has quite some experience with mentoring
> students, see
>
>  http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports
>
> and one thing we thing very important is weekly meetings
> between students (and mentors), and weekly blogs by the students.
> These will be captured on http://biogems.info/.
>
> It would be great your students participate in some of our meetings,
> so we can exchange ideas on Ruby and performance (we use extensions
> and parallel computing). Also I would like to invite your programme
> to blog, and that we track those blogs.
>
> Pj.
>
> On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote:
> > Hi BioRuby folks,
> >
> > I'm pleased to announce that we've opened applications for our first ever
> > Summer of Code, generously sponsored by Brighter Planet.
> >
> > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/
> >
> > Please note that we recommend you have your application in by *Monday*,
> > which is really soon.
> >
> > Help us out by sharing this around on various social media. Here are
> links
> > to existing tweets/posts/etc that you can retweet/share/etc.
> >
> > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
> > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
> > Reddit:
> >
> http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/
> >
> > Cheers,
> > John Woods
> > Director, SciRuby Project
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
>

From p.j.a.cock at googlemail.com  Wed May  9 13:44:37 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:44:37 +0100
Subject: [BioRuby] BioPerl BuildBot
Message-ID: <CAKVJ-_50-xSd2768Q9snegfs-hZ6Wb3iooYpSymxMAB6k5LN9A@mail.gmail.com>

Hi all,

I've retitled this and sent it to the BioPerl list, continuing from
this thread on
the BioRuby list:

http://lists.open-bio.org/pipermail/bioruby/2012-May/002247.html

On Wed, May 9, 2012 at 6:35 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote:
>> *sigh*
>>
>> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P
>
> Peter knows someone in Scotland who can help! Now I got to see a man
> about a sheep...
>
> Pj.

You mean Dolly The Sheep? ;)

Tiago or I can assist on the BuilBot server side for BioPerl - in fact Tiago
had already made a start (CC'd).

We'll need help from a BioPerl developer with a spare machine or two
to use as a buildslave (and I can probably borrow some of my employer's
which are already nightly tests) to help with how we setup the BuildSlaves
- essentially how to get BioPerl and relevant dependencies installed,
and then what needs to be done from a fresh git checkout to build
and run the tests. Tiago has got this currently:

perl Build.PL --accepts
./Build test

Once that is working on a single buildslave we can talk about different
targets which is where BuildBot is really helpful (e.g. versions of Perl,
different OS, etc)

Peter

From pjotr.public14 at thebird.nl  Wed May  9 17:31:58 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 23:31:58 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
Message-ID: <20120509213158.GB31329@thebird.nl>

On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
> I'm guessing that's how you did it for BioRuby?

I think I added it before we were a github organization. Or we were
just lucky :)

Pj.

From pjotr.public14 at thebird.nl  Thu May 10 03:27:47 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 10 May 2012 09:27:47 +0200
Subject: [BioRuby] BioRuby rss news feed
Message-ID: <20120510072747.GA4587@thebird.nl>

Marjan and I have revamped the BioRuby/biogems news feed. See

  http://www.biogems.info/rss.xml

Health warning: Includes opiniated and caffeenated Google Summer of Code
blog entries :)

Pj.


From p.j.a.cock at googlemail.com  Thu May 10 06:31:07 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 10 May 2012 11:31:07 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509213158.GB31329@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
Message-ID: <CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>

On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
>> I'm guessing that's how you did it for BioRuby?
>
> I think I added it before we were a github organization. Or we were
> just lucky :)
>
> Pj.

I'd guess the former - I've now got a personal Travis account via my
personal GitHub account), but for now I can't seem to create a Biopython
Travis account via the Biopython organization account on GitHub.

Nevertheless, I could get the basic Biopython unit tests running on
Travis last night (including Python 3), although this needs more
work installing dependencies to get the full test suite coverage:
http://travis-ci.org/#!/peterjc/biopython

Peter

From pjotr.public14 at thebird.nl  Thu May 10 12:40:02 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 10 May 2012 18:40:02 +0200
Subject: [BioRuby] BioRuby rss news feed
In-Reply-To: <20120510072747.GA4587@thebird.nl>
References: <20120510072747.GA4587@thebird.nl>
Message-ID: <20120510164002.GA9030@thebird.nl>

http://www.biogems.info/ also shows news items and blog entries on the
right now. If you want your blog on Bio/Ruby added, just tell us :)

Pj.

On Thu, May 10, 2012 at 09:27:47AM +0200, Pjotr Prins wrote:
> Marjan and I have revamped the BioRuby/biogems news feed. See
> 
>   http://www.biogems.info/rss.xml
> 
> Health warning: Includes opiniated and caffeenated Google Summer of Code
> blog entries :)
> 
> Pj.
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From georgkam at gmail.com  Thu May 10 13:34:14 2012
From: georgkam at gmail.com (George Githinji)
Date: Thu, 10 May 2012 20:34:14 +0300
Subject: [BioRuby] BioRuby rss news feed
In-Reply-To: <20120510072747.GA4587@thebird.nl>
References: <20120510072747.GA4587@thebird.nl>
Message-ID: <CALf85+WGz=xA7ROXrA4JO_VFgJ2LM_9dipH36e=ia5qA3LfCcg@mail.gmail.com>

Thanks for all the hardwork!

On Thu, May 10, 2012 at 10:27 AM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Marjan and I have revamped the BioRuby/biogems news feed. See
>
> ?http://www.biogems.info/rss.xml
>
> Health warning: Includes opiniated and caffeenated Google Summer of Code
> blog entries :)
>
> Pj.
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 
---------------
Sincerely
George
Skype: george_g2
Blog: http://biorelated.wordpress.com/
Twitter: http://twitter.com/#!/george_l


From pjotr.public14 at thebird.nl  Fri May 11 05:06:48 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 11 May 2012 11:06:48 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
	<20120509064308.GA24946@thebird.nl>
	<CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>
Message-ID: <20120511090648.GA15897@thebird.nl>

We can now list non-biogem rubygems.

SciRuby is listed on http://www.biogems.info/rubygems.html

Pj.


From bonnal at ingm.org  Fri May 11 06:58:44 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Fri, 11 May 2012 12:58:44 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <20120511090648.GA15897@thebird.nl>
Message-ID: <CBD2BD84.97F0%bonnal@ingm.org>

+1 :)


On 11/05/12 11.06, "Pjotr Prins" <pjotr.public14 at thebird.nl> wrote:

> We can now list non-biogem rubygems.
> 
> SciRuby is listed on http://www.biogems.info/rubygems.html
> 
> Pj.
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From throwern at msu.edu  Fri May 11 10:20:19 2012
From: throwern at msu.edu (Nick Thrower)
Date: Fri, 11 May 2012 10:20:19 -0400
Subject: [BioRuby] BioTabix gem
Message-ID: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>

Hello all,

I recently released a bio-tabix gem.

It is available on rubygems: 
https://rubygems.org/gems/bio-tabix

and Github: 
https://github.com/throwern/bio-tabix

The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml

Feel free to contact me with any comments or suggestions. 

Best,
Nick

-- 
Nick Thrower
Information Technology Professional
Michigan State University
Great Lakes Bioenergy Research Center


From pjotr.public14 at thebird.nl  Fri May 11 11:43:49 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 11 May 2012 17:43:49 +0200
Subject: [BioRuby] BioTabix gem
In-Reply-To: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>
References: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>
Message-ID: <20120511154349.GB17747@thebird.nl>

Super :)

On Fri, May 11, 2012 at 10:20:19AM -0400, Nick Thrower wrote:
> Hello all,
> 
> I recently released a bio-tabix gem.
> 
> It is available on rubygems: 
> https://rubygems.org/gems/bio-tabix
> 
> and Github: 
> https://github.com/throwern/bio-tabix
> 
> The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml
> 
> Feel free to contact me with any comments or suggestions. 
> 
> Best,
> Nick
> 
> -- 
> Nick Thrower
> Information Technology Professional
> Michigan State University
> Great Lakes Bioenergy Research Center
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From cswh at umich.edu  Fri May 11 21:21:02 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Fri, 11 May 2012 21:21:02 -0400
Subject: [BioRuby] Submitted JRuby bug and RubySpec addition for unit test
	failures under JRuby
Message-ID: <57CFFD67-58BC-41AD-87E8-9C70A0A7AC97@umich.edu>

Hi all,

I've noticed that many of the BioRuby unit tests are failing under JRuby, locally and on travis-ci, with NameErrors for 'uninitialized constant' conditions. Many of these tests work when running just a single test script in isolation, but fail when the full suite is run with 'rake test'. 

I've identified the root cause of this problem, which appears to be a JRuby bug triggered when an autoload entry is defined, the file which would have been autoloaded is explicitly required, and the autoload entry is defined again. Subsequent attempts to access the target of the autoload entry fail with a NameError.

This is an unusual sequence of events, but BioRuby and its test suites contain many 'horizontal' autoload entries between various parts of the source tree. For instance, bio/sequence/common.rb sets up an autoload for Bio::Locations, which I observed causing a problem with subsequent use of Bio::Locations.

I created a minimized RubySpec illustrating the problem, which succeeds under MRI but fails under JRuby, and submitted it:

https://github.com/rubyspec/rubyspec/pull/136

I also filed JRUBY-6658 (http://jira.codehaus.org/browse/JRUBY-6658) for this. If this bug is accepted and fixed, JRuby versions containing the fix should do much better on the test suite.

Without a JRuby fix, it might be possible to work around this by restructuring autoloading in the BioRuby code base to avoid horizontal autoload invocations (that is, autoload declarations not in the parent of the module to be autoloaded), but that could be too invasive to justify.

Clayton Wheeler
cswh at umich.edu


From marian.povolny at gmail.com  Sat May 12 15:46:46 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sat, 12 May 2012 21:46:46 +0200
Subject: [BioRuby] GSoC weekly status report No.1.1
Message-ID: <CADKP5Ck-V_vo9kJ=pQiNzf184S1ocx7Z94hj7-QLPKAXRa8s2A@mail.gmail.com>

Hi all,

Here is my status report for this week:

This year we the GSoC students sure are a very creative group, just look at
our numbering schemes for our status reports for the pre-coding period -
everyone has his own thing going :)

And now back to the GFF3 project. I found a few more sites with big GFF3
files, those will be great for performance testing. And Robert Buels
suggested that I should reuse the test suite from the Perl?s
Bio::GFF3::LowLevel::Parser, and I think that?s a great idea. I should
definitely use that for completeness testing and I will check the test
suites of other GFF3 parsers.

I have also finished the work for the first week. That means basically I?m
already more then two weeks ahead of schedule. The parser is now reading
data on the D side and forwarding that to Ruby line by line. That won?t be
faster then reading the file from Ruby, but that?s a nice basic case to get
data flowing from D to Ruby.

The rake tasks have been improved too. There are now two tasks for building
the D library, ?compile? and ?compiledebug?, and there is the ?spec? task
for running rspec tests and ?features? task for running cucumber tests. The
?clean? task now deletes object and library files.

There is also a problem with the D library and garbage collector. It seems
this is the problem Iain Buclaw (one of the GDC developers) has warned us
about. When using a D shared library, when the GC kicks in for the first
time, it looks like if it collects all the static data, for example the
per-module variables. And pretty much everything, even when we register
with GC a chuck of memory allocated with malloc, it still gets collected.
Or at least that?s what it looks like. However, Iain also assured us that
this will be solved by the end of this month/beginning of the next. My
cucumber and rspec tests still work because they don?t require enough
memory for the GC to run, but to be sure that this issue doesn?t interfere
with development at this point, I manually disabled the GC on
library initialization. I didn?t try yet, but from what has been discussed
in the forums, both 32 and 64-bit DLLs on windows built using DMD work fine.

I also helped Pjotr with getting our blog posts included in the RSS feed on
biogems.info.


That's all for now, you can find this report on my blog too:

http://blog.mpthecoder.com/post/22919943701/gsoc-weekly-status-report-no-1-1

--
Best regards,
Marjan


From lomereiter at googlemail.com  Sun May 13 16:10:45 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Mon, 14 May 2012 00:10:45 +0400
Subject: [BioRuby] [GSoC] Weekly report No 0.5
Message-ID: <CAE8u=e4YqsC9nXdb2w6W-YjqTwVwV+PW=DkQrDDqm=NVDku==w@mail.gmail.com>

Hi all,

this is yet another GSoC report.

During last week, I was mainly concentrated on D part of the project,
adding functionality to it. I implemented parsing of the whole BAM file :)
Today I wrote a simple utility in
D<https://github.com/lomereiter/BAMread/blob/master/examples/bam2sam.d>,
which uses my library to convert BAM to SAM. It doesn?t work with array
tags yet, and not as fast as samtools, but nevertheless? On a couple of BAM
files from test/data directory (namely, bins.bam and ex1_header.bam) the
output is identical to that of samtools view ? I checked with diff ? and
that kinda proves that everything works fine. Speed issues are mainly due
to using std.variant module for storing tags. It uses runtime reflection
which is quite slow. Maybe, there?re some other reasons. Anyway, I?m going
to write my own tagged union type next week, it should improve the
performance quite a bit, and also fix design flaws.

For testing tag parsing, I used file tags.bam provided to me by Peter Cock.
It contains tests for all types of tags, and my library successfully passes
them. Later I?ll experiment with possible speed improvements, and having
unit tests covering full range of possible tag types is a must.

Also, I downloaded and compiled gdc from trunk. It provides decent
performance, not worse than dmd, at least. We expect gdc to gain shared
library support in the next two months. Before that happens, we have to use
dmd, although there?re some issues with its garbage collector, causing
segfaults. We discussed that with Marjan and Pjotr and decided that the
best option in such circumstances would be to disable GC during development
? testing library on small files won?t consume much memory anyway.

Another thing I downloaded and compiled, is Rubinius. I?m going to
investigate why it hangs on BioRuby unittests in 1.9 mode. Another mode,
1.8, seems to work fine except maybe some very minor bugs.

During next week, I?m going to learn how to use Cucumber and Rspec, improve
D library performance a little, and start to write Ruby bindings. So it
will be mostly ?Ruby week? ;)


--
Artem


From cswh at umich.edu  Mon May 14 23:36:17 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Mon, 14 May 2012 23:36:17 -0400
Subject: [BioRuby] GSoC week 1 status report
Message-ID: <2D9F6030-8A11-4443-B610-58464F506EE5@umich.edu>

Hi all,

I've put my first GSoC status report on my project blog:

http://csw.github.com/bioruby-maf/blog/2012/05/13/progress/

(The web version of this has 100% more hyperlinks, but here's a plain text version, too.)

This has been my first half-week of work on my Google Summer of Code project, and it?s off to an exciting start. The first order of business has been to get my development environment together; since I?ve been a microbiology student instead of a programmer for the last year, it?s taken some work. In that process, I?ve ended up making a few open source contributions just to get my tools working the way I want. I?m running GNU Emacs 24 and trying to take more advantage of it than I have in the past. I?ll have much more to say about this in a future post.

I?ve also started working on the BioRuby unit test failures under JRuby, as a way of familiarizing myself with the BioRuby code base as well as the community and its development processes. Right now, JRuby in 1.8 mode is showing 6 failures and 126 errors, which is hardly confidence-inspiring for people considering using JRuby with BioRuby. This is too bad, since JRuby has some definite advantages as a Ruby implementation. After looking into these failures, I?ve broken them down into a few categories:

	? temporary file permissions problems, likely due to some sort of Travis-CI environment issue
	? a bug in JRuby?s implementation of Open3.popen3 which I?m working up a bug report for
	? an odd autoload problem I?ve filed JRUBY-6658 for and sent an accompanying RubySpec patch for
	? a problem with libxml-jruby, which appears unmaintained, for which I?ve submitted a BioRuby patch plus JRUBY-6662
	? and a small test case bug relating to floating point handling, which I?ve submitted apatch for.

Once these are resolved, JRuby should be passing the BioRuby unit tests in 1.8 mode, and closer to passing in 1.9 mode. (There are a few extra failures under 1.9 that I haven?t sorted through yet.)

I?ve also gotten a start on my project itself, creating the bioruby-maf Github repository with a project skeleton and writing my first Cucumber feature for it. This is, in fact, my first Cucumber feature ever. However, I did spend a few cross-country flights reading the RSpec and Cucumber books last week; between that and cribbing from Pjotr?s code I feel like I have some idea what I?m doing. Just assembling that feature has been useful, too, since I?ve had to get several of the existing MAF tools running on my machine. In fact, my test MAF data and the FASTA version of it are courtesy of bx-python, which will be my reference implementation in many respects.

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Tue May 15 13:08:20 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Tue, 15 May 2012 13:08:20 -0400
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
Message-ID: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>

Hi all,

The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)

There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.

However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?

Clayton Wheeler
cswh at umich.edu


From pjotr.public14 at thebird.nl  Tue May 15 14:54:32 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 15 May 2012 20:54:32 +0200
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
Message-ID: <20120515185432.GC20185@thebird.nl>

Marvellous work Clayton! My suggestion to BioRuby is to split out
phyloxml and to deprecate the current library module. In the next
release, or after, we should take out that code. I suspect few people
really depend on it, and they can adapt. I am partly responsible for
that dependency, and I think the Travis-ci tests also point out that
the purer Ruby BioRuby is, the better ;). 

Naohisa, what do you say? We should also ask the original author, even
though she has left our little group and now works for google (and I
am claiming Google does not recruit from GSoC :). Diana, maybe you are
reading the ML?

Pj.

On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
> Hi all,
> 
> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
> 
> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
> 
> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From cjfields at illinois.edu  Tue May 15 15:14:02 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 15 May 2012 19:14:02 +0000
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <20120515185432.GC20185@thebird.nl>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
	<20120515185432.GC20185@thebird.nl>
Message-ID: <C5E563DF-9678-47AC-BFA5-2778AC3DB514@illinois.edu>

I am intending on following the same tact with BioPerl's phyloxml (splitting it out), primarily so it can be maintained separately from the rest of bioperl.

chris

On May 15, 2012, at 1:54 PM, Pjotr Prins wrote:

> Marvellous work Clayton! My suggestion to BioRuby is to split out
> phyloxml and to deprecate the current library module. In the next
> release, or after, we should take out that code. I suspect few people
> really depend on it, and they can adapt. I am partly responsible for
> that dependency, and I think the Travis-ci tests also point out that
> the purer Ruby BioRuby is, the better ;). 
> 
> Naohisa, what do you say? We should also ask the original author, even
> though she has left our little group and now works for google (and I
> am claiming Google does not recruit from GSoC :). Diana, maybe you are
> reading the ML?
> 
> Pj.
> 
> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
>> Hi all,
>> 
>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
>> 
>> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
>> 
>> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
>> 
>> Clayton Wheeler
>> cswh at umich.edu
>> 
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Tue May 15 17:51:51 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Tue, 15 May 2012 17:51:51 -0400
Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test
	failures
Message-ID: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>

Hi all,

I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods).

Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead?

Clayton Wheeler
cswh at umich.edu


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 03:30:35 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 16:30:35 +0900
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
	<CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
Message-ID: <201205160739.q4G7dS4G004980@portal.open-bio.org>

Hi,

For Bioruby, I manually set the hook with my (ngoto's) personal
Travis account. As far as I can see, organization accout in Travis
is currently not available.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Thu, 10 May 2012 11:31:07 +0100
Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
> >> I'm guessing that's how you did it for BioRuby?
> >
> > I think I added it before we were a github organization. Or we were
> > just lucky :)
> >
> > Pj.
> 
> I'd guess the former - I've now got a personal Travis account via my
> personal GitHub account), but for now I can't seem to create a Biopython
> Travis account via the Biopython organization account on GitHub.
> 
> Nevertheless, I could get the basic Biopython unit tests running on
> Travis last night (including Python 3), although this needs more
> work installing dependencies to get the full test suite coverage:
> http://travis-ci.org/#!/peterjc/biopython
> 
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 03:54:53 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 16:54:53 +0900
Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test
 failures
In-Reply-To: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
Message-ID: <201205160754.q4G7srSc005733@portal.open-bio.org>

Hi Clayton,

In addition, we have a Redmine page hosted on OBF.

https://redmine.open-bio.org/projects/bioruby

Currently, bugs and feature requests moved from old RubyForge BTS
are submitted.

I think the Redmine page will be used for bugs and feature requests
without pull requests.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 15 May 2012 17:51:51 -0400
Clayton Wheeler <cswh at umich.edu> wrote:

> Hi all,
> 
> I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods).
> 
> Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead?
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From anurag08priyam at gmail.com  Wed May 16 04:15:40 2012
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Wed, 16 May 2012 13:45:40 +0530
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <201205160739.q4G7dS4G004980@portal.open-bio.org>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
	<CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
	<201205160739.q4G7dS4G004980@portal.open-bio.org>
Message-ID: <CAD1m08ULTUkgrXX6Rm+9rX=MUvAJPRgTAmqW_WRSFqaNo9Nm4w@mail.gmail.com>

On Wed, May 16, 2012 at 1:00 PM, Naohisa GOTO
<ngoto at gen-info.osaka-u.ac.jp> wrote:
> For Bioruby, I manually set the hook with my (ngoto's) personal
> Travis account. As far as I can see, organization accout in Travis
> is currently not available.

You are talking about the toggle button on your Travis profile page,
right?  For repos that belong to an organization, you need to enable
Travis hook from Github (admin/service-hooks), iirc, using the token
on your Travis profile page.

-- 
Anurag Priyam

From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 04:17:57 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 17:17:57 +0900
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <20120515185432.GC20185@thebird.nl>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
	<20120515185432.GC20185@thebird.nl>
Message-ID: <201205160817.q4G8HwBO007774@portal.open-bio.org>

Hi,

Great work, Clayton!

I think separate gem (Biogem) is good, too.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 15 May 2012 20:54:32 +0200
Pjotr Prins <pjotr.public14 at thebird.nl> wrote:

> Marvellous work Clayton! My suggestion to BioRuby is to split out
> phyloxml and to deprecate the current library module. In the next
> release, or after, we should take out that code. I suspect few people
> really depend on it, and they can adapt. I am partly responsible for
> that dependency, and I think the Travis-ci tests also point out that
> the purer Ruby BioRuby is, the better ;). 
> 
> Naohisa, what do you say? We should also ask the original author, even
> though she has left our little group and now works for google (and I
> am claiming Google does not recruit from GSoC :). Diana, maybe you are
> reading the ML?
> 
> Pj.
> 
> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
> > Hi all,
> > 
> > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
> > 
> > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
> > 
> > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
> > 
> > Clayton Wheeler
> > cswh at umich.edu
> > 
> > 
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> > 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From donttrustben at gmail.com  Wed May 16 07:09:24 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 21:09:24 +1000
Subject: [BioRuby] hmmer3
Message-ID: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>

Hi guys,

I noticed today that there isn't HMMER3 support in bioruby - particularly
I'm interested in a parser for hmmsearch outputs as I want to iterate over
aligned positions.

I noticed that there is mention of this in the 1.4.1 release notes, that
hmmer3 will be supported in 1.5, although I'm not sure what exactly this
means.
http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/

Can I ask what the state of this merge is please? Is there code somewhere
just waiting to be merged? Can it be quickly spun out into a biogem in the
meantime?

Thanks,
ben

-- 
Ben Woodcroft

From bonnal at ingm.org  Wed May 16 07:27:18 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 16 May 2012 13:27:18 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
Message-ID: <CBD95BB6.991E%bonnal@ingm.org>

If you need to wrap the binary please have a look at our wrapper. I
wondering is this wrapper could be useful to other gems, I could create a
separated gem just for it. Let me know. Docs about the wrapper is in the
readme.

https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb
https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper
 

On 16/05/12 13.09, "Ben Woodcroft" <donttrustben at gmail.com> wrote:

> Hi guys,
> 
> I noticed today that there isn't HMMER3 support in bioruby - particularly
> I'm interested in a parser for hmmsearch outputs as I want to iterate over
> aligned positions.
> 
> I noticed that there is mention of this in the 1.4.1 release notes, that
> hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> means.
> http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> 
> Can I ask what the state of this merge is please? Is there code somewhere
> just waiting to be merged? Can it be quickly spun out into a biogem in the
> meantime?
> 
> Thanks,
> ben


From donttrustben at gmail.com  Wed May 16 07:43:44 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 21:43:44 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <CBD95BB6.991E%bonnal@ingm.org>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<CBD95BB6.991E%bonnal@ingm.org>
Message-ID: <CA+adgSAsb-M73U0mZ8oR2_TBACs3e2EVBr9hmn6UVEJB2KK=xA@mail.gmail.com>

Thanks for the feedback dudes. I'm happy to spin it out myself, only I
don't know where the code is.

I don't personally need a wrapper, but I've got 40G of hmmsearch result
files to parse.

Relatedly I've written a gem that parses HMM model files - I'll release
that after a little more testing, hopefully tomorrow.

On 16 May 2012 21:27, Raoul Bonnal <bonnal at ingm.org> wrote:

> If you need to wrap the binary please have a look at our wrapper. I
> wondering is this wrapper could be useful to other gems, I could create a
> separated gem just for it. Let me know. Docs about the wrapper is in the
> readme.
>
> https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb
> https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper
>
>
> On 16/05/12 13.09, "Ben Woodcroft" <donttrustben at gmail.com> wrote:
>
> > Hi guys,
> >
> > I noticed today that there isn't HMMER3 support in bioruby - particularly
> > I'm interested in a parser for hmmsearch outputs as I want to iterate
> over
> > aligned positions.
> >
> > I noticed that there is mention of this in the 1.4.1 release notes, that
> > hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> > means.
> > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> >
> > Can I ask what the state of this merge is please? Is there code somewhere
> > just waiting to be merged? Can it be quickly spun out into a biogem in
> the
> > meantime?
> >
> > Thanks,
> > ben
>
>
>

From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 07:48:14 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 20:48:14 +0900
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
Message-ID: <201205161148.q4GBmFSj016839@portal.open-bio.org>

Hi Ben,

HMMER3 result parser is written by Christian.
https://github.com/cmzmasek/bioruby

I guess it may be enough quality, except RDF/XML support
which is experimental.

I'd like to discuss that the class name Bio::Hmmer3Report
is suitable. For HMMER2, Bio::HMMER::Report.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Wed, 16 May 2012 21:09:24 +1000
Ben Woodcroft <donttrustben at gmail.com> wrote:

> Hi guys,
> 
> I noticed today that there isn't HMMER3 support in bioruby - particularly
> I'm interested in a parser for hmmsearch outputs as I want to iterate over
> aligned positions.
> 
> I noticed that there is mention of this in the 1.4.1 release notes, that
> hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> means.
> http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> 
> Can I ask what the state of this merge is please? Is there code somewhere
> just waiting to be merged? Can it be quickly spun out into a biogem in the
> meantime?
> 
> Thanks,
> ben
> 
> -- 
> Ben Woodcroft
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From bonnal at ingm.org  Wed May 16 08:46:34 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 16 May 2012 14:46:34 +0200
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <201205160817.q4G8HwBO007774@portal.open-bio.org>
Message-ID: <CBD96E4A.9927%bonnal@ingm.org>

Impressive.
This is the right approach for cleaning BioRuby from dependencies which
could create problems.


Thanks Clayton.


On 16/05/12 10.17, "Naohisa GOTO" <ngoto at gen-info.osaka-u.ac.jp> wrote:

> Hi,
> 
> Great work, Clayton!
> 
> I think separate gem (Biogem) is good, too.
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Tue, 15 May 2012 20:54:32 +0200
> Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> 
>> Marvellous work Clayton! My suggestion to BioRuby is to split out
>> phyloxml and to deprecate the current library module. In the next
>> release, or after, we should take out that code. I suspect few people
>> really depend on it, and they can adapt. I am partly responsible for
>> that dependency, and I think the Travis-ci tests also point out that
>> the purer Ruby BioRuby is, the better ;).
>> 
>> Naohisa, what do you say? We should also ask the original author, even
>> though she has left our little group and now works for google (and I
>> am claiming Google does not recruit from GSoC :). Diana, maybe you are
>> reading the ML?
>> 
>> Pj.
>> 
>> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
>>> Hi all,
>>> 
>>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby
>>> gem (an implementation of the libxml API using native Java XML libraries)
>>> does not support the full API of libxml-ruby. My first approach to this was
>>> to simply use the native libxml-ruby gem and its C extension, which works
>>> with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a
>>> Unicode issue, and the JRuby developers indicate that the C extension API
>>> (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9
>>> mode. (see http://bit.ly/JGWC4K)
>>> 
>>> There was a discussion of the PhyloXML parser on the mailing list a couple
>>> of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be
>>> rewritten to use Nokogiri at some point soon, since Nokogiri is now the de
>>> facto standard XML parser. Following that lead, I've gone ahead and ported
>>> the PhyloXML parser to use Nokogiri; it only took an hour or two, and the
>>> unit tests are passing. My branch for this is at
>>> https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a
>>> good approach, I can port the writer as well.
>>> 
>>> However, Pjotr suggested that it might make sense to split PhyloXML out into
>>> a separate gem. This should be straightforward enough, since no other
>>> BioRuby components appear to call PhyloXML. It would mean that any PhyloXML
>>> users would need to install a separate gem. On the other hand, it would
>>> remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I
>>> proceed with this approach?
>>> 
>>> Clayton Wheeler
>>> cswh at umich.edu
>>> 
>>> 
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From donttrustben at gmail.com  Wed May 16 09:28:01 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 23:28:01 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
Message-ID: <CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>

Ah cool, thanks ngoto.

Thanks for writing this Christian. I believe I've extracted the hmmer3
stuff into a new biogem. I've added you as an author on this Christian -
hope that's ok with you?
https://github.com/wwood/bioruby-hmmer3_report

I've not released it to rubygems yet - I wanted to clear up namespace
issues first. What do you suggest Naohisa? BIo::HMMER::HMMER3::Report ?

On looking at the code it seems it only handles tabular format data, which
is rather unfortunate for me, as I need the actual alignment. Looks like
I'll have to roll my sleeves up after all, unless there is yet more code
out there that parses the regular textual format?

I'm not sure about your feelings on this Christian, but how do you feel
about putting the rdf stuff in another biogem? If the aim is to get this
gem merged into the bioruby core code (and I hope it is since when people
say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a
bit tangential?

I also noticed that in the tests Christian referred to BioRubyTestDataPath
which isn't recognised in the biogem. Is there a recommended way to do this
in a biogem? Perhaps we should mirror what bioruby itself does to make the
code more portable.

Thanks everyone for the openness and responsiveness.
ben

On 16 May 2012 21:48, Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp> wrote:

> Hi Ben,
>
> HMMER3 result parser is written by Christian.
> https://github.com/cmzmasek/bioruby
>
> I guess it may be enough quality, except RDF/XML support
> which is experimental.
>
> I'd like to discuss that the class name Bio::Hmmer3Report
> is suitable. For HMMER2, Bio::HMMER::Report.
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
> On Wed, 16 May 2012 21:09:24 +1000
> Ben Woodcroft <donttrustben at gmail.com> wrote:
>
> > Hi guys,
> >
> > I noticed today that there isn't HMMER3 support in bioruby - particularly
> > I'm interested in a parser for hmmsearch outputs as I want to iterate
> over
> > aligned positions.
> >
> > I noticed that there is mention of this in the 1.4.1 release notes, that
> > hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> > means.
> > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> >
> > Can I ask what the state of this merge is please? Is there code somewhere
> > just waiting to be merged? Can it be quickly spun out into a biogem in
> the
> > meantime?
> >
> > Thanks,
> > ben
> >
> > --
> > Ben Woodcroft
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


-- 
--
Ben Woodcroft
http://ecogenomic.org/users/ben-woodcroft <http://www.ecogenomic.org/>

From pjotr.public14 at thebird.nl  Wed May 16 09:46:12 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 16 May 2012 15:46:12 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
Message-ID: <20120516134612.GA26059@thebird.nl>

On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
> I'm not sure about your feelings on this Christian, but how do you feel
> about putting the rdf stuff in another biogem? If the aim is to get this
> gem merged into the bioruby core code (and I hope it is since when people
> say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a
> bit tangential?

I think it should be decoupled. RDF, in general, is a (searchable)
result-based (post-parser) format. Maybe we should coin that
definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF
into triple stores. Sounds that bio-rdf is the right place for that
translation code to me :).  Feel free to push it in.

Pj.

From pjotr.public14 at thebird.nl  Thu May 17 12:51:01 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 17 May 2012 18:51:01 +0200
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <201205160754.q4G7srSc005733@portal.open-bio.org>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
Message-ID: <20120517165101.GA32610@thebird.nl>

I don't know if you all track github, but thanks to two GSoC coders
(Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.

Travis-CI should show the green light for all Rubies once Rubinius
itself gets updated on Travis :)

Kudos.

Pj.

From cjfields at illinois.edu  Thu May 17 12:59:33 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 17 May 2012 16:59:33 +0000
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <20120517165101.GA32610@thebird.nl>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
	<20120517165101.GA32610@thebird.nl>
Message-ID: <D0730676-B5EE-4C98-AAB0-2F67F7F516D5@illinois.edu>

Sounds like GSoC this year is paying lots of dividends :)

chris

On May 17, 2012, at 11:51 AM, Pjotr Prins wrote:

> I don't know if you all track github, but thanks to two GSoC coders
> (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.
> 
> Travis-CI should show the green light for all Rubies once Rubinius
> itself gets updated on Travis :)
> 
> Kudos.
> 
> Pj.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Thu May 17 13:42:06 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 13:42:06 -0400
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <20120517165101.GA32610@thebird.nl>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
	<20120517165101.GA32610@thebird.nl>
Message-ID: <7D2E3046-44E9-4275-B294-8DB39D36294B@umich.edu>

On May 17, 2012, at 12:51 PM, Pjotr Prins wrote:

> I don't know if you all track github, but thanks to two GSoC coders
> (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.
> 
> Travis-CI should show the green light for all Rubies once Rubinius
> itself gets updated on Travis :)

Thanks Pjotr. Unfortunately I think we're not going to be quite there for JRuby just yet; we've hit a couple of JRuby bugs which will probably need to be fixed to solve some of the failures. Also, I think we may be stuck with PhyloXML test failures under JRuby in 1.9 mode until we split that out into a separate gem. It's definitely progress, though.

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Thu May 17 15:39:27 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 15:39:27 -0400
Subject: [BioRuby] PhyloXML and libxml-ruby
Message-ID: <C919224A-16F7-4940-A307-00A87A61D96A@umich.edu>

Hi all,

It appears that the native extension for libxml-ruby is not building reliably under JRuby, causing Travis-CI runs to fail as seen at:

http://travis-ci.org/#!/ngoto/bioruby/jobs/1356992

I'm not having much luck identifying exactly why it builds in some JRuby environments and not others, but I've been able to reproduce the Travis-CI problem on a test Linux machine and don't see an obvious fix.

If we're going to repackage PhyloXML into a separate gem, I think the safest course of action would be to revert to calling for libxml-jruby in the Travis-CI Gemfiles (i.e. back out http://bit.ly/JmNjDY). Using libxml-ruby instead of libxml-jruby doesn't solve the PhyloXML problems on JRuby in 1.9 mode anyway, and 1.9 mode will soon be the default in JRuby. The PhyloXML gem can be explicitly declared to depend on libxml-ruby, and moving it out of the core BioRuby gem will remove this whole issue, as far as the unit tests go. Then PhyloXML's library requirements can be addressed separately.

Thoughts?

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Thu May 17 23:10:52 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 23:10:52 -0400
Subject: [BioRuby] bio-phyloxml gem
Message-ID: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>

Hi all,

I have repackaged BioRuby's PhyloXML support as a separate gem:

https://github.com/csw/bioruby-phyloxml

I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.)

Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough.

Also, feel free to contribute changes to the README.

If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0.

Clayton Wheeler
cswh at umich.edu


From donttrustben at gmail.com  Fri May 18 00:59:44 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Fri, 18 May 2012 14:59:44 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <20120516134612.GA26059@thebird.nl>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
	<20120516134612.GA26059@thebird.nl>
Message-ID: <CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>

On 16 May 2012 23:46, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:

> On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
> > maybe the rdf stuff is a
> > bit tangential?
>
> I think it should be decoupled. RDF, in general, is a (searchable)
> result-based (post-parser) format. Maybe we should coin that
> definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF
> into triple stores. Sounds that bio-rdf is the right place for that
> translation code to me :).  Feel free to push it in.
>

Thanks. I've removed the rdf related code all in one commit:
https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20

However, I don't feel like I should be adding this to a different
repository because I don't feel like I understand the technology enough,
and therefore am not really inclined to maintain it. All of the relevant
code should be in that commit, so should be quite simple to add in yourself
if you are inclined (though I couldn't find any unit tests). Only, I've
changed the namespace of it to Bio::HMMER::HMMER3::Report from
Bio::Hmmer3report as Naohisa suggested. I've also now pushed the new biogem
to rubygems/biogems.info.

Thanks,
ben

From pjotr.public14 at thebird.nl  Fri May 18 01:21:23 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 18 May 2012 07:21:23 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
	<20120516134612.GA26059@thebird.nl>
	<CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>
Message-ID: <20120518052123.GA3360@thebird.nl>

OK, I'll take the orphaned RDF code.

On Fri, May 18, 2012 at 02:59:44PM +1000, Ben Woodcroft wrote:
>    On 16 May 2012 23:46, Pjotr Prins <[1]pjotr.public14 at thebird.nl> wrote:
> 
>    On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
>    > maybe the rdf stuff is a
>    > bit tangential?
> 
>      I think it should be decoupled. RDF, in general, is a (searchable)
>      result-based (post-parser) format. Maybe we should coin that
>      definition somewhere :). I created bio-rdf biogem as a 'sink' for
>      RDF
>      into triple stores. Sounds that bio-rdf is the right place for that
>      translation code to me :).  Feel free to push it in.
> 
>    Thanks. I've removed the rdf related code all in one commit:
>    [2]https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011
>    cb600e78e6ef10603187c99d20
>    However, I don't feel like I should be adding this to a different
>    repository because I don't feel like I understand the technology
>    enough, and therefore am not really inclined to maintain it. All of the
>    relevant code should be in that commit, so should be quite simple to
>    add in yourself if you are inclined (though I couldn't find any unit
>    tests). Only, I've changed the namespace of it to
>    Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested.
>    I've also now pushed the new biogem to rubygems/[3]biogems.info.
>    Thanks,
>    ben
> 
> References
> 
>    1. mailto:pjotr.public14 at thebird.nl
>    2. https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20
>    3. http://biogems.info/

From pjotr.public14 at thebird.nl  Fri May 18 01:24:40 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 18 May 2012 07:24:40 +0200
Subject: [BioRuby] bio-phyloxml gem
In-Reply-To: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>
References: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>
Message-ID: <20120518052440.GB3360@thebird.nl>

I am with Raoul and Francesco today. We will take a look and discuss.
Good job, also saving the revision history :).

On Thu, May 17, 2012 at 11:10:52PM -0400, Clayton Wheeler wrote:
> Hi all,
> 
> I have repackaged BioRuby's PhyloXML support as a separate gem:
> 
> https://github.com/csw/bioruby-phyloxml
> 
> I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.)
> 
> Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough.
> 
> Also, feel free to contribute changes to the README.
> 
> If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0.
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From donttrustben at gmail.com  Fri May 18 01:40:28 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Fri, 18 May 2012 15:40:28 +1000
Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer
Message-ID: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>

Hi guys,

Here's some blatant advertising for some code I've recently written in
biogem form.

bio-gag: "gag error" is the term I've coined to describe an error that
various people have observed on certain sequencing kits with IonTorrent,
though it has not previously been characterised very well that I know of
(we noticed that the errors seemed to occur at GAG positions in the reads
that were supposed to be GAAG). This biogem tries to find and fix these
errors. It isn't benchmarked for accuracy but worked well enough for my
lab's own purposes. Actually to be honest we've only used an older version
of the software on real data and the logic has a little since given some
recent evidence we have, but I thought I'd push it out with the latest and
greatest error model.
https://github.com/wwood/bioruby-gag

bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
files looking for particular patterns e.g. strand bias of insertions. This
gem can be used to iterate through pileup files one position (one line) at
a time, building up the sequence of each read as it goes, recording their
direction etc. Probably not the fastest piece of code in the world, sorry.
I'm not sure whether this should/can be incorporated into bio-samtools? It
adds functionality - there's no duplication (I don't think).
https://github.com/wwood/bioruby-pileup_iterator

bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
the hmmer v3 manual.
https://github.com/wwood/bioruby-hmmer_model

bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
tabular format files - the guts of this were written by Christian - see
yesterday's thread for details. I'm hoping to add regular (non-tabular)
format parsing in the near future, but no promises.
https://github.com/wwood/bioruby-hmmer3_report

I'm sure there is bugs and deficiencies - apologies in advance.

Enjoy,
ben

From francesco.strozzi at gmail.com  Fri May 18 04:01:01 2012
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Fri, 18 May 2012 10:01:01 +0200
Subject: [BioRuby] New biogems for IonTorrent, pileup files,
	pfam and hmmer
In-Reply-To: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>
References: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>
Message-ID: <CACtet2Sn5BcDA-1-asr9T26Kbvg6iEEwEymC2KDY9jjNa43p=w@mail.gmail.com>

Hi Ben,
thanks for the amazing work! I'm not using Ion Torrent atm but I
eventually will and it's good to see there is something already setup.

Francesco

On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft <donttrustben at gmail.com> wrote:
> Hi guys,
>
> Here's some blatant advertising for some code I've recently written in
> biogem form.
>
> bio-gag: "gag error" is the term I've coined to describe an error that
> various people have observed on certain sequencing kits with IonTorrent,
> though it has not previously been characterised very well that I know of
> (we noticed that the errors seemed to occur at GAG positions in the reads
> that were supposed to be GAAG). This biogem tries to find and fix these
> errors. It isn't benchmarked for accuracy but worked well enough for my
> lab's own purposes. Actually to be honest we've only used an older version
> of the software on real data and the logic has a little since given some
> recent evidence we have, but I thought I'd push it out with the latest and
> greatest error model.
> https://github.com/wwood/bioruby-gag
>
> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
> files looking for particular patterns e.g. strand bias of insertions. This
> gem can be used to iterate through pileup files one position (one line) at
> a time, building up the sequence of each read as it goes, recording their
> direction etc. Probably not the fastest piece of code in the world, sorry.
> I'm not sure whether this should/can be incorporated into bio-samtools? It
> adds functionality - there's no duplication (I don't think).
> https://github.com/wwood/bioruby-pileup_iterator
>
> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
> the hmmer v3 manual.
> https://github.com/wwood/bioruby-hmmer_model
>
> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
> tabular format files - the guts of this were written by Christian - see
> yesterday's thread for details. I'm hoping to add regular (non-tabular)
> format parsing in the near future, but no promises.
> https://github.com/wwood/bioruby-hmmer3_report
>
> I'm sure there is bugs and deficiencies - apologies in advance.
>
> Enjoy,
> ben
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 

Francesco

From bonnal at ingm.org  Fri May 18 04:54:44 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Fri, 18 May 2012 10:54:44 +0200
Subject: [BioRuby] New biogems for IonTorrent, pileup files,
 pfam and hmmer
In-Reply-To: <CACtet2Sn5BcDA-1-asr9T26Kbvg6iEEwEymC2KDY9jjNa43p=w@mail.gmail.com>
Message-ID: <CBDBDAF4.997C%bonnal@ingm.org>

My lab (Alberto) will try your HMM parsers because we are going to annotate
a lot of stuff coming form NGS ^_^


On 18/05/12 10.01, "Francesco Strozzi" <francesco.strozzi at gmail.com> wrote:

> Hi Ben,
> thanks for the amazing work! I'm not using Ion Torrent atm but I
> eventually will and it's good to see there is something already setup.
> 
> Francesco
> 
> On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft <donttrustben at gmail.com> wrote:
>> Hi guys,
>> 
>> Here's some blatant advertising for some code I've recently written in
>> biogem form.
>> 
>> bio-gag: "gag error" is the term I've coined to describe an error that
>> various people have observed on certain sequencing kits with IonTorrent,
>> though it has not previously been characterised very well that I know of
>> (we noticed that the errors seemed to occur at GAG positions in the reads
>> that were supposed to be GAAG). This biogem tries to find and fix these
>> errors. It isn't benchmarked for accuracy but worked well enough for my
>> lab's own purposes. Actually to be honest we've only used an older version
>> of the software on real data and the logic has a little since given some
>> recent evidence we have, but I thought I'd push it out with the latest and
>> greatest error model.
>> https://github.com/wwood/bioruby-gag
>> 
>> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
>> files looking for particular patterns e.g. strand bias of insertions. This
>> gem can be used to iterate through pileup files one position (one line) at
>> a time, building up the sequence of each read as it goes, recording their
>> direction etc. Probably not the fastest piece of code in the world, sorry.
>> I'm not sure whether this should/can be incorporated into bio-samtools? It
>> adds functionality - there's no duplication (I don't think).
>> https://github.com/wwood/bioruby-pileup_iterator
>> 
>> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
>> the hmmer v3 manual.
>> https://github.com/wwood/bioruby-hmmer_model
>> 
>> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
>> tabular format files - the guts of this were written by Christian - see
>> yesterday's thread for details. I'm hoping to add regular (non-tabular)
>> format parsing in the near future, but no promises.
>> https://github.com/wwood/bioruby-hmmer3_report
>> 
>> I'm sure there is bugs and deficiencies - apologies in advance.
>> 
>> Enjoy,
>> ben
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From pjotr.public14 at thebird.nl  Sun May 20 08:31:31 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 20 May 2012 14:31:31 +0200
Subject: [BioRuby] biogems.info updated
Message-ID: <20120520123131.GA17983@thebird.nl>

Marjan and I have updated the RSS feed for biogems.info - now we can
support more blogs. If you are blogging on Ruby for Bioinformatics,
give us the feed :)

Pj.

From marian.povolny at gmail.com  Mon May 21 05:36:01 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Mon, 21 May 2012 11:36:01 +0200
Subject: [BioRuby] GSoC weekly status report No.1.2
Message-ID: <CADKP5CkumC36Tkq3xDCif4m8m29Ms1Dj=suBnt6Wr4XFVo+yEw@mail.gmail.com>

http://blog.mpthecoder.com/post/23473020471/gsoc-weekly-status-report-no-1-2

It?s been three months since my first introduction on the BioRuby ML and
it?s been great. As it is the end of the GSoC community bonding period, I
would like to thank Pjotr most and then all the other community members for
their help and support. It?s a great feeling to become a member of a small
but growing community of enthusiasts that work together for the better of
all of us and for fun.

As Pjotr already did, I would like to encourage you to write blog posts
about using Ruby in Bioinformatics and let us include them in our RSS and
news feeds on the biogems.info website. The site supports both RSS and Atom
feeds now, and a similar functionality will be part of the new website for
BioRuby once it?s finished. The code also supports adding only posts for
one category/tag, so you can tag your posts with BioRuby or similar, and
only those posts will be included in the RSS feed on biogems.info.

The GSoC coding period starts today, It?s time for me to roll my sleeves
up, and start working on the GFF3 parser full-time.

--
Marjan


From lomereiter at googlemail.com  Mon May 21 07:58:46 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Mon, 21 May 2012 15:58:46 +0400
Subject: [BioRuby] [GSoC] Weekly report #1
Message-ID: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>

Hi all,

here's my report about the past week:
http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/

Brief summary:

1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
now pass all tests. The situation with 1.9 mode is not that great, but I'm
working on it.

2) I started to collect D optimization tricks on github wiki page.
Currently, it contains just 6 tips, but this number is going to grow.
Probably, another page will be created soon to keep best practices of
connecting Ruby and D. Since my project and Marjan's one have a lot in
common, I think it's important for us to not waste time on something that
already have been investigated.

3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
wrote my first two features.

4) Measurements of object instantiation time in Ruby suggest that exposing
low-level D functions via FFI makes little sense. I'm going to discuss with
mentors which high-level functions should be available, and make that into
Cucumber features.


--
Artem


From cswh at umich.edu  Mon May 21 11:50:18 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Mon, 21 May 2012 11:50:18 -0400
Subject: [BioRuby] GSoC week 2 status report
Message-ID: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>

Hi all,

Here's my report on last week's work:

http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/

This was my second week of work on my GSoC project, and the last week of the ?community bonding? period before the official start of coding. A major focus of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing unit test failures under JRuby. In the end, the best course of action seemed to separate the phyloXML support as a separate plugin, which I have done as the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries entirely and that JRuby issue along with it. At the same time, users of the phyloXML code should be able to continue using it with no substantive changes.

Separately, I began porting this phyloXML code to use Nokogiri instead of libxml-ruby, but ran into difficulties with this effort. While it is possible, and the library APIs are very similar, the code uses relatively low-level XML processing APIs in ways that seem to be sensitive to subtle differences in text node and namespace semantics between the two libraries. Substantial restructuring of the code and the addition of quite a few unit tests might be necessary to carry out such a port with confidence that the resulting code would work well.

Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, we?ll be close to having all the tests passing under JRuby.

I identified another JRuby bug, JRUBY-6666, causing several unit test failures. This one affects BioRuby?s code for running external commands, so it would be likely to be encountered in production use. For this one, I also worked up a patch.

I also spent some time preparing a performance testing environment, for evaluating existing MAF implementations as well as my own. This will be important, since I will be considering the use of an existing C parser. I will also want to ensure that the performance of my code is competitive with the alternatives. Lacking any hardware more powerful than a MacBook Air, I am setting this up with Amazon EC2. To simplify environment setup, I?ll be using Chef. I?ve already set up a Chef repository with configuration logic, and some rudimentary code to streamline launching Ubuntu machines on EC2 and bootstrapping a Chef environment. To save money, I plan to make use of EC2 Spot Instances, which are perfect for instances that only need to run for a few hours for batch tasks.

Clayton Wheeler
cswh at umich.edu


From bonnal at ingm.org  Tue May 22 05:21:42 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Tue, 22 May 2012 11:21:42 +0200
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>
Message-ID: <CBE12746.9A11%bonnal@ingm.org>

Hi Clayton,
Well done and thanks for your contributes to bioruby and jruby community.

For you computing issue I have two solutions:
1) I can create a VM and give you the access, I need to contact my IT dep.
2) Could Amazon provide some VM for our students?


On 21/05/12 17.50, "Clayton Wheeler" <cswh at umich.edu> wrote:

> Hi all,
> 
> Here's my report on last week's work:
> 
> http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/
> 
> This was my second week of work on my GSoC project, and the last week of the
> ?community bonding? period before the official start of coding. A major focus
> of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing
> unit test failures under JRuby. In the end, the best course of action seemed
> to separate the phyloXML support as a separate plugin, which I have done as
> the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries
> entirely and that JRuby issue along with it. At the same time, users of the
> phyloXML code should be able to continue using it with no substantive changes.
> 
> Separately, I began porting this phyloXML code to use Nokogiri instead of
> libxml-ruby, but ran into difficulties with this effort. While it is possible,
> and the library APIs are very similar, the code uses relatively low-level XML
> processing APIs in ways that seem to be sensitive to subtle differences in
> text node and namespace semantics between the two libraries. Substantial
> restructuring of the code and the addition of quite a few unit tests might be
> necessary to carry out such a port with confidence that the resulting code
> would work well.
> 
> Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major
> causes of BioRuby?s unit test failures with JRuby; once a fix is integrated,
> we?ll be close to having all the tests passing under JRuby.
> 
> I identified another JRuby bug, JRUBY-6666, causing several unit test
> failures. This one affects BioRuby?s code for running external commands, so it
> would be likely to be encountered in production use. For this one, I also
> worked up a patch.
> 
> I also spent some time preparing a performance testing environment, for
> evaluating existing MAF implementations as well as my own. This will be
> important, since I will be considering the use of an existing C parser. I will
> also want to ensure that the performance of my code is competitive with the
> alternatives. Lacking any hardware more powerful than a MacBook Air, I am
> setting this up with Amazon EC2. To simplify environment setup, I?ll be using
> Chef. I?ve already set up a Chef repository with configuration logic, and some
> rudimentary code to streamline launching Ubuntu machines on EC2 and
> bootstrapping a Chef environment. To save money, I plan to make use of EC2
> Spot Instances, which are perfect for instances that only need to run for a
> few hours for batch tasks.
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From p.j.a.cock at googlemail.com  Tue May 22 07:07:15 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 22 May 2012 12:07:15 +0100
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <4F9AFA1F.6030103@med.nyu.edu>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
Message-ID: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>

Hi all,

I've CC'd the BioRuby mailing list just to ensure you're aware of the
potentially useful combination of MAF indexing and BGZF compression.
We can continue this on the BioRuby list if more appropriate.

The start of this Biopython-dev thread is here:
http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html

This might be a nice opportunity to combine the work of this year's OBF
Google Summer of Code students - Clayton is doing MAF for BioRuby,
and part of Artem's project could provide BGZF support for BioRuby.

On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
<andrew.sczesnak at med.nyu.edu> wrote:
> Peter,
>
>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>> and I'm willing to do this myself for MAF (while going over your index
>> work - something I want to do anyway). The only potential catch is
>> avoiding offset arithmetic.
>
> I have no problem with you doing this if you're willing. It would be great
> to have some code review of MafIndex as well.

I'm not sure if Clayton will be able to comment on the Python code,
but he should have some thoughts on the MAF indexing itself.

Regards,

Peter

From pjotr.public14 at thebird.nl  Tue May 22 11:23:17 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 22 May 2012 17:23:17 +0200
Subject: [BioRuby] BioRuby hitting 20K
Message-ID: <20120522152317.GA30752@thebird.nl>

Looks like we'll have 20K downloads of the bioruby gem by tomorrow
:). Maybe time for a new release?

We are getting a lot more activity anyway - Go BioRuby Go!

Pj.

From mh6 at sanger.ac.uk  Tue May 22 11:32:03 2012
From: mh6 at sanger.ac.uk (Michael Paulini)
Date: Tue, 22 May 2012 16:32:03 +0100
Subject: [BioRuby] BioRuby hitting 20K
In-Reply-To: <20120522152317.GA30752@thebird.nl>
References: <20120522152317.GA30752@thebird.nl>
Message-ID: <4FBBB173.2030001@sanger.ac.uk>

congrats biorubystas :-)

M

On 22/05/12 16:23, Pjotr Prins wrote:
> Looks like we'll have 20K downloads of the bioruby gem by tomorrow
> :). Maybe time for a new release?
>
> We are getting a lot more activity anyway - Go BioRuby Go!
>
> Pj.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

From bonnal at ingm.org  Wed May 23 09:24:56 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 23 May 2012 15:24:56 +0200
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
Message-ID: <CBE2B1C8.9A6B%bonnal@ingm.org>

Thanks Peter,
These are valuable hints.


On 22/05/12 13.07, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> Hi all,
> 
> I've CC'd the BioRuby mailing list just to ensure you're aware of the
> potentially useful combination of MAF indexing and BGZF compression.
> We can continue this on the BioRuby list if more appropriate.
> 
> The start of this Biopython-dev thread is here:
> http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html
> 
> This might be a nice opportunity to combine the work of this year's OBF
> Google Summer of Code students - Clayton is doing MAF for BioRuby,
> and part of Artem's project could provide BGZF support for BioRuby.
> 
> On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
>> Peter,
>> 
>>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>>> and I'm willing to do this myself for MAF (while going over your index
>>> work - something I want to do anyway). The only potential catch is
>>> avoiding offset arithmetic.
>> 
>> I have no problem with you doing this if you're willing. It would be great
>> to have some code review of MafIndex as well.
> 
> I'm not sure if Clayton will be able to comment on the Python code,
> but he should have some thoughts on the MAF indexing itself.
> 
> Regards,
> 
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Wed May 23 21:35:46 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Wed, 23 May 2012 21:35:46 -0400
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
Message-ID: <DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>

On May 22, 2012, at 7:07 AM, Peter Cock wrote:

> Hi all,
> 
> I've CC'd the BioRuby mailing list just to ensure you're aware of the
> potentially useful combination of MAF indexing and BGZF compression.
> We can continue this on the BioRuby list if more appropriate.
> 
> The start of this Biopython-dev thread is here:
> http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html
> 
> This might be a nice opportunity to combine the work of this year's OBF
> Google Summer of Code students - Clayton is doing MAF for BioRuby,
> and part of Artem's project could provide BGZF support for BioRuby.

Indeed, thanks Peter. BGZF sounds like a great approach for MAF compression; I'm just about to start looking into indexing support, and it makes sense to tackle compression in that context.

So far, I think Artem's BGZF implementation is entirely in D; I may just add Ruby support for BGZF separately.

> On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
>> Peter,
>> 
>>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>>> and I'm willing to do this myself for MAF (while going over your index
>>> work - something I want to do anyway). The only potential catch is
>>> avoiding offset arithmetic.
>> 
>> I have no problem with you doing this if you're willing. It would be great
>> to have some code review of MafIndex as well.
> 
> I'm not sure if Clayton will be able to comment on the Python code,
> but he should have some thoughts on the MAF indexing itself.

I'll definitely be spending more time with that code; it and the bx-python MAF indexing code will be my main reference points for indexed access. It's been a little while, but I have done some Python work in the past, so I should be able to follow along okay. I'll send some comments out in a few days.

Clayton Wheeler
cswh at umich.edu


From mictadlo at gmail.com  Thu May 24 00:30:22 2012
From: mictadlo at gmail.com (Mic)
Date: Thu, 24 May 2012 14:30:22 +1000
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
Message-ID: <CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>

D to Ruby: http://www.swig.org/compare.html

On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <lomereiter at googlemail.com>wrote:

> Hi all,
>
> here's my report about the past week:
> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
>
> Brief summary:
>
> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
> now pass all tests. The situation with 1.9 mode is not that great, but I'm
> working on it.
>
> 2) I started to collect D optimization tricks on github wiki page.
> Currently, it contains just 6 tips, but this number is going to grow.
> Probably, another page will be created soon to keep best practices of
> connecting Ruby and D. Since my project and Marjan's one have a lot in
> common, I think it's important for us to not waste time on something that
> already have been investigated.
>
> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
> wrote my first two features.
>
> 4) Measurements of object instantiation time in Ruby suggest that exposing
> low-level D functions via FFI makes little sense. I'm going to discuss with
> mentors which high-level functions should be available, and make that into
> Cucumber features.
>
>
>
>
> --
> Artem
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From cjfields at illinois.edu  Thu May 24 01:14:20 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 24 May 2012 05:14:20 +0000
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
	<CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
Message-ID: <BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>

I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken...

chris

On May 23, 2012, at 11:30 PM, Mic wrote:

> D to Ruby: http://www.swig.org/compare.html
> 
> On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <lomereiter at googlemail.com>wrote:
> 
>> Hi all,
>> 
>> here's my report about the past week:
>> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
>> 
>> Brief summary:
>> 
>> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
>> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
>> now pass all tests. The situation with 1.9 mode is not that great, but I'm
>> working on it.
>> 
>> 2) I started to collect D optimization tricks on github wiki page.
>> Currently, it contains just 6 tips, but this number is going to grow.
>> Probably, another page will be created soon to keep best practices of
>> connecting Ruby and D. Since my project and Marjan's one have a lot in
>> common, I think it's important for us to not waste time on something that
>> already have been investigated.
>> 
>> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
>> wrote my first two features.
>> 
>> 4) Measurements of object instantiation time in Ruby suggest that exposing
>> low-level D functions via FFI makes little sense. I'm going to discuss with
>> mentors which high-level functions should be available, and make that into
>> Cucumber features.
>> 
>> 
>> 
>> 
>> --
>> Artem
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Thu May 24 01:33:40 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 24 May 2012 01:33:40 -0400
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <CBE12746.9A11%bonnal@ingm.org>
References: <CBE12746.9A11%bonnal@ingm.org>
Message-ID: <9DBCD042-7086-4F4B-ABB9-1A7F63C089B8@umich.edu>

Thanks for the offers of help, everybody. Raoul, if it's convenient for you to set up a test VM in house, that would probably make the most sense. I don't think it's a pressing need at this point, but let's look into that. 

If we run into issues, we can revisit the EC2 options. (I've had an AWS account too long to qualify for the free usage tier, unfortunately.) An Amazon grant might be worth looking at, especially if we can use it to publicly host, say, BGZF-compressed pre-indexed MAF data sets also. On the other hand, that might be overkill just for my needs; using spot-priced instances, I expect I could do all the testing I need for under $50.

Clayton Wheeler
cswh at umich.edu


From lomereiter at googlemail.com  Thu May 24 01:40:54 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Thu, 24 May 2012 09:40:54 +0400
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
	<CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
	<BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>
Message-ID: <CAE8u=e503XzVVyDoSARz6OVMT_MM0WYuFrXpuJy9SKYb5BsNRw@mail.gmail.com>

Chris is right. Currently, it's easier to write everything manually. When
I'll develop some 'best practices' I may put then into compile-time
algorithms and generate bindings from D. (The language has compile-time
introspection but doesn't have run-time one, probably because that would
hurt the performance.)

On Thu, May 24, 2012 at 9:14 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries
> wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken...
>
> chris
>
> On May 23, 2012, at 11:30 PM, Mic wrote:
>
> > D to Ruby: http://www.swig.org/compare.html
> >
> > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <
> lomereiter at googlemail.com>wrote:
> >
> >> Hi all,
> >>
> >> here's my report about the past week:
> >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
> >>
> >> Brief summary:
> >>
> >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
> >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode
> should
> >> now pass all tests. The situation with 1.9 mode is not that great, but
> I'm
> >> working on it.
> >>
> >> 2) I started to collect D optimization tricks on github wiki page.
> >> Currently, it contains just 6 tips, but this number is going to grow.
> >> Probably, another page will be created soon to keep best practices of
> >> connecting Ruby and D. Since my project and Marjan's one have a lot in
> >> common, I think it's important for us to not waste time on something
> that
> >> already have been investigated.
> >>
> >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it,
> and
> >> wrote my first two features.
> >>
> >> 4) Measurements of object instantiation time in Ruby suggest that
> exposing
> >> low-level D functions via FFI makes little sense. I'm going to discuss
> with
> >> mentors which high-level functions should be available, and make that
> into
> >> Cucumber features.
> >>
> >>
> >>
> >>
> >> --
> >> Artem
> >>
> >> _______________________________________________
> >> BioRuby Project - http://www.bioruby.org/
> >> BioRuby mailing list
> >> BioRuby at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioruby
> >>
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


From lomereiter at googlemail.com  Thu May 24 01:52:42 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Thu, 24 May 2012 09:52:42 +0400
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
Message-ID: <CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>

Hi all,

it's a good point that many line-based formats need some sort of
compression with indexing, and BGZF is good enough in that sense.

So far, I think Artem's BGZF implementation is entirely in D; I may just
> add Ruby support for BGZF separately.
>

The only problem I see with that approach is that it's hardly possible to
get parallel compression with MRI. But overall I tend to agree with
Clayton. Firstly, it's hard to abstract away some common interface right
now, not writing any code and looking at it. Secondly, there're still
problems with D shared library support. We were assured by GDC developer
that they'll get solved soon, but at the moment the situation is far from
perfect.

From p.j.a.cock at googlemail.com  Thu May 24 05:18:33 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 24 May 2012 10:18:33 +0100
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
	<CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
Message-ID: <CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>

On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov
<lomereiter at googlemail.com> wrote:
> Hi all,
>
> it's a good point that many line-based formats need some sort of compression
> with indexing, and BGZF is good enough in that sense.

BGZF doesn't have to be used with line-based formats, anything
with sequential records would work (like BAM files of course). I've not
tried it to see how well it compressed, but SFF files in BGZF should
work too as another example.

>> So far, I think Artem's BGZF implementation is entirely in D; I may just
>> add Ruby support for BGZF separately.
>
> The only problem I see with that approach is that it's hardly possible to
> get parallel compression with MRI. But overall I tend to agree with Clayton.
> Firstly, it's hard to abstract away some common interface right now, not
> writing any code and looking at it. Secondly, there're still problems with D
> shared library support. We were assured by GDC developer that they'll get
> solved soon, but at the moment the situation is far from perfect.

My BGZF code is pure Python (using C zlib via Python's zlib library),
and does not currently tackle parallel compression or decompression.
There as been recent work in samtools for this.

We don't need parallel compression/decompression of BGZF for it to
be useful.

Peter

From john.woods at marcottelab.org  Thu May 24 10:01:08 2012
From: john.woods at marcottelab.org (John Woods)
Date: Thu, 24 May 2012 09:01:08 -0500
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <CBE12746.9A11%bonnal@ingm.org>
References: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>
	<CBE12746.9A11%bonnal@ingm.org>
Message-ID: <CAPkCRRuEZkF7b4K_1zDoyjW-1D_ckhK6K3v+wLex0dD+Arj1tg@mail.gmail.com>

If I can just suggest, there's a startup pitch out there which was formerly
known as Happy Science Coding, now Appsoma, which lets you run Ruby code on
Rackspace instances.

It may or may not be appropriate for what you want to do. It's not EC2, but
it is a VM (right?).

http://appsoma.com/

It's still a bit buggy with Ruby. If you have trouble, email Zack (see the
"About us" page). He's fairly responsive.

John
SciRuby

On Tue, May 22, 2012 at 4:21 AM, Raoul Bonnal <bonnal at ingm.org> wrote:

> Hi Clayton,
> Well done and thanks for your contributes to bioruby and jruby community.
>
> For you computing issue I have two solutions:
> 1) I can create a VM and give you the access, I need to contact my IT dep.
> 2) Could Amazon provide some VM for our students?
>
>
>
> On 21/05/12 17.50, "Clayton Wheeler" <cswh at umich.edu> wrote:
>
> > Hi all,
> >
> > Here's my report on last week's work:
> >
> > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/
> >
> > This was my second week of work on my GSoC project, and the last week of
> the
> > ?community bonding? period before the official start of coding. A major
> focus
> > of mine was BioRuby?s phyloXML support; it uses libxml, which has been
> causing
> > unit test failures under JRuby. In the end, the best course of action
> seemed
> > to separate the phyloXML support as a separate plugin, which I have done
> as
> > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML
> libraries
> > entirely and that JRuby issue along with it. At the same time, users of
> the
> > phyloXML code should be able to continue using it with no substantive
> changes.
> >
> > Separately, I began porting this phyloXML code to use Nokogiri instead of
> > libxml-ruby, but ran into difficulties with this effort. While it is
> possible,
> > and the library APIs are very similar, the code uses relatively
> low-level XML
> > processing APIs in ways that seem to be sensitive to subtle differences
> in
> > text node and namespace semantics between the two libraries. Substantial
> > restructuring of the code and the addition of quite a few unit tests
> might be
> > necessary to carry out such a port with confidence that the resulting
> code
> > would work well.
> >
> > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the
> major
> > causes of BioRuby?s unit test failures with JRuby; once a fix is
> integrated,
> > we?ll be close to having all the tests passing under JRuby.
> >
> > I identified another JRuby bug, JRUBY-6666, causing several unit test
> > failures. This one affects BioRuby?s code for running external commands,
> so it
> > would be likely to be encountered in production use. For this one, I also
> > worked up a patch.
> >
> > I also spent some time preparing a performance testing environment, for
> > evaluating existing MAF implementations as well as my own. This will be
> > important, since I will be considering the use of an existing C parser.
> I will
> > also want to ensure that the performance of my code is competitive with
> the
> > alternatives. Lacking any hardware more powerful than a MacBook Air, I am
> > setting this up with Amazon EC2. To simplify environment setup, I?ll be
> using
> > Chef. I?ve already set up a Chef repository with configuration logic,
> and some
> > rudimentary code to streamline launching Ubuntu machines on EC2 and
> > bootstrapping a Chef environment. To save money, I plan to make use of
> EC2
> > Spot Instances, which are perfect for instances that only need to run
> for a
> > few hours for batch tasks.
> >
> > Clayton Wheeler
> > cswh at umich.edu
> >
> >
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From mictadlo at gmail.com  Fri May 25 02:49:13 2012
From: mictadlo at gmail.com (Mic)
Date: Fri, 25 May 2012 16:49:13 +1000
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
	<CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
	<CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>
Message-ID: <CAOP6n=j2z1Ldtfywa8o4FmuLLw+J=uZSsEyG5pbo3-1zfma5iA@mail.gmail.com>

I think Pircard-tools does parallel compression/decompression of BGZF.

Cheers,
Mic

On Thu, May 24, 2012 at 7:18 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov
> <lomereiter at googlemail.com> wrote:
> > Hi all,
> >
> > it's a good point that many line-based formats need some sort of
> compression
> > with indexing, and BGZF is good enough in that sense.
>
> BGZF doesn't have to be used with line-based formats, anything
> with sequential records would work (like BAM files of course). I've not
> tried it to see how well it compressed, but SFF files in BGZF should
> work too as another example.
>
> >> So far, I think Artem's BGZF implementation is entirely in D; I may just
> >> add Ruby support for BGZF separately.
> >
> > The only problem I see with that approach is that it's hardly possible to
> > get parallel compression with MRI. But overall I tend to agree with
> Clayton.
> > Firstly, it's hard to abstract away some common interface right now, not
> > writing any code and looking at it. Secondly, there're still problems
> with D
> > shared library support. We were assured by GDC developer that they'll get
> > solved soon, but at the moment the situation is far from perfect.
>
> My BGZF code is pure Python (using C zlib via Python's zlib library),
> and does not currently tackle parallel compression or decompression.
> There as been recent work in samtools for this.
>
> We don't need parallel compression/decompression of BGZF for it to
> be useful.
>
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>

From cswh at umich.edu  Fri May 25 16:42:13 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Fri, 25 May 2012 16:42:13 -0400
Subject: [BioRuby] New blog post on this week's work
Message-ID: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>

Hi all,

I've written a new blog post on the work I did on my MAF parser this week:

http://csw.github.com/bioruby-maf/blog/2012/05/25/first_milestone/

It covers parser implementation and performance issues, BDD, and tools.

Clayton Wheeler
cswh at umich.edu


From lomereiter at googlemail.com  Sun May 27 14:27:43 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Sun, 27 May 2012 22:27:43 +0400
Subject: [BioRuby] [GSoC] weekly report #2
Message-ID: <CAE8u=e4Yer4BwgG8GGEV5MzcmRjGmgwPuKkyAMALXq9xHYu+Gg@mail.gmail.com>

Hi all,

I wrote a blog post about the past week:
http://lomereiter.wordpress.com/2012/05/27/gsoc-weekly-report-2/

Topics are:
1) I have quite good validation module for BAM now. More kinds of checks
can be added, just request them :)
2) Also I started to implement random access via BAI file, just because I
mostly finished what I planned for the first two weeks, and random access
seems to be one of the most important things.

Also it's not mentioned in the blog, but I started to work on BGZF gem, as
Pjotr suggested to me. I'll try to document it and publish the first
version next week. Currently I write it in pure Ruby.

From marian.povolny at gmail.com  Sun May 27 15:21:48 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sun, 27 May 2012 21:21:48 +0200
Subject: [BioRuby] GSoC weekly status report No.1.9
Message-ID: <CADKP5CkmqhUwiCnj1ERMP1dcguWKpEnv4XfvYt6kmqjHrv7UDQ@mail.gmail.com>

http://blog.mpthecoder.com/post/23877896288/gsoc-weekly-status-report-no-1-9

This is the final post in 1.x series, I promise.

The last week was spent adding support of parsing lines into records. It
was a lot of work, and when I read the comments from my mentor, I wasn?t
happy. But I agree with him, I did make it more complicated then it had to
be (the C API, for example), I should spend some time polishing and
refactoring the D side, and my cucumber features should be split into more
features. So that?s the rough plan for the next week.

--
Marjan


From bonnal at ingm.org  Mon May 28 04:50:19 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 28 May 2012 10:50:19 +0200
Subject: [BioRuby] DevTools
In-Reply-To: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>
Message-ID: <CBE908EB.9BBB%bonnal@ingm.org>

In case you want to use RedMine I can give you the license for free, any
bioruby developer can request it.


From p.j.a.cock at googlemail.com  Mon May 28 05:00:30 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 28 May 2012 10:00:30 +0100
Subject: [BioRuby] DevTools
In-Reply-To: <CBE908EB.9BBB%bonnal@ingm.org>
References: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>
	<CBE908EB.9BBB%bonnal@ingm.org>
Message-ID: <CAKVJ-_5hZofR0h=hXMACSEg+wgY4Gj__u8iY=_AkiB5QuY-Tgw@mail.gmail.com>

On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal <bonnal at ingm.org> wrote:

> In case you want to use RedMine I can give you the license for free, any
> bioruby developer can request it.
>

??? Redmine is licensed under the GPL.

Did you mean admin rights on the OBF RedMine instance, for
example to close bug reports?
https://redmine.open-bio.org/projects/bioruby

Peter

From bonnal at ingm.org  Mon May 28 05:03:01 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 28 May 2012 11:03:01 +0200
Subject: [BioRuby] DevTools
In-Reply-To: <CAKVJ-_5hZofR0h=hXMACSEg+wgY4Gj__u8iY=_AkiB5QuY-Tgw@mail.gmail.com>
Message-ID: <CBE90BE5.9BC2%bonnal@ingm.org>

Ahhhhhhhhhhh


I mean RubyMine

 http://www.jetbrains.com/ruby/

sorry

On 28/05/12 11.00, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> 
> 
> On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal <bonnal at ingm.org> wrote:
>> In case you want to use RedMine I can give you the license for free, any
>> bioruby developer can request it.
> 
> ??? Redmine is licensed under the GPL.
> 
> Did you mean admin rights on the OBF RedMine instance, for
> example to close bug reports?
> https://redmine.open-bio.org/projects/bioruby
> 
> Peter
> 
> 


From francesco.strozzi at gmail.com  Thu May 31 05:11:25 2012
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Thu, 31 May 2012 11:11:25 +0200
Subject: [BioRuby] EU Codefest 2012 Announcement
Message-ID: <CACtet2RtKqXavJ4C7An==MJP0MTW-yeHYwi9xKqWh6RP02Qp6w@mail.gmail.com>

The Open Bioinformatics Foundation (OBF) EU-CodeFest will be held in
Parco Tecnologico Padano (PTP) Lodi, Italy on the19th ? 20th of July.
The CodeFest is a small focused event under the auspices of the Open
Bioinformatics Foundation, and is a sister event of BOSC2012 being
held in California USA this year.
Three main topics will be worked on during the CodeFest:

- NGS and high performance parsers for OpenBio projects.
- RDF and semantic web for bioinformatics.
- Bioinformatics pipelines definition, execution and distribution.

The number of places is limited to 30 participants at maximum, on a
first come, first serve basis. Undergraduate and PhD students are
welcome to participate.
The cost of the event is EUR 100 per person, which includes also
lunches, coffee breaks and the social dinner on the 19th of July.
Only for students, we can sponsor a limited number of attendees that
will not pay for the registration fee. Those students, willing to
participate for free to the event, will be asked to submit their
qualifications and experience in software development. The organizing
committee will review students? applications before final acceptance.
Talks and abstracts may be presented during the CodeFest in sessions
of 10 minutes plus questions. Coding activities will continue during
the talks.

The City of Lodi is very close to Milano and has good hotel
facilities. The connections by air are excellent, via Milano Malpensa,
Milano Linate and Bergamo Orio Al Serio airports.

Please register soon using the form at this page
http://tecnoparco.org/codefest, places may run out quickly.


-- 

Francesco


From marian.povolny at gmail.com  Sat May  5 13:07:30 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sat, 5 May 2012 15:07:30 +0200
Subject: [BioRuby] GSoC weekly status report No.1
Message-ID: <CADKP5C=v+GH4XsWZmhDz3feDtvkyy6KTkA1-bOsAtGBdbnNNzw@mail.gmail.com>

Hello all,

It might be a little early, but there has been so much going on in the last
10 days since the results of GSoC were published...

http://blog.mpthecoder.com/post/22380853664/gsoc-weekly-status-report-no-1

A short summary:

It has been 10 days since the GSoC results were published, and a lot has
happened since then. I got to know the other students and mentors in a
longish meeting on Google hangout, I got into a discussion with my mentor
on IRC in which we didn?t agree about the parallelization strategy for the
parser (experiments will show who?s right) and my inbox is full with mails
from my mentor and other students, in which we exchanged loads of
interesting ideas. Also, I solved a bug in biogems.info website, which was
stopping Pjotr from updating the website with new information about biogems.

There is now a GitHub repository for my project:

https://github.com/mamarjan/bioruby-hpc-gff3

The work for the first week of coding is halfway done too.

There seems to be huge interest for a GFF3 parser with more features, like
indexing, random access and writing output, and also support for linking
into trees of features that are not located close to each other in the
file. A fast sequential parser could be used to generate indexes, and the
lower-level parts can be used to reorder the file for faster future usage.
Based on that, I think this project is a good start.

*I would like to ask you if you?re using the GFF3/GTF file formats in your
research, to send me example files and descriptions of how are your
applications using the data. This way I?ll be able to test the parser
against your files and optimize it for your applications. Currently I have
GFF files from Ensembl and Wormbase, and Pjotr pointed me to the genome
browser web application at wormbase.org.*

--
Marjan


From lomereiter at googlemail.com  Sun May  6 19:56:50 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Sun, 6 May 2012 23:56:50 +0400
Subject: [BioRuby] [GSoC][BAM] Weekly report No. 0
Message-ID: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>

Hi all,

I wrote a few words about what I've done last week:
http://lomereiter.wordpress.com/2012/05/06/gsoc-weekly-report-0/

Summary:

The code is available at github: https://github.com/lomereiter/BAMread/
I already started to write code planned for the first week so as to have
more time in June for exam preparation.
Opening BAM and parsing SAM header works, and is available from Ruby, and
now I need to write some tests and documentation. Also, I described
some compile-time metaprogramming tricks in D which I use to reduce
duplication in the code.


I'd be grateful for some small BAM files, 1-50 kilobytes in size, with
non-empty headers, for testing purposes.


--
Artem


From bonnal at ingm.org  Mon May  7 07:08:53 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 07 May 2012 09:08:53 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>
Message-ID: <CBCD41A5.963E%bonnal@ingm.org>

Dear All,
BioRuby wiki is up to date with the accepted projects. I created new pages
for each accepted project ( just created ). Are we going to keep it up to
date with results and summarizing blog posts or what ?

--
Ra


From p.j.a.cock at googlemail.com  Mon May  7 07:31:09 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 7 May 2012 08:31:09 +0100
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CBCD41A5.963E%bonnal@ingm.org>
References: <CAE8u=e6vV3Ost-gsWHxLjLxbzg4OoktJuRD2U7Y_2bdMuFFhrw@mail.gmail.com>
	<CBCD41A5.963E%bonnal@ingm.org>
Message-ID: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>

On Monday, May 7, 2012, Raoul Bonnal wrote:

> Dear All,
> BioRuby wiki is up to date with the accepted projects. I created new pages
> for each accepted project ( just created ). Are we going to keep it up to
> date with results and summarizing blog posts or what ?
>
>
Blog posts (sent to the mailing list too) for weekly updates,
but more static wiki page for summary? You can link to the
blog posts from the wiki too.

Peter


From pjotr.public14 at thebird.nl  Mon May  7 07:49:09 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 7 May 2012 09:49:09 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CBCD495C.9641%bonnal@ingm.org>
References: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>
	<CBCD495C.9641%bonnal@ingm.org>
Message-ID: <20120507074909.GB30679@thebird.nl>

I was thinking to add news items to biogems.info, and its RSS feed.
That gets updated a few times a day. Anyone interested in helping
out? Should be straightforward:

- Add YAML ./etc/blogs.yaml with links to BLOG RSS feeds
- Write script to fetch these and merge it with the RSS for biogems

That would give us a new RSS feed. Useful. Next step:

- Add news column on main http://biogems.info/ page
- Fill it with same RSS items

Later I would also like to add a list of active pushes to projects
(github style). But that is later.

Pj.

On Mon, May 07, 2012 at 09:41:48AM +0200, Raoul Bonnal wrote:
>    Fine.
>    On 07/05/12 09.31, "Peter Cock" <[1]p.j.a.cock at googlemail.com> wrote:
> 
>      On Monday, May 7, 2012, Raoul Bonnal  wrote:
> 
>      Dear All,
>      BioRuby wiki is up to date with the accepted projects. I created new
>      pages
>      for each accepted project ( just created ). Are we going to keep it
>      up to
>      date with results and summarizing blog posts or what ?
> 
>      Blog posts (sent to the mailing list too) for weekly updates,
>      but more static wiki page for summary? You can link to the
>      blog posts from the wiki too.
>      Peter
> 
> References
> 
>    1. file://localhost/tmp/p.j.a.cock at googlemail.com


From bonnal at ingm.org  Mon May  7 07:41:48 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 07 May 2012 09:41:48 +0200
Subject: [BioRuby] [GSoC] BioRuby wiki
In-Reply-To: <CAKVJ-_4QNw3r3w+=OVNrhyUSkjXSjSuX9hzM4iM41ShX2+EBtg@mail.gmail.com>
Message-ID: <CBCD495C.9641%bonnal@ingm.org>

Fine.


On 07/05/12 09.31, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> 
> 
> On Monday, May 7, 2012, Raoul Bonnal  wrote:
>> Dear All,
>> BioRuby wiki is up to date with the accepted projects. I created new pages
>> for each accepted project ( just created ). Are we going to keep it up to
>> date with results and summarizing blog posts or what ?
>> 
> 
> Blog posts (sent to the mailing list too) for weekly updates,
> but more static wiki page for summary? You can link to the
> blog posts from the wiki too.
> 
> 
> 
> Peter
> ?
> 


From john.woods at marcottelab.org  Tue May  8 22:08:47 2012
From: john.woods at marcottelab.org (John Woods)
Date: Tue, 8 May 2012 17:08:47 -0500
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
Message-ID: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>

Hi BioRuby folks,

I'm pleased to announce that we've opened applications for our first ever
Summer of Code, generously sponsored by Brighter Planet.

http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/

Please note that we recommend you have your application in by *Monday*,
which is really soon.

Help us out by sharing this around on various social media. Here are links
to existing tweets/posts/etc that you can retweet/share/etc.

Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
Reddit:
http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/

Cheers,
John Woods
Director, SciRuby Project


From pjotr.public14 at thebird.nl  Wed May  9 06:43:08 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 08:43:08 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
Message-ID: <20120509064308.GA24946@thebird.nl>

Hi John,

That is awesome news! Google has set a right trend with these summer
of code initiatives. The OBF has quite some experience with mentoring
students, see

  http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports

and one thing we thing very important is weekly meetings
between students (and mentors), and weekly blogs by the students.
These will be captured on http://biogems.info/.

It would be great your students participate in some of our meetings,
so we can exchange ideas on Ruby and performance (we use extensions
and parallel computing). Also I would like to invite your programme
to blog, and that we track those blogs.

Pj.

On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote:
> Hi BioRuby folks,
> 
> I'm pleased to announce that we've opened applications for our first ever
> Summer of Code, generously sponsored by Brighter Planet.
> 
> http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/
> 
> Please note that we recommend you have your application in by *Monday*,
> which is really soon.
> 
> Help us out by sharing this around on various social media. Here are links
> to existing tweets/posts/etc that you can retweet/share/etc.
> 
> Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
> Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
> Reddit:
> http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/
> 
> Cheers,
> John Woods
> Director, SciRuby Project
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From pjotr.public14 at thebird.nl  Wed May  9 17:14:49 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:14:49 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
Message-ID: <20120509171449.GA29529@thebird.nl>

Hi,

Some have maybe noticed Goto-san put BioRuby on travis-ci now! See

  http://travis-ci.org/#!/bioruby/bioruby

You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
failure.  JRuby fails on a handful of tests and the crash on Rubinius
looks spectacular. 

Note the clever .travis.yml file.

We invite you to submit fixes to these tests. Especially our GSoC
students, and other students on this ML, can get honors by providing
a few fixes, and/or sending in issues to the JRuby/Rubinius projects
:). Note both JRuby and Rubinius come with very interesting debugger
support. Worth a shot. Your chance to show your Ruby muscles!

Pj.


From p.j.a.cock at googlemail.com  Wed May  9 17:26:31 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:26:31 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>

On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>
> ?http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
> looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially our GSoC
> students, and other students on this ML, can get honors by providing
> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
> :). Note both JRuby and Rubinius come with very interesting debugger
> support. Worth a shot. Your chance to show your Ruby muscles!
>
> Pj.

And if you can fix the different bug identified via the BuildBot too, even
better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html

Starting from a clean nightly test result makes spotting regressions
much easier ;)

Peter


From pjotr.public14 at thebird.nl  Wed May  9 17:32:39 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:32:39 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
Message-ID: <20120509173239.GA30220@thebird.nl>

Right, the link is here

  http://testing.open-bio.org/bioruby/one_line_per_build

(I need to incorporate this also in http://biogems.info/)

On Wed, May 09, 2012 at 06:26:31PM +0100, Peter Cock wrote:
> And if you can fix the different bug identified via the BuildBot too, even
> better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html
> 
> Starting from a clean nightly test result makes spotting regressions
> much easier ;)
> 
> Peter
> 


From cjfields at illinois.edu  Wed May  9 17:29:49 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Wed, 9 May 2012 17:29:49 +0000
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
Message-ID: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>

On May 9, 2012, at 12:26 PM, Peter Cock wrote:

> On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
>> Hi,
>> 
>> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>> 
>>  http://travis-ci.org/#!/bioruby/bioruby
>> 
>> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
>> failure.  JRuby fails on a handful of tests and the crash on Rubinius
>> looks spectacular.
>> 
>> Note the clever .travis.yml file.
>> 
>> We invite you to submit fixes to these tests. Especially our GSoC
>> students, and other students on this ML, can get honors by providing
>> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
>> :). Note both JRuby and Rubinius come with very interesting debugger
>> support. Worth a shot. Your chance to show your Ruby muscles!
>> 
>> Pj.
> 
> And if you can fix the different bug identified via the BuildBot too, even
> better: http://lists.open-bio.org/pipermail/bioruby/2012-April/002231.html
> 
> Starting from a clean nightly test result makes spotting regressions
> much easier ;)
> 
> Peter

*sigh*

Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P

chris


From pjotr.public14 at thebird.nl  Wed May  9 17:35:17 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 19:35:17 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_63W-BPuzL8dBxBo8vPRgenHeae2QtgrXWFG30K=vDg4w@mail.gmail.com>
	<31802420-F1B1-4473-8391-6830672AA7AB@illinois.edu>
Message-ID: <20120509173517.GB30220@thebird.nl>

On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote:
> *sigh*
> 
> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P

Peter knows someone in Scotland who can help! Now I got to see a man
about a sheep...

Pj.


From p.j.a.cock at googlemail.com  Wed May  9 17:49:59 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:49:59 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>

On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>
> ?http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
> looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially our GSoC
> students, and other students on this ML, can get honors by providing
> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
> :). Note both JRuby and Rubinius come with very interesting debugger
> support. Worth a shot. Your chance to show your Ruby muscles!
>
> Pj.

I see Travis supports Perl, Python and Java too (amongst others)
so could be used by the other Bio* projects too for nightly testing
(on a 32bit Debian Linux platform).

How did you do this in Travis regarding the GitHub authorization?
I don't see any way when logged in as me (peterjc) to allow Travis
access to the repositories of GitHub organizations I have access
to (like Biopython).

Thanks,

Peter


From p.j.a.cock at googlemail.com  Wed May  9 17:56:17 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:56:17 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
Message-ID: <CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>

On Wed, May 9, 2012 at 6:49 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Wed, May 9, 2012 at 6:14 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
>> Hi,
>>
>> Some have maybe noticed Goto-san put BioRuby on travis-ci now! See
>>
>> ?http://travis-ci.org/#!/bioruby/bioruby
>>
>> You can see MRI 1.9.x passes, and 1.8.7 has only a small unit test
>> failure. ?JRuby fails on a handful of tests and the crash on Rubinius
>> looks spectacular.
>>
>> Note the clever .travis.yml file.
>>
>> We invite you to submit fixes to these tests. Especially our GSoC
>> students, and other students on this ML, can get honors by providing
>> a few fixes, and/or sending in issues to the JRuby/Rubinius projects
>> :). Note both JRuby and Rubinius come with very interesting debugger
>> support. Worth a shot. Your chance to show your Ruby muscles!
>>
>> Pj.
>
> I see Travis supports Perl, Python and Java too (amongst others)
> so could be used by the other Bio* projects too for nightly testing
> (on a 32bit Debian Linux platform).
>
> How did you do this in Travis regarding the GitHub authorization?
> I don't see any way when logged in as me (peterjc) to allow Travis
> access to the repositories of GitHub organizations I have access
> to (like Biopython).

I found there is an open issue on this missing feature:
https://github.com/travis-ci/travis-ci/issues/242

There a comment links to a manual workaround:
http://about.travis-ci.org/docs/user/how-to-setup-and-trigger-the-hook-manually/

I'm guessing that's how you did it for BioRuby?

Thanks,

Peter


From mail at michaelbarton.me.uk  Wed May  9 18:24:54 2012
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Wed, 9 May 2012 14:24:54 -0400
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509171449.GA29529@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
Message-ID: <20120509182454.GA4429@bartonh-mbp-01.uanet.edu>

Travis CI is also rolling out a new feature when pull
requests on github are automatically tested using the specs
in the upstream merge. This can make it much easier to spot
broken builds (and vice versa) before they are merged into
the blessed branch.

http://about.travis-ci.org/blog/announcing-pull-request-support/

On Wed, May 09, 2012 at 07:14:49PM +0200, Pjotr Prins wrote:

> Hi,
>
> Some have maybe noticed Goto-san put BioRuby on travis-ci
> now! See
>
>   http://travis-ci.org/#!/bioruby/bioruby
>
> You can see MRI 1.9.x passes, and 1.8.7 has only a small
> unit test failure. JRuby fails on a handful of tests and
> the crash on Rubinius looks spectacular.
>
> Note the clever .travis.yml file.
>
> We invite you to submit fixes to these tests. Especially
> our GSoC students, and other students on this ML, can get
> honors by providing a few fixes, and/or sending in issues
> to the JRuby/Rubinius projects :). Note both JRuby and
> Rubinius come with very interesting debugger support.
> Worth a shot. Your chance to show your Ruby muscles!
>
> Pj. _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From john.woods at marcottelab.org  Wed May  9 19:25:38 2012
From: john.woods at marcottelab.org (John Woods)
Date: Wed, 9 May 2012 14:25:38 -0500
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <20120509064308.GA24946@thebird.nl>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
	<20120509064308.GA24946@thebird.nl>
Message-ID: <CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>

Hi Pjotr,

I'll discuss having our fellow participate in some of your meetings with
the SciRuby team. I think the weekly meetings suggestion is a very good
one, and we definitely do pay attention to how BioRuby handles its GSoC
fellows.

We do blog periodically. You can find it here: http://sciruby.com/blog/
I'll make sure that blogging is also a requirement for our fellow.

Cheers,
John


On Wed, May 9, 2012 at 1:43 AM, Pjotr Prins <pjotr.public14 at thebird.nl>wrote:

> Hi John,
>
> That is awesome news! Google has set a right trend with these summer
> of code initiatives. The OBF has quite some experience with mentoring
> students, see
>
>  http://www.open-bio.org/wiki/Gsoc#Student_Progress_Reports
>
> and one thing we thing very important is weekly meetings
> between students (and mentors), and weekly blogs by the students.
> These will be captured on http://biogems.info/.
>
> It would be great your students participate in some of our meetings,
> so we can exchange ideas on Ruby and performance (we use extensions
> and parallel computing). Also I would like to invite your programme
> to blog, and that we track those blogs.
>
> Pj.
>
> On Tue, May 08, 2012 at 05:08:47PM -0500, John Woods wrote:
> > Hi BioRuby folks,
> >
> > I'm pleased to announce that we've opened applications for our first ever
> > Summer of Code, generously sponsored by Brighter Planet.
> >
> > http://sciruby.com/blog/2012/05/08/sciruby-summer-of-code/
> >
> > Please note that we recommend you have your application in by *Monday*,
> > which is really soon.
> >
> > Help us out by sharing this around on various social media. Here are
> links
> > to existing tweets/posts/etc that you can retweet/share/etc.
> >
> > Twitter: https://twitter.com/#!/SciRuby/status/199982870129942528
> > Google+: https://plus.google.com/109304769076178160953/posts/c4gT5y24LLH
> > Reddit:
> >
> http://www.reddit.com/r/ruby/comments/tdm7e/sciruby_announcing_sciruby_summer_coding/
> >
> > Cheers,
> > John Woods
> > Director, SciRuby Project
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
>


From p.j.a.cock at googlemail.com  Wed May  9 17:44:37 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Wed, 9 May 2012 18:44:37 +0100
Subject: [BioRuby] BioPerl BuildBot
Message-ID: <CAKVJ-_50-xSd2768Q9snegfs-hZ6Wb3iooYpSymxMAB6k5LN9A@mail.gmail.com>

Hi all,

I've retitled this and sent it to the BioPerl list, continuing from
this thread on
the BioRuby list:

http://lists.open-bio.org/pipermail/bioruby/2012-May/002247.html

On Wed, May 9, 2012 at 6:35 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> On Wed, May 09, 2012 at 05:29:49PM +0000, Fields, Christopher J wrote:
>> *sigh*
>>
>> Anyone know of a way I can clone myself a few times, so one of my clones can get bioperl set up on buildbot? :P
>
> Peter knows someone in Scotland who can help! Now I got to see a man
> about a sheep...
>
> Pj.

You mean Dolly The Sheep? ;)

Tiago or I can assist on the BuilBot server side for BioPerl - in fact Tiago
had already made a start (CC'd).

We'll need help from a BioPerl developer with a spare machine or two
to use as a buildslave (and I can probably borrow some of my employer's
which are already nightly tests) to help with how we setup the BuildSlaves
- essentially how to get BioPerl and relevant dependencies installed,
and then what needs to be done from a fresh git checkout to build
and run the tests. Tiago has got this currently:

perl Build.PL --accepts
./Build test

Once that is working on a single buildslave we can talk about different
targets which is where BuildBot is really helpful (e.g. versions of Perl,
different OS, etc)

Peter


From pjotr.public14 at thebird.nl  Wed May  9 21:31:58 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 9 May 2012 23:31:58 +0200
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
Message-ID: <20120509213158.GB31329@thebird.nl>

On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
> I'm guessing that's how you did it for BioRuby?

I think I added it before we were a github organization. Or we were
just lucky :)

Pj.


From pjotr.public14 at thebird.nl  Thu May 10 07:27:47 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 10 May 2012 09:27:47 +0200
Subject: [BioRuby] BioRuby rss news feed
Message-ID: <20120510072747.GA4587@thebird.nl>

Marjan and I have revamped the BioRuby/biogems news feed. See

  http://www.biogems.info/rss.xml

Health warning: Includes opiniated and caffeenated Google Summer of Code
blog entries :)

Pj.


From p.j.a.cock at googlemail.com  Thu May 10 10:31:07 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 10 May 2012 11:31:07 +0100
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <20120509213158.GB31329@thebird.nl>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
Message-ID: <CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>

On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
>> I'm guessing that's how you did it for BioRuby?
>
> I think I added it before we were a github organization. Or we were
> just lucky :)
>
> Pj.

I'd guess the former - I've now got a personal Travis account via my
personal GitHub account), but for now I can't seem to create a Biopython
Travis account via the Biopython organization account on GitHub.

Nevertheless, I could get the basic Biopython unit tests running on
Travis last night (including Python 3), although this needs more
work installing dependencies to get the full test suite coverage:
http://travis-ci.org/#!/peterjc/biopython

Peter


From pjotr.public14 at thebird.nl  Thu May 10 16:40:02 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 10 May 2012 18:40:02 +0200
Subject: [BioRuby] BioRuby rss news feed
In-Reply-To: <20120510072747.GA4587@thebird.nl>
References: <20120510072747.GA4587@thebird.nl>
Message-ID: <20120510164002.GA9030@thebird.nl>

http://www.biogems.info/ also shows news items and blog entries on the
right now. If you want your blog on Bio/Ruby added, just tell us :)

Pj.

On Thu, May 10, 2012 at 09:27:47AM +0200, Pjotr Prins wrote:
> Marjan and I have revamped the BioRuby/biogems news feed. See
> 
>   http://www.biogems.info/rss.xml
> 
> Health warning: Includes opiniated and caffeenated Google Summer of Code
> blog entries :)
> 
> Pj.
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From georgkam at gmail.com  Thu May 10 17:34:14 2012
From: georgkam at gmail.com (George Githinji)
Date: Thu, 10 May 2012 20:34:14 +0300
Subject: [BioRuby] BioRuby rss news feed
In-Reply-To: <20120510072747.GA4587@thebird.nl>
References: <20120510072747.GA4587@thebird.nl>
Message-ID: <CALf85+WGz=xA7ROXrA4JO_VFgJ2LM_9dipH36e=ia5qA3LfCcg@mail.gmail.com>

Thanks for all the hardwork!

On Thu, May 10, 2012 at 10:27 AM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Marjan and I have revamped the BioRuby/biogems news feed. See
>
> ?http://www.biogems.info/rss.xml
>
> Health warning: Includes opiniated and caffeenated Google Summer of Code
> blog entries :)
>
> Pj.
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 
---------------
Sincerely
George
Skype: george_g2
Blog: http://biorelated.wordpress.com/
Twitter: http://twitter.com/#!/george_l


From pjotr.public14 at thebird.nl  Fri May 11 09:06:48 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 11 May 2012 11:06:48 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>
References: <CAPkCRRv0yVhvBs1JJvkJZBWvoLCD4ss-TGzCFKZKG2KaJ6VMWw@mail.gmail.com>
	<20120509064308.GA24946@thebird.nl>
	<CAPkCRRs8tqdFXu8ky+hRjwqhy=9u41rmjg9vuOKKQtLFgx20bQ@mail.gmail.com>
Message-ID: <20120511090648.GA15897@thebird.nl>

We can now list non-biogem rubygems.

SciRuby is listed on http://www.biogems.info/rubygems.html

Pj.


From bonnal at ingm.org  Fri May 11 10:58:44 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Fri, 11 May 2012 12:58:44 +0200
Subject: [BioRuby] Announcing the SciRuby Summer Coding Fellowship
In-Reply-To: <20120511090648.GA15897@thebird.nl>
Message-ID: <CBD2BD84.97F0%bonnal@ingm.org>

+1 :)


On 11/05/12 11.06, "Pjotr Prins" <pjotr.public14 at thebird.nl> wrote:

> We can now list non-biogem rubygems.
> 
> SciRuby is listed on http://www.biogems.info/rubygems.html
> 
> Pj.
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From throwern at msu.edu  Fri May 11 14:20:19 2012
From: throwern at msu.edu (Nick Thrower)
Date: Fri, 11 May 2012 10:20:19 -0400
Subject: [BioRuby] BioTabix gem
Message-ID: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>

Hello all,

I recently released a bio-tabix gem.

It is available on rubygems: 
https://rubygems.org/gems/bio-tabix

and Github: 
https://github.com/throwern/bio-tabix

The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml

Feel free to contact me with any comments or suggestions. 

Best,
Nick

-- 
Nick Thrower
Information Technology Professional
Michigan State University
Great Lakes Bioenergy Research Center


From pjotr.public14 at thebird.nl  Fri May 11 15:43:49 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 11 May 2012 17:43:49 +0200
Subject: [BioRuby] BioTabix gem
In-Reply-To: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>
References: <A824F904-5C6D-4BA5-88CF-A5F2CDFA6643@msu.edu>
Message-ID: <20120511154349.GB17747@thebird.nl>

Super :)

On Fri, May 11, 2012 at 10:20:19AM -0400, Nick Thrower wrote:
> Hello all,
> 
> I recently released a bio-tabix gem.
> 
> It is available on rubygems: 
> https://rubygems.org/gems/bio-tabix
> 
> and Github: 
> https://github.com/throwern/bio-tabix
> 
> The gem binds ruby to the samtools tabix utility for indexing and parsing regions of tab delimited files. http://samtools.sourceforge.net/tabix.shtml
> 
> Feel free to contact me with any comments or suggestions. 
> 
> Best,
> Nick
> 
> -- 
> Nick Thrower
> Information Technology Professional
> Michigan State University
> Great Lakes Bioenergy Research Center
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From cswh at umich.edu  Sat May 12 01:21:02 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Fri, 11 May 2012 21:21:02 -0400
Subject: [BioRuby] Submitted JRuby bug and RubySpec addition for unit test
	failures under JRuby
Message-ID: <57CFFD67-58BC-41AD-87E8-9C70A0A7AC97@umich.edu>

Hi all,

I've noticed that many of the BioRuby unit tests are failing under JRuby, locally and on travis-ci, with NameErrors for 'uninitialized constant' conditions. Many of these tests work when running just a single test script in isolation, but fail when the full suite is run with 'rake test'. 

I've identified the root cause of this problem, which appears to be a JRuby bug triggered when an autoload entry is defined, the file which would have been autoloaded is explicitly required, and the autoload entry is defined again. Subsequent attempts to access the target of the autoload entry fail with a NameError.

This is an unusual sequence of events, but BioRuby and its test suites contain many 'horizontal' autoload entries between various parts of the source tree. For instance, bio/sequence/common.rb sets up an autoload for Bio::Locations, which I observed causing a problem with subsequent use of Bio::Locations.

I created a minimized RubySpec illustrating the problem, which succeeds under MRI but fails under JRuby, and submitted it:

https://github.com/rubyspec/rubyspec/pull/136

I also filed JRUBY-6658 (http://jira.codehaus.org/browse/JRUBY-6658) for this. If this bug is accepted and fixed, JRuby versions containing the fix should do much better on the test suite.

Without a JRuby fix, it might be possible to work around this by restructuring autoloading in the BioRuby code base to avoid horizontal autoload invocations (that is, autoload declarations not in the parent of the module to be autoloaded), but that could be too invasive to justify.

Clayton Wheeler
cswh at umich.edu


From marian.povolny at gmail.com  Sat May 12 19:46:46 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sat, 12 May 2012 21:46:46 +0200
Subject: [BioRuby] GSoC weekly status report No.1.1
Message-ID: <CADKP5Ck-V_vo9kJ=pQiNzf184S1ocx7Z94hj7-QLPKAXRa8s2A@mail.gmail.com>

Hi all,

Here is my status report for this week:

This year we the GSoC students sure are a very creative group, just look at
our numbering schemes for our status reports for the pre-coding period -
everyone has his own thing going :)

And now back to the GFF3 project. I found a few more sites with big GFF3
files, those will be great for performance testing. And Robert Buels
suggested that I should reuse the test suite from the Perl?s
Bio::GFF3::LowLevel::Parser, and I think that?s a great idea. I should
definitely use that for completeness testing and I will check the test
suites of other GFF3 parsers.

I have also finished the work for the first week. That means basically I?m
already more then two weeks ahead of schedule. The parser is now reading
data on the D side and forwarding that to Ruby line by line. That won?t be
faster then reading the file from Ruby, but that?s a nice basic case to get
data flowing from D to Ruby.

The rake tasks have been improved too. There are now two tasks for building
the D library, ?compile? and ?compiledebug?, and there is the ?spec? task
for running rspec tests and ?features? task for running cucumber tests. The
?clean? task now deletes object and library files.

There is also a problem with the D library and garbage collector. It seems
this is the problem Iain Buclaw (one of the GDC developers) has warned us
about. When using a D shared library, when the GC kicks in for the first
time, it looks like if it collects all the static data, for example the
per-module variables. And pretty much everything, even when we register
with GC a chuck of memory allocated with malloc, it still gets collected.
Or at least that?s what it looks like. However, Iain also assured us that
this will be solved by the end of this month/beginning of the next. My
cucumber and rspec tests still work because they don?t require enough
memory for the GC to run, but to be sure that this issue doesn?t interfere
with development at this point, I manually disabled the GC on
library initialization. I didn?t try yet, but from what has been discussed
in the forums, both 32 and 64-bit DLLs on windows built using DMD work fine.

I also helped Pjotr with getting our blog posts included in the RSS feed on
biogems.info.


That's all for now, you can find this report on my blog too:

http://blog.mpthecoder.com/post/22919943701/gsoc-weekly-status-report-no-1-1

--
Best regards,
Marjan


From lomereiter at googlemail.com  Sun May 13 20:10:45 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Mon, 14 May 2012 00:10:45 +0400
Subject: [BioRuby] [GSoC] Weekly report No 0.5
Message-ID: <CAE8u=e4YqsC9nXdb2w6W-YjqTwVwV+PW=DkQrDDqm=NVDku==w@mail.gmail.com>

Hi all,

this is yet another GSoC report.

During last week, I was mainly concentrated on D part of the project,
adding functionality to it. I implemented parsing of the whole BAM file :)
Today I wrote a simple utility in
D<https://github.com/lomereiter/BAMread/blob/master/examples/bam2sam.d>,
which uses my library to convert BAM to SAM. It doesn?t work with array
tags yet, and not as fast as samtools, but nevertheless? On a couple of BAM
files from test/data directory (namely, bins.bam and ex1_header.bam) the
output is identical to that of samtools view ? I checked with diff ? and
that kinda proves that everything works fine. Speed issues are mainly due
to using std.variant module for storing tags. It uses runtime reflection
which is quite slow. Maybe, there?re some other reasons. Anyway, I?m going
to write my own tagged union type next week, it should improve the
performance quite a bit, and also fix design flaws.

For testing tag parsing, I used file tags.bam provided to me by Peter Cock.
It contains tests for all types of tags, and my library successfully passes
them. Later I?ll experiment with possible speed improvements, and having
unit tests covering full range of possible tag types is a must.

Also, I downloaded and compiled gdc from trunk. It provides decent
performance, not worse than dmd, at least. We expect gdc to gain shared
library support in the next two months. Before that happens, we have to use
dmd, although there?re some issues with its garbage collector, causing
segfaults. We discussed that with Marjan and Pjotr and decided that the
best option in such circumstances would be to disable GC during development
? testing library on small files won?t consume much memory anyway.

Another thing I downloaded and compiled, is Rubinius. I?m going to
investigate why it hangs on BioRuby unittests in 1.9 mode. Another mode,
1.8, seems to work fine except maybe some very minor bugs.

During next week, I?m going to learn how to use Cucumber and Rspec, improve
D library performance a little, and start to write Ruby bindings. So it
will be mostly ?Ruby week? ;)


--
Artem


From cswh at umich.edu  Tue May 15 03:36:17 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Mon, 14 May 2012 23:36:17 -0400
Subject: [BioRuby] GSoC week 1 status report
Message-ID: <2D9F6030-8A11-4443-B610-58464F506EE5@umich.edu>

Hi all,

I've put my first GSoC status report on my project blog:

http://csw.github.com/bioruby-maf/blog/2012/05/13/progress/

(The web version of this has 100% more hyperlinks, but here's a plain text version, too.)

This has been my first half-week of work on my Google Summer of Code project, and it?s off to an exciting start. The first order of business has been to get my development environment together; since I?ve been a microbiology student instead of a programmer for the last year, it?s taken some work. In that process, I?ve ended up making a few open source contributions just to get my tools working the way I want. I?m running GNU Emacs 24 and trying to take more advantage of it than I have in the past. I?ll have much more to say about this in a future post.

I?ve also started working on the BioRuby unit test failures under JRuby, as a way of familiarizing myself with the BioRuby code base as well as the community and its development processes. Right now, JRuby in 1.8 mode is showing 6 failures and 126 errors, which is hardly confidence-inspiring for people considering using JRuby with BioRuby. This is too bad, since JRuby has some definite advantages as a Ruby implementation. After looking into these failures, I?ve broken them down into a few categories:

	? temporary file permissions problems, likely due to some sort of Travis-CI environment issue
	? a bug in JRuby?s implementation of Open3.popen3 which I?m working up a bug report for
	? an odd autoload problem I?ve filed JRUBY-6658 for and sent an accompanying RubySpec patch for
	? a problem with libxml-jruby, which appears unmaintained, for which I?ve submitted a BioRuby patch plus JRUBY-6662
	? and a small test case bug relating to floating point handling, which I?ve submitted apatch for.

Once these are resolved, JRuby should be passing the BioRuby unit tests in 1.8 mode, and closer to passing in 1.9 mode. (There are a few extra failures under 1.9 that I haven?t sorted through yet.)

I?ve also gotten a start on my project itself, creating the bioruby-maf Github repository with a project skeleton and writing my first Cucumber feature for it. This is, in fact, my first Cucumber feature ever. However, I did spend a few cross-country flights reading the RSpec and Cucumber books last week; between that and cribbing from Pjotr?s code I feel like I have some idea what I?m doing. Just assembling that feature has been useful, too, since I?ve had to get several of the existing MAF tools running on my machine. In fact, my test MAF data and the FASTA version of it are courtesy of bx-python, which will be my reference implementation in many respects.

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Tue May 15 17:08:20 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Tue, 15 May 2012 13:08:20 -0400
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
Message-ID: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>

Hi all,

The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)

There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.

However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?

Clayton Wheeler
cswh at umich.edu


From pjotr.public14 at thebird.nl  Tue May 15 18:54:32 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 15 May 2012 20:54:32 +0200
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
Message-ID: <20120515185432.GC20185@thebird.nl>

Marvellous work Clayton! My suggestion to BioRuby is to split out
phyloxml and to deprecate the current library module. In the next
release, or after, we should take out that code. I suspect few people
really depend on it, and they can adapt. I am partly responsible for
that dependency, and I think the Travis-ci tests also point out that
the purer Ruby BioRuby is, the better ;). 

Naohisa, what do you say? We should also ask the original author, even
though she has left our little group and now works for google (and I
am claiming Google does not recruit from GSoC :). Diana, maybe you are
reading the ML?

Pj.

On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
> Hi all,
> 
> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
> 
> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
> 
> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From cjfields at illinois.edu  Tue May 15 19:14:02 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Tue, 15 May 2012 19:14:02 +0000
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <20120515185432.GC20185@thebird.nl>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
	<20120515185432.GC20185@thebird.nl>
Message-ID: <C5E563DF-9678-47AC-BFA5-2778AC3DB514@illinois.edu>

I am intending on following the same tact with BioPerl's phyloxml (splitting it out), primarily so it can be maintained separately from the rest of bioperl.

chris

On May 15, 2012, at 1:54 PM, Pjotr Prins wrote:

> Marvellous work Clayton! My suggestion to BioRuby is to split out
> phyloxml and to deprecate the current library module. In the next
> release, or after, we should take out that code. I suspect few people
> really depend on it, and they can adapt. I am partly responsible for
> that dependency, and I think the Travis-ci tests also point out that
> the purer Ruby BioRuby is, the better ;). 
> 
> Naohisa, what do you say? We should also ask the original author, even
> though she has left our little group and now works for google (and I
> am claiming Google does not recruit from GSoC :). Diana, maybe you are
> reading the ML?
> 
> Pj.
> 
> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
>> Hi all,
>> 
>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
>> 
>> There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
>> 
>> However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
>> 
>> Clayton Wheeler
>> cswh at umich.edu
>> 
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Tue May 15 21:51:51 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Tue, 15 May 2012 17:51:51 -0400
Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test
	failures
Message-ID: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>

Hi all,

I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods).

Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead?

Clayton Wheeler
cswh at umich.edu


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 07:30:35 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 16:30:35 +0900
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
	<CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
Message-ID: <201205160739.q4G7dS4G004980@portal.open-bio.org>

Hi,

For Bioruby, I manually set the hook with my (ngoto's) personal
Travis account. As far as I can see, organization accout in Travis
is currently not available.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Thu, 10 May 2012 11:31:07 +0100
Peter Cock <p.j.a.cock at googlemail.com> wrote:

> On Wed, May 9, 2012 at 10:31 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> > On Wed, May 09, 2012 at 06:56:17PM +0100, Peter Cock wrote:
> >> I'm guessing that's how you did it for BioRuby?
> >
> > I think I added it before we were a github organization. Or we were
> > just lucky :)
> >
> > Pj.
> 
> I'd guess the former - I've now got a personal Travis account via my
> personal GitHub account), but for now I can't seem to create a Biopython
> Travis account via the Biopython organization account on GitHub.
> 
> Nevertheless, I could get the basic Biopython unit tests running on
> Travis last night (including Python 3), although this needs more
> work installing dependencies to get the full test suite coverage:
> http://travis-ci.org/#!/peterjc/biopython
> 
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 07:54:53 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 16:54:53 +0900
Subject: [BioRuby] JRuby bug filed for Bio::Command-related unit test
 failures
In-Reply-To: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
Message-ID: <201205160754.q4G7srSc005733@portal.open-bio.org>

Hi Clayton,

In addition, we have a Redmine page hosted on OBF.

https://redmine.open-bio.org/projects/bioruby

Currently, bugs and feature requests moved from old RubyForge BTS
are submitted.

I think the Redmine page will be used for bugs and feature requests
without pull requests.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 15 May 2012 17:51:51 -0400
Clayton Wheeler <cswh at umich.edu> wrote:

> Hi all,
> 
> I've submitted a bug report and patch for JRUBY-6666 (http://jira.codehaus.org/browse/JRUBY-6666), which should fix another set of JRuby unit test failures occurring when Bio::Command methods call Open3.popen3 (and perhaps even other similar exec-family methods).
> 
> Would it be helpful for me to file a BioRuby bug to track this issue, perhaps on Github? Or perhaps create a wiki page to track unit test problems instead?
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From anurag08priyam at gmail.com  Wed May 16 08:15:40 2012
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Wed, 16 May 2012 13:45:40 +0530
Subject: [BioRuby] BioRuby on Travis-ci!
In-Reply-To: <201205160739.q4G7dS4G004980@portal.open-bio.org>
References: <20120509171449.GA29529@thebird.nl>
	<CAKVJ-_6h50wJYvN2Q1fjG7BPuQP1-DsN2qWhnKBagmvVQB5Cqg@mail.gmail.com>
	<CAKVJ-_69PbdD5psxgmnC8w2aNMv2c8BO7_qGnOSDCgLSLbAk_w@mail.gmail.com>
	<20120509213158.GB31329@thebird.nl>
	<CAKVJ-_7FBtKJee57==o5S5RYjr16CQStUgK2w0qVmrrvpLOAgg@mail.gmail.com>
	<201205160739.q4G7dS4G004980@portal.open-bio.org>
Message-ID: <CAD1m08ULTUkgrXX6Rm+9rX=MUvAJPRgTAmqW_WRSFqaNo9Nm4w@mail.gmail.com>

On Wed, May 16, 2012 at 1:00 PM, Naohisa GOTO
<ngoto at gen-info.osaka-u.ac.jp> wrote:
> For Bioruby, I manually set the hook with my (ngoto's) personal
> Travis account. As far as I can see, organization accout in Travis
> is currently not available.

You are talking about the toggle button on your Travis profile page,
right?  For repos that belong to an organization, you need to enable
Travis hook from Github (admin/service-hooks), iirc, using the token
on your Travis profile page.

-- 
Anurag Priyam


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 08:17:57 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 17:17:57 +0900
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <20120515185432.GC20185@thebird.nl>
References: <D9A472A6-379B-458D-BFBE-3F3D5D976E0E@umich.edu>
	<20120515185432.GC20185@thebird.nl>
Message-ID: <201205160817.q4G8HwBO007774@portal.open-bio.org>

Hi,

Great work, Clayton!

I think separate gem (Biogem) is good, too.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 15 May 2012 20:54:32 +0200
Pjotr Prins <pjotr.public14 at thebird.nl> wrote:

> Marvellous work Clayton! My suggestion to BioRuby is to split out
> phyloxml and to deprecate the current library module. In the next
> release, or after, we should take out that code. I suspect few people
> really depend on it, and they can adapt. I am partly responsible for
> that dependency, and I think the Travis-ci tests also point out that
> the purer Ruby BioRuby is, the better ;). 
> 
> Naohisa, what do you say? We should also ask the original author, even
> though she has left our little group and now works for google (and I
> am claiming Google does not recruit from GSoC :). Diana, maybe you are
> reading the ML?
> 
> Pj.
> 
> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
> > Hi all,
> > 
> > The PhyloXML unit tests are failing under JRuby, because the libxml-jruby gem (an implementation of the libxml API using native Java XML libraries) does not support the full API of libxml-ruby. My first approach to this was to simply use the native libxml-ruby gem and its C extension, which works with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a Unicode issue, and the JRuby developers indicate that the C extension API (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9 mode. (see http://bit.ly/JGWC4K)
> > 
> > There was a discussion of the PhyloXML parser on the mailing list a couple of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be rewritten to use Nokogiri at some point soon, since Nokogiri is now the de facto standard XML parser. Following that lead, I've gone ahead and ported the PhyloXML parser to use Nokogiri; it only took an hour or two, and the unit tests are passing. My branch for this is at https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a good approach, I can port the writer as well.
> > 
> > However, Pjotr suggested that it might make sense to split PhyloXML out into a separate gem. This should be straightforward enough, since no other BioRuby components appear to call PhyloXML. It would mean that any PhyloXML users would need to install a separate gem. On the other hand, it would remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I proceed with this approach?
> > 
> > Clayton Wheeler
> > cswh at umich.edu
> > 
> > 
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> > 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From donttrustben at gmail.com  Wed May 16 11:09:24 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 21:09:24 +1000
Subject: [BioRuby] hmmer3
Message-ID: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>

Hi guys,

I noticed today that there isn't HMMER3 support in bioruby - particularly
I'm interested in a parser for hmmsearch outputs as I want to iterate over
aligned positions.

I noticed that there is mention of this in the 1.4.1 release notes, that
hmmer3 will be supported in 1.5, although I'm not sure what exactly this
means.
http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/

Can I ask what the state of this merge is please? Is there code somewhere
just waiting to be merged? Can it be quickly spun out into a biogem in the
meantime?

Thanks,
ben

-- 
Ben Woodcroft


From bonnal at ingm.org  Wed May 16 11:27:18 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 16 May 2012 13:27:18 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
Message-ID: <CBD95BB6.991E%bonnal@ingm.org>

If you need to wrap the binary please have a look at our wrapper. I
wondering is this wrapper could be useful to other gems, I could create a
separated gem just for it. Let me know. Docs about the wrapper is in the
readme.

https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb
https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper
 

On 16/05/12 13.09, "Ben Woodcroft" <donttrustben at gmail.com> wrote:

> Hi guys,
> 
> I noticed today that there isn't HMMER3 support in bioruby - particularly
> I'm interested in a parser for hmmsearch outputs as I want to iterate over
> aligned positions.
> 
> I noticed that there is mention of this in the 1.4.1 release notes, that
> hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> means.
> http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> 
> Can I ask what the state of this merge is please? Is there code somewhere
> just waiting to be merged? Can it be quickly spun out into a biogem in the
> meantime?
> 
> Thanks,
> ben


From donttrustben at gmail.com  Wed May 16 11:43:44 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 21:43:44 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <CBD95BB6.991E%bonnal@ingm.org>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<CBD95BB6.991E%bonnal@ingm.org>
Message-ID: <CA+adgSAsb-M73U0mZ8oR2_TBACs3e2EVBr9hmn6UVEJB2KK=xA@mail.gmail.com>

Thanks for the feedback dudes. I'm happy to spin it out myself, only I
don't know where the code is.

I don't personally need a wrapper, but I've got 40G of hmmsearch result
files to parse.

Relatedly I've written a gem that parses HMM model files - I'll release
that after a little more testing, hopefully tomorrow.

On 16 May 2012 21:27, Raoul Bonnal <bonnal at ingm.org> wrote:

> If you need to wrap the binary please have a look at our wrapper. I
> wondering is this wrapper could be useful to other gems, I could create a
> separated gem just for it. Let me know. Docs about the wrapper is in the
> readme.
>
> https://github.com/helios/bioruby-ngs/blob/master/lib/wrapper.rb
> https://github.com/helios/bioruby-ngs/blob/master/README.rdoc#wrapper
>
>
> On 16/05/12 13.09, "Ben Woodcroft" <donttrustben at gmail.com> wrote:
>
> > Hi guys,
> >
> > I noticed today that there isn't HMMER3 support in bioruby - particularly
> > I'm interested in a parser for hmmsearch outputs as I want to iterate
> over
> > aligned positions.
> >
> > I noticed that there is mention of this in the 1.4.1 release notes, that
> > hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> > means.
> > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> >
> > Can I ask what the state of this merge is please? Is there code somewhere
> > just waiting to be merged? Can it be quickly spun out into a biogem in
> the
> > meantime?
> >
> > Thanks,
> > ben
>
>
>


From ngoto at gen-info.osaka-u.ac.jp  Wed May 16 11:48:14 2012
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 16 May 2012 20:48:14 +0900
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
Message-ID: <201205161148.q4GBmFSj016839@portal.open-bio.org>

Hi Ben,

HMMER3 result parser is written by Christian.
https://github.com/cmzmasek/bioruby

I guess it may be enough quality, except RDF/XML support
which is experimental.

I'd like to discuss that the class name Bio::Hmmer3Report
is suitable. For HMMER2, Bio::HMMER::Report.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Wed, 16 May 2012 21:09:24 +1000
Ben Woodcroft <donttrustben at gmail.com> wrote:

> Hi guys,
> 
> I noticed today that there isn't HMMER3 support in bioruby - particularly
> I'm interested in a parser for hmmsearch outputs as I want to iterate over
> aligned positions.
> 
> I noticed that there is mention of this in the 1.4.1 release notes, that
> hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> means.
> http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> 
> Can I ask what the state of this merge is please? Is there code somewhere
> just waiting to be merged? Can it be quickly spun out into a biogem in the
> meantime?
> 
> Thanks,
> ben
> 
> -- 
> Ben Woodcroft
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From bonnal at ingm.org  Wed May 16 12:46:34 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 16 May 2012 14:46:34 +0200
Subject: [BioRuby] Porting PhyloXML to Nokogiri, maybe repackaging it
In-Reply-To: <201205160817.q4G8HwBO007774@portal.open-bio.org>
Message-ID: <CBD96E4A.9927%bonnal@ingm.org>

Impressive.
This is the right approach for cleaning BioRuby from dependencies which
could create problems.


Thanks Clayton.


On 16/05/12 10.17, "Naohisa GOTO" <ngoto at gen-info.osaka-u.ac.jp> wrote:

> Hi,
> 
> Great work, Clayton!
> 
> I think separate gem (Biogem) is good, too.
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Tue, 15 May 2012 20:54:32 +0200
> Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> 
>> Marvellous work Clayton! My suggestion to BioRuby is to split out
>> phyloxml and to deprecate the current library module. In the next
>> release, or after, we should take out that code. I suspect few people
>> really depend on it, and they can adapt. I am partly responsible for
>> that dependency, and I think the Travis-ci tests also point out that
>> the purer Ruby BioRuby is, the better ;).
>> 
>> Naohisa, what do you say? We should also ask the original author, even
>> though she has left our little group and now works for google (and I
>> am claiming Google does not recruit from GSoC :). Diana, maybe you are
>> reading the ML?
>> 
>> Pj.
>> 
>> On Tue, May 15, 2012 at 01:08:20PM -0400, Clayton Wheeler wrote:
>>> Hi all,
>>> 
>>> The PhyloXML unit tests are failing under JRuby, because the libxml-jruby
>>> gem (an implementation of the libxml API using native Java XML libraries)
>>> does not support the full API of libxml-ruby. My first approach to this was
>>> to simply use the native libxml-ruby gem and its C extension, which works
>>> with JRuby in 1.8 mode. However, it doesn't work in 1.9 mode due to a
>>> Unicode issue, and the JRuby developers indicate that the C extension API
>>> (as opposed to FFI, I suppose) isn't likely to be supported further in 1.9
>>> mode. (see http://bit.ly/JGWC4K)
>>> 
>>> There was a discussion of the PhyloXML parser on the mailing list a couple
>>> of months ago (http://bit.ly/JFX8Qf), and Naohisa indicated that it might be
>>> rewritten to use Nokogiri at some point soon, since Nokogiri is now the de
>>> facto standard XML parser. Following that lead, I've gone ahead and ported
>>> the PhyloXML parser to use Nokogiri; it only took an hour or two, and the
>>> unit tests are passing. My branch for this is at
>>> https://github.com/csw/bioruby/tree/phyloxml-nokogiri. If this seems like a
>>> good approach, I can port the writer as well.
>>> 
>>> However, Pjotr suggested that it might make sense to split PhyloXML out into
>>> a separate gem. This should be straightforward enough, since no other
>>> BioRuby components appear to call PhyloXML. It would mean that any PhyloXML
>>> users would need to install a separate gem. On the other hand, it would
>>> remove a dependency on libxml2 for core BioRuby on MRI. Thoughts? Should I
>>> proceed with this approach?
>>> 
>>> Clayton Wheeler
>>> cswh at umich.edu
>>> 
>>> 
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From donttrustben at gmail.com  Wed May 16 13:28:01 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Wed, 16 May 2012 23:28:01 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
Message-ID: <CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>

Ah cool, thanks ngoto.

Thanks for writing this Christian. I believe I've extracted the hmmer3
stuff into a new biogem. I've added you as an author on this Christian -
hope that's ok with you?
https://github.com/wwood/bioruby-hmmer3_report

I've not released it to rubygems yet - I wanted to clear up namespace
issues first. What do you suggest Naohisa? BIo::HMMER::HMMER3::Report ?

On looking at the code it seems it only handles tabular format data, which
is rather unfortunate for me, as I need the actual alignment. Looks like
I'll have to roll my sleeves up after all, unless there is yet more code
out there that parses the regular textual format?

I'm not sure about your feelings on this Christian, but how do you feel
about putting the rdf stuff in another biogem? If the aim is to get this
gem merged into the bioruby core code (and I hope it is since when people
say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a
bit tangential?

I also noticed that in the tests Christian referred to BioRubyTestDataPath
which isn't recognised in the biogem. Is there a recommended way to do this
in a biogem? Perhaps we should mirror what bioruby itself does to make the
code more portable.

Thanks everyone for the openness and responsiveness.
ben

On 16 May 2012 21:48, Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp> wrote:

> Hi Ben,
>
> HMMER3 result parser is written by Christian.
> https://github.com/cmzmasek/bioruby
>
> I guess it may be enough quality, except RDF/XML support
> which is experimental.
>
> I'd like to discuss that the class name Bio::Hmmer3Report
> is suitable. For HMMER2, Bio::HMMER::Report.
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
> On Wed, 16 May 2012 21:09:24 +1000
> Ben Woodcroft <donttrustben at gmail.com> wrote:
>
> > Hi guys,
> >
> > I noticed today that there isn't HMMER3 support in bioruby - particularly
> > I'm interested in a parser for hmmsearch outputs as I want to iterate
> over
> > aligned positions.
> >
> > I noticed that there is mention of this in the 1.4.1 release notes, that
> > hmmer3 will be supported in 1.5, although I'm not sure what exactly this
> > means.
> > http://news.open-bio.org/news/2010/10/bioruby-1-4-1-released/
> >
> > Can I ask what the state of this merge is please? Is there code somewhere
> > just waiting to be merged? Can it be quickly spun out into a biogem in
> the
> > meantime?
> >
> > Thanks,
> > ben
> >
> > --
> > Ben Woodcroft
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


-- 
--
Ben Woodcroft
http://ecogenomic.org/users/ben-woodcroft <http://www.ecogenomic.org/>


From pjotr.public14 at thebird.nl  Wed May 16 13:46:12 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 16 May 2012 15:46:12 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
Message-ID: <20120516134612.GA26059@thebird.nl>

On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
> I'm not sure about your feelings on this Christian, but how do you feel
> about putting the rdf stuff in another biogem? If the aim is to get this
> gem merged into the bioruby core code (and I hope it is since when people
> say hmmer nowadays they likely mean v3, not v2), maybe the rdf stuff is a
> bit tangential?

I think it should be decoupled. RDF, in general, is a (searchable)
result-based (post-parser) format. Maybe we should coin that
definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF
into triple stores. Sounds that bio-rdf is the right place for that
translation code to me :).  Feel free to push it in.

Pj.


From pjotr.public14 at thebird.nl  Thu May 17 16:51:01 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 17 May 2012 18:51:01 +0200
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <201205160754.q4G7srSc005733@portal.open-bio.org>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
Message-ID: <20120517165101.GA32610@thebird.nl>

I don't know if you all track github, but thanks to two GSoC coders
(Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.

Travis-CI should show the green light for all Rubies once Rubinius
itself gets updated on Travis :)

Kudos.

Pj.


From cjfields at illinois.edu  Thu May 17 16:59:33 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 17 May 2012 16:59:33 +0000
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <20120517165101.GA32610@thebird.nl>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
	<20120517165101.GA32610@thebird.nl>
Message-ID: <D0730676-B5EE-4C98-AAB0-2F67F7F516D5@illinois.edu>

Sounds like GSoC this year is paying lots of dividends :)

chris

On May 17, 2012, at 11:51 AM, Pjotr Prins wrote:

> I don't know if you all track github, but thanks to two GSoC coders
> (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.
> 
> Travis-CI should show the green light for all Rubies once Rubinius
> itself gets updated on Travis :)
> 
> Kudos.
> 
> Pj.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Thu May 17 17:42:06 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 13:42:06 -0400
Subject: [BioRuby] BioRuby fixed for JRuby and Rubinius failures
In-Reply-To: <20120517165101.GA32610@thebird.nl>
References: <4B1D0D9B-7EAC-4AC6-819C-90C7A4A899B0@umich.edu>
	<201205160754.q4G7srSc005733@portal.open-bio.org>
	<20120517165101.GA32610@thebird.nl>
Message-ID: <7D2E3046-44E9-4275-B294-8DB39D36294B@umich.edu>

On May 17, 2012, at 12:51 PM, Pjotr Prins wrote:

> I don't know if you all track github, but thanks to two GSoC coders
> (Artem and Clayton) BioRuby got fixed to run on JRuby and Rubinius.
> 
> Travis-CI should show the green light for all Rubies once Rubinius
> itself gets updated on Travis :)

Thanks Pjotr. Unfortunately I think we're not going to be quite there for JRuby just yet; we've hit a couple of JRuby bugs which will probably need to be fixed to solve some of the failures. Also, I think we may be stuck with PhyloXML test failures under JRuby in 1.9 mode until we split that out into a separate gem. It's definitely progress, though.

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Thu May 17 19:39:27 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 15:39:27 -0400
Subject: [BioRuby] PhyloXML and libxml-ruby
Message-ID: <C919224A-16F7-4940-A307-00A87A61D96A@umich.edu>

Hi all,

It appears that the native extension for libxml-ruby is not building reliably under JRuby, causing Travis-CI runs to fail as seen at:

http://travis-ci.org/#!/ngoto/bioruby/jobs/1356992

I'm not having much luck identifying exactly why it builds in some JRuby environments and not others, but I've been able to reproduce the Travis-CI problem on a test Linux machine and don't see an obvious fix.

If we're going to repackage PhyloXML into a separate gem, I think the safest course of action would be to revert to calling for libxml-jruby in the Travis-CI Gemfiles (i.e. back out http://bit.ly/JmNjDY). Using libxml-ruby instead of libxml-jruby doesn't solve the PhyloXML problems on JRuby in 1.9 mode anyway, and 1.9 mode will soon be the default in JRuby. The PhyloXML gem can be explicitly declared to depend on libxml-ruby, and moving it out of the core BioRuby gem will remove this whole issue, as far as the unit tests go. Then PhyloXML's library requirements can be addressed separately.

Thoughts?

Clayton Wheeler
cswh at umich.edu


From cswh at umich.edu  Fri May 18 03:10:52 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 17 May 2012 23:10:52 -0400
Subject: [BioRuby] bio-phyloxml gem
Message-ID: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>

Hi all,

I have repackaged BioRuby's PhyloXML support as a separate gem:

https://github.com/csw/bioruby-phyloxml

I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.)

Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough.

Also, feel free to contribute changes to the README.

If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0.

Clayton Wheeler
cswh at umich.edu


From donttrustben at gmail.com  Fri May 18 04:59:44 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Fri, 18 May 2012 14:59:44 +1000
Subject: [BioRuby] hmmer3
In-Reply-To: <20120516134612.GA26059@thebird.nl>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
	<20120516134612.GA26059@thebird.nl>
Message-ID: <CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>

On 16 May 2012 23:46, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:

> On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
> > maybe the rdf stuff is a
> > bit tangential?
>
> I think it should be decoupled. RDF, in general, is a (searchable)
> result-based (post-parser) format. Maybe we should coin that
> definition somewhere :). I created bio-rdf biogem as a 'sink' for RDF
> into triple stores. Sounds that bio-rdf is the right place for that
> translation code to me :).  Feel free to push it in.
>

Thanks. I've removed the rdf related code all in one commit:
https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20

However, I don't feel like I should be adding this to a different
repository because I don't feel like I understand the technology enough,
and therefore am not really inclined to maintain it. All of the relevant
code should be in that commit, so should be quite simple to add in yourself
if you are inclined (though I couldn't find any unit tests). Only, I've
changed the namespace of it to Bio::HMMER::HMMER3::Report from
Bio::Hmmer3report as Naohisa suggested. I've also now pushed the new biogem
to rubygems/biogems.info.

Thanks,
ben


From pjotr.public14 at thebird.nl  Fri May 18 05:21:23 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 18 May 2012 07:21:23 +0200
Subject: [BioRuby] hmmer3
In-Reply-To: <CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>
References: <CA+adgSCGkQqMFPNbPei5yzBvG+CQcUDfXB=y7oBbgxQwEBHj7Q@mail.gmail.com>
	<4fb39400.2421440a.5445.70ddSMTPIN_ADDED@mx.google.com>
	<CA+adgSDh0TYM1u9L-eq6nc_3EDu--RYY4BTOOCTeBgB_DmsNTA@mail.gmail.com>
	<20120516134612.GA26059@thebird.nl>
	<CA+adgSBtB9eiSkBBzQLF9-1xoxy5qM-OjwgkZJHd0j8kO9R2Hg@mail.gmail.com>
Message-ID: <20120518052123.GA3360@thebird.nl>

OK, I'll take the orphaned RDF code.

On Fri, May 18, 2012 at 02:59:44PM +1000, Ben Woodcroft wrote:
>    On 16 May 2012 23:46, Pjotr Prins <[1]pjotr.public14 at thebird.nl> wrote:
> 
>    On Wed, May 16, 2012 at 11:28:01PM +1000, Ben Woodcroft wrote:
>    > maybe the rdf stuff is a
>    > bit tangential?
> 
>      I think it should be decoupled. RDF, in general, is a (searchable)
>      result-based (post-parser) format. Maybe we should coin that
>      definition somewhere :). I created bio-rdf biogem as a 'sink' for
>      RDF
>      into triple stores. Sounds that bio-rdf is the right place for that
>      translation code to me :).  Feel free to push it in.
> 
>    Thanks. I've removed the rdf related code all in one commit:
>    [2]https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011
>    cb600e78e6ef10603187c99d20
>    However, I don't feel like I should be adding this to a different
>    repository because I don't feel like I understand the technology
>    enough, and therefore am not really inclined to maintain it. All of the
>    relevant code should be in that commit, so should be quite simple to
>    add in yourself if you are inclined (though I couldn't find any unit
>    tests). Only, I've changed the namespace of it to
>    Bio::HMMER::HMMER3::Report from Bio::Hmmer3report as Naohisa suggested.
>    I've also now pushed the new biogem to rubygems/[3]biogems.info.
>    Thanks,
>    ben
> 
> References
> 
>    1. mailto:pjotr.public14 at thebird.nl
>    2. https://github.com/wwood/bioruby-hmmer3_report/commit/3795ce3a124011cb600e78e6ef10603187c99d20
>    3. http://biogems.info/


From pjotr.public14 at thebird.nl  Fri May 18 05:24:40 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 18 May 2012 07:24:40 +0200
Subject: [BioRuby] bio-phyloxml gem
In-Reply-To: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>
References: <8C0AB87F-CC00-4A34-8FED-22300D88D0EE@umich.edu>
Message-ID: <20120518052440.GB3360@thebird.nl>

I am with Raoul and Francesco today. We will take a look and discuss.
Good job, also saving the revision history :).

On Thu, May 17, 2012 at 11:10:52PM -0400, Clayton Wheeler wrote:
> Hi all,
> 
> I have repackaged BioRuby's PhyloXML support as a separate gem:
> 
> https://github.com/csw/bioruby-phyloxml
> 
> I was able to preserve its revision history. All the unit tests pass, too. I did take this opportunity to rename some of the files, so their names correspond to the namespace of the classes. I think I've set up the packaging appropriately, though I'd appreciate it if someone more experienced with the Biogems infrastructure could take a quick look at this. (Hint hint, Pjotr.)
> 
> Who should we designate as the maintainer? I suppose I have my hands on it, but if there are any volunteers? And if it would make more sense to host this under someone else's Github account, that should be easy enough.
> 
> Also, feel free to contribute changes to the README.
> 
> If everything looks good, I'll go ahead and set this up on Travis-CI, biogems.info, and Rubygems as version 1.0.0.
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From donttrustben at gmail.com  Fri May 18 05:40:28 2012
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Fri, 18 May 2012 15:40:28 +1000
Subject: [BioRuby] New biogems for IonTorrent, pileup files, pfam and hmmer
Message-ID: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>

Hi guys,

Here's some blatant advertising for some code I've recently written in
biogem form.

bio-gag: "gag error" is the term I've coined to describe an error that
various people have observed on certain sequencing kits with IonTorrent,
though it has not previously been characterised very well that I know of
(we noticed that the errors seemed to occur at GAG positions in the reads
that were supposed to be GAAG). This biogem tries to find and fix these
errors. It isn't benchmarked for accuracy but worked well enough for my
lab's own purposes. Actually to be honest we've only used an older version
of the software on real data and the logic has a little since given some
recent evidence we have, but I thought I'd push it out with the latest and
greatest error model.
https://github.com/wwood/bioruby-gag

bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
files looking for particular patterns e.g. strand bias of insertions. This
gem can be used to iterate through pileup files one position (one line) at
a time, building up the sequence of each read as it goes, recording their
direction etc. Probably not the fastest piece of code in the world, sorry.
I'm not sure whether this should/can be incorporated into bio-samtools? It
adds functionality - there's no duplication (I don't think).
https://github.com/wwood/bioruby-pileup_iterator

bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
the hmmer v3 manual.
https://github.com/wwood/bioruby-hmmer_model

bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
tabular format files - the guts of this were written by Christian - see
yesterday's thread for details. I'm hoping to add regular (non-tabular)
format parsing in the near future, but no promises.
https://github.com/wwood/bioruby-hmmer3_report

I'm sure there is bugs and deficiencies - apologies in advance.

Enjoy,
ben


From francesco.strozzi at gmail.com  Fri May 18 08:01:01 2012
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Fri, 18 May 2012 10:01:01 +0200
Subject: [BioRuby] New biogems for IonTorrent, pileup files,
	pfam and hmmer
In-Reply-To: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>
References: <CA+adgSD35RG8=7+xwkF-s4JCmhdMUat9_FRACXS_H50crhsZGA@mail.gmail.com>
Message-ID: <CACtet2Sn5BcDA-1-asr9T26Kbvg6iEEwEymC2KDY9jjNa43p=w@mail.gmail.com>

Hi Ben,
thanks for the amazing work! I'm not using Ion Torrent atm but I
eventually will and it's good to see there is something already setup.

Francesco

On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft <donttrustben at gmail.com> wrote:
> Hi guys,
>
> Here's some blatant advertising for some code I've recently written in
> biogem form.
>
> bio-gag: "gag error" is the term I've coined to describe an error that
> various people have observed on certain sequencing kits with IonTorrent,
> though it has not previously been characterised very well that I know of
> (we noticed that the errors seemed to occur at GAG positions in the reads
> that were supposed to be GAAG). This biogem tries to find and fix these
> errors. It isn't benchmarked for accuracy but worked well enough for my
> lab's own purposes. Actually to be honest we've only used an older version
> of the software on real data and the logic has a little since given some
> recent evidence we have, but I thought I'd push it out with the latest and
> greatest error model.
> https://github.com/wwood/bioruby-gag
>
> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
> files looking for particular patterns e.g. strand bias of insertions. This
> gem can be used to iterate through pileup files one position (one line) at
> a time, building up the sequence of each read as it goes, recording their
> direction etc. Probably not the fastest piece of code in the world, sorry.
> I'm not sure whether this should/can be incorporated into bio-samtools? It
> adds functionality - there's no duplication (I don't think).
> https://github.com/wwood/bioruby-pileup_iterator
>
> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
> the hmmer v3 manual.
> https://github.com/wwood/bioruby-hmmer_model
>
> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
> tabular format files - the guts of this were written by Christian - see
> yesterday's thread for details. I'm hoping to add regular (non-tabular)
> format parsing in the near future, but no promises.
> https://github.com/wwood/bioruby-hmmer3_report
>
> I'm sure there is bugs and deficiencies - apologies in advance.
>
> Enjoy,
> ben
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 

Francesco


From bonnal at ingm.org  Fri May 18 08:54:44 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Fri, 18 May 2012 10:54:44 +0200
Subject: [BioRuby] New biogems for IonTorrent, pileup files,
 pfam and hmmer
In-Reply-To: <CACtet2Sn5BcDA-1-asr9T26Kbvg6iEEwEymC2KDY9jjNa43p=w@mail.gmail.com>
Message-ID: <CBDBDAF4.997C%bonnal@ingm.org>

My lab (Alberto) will try your HMM parsers because we are going to annotate
a lot of stuff coming form NGS ^_^


On 18/05/12 10.01, "Francesco Strozzi" <francesco.strozzi at gmail.com> wrote:

> Hi Ben,
> thanks for the amazing work! I'm not using Ion Torrent atm but I
> eventually will and it's good to see there is something already setup.
> 
> Francesco
> 
> On Fri, May 18, 2012 at 7:40 AM, Ben Woodcroft <donttrustben at gmail.com> wrote:
>> Hi guys,
>> 
>> Here's some blatant advertising for some code I've recently written in
>> biogem form.
>> 
>> bio-gag: "gag error" is the term I've coined to describe an error that
>> various people have observed on certain sequencing kits with IonTorrent,
>> though it has not previously been characterised very well that I know of
>> (we noticed that the errors seemed to occur at GAG positions in the reads
>> that were supposed to be GAAG). This biogem tries to find and fix these
>> errors. It isn't benchmarked for accuracy but worked well enough for my
>> lab's own purposes. Actually to be honest we've only used an older version
>> of the software on real data and the logic has a little since given some
>> recent evidence we have, but I thought I'd push it out with the latest and
>> greatest error model.
>> https://github.com/wwood/bioruby-gag
>> 
>> bio-pileup_iterator: To find gag errors bio-gag iterates through pileup
>> files looking for particular patterns e.g. strand bias of insertions. This
>> gem can be used to iterate through pileup files one position (one line) at
>> a time, building up the sequence of each read as it goes, recording their
>> direction etc. Probably not the fastest piece of code in the world, sorry.
>> I'm not sure whether this should/can be incorporated into bio-samtools? It
>> adds functionality - there's no duplication (I don't think).
>> https://github.com/wwood/bioruby-pileup_iterator
>> 
>> bio-hmmer_model: This is a parser of HMM files e.g. from PFAM according to
>> the hmmer v3 manual.
>> https://github.com/wwood/bioruby-hmmer_model
>> 
>> bio-hmmer3_report: Parsing of HMMER3 result files. Currently only handles
>> tabular format files - the guts of this were written by Christian - see
>> yesterday's thread for details. I'm hoping to add regular (non-tabular)
>> format parsing in the near future, but no promises.
>> https://github.com/wwood/bioruby-hmmer3_report
>> 
>> I'm sure there is bugs and deficiencies - apologies in advance.
>> 
>> Enjoy,
>> ben
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From pjotr.public14 at thebird.nl  Sun May 20 12:31:31 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 20 May 2012 14:31:31 +0200
Subject: [BioRuby] biogems.info updated
Message-ID: <20120520123131.GA17983@thebird.nl>

Marjan and I have updated the RSS feed for biogems.info - now we can
support more blogs. If you are blogging on Ruby for Bioinformatics,
give us the feed :)

Pj.


From marian.povolny at gmail.com  Mon May 21 09:36:01 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Mon, 21 May 2012 11:36:01 +0200
Subject: [BioRuby] GSoC weekly status report No.1.2
Message-ID: <CADKP5CkumC36Tkq3xDCif4m8m29Ms1Dj=suBnt6Wr4XFVo+yEw@mail.gmail.com>

http://blog.mpthecoder.com/post/23473020471/gsoc-weekly-status-report-no-1-2

It?s been three months since my first introduction on the BioRuby ML and
it?s been great. As it is the end of the GSoC community bonding period, I
would like to thank Pjotr most and then all the other community members for
their help and support. It?s a great feeling to become a member of a small
but growing community of enthusiasts that work together for the better of
all of us and for fun.

As Pjotr already did, I would like to encourage you to write blog posts
about using Ruby in Bioinformatics and let us include them in our RSS and
news feeds on the biogems.info website. The site supports both RSS and Atom
feeds now, and a similar functionality will be part of the new website for
BioRuby once it?s finished. The code also supports adding only posts for
one category/tag, so you can tag your posts with BioRuby or similar, and
only those posts will be included in the RSS feed on biogems.info.

The GSoC coding period starts today, It?s time for me to roll my sleeves
up, and start working on the GFF3 parser full-time.

--
Marjan


From lomereiter at googlemail.com  Mon May 21 11:58:46 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Mon, 21 May 2012 15:58:46 +0400
Subject: [BioRuby] [GSoC] Weekly report #1
Message-ID: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>

Hi all,

here's my report about the past week:
http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/

Brief summary:

1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
now pass all tests. The situation with 1.9 mode is not that great, but I'm
working on it.

2) I started to collect D optimization tricks on github wiki page.
Currently, it contains just 6 tips, but this number is going to grow.
Probably, another page will be created soon to keep best practices of
connecting Ruby and D. Since my project and Marjan's one have a lot in
common, I think it's important for us to not waste time on something that
already have been investigated.

3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
wrote my first two features.

4) Measurements of object instantiation time in Ruby suggest that exposing
low-level D functions via FFI makes little sense. I'm going to discuss with
mentors which high-level functions should be available, and make that into
Cucumber features.


--
Artem


From cswh at umich.edu  Mon May 21 15:50:18 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Mon, 21 May 2012 11:50:18 -0400
Subject: [BioRuby] GSoC week 2 status report
Message-ID: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>

Hi all,

Here's my report on last week's work:

http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/

This was my second week of work on my GSoC project, and the last week of the ?community bonding? period before the official start of coding. A major focus of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing unit test failures under JRuby. In the end, the best course of action seemed to separate the phyloXML support as a separate plugin, which I have done as the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries entirely and that JRuby issue along with it. At the same time, users of the phyloXML code should be able to continue using it with no substantive changes.

Separately, I began porting this phyloXML code to use Nokogiri instead of libxml-ruby, but ran into difficulties with this effort. While it is possible, and the library APIs are very similar, the code uses relatively low-level XML processing APIs in ways that seem to be sensitive to subtle differences in text node and namespace semantics between the two libraries. Substantial restructuring of the code and the addition of quite a few unit tests might be necessary to carry out such a port with confidence that the resulting code would work well.

Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major causes of BioRuby?s unit test failures with JRuby; once a fix is integrated, we?ll be close to having all the tests passing under JRuby.

I identified another JRuby bug, JRUBY-6666, causing several unit test failures. This one affects BioRuby?s code for running external commands, so it would be likely to be encountered in production use. For this one, I also worked up a patch.

I also spent some time preparing a performance testing environment, for evaluating existing MAF implementations as well as my own. This will be important, since I will be considering the use of an existing C parser. I will also want to ensure that the performance of my code is competitive with the alternatives. Lacking any hardware more powerful than a MacBook Air, I am setting this up with Amazon EC2. To simplify environment setup, I?ll be using Chef. I?ve already set up a Chef repository with configuration logic, and some rudimentary code to streamline launching Ubuntu machines on EC2 and bootstrapping a Chef environment. To save money, I plan to make use of EC2 Spot Instances, which are perfect for instances that only need to run for a few hours for batch tasks.

Clayton Wheeler
cswh at umich.edu


From bonnal at ingm.org  Tue May 22 09:21:42 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Tue, 22 May 2012 11:21:42 +0200
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>
Message-ID: <CBE12746.9A11%bonnal@ingm.org>

Hi Clayton,
Well done and thanks for your contributes to bioruby and jruby community.

For you computing issue I have two solutions:
1) I can create a VM and give you the access, I need to contact my IT dep.
2) Could Amazon provide some VM for our students?


On 21/05/12 17.50, "Clayton Wheeler" <cswh at umich.edu> wrote:

> Hi all,
> 
> Here's my report on last week's work:
> 
> http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/
> 
> This was my second week of work on my GSoC project, and the last week of the
> ?community bonding? period before the official start of coding. A major focus
> of mine was BioRuby?s phyloXML support; it uses libxml, which has been causing
> unit test failures under JRuby. In the end, the best course of action seemed
> to separate the phyloXML support as a separate plugin, which I have done as
> the bio-phyloxml gem. This will remove BioRuby?s dependency on XML libraries
> entirely and that JRuby issue along with it. At the same time, users of the
> phyloXML code should be able to continue using it with no substantive changes.
> 
> Separately, I began porting this phyloXML code to use Nokogiri instead of
> libxml-ruby, but ran into difficulties with this effort. While it is possible,
> and the library APIs are very similar, the code uses relatively low-level XML
> processing APIs in ways that seem to be sensitive to subtle differences in
> text node and namespace semantics between the two libraries. Substantial
> restructuring of the code and the addition of quite a few unit tests might be
> necessary to carry out such a port with confidence that the resulting code
> would work well.
> 
> Also, someone else submitted a JRuby patch for JRUBY-6658, one of the major
> causes of BioRuby?s unit test failures with JRuby; once a fix is integrated,
> we?ll be close to having all the tests passing under JRuby.
> 
> I identified another JRuby bug, JRUBY-6666, causing several unit test
> failures. This one affects BioRuby?s code for running external commands, so it
> would be likely to be encountered in production use. For this one, I also
> worked up a patch.
> 
> I also spent some time preparing a performance testing environment, for
> evaluating existing MAF implementations as well as my own. This will be
> important, since I will be considering the use of an existing C parser. I will
> also want to ensure that the performance of my code is competitive with the
> alternatives. Lacking any hardware more powerful than a MacBook Air, I am
> setting this up with Amazon EC2. To simplify environment setup, I?ll be using
> Chef. I?ve already set up a Chef repository with configuration logic, and some
> rudimentary code to streamline launching Ubuntu machines on EC2 and
> bootstrapping a Chef environment. To save money, I plan to make use of EC2
> Spot Instances, which are perfect for instances that only need to run for a
> few hours for batch tasks.
> 
> Clayton Wheeler
> cswh at umich.edu
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From p.j.a.cock at googlemail.com  Tue May 22 11:07:15 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Tue, 22 May 2012 12:07:15 +0100
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <4F9AFA1F.6030103@med.nyu.edu>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
Message-ID: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>

Hi all,

I've CC'd the BioRuby mailing list just to ensure you're aware of the
potentially useful combination of MAF indexing and BGZF compression.
We can continue this on the BioRuby list if more appropriate.

The start of this Biopython-dev thread is here:
http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html

This might be a nice opportunity to combine the work of this year's OBF
Google Summer of Code students - Clayton is doing MAF for BioRuby,
and part of Artem's project could provide BGZF support for BioRuby.

On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
<andrew.sczesnak at med.nyu.edu> wrote:
> Peter,
>
>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>> and I'm willing to do this myself for MAF (while going over your index
>> work - something I want to do anyway). The only potential catch is
>> avoiding offset arithmetic.
>
> I have no problem with you doing this if you're willing. It would be great
> to have some code review of MafIndex as well.

I'm not sure if Clayton will be able to comment on the Python code,
but he should have some thoughts on the MAF indexing itself.

Regards,

Peter


From pjotr.public14 at thebird.nl  Tue May 22 15:23:17 2012
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 22 May 2012 17:23:17 +0200
Subject: [BioRuby] BioRuby hitting 20K
Message-ID: <20120522152317.GA30752@thebird.nl>

Looks like we'll have 20K downloads of the bioruby gem by tomorrow
:). Maybe time for a new release?

We are getting a lot more activity anyway - Go BioRuby Go!

Pj.


From mh6 at sanger.ac.uk  Tue May 22 15:32:03 2012
From: mh6 at sanger.ac.uk (Michael Paulini)
Date: Tue, 22 May 2012 16:32:03 +0100
Subject: [BioRuby] BioRuby hitting 20K
In-Reply-To: <20120522152317.GA30752@thebird.nl>
References: <20120522152317.GA30752@thebird.nl>
Message-ID: <4FBBB173.2030001@sanger.ac.uk>

congrats biorubystas :-)

M

On 22/05/12 16:23, Pjotr Prins wrote:
> Looks like we'll have 20K downloads of the bioruby gem by tomorrow
> :). Maybe time for a new release?
>
> We are getting a lot more activity anyway - Go BioRuby Go!
>
> Pj.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From bonnal at ingm.org  Wed May 23 13:24:56 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 23 May 2012 15:24:56 +0200
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
Message-ID: <CBE2B1C8.9A6B%bonnal@ingm.org>

Thanks Peter,
These are valuable hints.


On 22/05/12 13.07, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> Hi all,
> 
> I've CC'd the BioRuby mailing list just to ensure you're aware of the
> potentially useful combination of MAF indexing and BGZF compression.
> We can continue this on the BioRuby list if more appropriate.
> 
> The start of this Biopython-dev thread is here:
> http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html
> 
> This might be a nice opportunity to combine the work of this year's OBF
> Google Summer of Code students - Clayton is doing MAF for BioRuby,
> and part of Artem's project could provide BGZF support for BioRuby.
> 
> On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
>> Peter,
>> 
>>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>>> and I'm willing to do this myself for MAF (while going over your index
>>> work - something I want to do anyway). The only potential catch is
>>> avoiding offset arithmetic.
>> 
>> I have no problem with you doing this if you're willing. It would be great
>> to have some code review of MafIndex as well.
> 
> I'm not sure if Clayton will be able to comment on the Python code,
> but he should have some thoughts on the MAF indexing itself.
> 
> Regards,
> 
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Thu May 24 01:35:46 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Wed, 23 May 2012 21:35:46 -0400
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
Message-ID: <DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>

On May 22, 2012, at 7:07 AM, Peter Cock wrote:

> Hi all,
> 
> I've CC'd the BioRuby mailing list just to ensure you're aware of the
> potentially useful combination of MAF indexing and BGZF compression.
> We can continue this on the BioRuby list if more appropriate.
> 
> The start of this Biopython-dev thread is here:
> http://lists.open-bio.org/pipermail/biopython-dev/2012-April/009561.html
> 
> This might be a nice opportunity to combine the work of this year's OBF
> Google Summer of Code students - Clayton is doing MAF for BioRuby,
> and part of Artem's project could provide BGZF support for BioRuby.

Indeed, thanks Peter. BGZF sounds like a great approach for MAF compression; I'm just about to start looking into indexing support, and it makes sense to tackle compression in that context.

So far, I think Artem's BGZF implementation is entirely in D; I may just add Ruby support for BGZF separately.

> On Fri, Apr 27, 2012 at 8:57 PM, Andrew Sczesnak
> <andrew.sczesnak at med.nyu.edu> wrote:
>> Peter,
>> 
>>> It should be easy enough to follow the BGZF changes to Bio/SeqIO/_index.py
>>> and I'm willing to do this myself for MAF (while going over your index
>>> work - something I want to do anyway). The only potential catch is
>>> avoiding offset arithmetic.
>> 
>> I have no problem with you doing this if you're willing. It would be great
>> to have some code review of MafIndex as well.
> 
> I'm not sure if Clayton will be able to comment on the Python code,
> but he should have some thoughts on the MAF indexing itself.

I'll definitely be spending more time with that code; it and the bx-python MAF indexing code will be my main reference points for indexed access. It's been a little while, but I have done some Python work in the past, so I should be able to follow along okay. I'll send some comments out in a few days.

Clayton Wheeler
cswh at umich.edu


From mictadlo at gmail.com  Thu May 24 04:30:22 2012
From: mictadlo at gmail.com (Mic)
Date: Thu, 24 May 2012 14:30:22 +1000
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
Message-ID: <CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>

D to Ruby: http://www.swig.org/compare.html

On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <lomereiter at googlemail.com>wrote:

> Hi all,
>
> here's my report about the past week:
> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
>
> Brief summary:
>
> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
> now pass all tests. The situation with 1.9 mode is not that great, but I'm
> working on it.
>
> 2) I started to collect D optimization tricks on github wiki page.
> Currently, it contains just 6 tips, but this number is going to grow.
> Probably, another page will be created soon to keep best practices of
> connecting Ruby and D. Since my project and Marjan's one have a lot in
> common, I think it's important for us to not waste time on something that
> already have been investigated.
>
> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
> wrote my first two features.
>
> 4) Measurements of object instantiation time in Ruby suggest that exposing
> low-level D functions via FFI makes little sense. I'm going to discuss with
> mentors which high-level functions should be available, and make that into
> Cucumber features.
>
>
>
>
> --
> Artem
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From cjfields at illinois.edu  Thu May 24 05:14:20 2012
From: cjfields at illinois.edu (Fields, Christopher J)
Date: Thu, 24 May 2012 05:14:20 +0000
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
	<CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
Message-ID: <BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>

I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken...

chris

On May 23, 2012, at 11:30 PM, Mic wrote:

> D to Ruby: http://www.swig.org/compare.html
> 
> On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <lomereiter at googlemail.com>wrote:
> 
>> Hi all,
>> 
>> here's my report about the past week:
>> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
>> 
>> Brief summary:
>> 
>> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
>> bugtracker, and one of them is already solved. Rubinius in 1.8 mode should
>> now pass all tests. The situation with 1.9 mode is not that great, but I'm
>> working on it.
>> 
>> 2) I started to collect D optimization tricks on github wiki page.
>> Currently, it contains just 6 tips, but this number is going to grow.
>> Probably, another page will be created soon to keep best practices of
>> connecting Ruby and D. Since my project and Marjan's one have a lot in
>> common, I think it's important for us to not waste time on something that
>> already have been investigated.
>> 
>> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it, and
>> wrote my first two features.
>> 
>> 4) Measurements of object instantiation time in Ruby suggest that exposing
>> low-level D functions via FFI makes little sense. I'm going to discuss with
>> mentors which high-level functions should be available, and make that into
>> Cucumber features.
>> 
>> 
>> 
>> 
>> --
>> Artem
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From cswh at umich.edu  Thu May 24 05:33:40 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Thu, 24 May 2012 01:33:40 -0400
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <CBE12746.9A11%bonnal@ingm.org>
References: <CBE12746.9A11%bonnal@ingm.org>
Message-ID: <9DBCD042-7086-4F4B-ABB9-1A7F63C089B8@umich.edu>

Thanks for the offers of help, everybody. Raoul, if it's convenient for you to set up a test VM in house, that would probably make the most sense. I don't think it's a pressing need at this point, but let's look into that. 

If we run into issues, we can revisit the EC2 options. (I've had an AWS account too long to qualify for the free usage tier, unfortunately.) An Amazon grant might be worth looking at, especially if we can use it to publicly host, say, BGZF-compressed pre-indexed MAF data sets also. On the other hand, that might be overkill just for my needs; using spot-priced instances, I expect I could do all the testing I need for under $50.

Clayton Wheeler
cswh at umich.edu


From lomereiter at googlemail.com  Thu May 24 05:40:54 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Thu, 24 May 2012 09:40:54 +0400
Subject: [BioRuby] [GSoC] Weekly report #1
In-Reply-To: <BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>
References: <CAE8u=e4b73oPfb62tySnwTV27X5b_bTKrtq2rc3POnpMSXkXXQ@mail.gmail.com>
	<CAOP6n=gXanYq7YJ+73tXxkotHR+w17AARCv3bO96ziMSLrRtgQ@mail.gmail.com>
	<BD83611A-BDC6-4844-9C25-B060EFA54A81@illinois.edu>
Message-ID: <CAE8u=e503XzVVyDoSARz6OVMT_MM0WYuFrXpuJy9SKYb5BsNRw@mail.gmail.com>

Chris is right. Currently, it's easier to write everything manually. When
I'll develop some 'best practices' I may put then into compile-time
algorithms and generate bindings from D. (The language has compile-time
introspection but doesn't have run-time one, probably because that would
hurt the performance.)

On Thu, May 24, 2012 at 9:14 AM, Fields, Christopher J <
cjfields at illinois.edu> wrote:

> I think the mentioned D wrappers on the SWIG page are ANSI C/C++ libraries
> wrapped for D, not D code/libs/etc wrapped for Ruby, unless I'm mistaken...
>
> chris
>
> On May 23, 2012, at 11:30 PM, Mic wrote:
>
> > D to Ruby: http://www.swig.org/compare.html
> >
> > On Mon, May 21, 2012 at 9:58 PM, Artem Tarasov <
> lomereiter at googlemail.com>wrote:
> >
> >> Hi all,
> >>
> >> here's my report about the past week:
> >> http://lomereiter.wordpress.com/2012/05/21/gsoc-weekly-report-1/
> >>
> >> Brief summary:
> >>
> >> 1) BioRuby unit tests and Rubinius bugs ? I posted 2 issues in Rubinius
> >> bugtracker, and one of them is already solved. Rubinius in 1.8 mode
> should
> >> now pass all tests. The situation with 1.9 mode is not that great, but
> I'm
> >> working on it.
> >>
> >> 2) I started to collect D optimization tricks on github wiki page.
> >> Currently, it contains just 6 tips, but this number is going to grow.
> >> Probably, another page will be created soon to keep best practices of
> >> connecting Ruby and D. Since my project and Marjan's one have a lot in
> >> common, I think it's important for us to not waste time on something
> that
> >> already have been investigated.
> >>
> >> 3) During the week, I learned a bit about BDD and Cucumber, enjoyed it,
> and
> >> wrote my first two features.
> >>
> >> 4) Measurements of object instantiation time in Ruby suggest that
> exposing
> >> low-level D functions via FFI makes little sense. I'm going to discuss
> with
> >> mentors which high-level functions should be available, and make that
> into
> >> Cucumber features.
> >>
> >>
> >>
> >>
> >> --
> >> Artem
> >>
> >> _______________________________________________
> >> BioRuby Project - http://www.bioruby.org/
> >> BioRuby mailing list
> >> BioRuby at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioruby
> >>
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


From lomereiter at googlemail.com  Thu May 24 05:52:42 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Thu, 24 May 2012 09:52:42 +0400
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
Message-ID: <CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>

Hi all,

it's a good point that many line-based formats need some sort of
compression with indexing, and BGZF is good enough in that sense.

So far, I think Artem's BGZF implementation is entirely in D; I may just
> add Ruby support for BGZF separately.
>

The only problem I see with that approach is that it's hardly possible to
get parallel compression with MRI. But overall I tend to agree with
Clayton. Firstly, it's hard to abstract away some common interface right
now, not writing any code and looking at it. Secondly, there're still
problems with D shared library support. We were assured by GDC developer
that they'll get solved soon, but at the moment the situation is far from
perfect.


From p.j.a.cock at googlemail.com  Thu May 24 09:18:33 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 24 May 2012 10:18:33 +0100
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
	<CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
Message-ID: <CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>

On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov
<lomereiter at googlemail.com> wrote:
> Hi all,
>
> it's a good point that many line-based formats need some sort of compression
> with indexing, and BGZF is good enough in that sense.

BGZF doesn't have to be used with line-based formats, anything
with sequential records would work (like BAM files of course). I've not
tried it to see how well it compressed, but SFF files in BGZF should
work too as another example.

>> So far, I think Artem's BGZF implementation is entirely in D; I may just
>> add Ruby support for BGZF separately.
>
> The only problem I see with that approach is that it's hardly possible to
> get parallel compression with MRI. But overall I tend to agree with Clayton.
> Firstly, it's hard to abstract away some common interface right now, not
> writing any code and looking at it. Secondly, there're still problems with D
> shared library support. We were assured by GDC developer that they'll get
> solved soon, but at the moment the situation is far from perfect.

My BGZF code is pure Python (using C zlib via Python's zlib library),
and does not currently tackle parallel compression or decompression.
There as been recent work in samtools for this.

We don't need parallel compression/decompression of BGZF for it to
be useful.

Peter


From john.woods at marcottelab.org  Thu May 24 14:01:08 2012
From: john.woods at marcottelab.org (John Woods)
Date: Thu, 24 May 2012 09:01:08 -0500
Subject: [BioRuby] GSoC week 2 status report
In-Reply-To: <CBE12746.9A11%bonnal@ingm.org>
References: <0D2AC678-1DD1-40B9-B100-EDA3429B3D87@umich.edu>
	<CBE12746.9A11%bonnal@ingm.org>
Message-ID: <CAPkCRRuEZkF7b4K_1zDoyjW-1D_ckhK6K3v+wLex0dD+Arj1tg@mail.gmail.com>

If I can just suggest, there's a startup pitch out there which was formerly
known as Happy Science Coding, now Appsoma, which lets you run Ruby code on
Rackspace instances.

It may or may not be appropriate for what you want to do. It's not EC2, but
it is a VM (right?).

http://appsoma.com/

It's still a bit buggy with Ruby. If you have trouble, email Zack (see the
"About us" page). He's fairly responsive.

John
SciRuby

On Tue, May 22, 2012 at 4:21 AM, Raoul Bonnal <bonnal at ingm.org> wrote:

> Hi Clayton,
> Well done and thanks for your contributes to bioruby and jruby community.
>
> For you computing issue I have two solutions:
> 1) I can create a VM and give you the access, I need to contact my IT dep.
> 2) Could Amazon provide some VM for our students?
>
>
>
> On 21/05/12 17.50, "Clayton Wheeler" <cswh at umich.edu> wrote:
>
> > Hi all,
> >
> > Here's my report on last week's work:
> >
> > http://csw.github.com/bioruby-maf/blog/2012/05/21/week_2_progress/
> >
> > This was my second week of work on my GSoC project, and the last week of
> the
> > ?community bonding? period before the official start of coding. A major
> focus
> > of mine was BioRuby?s phyloXML support; it uses libxml, which has been
> causing
> > unit test failures under JRuby. In the end, the best course of action
> seemed
> > to separate the phyloXML support as a separate plugin, which I have done
> as
> > the bio-phyloxml gem. This will remove BioRuby?s dependency on XML
> libraries
> > entirely and that JRuby issue along with it. At the same time, users of
> the
> > phyloXML code should be able to continue using it with no substantive
> changes.
> >
> > Separately, I began porting this phyloXML code to use Nokogiri instead of
> > libxml-ruby, but ran into difficulties with this effort. While it is
> possible,
> > and the library APIs are very similar, the code uses relatively
> low-level XML
> > processing APIs in ways that seem to be sensitive to subtle differences
> in
> > text node and namespace semantics between the two libraries. Substantial
> > restructuring of the code and the addition of quite a few unit tests
> might be
> > necessary to carry out such a port with confidence that the resulting
> code
> > would work well.
> >
> > Also, someone else submitted a JRuby patch for JRUBY-6658, one of the
> major
> > causes of BioRuby?s unit test failures with JRuby; once a fix is
> integrated,
> > we?ll be close to having all the tests passing under JRuby.
> >
> > I identified another JRuby bug, JRUBY-6666, causing several unit test
> > failures. This one affects BioRuby?s code for running external commands,
> so it
> > would be likely to be encountered in production use. For this one, I also
> > worked up a patch.
> >
> > I also spent some time preparing a performance testing environment, for
> > evaluating existing MAF implementations as well as my own. This will be
> > important, since I will be considering the use of an existing C parser.
> I will
> > also want to ensure that the performance of my code is competitive with
> the
> > alternatives. Lacking any hardware more powerful than a MacBook Air, I am
> > setting this up with Amazon EC2. To simplify environment setup, I?ll be
> using
> > Chef. I?ve already set up a Chef repository with configuration logic,
> and some
> > rudimentary code to streamline launching Ubuntu machines on EC2 and
> > bootstrapping a Chef environment. To save money, I plan to make use of
> EC2
> > Spot Instances, which are perfect for instances that only need to run
> for a
> > few hours for batch tasks.
> >
> > Clayton Wheeler
> > cswh at umich.edu
> >
> >
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
>
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From mictadlo at gmail.com  Fri May 25 06:49:13 2012
From: mictadlo at gmail.com (Mic)
Date: Fri, 25 May 2012 16:49:13 +1000
Subject: [BioRuby] BGZF support, was Re: Biopython 1.60 plans and beyond
In-Reply-To: <CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>
References: <CAKVJ-_6xDOnV4YiGuYKo8xFi=1WeL0oX+RqRD5QKFw14VKKYbQ@mail.gmail.com>
	<4F91E4CF.8040602@med.nyu.edu>
	<CAKVJ-_4k==uN0UYa17-xPV6OMjE-Wm5Yuohf=bzGKB5vwXmKVQ@mail.gmail.com>
	<4F9AFA1F.6030103@med.nyu.edu>
	<CAKVJ-_4cv_kO4GCLhdLNpGr4xKQQEtAgas+HX0LakbUMp0NgbA@mail.gmail.com>
	<DB22FC5D-3BE1-4BF4-ABEC-3D55A17056C2@umich.edu>
	<CAE8u=e48RD5uKY3t8JJTwUUBXV=vnHwqgd=3viiFpWPAV+O9og@mail.gmail.com>
	<CAKVJ-_5Tvc99ORek8fZttiuv-3L82-nKPbU1v6CbeeWCH1TBhw@mail.gmail.com>
Message-ID: <CAOP6n=j2z1Ldtfywa8o4FmuLLw+J=uZSsEyG5pbo3-1zfma5iA@mail.gmail.com>

I think Pircard-tools does parallel compression/decompression of BGZF.

Cheers,
Mic

On Thu, May 24, 2012 at 7:18 PM, Peter Cock <p.j.a.cock at googlemail.com>wrote:

> On Thu, May 24, 2012 at 6:52 AM, Artem Tarasov
> <lomereiter at googlemail.com> wrote:
> > Hi all,
> >
> > it's a good point that many line-based formats need some sort of
> compression
> > with indexing, and BGZF is good enough in that sense.
>
> BGZF doesn't have to be used with line-based formats, anything
> with sequential records would work (like BAM files of course). I've not
> tried it to see how well it compressed, but SFF files in BGZF should
> work too as another example.
>
> >> So far, I think Artem's BGZF implementation is entirely in D; I may just
> >> add Ruby support for BGZF separately.
> >
> > The only problem I see with that approach is that it's hardly possible to
> > get parallel compression with MRI. But overall I tend to agree with
> Clayton.
> > Firstly, it's hard to abstract away some common interface right now, not
> > writing any code and looking at it. Secondly, there're still problems
> with D
> > shared library support. We were assured by GDC developer that they'll get
> > solved soon, but at the moment the situation is far from perfect.
>
> My BGZF code is pure Python (using C zlib via Python's zlib library),
> and does not currently tackle parallel compression or decompression.
> There as been recent work in samtools for this.
>
> We don't need parallel compression/decompression of BGZF for it to
> be useful.
>
> Peter
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From cswh at umich.edu  Fri May 25 20:42:13 2012
From: cswh at umich.edu (Clayton Wheeler)
Date: Fri, 25 May 2012 16:42:13 -0400
Subject: [BioRuby] New blog post on this week's work
Message-ID: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>

Hi all,

I've written a new blog post on the work I did on my MAF parser this week:

http://csw.github.com/bioruby-maf/blog/2012/05/25/first_milestone/

It covers parser implementation and performance issues, BDD, and tools.

Clayton Wheeler
cswh at umich.edu


From lomereiter at googlemail.com  Sun May 27 18:27:43 2012
From: lomereiter at googlemail.com (Artem Tarasov)
Date: Sun, 27 May 2012 22:27:43 +0400
Subject: [BioRuby] [GSoC] weekly report #2
Message-ID: <CAE8u=e4Yer4BwgG8GGEV5MzcmRjGmgwPuKkyAMALXq9xHYu+Gg@mail.gmail.com>

Hi all,

I wrote a blog post about the past week:
http://lomereiter.wordpress.com/2012/05/27/gsoc-weekly-report-2/

Topics are:
1) I have quite good validation module for BAM now. More kinds of checks
can be added, just request them :)
2) Also I started to implement random access via BAI file, just because I
mostly finished what I planned for the first two weeks, and random access
seems to be one of the most important things.

Also it's not mentioned in the blog, but I started to work on BGZF gem, as
Pjotr suggested to me. I'll try to document it and publish the first
version next week. Currently I write it in pure Ruby.


From marian.povolny at gmail.com  Sun May 27 19:21:48 2012
From: marian.povolny at gmail.com (Marjan Povolni)
Date: Sun, 27 May 2012 21:21:48 +0200
Subject: [BioRuby] GSoC weekly status report No.1.9
Message-ID: <CADKP5CkmqhUwiCnj1ERMP1dcguWKpEnv4XfvYt6kmqjHrv7UDQ@mail.gmail.com>

http://blog.mpthecoder.com/post/23877896288/gsoc-weekly-status-report-no-1-9

This is the final post in 1.x series, I promise.

The last week was spent adding support of parsing lines into records. It
was a lot of work, and when I read the comments from my mentor, I wasn?t
happy. But I agree with him, I did make it more complicated then it had to
be (the C API, for example), I should spend some time polishing and
refactoring the D side, and my cucumber features should be split into more
features. So that?s the rough plan for the next week.

--
Marjan


From bonnal at ingm.org  Mon May 28 08:50:19 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 28 May 2012 10:50:19 +0200
Subject: [BioRuby] DevTools
In-Reply-To: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>
Message-ID: <CBE908EB.9BBB%bonnal@ingm.org>

In case you want to use RedMine I can give you the license for free, any
bioruby developer can request it.


From p.j.a.cock at googlemail.com  Mon May 28 09:00:30 2012
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Mon, 28 May 2012 10:00:30 +0100
Subject: [BioRuby] DevTools
In-Reply-To: <CBE908EB.9BBB%bonnal@ingm.org>
References: <329E20F7-BF3F-4201-ADD0-ABCDFC5ECDE4@umich.edu>
	<CBE908EB.9BBB%bonnal@ingm.org>
Message-ID: <CAKVJ-_5hZofR0h=hXMACSEg+wgY4Gj__u8iY=_AkiB5QuY-Tgw@mail.gmail.com>

On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal <bonnal at ingm.org> wrote:

> In case you want to use RedMine I can give you the license for free, any
> bioruby developer can request it.
>

??? Redmine is licensed under the GPL.

Did you mean admin rights on the OBF RedMine instance, for
example to close bug reports?
https://redmine.open-bio.org/projects/bioruby

Peter


From bonnal at ingm.org  Mon May 28 09:03:01 2012
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 28 May 2012 11:03:01 +0200
Subject: [BioRuby] DevTools
In-Reply-To: <CAKVJ-_5hZofR0h=hXMACSEg+wgY4Gj__u8iY=_AkiB5QuY-Tgw@mail.gmail.com>
Message-ID: <CBE90BE5.9BC2%bonnal@ingm.org>

Ahhhhhhhhhhh


I mean RubyMine

 http://www.jetbrains.com/ruby/

sorry

On 28/05/12 11.00, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:

> 
> 
> On Mon, May 28, 2012 at 9:50 AM, Raoul Bonnal <bonnal at ingm.org> wrote:
>> In case you want to use RedMine I can give you the license for free, any
>> bioruby developer can request it.
> 
> ??? Redmine is licensed under the GPL.
> 
> Did you mean admin rights on the OBF RedMine instance, for
> example to close bug reports?
> https://redmine.open-bio.org/projects/bioruby
> 
> Peter
> 
> 


From francesco.strozzi at gmail.com  Thu May 31 09:11:25 2012
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Thu, 31 May 2012 11:11:25 +0200
Subject: [BioRuby] EU Codefest 2012 Announcement
Message-ID: <CACtet2RtKqXavJ4C7An==MJP0MTW-yeHYwi9xKqWh6RP02Qp6w@mail.gmail.com>

The Open Bioinformatics Foundation (OBF) EU-CodeFest will be held in
Parco Tecnologico Padano (PTP) Lodi, Italy on the19th ? 20th of July.
The CodeFest is a small focused event under the auspices of the Open
Bioinformatics Foundation, and is a sister event of BOSC2012 being
held in California USA this year.
Three main topics will be worked on during the CodeFest:

- NGS and high performance parsers for OpenBio projects.
- RDF and semantic web for bioinformatics.
- Bioinformatics pipelines definition, execution and distribution.

The number of places is limited to 30 participants at maximum, on a
first come, first serve basis. Undergraduate and PhD students are
welcome to participate.
The cost of the event is EUR 100 per person, which includes also
lunches, coffee breaks and the social dinner on the 19th of July.
Only for students, we can sponsor a limited number of attendees that
will not pay for the registration fee. Those students, willing to
participate for free to the event, will be asked to submit their
qualifications and experience in software development. The organizing
committee will review students? applications before final acceptance.
Talks and abstracts may be presented during the CodeFest in sessions
of 10 minutes plus questions. Coding activities will continue during
the talks.

The City of Lodi is very close to Milano and has good hotel
facilities. The connections by air are excellent, via Milano Malpensa,
Milano Linate and Bergamo Orio Al Serio airports.

Please register soon using the form at this page
http://tecnoparco.org/codefest, places may run out quickly.


-- 

Francesco