From rob.syme at gmail.com  Wed Jun  1 03:17:30 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 15:17:30 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
Message-ID: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>

I've written a quick bioruby plugin to help parse blast results that
are too large to fit into memory.

Install: gem install bio-lazyblastxml
Code:?github.com/robsyme/bioruby-lazyblastxml
Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/

The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
objects when required.
The interface is as close to Bio::Blast::Report as I could keep it,
but there are a few changes:
? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
is a enumerable that yields iterations, Iteration is an enumerable
that yields hits, Hits are enumerables that yield hsps, etc.

This is my first attempt real shared code, and all comments and
criticism are very welcome.

-r

Rob Syme
PhD Candidate
Curtin University
Western Australia


From pjotr.public14 at thebird.nl  Wed Jun  1 03:30:16 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 1 Jun 2011 09:30:16 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
Message-ID: <20110601073016.GB22723@thebird.nl>

Hi Rob,

Why did you not start from my lazy fast and big-data XML parser for
BLAST?

  https://github.com/pjotrp/blastxmlparser

I hear it is being used in the NGS plugin. Be good to do some
performance tests, when you introduce something new.

I have a feeling you were simply not aware of it. 

Pj.

On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
> I've written a quick bioruby plugin to help parse blast results that
> are too large to fit into memory.
> 
> Install: gem install bio-lazyblastxml
> Code:?github.com/robsyme/bioruby-lazyblastxml
> Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/
> 
> The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
> objects when required.
> The interface is as close to Bio::Blast::Report as I could keep it,
> but there are a few changes:
> ? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
> is a enumerable that yields iterations, Iteration is an enumerable
> that yields hits, Hits are enumerables that yield hsps, etc.
> 
> This is my first attempt real shared code, and all comments and
> criticism are very welcome.
> 
> -r
> 
> Rob Syme
> PhD Candidate
> Curtin University
> Western Australia
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From rob.syme at gmail.com  Wed Jun  1 04:07:13 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 16:07:13 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601073016.GB22723@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
Message-ID: <BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>

You're right, I hadn't seen your project. My mistake.
-r

On Wed, Jun 1, 2011 at 3:30 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi Rob,
>
> Why did you not start from my lazy fast and big-data XML parser for
> BLAST?
>
> ?https://github.com/pjotrp/blastxmlparser
>
> I hear it is being used in the NGS plugin. Be good to do some
> performance tests, when you introduce something new.
>
> I have a feeling you were simply not aware of it.
>
> Pj.
>
> On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
>> I've written a quick bioruby plugin to help parse blast results that
>> are too large to fit into memory.
>>
>> Install: gem install bio-lazyblastxml
>> Code:?github.com/robsyme/bioruby-lazyblastxml
>> Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/
>>
>> The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
>> objects when required.
>> The interface is as close to Bio::Blast::Report as I could keep it,
>> but there are a few changes:
>> ? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
>> is a enumerable that yields iterations, Iteration is an enumerable
>> that yields hits, Hits are enumerables that yield hsps, etc.
>>
>> This is my first attempt real shared code, and all comments and
>> criticism are very welcome.
>>
>> -r
>>
>> Rob Syme
>> PhD Candidate
>> Curtin University
>> Western Australia
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>


From philipp.comans at googlemail.com  Wed Jun  1 04:25:37 2011
From: philipp.comans at googlemail.com (Philipp Comans)
Date: Wed, 1 Jun 2011 10:25:37 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
Message-ID: <2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>

Hi,

I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.

BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.

Cheers,

Philipp

Am Mittwoch, 1. Juni 2011 um 10:07 schrieb Rob Syme:

> You're right, I hadn't seen your project. My mistake.
> -r
> 
> On Wed, Jun 1, 2011 at 3:30 PM, Pjotr Prins <pjotr.public14 at thebird.nl (mailto:pjotr.public14 at thebird.nl)> wrote:
> > Hi Rob,
> > 
> > Why did you not start from my lazy fast and big-data XML parser for
> > BLAST?
> > 
> > https://github.com/pjotrp/blastxmlparser
> > 
> > I hear it is being used in the NGS plugin. Be good to do some
> > performance tests, when you introduce something new.
> > 
> > I have a feeling you were simply not aware of it.
> > 
> > Pj.
> > 
> > On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
> > > I've written a quick bioruby plugin to help parse blast results that
> > > are too large to fit into memory.
> > > 
> > > Install: gem install bio-lazyblastxml
> > > Code: github.com/robsyme/bioruby-lazyblastxml (http://github.com/robsyme/bioruby-lazyblastxml)
> > > Blog post: biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/ (http://biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/)
> > > 
> > > The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
> > > objects when required.
> > > The interface is as close to Bio::Blast::Report as I could keep it,
> > > but there are a few changes:
> > >  Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
> > > is a enumerable that yields iterations, Iteration is an enumerable
> > > that yields hits, Hits are enumerables that yield hsps, etc.
> > > 
> > > This is my first attempt real shared code, and all comments and
> > > criticism are very welcome.
> > > 
> > > -r
> > > 
> > > Rob Syme
> > > PhD Candidate
> > > Curtin University
> > > Western Australia
> > > 
> > > _______________________________________________
> > > BioRuby Project - http://www.bioruby.org/
> > > BioRuby mailing list
> > > BioRuby at lists.open-bio.org (mailto:BioRuby at lists.open-bio.org)
> > > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org (mailto:BioRuby at lists.open-bio.org)
> http://lists.open-bio.org/mailman/listinfo/bioruby


From rob.syme at gmail.com  Wed Jun  1 04:33:36 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 16:33:36 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
Message-ID: <BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>

I think that the list at
http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
comprehensive, my mistake was simply not looking.
-r


On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
<philipp.comans at googlemail.com> wrote:
> Hi,
>
> I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
> In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
>
> BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
>
> Cheers,
>
> Philipp
>


From pjotr.public14 at thebird.nl  Wed Jun  1 04:49:48 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 1 Jun 2011 10:49:48 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
Message-ID: <20110601084948.GA23592@thebird.nl>

The general idea is to have a number of 'blessed' plugins tied to
BioRuby releases. A blessed plugin is supposed to be rather solid,
and have a level of documentation and testing.

In addition there are 'development' plugins. Both should be listed on
the plugin page. We are introducing that plumbing shortly. The
duplication of work merely points out we need to get this done ;)

It is interesting to note both XML parsers use lazy iterators. I also
do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
performance on some real-life data.

Pj.

On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
> I think that the list at
> http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
> comprehensive, my mistake was simply not looking.
> -r
> 
> 
> On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
> <philipp.comans at googlemail.com> wrote:
> > Hi,
> >
> > I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
> > In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
> >
> > BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
> >
> > Cheers,
> >
> > Philipp
> >
> 

From bonnal at ingm.org  Wed Jun  1 06:26:19 2011
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 1 Jun 2011 12:26:19 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601084948.GA23592@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
	<20110601084948.GA23592@thebird.nl>
Message-ID: <23D33897-ACAC-47B0-85D1-A3A808D46B48@ingm.org>

what about to automate this process on our wiki :-)?

$# gem search -r bio-

bio-assembly (0.1.0)
bio-blastxmlparser (0.6.1)
bio-bwa (0.2.2)
bio-cnls_screenscraper (0.1.0)
bio-emboss_six_frame_nucleotide_sequences (0.1.0)
bio-gem (0.2.2)
bio-genomic-interval (0.1.2)
bio-gex (0.0.0)
bio-gff3 (0.8.6)
bio-graphics (1.4)
bio-hello (0.0.0)
bio-isoelectric_point (0.1.1)
bio-kb-illumina (0.1.0)
bio-lazyblastxml (0.4.0)
bio-logger (0.9.0)
bio-nexml (0.0.1)
bio-octopus (0.1.1)
bio-samtools (0.2.1)
bio-sge (0.0.0)
bio-tm_hmm (0.2.0)
bio-ucsc-api (0.0.4)

wow quite long list of plugins :-) I'm happy to see this boiling soup

On 01/giu/2011, at 10.49, Pjotr Prins wrote:

> The general idea is to have a number of 'blessed' plugins tied to
> BioRuby releases. A blessed plugin is supposed to be rather solid,
> and have a level of documentation and testing.
> 
> In addition there are 'development' plugins. Both should be listed on
> the plugin page. We are introducing that plumbing shortly. The
> duplication of work merely points out we need to get this done ;)
> 
> It is interesting to note both XML parsers use lazy iterators. I also
> do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
> performance on some real-life data.
> 
> Pj.
> 
> On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
>> I think that the list at
>> http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
>> comprehensive, my mistake was simply not looking.
>> -r
>> 
>> 
>> On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
>> <philipp.comans at googlemail.com> wrote:
>>> Hi,
>>> 
>>> I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
>>> In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
>>> 
>>> BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
>>> 
>>> Cheers,
>>> 
>>> Philipp
>>> 
>> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
The only change to succeed is starting from a simple thing.


From rob.syme at gmail.com  Wed Jun  1 08:26:25 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 20:26:25 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601084948.GA23592@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
	<20110601084948.GA23592@thebird.nl>
Message-ID: <BANLkTi=wFYGACB5OK3-rCbYm8Q72To8sjg@mail.gmail.com>

I pushed a 1.4GB file through each of the parsers, simply counting the
number of hits per iteration:

     user     system      total        real
Rob:    91.510000   0.620000  92.130000 ( 92.527617)
Pjotr:  46.730000   0.430000  47.160000 ( 47.263949)

One of the important differences in the parsers is that mine is lazy 'all
the way down', in that the iterations are lazy, the hits are lazy and the
hsps are lazy. No large chunks of XML are ever buffered into a string and
then parsed together. While lazy-loading is a good idea, and should probably
be adopted in more of the BioRuby core, taking it to this extreme is a bit
silly.
Pjotr's (more sensible) approach is to chunk up the file by iterations, and
then use XPath to pull out the relevant information from there. One
iteration will never be more than a few kb - certainly no strain on memory
consumption. The IO strain of reading a file in tiny pieces looks to be the
cause of the 2x slowdown in the example above.

Lesson 1: Pragmatism is a good thing.
Lesson 2: Always check to make sure work you're doing hasn't been done
before
Lesson 3: Use Pjotr's parser to make light work of your large Blast results.

-r

On Wed, Jun 1, 2011 at 4:49 PM, Pjotr Prins <pjotr.public14 at thebird.nl>wrote:

> The general idea is to have a number of 'blessed' plugins tied to
> BioRuby releases. A blessed plugin is supposed to be rather solid,
> and have a level of documentation and testing.
>
> In addition there are 'development' plugins. Both should be listed on
> the plugin page. We are introducing that plumbing shortly. The
> duplication of work merely points out we need to get this done ;)
>
> It is interesting to note both XML parsers use lazy iterators. I also
> do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
> performance on some real-life data.
>
> Pj.
>
> On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
> > I think that the list at
> > http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
> > comprehensive, my mistake was simply not looking.
> > -r
> >
> >
> > On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
> > <philipp.comans at googlemail.com> wrote:
> > > Hi,
> > >
> > > I had a similar problem recently. I needed an efficient parser for
> Blast XML results and I discovered that the default parser in BioRuby was
> not suitable. So I wrote my own using Nokogiri.
> > > In my opinion it is way too hard at the moment to discover BioPlugins.
> When people use the default XML or GFF parser that comes with BioRUby, they
> do not expect that there is another, more efficient version. There should be
> a section on the front page or even in the corresponding parts of the API
> documentation that makes people aware of the existence of these efficient
> parsers.
> > >
> > > BTW thank you all for BioRuby, I used in a project recently and it made
> my life tremendously easier.
> > >
> > > Cheers,
> > >
> > > Philipp
> > >
> >
>

From yannick.wurm at unil.ch  Mon Jun 13 01:49:39 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Mon, 13 Jun 2011 12:49:39 +0700
Subject: [BioRuby] ruby BLAST server (web frontend)
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
Message-ID: <4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>

Dear list & CC-ed,

let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):

> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
> GMOD obviously comes to mind, but it seems like overkill. 
> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?


There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com

Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:

Ease of use for biologists:
 * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
 * links to easily download sequences of BLAST hits
 * support for advanced options.

Rapid deployment for bioinformatics administrators:
 * assisted formatting of BLAST databases (with sequence type detection)
 * automatic discovery of formatted BLAST databases during startup
 * uses ruby's internal web server (on any open port) or Apache
 * add custom hyperlinks from hits (to your genome browser or custom database).


We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 

Comments, suggestions... and contributions are most welcome!

Cheers,

Anurag & Ben & Yannick


-----------------------------
  Ant Genomes & Evolution 
 http://yannick.poulet.org
    skype://yannickwurm
-----------------------------


From bonnal at ingm.org  Mon Jun 13 03:17:21 2011
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 13 Jun 2011 09:17:21 +0200
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
Message-ID: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>

Dear Yannick and other,
cute work.
Just few suggestions.
you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.
Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
something like:

sequenceserver database_formatter directory_with_fasta_files
sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
sequenceserver start

then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/

if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.

could this became a bioruby plugin ?


On 13/giu/2011, at 07.49, Yannick Wurm wrote:

> Dear list & CC-ed,
> 
> let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):
> 
>> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
>> GMOD obviously comes to mind, but it seems like overkill. 
>> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?
> 
> 
> There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
> https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com
> 
> Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:
> 
> Ease of use for biologists:
> * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
> * links to easily download sequences of BLAST hits
> * support for advanced options.
> 
> Rapid deployment for bioinformatics administrators:
> * assisted formatting of BLAST databases (with sequence type detection)
> * automatic discovery of formatted BLAST databases during startup
> * uses ruby's internal web server (on any open port) or Apache
> * add custom hyperlinks from hits (to your genome browser or custom database).
> 
> 
> We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 
> 
> Comments, suggestions... and contributions are most welcome!
> 
> Cheers,
> 
> Anurag & Ben & Yannick
> 
> 
> 
> -----------------------------
>  Ant Genomes & Evolution 
> http://yannick.poulet.org
>    skype://yannickwurm
> -----------------------------
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From yannick.wurm at unil.ch  Mon Jun 13 04:06:43 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Mon, 13 Jun 2011 15:06:43 +0700
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
Message-ID: <AF9A450A-3741-44CC-845E-3E76E313D821@unil.ch>

Thanks for the suggestions Raoul,
they could substantially streamline setting things up!

cheers
yannick


On 13 Jun 2011, at 14:17, Raoul Bonnal wrote:

> Dear Yannick and other,
> cute work.
> Just few suggestions.
> you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
> configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.
> Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
> something like:
> 
> sequenceserver database_formatter directory_with_fasta_files
> sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
> sequenceserver start
> 
> then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/
> 
> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.
> 
> could this became a bioruby plugin ?
> 
> 
> 
> 
> 
> On 13/giu/2011, at 07.49, Yannick Wurm wrote:
> 
>> Dear list & CC-ed,
>> 
>> let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):
>> 
>>> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
>>> GMOD obviously comes to mind, but it seems like overkill. 
>>> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?
>> 
>> 
>> There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
>> https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com
>> 
>> Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:
>> 
>> Ease of use for biologists:
>> * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
>> * links to easily download sequences of BLAST hits
>> * support for advanced options.
>> 
>> Rapid deployment for bioinformatics administrators:
>> * assisted formatting of BLAST databases (with sequence type detection)
>> * automatic discovery of formatted BLAST databases during startup
>> * uses ruby's internal web server (on any open port) or Apache
>> * add custom hyperlinks from hits (to your genome browser or custom database).
>> 
>> 
>> We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 
>> 
>> Comments, suggestions... and contributions are most welcome!
>> 
>> Cheers,
>> 
>> Anurag & Ben & Yannick
>> 
>> 
>> 
>> -----------------------------
>>  Ant Genomes & Evolution 
>> http://yannick.poulet.org
>>    skype://yannickwurm
>> -----------------------------
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


-----------------------------
   Ant Genomes & Evolution 
  http://yannick.poulet.org
     skype://yannickwurm
-----------------------------
BLAST @ http://antgenomes.org


From anurag08priyam at gmail.com  Mon Jun 13 12:10:53 2011
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Mon, 13 Jun 2011 21:40:53 +0530
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
Message-ID: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>

> cute work.

Thanks a lot Raoul :).

> Just few suggestions.
> you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
> configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.

Agreed. And that is our target for the next release.

> Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
> something like:
>
> sequenceserver database_formatter directory_with_fasta_files
> sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
> sequenceserver start

This looks quite good. I will keep this in mind when pushing forward a
gem release.

>> then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/
>
> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.

That would be great. We are putting forward a wiki page with
instructions on deploying SequenceServer on Apache, and Nginix. I am
almost done adding instructions for Apache, but I am not sure how to
do it for Nginix.

> could this became a bioruby plugin ?

So, then would it become bio-sequenceserver? IMO, it doesn't logically
fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
BioRuby is more like library but SequenceServer is more like an end
product. Not sure though :-|.

-- 
Anurag Priyam
http://about.me/yeban/

From anurag08priyam at gmail.com  Mon Jun 13 12:12:25 2011
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Mon, 13 Jun 2011 21:42:25 +0530
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
Message-ID: <BANLkTim=ZJHn-fCUp6hVXXVEivC7Yzvu4A@mail.gmail.com>

>> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.
>
> That would be great. We are putting forward a wiki page with
> instructions on deploying SequenceServer on Apache, and Nginix. I am
> almost done adding instructions for Apache, but I am not sure how to
> do it for Nginix.

Oops, forgot to add the link:
https://github.com/yannickwurm/sequenceserver/wiki/Deploying-Sequence-Server

-- 
Anurag Priyam
http://about.me/yeban/

From donttrustben at gmail.com  Tue Jun 14 09:19:37 2011
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Tue, 14 Jun 2011 23:19:37 +1000
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
Message-ID: <BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>

Hi,


> > could this became a bioruby plugin ?
>
> So, then would it become bio-sequenceserver? IMO, it doesn't logically
> fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
> BioRuby is more like library but SequenceServer is more like an end
> product. Not sure though :-|.
>

To be technical, the branch trying to implement the blast overview graphic
does rely on BioRuby, since that is a dependency of bio-graphics. But that
branch hasn't been merged into the main tree yet, and might remain an
optional thing anyway.

-- 
Ben J Woodcroft, BE (Hons)

PhD Candidate
Ralph Laboratory
The University of Melbourne
Melbourne, Australia

tel: (+613) 8344 2319
b.woodcroft at pgrad.unimelb.edu.au

From pjotr.public14 at thebird.nl  Tue Jun 14 09:26:54 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 14 Jun 2011 15:26:54 +0200
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
	<BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>
Message-ID: <20110614132654.GA20916@thebird.nl>

The advantages of making it a plugin:

1. easy install for users
2. visibility from the BioRuby project
3. potentially a member of the stable plugin family
4. developers may use your libraries - even if the focus is an
   application

Pj.

On Tue, Jun 14, 2011 at 11:19:37PM +1000, Ben Woodcroft wrote:
> Hi,
> 
> 
> > > could this became a bioruby plugin ?
> >
> > So, then would it become bio-sequenceserver? IMO, it doesn't logically
> > fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
> > BioRuby is more like library but SequenceServer is more like an end
> > product. Not sure though :-|.
> >
> 
> To be technical, the branch trying to implement the blast overview graphic
> does rely on BioRuby, since that is a dependency of bio-graphics. But that
> branch hasn't been merged into the main tree yet, and might remain an
> optional thing anyway.
> 
> -- 
> Ben J Woodcroft, BE (Hons)
> 
> PhD Candidate
> Ralph Laboratory
> The University of Melbourne
> Melbourne, Australia
> 
> tel: (+613) 8344 2319
> b.woodcroft at pgrad.unimelb.edu.au
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 

From mail at michaelbarton.me.uk  Thu Jun 23 11:05:48 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Thu, 23 Jun 2011 11:05:48 -0400
Subject: [BioRuby] GFF3 Record Equality Method
Message-ID: <20110623150548.GA1030@Michael-Bartons-MacBook.local>

As far as I can tell the GFF3 record in bioruby uses Object#== for comparison.
I'm implementing a Bio::GFF::GFF3::Record#== method based on comparison of the
GFF3 fields. Would this this be a useful addition to bioruby library?

Cheers

Michael Barton

From ngoto at gen-info.osaka-u.ac.jp  Fri Jun 24 08:41:29 2011
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 24 Jun 2011 21:41:29 +0900
Subject: [BioRuby] GFF3 Record Equality Method
In-Reply-To: <20110623150548.GA1030@Michael-Bartons-MacBook.local>
References: <20110623150548.GA1030@Michael-Bartons-MacBook.local>
Message-ID: <20110624124129.C00871CBC47D@idnmail.gen-info.osaka-u.ac.jp>

On Thu, 23 Jun 2011 11:05:48 -0400
Michael Barton <mail at michaelbarton.me.uk> wrote:

> As far as I can tell the GFF3 record in bioruby uses Object#== for comparison.
> I'm implementing a Bio::GFF::GFF3::Record#== method based on comparison of the
> GFF3 fields. Would this this be a useful addition to bioruby library?
> 
> Cheers
> 
> Michael Barton

Bio::GFF::GFF3::Record inherits Bio::GFF::GFF2::Record, and
the GFF2::Record already has its own == method. GFF2::Record#==
gives enough functionality for comparing GFF3 records, in
addition to GFF2 records.

#sample code
#-----------------------------------------------------------
 require 'bio'
 str1 = "chrI\tSGD\tcentromere\t151467\t151584\t.\t+\t.\t" +
        "ID=CEN1;Name=CEN1;gene=CEN1;Alias=CEN1,test%3B0001;" +
        "Note=Chromosome I centromere;dbxref=SGD:S000006463;" +
        "Target=test%2002 123 456 -,test%2C03 159 314;" +
        "memo%3Dtest%3Battr=99.9%25%09match"
 str2 = str1.dup
 str3 = str1.gsub(/CEN1/, 'CEN2')
 obj0 = Bio::GFF::GFF3::Record.new(str1)
 obj1 = Bio::GFF::GFF3::Record.new(str1)
 obj2 = Bio::GFF::GFF3::Record.new(str2)
 obj3 = Bio::GFF::GFF3::Record.new(str3)

 p obj0==obj1
 p obj1==obj2
 p obj1==obj3
#-----------------------------------------------------------

--
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

From andrew.j.grimm at gmail.com  Sun Jun 26 06:16:21 2011
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Sun, 26 Jun 2011 20:16:21 +1000
Subject: [BioRuby] Anyone else attending RubyKaigi 2011?
Message-ID: <BANLkTinc9BURgNiNbirdM=VguSCceq4gzQ@mail.gmail.com>

I noticed that Goto-san's talk got accepted as a lightning talk.

Are any other BioRuby contributors or users attending?

I'll be giving a talk, but I'll only briefly mention bioinformatics.
I'll be talking about the Small Eigen Collider. In describing why I
created the Small Eigen Collider, I'll mention that I'm a
bioinformatician, and that I deal with enough information that I am
tempted to run Ruby code under implementations other than YARV.
http://rubykaigi.org/2011/en/schedule/details/18S03

Andrew

From rob.syme at gmail.com  Wed Jun  1 07:17:30 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 15:17:30 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
Message-ID: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>

I've written a quick bioruby plugin to help parse blast results that
are too large to fit into memory.

Install: gem install bio-lazyblastxml
Code:?github.com/robsyme/bioruby-lazyblastxml
Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/

The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
objects when required.
The interface is as close to Bio::Blast::Report as I could keep it,
but there are a few changes:
? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
is a enumerable that yields iterations, Iteration is an enumerable
that yields hits, Hits are enumerables that yield hsps, etc.

This is my first attempt real shared code, and all comments and
criticism are very welcome.

-r

Rob Syme
PhD Candidate
Curtin University
Western Australia


From pjotr.public14 at thebird.nl  Wed Jun  1 07:30:16 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 1 Jun 2011 09:30:16 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
Message-ID: <20110601073016.GB22723@thebird.nl>

Hi Rob,

Why did you not start from my lazy fast and big-data XML parser for
BLAST?

  https://github.com/pjotrp/blastxmlparser

I hear it is being used in the NGS plugin. Be good to do some
performance tests, when you introduce something new.

I have a feeling you were simply not aware of it. 

Pj.

On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
> I've written a quick bioruby plugin to help parse blast results that
> are too large to fit into memory.
> 
> Install: gem install bio-lazyblastxml
> Code:?github.com/robsyme/bioruby-lazyblastxml
> Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/
> 
> The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
> objects when required.
> The interface is as close to Bio::Blast::Report as I could keep it,
> but there are a few changes:
> ? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
> is a enumerable that yields iterations, Iteration is an enumerable
> that yields hits, Hits are enumerables that yield hsps, etc.
> 
> This is my first attempt real shared code, and all comments and
> criticism are very welcome.
> 
> -r
> 
> Rob Syme
> PhD Candidate
> Curtin University
> Western Australia
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From rob.syme at gmail.com  Wed Jun  1 08:07:13 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 16:07:13 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601073016.GB22723@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
Message-ID: <BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>

You're right, I hadn't seen your project. My mistake.
-r

On Wed, Jun 1, 2011 at 3:30 PM, Pjotr Prins <pjotr.public14 at thebird.nl> wrote:
> Hi Rob,
>
> Why did you not start from my lazy fast and big-data XML parser for
> BLAST?
>
> ?https://github.com/pjotrp/blastxmlparser
>
> I hear it is being used in the NGS plugin. Be good to do some
> performance tests, when you introduce something new.
>
> I have a feeling you were simply not aware of it.
>
> Pj.
>
> On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
>> I've written a quick bioruby plugin to help parse blast results that
>> are too large to fit into memory.
>>
>> Install: gem install bio-lazyblastxml
>> Code:?github.com/robsyme/bioruby-lazyblastxml
>> Blog post:?biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/
>>
>> The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
>> objects when required.
>> The interface is as close to Bio::Blast::Report as I could keep it,
>> but there are a few changes:
>> ? Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
>> is a enumerable that yields iterations, Iteration is an enumerable
>> that yields hits, Hits are enumerables that yield hsps, etc.
>>
>> This is my first attempt real shared code, and all comments and
>> criticism are very welcome.
>>
>> -r
>>
>> Rob Syme
>> PhD Candidate
>> Curtin University
>> Western Australia
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>>
>


From philipp.comans at googlemail.com  Wed Jun  1 08:25:37 2011
From: philipp.comans at googlemail.com (Philipp Comans)
Date: Wed, 1 Jun 2011 10:25:37 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
Message-ID: <2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>

Hi,

I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.

BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.

Cheers,

Philipp

Am Mittwoch, 1. Juni 2011 um 10:07 schrieb Rob Syme:

> You're right, I hadn't seen your project. My mistake.
> -r
> 
> On Wed, Jun 1, 2011 at 3:30 PM, Pjotr Prins <pjotr.public14 at thebird.nl (mailto:pjotr.public14 at thebird.nl)> wrote:
> > Hi Rob,
> > 
> > Why did you not start from my lazy fast and big-data XML parser for
> > BLAST?
> > 
> > https://github.com/pjotrp/blastxmlparser
> > 
> > I hear it is being used in the NGS plugin. Be good to do some
> > performance tests, when you introduce something new.
> > 
> > I have a feeling you were simply not aware of it.
> > 
> > Pj.
> > 
> > On Wed, Jun 01, 2011 at 03:17:30PM +0800, Rob Syme wrote:
> > > I've written a quick bioruby plugin to help parse blast results that
> > > are too large to fit into memory.
> > > 
> > > Install: gem install bio-lazyblastxml
> > > Code: github.com/robsyme/bioruby-lazyblastxml (http://github.com/robsyme/bioruby-lazyblastxml)
> > > Blog post: biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/ (http://biolateral.wordpress.com/2011/05/31/parsing-huge-blast-files-with-bioruby/)
> > > 
> > > The plugin uses LibXML::Reader to iterate through nodes, yielding ruby
> > > objects when required.
> > > The interface is as close to Bio::Blast::Report as I could keep it,
> > > but there are a few changes:
> > >  Iteration.hits, hit.hsps etc do not return arrays. Instead, Report
> > > is a enumerable that yields iterations, Iteration is an enumerable
> > > that yields hits, Hits are enumerables that yield hsps, etc.
> > > 
> > > This is my first attempt real shared code, and all comments and
> > > criticism are very welcome.
> > > 
> > > -r
> > > 
> > > Rob Syme
> > > PhD Candidate
> > > Curtin University
> > > Western Australia
> > > 
> > > _______________________________________________
> > > BioRuby Project - http://www.bioruby.org/
> > > BioRuby mailing list
> > > BioRuby at lists.open-bio.org (mailto:BioRuby at lists.open-bio.org)
> > > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org (mailto:BioRuby at lists.open-bio.org)
> http://lists.open-bio.org/mailman/listinfo/bioruby


From rob.syme at gmail.com  Wed Jun  1 08:33:36 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 16:33:36 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
Message-ID: <BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>

I think that the list at
http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
comprehensive, my mistake was simply not looking.
-r


On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
<philipp.comans at googlemail.com> wrote:
> Hi,
>
> I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
> In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
>
> BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
>
> Cheers,
>
> Philipp
>


From pjotr.public14 at thebird.nl  Wed Jun  1 08:49:48 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Wed, 1 Jun 2011 10:49:48 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
Message-ID: <20110601084948.GA23592@thebird.nl>

The general idea is to have a number of 'blessed' plugins tied to
BioRuby releases. A blessed plugin is supposed to be rather solid,
and have a level of documentation and testing.

In addition there are 'development' plugins. Both should be listed on
the plugin page. We are introducing that plumbing shortly. The
duplication of work merely points out we need to get this done ;)

It is interesting to note both XML parsers use lazy iterators. I also
do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
performance on some real-life data.

Pj.

On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
> I think that the list at
> http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
> comprehensive, my mistake was simply not looking.
> -r
> 
> 
> On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
> <philipp.comans at googlemail.com> wrote:
> > Hi,
> >
> > I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
> > In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
> >
> > BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
> >
> > Cheers,
> >
> > Philipp
> >
> 


From bonnal at ingm.org  Wed Jun  1 10:26:19 2011
From: bonnal at ingm.org (Raoul Bonnal)
Date: Wed, 1 Jun 2011 12:26:19 +0200
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601084948.GA23592@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
	<20110601084948.GA23592@thebird.nl>
Message-ID: <23D33897-ACAC-47B0-85D1-A3A808D46B48@ingm.org>

what about to automate this process on our wiki :-)?

$# gem search -r bio-

bio-assembly (0.1.0)
bio-blastxmlparser (0.6.1)
bio-bwa (0.2.2)
bio-cnls_screenscraper (0.1.0)
bio-emboss_six_frame_nucleotide_sequences (0.1.0)
bio-gem (0.2.2)
bio-genomic-interval (0.1.2)
bio-gex (0.0.0)
bio-gff3 (0.8.6)
bio-graphics (1.4)
bio-hello (0.0.0)
bio-isoelectric_point (0.1.1)
bio-kb-illumina (0.1.0)
bio-lazyblastxml (0.4.0)
bio-logger (0.9.0)
bio-nexml (0.0.1)
bio-octopus (0.1.1)
bio-samtools (0.2.1)
bio-sge (0.0.0)
bio-tm_hmm (0.2.0)
bio-ucsc-api (0.0.4)

wow quite long list of plugins :-) I'm happy to see this boiling soup

On 01/giu/2011, at 10.49, Pjotr Prins wrote:

> The general idea is to have a number of 'blessed' plugins tied to
> BioRuby releases. A blessed plugin is supposed to be rather solid,
> and have a level of documentation and testing.
> 
> In addition there are 'development' plugins. Both should be listed on
> the plugin page. We are introducing that plumbing shortly. The
> duplication of work merely points out we need to get this done ;)
> 
> It is interesting to note both XML parsers use lazy iterators. I also
> do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
> performance on some real-life data.
> 
> Pj.
> 
> On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
>> I think that the list at
>> http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
>> comprehensive, my mistake was simply not looking.
>> -r
>> 
>> 
>> On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
>> <philipp.comans at googlemail.com> wrote:
>>> Hi,
>>> 
>>> I had a similar problem recently. I needed an efficient parser for Blast XML results and I discovered that the default parser in BioRuby was not suitable. So I wrote my own using Nokogiri.
>>> In my opinion it is way too hard at the moment to discover BioPlugins. When people use the default XML or GFF parser that comes with BioRUby, they do not expect that there is another, more efficient version. There should be a section on the front page or even in the corresponding parts of the API documentation that makes people aware of the existence of these efficient parsers.
>>> 
>>> BTW thank you all for BioRuby, I used in a project recently and it made my life tremendously easier.
>>> 
>>> Cheers,
>>> 
>>> Philipp
>>> 
>> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
The only change to succeed is starting from a simple thing.


From rob.syme at gmail.com  Wed Jun  1 12:26:25 2011
From: rob.syme at gmail.com (Rob Syme)
Date: Wed, 1 Jun 2011 20:26:25 +0800
Subject: [BioRuby] Parsing large Blast xml files - a new bioruby plugin
In-Reply-To: <20110601084948.GA23592@thebird.nl>
References: <BANLkTim79AgRW2LVAGQuHCcFw2WfNZtAag@mail.gmail.com>
	<20110601073016.GB22723@thebird.nl>
	<BANLkTi=SC9a3U9ev8=0EFNTzF0wuqrTwLw@mail.gmail.com>
	<2738C8712A2F46BAB655CE885CDF4F89@googlemail.com>
	<BANLkTik3-hXTVSQKRK6kOhW8JY3Cf7NkeQ@mail.gmail.com>
	<20110601084948.GA23592@thebird.nl>
Message-ID: <BANLkTi=wFYGACB5OK3-rCbYm8Q72To8sjg@mail.gmail.com>

I pushed a 1.4GB file through each of the parsers, simply counting the
number of hits per iteration:

     user     system      total        real
Rob:    91.510000   0.620000  92.130000 ( 92.527617)
Pjotr:  46.730000   0.430000  47.160000 ( 47.263949)

One of the important differences in the parsers is that mine is lazy 'all
the way down', in that the iterations are lazy, the hits are lazy and the
hsps are lazy. No large chunks of XML are ever buffered into a string and
then parsed together. While lazy-loading is a good idea, and should probably
be adopted in more of the BioRuby core, taking it to this extreme is a bit
silly.
Pjotr's (more sensible) approach is to chunk up the file by iterations, and
then use XPath to pull out the relevant information from there. One
iteration will never be more than a few kb - certainly no strain on memory
consumption. The IO strain of reading a file in tiny pieces looks to be the
cause of the 2x slowdown in the example above.

Lesson 1: Pragmatism is a good thing.
Lesson 2: Always check to make sure work you're doing hasn't been done
before
Lesson 3: Use Pjotr's parser to make light work of your large Blast results.

-r

On Wed, Jun 1, 2011 at 4:49 PM, Pjotr Prins <pjotr.public14 at thebird.nl>wrote:

> The general idea is to have a number of 'blessed' plugins tied to
> BioRuby releases. A blessed plugin is supposed to be rather solid,
> and have a level of documentation and testing.
>
> In addition there are 'development' plugins. Both should be listed on
> the plugin page. We are introducing that plumbing shortly. The
> duplication of work merely points out we need to get this done ;)
>
> It is interesting to note both XML parsers use lazy iterators. I also
> do lazy conversions. Same for my GFF3 plugin. Rob, be good to compare
> performance on some real-life data.
>
> Pj.
>
> On Wed, Jun 01, 2011 at 04:33:36PM +0800, Rob Syme wrote:
> > I think that the list at
> > http://bioruby.open-bio.org/wiki/BioRuby_Plugins is pretty
> > comprehensive, my mistake was simply not looking.
> > -r
> >
> >
> > On Wed, Jun 1, 2011 at 4:25 PM, Philipp Comans
> > <philipp.comans at googlemail.com> wrote:
> > > Hi,
> > >
> > > I had a similar problem recently. I needed an efficient parser for
> Blast XML results and I discovered that the default parser in BioRuby was
> not suitable. So I wrote my own using Nokogiri.
> > > In my opinion it is way too hard at the moment to discover BioPlugins.
> When people use the default XML or GFF parser that comes with BioRUby, they
> do not expect that there is another, more efficient version. There should be
> a section on the front page or even in the corresponding parts of the API
> documentation that makes people aware of the existence of these efficient
> parsers.
> > >
> > > BTW thank you all for BioRuby, I used in a project recently and it made
> my life tremendously easier.
> > >
> > > Cheers,
> > >
> > > Philipp
> > >
> >
>


From yannick.wurm at unil.ch  Mon Jun 13 05:49:39 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Mon, 13 Jun 2011 12:49:39 +0700
Subject: [BioRuby] ruby BLAST server (web frontend)
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
Message-ID: <4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>

Dear list & CC-ed,

let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):

> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
> GMOD obviously comes to mind, but it seems like overkill. 
> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?


There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com

Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:

Ease of use for biologists:
 * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
 * links to easily download sequences of BLAST hits
 * support for advanced options.

Rapid deployment for bioinformatics administrators:
 * assisted formatting of BLAST databases (with sequence type detection)
 * automatic discovery of formatted BLAST databases during startup
 * uses ruby's internal web server (on any open port) or Apache
 * add custom hyperlinks from hits (to your genome browser or custom database).


We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 

Comments, suggestions... and contributions are most welcome!

Cheers,

Anurag & Ben & Yannick


-----------------------------
  Ant Genomes & Evolution 
 http://yannick.poulet.org
    skype://yannickwurm
-----------------------------


From bonnal at ingm.org  Mon Jun 13 07:17:21 2011
From: bonnal at ingm.org (Raoul Bonnal)
Date: Mon, 13 Jun 2011 09:17:21 +0200
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
Message-ID: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>

Dear Yannick and other,
cute work.
Just few suggestions.
you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.
Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
something like:

sequenceserver database_formatter directory_with_fasta_files
sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
sequenceserver start

then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/

if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.

could this became a bioruby plugin ?


On 13/giu/2011, at 07.49, Yannick Wurm wrote:

> Dear list & CC-ed,
> 
> let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):
> 
>> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
>> GMOD obviously comes to mind, but it seems like overkill. 
>> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?
> 
> 
> There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
> https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com
> 
> Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:
> 
> Ease of use for biologists:
> * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
> * links to easily download sequences of BLAST hits
> * support for advanced options.
> 
> Rapid deployment for bioinformatics administrators:
> * assisted formatting of BLAST databases (with sequence type detection)
> * automatic discovery of formatted BLAST databases during startup
> * uses ruby's internal web server (on any open port) or Apache
> * add custom hyperlinks from hits (to your genome browser or custom database).
> 
> 
> We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 
> 
> Comments, suggestions... and contributions are most welcome!
> 
> Cheers,
> 
> Anurag & Ben & Yannick
> 
> 
> 
> -----------------------------
>  Ant Genomes & Evolution 
> http://yannick.poulet.org
>    skype://yannickwurm
> -----------------------------
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From yannick.wurm at unil.ch  Mon Jun 13 08:06:43 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Mon, 13 Jun 2011 15:06:43 +0700
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
Message-ID: <AF9A450A-3741-44CC-845E-3E76E313D821@unil.ch>

Thanks for the suggestions Raoul,
they could substantially streamline setting things up!

cheers
yannick


On 13 Jun 2011, at 14:17, Raoul Bonnal wrote:

> Dear Yannick and other,
> cute work.
> Just few suggestions.
> you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
> configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.
> Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
> something like:
> 
> sequenceserver database_formatter directory_with_fasta_files
> sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
> sequenceserver start
> 
> then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/
> 
> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.
> 
> could this became a bioruby plugin ?
> 
> 
> 
> 
> 
> On 13/giu/2011, at 07.49, Yannick Wurm wrote:
> 
>> Dear list & CC-ed,
>> 
>> let me quote a discussion from a while back ( http://answerpot.com/showthread.php?1292835-rails+blast+server ):
>> 
>>> I'd like to set up a small server for people to run BLAST against some of my sequences & see the results. 
>>> GMOD obviously comes to mind, but it seems like overkill. 
>>> And perhaps there is an almost automagic way to do this with ruby on rails. Has anyone done this yet?
>> 
>> 
>> There was no good solution at the time. Anurag Priyam & I have since been working on something that fills this need. Ben Woodcroft has recently been contributing as well. Check:
>> https://github.com/yannickwurm/sequenceserver or http://www.sequenceserver.com
>> 
>> Some things remain to be improved. But globally the software works great. Thus we thought to share our progress on the list that initiated it. An excerpt of the README highlights some features:
>> 
>> Ease of use for biologists:
>> * intuitive and helpful web interface: automatic sequence type detection that helps choose appropriate BLAST method and database types
>> * links to easily download sequences of BLAST hits
>> * support for advanced options.
>> 
>> Rapid deployment for bioinformatics administrators:
>> * assisted formatting of BLAST databases (with sequence type detection)
>> * automatic discovery of formatted BLAST databases during startup
>> * uses ruby's internal web server (on any open port) or Apache
>> * add custom hyperlinks from hits (to your genome browser or custom database).
>> 
>> 
>> We have been using this as the web frontend for our ant genome blast at http://www.antgenomes.org since a few months. 
>> 
>> Comments, suggestions... and contributions are most welcome!
>> 
>> Cheers,
>> 
>> Anurag & Ben & Yannick
>> 
>> 
>> 
>> -----------------------------
>>  Ant Genomes & Evolution 
>> http://yannick.poulet.org
>>    skype://yannickwurm
>> -----------------------------
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


-----------------------------
   Ant Genomes & Evolution 
  http://yannick.poulet.org
     skype://yannickwurm
-----------------------------
BLAST @ http://antgenomes.org


From anurag08priyam at gmail.com  Mon Jun 13 16:10:53 2011
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Mon, 13 Jun 2011 21:40:53 +0530
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
Message-ID: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>

> cute work.

Thanks a lot Raoul :).

> Just few suggestions.
> you could build a gem and distribute is then with a single executable script "sequenceserver" you can call all other tasks,
> configuration, database or starting the service like we did with biongs; it's a more consistent approach and the end user has a clear reference to your application.

Agreed. And that is our target for the next release.

> Installing it as gem then you need to build a web environment somewhere else but it is quite simple to create a scaffold directory ready to be used by a web server (where you put your configuration/database ref, public, js, css etc.)
> something like:
>
> sequenceserver database_formatter directory_with_fasta_files
> sequenceserver config production --bin="~/ncbi-blast-2.2.24+/bin/" --database="/Users/me/blast_databases/"
> sequenceserver start

This looks quite good. I will keep this in mind when pushing forward a
gem release.

>> then if your application runs on ruby 1.87, try REE with passenger and nginx, in my opinion is the easiest web server (NGINX) with high level of performances http://www.modrails.com/
>
> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.

That would be great. We are putting forward a wiki page with
instructions on deploying SequenceServer on Apache, and Nginix. I am
almost done adding instructions for Apache, but I am not sure how to
do it for Nginix.

> could this became a bioruby plugin ?

So, then would it become bio-sequenceserver? IMO, it doesn't logically
fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
BioRuby is more like library but SequenceServer is more like an end
product. Not sure though :-|.

-- 
Anurag Priyam
http://about.me/yeban/


From anurag08priyam at gmail.com  Mon Jun 13 16:12:25 2011
From: anurag08priyam at gmail.com (Anurag Priyam)
Date: Mon, 13 Jun 2011 21:42:25 +0530
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
Message-ID: <BANLkTim=ZJHn-fCUp6hVXXVEivC7Yzvu4A@mail.gmail.com>

>> if you need help to configure nginx I can give you some hint or example of my config, it works well with rvm as well.
>
> That would be great. We are putting forward a wiki page with
> instructions on deploying SequenceServer on Apache, and Nginix. I am
> almost done adding instructions for Apache, but I am not sure how to
> do it for Nginix.

Oops, forgot to add the link:
https://github.com/yannickwurm/sequenceserver/wiki/Deploying-Sequence-Server

-- 
Anurag Priyam
http://about.me/yeban/


From donttrustben at gmail.com  Tue Jun 14 13:19:37 2011
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Tue, 14 Jun 2011 23:19:37 +1000
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
Message-ID: <BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>

Hi,


> > could this became a bioruby plugin ?
>
> So, then would it become bio-sequenceserver? IMO, it doesn't logically
> fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
> BioRuby is more like library but SequenceServer is more like an end
> product. Not sure though :-|.
>

To be technical, the branch trying to implement the blast overview graphic
does rely on BioRuby, since that is a dependency of bio-graphics. But that
branch hasn't been merged into the main tree yet, and might remain an
optional thing anyway.

-- 
Ben J Woodcroft, BE (Hons)

PhD Candidate
Ralph Laboratory
The University of Melbourne
Melbourne, Australia

tel: (+613) 8344 2319
b.woodcroft at pgrad.unimelb.edu.au


From pjotr.public14 at thebird.nl  Tue Jun 14 13:26:54 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 14 Jun 2011 15:26:54 +0200
Subject: [BioRuby] ruby BLAST server (web frontend)
In-Reply-To: <BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>
References: <DA89A56D-864B-4250-B6D6-2E0DE29BEC9B@unil.ch>
	<4E447EC6-D36D-42DF-85B7-E199E7E78042@unil.ch>
	<F6D734DA-96B8-41DD-B076-BA0FACC09491@ingm.org>
	<BANLkTiky0RFNkEWWtf+7ws0aZLmt3YMfDw@mail.gmail.com>
	<BANLkTi=rdhqBzd_hzDZvP442pSspDpH3dw@mail.gmail.com>
Message-ID: <20110614132654.GA20916@thebird.nl>

The advantages of making it a plugin:

1. easy install for users
2. visibility from the BioRuby project
3. potentially a member of the stable plugin family
4. developers may use your libraries - even if the focus is an
   application

Pj.

On Tue, Jun 14, 2011 at 11:19:37PM +1000, Ben Woodcroft wrote:
> Hi,
> 
> 
> > > could this became a bioruby plugin ?
> >
> > So, then would it become bio-sequenceserver? IMO, it doesn't logically
> > fit in as a BioRuby plugin, as in it doesn't depend on BioRuby. And
> > BioRuby is more like library but SequenceServer is more like an end
> > product. Not sure though :-|.
> >
> 
> To be technical, the branch trying to implement the blast overview graphic
> does rely on BioRuby, since that is a dependency of bio-graphics. But that
> branch hasn't been merged into the main tree yet, and might remain an
> optional thing anyway.
> 
> -- 
> Ben J Woodcroft, BE (Hons)
> 
> PhD Candidate
> Ralph Laboratory
> The University of Melbourne
> Melbourne, Australia
> 
> tel: (+613) 8344 2319
> b.woodcroft at pgrad.unimelb.edu.au
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


From mail at michaelbarton.me.uk  Thu Jun 23 15:05:48 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Thu, 23 Jun 2011 11:05:48 -0400
Subject: [BioRuby] GFF3 Record Equality Method
Message-ID: <20110623150548.GA1030@Michael-Bartons-MacBook.local>

As far as I can tell the GFF3 record in bioruby uses Object#== for comparison.
I'm implementing a Bio::GFF::GFF3::Record#== method based on comparison of the
GFF3 fields. Would this this be a useful addition to bioruby library?

Cheers

Michael Barton

From ngoto at gen-info.osaka-u.ac.jp  Fri Jun 24 12:41:29 2011
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 24 Jun 2011 21:41:29 +0900
Subject: [BioRuby] GFF3 Record Equality Method
In-Reply-To: <20110623150548.GA1030@Michael-Bartons-MacBook.local>
References: <20110623150548.GA1030@Michael-Bartons-MacBook.local>
Message-ID: <20110624124129.C00871CBC47D@idnmail.gen-info.osaka-u.ac.jp>

On Thu, 23 Jun 2011 11:05:48 -0400
Michael Barton <mail at michaelbarton.me.uk> wrote:

> As far as I can tell the GFF3 record in bioruby uses Object#== for comparison.
> I'm implementing a Bio::GFF::GFF3::Record#== method based on comparison of the
> GFF3 fields. Would this this be a useful addition to bioruby library?
> 
> Cheers
> 
> Michael Barton

Bio::GFF::GFF3::Record inherits Bio::GFF::GFF2::Record, and
the GFF2::Record already has its own == method. GFF2::Record#==
gives enough functionality for comparing GFF3 records, in
addition to GFF2 records.

#sample code
#-----------------------------------------------------------
 require 'bio'
 str1 = "chrI\tSGD\tcentromere\t151467\t151584\t.\t+\t.\t" +
        "ID=CEN1;Name=CEN1;gene=CEN1;Alias=CEN1,test%3B0001;" +
        "Note=Chromosome I centromere;dbxref=SGD:S000006463;" +
        "Target=test%2002 123 456 -,test%2C03 159 314;" +
        "memo%3Dtest%3Battr=99.9%25%09match"
 str2 = str1.dup
 str3 = str1.gsub(/CEN1/, 'CEN2')
 obj0 = Bio::GFF::GFF3::Record.new(str1)
 obj1 = Bio::GFF::GFF3::Record.new(str1)
 obj2 = Bio::GFF::GFF3::Record.new(str2)
 obj3 = Bio::GFF::GFF3::Record.new(str3)

 p obj0==obj1
 p obj1==obj2
 p obj1==obj3
#-----------------------------------------------------------

--
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


From andrew.j.grimm at gmail.com  Sun Jun 26 10:16:21 2011
From: andrew.j.grimm at gmail.com (Andrew Grimm)
Date: Sun, 26 Jun 2011 20:16:21 +1000
Subject: [BioRuby] Anyone else attending RubyKaigi 2011?
Message-ID: <BANLkTinc9BURgNiNbirdM=VguSCceq4gzQ@mail.gmail.com>

I noticed that Goto-san's talk got accepted as a lightning talk.

Are any other BioRuby contributors or users attending?

I'll be giving a talk, but I'll only briefly mention bioinformatics.
I'll be talking about the Small Eigen Collider. In describing why I
created the Small Eigen Collider, I'll mention that I'm a
bioinformatician, and that I deal with enough information that I am
tempted to run Ruby code under implementations other than YARV.
http://rubykaigi.org/2011/en/schedule/details/18S03

Andrew