From missy at be.to  Sat Jan  1 19:54:08 2011
From: missy at be.to (MISHIMA, Hiroyuki)
Date: Sun, 02 Jan 2011 09:54:08 +0900
Subject: [BioRuby] Workflows: NGS + miRNA (Re: Workflows and Parallelization)
In-Reply-To: <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>
	<4D10B9AE.2010206@be.to>
	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>
	<4D12A1CE.4040702@be.to>
	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
Message-ID: <4D1FCCB0.2000303@be.to>

Dear Raoul and the BioRuby list,

My workflow for miRNA analysis using Illumina GAii is like the followings:

1) Read alignment using Novoalign. ( http://www.novocraft.com/ ).
It is a proprietary software, but its binary is free for academic use
with several restrictions. The advantage of Novoalign is the function to
remove adapter sequences from each read. Adapter clipping is
indispensable for miRNA analyses because target molecules are always
shorter than read length.

1b) You may be able to use BWA/MAQ instead. Adopter clipping tool such
as Cutadapt ( http://cutadapt.googlecode.com/ ) is available.

2) To find miRBASE-registered miRNAs, I used miRExpress (
http://mirexpress.mbc.nctu.edu.tw/ , Wang et al, BMC Bioinform 10, p328,
2009. http://www.biomedcentral.com/1471-2105/10/328 )

2b) Data analysis. I plotted heatmaps using R. See Ruby et al. (Genome
Res, 17, p1850, 2007. http://genome.cshlp.org/content/17/12/1850.long ).

3) To find potentially novel miRNA, I used miRTRAP
(http://flybuzz.berkeley.edu/miRTRAP.html (Hendrix et al., Genome Biolo,
11, pR39, 2010. http://genomebiology.com/2010/11/4/R39 ).

The workflow may have to be updated. Hopefully, it will help you.

Thanks,
Hiro.

Raoul Bonnal wrote (2010/12/23 18:47):
> Actually the focus of my institute is mainly on mirna, so I'm also
> interested on techniques for analyzing NGS(illumina) and microRNA.

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences

From missy at be.to  Sat Jan  1 20:38:57 2011
From: missy at be.to (MISHIMA, Hiroyuki)
Date: Sun, 02 Jan 2011 10:38:57 +0900
Subject: [BioRuby] Workflows: NGS + miRNA
In-Reply-To: <4D1FCCB0.2000303@be.to>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>	<4D10B9AE.2010206@be.to>	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>	<4D12A1CE.4040702@be.to>	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
	<4D1FCCB0.2000303@be.to>
Message-ID: <4D1FD731.4070309@be.to>

Hi all,

Addition to my workflow.

Only miRTRAP requires read alignment generated by Novoalign. Inputs for 
miRExpress are fastq files and miRExpress clips adapters from fastq files.

miRExpress is easy and fast. This one is good for first try.

During using miRExpress, you may find 5'-end variations in mature miRNA 
reads. These prevent accurate alignment. These may be not artifacts. See 
Wu et al. PLoS One, 4, p.e7566, 
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 
). Clipping 5'-end variations increase alignment-hits.

MISHIMA, Hiroyuki wrote (2011/01/02 9:54):
> Dear Raoul and the BioRuby list,
>
> My workflow for miRNA analysis using Illumina GAii is like the followings:

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences

From pjotr.public14 at thebird.nl  Sun Jan  2 07:04:48 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 2 Jan 2011 13:04:48 +0100
Subject: [BioRuby] GFF3
Message-ID: <20110102120448.GA23804@thebird.nl>

The GFF3 plugin works rather well. Anyone who has ruby 1.9.x on his
system can just type as a user:

  gem install bio-gff3

and even bioruby itself gets installed, if needed. Next you can type,
for example

  gff3-fetch mRNA test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3

to assemble all mRNA. 

Unfortunately I am finding some problems with data. For example
the reading frame is *wrong* in this wormbase data file (predicted
gene). The contig starts as:

>MhA1_Contig3426
TTAATAAATTTAATTCATTAAAATTTTAAAAAGAAAGGGACATTCGAGGGGAAATGAGAGAGAACGAGAGAAAATGGACG
GGAAATTAAATTAAAAAATAAAAAATTAATTTTTATTTTTTTTTATTTAATTTAAAATTAATTTTCTACATTTATTAAAT
CTTAAATTATTAATTTTAAATTAATTTAAAG GCATCCAACAACAACAATTAGAAGTCTTTCCCAGCTCCTCCTCTGCCCC
TCAGCAACAACAATACCCAGCGCAGCAGCTTCAATTAGTTACTCCTTTTATTGCATGCATAGCAGATGAATTGAGGGAGT
TGATAGATGAAATGCGTATGTTTTAG AATATTTTTTAAAAAAAAATTAAAAAAAATTTTTTTTTGCCAAACAGGCTCTCG

and the full record is:

##gff-version 3
##sequence-region MhA1_Contig3426 1 2029
# Gene gene:MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        gene    192     346     .       +       .       
ID=gene:MhA1_Contig3426.frz3.gene1;Name=MhA1_Contig3426.frz3.gene1;Note=PREDICTE
D protein_coding;public_name=MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        mRNA    192     346     .       +       .       
ID=transcript:MhA1_Contig3426.frz3.gene1;Parent=gene:MhA1_Contig3426.frz3.gene1;
Name=MhA1_Contig3426.frz3.gene1;public_name=MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        exon    192     346     .       +       .       
ID=exon:MhA1_Contig3426.frz3.gene1.1;Parent=transcript:MhA1_Contig3426.frz3.gene
1
MhA1_Contig3426 WormBase        CDS     192     346     .       +       0       
ID=cds:MhA1_Contig3426.frz3.gene1;Parent=transcript:MhA1_Contig3426.frz3.gene1

So, forward reading frame start at 192 and CDS phase 0. The actual sequence is 

GCATCCAACA ACAACAATTA GAAGTCTTTC CCAGCTCCTC CTCTGCCCCT CAGCAACAAC AATACCCAGC GCAGCAGCTT
CAATTAGTTA CTCCTTTTAT TGCATGCATA GCAGATGAAT TGAGGGAGTT GATAGATGAA ATGCGTATGT TTTAG

which translates to a valid protein only in frame 2(!). This is not
compliant with GFF3 in any interpretation. Turns out for this
particular GFF3 file this is the case only with the *first* ORF on every
contig, and probably a bug of the gene predictor used. None of the
other genes is in the wrong frame.

I have informed Wormbase some time ago, but I don't have the
impression that anyone is interested. You can validate its contents at

  http://www.wormbase.org/db/gb2/gbrowse/m_hapla/?name=id:2258995;dbid=m_hapla:database

I am going to add an option to the GFF3 plugin to test for valid
reading frames, so these files give the expected results. Be good for
validation anyway.

Pj.


From pjotr.public14 at thebird.nl  Sun Jan  2 13:49:58 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 2 Jan 2011 19:49:58 +0100
Subject: [BioRuby] BioRuby and log4r
Message-ID: <20110102184958.GA25699@thebird.nl>

I propose we start using 

  http://log4r.rubyforge.org/manual.html

which has the standard logging features one would expect. I
particularly like the lazy evaluation (deferred block).

What it does fall short on, as well as most other loggers, is
usage use cases. A logger has to behave differently when a tool is
used by:

- developer: fail early and often (on warnings!)
- user: fail on normal error
- library: fail on serious error
- web server: fail on serious error
- fault tolerant system: never fail, try to resume
 
Essentially, I see three or four error handlers.

We can create a default logger for BioRuby = user

But I like to have more options. It would be nice to have several
levels within 'info', 'warn' or 'error', to be displayed/logged on
user needs.

Also, with the plugins we should have standardized switches for CLI
utilities. 

Are we interested in making this core BioRuby, or should I
incorporate it as a bio-plugin? I am thinking of writing a front-end
of log4r.

Pj.


From bonnalraoul at ingm.it  Mon Jan  3 07:14:44 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 3 Jan 2011 13:14:44 +0100
Subject: [BioRuby] Workflows: NGS + miRNA
In-Reply-To: <4D1FD731.4070309@be.to>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>	<4D10B9AE.2010206@be.to>	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>	<4D12A1CE.4040702@be.to>	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
	<4D1FCCB0.2000303@be.to> <4D1FD731.4070309@be.to>
Message-ID: <3D720507-34D5-4A3C-9F2C-A54CB9556E3D@ingm.it>

Thank you very much, I'll read all the refs.

On 02/gen/2011, at 02.38, MISHIMA, Hiroyuki wrote:

> Hi all,
> 
> Addition to my workflow.
> 
> Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files.
> 
> miRExpress is easy and fast. This one is good for first try.
> 
> During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits.
> 
> MISHIMA, Hiroyuki wrote (2011/01/02 9:54):
>> Dear Raoul and the BioRuby list,
>> 
>> My workflow for miRNA analysis using Illumina GAii is like the followings:
> 
> -- 
> MISHIMA, Hiroyuki, DDS, Ph.D.
> COE Research Fellow
> Department of Human Genetics
> Nagasaki University Graduate School of Biomedical Sciences
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From pjotr.public14 at thebird.nl  Fri Jan  7 03:52:21 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 7 Jan 2011 09:52:21 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110102184958.GA25699@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
Message-ID: <20110107085221.GA14735@thebird.nl>

I am creating a plugin 'bio-logger' for sane handling of errors and
exceptions in different situations (log-act):

* Normal user
* Developer
* Web server
* Fault-tolerant systems

One example is a program logs a warning to stdout, as a user, but
raises an exception, as a developer.

bio-logger builds up on log4r functionality, using a more fine-grained
approach for logging errors. I.e. within 'debug', 'info', 'warn',
'error' an addition value 1..10 can be set to limit output and
logging.

When a program, e.g. gff3-fetch, supports bio-logger switches, the following 
is possible:

  --logger stderr              Add stderr logger (default is stdout)
  --logger filen               Add filename logger
  --trace  debug               Show all messages 
  --trace  warn                Show messages more serious than 'warn'
  --trace  warn:3              Show messaged more serious that 'warn' level 3

module overrides:

  --trace  gff3:info:5         Override level for 'gff3' to info level 5
  --trace  blast:debug         Override level for 'blast'
  --trace  blast,gff3:debug    Override level for 'blast' and 'gff3' 
  --trace  stderr:blast:debug  Override level for 'blast' on stderr 

Also behaviour can be changed. This normally happens through library 
calls. There is one command line switch, which changes log-act:

  --log-act Developer          Modify the logger for development

log4r supports rotating logs and remote logging. Which will be
available.

Any comments?

Pj.

On Sun, Jan 02, 2011 at 07:49:58PM +0100, Pjotr Prins wrote:
>   http://log4r.rubyforge.org/manual.html

From pjotr.public14 at thebird.nl  Fri Jan  7 10:01:47 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 7 Jan 2011 16:01:47 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110107085221.GA14735@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
	<20110107085221.GA14735@thebird.nl>
Message-ID: <20110107150147.GA16116@thebird.nl>

bio-logger created. YABP (yet another BioRuby plugin).

  https://github.com/pjotrp/bioruby-logger-plugin

Finally the logger I always wanted to have...

Pj.

From pjotr.public14 at thebird.nl  Sat Jan  8 08:06:12 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sat, 8 Jan 2011 14:06:12 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110107150147.GA16116@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
	<20110107085221.GA14735@thebird.nl>
	<20110107150147.GA16116@thebird.nl>
Message-ID: <20110108130612.GA19929@thebird.nl>

If anyone is interested, the bio-logger plugin is fully functional (I
am using it in the GFF3 plugin):

This is a plugin for nailing down problems with big data parsers,
common in bioinformatics, and sane handling of errors and exceptions
in different situations.

In Bioinformatics the following is a common scenario when dealing with
parsers: Large data files sometimes contain errors. As a user you want
to continue and hope for the best (logging the error). As a developer
you want to see how you can fix the problem. Waiting for a full run
and checking the logs is tedious. The logger can be helpful here, and
avoids sticking temporary solutions in code. Read on...

  https://github.com/pjotrp/bioruby-logger-plugin

I think we should use this throughout BioRuby to get consistent error
handling and logging. No more $stderr.print statements.

Pj.

From bonnalraoul at ingm.it  Mon Jan 10 17:06:48 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 10 Jan 2011 23:06:48 +0100
Subject: [BioRuby] biogem and options
Message-ID: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>

Hi all,
I have updated the github repo with some requests from Pjotr.
Now is possible to create bin, db and test/data directory if needed from the command line 

biogem --with-bin --with-bd --with-test-data youprojectname

NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
NOTE 2: README updated.

about rspec and cucumber jeweler already has those options.

type 'biogem -h' and you'll get the help.

This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.

I'm refactoring the code so there are some variations in the original tree.

I "hope", by the end of the week, to provide templates files too.

--
R.J.P.B.


From pjotr.public14 at thebird.nl  Tue Jan 11 01:38:34 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 11 Jan 2011 07:38:34 +0100
Subject: [BioRuby] biogem and options
In-Reply-To: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
References: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
Message-ID: <20110111063834.GA2409@thebird.nl>

Super!

On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote:
> Hi all,
> I have updated the github repo with some requests from Pjotr.
> Now is possible to create bin, db and test/data directory if needed from the command line 
> 
> biogem --with-bin --with-bd --with-test-data youprojectname
> 
> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
> NOTE 2: README updated.
> 
> about rspec and cucumber jeweler already has those options.
> 
> type 'biogem -h' and you'll get the help.
> 
> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.
> 
> I'm refactoring the code so there are some variations in the original tree.
> 
> I "hope", by the end of the week, to provide templates files too.
> 
> --
> R.J.P.B.
> 
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From ktym at hgc.jp  Tue Jan 11 05:47:55 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 11 Jan 2011 19:47:55 +0900
Subject: [BioRuby] biogem and options
In-Reply-To: <20110111063834.GA2409@thebird.nl>
References: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
	<20110111063834.GA2409@thebird.nl>
Message-ID: <AC955B15-8A62-4DCF-AE0E-58834312F26A@hgc.jp>

Raoul,

http://twitter.com/#!/ilpuccio/status/24766316493672448
> @tktym could you point me to some example of what you mena, please?
> "provide a recommended template for rdoc, require lines, and class def"

In my example plugin,

  https://github.com/ktym/bioruby-hello/blob/master/lib/bio-hello.rb

I used a style something similar with the BioRuby core library
which is described in 

  https://github.com/bioruby/bioruby/blob/master/README_DEV.rdoc

but I'm not sure what is the best practice for plugin.
It might be better to include the documentation in the README file instead.

In ether case, what in my mind is to auto-generate a plugin description
from those embedded description for the "plugin showcase" which will be
available somewhere on the bioruby.org site in the future.

For that purpose, we may also want to have some flags indicating:

* status of the plugin (stable, usable, buggy, just started etc.)
* the plugin will override the BioRuby core or just provide new features harmlessly
* pre-requirements (especially, other than gems)

etc. etc.

Here's a material for further discussion (example template):

#
# = Bio::XXX - BioRuby plugin for XXX
#
# Copyright::  Copyright (C) 2001, 2003-2005 Bio R. Hacker <brh at example.org>,
# Copyright::  Copyright (C) 2006 Chem R. Hacker <crh at example.org>
# License::    The Ruby License
# Site:        http://github.com/user/bioruby-xxx
#
# == Description
#
# This plugin provides an interface for the XXX database.
#
# == Usage
#
# Lorem ipsum dolor sit amet, consectetur adipisicing elit, ....
#
# == Effects (Overrides?)
#
# * Modify the behavior of Bio::Sequence::NA#translate destructively
# * Add methods to the Bio::DB class
#
# == Depends (Requirements?)
#
# * External MySQL database system
# * RubyGem package 'foobar'
#
# == References
#
# * Hoge F. et al., The XXX database, Nucleic. Acid. Res. 123:100--123 (2030)
# * http://hoge.db/
#

# Do we need these two lines in every BioRuby plugin?
require 'rubygems'
require 'bio'

# Do we allow classes defined outside of the 'Bio' namespace?
module Bio
  class XXX
    # :
  end # XXX
end # Bio


Thanks,
Toshiaki

On 2011/01/11, at 15:38, Pjotr Prins wrote:

> Super!
> 
> On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote:
>> Hi all,
>> I have updated the github repo with some requests from Pjotr.
>> Now is possible to create bin, db and test/data directory if needed from the command line 
>> 
>> biogem --with-bin --with-bd --with-test-data youprojectname
>> 
>> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
>> NOTE 2: README updated.
>> 
>> about rspec and cucumber jeweler already has those options.
>> 
>> type 'biogem -h' and you'll get the help.
>> 
>> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.
>> 
>> I'm refactoring the code so there are some variations in the original tree.
>> 
>> I "hope", by the end of the week, to provide templates files too.
>> 
>> --
>> R.J.P.B.
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From chmille4 at gmail.com  Wed Jan 12 13:37:19 2011
From: chmille4 at gmail.com (Chase Miller)
Date: Wed, 12 Jan 2011 13:37:19 -0500
Subject: [BioRuby] bio-assembly
Message-ID: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>

Hi All,

Quick update on the bio-assembly plugin.

Francesco has added support for CAF files.  According to his preliminary
tests it can handle a 27k contig 454 file in about a minute.  He also
improved the performance overall so now the ace parser can process a 70 mb
file in about 10 seconds. Nice work!

If there are any requests for parsers or functionality, let us know.

source code: https://github.com/chmille4/bioruby-assembly

<https://github.com/chmille4/bioruby-assembly>usage:
https://github.com/chmille4/bioruby-assembly#readme

gem: https://rubygems.org/gems/bio-assembly


Cheers
Chase

From bonnalraoul at ingm.it  Thu Jan 13 04:30:03 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 13 Jan 2011 10:30:03 +0100
Subject: [BioRuby] bio-assembly
In-Reply-To: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>
References: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>
Message-ID: <8EB2ADDB-7137-4C8E-AB70-C5574A797886@ingm.it>

Hi,
great work guys.

I have updated the Plugins' page http://bioruby.open-bio.org/wiki/Plugins#On_Development_Plugins, it's a list/resume with the state of the art of the plugins.

Please let me know if there is something wrong.

In my mind Planned plugins are just ideas not yet coded.

The other are "on going development". I tried to list the plugins in order of creations.

@Jan: Do you plan to release Ensembl API as a  plugin ? I think is't just a matter of rename the gem
@Geroge: To avoid problems, please, yank isoelectric_point from rubygems

I didn't receive any reply from Ricardo H. Ram?rez-Gonzalez about samtools-ruby-ffi

Do you think that a separate page would be better?  I think so, u?

Ciao.

On 12/gen/2011, at 19.37, Chase Miller wrote:

> Hi All,
> 
> Quick update on the bio-assembly plugin.
> 
> Francesco has added support for CAF files.  According to his preliminary
> tests it can handle a 27k contig 454 file in about a minute.  He also
> improved the performance overall so now the ace parser can process a 70 mb
> file in about 10 seconds. Nice work!
> 
> If there are any requests for parsers or functionality, let us know.
> 
> source code: https://github.com/chmille4/bioruby-assembly
> 
> <https://github.com/chmille4/bioruby-assembly>usage:
> https://github.com/chmille4/bioruby-assembly#readme
> 
> gem: https://rubygems.org/gems/bio-assembly
> 
> 
> Cheers
> Chase
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From yannick.wurm at unil.ch  Sun Jan 16 06:57:04 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Sun, 16 Jan 2011 18:57:04 +0700
Subject: [BioRuby] trees
Message-ID: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>

is a specific person "responsible" for coordinating the wiki?

the following page is largely misleading (contains tons of ruby code):
http://bioruby.open-bio.org/wiki/HOWTO:Trees

cheers,
yannick


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From ngoto at gen-info.osaka-u.ac.jp  Mon Jan 17 00:34:48 2011
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Mon, 17 Jan 2011 14:34:48 +0900
Subject: [BioRuby] trees
In-Reply-To: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
References: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
Message-ID: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>

On Sun, 16 Jan 2011 18:57:04 +0700
Yannick Wurm <yannick.wurm at unil.ch> wrote:

> is a specific person "responsible" for coordinating the wiki?
> 
> the following page is largely misleading (contains tons of ruby code):
> http://bioruby.open-bio.org/wiki/HOWTO:Trees

The page is a trial to translate BioPerl HowTOs from Perl to Ruby,
but is still left unfinished. See the discussion:
http://bioruby.open-bio.org/wiki/Talk:HOWTOs

One of the reasons why the trial stalled is the differences between
BioPerl and BioRuby is larger than we expected.

In the Talk:HOWTOs page, to write BioRuby original documentation
were also discussed, but it stalled too.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp

From pjotr.public14 at thebird.nl  Mon Jan 17 03:47:05 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 17 Jan 2011 09:47:05 +0100
Subject: [BioRuby] bio-logger release 0.9.0
Message-ID: <20110117084705.GA5136@thebird.nl>

Just released bio-logger 0.9.0. Most important feature I added is that
you can inject a filter on log messages (by module). I.e. for the
blast logger you could only show messages relating to a contig:

   log = LoggerPlus['blast']
   log.filter { | level, sub_level, msg | msg =~ /contig1133/ }

on the command line you can do the same with:

   --trace "blast:= msg =~ /contig1133/"

another option is to filter on level and sub_level values:

   log.filter { | level, sub_level, msg | sub_level == 3 or level <= ERROR }

providing lots of possibilities. Obviously much of this can be
handled (multi)grep'ing log files, but the power of using Ruby and
filter combinations makes at a great feature for debugging big data
problems. And you can limit the size of log files, without limiting
expressive power.

Pj.

From pjotr.public14 at thebird.nl  Mon Jan 17 05:08:12 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 17 Jan 2011 11:08:12 +0100
Subject: [BioRuby] Bio-gff3 plugin 0.8.6
Message-ID: <20110117100812.GA6947@thebird.nl>

Released bio-gff3 parser plugin 0.8.6 on rubygems, and can be used
from the command-line. E.g.

  gem install bio-gff3
  gff3-fetch --help

Introduced LRU cache, replaced the BioRuby GFF line parser and
added lazy parsing. All with significant speedups compared to the
original (No-cache, BioRuby parser, non-lazy).

The LRU version has limited RAM use for any sized data (730MB), and
currently runs 6 times slower than the full memory version.

  Digesting parser:

  Cache              real     user     sys  version     RAM
  ------------------------------------------------------------
  full,bioruby       12m41    12m28    0m09 (0.8.0)
  full,line          12m13    12m06    0m07 (0.8.5)
  full,line,lazy     11m51    11m43    0m07 (0.8.6)     6,600M

  none,bioruby      504m     477m     26m50 (0.8.0)
  none,line         297m     267m     28m36 (0.8.5)       
  none,line,lazy    132m     106m     26m01 (0.8.6)       650M

  lru,bioruby       533m     510m     22m47 (0.8.5)
  lru,line          353m     326m     26m44 (0.8.5)  1K
  lru,line          305m     281m     22m30 (0.8.5) 10K
  lru,line,lazy     182m     161m     21m10 (0.8.6) 10K
  lru,line,lazy      75m      75m      0m17 (0.8.6) 50K   730M
  ------------------------------------------------------------

where

   52M  m_hapla.WS217.dna.fa
  456M  m_hapla.WS217.gff3

ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-linux]
on 64-bits CPU 2.6 GHz (6MB cache), 16 GB RAM machine. 

Note bio-gff3 0.8.6 is a fully digesting parser, with scope for full
validation of the GFF3 relations. The next step, a limited
'optimistic' digestion, will speed things up.

Note also that bio-gff3 exploits the bio-logger plugin - it is a good 
example.

Pj.

From yannick.wurm at unil.ch  Tue Jan 18 02:55:15 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Tue, 18 Jan 2011 14:55:15 +0700
Subject: [BioRuby] trees
In-Reply-To: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>
References: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
	<20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <C31A2387-A24B-4A20-80F8-237DEEA83A09@unil.ch>

Thanks for the details Naohisa-san

Maybe I suggest we try to "hide from google" things that are not finalized, and links to non-existant documents?
(I have the feeling it may be better to have nothing than to create confusion?)


On 17 Jan 2011, at 12:34, Naohisa GOTO wrote:

> On Sun, 16 Jan 2011 18:57:04 +0700
> Yannick Wurm <yannick.wurm at unil.ch> wrote:
> 
>> is a specific person "responsible" for coordinating the wiki?
>> 
>> the following page is largely misleading (contains tons of ruby code):
>> http://bioruby.open-bio.org/wiki/HOWTO:Trees
> 
> The page is a trial to translate BioPerl HowTOs from Perl to Ruby,
> but is still left unfinished. See the discussion:
> http://bioruby.open-bio.org/wiki/Talk:HOWTOs
> 
> One of the reasons why the trial stalled is the differences between
> BioPerl and BioRuby is larger than we expected.
> 
> In the Talk:HOWTOs page, to write BioRuby original documentation
> were also discussed, but it stalled too.
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp


From yannick.wurm at unil.ch  Tue Jan 18 03:56:26 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Tue, 18 Jan 2011 15:56:26 +0700
Subject: [BioRuby] Rake
In-Reply-To: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
Message-ID: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>

Dear List,

I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.

It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
 - tranlsation
 - multiple alignment (mafft)
 - gblocks to remove crap 
 - tree building (phyml)
AND
 - codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
 - gblocks to remove crap
 - tree building (phyml)

https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile


It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. 

However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. 

Cheers,

yannick 

-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From mail at michaelbarton.me.uk  Tue Jan 18 10:17:08 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Tue, 18 Jan 2011 10:17:08 -0500
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <20110118151708.GB3430@nku069218.hh.nku.edu>

Hi Yannick,

I think it's a great idea to generate predefined pipelines for common
bioinformatics tasks. I experimented with a tool called Boson six months ago.
It could be worth looking if you feel like investing more time into your
pipeline.

Boson commands, similar to rake tasks, are more modular and can be installed
from the web into a ~/.boson directory. This has obvious advantages over
a single rake file. Boson tasks can be chained together where the data is
passed around in YAML format.

The github link is - https://github.com/cldwalker/boson

Cheers

Michael Barton


On Tue, Jan 18, 2011 at 03:56:26PM +0700, Yannick Wurm wrote:
> Dear List,
> 
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san
> and some of the google-able tutorials things are now working.
> 
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta"
> and goes through the following steps for tree building: - tranlsation
> - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml)
> AND - codon-level alignment: reverse translated from protein multiple
> alignment (pal2nal) - gblocks to remove crap - tree building (phyml)
> 
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
> 
> 
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere
> out of the box. But FWIW maybe it can be usefull to a random googler. 
> 
> However, it feels quite clunky, so I think I should do things differently in
> the future. If you have any comments or suggestions, I'd be most happy to
> hear them. 
> 
> Cheers,
> 
> yannick 
> 
> ------------------------- Ant Genomes & Evolution http://yannick.poulet.org
> skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________ BioRuby Project
> - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From diapriid at gmail.com  Tue Jan 18 10:21:36 2011
From: diapriid at gmail.com (Matt)
Date: Tue, 18 Jan 2011 10:21:36 -0500
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <AANLkTinGavos9pLmz6cCHW8FQTZZoS+tJ7PkwxT4QRrw@mail.gmail.com>

Yannick-

I like it.  It might be nice to extend your pipeline in a generic
manner (SimpleAnalysisPipeline). Just a couple of steps that would be
extensible/swappable to different software.  The generic pipeline
would "just work" given a minimal local configuration (I like your
starting point).

Swappable/configurable steps might be

Pre-process (trim / quality filters?)
Alignment   (align)
Post alignment (gblocks)
Translation   (to Nexus)
Analysis      (Phyml)

The idea is that we could swap in components (TNT or RaXML for Phyml,
Muscle for MAFFT etc.)- but also that the pipeline remains "simple".

If I find some time I'd like to work on my first attempt at a BioRuby
Plugin, a wrapper for TNT (hopefully tied in to the analysis bit
above).

cheers,
Matt


On Tue, Jan 18, 2011 at 3:56 AM, Yannick Wurm <yannick.wurm at unil.ch> wrote:
> Dear List,
>
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.
>
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
> ?- tranlsation
> ?- multiple alignment (mafft)
> ?- gblocks to remove crap
> ?- tree building (phyml)
> AND
> ?- codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
> ?- gblocks to remove crap
> ?- tree building (phyml)
>
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
>
>
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler.
>
> However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them.
>
> Cheers,
>
> yannick
>
> -------------------------
> ?Ant Genomes & Evolution
> http://yannick.poulet.org
> ? skype://yannickwurm
>
>
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From francesco.strozzi at gmail.com  Tue Jan 18 15:55:20 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Tue, 18 Jan 2011 21:55:20 +0100
Subject: [BioRuby] BioRuby HTSeq-like
Message-ID: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>

Hi BioRuby people,
just wondering if something similar exists for BioRuby (is a package
to work and manipulate next-gen sequencing data, in Python):

http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html

Many features could be implemented or are already available for
BioRuby....these are the basics:
- Getting statistical summaries about the base-call quality scores to
study the data quality.
- Calculating a coverage vector and exporting it for visualization in
a genome browser.
- Reading in annotation data from a GFF file.
- Assigning aligned reads from an RNA-Seq experiments to exons and genes.

Particularly, the plotting functions to explore and assess quality
data seems very interesting.
If nothing similar exists for BioRuby, I think we should discuss about
coding a BioRuby "NextGenSequencing" plugin, to provide the same
functionalities and also to add something new as well....

What do you think?

Cheers
--

Francesco

From yannick.wurm at unil.ch  Wed Jan 19 23:08:55 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Thu, 20 Jan 2011 11:08:55 +0700
Subject: [BioRuby] Rake
In-Reply-To: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
References: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
Message-ID: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>

Hello & thanks for the comments.

Matt wrote: 
> Swappable/configurable steps might be
> 
> Pre-process (trim / quality filters?)
> Alignment   (align)
> Post alignment (gblocks)
> Translation   (to Nexus)
> Analysis      (Phyml)
> 
> The idea is that we could swap in components (TNT or RaXML for Phyml,
> Muscle for MAFFT etc.)- but also that the pipeline remains "simple".

Yes, thats what I would ideally want. (as well as being able to easily modify the run options of the programs). How would you go about generalizing this?
Right now I'm basing "what do to" on the file extensions I provide... which limits me based on the file extensions...


Michael wrote:
> I think it's a great idea to generate predefined pipelines for common
> bioinformatics tasks. I experimented with a tool called Boson six months ago.
> It could be worth looking if you feel like investing more time into your
> pipeline.
> 
> Boson commands, similar to rake tasks, are more modular and can be installed
> from the web into a ~/.boson directory. This has obvious advantages over
> a single rake file. Boson tasks can be chained together where the data is
> passed around in YAML format.
> 
> The github link is - https://github.com/cldwalker/boson

I haven't looked thoroughly now, but at least superficially, Boson looks real cool.

However, I'm a bit scared of investing energy into technologies that are too new. Boson has only one developer who may or may not keep his project alive over the next years. Time I invest in learning something today ... must continue to help improve my productivity over the next 5 or 10 years by still being reusable in 5 or 10 years (with as few modifications as possible). There is uncertainty to everything, but something like Boson does seems a bit too risky right now...

Cheers,

yannick


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From bonnalraoul at ingm.it  Thu Jan 20 04:13:20 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 20 Jan 2011 10:13:20 +0100
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>

Dear Yanninck,
rake usually is used inside a project directory to provide common operations to the project.
Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are.

Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability
What about spend some energy on wrapping that commands into BioRuby classes?
In that way those application could be available to other scripts.

If you want to keep the rake approach we should find a way to not replicate rakefiles.
One idea could be to create a rakefile in your working directory, similar to Rails:


# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.

require File.expand_path('../config/application', __FILE__)
require 'rake'

#The user needs just to add  the tasks he wants:
Bio::SomeName.load_tasks
Bio::SomeOtherName.load_tasks
Bio::AnotherName.load_tasks


On 18/gen/2011, at 09.56, Yannick Wurm wrote:

> Dear List,
> 
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.
> 
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
> - tranlsation
> - multiple alignment (mafft)
> - gblocks to remove crap 
> - tree building (phyml)
> AND
> - codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
> - gblocks to remove crap
> - tree building (phyml)
> 
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
> 
> 
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. 
> 
> However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. 
> 
> Cheers,
> 
> yannick 
> 
> -------------------------
> Ant Genomes & Evolution 
> http://yannick.poulet.org
>   skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From bonnalraoul at ingm.it  Thu Jan 20 05:35:58 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 20 Jan 2011 11:35:58 +0100
Subject: [BioRuby] BioRuby HTSeq-like
In-Reply-To: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
References: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
Message-ID: <EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>

Hi folks,
Yesterday I met Francesco in my lab and was a wonderful opportunity to exchange ideas and thoughts.


About Fancesco's mail I think that we could grab inspiration from Galaxy/BioPython (http://main.g2.bx.psu.edu/) , they did a very good work on wrapping the common software for crunching NGS data.

So my input is, let's start wrapping them and possibly opening a bioruby-ngs project on github:
https://github.com/helios/bioruby-ngs (just the repo :-))


reading around  http://seqanswers.com/forums/showthread.php?t=2461 sometimes there is the need to split and distribute the computation:
there are different possibilities, but splitting the fastq file and at the same time enabling the multithreading seems to be the best option; if you have suggestions please comment. 

Thanks to Goto san, fastq support is on 
Thanks to Pjotr, GFF3 support is on
Thanks to Chase and Fancesco, CAF and Ace support is on
For plotting as we said one possibility is http://rubyvis.rubyforge.org/  from Claudio Bustos but if you have better alternatives... please discuss.
About  statistics please join http://groups.google.com/group/sciruby-dev 

Having this tools in our arsenal is useful and strategical for founding.

I would say 
+1

PS: Please clone and add your name to the list of the authors if you want to join into this project.

PS: if someone is using SGE what do you think about http://gridengine.info/2010/12/24/goodbye-grid-engine ?


On 18/gen/2011, at 21.55, Francesco Strozzi wrote:

> Hi BioRuby people,
> just wondering if something similar exists for BioRuby (is a package
> to work and manipulate next-gen sequencing data, in Python):
> 
> http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
> 
> Many features could be implemented or are already available for
> BioRuby....these are the basics:
> - Getting statistical summaries about the base-call quality scores to
> study the data quality.
> - Calculating a coverage vector and exporting it for visualization in
> a genome browser.
> - Reading in annotation data from a GFF file.
> - Assigning aligned reads from an RNA-Seq experiments to exons and genes.
> 
> Particularly, the plotting functions to explore and assess quality
> data seems very interesting.
> If nothing similar exists for BioRuby, I think we should discuss about
> coding a BioRuby "NextGenSequencing" plugin, to provide the same
> functionalities and also to add something new as well....
> 
> What do you think?
> 
> Cheers
> --
> 
> Francesco
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From francesco.strozzi at gmail.com  Thu Jan 20 06:27:43 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Thu, 20 Jan 2011 12:27:43 +0100
Subject: [BioRuby] BioRuby HTSeq-like
In-Reply-To: <EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>
References: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
	<EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>
Message-ID: <AANLkTi=4d7CJCe8eQ2AeEq8DfFpqFJNWWh=v+3eAkfn4@mail.gmail.com>

Today is Thursday (BioRuby IRC day), I will try to join the #bioruby
channel this afternoon (CET time). If there is someone else we could
discuss about this plugin and new ideas.


> PS: Please clone and add your name to the list of the authors if you want to
> join into this project.

Done! I'm in!


-- 

Francesco

From mail at michaelbarton.me.uk  Thu Jan 20 15:57:26 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Thu, 20 Jan 2011 15:57:26 -0500
Subject: [BioRuby] Rake
In-Reply-To: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>
References: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
	<7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>
Message-ID: <20110120205726.GD245@Michael-Bartons-MacBook.local>

Yannick, you make an excellent point about the long term stability for boson.
The ruby community, myself often guilty of this, is quick to jump on a new gem,
which may or may not last into the future. A example of this is the Less gem
for compiling CSS which has seen some recet popularity. I believe the developer
has said he will no longer maintain it.

Another option could be Thor. I believe this is also aimed at being a more
modular rake-like tool. This is developed by Yehuda Katz and I think is used
for the basis of few mainstream ruby command line tools (possibly the rails3
CLI? I'm not 100% about this.). I think you could expect Thor to be more mature
and likely to be continually developed. If can find the episode of the
ChangeLog with Yehuda you can hear him discuss it.

On Thu, Jan 20, 2011 at 11:08:55AM +0700, Yannick Wurm wrote:
> Hello & thanks for the comments.
> 
> Matt wrote: 
> > Swappable/configurable steps might be
> > 
> > Pre-process (trim / quality filters?) Alignment   (align) Post alignment
> > (gblocks) Translation   (to Nexus) Analysis      (Phyml)
> > 
> > The idea is that we could swap in components (TNT or RaXML for Phyml,
> > Muscle for MAFFT etc.)- but also that the pipeline remains "simple".
> 
> Yes, thats what I would ideally want. (as well as being able to easily modify
> the run options of the programs). How would you go about generalizing this?
> Right now I'm basing "what do to" on the file extensions I provide... which
> limits me based on the file extensions...
> 
> 
> 
> Michael wrote:
> > I think it's a great idea to generate predefined pipelines for common
> > bioinformatics tasks. I experimented with a tool called Boson six months
> > ago.  It could be worth looking if you feel like investing more time into
> > your pipeline.
> > 
> > Boson commands, similar to rake tasks, are more modular and can be
> > installed from the web into a ~/.boson directory. This has obvious
> > advantages over a single rake file. Boson tasks can be chained together
> > where the data is passed around in YAML format.
> > 
> > The github link is - https://github.com/cldwalker/boson
> 
> I haven't looked thoroughly now, but at least superficially, Boson looks real
> cool.
> 
> However, I'm a bit scared of investing energy into technologies that are too
> new. Boson has only one developer who may or may not keep his project alive
> over the next years. Time I invest in learning something today ... must
> continue to help improve my productivity over the next 5 or 10 years by still
> being reusable in 5 or 10 years (with as few modifications as possible).
> There is uncertainty to everything, but something like Boson does seems a bit
> too risky right now...
> 
> Cheers,
> 
> yannick
> 
> 
> ------------------------- Ant Genomes & Evolution http://yannick.poulet.org
> skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________ BioRuby Project
> - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From yannick.wurm at unil.ch  Fri Jan 21 00:24:22 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Fri, 21 Jan 2011 12:24:22 +0700
Subject: [BioRuby] Rake
In-Reply-To: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
	<63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>
Message-ID: <2E930098-FD2E-4D3B-AC61-1B54D3653DE7@unil.ch>

Ciao Raoul,

mi dispiace, I was away from the computer during most of the irc thing.


On 20 Jan 2011, at 16:13, Raoul Bonnal wrote:
> Dear Yanninck,
> rake usually is used inside a project directory to provide common operations to the project.
> Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are.
> 
> Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability

That is true. My main worry here (and for most things) is rapidly getting a biological result. I'm still working on finding the optimal balance between quick hack and maintainability/reusability. Migrating from shell scripts to ruby hacks does probably save me some time because in ruby it's really simple to put in a few verifications by raising Errors if a tool I need isn't in the $PATH or if an input/output file is empty. Those mean that debugging and fixing is much faster if I decide to run things on the linux server instead of the macbook, or in 2 years time after a reinstall.


> What about spend some energy on wrapping that commands into BioRuby classes?
> In that way those application could be available to other scripts.
I have two answers.
  - right now I cannot dedicate the time required to learn how to do that well. I need understand how ants work first :)     (If I were developping a big uniprot-type web application that needs to be robust for users, making wrappers may be defendable.... for one-off hacks its not)
  - call me conservative, but I'm also generally scared of wrappers. First, I want to have the raw input & output files that the programs use, because I may need to read or edit or rerun them in the future... I know I'll be able to read a raw text file. Thus I've never used bioruby's wrappers for blast or codeml or multiple sequence alignment (However, I have recently discovered the amazingly timesaving Bio::Tree however -wow). Second, programs are constantly changing... and thus wrappers must too - they're a ton of work to maintain and -like the Boson thing- there is no guarantee that that will be done.
 

> If you want to keep the rake approach we should find a way to not replicate rakefiles.
> One idea could be to create a rakefile in your working directory, similar to Rails:
> 
> # Add your own tasks in files placed in lib/tasks ending in .rake,
> # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
> 
> require File.expand_path('../config/application', __FILE__)
> require 'rake'
> 
> #The user needs just to add the tasks he wants:
> Bio::SomeName.load_tasks
> Bio::SomeOtherName.load_tasks
> Bio::AnotherName.load_tasks


That sounds like a really cool approach. I want to hear more :)


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From francesco.strozzi at gmail.com  Fri Jan 21 04:41:47 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Fri, 21 Jan 2011 10:41:47 +0100
Subject: [BioRuby] BIO-NGS (and Rake/Thor for bioinformatics)
Message-ID: <AANLkTi=kyv-ynXXmsW2wD=uoaNjpWc6atQN=zgVPK339@mail.gmail.com>

Hi all,
in the yesterday IRC chat (http://bioruby.org/irc/?date=2011-01) we
discussed about the bio-ngs plugin that Raoul wrote in a previous
email.
Here is the Wiki page on BioRuby describing the general idea for this
plugin: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing

We want to use wrappers and/or bindings to existing tools like
MAQ,BWA,SAMtools and we want to use Rake or Thor to provide custom
tasks and let the user run NGS analysis. We would like to include also
the possibility to create reports using statsample and rubyvis. Maybe
some aspects are still a bit unclear at the moment (I think we need to
define some sort of guidelines), but I hope we could come up with a
useful (let me use this term) "framework" to run bioinformatics NGS
analyses with Ruby.

Any comment/help/feedback/suggestion is more than welcome!

Cheers

-- 

Francesco

From mictadlo at gmail.com  Tue Jan 25 19:41:11 2011
From: mictadlo at gmail.com (Michal)
Date: Wed, 26 Jan 2011 10:41:11 +1000
Subject: [BioRuby] marshal data too short
Message-ID: <4D3F6DA7.8050101@gmail.com>

Hi,
I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way

$ tar xvfz ruby-1.9.2-p136.tar.gz
$ cd ruby-1.9.2-p136/
$ ./configure --prefix=/home/mictadlo/apps/ruby
$ make
$ make install
$ vim ~/.bashrc
  export APPS=/home/mictadlo/apps
  export RUBY_HOME=$APPS/ruby
  export LD_LIBRARY_PATH=/RUBY_HOME/lib
  PATH=$RUBY_HOME/bin:$PATH
$ . ~/.bashrc
$ ruby -v
  ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]

$ tar xvfz bioruby-1.4.1.tar.gz
$ cd bioruby-1.4.1/
$ ruby setup.rb
$ bioruby
     Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
     Loading object (/home/mitlox/.bioruby/shell/session/object) ... 
Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : 
marshal data too short
     done

     . . . B i o R u b y   i n   t h e   s h e l l . . .

       Version : BioRuby 1.4.1 / Ruby 1.9.2

     bioruby> exit

How can I fix the error in BioRuby?

Thank you in advance.

Michal


From bonnalraoul at ingm.it  Wed Jan 26 10:23:03 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Wed, 26 Jan 2011 16:23:03 +0100
Subject: [BioRuby] IRC meeting
Message-ID: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>

As usual, tomorrow the IRC meeting.

--
R.J.P.B.


From ktym at hgc.jp  Wed Jan 26 11:09:02 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu, 27 Jan 2011 01:09:02 +0900
Subject: [BioRuby] marshal data too short
In-Reply-To: <4D3F6DA7.8050101@gmail.com>
References: <4D3F6DA7.8050101@gmail.com>
Message-ID: <ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>

Hi Michal,

Could you give me some additional information?

% ls -l ~/.bioruby/shell/session/object
-rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object

% ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
[4, 8]

Have you ever used the bioruby shell with the old version of Ruby before?

If your file is not corrupted, this might be due to the backward
incompatibility of the Marshal file format (if so, does anyone know
whether there are any workaround to convert old marshal data into 1.9's?).

Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
restore the saved objects:

% ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object

Toshiaki


On 2011/01/26, at 9:41, Michal wrote:

> Hi,
> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
> 
> $ tar xvfz ruby-1.9.2-p136.tar.gz
> $ cd ruby-1.9.2-p136/
> $ ./configure --prefix=/home/mictadlo/apps/ruby
> $ make
> $ make install
> $ vim ~/.bashrc
> export APPS=/home/mictadlo/apps
> export RUBY_HOME=$APPS/ruby
> export LD_LIBRARY_PATH=/RUBY_HOME/lib
> PATH=$RUBY_HOME/bin:$PATH
> $ . ~/.bashrc
> $ ruby -v
> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
> 
> $ tar xvfz bioruby-1.4.1.tar.gz
> $ cd bioruby-1.4.1/
> $ ruby setup.rb
> $ bioruby
>    Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>    Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>    done
> 
>    . . . B i o R u b y   i n   t h e   s h e l l . . .
> 
>      Version : BioRuby 1.4.1 / Ruby 1.9.2
> 
>    bioruby> exit
> 
> How can I fix the error in BioRuby?
> 
> Thank you in advance.
> 
> Michal
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ktym at hgc.jp  Wed Jan 26 11:46:23 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu, 27 Jan 2011 01:46:23 +0900
Subject: [BioRuby] IRC meeting
In-Reply-To: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>
References: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>
Message-ID: <E7627EE8-7AC6-45EF-B88E-F8600DE341F8@hgc.jp>

Raoul,

On 2011/01/27, at 0:23, Raoul Bonnal wrote:

> As usual, tomorrow the IRC meeting.
> 
> --
> R.J.P.B.


Thank you for the reminder! The next will be our 6th IRC meeting.

In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system.
As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became
the first use case of the logger (he posted announcements to this list on Jan 17th).

We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting.
As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea:
http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing

As for the weekly BioRuby IRC meeting, please see
http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference

Thanks,

Toshiaki

From bonnalraoul at ingm.it  Wed Jan 26 14:10:26 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Wed, 26 Jan 2011 20:10:26 +0100
Subject: [BioRuby] IRC meeting
In-Reply-To: <E7627EE8-7AC6-45EF-B88E-F8600DE341F8@hgc.jp>
Message-ID: <20110126191026.e71c169e@mail.ingm.it>

Hi all,
I have updated the page http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing
I'll try to keep you up to date about samtools from ml and that page.
I can't remember who is involved in the workflows, tomorrow we'll fix the page with the rigth names.
  _____  

From: Toshiaki Katayama [mailto:ktym at hgc.jp]
To: Raoul Bonnal [mailto:bonnalraoul at ingm.it]
Cc: BioRuby ML [mailto:bioruby at lists.open-bio.org]
Sent: Wed, 26 Jan 2011 17:46:23 +0100
Subject: Re: [BioRuby] IRC meeting

Raoul,
  
  On 2011/01/27, at 0:23, Raoul Bonnal wrote:
  
  > As usual, tomorrow the IRC meeting.
  > 
  > --
  > R.J.P.B.
  
  
  Thank you for the reminder! The next will be our 6th IRC meeting.
  
  In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system.
  As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became
  the first use case of the logger (he posted announcements to this list on Jan 17th).
  
  We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting.
  As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea:
  http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing
  
  As for the weekly BioRuby IRC meeting, please see
  http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference
  
  Thanks,
  
  Toshiaki  

From mictadlo at gmail.com  Fri Jan 28 07:18:30 2011
From: mictadlo at gmail.com (Michal)
Date: Fri, 28 Jan 2011 22:18:30 +1000
Subject: [BioRuby] marshal data too short
In-Reply-To: <ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
References: <4D3F6DA7.8050101@gmail.com>
	<ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
Message-ID: <4D42B416.8010503@gmail.com>

Hi Toshiaki,
On my system was not Ruby installed before and I just installed the 
latest version in my home directory:
$ ls -l ~/.bioruby/shell/session/object
-rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 
/home/mictadlo/.bioruby/shell/session/object
$ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
[nil, nil]
$ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' 
~/.bioruby/shell/session/object
-e:1:in `load': marshal data too short (ArgumentError)
     from -e:1:in `<main>'

Do you need another information?

Thank you in advance.

Michal


On 01/27/2011 02:09 AM, Toshiaki Katayama wrote:
> Hi Michal,
>
> Could you give me some additional information?
>
> % ls -l ~/.bioruby/shell/session/object
> -rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object
>
> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
> [4, 8]
>
> Have you ever used the bioruby shell with the old version of Ruby before?
>
> If your file is not corrupted, this might be due to the backward
> incompatibility of the Marshal file format (if so, does anyone know
> whether there are any workaround to convert old marshal data into 1.9's?).
>
> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
> restore the saved objects:
>
> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
>
> Toshiaki
>
>
> On 2011/01/26, at 9:41, Michal wrote:
>
>> Hi,
>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
>>
>> $ tar xvfz ruby-1.9.2-p136.tar.gz
>> $ cd ruby-1.9.2-p136/
>> $ ./configure --prefix=/home/mictadlo/apps/ruby
>> $ make
>> $ make install
>> $ vim ~/.bashrc
>> export APPS=/home/mictadlo/apps
>> export RUBY_HOME=$APPS/ruby
>> export LD_LIBRARY_PATH=/RUBY_HOME/lib
>> PATH=$RUBY_HOME/bin:$PATH
>> $ . ~/.bashrc
>> $ ruby -v
>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
>>
>> $ tar xvfz bioruby-1.4.1.tar.gz
>> $ cd bioruby-1.4.1/
>> $ ruby setup.rb
>> $ bioruby
>>     Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>>     Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>>     done
>>
>>     . . . B i o R u b y   i n   t h e   s h e l l . . .
>>
>>       Version : BioRuby 1.4.1 / Ruby 1.9.2
>>
>>     bioruby>  exit
>>
>> How can I fix the error in BioRuby?
>>
>> Thank you in advance.
>>
>> Michal
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From ktym at hgc.jp  Sat Jan 29 07:18:04 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Sat, 29 Jan 2011 21:18:04 +0900
Subject: [BioRuby] marshal data too short
In-Reply-To: <4D42B416.8010503@gmail.com>
References: <4D3F6DA7.8050101@gmail.com>
	<ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
	<4D42B416.8010503@gmail.com>
Message-ID: <8DFDDEA3-9B1D-44DC-BCB6-DCBA2C06BAF9@hgc.jp>

Hi Michal,

When I remove the ~/.bioruby directory, I could reproduce the same error with Ruby 1.9.2.

The ~/.bioruby/shell/session/object file was empty because BioRuby shell failed to save the file.

Saving object (/Users/ktym/.bioruby/shell/session/object) ... Error: Failed to save (/Users/ktym/.bioruby/shell/session/object) : can't convert Symbol into String

I'll try to fix this.

Toshiaki


On 2011/01/28, at 21:18, Michal wrote:

> Hi Toshiaki,
> On my system was not Ruby installed before and I just installed the latest version in my home directory:
> $ ls -l ~/.bioruby/shell/session/object
> -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object
> $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
> [nil, nil]
> $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
> -e:1:in `load': marshal data too short (ArgumentError)
>    from -e:1:in `<main>'
> 
> Do you need another information?
> 
> Thank you in advance.
> 
> Michal
> 
> 
> On 01/27/2011 02:09 AM, Toshiaki Katayama wrote:
>> Hi Michal,
>> 
>> Could you give me some additional information?
>> 
>> % ls -l ~/.bioruby/shell/session/object
>> -rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object
>> 
>> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
>> [4, 8]
>> 
>> Have you ever used the bioruby shell with the old version of Ruby before?
>> 
>> If your file is not corrupted, this might be due to the backward
>> incompatibility of the Marshal file format (if so, does anyone know
>> whether there are any workaround to convert old marshal data into 1.9's?).
>> 
>> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
>> restore the saved objects:
>> 
>> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
>> 
>> Toshiaki
>> 
>> 
>> On 2011/01/26, at 9:41, Michal wrote:
>> 
>>> Hi,
>>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
>>> 
>>> $ tar xvfz ruby-1.9.2-p136.tar.gz
>>> $ cd ruby-1.9.2-p136/
>>> $ ./configure --prefix=/home/mictadlo/apps/ruby
>>> $ make
>>> $ make install
>>> $ vim ~/.bashrc
>>> export APPS=/home/mictadlo/apps
>>> export RUBY_HOME=$APPS/ruby
>>> export LD_LIBRARY_PATH=/RUBY_HOME/lib
>>> PATH=$RUBY_HOME/bin:$PATH
>>> $ . ~/.bashrc
>>> $ ruby -v
>>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
>>> 
>>> $ tar xvfz bioruby-1.4.1.tar.gz
>>> $ cd bioruby-1.4.1/
>>> $ ruby setup.rb
>>> $ bioruby
>>>    Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>>>    Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>>>    done
>>> 
>>>    . . . B i o R u b y   i n   t h e   s h e l l . . .
>>> 
>>>      Version : BioRuby 1.4.1 / Ruby 1.9.2
>>> 
>>>    bioruby>  exit
>>> 
>>> How can I fix the error in BioRuby?
>>> 
>>> Thank you in advance.
>>> 
>>> Michal
>>> 
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> 


From mictadlo at gmail.com  Sun Jan 30 06:42:09 2011
From: mictadlo at gmail.com (Michal)
Date: Sun, 30 Jan 2011 21:42:09 +1000
Subject: [BioRuby] samtools-ruby
Message-ID: <4D454E91.1080604@gmail.com>

Hi,
I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. 
I have already posted this problem on 
https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have 
not got any response.

What did I wrong?

Michal

From bonnalraoul at ingm.it  Mon Jan 31 05:11:46 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 11:11:46 +0100
Subject: [BioRuby] samtools-ruby
In-Reply-To: <4D454E91.1080604@gmail.com>
References: <4D454E91.1080604@gmail.com>
Message-ID: <A6CFF775-1E68-4FA1-A84F-8E489E2FCE96@ingm.it>

Dear Michal,
please check this out:

https://github.com/helios/bioruby-samtools

This is the inital port of samtools-ruby as plugin. It comes with library for osx and linux, no windows.
I need to test the linux library because I'm developing under osx.
If the libbam.a is wrong please give me the right one and I'll add it to the repo.
Also note that the library has been compiled for 64bit.

Ciao!

On 30/gen/2011, at 12.42, Michal wrote:

> Hi,
> I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response.
> 
> What did I wrong?
> 
> Michal
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From bonnalraoul at ingm.it  Mon Jan 31 05:27:51 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 11:27:51 +0100
Subject: [BioRuby] BioGem and Rails
Message-ID: <1CCE39E6-F232-44C8-B95D-3C620443EF5C@ingm.it>

Dear All,
I've created a new branch in biogem.

https://github.com/helios/bioruby-gem/tree/rails_engine

It adds an option at biogem script for creating a rails engine with your gem, ONLY Rails3 !!!

The idea is: develop a gem that can be used in a script and extend it to be integrated in a rails project.
Which library can benefits from this approach ? I think,  databases, parser  or any data that you want to expose to a rails application.

It's in a very early stage so don't use it now, this message is just to let you know that we are adding new features.

from the help:
--with-engine                create a Rails engine with the namespace give in input. Set default database creation

Note: Is not possible to add the engine to an old gem, I need to fix it and implement the generator to accomplish to this task.

Any input is welcome.


Ciao.

--
R.J.P.B.


From jan.aerts at gmail.com  Mon Jan 31 10:07:39 2011
From: jan.aerts at gmail.com (Jan Aerts)
Date: Mon, 31 Jan 2011 16:07:39 +0100
Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics
Message-ID: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>

All,

FYI: There is now a Bioinformatics paper that describes the Ruby API to the
Ensembl databases. Thanks to Francesco Strozzi for working on this with me.
You can find it here: http://bit.ly/fzQamR

At this moment this API covers the core and variation databases. If anyone
is interested in working on the API for compara or functional, please let me
know.

Kind regards,
jan.

From bonnalraoul at ingm.it  Mon Jan 31 10:22:12 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 16:22:12 +0100
Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics
In-Reply-To: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>
References: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>
Message-ID: <193864A0-D798-4737-83CE-7A7932E4552C@ingm.it>

well done!
On 31/gen/2011, at 16.07, Jan Aerts wrote:

> All,
> 
> FYI: There is now a Bioinformatics paper that describes the Ruby API to the
> Ensembl databases. Thanks to Francesco Strozzi for working on this with me.
> You can find it here: http://bit.ly/fzQamR
> 
> At this moment this API covers the core and variation databases. If anyone
> is interested in working on the API for compara or functional, please let me
> know.
> 
> Kind regards,
> jan.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From missy at be.to  Sun Jan  2 00:54:08 2011
From: missy at be.to (MISHIMA, Hiroyuki)
Date: Sun, 02 Jan 2011 09:54:08 +0900
Subject: [BioRuby] Workflows: NGS + miRNA (Re: Workflows and Parallelization)
In-Reply-To: <0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>
	<4D10B9AE.2010206@be.to>
	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>
	<4D12A1CE.4040702@be.to>
	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
Message-ID: <4D1FCCB0.2000303@be.to>

Dear Raoul and the BioRuby list,

My workflow for miRNA analysis using Illumina GAii is like the followings:

1) Read alignment using Novoalign. ( http://www.novocraft.com/ ).
It is a proprietary software, but its binary is free for academic use
with several restrictions. The advantage of Novoalign is the function to
remove adapter sequences from each read. Adapter clipping is
indispensable for miRNA analyses because target molecules are always
shorter than read length.

1b) You may be able to use BWA/MAQ instead. Adopter clipping tool such
as Cutadapt ( http://cutadapt.googlecode.com/ ) is available.

2) To find miRBASE-registered miRNAs, I used miRExpress (
http://mirexpress.mbc.nctu.edu.tw/ , Wang et al, BMC Bioinform 10, p328,
2009. http://www.biomedcentral.com/1471-2105/10/328 )

2b) Data analysis. I plotted heatmaps using R. See Ruby et al. (Genome
Res, 17, p1850, 2007. http://genome.cshlp.org/content/17/12/1850.long ).

3) To find potentially novel miRNA, I used miRTRAP
(http://flybuzz.berkeley.edu/miRTRAP.html (Hendrix et al., Genome Biolo,
11, pR39, 2010. http://genomebiology.com/2010/11/4/R39 ).

The workflow may have to be updated. Hopefully, it will help you.

Thanks,
Hiro.

Raoul Bonnal wrote (2010/12/23 18:47):
> Actually the focus of my institute is mainly on mirna, so I'm also
> interested on techniques for analyzing NGS(illumina) and microRNA.

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences


From missy at be.to  Sun Jan  2 01:38:57 2011
From: missy at be.to (MISHIMA, Hiroyuki)
Date: Sun, 02 Jan 2011 10:38:57 +0900
Subject: [BioRuby] Workflows: NGS + miRNA
In-Reply-To: <4D1FCCB0.2000303@be.to>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>	<4D10B9AE.2010206@be.to>	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>	<4D12A1CE.4040702@be.to>	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
	<4D1FCCB0.2000303@be.to>
Message-ID: <4D1FD731.4070309@be.to>

Hi all,

Addition to my workflow.

Only miRTRAP requires read alignment generated by Novoalign. Inputs for 
miRExpress are fastq files and miRExpress clips adapters from fastq files.

miRExpress is easy and fast. This one is good for first try.

During using miRExpress, you may find 5'-end variations in mature miRNA 
reads. These prevent accurate alignment. These may be not artifacts. See 
Wu et al. PLoS One, 4, p.e7566, 
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 
). Clipping 5'-end variations increase alignment-hits.

MISHIMA, Hiroyuki wrote (2011/01/02 9:54):
> Dear Raoul and the BioRuby list,
>
> My workflow for miRNA analysis using Illumina GAii is like the followings:

-- 
MISHIMA, Hiroyuki, DDS, Ph.D.
COE Research Fellow
Department of Human Genetics
Nagasaki University Graduate School of Biomedical Sciences


From pjotr.public14 at thebird.nl  Sun Jan  2 12:04:48 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 2 Jan 2011 13:04:48 +0100
Subject: [BioRuby] GFF3
Message-ID: <20110102120448.GA23804@thebird.nl>

The GFF3 plugin works rather well. Anyone who has ruby 1.9.x on his
system can just type as a user:

  gem install bio-gff3

and even bioruby itself gets installed, if needed. Next you can type,
for example

  gff3-fetch mRNA test/data/gff/MhA1_Contig1133.fa test/data/gff/MhA1_Contig1133.gff3

to assemble all mRNA. 

Unfortunately I am finding some problems with data. For example
the reading frame is *wrong* in this wormbase data file (predicted
gene). The contig starts as:

>MhA1_Contig3426
TTAATAAATTTAATTCATTAAAATTTTAAAAAGAAAGGGACATTCGAGGGGAAATGAGAGAGAACGAGAGAAAATGGACG
GGAAATTAAATTAAAAAATAAAAAATTAATTTTTATTTTTTTTTATTTAATTTAAAATTAATTTTCTACATTTATTAAAT
CTTAAATTATTAATTTTAAATTAATTTAAAG GCATCCAACAACAACAATTAGAAGTCTTTCCCAGCTCCTCCTCTGCCCC
TCAGCAACAACAATACCCAGCGCAGCAGCTTCAATTAGTTACTCCTTTTATTGCATGCATAGCAGATGAATTGAGGGAGT
TGATAGATGAAATGCGTATGTTTTAG AATATTTTTTAAAAAAAAATTAAAAAAAATTTTTTTTTGCCAAACAGGCTCTCG

and the full record is:

##gff-version 3
##sequence-region MhA1_Contig3426 1 2029
# Gene gene:MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        gene    192     346     .       +       .       
ID=gene:MhA1_Contig3426.frz3.gene1;Name=MhA1_Contig3426.frz3.gene1;Note=PREDICTE
D protein_coding;public_name=MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        mRNA    192     346     .       +       .       
ID=transcript:MhA1_Contig3426.frz3.gene1;Parent=gene:MhA1_Contig3426.frz3.gene1;
Name=MhA1_Contig3426.frz3.gene1;public_name=MhA1_Contig3426.frz3.gene1
MhA1_Contig3426 WormBase        exon    192     346     .       +       .       
ID=exon:MhA1_Contig3426.frz3.gene1.1;Parent=transcript:MhA1_Contig3426.frz3.gene
1
MhA1_Contig3426 WormBase        CDS     192     346     .       +       0       
ID=cds:MhA1_Contig3426.frz3.gene1;Parent=transcript:MhA1_Contig3426.frz3.gene1

So, forward reading frame start at 192 and CDS phase 0. The actual sequence is 

GCATCCAACA ACAACAATTA GAAGTCTTTC CCAGCTCCTC CTCTGCCCCT CAGCAACAAC AATACCCAGC GCAGCAGCTT
CAATTAGTTA CTCCTTTTAT TGCATGCATA GCAGATGAAT TGAGGGAGTT GATAGATGAA ATGCGTATGT TTTAG

which translates to a valid protein only in frame 2(!). This is not
compliant with GFF3 in any interpretation. Turns out for this
particular GFF3 file this is the case only with the *first* ORF on every
contig, and probably a bug of the gene predictor used. None of the
other genes is in the wrong frame.

I have informed Wormbase some time ago, but I don't have the
impression that anyone is interested. You can validate its contents at

  http://www.wormbase.org/db/gb2/gbrowse/m_hapla/?name=id:2258995;dbid=m_hapla:database

I am going to add an option to the GFF3 plugin to test for valid
reading frames, so these files give the expected results. Be good for
validation anyway.

Pj.


From pjotr.public14 at thebird.nl  Sun Jan  2 18:49:58 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sun, 2 Jan 2011 19:49:58 +0100
Subject: [BioRuby] BioRuby and log4r
Message-ID: <20110102184958.GA25699@thebird.nl>

I propose we start using 

  http://log4r.rubyforge.org/manual.html

which has the standard logging features one would expect. I
particularly like the lazy evaluation (deferred block).

What it does fall short on, as well as most other loggers, is
usage use cases. A logger has to behave differently when a tool is
used by:

- developer: fail early and often (on warnings!)
- user: fail on normal error
- library: fail on serious error
- web server: fail on serious error
- fault tolerant system: never fail, try to resume
 
Essentially, I see three or four error handlers.

We can create a default logger for BioRuby = user

But I like to have more options. It would be nice to have several
levels within 'info', 'warn' or 'error', to be displayed/logged on
user needs.

Also, with the plugins we should have standardized switches for CLI
utilities. 

Are we interested in making this core BioRuby, or should I
incorporate it as a bio-plugin? I am thinking of writing a front-end
of log4r.

Pj.


From bonnalraoul at ingm.it  Mon Jan  3 12:14:44 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 3 Jan 2011 13:14:44 +0100
Subject: [BioRuby] Workflows: NGS + miRNA
In-Reply-To: <4D1FD731.4070309@be.to>
References: <2D62D256-4CBD-4793-97F1-37A908C734F6@ingm.it>	<4D10B9AE.2010206@be.to>	<5EB99E14-47EF-4267-99F5-216C16A93426@ingm.it>	<4D12A1CE.4040702@be.to>	<0B6FAF54-370F-4CB3-8752-1E5F80DAC569@ingm.it>
	<4D1FCCB0.2000303@be.to> <4D1FD731.4070309@be.to>
Message-ID: <3D720507-34D5-4A3C-9F2C-A54CB9556E3D@ingm.it>

Thank you very much, I'll read all the refs.

On 02/gen/2011, at 02.38, MISHIMA, Hiroyuki wrote:

> Hi all,
> 
> Addition to my workflow.
> 
> Only miRTRAP requires read alignment generated by Novoalign. Inputs for miRExpress are fastq files and miRExpress clips adapters from fastq files.
> 
> miRExpress is easy and fast. This one is good for first try.
> 
> During using miRExpress, you may find 5'-end variations in mature miRNA reads. These prevent accurate alignment. These may be not artifacts. See Wu et al. PLoS One, 4, p.e7566, http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007566 ). Clipping 5'-end variations increase alignment-hits.
> 
> MISHIMA, Hiroyuki wrote (2011/01/02 9:54):
>> Dear Raoul and the BioRuby list,
>> 
>> My workflow for miRNA analysis using Illumina GAii is like the followings:
> 
> -- 
> MISHIMA, Hiroyuki, DDS, Ph.D.
> COE Research Fellow
> Department of Human Genetics
> Nagasaki University Graduate School of Biomedical Sciences
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From pjotr.public14 at thebird.nl  Fri Jan  7 08:52:21 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 7 Jan 2011 09:52:21 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110102184958.GA25699@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
Message-ID: <20110107085221.GA14735@thebird.nl>

I am creating a plugin 'bio-logger' for sane handling of errors and
exceptions in different situations (log-act):

* Normal user
* Developer
* Web server
* Fault-tolerant systems

One example is a program logs a warning to stdout, as a user, but
raises an exception, as a developer.

bio-logger builds up on log4r functionality, using a more fine-grained
approach for logging errors. I.e. within 'debug', 'info', 'warn',
'error' an addition value 1..10 can be set to limit output and
logging.

When a program, e.g. gff3-fetch, supports bio-logger switches, the following 
is possible:

  --logger stderr              Add stderr logger (default is stdout)
  --logger filen               Add filename logger
  --trace  debug               Show all messages 
  --trace  warn                Show messages more serious than 'warn'
  --trace  warn:3              Show messaged more serious that 'warn' level 3

module overrides:

  --trace  gff3:info:5         Override level for 'gff3' to info level 5
  --trace  blast:debug         Override level for 'blast'
  --trace  blast,gff3:debug    Override level for 'blast' and 'gff3' 
  --trace  stderr:blast:debug  Override level for 'blast' on stderr 

Also behaviour can be changed. This normally happens through library 
calls. There is one command line switch, which changes log-act:

  --log-act Developer          Modify the logger for development

log4r supports rotating logs and remote logging. Which will be
available.

Any comments?

Pj.

On Sun, Jan 02, 2011 at 07:49:58PM +0100, Pjotr Prins wrote:
>   http://log4r.rubyforge.org/manual.html


From pjotr.public14 at thebird.nl  Fri Jan  7 15:01:47 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Fri, 7 Jan 2011 16:01:47 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110107085221.GA14735@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
	<20110107085221.GA14735@thebird.nl>
Message-ID: <20110107150147.GA16116@thebird.nl>

bio-logger created. YABP (yet another BioRuby plugin).

  https://github.com/pjotrp/bioruby-logger-plugin

Finally the logger I always wanted to have...

Pj.


From pjotr.public14 at thebird.nl  Sat Jan  8 13:06:12 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Sat, 8 Jan 2011 14:06:12 +0100
Subject: [BioRuby] BioRuby and log4r
In-Reply-To: <20110107150147.GA16116@thebird.nl>
References: <20110102184958.GA25699@thebird.nl>
	<20110107085221.GA14735@thebird.nl>
	<20110107150147.GA16116@thebird.nl>
Message-ID: <20110108130612.GA19929@thebird.nl>

If anyone is interested, the bio-logger plugin is fully functional (I
am using it in the GFF3 plugin):

This is a plugin for nailing down problems with big data parsers,
common in bioinformatics, and sane handling of errors and exceptions
in different situations.

In Bioinformatics the following is a common scenario when dealing with
parsers: Large data files sometimes contain errors. As a user you want
to continue and hope for the best (logging the error). As a developer
you want to see how you can fix the problem. Waiting for a full run
and checking the logs is tedious. The logger can be helpful here, and
avoids sticking temporary solutions in code. Read on...

  https://github.com/pjotrp/bioruby-logger-plugin

I think we should use this throughout BioRuby to get consistent error
handling and logging. No more $stderr.print statements.

Pj.


From bonnalraoul at ingm.it  Mon Jan 10 22:06:48 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 10 Jan 2011 23:06:48 +0100
Subject: [BioRuby] biogem and options
Message-ID: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>

Hi all,
I have updated the github repo with some requests from Pjotr.
Now is possible to create bin, db and test/data directory if needed from the command line 

biogem --with-bin --with-bd --with-test-data youprojectname

NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
NOTE 2: README updated.

about rspec and cucumber jeweler already has those options.

type 'biogem -h' and you'll get the help.

This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.

I'm refactoring the code so there are some variations in the original tree.

I "hope", by the end of the week, to provide templates files too.

--
R.J.P.B.


From pjotr.public14 at thebird.nl  Tue Jan 11 06:38:34 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Tue, 11 Jan 2011 07:38:34 +0100
Subject: [BioRuby] biogem and options
In-Reply-To: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
References: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
Message-ID: <20110111063834.GA2409@thebird.nl>

Super!

On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote:
> Hi all,
> I have updated the github repo with some requests from Pjotr.
> Now is possible to create bin, db and test/data directory if needed from the command line 
> 
> biogem --with-bin --with-bd --with-test-data youprojectname
> 
> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
> NOTE 2: README updated.
> 
> about rspec and cucumber jeweler already has those options.
> 
> type 'biogem -h' and you'll get the help.
> 
> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.
> 
> I'm refactoring the code so there are some variations in the original tree.
> 
> I "hope", by the end of the week, to provide templates files too.
> 
> --
> R.J.P.B.
> 
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ktym at hgc.jp  Tue Jan 11 10:47:55 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Tue, 11 Jan 2011 19:47:55 +0900
Subject: [BioRuby] biogem and options
In-Reply-To: <20110111063834.GA2409@thebird.nl>
References: <E7D2D648-688A-4EB5-AE34-1ED2D3C04B1F@ingm.it>
	<20110111063834.GA2409@thebird.nl>
Message-ID: <AC955B15-8A62-4DCF-AE0E-58834312F26A@hgc.jp>

Raoul,

http://twitter.com/#!/ilpuccio/status/24766316493672448
> @tktym could you point me to some example of what you mena, please?
> "provide a recommended template for rdoc, require lines, and class def"

In my example plugin,

  https://github.com/ktym/bioruby-hello/blob/master/lib/bio-hello.rb

I used a style something similar with the BioRuby core library
which is described in 

  https://github.com/bioruby/bioruby/blob/master/README_DEV.rdoc

but I'm not sure what is the best practice for plugin.
It might be better to include the documentation in the README file instead.

In ether case, what in my mind is to auto-generate a plugin description
from those embedded description for the "plugin showcase" which will be
available somewhere on the bioruby.org site in the future.

For that purpose, we may also want to have some flags indicating:

* status of the plugin (stable, usable, buggy, just started etc.)
* the plugin will override the BioRuby core or just provide new features harmlessly
* pre-requirements (especially, other than gems)

etc. etc.

Here's a material for further discussion (example template):

#
# = Bio::XXX - BioRuby plugin for XXX
#
# Copyright::  Copyright (C) 2001, 2003-2005 Bio R. Hacker <brh at example.org>,
# Copyright::  Copyright (C) 2006 Chem R. Hacker <crh at example.org>
# License::    The Ruby License
# Site:        http://github.com/user/bioruby-xxx
#
# == Description
#
# This plugin provides an interface for the XXX database.
#
# == Usage
#
# Lorem ipsum dolor sit amet, consectetur adipisicing elit, ....
#
# == Effects (Overrides?)
#
# * Modify the behavior of Bio::Sequence::NA#translate destructively
# * Add methods to the Bio::DB class
#
# == Depends (Requirements?)
#
# * External MySQL database system
# * RubyGem package 'foobar'
#
# == References
#
# * Hoge F. et al., The XXX database, Nucleic. Acid. Res. 123:100--123 (2030)
# * http://hoge.db/
#

# Do we need these two lines in every BioRuby plugin?
require 'rubygems'
require 'bio'

# Do we allow classes defined outside of the 'Bio' namespace?
module Bio
  class XXX
    # :
  end # XXX
end # Bio


Thanks,
Toshiaki

On 2011/01/11, at 15:38, Pjotr Prins wrote:

> Super!
> 
> On Mon, Jan 10, 2011 at 11:06:48PM +0100, Raoul Bonnal wrote:
>> Hi all,
>> I have updated the github repo with some requests from Pjotr.
>> Now is possible to create bin, db and test/data directory if needed from the command line 
>> 
>> biogem --with-bin --with-bd --with-test-data youprojectname
>> 
>> NOTE 1: older 'data' directory is now 'db' more compliant with rails. Why ? data cames from R packages but we are used to store database or look for databases in db directory.
>> NOTE 2: README updated.
>> 
>> about rspec and cucumber jeweler already has those options.
>> 
>> type 'biogem -h' and you'll get the help.
>> 
>> This gem is not yet available on rubygems I need to implement the templates requested from Toshiaki and Pjotr.
>> 
>> I'm refactoring the code so there are some variations in the original tree.
>> 
>> I "hope", by the end of the week, to provide templates files too.
>> 
>> --
>> R.J.P.B.
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From chmille4 at gmail.com  Wed Jan 12 18:37:19 2011
From: chmille4 at gmail.com (Chase Miller)
Date: Wed, 12 Jan 2011 13:37:19 -0500
Subject: [BioRuby] bio-assembly
Message-ID: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>

Hi All,

Quick update on the bio-assembly plugin.

Francesco has added support for CAF files.  According to his preliminary
tests it can handle a 27k contig 454 file in about a minute.  He also
improved the performance overall so now the ace parser can process a 70 mb
file in about 10 seconds. Nice work!

If there are any requests for parsers or functionality, let us know.

source code: https://github.com/chmille4/bioruby-assembly

<https://github.com/chmille4/bioruby-assembly>usage:
https://github.com/chmille4/bioruby-assembly#readme

gem: https://rubygems.org/gems/bio-assembly


Cheers
Chase


From bonnalraoul at ingm.it  Thu Jan 13 09:30:03 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 13 Jan 2011 10:30:03 +0100
Subject: [BioRuby] bio-assembly
In-Reply-To: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>
References: <AANLkTikAm4z=28Cm=dfP_crB+g6xwVvuN-qnjm1ALtPK@mail.gmail.com>
Message-ID: <8EB2ADDB-7137-4C8E-AB70-C5574A797886@ingm.it>

Hi,
great work guys.

I have updated the Plugins' page http://bioruby.open-bio.org/wiki/Plugins#On_Development_Plugins, it's a list/resume with the state of the art of the plugins.

Please let me know if there is something wrong.

In my mind Planned plugins are just ideas not yet coded.

The other are "on going development". I tried to list the plugins in order of creations.

@Jan: Do you plan to release Ensembl API as a  plugin ? I think is't just a matter of rename the gem
@Geroge: To avoid problems, please, yank isoelectric_point from rubygems

I didn't receive any reply from Ricardo H. Ram?rez-Gonzalez about samtools-ruby-ffi

Do you think that a separate page would be better?  I think so, u?

Ciao.

On 12/gen/2011, at 19.37, Chase Miller wrote:

> Hi All,
> 
> Quick update on the bio-assembly plugin.
> 
> Francesco has added support for CAF files.  According to his preliminary
> tests it can handle a 27k contig 454 file in about a minute.  He also
> improved the performance overall so now the ace parser can process a 70 mb
> file in about 10 seconds. Nice work!
> 
> If there are any requests for parsers or functionality, let us know.
> 
> source code: https://github.com/chmille4/bioruby-assembly
> 
> <https://github.com/chmille4/bioruby-assembly>usage:
> https://github.com/chmille4/bioruby-assembly#readme
> 
> gem: https://rubygems.org/gems/bio-assembly
> 
> 
> Cheers
> Chase
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From yannick.wurm at unil.ch  Sun Jan 16 11:57:04 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Sun, 16 Jan 2011 18:57:04 +0700
Subject: [BioRuby] trees
Message-ID: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>

is a specific person "responsible" for coordinating the wiki?

the following page is largely misleading (contains tons of ruby code):
http://bioruby.open-bio.org/wiki/HOWTO:Trees

cheers,
yannick


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From ngoto at gen-info.osaka-u.ac.jp  Mon Jan 17 05:34:48 2011
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Mon, 17 Jan 2011 14:34:48 +0900
Subject: [BioRuby] trees
In-Reply-To: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
References: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
Message-ID: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>

On Sun, 16 Jan 2011 18:57:04 +0700
Yannick Wurm <yannick.wurm at unil.ch> wrote:

> is a specific person "responsible" for coordinating the wiki?
> 
> the following page is largely misleading (contains tons of ruby code):
> http://bioruby.open-bio.org/wiki/HOWTO:Trees

The page is a trial to translate BioPerl HowTOs from Perl to Ruby,
but is still left unfinished. See the discussion:
http://bioruby.open-bio.org/wiki/Talk:HOWTOs

One of the reasons why the trial stalled is the differences between
BioPerl and BioRuby is larger than we expected.

In the Talk:HOWTOs page, to write BioRuby original documentation
were also discussed, but it stalled too.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp


From pjotr.public14 at thebird.nl  Mon Jan 17 08:47:05 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 17 Jan 2011 09:47:05 +0100
Subject: [BioRuby] bio-logger release 0.9.0
Message-ID: <20110117084705.GA5136@thebird.nl>

Just released bio-logger 0.9.0. Most important feature I added is that
you can inject a filter on log messages (by module). I.e. for the
blast logger you could only show messages relating to a contig:

   log = LoggerPlus['blast']
   log.filter { | level, sub_level, msg | msg =~ /contig1133/ }

on the command line you can do the same with:

   --trace "blast:= msg =~ /contig1133/"

another option is to filter on level and sub_level values:

   log.filter { | level, sub_level, msg | sub_level == 3 or level <= ERROR }

providing lots of possibilities. Obviously much of this can be
handled (multi)grep'ing log files, but the power of using Ruby and
filter combinations makes at a great feature for debugging big data
problems. And you can limit the size of log files, without limiting
expressive power.

Pj.


From pjotr.public14 at thebird.nl  Mon Jan 17 10:08:12 2011
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Mon, 17 Jan 2011 11:08:12 +0100
Subject: [BioRuby] Bio-gff3 plugin 0.8.6
Message-ID: <20110117100812.GA6947@thebird.nl>

Released bio-gff3 parser plugin 0.8.6 on rubygems, and can be used
from the command-line. E.g.

  gem install bio-gff3
  gff3-fetch --help

Introduced LRU cache, replaced the BioRuby GFF line parser and
added lazy parsing. All with significant speedups compared to the
original (No-cache, BioRuby parser, non-lazy).

The LRU version has limited RAM use for any sized data (730MB), and
currently runs 6 times slower than the full memory version.

  Digesting parser:

  Cache              real     user     sys  version     RAM
  ------------------------------------------------------------
  full,bioruby       12m41    12m28    0m09 (0.8.0)
  full,line          12m13    12m06    0m07 (0.8.5)
  full,line,lazy     11m51    11m43    0m07 (0.8.6)     6,600M

  none,bioruby      504m     477m     26m50 (0.8.0)
  none,line         297m     267m     28m36 (0.8.5)       
  none,line,lazy    132m     106m     26m01 (0.8.6)       650M

  lru,bioruby       533m     510m     22m47 (0.8.5)
  lru,line          353m     326m     26m44 (0.8.5)  1K
  lru,line          305m     281m     22m30 (0.8.5) 10K
  lru,line,lazy     182m     161m     21m10 (0.8.6) 10K
  lru,line,lazy      75m      75m      0m17 (0.8.6) 50K   730M
  ------------------------------------------------------------

where

   52M  m_hapla.WS217.dna.fa
  456M  m_hapla.WS217.gff3

ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-linux]
on 64-bits CPU 2.6 GHz (6MB cache), 16 GB RAM machine. 

Note bio-gff3 0.8.6 is a fully digesting parser, with scope for full
validation of the GFF3 relations. The next step, a limited
'optimistic' digestion, will speed things up.

Note also that bio-gff3 exploits the bio-logger plugin - it is a good 
example.

Pj.


From yannick.wurm at unil.ch  Tue Jan 18 07:55:15 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Tue, 18 Jan 2011 14:55:15 +0700
Subject: [BioRuby] trees
In-Reply-To: <20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>
References: <A545268E-7939-49C9-9784-7D5C224D9129@unil.ch>
	<20110117053449.4C7051CBC414@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <C31A2387-A24B-4A20-80F8-237DEEA83A09@unil.ch>

Thanks for the details Naohisa-san

Maybe I suggest we try to "hide from google" things that are not finalized, and links to non-existant documents?
(I have the feeling it may be better to have nothing than to create confusion?)


On 17 Jan 2011, at 12:34, Naohisa GOTO wrote:

> On Sun, 16 Jan 2011 18:57:04 +0700
> Yannick Wurm <yannick.wurm at unil.ch> wrote:
> 
>> is a specific person "responsible" for coordinating the wiki?
>> 
>> the following page is largely misleading (contains tons of ruby code):
>> http://bioruby.open-bio.org/wiki/HOWTO:Trees
> 
> The page is a trial to translate BioPerl HowTOs from Perl to Ruby,
> but is still left unfinished. See the discussion:
> http://bioruby.open-bio.org/wiki/Talk:HOWTOs
> 
> One of the reasons why the trial stalled is the differences between
> BioPerl and BioRuby is larger than we expected.
> 
> In the Talk:HOWTOs page, to write BioRuby original documentation
> were also discussed, but it stalled too.
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp


From yannick.wurm at unil.ch  Tue Jan 18 08:56:26 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Tue, 18 Jan 2011 15:56:26 +0700
Subject: [BioRuby] Rake
In-Reply-To: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
Message-ID: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>

Dear List,

I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.

It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
 - tranlsation
 - multiple alignment (mafft)
 - gblocks to remove crap 
 - tree building (phyml)
AND
 - codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
 - gblocks to remove crap
 - tree building (phyml)

https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile


It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. 

However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. 

Cheers,

yannick 

-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From mail at michaelbarton.me.uk  Tue Jan 18 15:17:08 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Tue, 18 Jan 2011 10:17:08 -0500
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <20110118151708.GB3430@nku069218.hh.nku.edu>

Hi Yannick,

I think it's a great idea to generate predefined pipelines for common
bioinformatics tasks. I experimented with a tool called Boson six months ago.
It could be worth looking if you feel like investing more time into your
pipeline.

Boson commands, similar to rake tasks, are more modular and can be installed
from the web into a ~/.boson directory. This has obvious advantages over
a single rake file. Boson tasks can be chained together where the data is
passed around in YAML format.

The github link is - https://github.com/cldwalker/boson

Cheers

Michael Barton


On Tue, Jan 18, 2011 at 03:56:26PM +0700, Yannick Wurm wrote:
> Dear List,
> 
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san
> and some of the google-able tutorials things are now working.
> 
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta"
> and goes through the following steps for tree building: - tranlsation
> - multiple alignment (mafft) - gblocks to remove crap - tree building (phyml)
> AND - codon-level alignment: reverse translated from protein multiple
> alignment (pal2nal) - gblocks to remove crap - tree building (phyml)
> 
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
> 
> 
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere
> out of the box. But FWIW maybe it can be usefull to a random googler. 
> 
> However, it feels quite clunky, so I think I should do things differently in
> the future. If you have any comments or suggestions, I'd be most happy to
> hear them. 
> 
> Cheers,
> 
> yannick 
> 
> ------------------------- Ant Genomes & Evolution http://yannick.poulet.org
> skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________ BioRuby Project
> - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From diapriid at gmail.com  Tue Jan 18 15:21:36 2011
From: diapriid at gmail.com (Matt)
Date: Tue, 18 Jan 2011 10:21:36 -0500
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <AANLkTinGavos9pLmz6cCHW8FQTZZoS+tJ7PkwxT4QRrw@mail.gmail.com>

Yannick-

I like it.  It might be nice to extend your pipeline in a generic
manner (SimpleAnalysisPipeline). Just a couple of steps that would be
extensible/swappable to different software.  The generic pipeline
would "just work" given a minimal local configuration (I like your
starting point).

Swappable/configurable steps might be

Pre-process (trim / quality filters?)
Alignment   (align)
Post alignment (gblocks)
Translation   (to Nexus)
Analysis      (Phyml)

The idea is that we could swap in components (TNT or RaXML for Phyml,
Muscle for MAFFT etc.)- but also that the pipeline remains "simple".

If I find some time I'd like to work on my first attempt at a BioRuby
Plugin, a wrapper for TNT (hopefully tied in to the analysis bit
above).

cheers,
Matt


On Tue, Jan 18, 2011 at 3:56 AM, Yannick Wurm <yannick.wurm at unil.ch> wrote:
> Dear List,
>
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.
>
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
> ?- tranlsation
> ?- multiple alignment (mafft)
> ?- gblocks to remove crap
> ?- tree building (phyml)
> AND
> ?- codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
> ?- gblocks to remove crap
> ?- tree building (phyml)
>
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
>
>
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler.
>
> However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them.
>
> Cheers,
>
> yannick
>
> -------------------------
> ?Ant Genomes & Evolution
> http://yannick.poulet.org
> ? skype://yannickwurm
>
>
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From francesco.strozzi at gmail.com  Tue Jan 18 20:55:20 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Tue, 18 Jan 2011 21:55:20 +0100
Subject: [BioRuby] BioRuby HTSeq-like
Message-ID: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>

Hi BioRuby people,
just wondering if something similar exists for BioRuby (is a package
to work and manipulate next-gen sequencing data, in Python):

http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html

Many features could be implemented or are already available for
BioRuby....these are the basics:
- Getting statistical summaries about the base-call quality scores to
study the data quality.
- Calculating a coverage vector and exporting it for visualization in
a genome browser.
- Reading in annotation data from a GFF file.
- Assigning aligned reads from an RNA-Seq experiments to exons and genes.

Particularly, the plotting functions to explore and assess quality
data seems very interesting.
If nothing similar exists for BioRuby, I think we should discuss about
coding a BioRuby "NextGenSequencing" plugin, to provide the same
functionalities and also to add something new as well....

What do you think?

Cheers
--

Francesco


From yannick.wurm at unil.ch  Thu Jan 20 04:08:55 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Thu, 20 Jan 2011 11:08:55 +0700
Subject: [BioRuby] Rake
In-Reply-To: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
References: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
Message-ID: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>

Hello & thanks for the comments.

Matt wrote: 
> Swappable/configurable steps might be
> 
> Pre-process (trim / quality filters?)
> Alignment   (align)
> Post alignment (gblocks)
> Translation   (to Nexus)
> Analysis      (Phyml)
> 
> The idea is that we could swap in components (TNT or RaXML for Phyml,
> Muscle for MAFFT etc.)- but also that the pipeline remains "simple".

Yes, thats what I would ideally want. (as well as being able to easily modify the run options of the programs). How would you go about generalizing this?
Right now I'm basing "what do to" on the file extensions I provide... which limits me based on the file extensions...


Michael wrote:
> I think it's a great idea to generate predefined pipelines for common
> bioinformatics tasks. I experimented with a tool called Boson six months ago.
> It could be worth looking if you feel like investing more time into your
> pipeline.
> 
> Boson commands, similar to rake tasks, are more modular and can be installed
> from the web into a ~/.boson directory. This has obvious advantages over
> a single rake file. Boson tasks can be chained together where the data is
> passed around in YAML format.
> 
> The github link is - https://github.com/cldwalker/boson

I haven't looked thoroughly now, but at least superficially, Boson looks real cool.

However, I'm a bit scared of investing energy into technologies that are too new. Boson has only one developer who may or may not keep his project alive over the next years. Time I invest in learning something today ... must continue to help improve my productivity over the next 5 or 10 years by still being reusable in 5 or 10 years (with as few modifications as possible). There is uncertainty to everything, but something like Boson does seems a bit too risky right now...

Cheers,

yannick


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From bonnalraoul at ingm.it  Thu Jan 20 09:13:20 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 20 Jan 2011 10:13:20 +0100
Subject: [BioRuby] Rake
In-Reply-To: <3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
Message-ID: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>

Dear Yanninck,
rake usually is used inside a project directory to provide common operations to the project.
Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are.

Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability
What about spend some energy on wrapping that commands into BioRuby classes?
In that way those application could be available to other scripts.

If you want to keep the rake approach we should find a way to not replicate rakefiles.
One idea could be to create a rakefile in your working directory, similar to Rails:


# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.

require File.expand_path('../config/application', __FILE__)
require 'rake'

#The user needs just to add  the tasks he wants:
Bio::SomeName.load_tasks
Bio::SomeOtherName.load_tasks
Bio::AnotherName.load_tasks


On 18/gen/2011, at 09.56, Yannick Wurm wrote:

> Dear List,
> 
> I'd had a few issues setting up a rake file 2/3 weeks ago. Thanks to Hiro-san and some of the google-able tutorials things are now working.
> 
> It is supposed to be a mini-pipeline that takes a file called "cdsSeq.fasta" and goes through the following steps for tree building:
> - tranlsation
> - multiple alignment (mafft)
> - gblocks to remove crap 
> - tree building (phyml)
> AND
> - codon-level alignment: reverse translated from protein multiple alignment (pal2nal)
> - gblocks to remove crap
> - tree building (phyml)
> 
> https://github.com/yannickwurm/tidbits/blob/master/cdsToAlignmentToTree/Rakefile
> 
> 
> It depends on a bunch of stuff in my $PATH, so probably won't run elsewhere out of the box. But FWIW maybe it can be usefull to a random googler. 
> 
> However, it feels quite clunky, so I think I should do things differently in the future. If you have any comments or suggestions, I'd be most happy to hear them. 
> 
> Cheers,
> 
> yannick 
> 
> -------------------------
> Ant Genomes & Evolution 
> http://yannick.poulet.org
>   skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From bonnalraoul at ingm.it  Thu Jan 20 10:35:58 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 20 Jan 2011 11:35:58 +0100
Subject: [BioRuby] BioRuby HTSeq-like
In-Reply-To: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
References: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
Message-ID: <EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>

Hi folks,
Yesterday I met Francesco in my lab and was a wonderful opportunity to exchange ideas and thoughts.


About Fancesco's mail I think that we could grab inspiration from Galaxy/BioPython (http://main.g2.bx.psu.edu/) , they did a very good work on wrapping the common software for crunching NGS data.

So my input is, let's start wrapping them and possibly opening a bioruby-ngs project on github:
https://github.com/helios/bioruby-ngs (just the repo :-))


reading around  http://seqanswers.com/forums/showthread.php?t=2461 sometimes there is the need to split and distribute the computation:
there are different possibilities, but splitting the fastq file and at the same time enabling the multithreading seems to be the best option; if you have suggestions please comment. 

Thanks to Goto san, fastq support is on 
Thanks to Pjotr, GFF3 support is on
Thanks to Chase and Fancesco, CAF and Ace support is on
For plotting as we said one possibility is http://rubyvis.rubyforge.org/  from Claudio Bustos but if you have better alternatives... please discuss.
About  statistics please join http://groups.google.com/group/sciruby-dev 

Having this tools in our arsenal is useful and strategical for founding.

I would say 
+1

PS: Please clone and add your name to the list of the authors if you want to join into this project.

PS: if someone is using SGE what do you think about http://gridengine.info/2010/12/24/goodbye-grid-engine ?


On 18/gen/2011, at 21.55, Francesco Strozzi wrote:

> Hi BioRuby people,
> just wondering if something similar exists for BioRuby (is a package
> to work and manipulate next-gen sequencing data, in Python):
> 
> http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
> 
> Many features could be implemented or are already available for
> BioRuby....these are the basics:
> - Getting statistical summaries about the base-call quality scores to
> study the data quality.
> - Calculating a coverage vector and exporting it for visualization in
> a genome browser.
> - Reading in annotation data from a GFF file.
> - Assigning aligned reads from an RNA-Seq experiments to exons and genes.
> 
> Particularly, the plotting functions to explore and assess quality
> data seems very interesting.
> If nothing similar exists for BioRuby, I think we should discuss about
> coding a BioRuby "NextGenSequencing" plugin, to provide the same
> functionalities and also to add something new as well....
> 
> What do you think?
> 
> Cheers
> --
> 
> Francesco
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From francesco.strozzi at gmail.com  Thu Jan 20 11:27:43 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Thu, 20 Jan 2011 12:27:43 +0100
Subject: [BioRuby] BioRuby HTSeq-like
In-Reply-To: <EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>
References: <AANLkTinWN5ZDRkeJWmVoCbMQzPZa72_yMwBN+eg0eoMb@mail.gmail.com>
	<EA3E3FFC-7BD8-43DE-806F-CA873089E23D@ingm.it>
Message-ID: <AANLkTi=4d7CJCe8eQ2AeEq8DfFpqFJNWWh=v+3eAkfn4@mail.gmail.com>

Today is Thursday (BioRuby IRC day), I will try to join the #bioruby
channel this afternoon (CET time). If there is someone else we could
discuss about this plugin and new ideas.


> PS: Please clone and add your name to the list of the authors if you want to
> join into this project.

Done! I'm in!


-- 

Francesco


From mail at michaelbarton.me.uk  Thu Jan 20 20:57:26 2011
From: mail at michaelbarton.me.uk (Michael Barton)
Date: Thu, 20 Jan 2011 15:57:26 -0500
Subject: [BioRuby] Rake
In-Reply-To: <7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>
References: <mailman.7.1295370009.1693.bioruby@lists.open-bio.org>
	<7B61BC01-98ED-414F-90C0-A942144D9C08@unil.ch>
Message-ID: <20110120205726.GD245@Michael-Bartons-MacBook.local>

Yannick, you make an excellent point about the long term stability for boson.
The ruby community, myself often guilty of this, is quick to jump on a new gem,
which may or may not last into the future. A example of this is the Less gem
for compiling CSS which has seen some recet popularity. I believe the developer
has said he will no longer maintain it.

Another option could be Thor. I believe this is also aimed at being a more
modular rake-like tool. This is developed by Yehuda Katz and I think is used
for the basis of few mainstream ruby command line tools (possibly the rails3
CLI? I'm not 100% about this.). I think you could expect Thor to be more mature
and likely to be continually developed. If can find the episode of the
ChangeLog with Yehuda you can hear him discuss it.

On Thu, Jan 20, 2011 at 11:08:55AM +0700, Yannick Wurm wrote:
> Hello & thanks for the comments.
> 
> Matt wrote: 
> > Swappable/configurable steps might be
> > 
> > Pre-process (trim / quality filters?) Alignment   (align) Post alignment
> > (gblocks) Translation   (to Nexus) Analysis      (Phyml)
> > 
> > The idea is that we could swap in components (TNT or RaXML for Phyml,
> > Muscle for MAFFT etc.)- but also that the pipeline remains "simple".
> 
> Yes, thats what I would ideally want. (as well as being able to easily modify
> the run options of the programs). How would you go about generalizing this?
> Right now I'm basing "what do to" on the file extensions I provide... which
> limits me based on the file extensions...
> 
> 
> 
> Michael wrote:
> > I think it's a great idea to generate predefined pipelines for common
> > bioinformatics tasks. I experimented with a tool called Boson six months
> > ago.  It could be worth looking if you feel like investing more time into
> > your pipeline.
> > 
> > Boson commands, similar to rake tasks, are more modular and can be
> > installed from the web into a ~/.boson directory. This has obvious
> > advantages over a single rake file. Boson tasks can be chained together
> > where the data is passed around in YAML format.
> > 
> > The github link is - https://github.com/cldwalker/boson
> 
> I haven't looked thoroughly now, but at least superficially, Boson looks real
> cool.
> 
> However, I'm a bit scared of investing energy into technologies that are too
> new. Boson has only one developer who may or may not keep his project alive
> over the next years. Time I invest in learning something today ... must
> continue to help improve my productivity over the next 5 or 10 years by still
> being reusable in 5 or 10 years (with as few modifications as possible).
> There is uncertainty to everything, but something like Boson does seems a bit
> too risky right now...
> 
> Cheers,
> 
> yannick
> 
> 
> ------------------------- Ant Genomes & Evolution http://yannick.poulet.org
> skype://yannickwurm
> 
> 
> 
> 
> _______________________________________________ BioRuby Project
> - http://www.bioruby.org/ BioRuby mailing list BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From yannick.wurm at unil.ch  Fri Jan 21 05:24:22 2011
From: yannick.wurm at unil.ch (Yannick Wurm)
Date: Fri, 21 Jan 2011 12:24:22 +0700
Subject: [BioRuby] Rake
In-Reply-To: <63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>
References: <mailman.3.1293123612.28870.bioruby@lists.open-bio.org>
	<3E646A64-028B-4528-A384-A84C9C426D2B@unil.ch>
	<63616086-9A73-440E-9AB6-2EE8160FC9D0@ingm.it>
Message-ID: <2E930098-FD2E-4D3B-AC61-1B54D3653DE7@unil.ch>

Ciao Raoul,

mi dispiace, I was away from the computer during most of the irc thing.


On 20 Jan 2011, at 16:13, Raoul Bonnal wrote:
> Dear Yanninck,
> rake usually is used inside a project directory to provide common operations to the project.
> Which is the idea behind "rake for bioinformatics" ? I mean, you have to copy you rake file where your data are.
> 
> Looking at the code there are a lot of dependencies from command line softwares and that could be a problem in maintainability

That is true. My main worry here (and for most things) is rapidly getting a biological result. I'm still working on finding the optimal balance between quick hack and maintainability/reusability. Migrating from shell scripts to ruby hacks does probably save me some time because in ruby it's really simple to put in a few verifications by raising Errors if a tool I need isn't in the $PATH or if an input/output file is empty. Those mean that debugging and fixing is much faster if I decide to run things on the linux server instead of the macbook, or in 2 years time after a reinstall.


> What about spend some energy on wrapping that commands into BioRuby classes?
> In that way those application could be available to other scripts.
I have two answers.
  - right now I cannot dedicate the time required to learn how to do that well. I need understand how ants work first :)     (If I were developping a big uniprot-type web application that needs to be robust for users, making wrappers may be defendable.... for one-off hacks its not)
  - call me conservative, but I'm also generally scared of wrappers. First, I want to have the raw input & output files that the programs use, because I may need to read or edit or rerun them in the future... I know I'll be able to read a raw text file. Thus I've never used bioruby's wrappers for blast or codeml or multiple sequence alignment (However, I have recently discovered the amazingly timesaving Bio::Tree however -wow). Second, programs are constantly changing... and thus wrappers must too - they're a ton of work to maintain and -like the Boson thing- there is no guarantee that that will be done.
 

> If you want to keep the rake approach we should find a way to not replicate rakefiles.
> One idea could be to create a rakefile in your working directory, similar to Rails:
> 
> # Add your own tasks in files placed in lib/tasks ending in .rake,
> # for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
> 
> require File.expand_path('../config/application', __FILE__)
> require 'rake'
> 
> #The user needs just to add the tasks he wants:
> Bio::SomeName.load_tasks
> Bio::SomeOtherName.load_tasks
> Bio::AnotherName.load_tasks


That sounds like a really cool approach. I want to hear more :)


-------------------------
 Ant Genomes & Evolution 
http://yannick.poulet.org
   skype://yannickwurm


From francesco.strozzi at gmail.com  Fri Jan 21 09:41:47 2011
From: francesco.strozzi at gmail.com (Francesco Strozzi)
Date: Fri, 21 Jan 2011 10:41:47 +0100
Subject: [BioRuby] BIO-NGS (and Rake/Thor for bioinformatics)
Message-ID: <AANLkTi=kyv-ynXXmsW2wD=uoaNjpWc6atQN=zgVPK339@mail.gmail.com>

Hi all,
in the yesterday IRC chat (http://bioruby.org/irc/?date=2011-01) we
discussed about the bio-ngs plugin that Raoul wrote in a previous
email.
Here is the Wiki page on BioRuby describing the general idea for this
plugin: http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing

We want to use wrappers and/or bindings to existing tools like
MAQ,BWA,SAMtools and we want to use Rake or Thor to provide custom
tasks and let the user run NGS analysis. We would like to include also
the possibility to create reports using statsample and rubyvis. Maybe
some aspects are still a bit unclear at the moment (I think we need to
define some sort of guidelines), but I hope we could come up with a
useful (let me use this term) "framework" to run bioinformatics NGS
analyses with Ruby.

Any comment/help/feedback/suggestion is more than welcome!

Cheers

-- 

Francesco


From mictadlo at gmail.com  Wed Jan 26 00:41:11 2011
From: mictadlo at gmail.com (Michal)
Date: Wed, 26 Jan 2011 10:41:11 +1000
Subject: [BioRuby] marshal data too short
Message-ID: <4D3F6DA7.8050101@gmail.com>

Hi,
I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way

$ tar xvfz ruby-1.9.2-p136.tar.gz
$ cd ruby-1.9.2-p136/
$ ./configure --prefix=/home/mictadlo/apps/ruby
$ make
$ make install
$ vim ~/.bashrc
  export APPS=/home/mictadlo/apps
  export RUBY_HOME=$APPS/ruby
  export LD_LIBRARY_PATH=/RUBY_HOME/lib
  PATH=$RUBY_HOME/bin:$PATH
$ . ~/.bashrc
$ ruby -v
  ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]

$ tar xvfz bioruby-1.4.1.tar.gz
$ cd bioruby-1.4.1/
$ ruby setup.rb
$ bioruby
     Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
     Loading object (/home/mitlox/.bioruby/shell/session/object) ... 
Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : 
marshal data too short
     done

     . . . B i o R u b y   i n   t h e   s h e l l . . .

       Version : BioRuby 1.4.1 / Ruby 1.9.2

     bioruby> exit

How can I fix the error in BioRuby?

Thank you in advance.

Michal


From bonnalraoul at ingm.it  Wed Jan 26 15:23:03 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Wed, 26 Jan 2011 16:23:03 +0100
Subject: [BioRuby] IRC meeting
Message-ID: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>

As usual, tomorrow the IRC meeting.

--
R.J.P.B.


From ktym at hgc.jp  Wed Jan 26 16:09:02 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu, 27 Jan 2011 01:09:02 +0900
Subject: [BioRuby] marshal data too short
In-Reply-To: <4D3F6DA7.8050101@gmail.com>
References: <4D3F6DA7.8050101@gmail.com>
Message-ID: <ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>

Hi Michal,

Could you give me some additional information?

% ls -l ~/.bioruby/shell/session/object
-rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object

% ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
[4, 8]

Have you ever used the bioruby shell with the old version of Ruby before?

If your file is not corrupted, this might be due to the backward
incompatibility of the Marshal file format (if so, does anyone know
whether there are any workaround to convert old marshal data into 1.9's?).

Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
restore the saved objects:

% ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object

Toshiaki


On 2011/01/26, at 9:41, Michal wrote:

> Hi,
> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
> 
> $ tar xvfz ruby-1.9.2-p136.tar.gz
> $ cd ruby-1.9.2-p136/
> $ ./configure --prefix=/home/mictadlo/apps/ruby
> $ make
> $ make install
> $ vim ~/.bashrc
> export APPS=/home/mictadlo/apps
> export RUBY_HOME=$APPS/ruby
> export LD_LIBRARY_PATH=/RUBY_HOME/lib
> PATH=$RUBY_HOME/bin:$PATH
> $ . ~/.bashrc
> $ ruby -v
> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
> 
> $ tar xvfz bioruby-1.4.1.tar.gz
> $ cd bioruby-1.4.1/
> $ ruby setup.rb
> $ bioruby
>    Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>    Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>    done
> 
>    . . . B i o R u b y   i n   t h e   s h e l l . . .
> 
>      Version : BioRuby 1.4.1 / Ruby 1.9.2
> 
>    bioruby> exit
> 
> How can I fix the error in BioRuby?
> 
> Thank you in advance.
> 
> Michal
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ktym at hgc.jp  Wed Jan 26 16:46:23 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Thu, 27 Jan 2011 01:46:23 +0900
Subject: [BioRuby] IRC meeting
In-Reply-To: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>
References: <69560D81-323D-4273-8B2B-C722341BD578@ingm.it>
Message-ID: <E7627EE8-7AC6-45EF-B88E-F8600DE341F8@hgc.jp>

Raoul,

On 2011/01/27, at 0:23, Raoul Bonnal wrote:

> As usual, tomorrow the IRC meeting.
> 
> --
> R.J.P.B.


Thank you for the reminder! The next will be our 6th IRC meeting.

In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system.
As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became
the first use case of the logger (he posted announcements to this list on Jan 17th).

We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting.
As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea:
http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing

As for the weekly BioRuby IRC meeting, please see
http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference

Thanks,

Toshiaki


From bonnalraoul at ingm.it  Wed Jan 26 19:10:26 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Wed, 26 Jan 2011 20:10:26 +0100
Subject: [BioRuby] IRC meeting
In-Reply-To: <E7627EE8-7AC6-45EF-B88E-F8600DE341F8@hgc.jp>
Message-ID: <20110126191026.e71c169e@mail.ingm.it>

Hi all,
I have updated the page http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing
I'll try to keep you up to date about samtools from ml and that page.
I can't remember who is involved in the workflows, tomorrow we'll fix the page with the rigth names.
  _____  

From: Toshiaki Katayama [mailto:ktym at hgc.jp]
To: Raoul Bonnal [mailto:bonnalraoul at ingm.it]
Cc: BioRuby ML [mailto:bioruby at lists.open-bio.org]
Sent: Wed, 26 Jan 2011 17:46:23 +0100
Subject: Re: [BioRuby] IRC meeting

Raoul,
  
  On 2011/01/27, at 0:23, Raoul Bonnal wrote:
  
  > As usual, tomorrow the IRC meeting.
  > 
  > --
  > R.J.P.B.
  
  
  Thank you for the reminder! The next will be our 6th IRC meeting.
  
  In the 3rd (Jan 6) and 4th (Jan 13) meeting, we discussed about the logging system.
  As a result, Pjotr released the bio-logger plugin and his bio-gff3 plugin became
  the first use case of the logger (he posted announcements to this list on Jan 17th).
  
  We talked about a plan to develop NGS-related plugins at the 5th (Jan 20) meeting.
  As posted by Francesco on Jan 21th, he contributed a Wiki page summarizing the idea:
  http://bioruby.open-bio.org/wiki/Next_Generation_Sequencing
  
  As for the weekly BioRuby IRC meeting, please see
  http://bioruby.open-bio.org/wiki/BioRuby_IRC_conference
  
  Thanks,
  
  Toshiaki  


From mictadlo at gmail.com  Fri Jan 28 12:18:30 2011
From: mictadlo at gmail.com (Michal)
Date: Fri, 28 Jan 2011 22:18:30 +1000
Subject: [BioRuby] marshal data too short
In-Reply-To: <ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
References: <4D3F6DA7.8050101@gmail.com>
	<ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
Message-ID: <4D42B416.8010503@gmail.com>

Hi Toshiaki,
On my system was not Ruby installed before and I just installed the 
latest version in my home directory:
$ ls -l ~/.bioruby/shell/session/object
-rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 
/home/mictadlo/.bioruby/shell/session/object
$ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
[nil, nil]
$ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' 
~/.bioruby/shell/session/object
-e:1:in `load': marshal data too short (ArgumentError)
     from -e:1:in `<main>'

Do you need another information?

Thank you in advance.

Michal


On 01/27/2011 02:09 AM, Toshiaki Katayama wrote:
> Hi Michal,
>
> Could you give me some additional information?
>
> % ls -l ~/.bioruby/shell/session/object
> -rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object
>
> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
> [4, 8]
>
> Have you ever used the bioruby shell with the old version of Ruby before?
>
> If your file is not corrupted, this might be due to the backward
> incompatibility of the Marshal file format (if so, does anyone know
> whether there are any workaround to convert old marshal data into 1.9's?).
>
> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
> restore the saved objects:
>
> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
>
> Toshiaki
>
>
> On 2011/01/26, at 9:41, Michal wrote:
>
>> Hi,
>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
>>
>> $ tar xvfz ruby-1.9.2-p136.tar.gz
>> $ cd ruby-1.9.2-p136/
>> $ ./configure --prefix=/home/mictadlo/apps/ruby
>> $ make
>> $ make install
>> $ vim ~/.bashrc
>> export APPS=/home/mictadlo/apps
>> export RUBY_HOME=$APPS/ruby
>> export LD_LIBRARY_PATH=/RUBY_HOME/lib
>> PATH=$RUBY_HOME/bin:$PATH
>> $ . ~/.bashrc
>> $ ruby -v
>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
>>
>> $ tar xvfz bioruby-1.4.1.tar.gz
>> $ cd bioruby-1.4.1/
>> $ ruby setup.rb
>> $ bioruby
>>     Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>>     Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>>     done
>>
>>     . . . B i o R u b y   i n   t h e   s h e l l . . .
>>
>>       Version : BioRuby 1.4.1 / Ruby 1.9.2
>>
>>     bioruby>  exit
>>
>> How can I fix the error in BioRuby?
>>
>> Thank you in advance.
>>
>> Michal
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>


From ktym at hgc.jp  Sat Jan 29 12:18:04 2011
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Sat, 29 Jan 2011 21:18:04 +0900
Subject: [BioRuby] marshal data too short
In-Reply-To: <4D42B416.8010503@gmail.com>
References: <4D3F6DA7.8050101@gmail.com>
	<ED694538-86D8-4D9A-B55A-A763C1853EAE@hgc.jp>
	<4D42B416.8010503@gmail.com>
Message-ID: <8DFDDEA3-9B1D-44DC-BCB6-DCBA2C06BAF9@hgc.jp>

Hi Michal,

When I remove the ~/.bioruby directory, I could reproduce the same error with Ruby 1.9.2.

The ~/.bioruby/shell/session/object file was empty because BioRuby shell failed to save the file.

Saving object (/Users/ktym/.bioruby/shell/session/object) ... Error: Failed to save (/Users/ktym/.bioruby/shell/session/object) : can't convert Symbol into String

I'll try to fix this.

Toshiaki


On 2011/01/28, at 21:18, Michal wrote:

> Hi Toshiaki,
> On my system was not Ruby installed before and I just installed the latest version in my home directory:
> $ ls -l ~/.bioruby/shell/session/object
> -rw-r--r-- 1 mictadlo mictadlo 0 2011-01-25 19:40 /home/mictadlo/.bioruby/shell/session/object
> $ ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
> [nil, nil]
> $ ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
> -e:1:in `load': marshal data too short (ArgumentError)
>    from -e:1:in `<main>'
> 
> Do you need another information?
> 
> Thank you in advance.
> 
> Michal
> 
> 
> On 01/27/2011 02:09 AM, Toshiaki Katayama wrote:
>> Hi Michal,
>> 
>> Could you give me some additional information?
>> 
>> % ls -l ~/.bioruby/shell/session/object
>> -rw-r--r--  1 ktym  staff  17401  1 19 13:09 /Users/ktym/.bioruby/shell/session/object
>> 
>> % ruby -e 'p ARGF.read.unpack("cc")' ~/.bioruby/shell/session/object
>> [4, 8]
>> 
>> Have you ever used the bioruby shell with the old version of Ruby before?
>> 
>> If your file is not corrupted, this might be due to the backward
>> incompatibility of the Marshal file format (if so, does anyone know
>> whether there are any workaround to convert old marshal data into 1.9's?).
>> 
>> Note that, in my environment, both Ruby 1.8.7 and 1.9.2 can successfully
>> restore the saved objects:
>> 
>> % ruby -rubygems -rbio -e 'p Marshal.load(ARGF.read)' ~/.bioruby/shell/session/object
>> 
>> Toshiaki
>> 
>> 
>> On 2011/01/26, at 9:41, Michal wrote:
>> 
>>> Hi,
>>> I installed on Ubuntu 10.10 (Linux Mint) Ruby 1.9.2 in the following way
>>> 
>>> $ tar xvfz ruby-1.9.2-p136.tar.gz
>>> $ cd ruby-1.9.2-p136/
>>> $ ./configure --prefix=/home/mictadlo/apps/ruby
>>> $ make
>>> $ make install
>>> $ vim ~/.bashrc
>>> export APPS=/home/mictadlo/apps
>>> export RUBY_HOME=$APPS/ruby
>>> export LD_LIBRARY_PATH=/RUBY_HOME/lib
>>> PATH=$RUBY_HOME/bin:$PATH
>>> $ . ~/.bashrc
>>> $ ruby -v
>>> ruby 1.9.2p136 (2010-12-25 revision 30365) [i686-linux]
>>> 
>>> $ tar xvfz bioruby-1.4.1.tar.gz
>>> $ cd bioruby-1.4.1/
>>> $ ruby setup.rb
>>> $ bioruby
>>>    Loading config (/home/mitlox/.bioruby/shell/session/config) ... done
>>>    Loading object (/home/mitlox/.bioruby/shell/session/object) ... Error: Failed to load (/home/mitlox/.bioruby/shell/session/object) : marshal data too short
>>>    done
>>> 
>>>    . . . B i o R u b y   i n   t h e   s h e l l . . .
>>> 
>>>      Version : BioRuby 1.4.1 / Ruby 1.9.2
>>> 
>>>    bioruby>  exit
>>> 
>>> How can I fix the error in BioRuby?
>>> 
>>> Thank you in advance.
>>> 
>>> Michal
>>> 
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
>> 
> 


From mictadlo at gmail.com  Sun Jan 30 11:42:09 2011
From: mictadlo at gmail.com (Michal)
Date: Sun, 30 Jan 2011 21:42:09 +1000
Subject: [BioRuby] samtools-ruby
Message-ID: <4D454E91.1080604@gmail.com>

Hi,
I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. 
I have already posted this problem on 
https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have 
not got any response.

What did I wrong?

Michal


From bonnalraoul at ingm.it  Mon Jan 31 10:11:46 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 11:11:46 +0100
Subject: [BioRuby] samtools-ruby
In-Reply-To: <4D454E91.1080604@gmail.com>
References: <4D454E91.1080604@gmail.com>
Message-ID: <A6CFF775-1E68-4FA1-A84F-8E489E2FCE96@ingm.it>

Dear Michal,
please check this out:

https://github.com/helios/bioruby-samtools

This is the inital port of samtools-ruby as plugin. It comes with library for osx and linux, no windows.
I need to test the linux library because I'm developing under osx.
If the libbam.a is wrong please give me the right one and I'll add it to the repo.
Also note that the library has been compiled for 64bit.

Ciao!

On 30/gen/2011, at 12.42, Michal wrote:

> Hi,
> I have tried to install samtools-ruby on ruby 1.9.2, but I have failed. I have already posted this problem on https://github.com/homonecloco/samtools-ruby/issues#issue/3 , but I have not got any response.
> 
> What did I wrong?
> 
> Michal
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.


From bonnalraoul at ingm.it  Mon Jan 31 10:27:51 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 11:27:51 +0100
Subject: [BioRuby] BioGem and Rails
Message-ID: <1CCE39E6-F232-44C8-B95D-3C620443EF5C@ingm.it>

Dear All,
I've created a new branch in biogem.

https://github.com/helios/bioruby-gem/tree/rails_engine

It adds an option at biogem script for creating a rails engine with your gem, ONLY Rails3 !!!

The idea is: develop a gem that can be used in a script and extend it to be integrated in a rails project.
Which library can benefits from this approach ? I think,  databases, parser  or any data that you want to expose to a rails application.

It's in a very early stage so don't use it now, this message is just to let you know that we are adding new features.

from the help:
--with-engine                create a Rails engine with the namespace give in input. Set default database creation

Note: Is not possible to add the engine to an old gem, I need to fix it and implement the generator to accomplish to this task.

Any input is welcome.


Ciao.

--
R.J.P.B.


From jan.aerts at gmail.com  Mon Jan 31 15:07:39 2011
From: jan.aerts at gmail.com (Jan Aerts)
Date: Mon, 31 Jan 2011 16:07:39 +0100
Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics
Message-ID: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>

All,

FYI: There is now a Bioinformatics paper that describes the Ruby API to the
Ensembl databases. Thanks to Francesco Strozzi for working on this with me.
You can find it here: http://bit.ly/fzQamR

At this moment this API covers the core and variation databases. If anyone
is interested in working on the API for compara or functional, please let me
know.

Kind regards,
jan.


From bonnalraoul at ingm.it  Mon Jan 31 15:22:12 2011
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 31 Jan 2011 16:22:12 +0100
Subject: [BioRuby] ruby-ensembl-api paper accepted by Bioinformatics
In-Reply-To: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>
References: <AANLkTimEcTFHOZBLRiYDBKo-iDOGnqYU_7ypM+tp41dM@mail.gmail.com>
Message-ID: <193864A0-D798-4737-83CE-7A7932E4552C@ingm.it>

well done!
On 31/gen/2011, at 16.07, Jan Aerts wrote:

> All,
> 
> FYI: There is now a Bioinformatics paper that describes the Ruby API to the
> Ensembl databases. Thanks to Francesco Strozzi for working on this with me.
> You can find it here: http://bit.ly/fzQamR
> 
> At this moment this API covers the core and variation databases. If anyone
> is interested in working on the API for compara or functional, please let me
> know.
> 
> Kind regards,
> jan.
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

--
R.J.P.B.