From jillyh0 at gmail.com  Mon Mar  1 16:42:25 2010
From: jillyh0 at gmail.com (Jillian E Kozyra)
Date: Mon, 1 Mar 2010 16:42:25 -0500
Subject: [BioRuby] Phylogenetic Trees or Hierarchical Clustering
Message-ID: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>

Dear Colleagues,

We are working on a linguistics project in which we will calculate language
similarities. From the language similarity matrix, we would like to create
either a hierarchical clustering output or phylogenetic tree. We seek a pure
Ruby plugin with which to do this. Could you give us some guidance?

Thanks,
Jillian

-- 
917-434-7511
http://sswl.railsplayground.net

From bonnalraoul at ingm.it  Mon Mar  8 08:28:16 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 8 Mar 2010 14:28:16 +0100
Subject: [BioRuby] RVM: Ruby Version Manager
Message-ID: <cb6734ac-24c5-4320-9407-90976749bc1c@ingm.it>

Do you know this http://rvm.beginrescueend.com/ tool for having multiple ruby environment installed at the same time ?

RVM is a command line tool which allows us to easily install, manage and work with multiple ruby environments from interpreters to sets of gems. RVM itself is easy to  install!

I'm using it on a vm for developing and testing and it is awesome how it handles everything :-)

Give it a try.

--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 0 200 662 326
fax: +39 0 200 662 346
http://www.ingm.it


From daniel.lundin at molbio.su.se  Tue Mar  9 14:48:11 2010
From: daniel.lundin at molbio.su.se (Daniel Lundin)
Date: Tue, 09 Mar 2010 20:48:11 +0100
Subject: [BioRuby] HMMER 3 parsers?
Message-ID: <4B96A5FB.9060607@molbio.su.se>

Hi,

HMMER 3 is currently available as a first release candidate. With it 
comes several news both in the form of new tools and new kinds of data, 
which means output formats are changed. Is anybody working on BioRuby 
parsers for these?

/D

-- 
Daniel Lundin

Department of Molecular Biology & Functional Genomics
Arrhenius Laboratories for Natural Sciences
Stockholm University, SE-106 91 Stockholm, Sweden

tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88

Email: daniel.lundin at molbio.su.se

From rutgeraldo at gmail.com  Wed Mar 10 08:22:48 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Wed, 10 Mar 2010 13:22:48 +0000
Subject: [BioRuby] RDF Triples in BioRuby, a funding proposal to Google SoC
Message-ID: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>

Dear BioRuby-ites,

my apologies that my first email to this list is so long and
tangential. I am trying to find out how to express RDF triples in
BioRuby. In this email I'm explaining why I care enough to try to get
funding for someone to work on this. If you don't care about any of
this, you can stop reading now.

The National Evolutionary Synthesis Center (NESCent.org) is planning
to be a mentoring organization for the Google Summer of Code 2010. I
have submitted a project idea to this: to develop NeXML I/O and -
probably more importantly for you - RDF capabilities for BioRuby. If
funded, a student/coder will work on this full time over the summer,
under the shared supervision of Jan Aerts and myself. Here is the
link: http://tinyurl.com/biorubynexml

NeXML is a data format for phylogenetic data that can be read and
written in perl, python, java and (to some extent) c++ and javascript.
RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
can tell BioRuby isn't completely up to speed for it, yet.

(As an aside: you might ask yourself why there is something like NeXML
when there is PhyloXML for BioRuby. The answer is that NeXML solves a
different problem: PhyloXML started essentially as a next generation
of New Hampshire eXtended (NHX) to meet the annotation needs of
comparative genomics, things such as gene duplications and other
molecular evolution events, on phylogenetic trees; NeXML started as a
complete XML representation of the NEXUS format, providing other
comparative data types such as categorical and continuous character
state matrices, restriction site matrices, and so on, in addition to
trees, taxa, sequence alignments. There is obviously some overlap
between the formats, but I guess that is not unique in bioinformatics
:))

NeXML has a semantic annotation facility that uses RDFa. This allows
us to add additional metadata to a fundamental phylogenetic data
object (a tree, taxon, character, etc.) to form a "triple": the
fundamental data object is the triple Subject, and the Predicate and
Object are added as RDFa attributes. Since NeXML can be transformed
using a standard XSL stylesheet to RDF/XML, we can express a limitless
number of statements about phylogenetics. However, this means that any
NeXML I/O library needs to be able to represent RDF triples. I have
studied the BioRuby API as best as I could (but: I don't know ruby)
and couldn't identify how to do this.

My questions to you:

* is there a way to express triples in BioRuby?
* if there is not, what would be a good design to express triples in
BioRuby so that this would be more useful than just for NeXML?

Thank you!

Rutger

-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com

From ktym at hgc.jp  Wed Mar 10 09:21:15 2010
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Wed, 10 Mar 2010 23:21:15 +0900
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google SoC
In-Reply-To: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
Message-ID: <9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>

Hi Rutger,

Thank you for your inputs on GSoC 2010!

> * is there a way to express triples in BioRuby?
> * if there is not, what would be a good design to express triples in
> BioRuby so that this would be more useful than just for NeXML?

This is what we discussed during the pre-BioHackathon 2010.

http://hackathon3.dbcls.jp/wiki/BioRuby

My first idea was to make all BioRuby object have common output
method to render the object contents in various formats
(such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).

Then, we tried to separate view from logic using erb, but as you
see in the above page, it still looks ugly. It is mainly because
view formatting itself requires some additional codes, specific
to each format.

Therefore, we don't have a solid conclusion on this yet, unfortunately.

Anyway, we already have PubMed to RDF converter written in Ruby as
the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at

http://togows.dbcls.jp/entry/pubmed/16381885
--> http://togows.dbcls.jp/entry/pubmed/16381885.ttl

and, we are also trying to support KEGG to RDF conversion in this
framework as well. I think we can put the code in BioRuby when we finished.

Your suggestions are welcome. :)

Regards,
Toshiaki

On 2010/03/10, at 22:22, Rutger Vos wrote:

> Dear BioRuby-ites,
> 
> my apologies that my first email to this list is so long and
> tangential. I am trying to find out how to express RDF triples in
> BioRuby. In this email I'm explaining why I care enough to try to get
> funding for someone to work on this. If you don't care about any of
> this, you can stop reading now.
> 
> The National Evolutionary Synthesis Center (NESCent.org) is planning
> to be a mentoring organization for the Google Summer of Code 2010. I
> have submitted a project idea to this: to develop NeXML I/O and -
> probably more importantly for you - RDF capabilities for BioRuby. If
> funded, a student/coder will work on this full time over the summer,
> under the shared supervision of Jan Aerts and myself. Here is the
> link: http://tinyurl.com/biorubynexml
> 
> NeXML is a data format for phylogenetic data that can be read and
> written in perl, python, java and (to some extent) c++ and javascript.
> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
> can tell BioRuby isn't completely up to speed for it, yet.
> 
> (As an aside: you might ask yourself why there is something like NeXML
> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
> different problem: PhyloXML started essentially as a next generation
> of New Hampshire eXtended (NHX) to meet the annotation needs of
> comparative genomics, things such as gene duplications and other
> molecular evolution events, on phylogenetic trees; NeXML started as a
> complete XML representation of the NEXUS format, providing other
> comparative data types such as categorical and continuous character
> state matrices, restriction site matrices, and so on, in addition to
> trees, taxa, sequence alignments. There is obviously some overlap
> between the formats, but I guess that is not unique in bioinformatics
> :))
> 
> NeXML has a semantic annotation facility that uses RDFa. This allows
> us to add additional metadata to a fundamental phylogenetic data
> object (a tree, taxon, character, etc.) to form a "triple": the
> fundamental data object is the triple Subject, and the Predicate and
> Object are added as RDFa attributes. Since NeXML can be transformed
> using a standard XSL stylesheet to RDF/XML, we can express a limitless
> number of statements about phylogenetics. However, this means that any
> NeXML I/O library needs to be able to represent RDF triples. I have
> studied the BioRuby API as best as I could (but: I don't know ruby)
> and couldn't identify how to do this.
> 
> My questions to you:
> 
> * is there a way to express triples in BioRuby?
> * if there is not, what would be a good design to express triples in
> BioRuby so that this would be more useful than just for NeXML?
> 
> Thank you!
> 
> Rutger
> 
> -- 
> Dr. Rutger A. Vos
> School of Biological Sciences
> Philip Lyle Building, Level 4
> University of Reading
> Reading
> RG6 6BX
> United Kingdom
> Tel: +44 (0) 118 378 7535
> http://www.nexml.org
> http://rutgervos.blogspot.com
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From rutgeraldo at gmail.com  Thu Mar 11 05:22:04 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Thu, 11 Mar 2010 10:22:04 +0000
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google 	SoC
In-Reply-To: <9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
	<9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
Message-ID: <2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>

Hi Toshiaki,

great to hear there's already been a lot of discussion over this.
(Well, I'd be surprised if there hadn't been :))

It looks to me like some fairly major bookkeeping would need to be
implemented high up in the inheritance tree if *all* bioruby objects
are to be serialized into RDF. It also would require all of bioruby to
be ontologized in one fell swoop.

It is perhaps more likely that subdomains are going to be ontologized
more or less independently from one another (as you mention,
references->RDF, or in my case phylogenetics->RDF) based implicitly on
intermediate data formats (pubmed records and nexml, respectively).

That is probably OK, we do things as needs arise.

But what would be handy if the API was at least general enough so that
this was extensible and we can make additional statements *about*
objects when we serialize them to RDF. For example, in your pubmed
turtle file, the subject is always
<http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>. Is there a way,
programmatically, where I can add additional statements about
<http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>?

Rutger

On Wed, Mar 10, 2010 at 2:21 PM, Toshiaki Katayama <ktym at hgc.jp> wrote:
> Hi Rutger,
>
> Thank you for your inputs on GSoC 2010!
>
>> * is there a way to express triples in BioRuby?
>> * if there is not, what would be a good design to express triples in
>> BioRuby so that this would be more useful than just for NeXML?
>
> This is what we discussed during the pre-BioHackathon 2010.
>
> http://hackathon3.dbcls.jp/wiki/BioRuby
>
> My first idea was to make all BioRuby object have common output
> method to render the object contents in various formats
> (such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).
>
> Then, we tried to separate view from logic using erb, but as you
> see in the above page, it still looks ugly. It is mainly because
> view formatting itself requires some additional codes, specific
> to each format.
>
> Therefore, we don't have a solid conclusion on this yet, unfortunately.
>
> Anyway, we already have PubMed to RDF converter written in Ruby as
> the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at
>
> http://togows.dbcls.jp/entry/pubmed/16381885
> --> http://togows.dbcls.jp/entry/pubmed/16381885.ttl
>
> and, we are also trying to support KEGG to RDF conversion in this
> framework as well. I think we can put the code in BioRuby when we finished.
>
> Your suggestions are welcome. :)
>
> Regards,
> Toshiaki
>
> On 2010/03/10, at 22:22, Rutger Vos wrote:
>
>> Dear BioRuby-ites,
>>
>> my apologies that my first email to this list is so long and
>> tangential. I am trying to find out how to express RDF triples in
>> BioRuby. In this email I'm explaining why I care enough to try to get
>> funding for someone to work on this. If you don't care about any of
>> this, you can stop reading now.
>>
>> The National Evolutionary Synthesis Center (NESCent.org) is planning
>> to be a mentoring organization for the Google Summer of Code 2010. I
>> have submitted a project idea to this: to develop NeXML I/O and -
>> probably more importantly for you - RDF capabilities for BioRuby. If
>> funded, a student/coder will work on this full time over the summer,
>> under the shared supervision of Jan Aerts and myself. Here is the
>> link: http://tinyurl.com/biorubynexml
>>
>> NeXML is a data format for phylogenetic data that can be read and
>> written in perl, python, java and (to some extent) c++ and javascript.
>> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
>> can tell BioRuby isn't completely up to speed for it, yet.
>>
>> (As an aside: you might ask yourself why there is something like NeXML
>> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
>> different problem: PhyloXML started essentially as a next generation
>> of New Hampshire eXtended (NHX) to meet the annotation needs of
>> comparative genomics, things such as gene duplications and other
>> molecular evolution events, on phylogenetic trees; NeXML started as a
>> complete XML representation of the NEXUS format, providing other
>> comparative data types such as categorical and continuous character
>> state matrices, restriction site matrices, and so on, in addition to
>> trees, taxa, sequence alignments. There is obviously some overlap
>> between the formats, but I guess that is not unique in bioinformatics
>> :))
>>
>> NeXML has a semantic annotation facility that uses RDFa. This allows
>> us to add additional metadata to a fundamental phylogenetic data
>> object (a tree, taxon, character, etc.) to form a "triple": the
>> fundamental data object is the triple Subject, and the Predicate and
>> Object are added as RDFa attributes. Since NeXML can be transformed
>> using a standard XSL stylesheet to RDF/XML, we can express a limitless
>> number of statements about phylogenetics. However, this means that any
>> NeXML I/O library needs to be able to represent RDF triples. I have
>> studied the BioRuby API as best as I could (but: I don't know ruby)
>> and couldn't identify how to do this.
>>
>> My questions to you:
>>
>> * is there a way to express triples in BioRuby?
>> * if there is not, what would be a good design to express triples in
>> BioRuby so that this would be more useful than just for NeXML?
>>
>> Thank you!
>>
>> Rutger
>>
>> --
>> Dr. Rutger A. Vos
>> School of Biological Sciences
>> Philip Lyle Building, Level 4
>> University of Reading
>> Reading
>> RG6 6BX
>> United Kingdom
>> Tel: +44 (0) 118 378 7535
>> http://www.nexml.org
>> http://rutgervos.blogspot.com
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com

From bonnalraoul at ingm.it  Thu Mar 11 08:02:23 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 11 Mar 2010 14:02:23 +0100
Subject: [BioRuby] Ruby and Statistics
Message-ID: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>

Hello Folks, 
I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
Which library do you suggest ? 
I don't want to use rsruby (R), for now. Er extend every time Array.

I found this: ruby-statsample but I don't know if is the best one.

--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 0 200 662 326
fax: +39 0 200 662 346
http://www.ingm.it


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 08:53:02 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 22:53:02 +0900
Subject: [BioRuby] Ruby and Statistics
In-Reply-To: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
References: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
Message-ID: <20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>

Hi,

I found some modules, but I haven't used them.

math-statistics: http://www.notwork.org/~gotoken/ruby/p/statistics/

statarray: http://rubyforge.org/projects/statarray/

ruby-stats: http://pallas.telperion.info/ruby-stats/

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Thu, 11 Mar 2010 14:02:23 +0100
"Raoul Bonnal" <bonnalraoul at ingm.it> wrote:

> Hello Folks, 
> I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
> Which library do you suggest ? 
> I don't want to use rsruby (R), for now. Er extend every time Array.
> 
> I found this: ruby-statsample but I don't know if is the best one.
> 
> --
> Raoul J.P. Bonnal
> Life Science Informatics
> Integrative Biology Program
> Fondazione INGM
> Via F. Sforza 28
> 20122 Milano, IT
> phone: +39 0 200 662 326
> fax: +39 0 200 662 346
> http://www.ingm.it
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 09:12:49 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 23:12:49 +0900
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <4B96A5FB.9060607@molbio.su.se>
References: <4B96A5FB.9060607@molbio.su.se>
Message-ID: <20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>

Hi,

Christian Zmasek are now working for the HMMER 3 support.
It will be great if you can help us.

http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 09 Mar 2010 20:48:11 +0100
Daniel Lundin <daniel.lundin at molbio.su.se> wrote:

> Hi,
> 
> HMMER 3 is currently available as a first release candidate. With it 
> comes several news both in the form of new tools and new kinds of data, 
> which means output formats are changed. Is anybody working on BioRuby 
> parsers for these?
> 
> /D
> 
> -- 
> Daniel Lundin
> 
> Department of Molecular Biology & Functional Genomics
> Arrhenius Laboratories for Natural Sciences
> Stockholm University, SE-106 91 Stockholm, Sweden
> 
> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
> 
> Email: daniel.lundin at molbio.su.se
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 09:59:11 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 23:59:11 +0900
Subject: [BioRuby] Phylogenetic Trees or Hierarchical Clustering
In-Reply-To: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>
References: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>
Message-ID: <20100311145912.CF9091CBC3DA@idnmail.gen-info.osaka-u.ac.jp>

Hi,

I always use phylogenetic tree construction software such as
PHYLIP and MEGA4, and I don't know much about the pure Ruby
solutions. Below are found by using Google search.

There are some pure Ruby implementations of clustering algorithms,
though I haven't used them.

AI4R (Artificial Intelligence for Ruby):
http://ai4r.rubyforge.org/

clusterer: http://rubyforge.org/projects/clusterer/

I found a phylogenetic tree visualization implementation written
in JRuby, and I found it can also work with normal Ruby 1.8.7.

Egan A et al. (2008)
IDEA: Interactive Display for Evolutionary Analyses.
BMC Bioinformatics 2008, 9:524
http://www.biomedcentral.com/1471-2105/9/524
http://ideanalyses.sourceforge.net/

Thanks,

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Mon, 1 Mar 2010 16:42:25 -0500
Jillian E Kozyra <jillyh0 at gmail.com> wrote:

> Dear Colleagues,
> 
> We are working on a linguistics project in which we will calculate language
> similarities. From the language similarity matrix, we would like to create
> either a hierarchical clustering output or phylogenetic tree. We seek a pure
> Ruby plugin with which to do this. Could you give us some guidance?
> 
> Thanks,
> Jillian
> 
> -- 
> 917-434-7511
> http://sswl.railsplayground.net
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From daniel.lundin at molbio.su.se  Thu Mar 11 11:18:25 2010
From: daniel.lundin at molbio.su.se (Daniel Lundin)
Date: Thu, 11 Mar 2010 17:18:25 +0100
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
References: <4B96A5FB.9060607@molbio.su.se>
	<20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <4B9917D1.9000702@molbio.su.se>

Naohisa GOTO skrev:
> Hi,
> 
> Christian Zmasek are now working for the HMMER 3 support.
> It will be great if you can help us.
> 
Certainly. Since my alternative is writing a parser for myself, I might 
as well put in my effort for the common good.

Christian, is there anything in particular I could help with? I have 
started collecting some test cases for my own needs.

/Daniel

> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Tue, 09 Mar 2010 20:48:11 +0100
> Daniel Lundin <daniel.lundin at molbio.su.se> wrote:
> 
>> Hi,
>>
>> HMMER 3 is currently available as a first release candidate. With it 
>> comes several news both in the form of new tools and new kinds of data, 
>> which means output formats are changed. Is anybody working on BioRuby 
>> parsers for these?
>>
>> /D
>>
>> -- 
>> Daniel Lundin
>>
>> Department of Molecular Biology & Functional Genomics
>> Arrhenius Laboratories for Natural Sciences
>> Stockholm University, SE-106 91 Stockholm, Sweden
>>
>> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
>>
>> Email: daniel.lundin at molbio.su.se
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


-- 
Daniel Lundin

Department of Molecular Biology & Functional Genomics
Arrhenius Laboratories for Natural Sciences
Stockholm University, SE-106 91 Stockholm, Sweden

tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88

Email: daniel.lundin at molbio.su.se

From pjotr.public14 at thebird.nl  Thu Mar 11 12:17:27 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 11 Mar 2010 18:17:27 +0100
Subject: [BioRuby] Ruby and Statistics
In-Reply-To: <20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>
References: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
	<20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <20100311171727.GD12523@thebird.nl>

Hi Raoul,

Biolib makes the GSL available for Ruby, as well as Rlib. So many
standard statistics can be used, including linear regression, etc. If
there is other libraries you want to use we can consider mapping
those to Ruby (BOOST is a candidate).

Main problem is that I am still in the process of documenting biolib
before its release 1.0.

If you are interested in using these tools, we can work it out between
us. Just tell me what functions you want, and I'll help map/document
them. Be great for Biolib - as testing is a good thing.

Pj.

On Thu, Mar 11, 2010 at 10:53:02PM +0900, Naohisa GOTO wrote:
> Hi,
> 
> I found some modules, but I haven't used them.
> 
> math-statistics: http://www.notwork.org/~gotoken/ruby/p/statistics/
> 
> statarray: http://rubyforge.org/projects/statarray/
> 
> ruby-stats: http://pallas.telperion.info/ruby-stats/
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Thu, 11 Mar 2010 14:02:23 +0100
> "Raoul Bonnal" <bonnalraoul at ingm.it> wrote:
> 
> > Hello Folks, 
> > I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
> > Which library do you suggest ? 
> > I don't want to use rsruby (R), for now. Er extend every time Array.
> > 
> > I found this: ruby-statsample but I don't know if is the best one.
> > 
> > --
> > Raoul J.P. Bonnal
> > Life Science Informatics
> > Integrative Biology Program
> > Fondazione INGM
> > Via F. Sforza 28
> > 20122 Milano, IT
> > phone: +39 0 200 662 326
> > fax: +39 0 200 662 346
> > http://www.ingm.it
> > 
> > 
> > 
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby

From rutgeraldo at gmail.com  Mon Mar 15 08:27:27 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Mon, 15 Mar 2010 12:27:27 +0000
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google SoC
In-Reply-To: <2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
	<9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
	<2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>
Message-ID: <2bb9b24a1003150527p439c135dm1a164e6a5218835f@mail.gmail.com>

To follow up along more practical lines, I've had to deal with similar
design issues in Bio::Phylo (perl), TreeBASE and Mesquite (both java).
I've learned it makes sense to have:

- a simple "annotation" object, with getters and setters for the
predicate namespace uri, the predicate string, and the value object
(either a literal or a uri),

- a get_annotations method for all (fundamental) data objects in the
toolkit that returns a collection of these annotation object

this way, when you serialize any bioruby object into rdf, you can add
as many other statements about that object as you want.

Would a refactoring along those lines have a chance of being
acceptable to the bioruby community (of course subsequent to a more
detailed RFC, testing, discussion, proof of concept, etc.)?

On Thursday, March 11, 2010, Rutger Vos <rutgeraldo at gmail.com> wrote:
> Hi Toshiaki,
>
> great to hear there's already been a lot of discussion over this.
> (Well, I'd be surprised if there hadn't been :))
>
> It looks to me like some fairly major bookkeeping would need to be
> implemented high up in the inheritance tree if *all* bioruby objects
> are to be serialized into RDF. It also would require all of bioruby to
> be ontologized in one fell swoop.
>
> It is perhaps more likely that subdomains are going to be ontologized
> more or less independently from one another (as you mention,
> references->RDF, or in my case phylogenetics->RDF) based implicitly on
> intermediate data formats (pubmed records and nexml, respectively).
>
> That is probably OK, we do things as needs arise.
>
> But what would be handy if the API was at least general enough so that
> this was extensible and we can make additional statements *about*
> objects when we serialize them to RDF. For example, in your pubmed
> turtle file, the subject is always
> <http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>. Is there a way,
> programmatically, where I can add additional statements about
> <http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>?
>
> Rutger
>
> On Wed, Mar 10, 2010 at 2:21 PM, Toshiaki Katayama <ktym at hgc.jp> wrote:
>> Hi Rutger,
>>
>> Thank you for your inputs on GSoC 2010!
>>
>>> * is there a way to express triples in BioRuby?
>>> * if there is not, what would be a good design to express triples in
>>> BioRuby so that this would be more useful than just for NeXML?
>>
>> This is what we discussed during the pre-BioHackathon 2010.
>>
>> http://hackathon3.dbcls.jp/wiki/BioRuby
>>
>> My first idea was to make all BioRuby object have common output
>> method to render the object contents in various formats
>> (such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).
>>
>> Then, we tried to separate view from logic using erb, but as you
>> see in the above page, it still looks ugly. It is mainly because
>> view formatting itself requires some additional codes, specific
>> to each format.
>>
>> Therefore, we don't have a solid conclusion on this yet, unfortunately.
>>
>> Anyway, we already have PubMed to RDF converter written in Ruby as
>> the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at
>>
>> http://togows.dbcls.jp/entry/pubmed/16381885
>> --> http://togows.dbcls.jp/entry/pubmed/16381885.ttl
>>
>> and, we are also trying to support KEGG to RDF conversion in this
>> framework as well. I think we can put the code in BioRuby when we finished.
>>
>> Your suggestions are welcome. :)
>>
>> Regards,
>> Toshiaki
>>
>> On 2010/03/10, at 22:22, Rutger Vos wrote:
>>
>>> Dear BioRuby-ites,
>>>
>>> my apologies that my first email to this list is so long and
>>> tangential. I am trying to find out how to express RDF triples in
>>> BioRuby. In this email I'm explaining why I care enough to try to get
>>> funding for someone to work on this. If you don't care about any of
>>> this, you can stop reading now.
>>>
>>> The National Evolutionary Synthesis Center (NESCent.org) is planning
>>> to be a mentoring organization for the Google Summer of Code 2010. I
>>> have submitted a project idea to this: to develop NeXML I/O and -
>>> probably more importantly for you - RDF capabilities for BioRuby. If
>>> funded, a student/coder will work on this full time over the summer,
>>> under the shared supervision of Jan Aerts and myself. Here is the
>>> link: http://tinyurl.com/biorubynexml
>>>
>>> NeXML is a data format for phylogenetic data that can be read and
>>> written in perl, python, java and (to some extent) c++ and javascript.
>>> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
>>> can tell BioRuby isn't completely up to speed for it, yet.
>>>
>>> (As an aside: you might ask yourself why there is something like NeXML
>>> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
>>> different problem: PhyloXML started essentially as a next generation
>>> of New Hampshire eXtended (NHX) to meet the annotation needs of
>>> comparative genomics, things such as gene duplications and other
>>> molecular evolution events, on phylogenetic trees; NeXML started as a
>>> complete XML representation of the NEXUS format, providing other
>>> comparative data types such as categorical and continuous character
>>> state matrices, restriction site matrices, and so on, in addition to
>>> trees, taxa, sequence alignments. There is obviously some overlap
>>> between the formats, but I guess that is not unique in bioinformatics
>>> :))
>>>
>>> NeXML has a semantic annotation facility that uses RDFa. This allows
>>> us to add additional metadata to a fundamental phylogenetic data
>>> object (a tree, taxon, character, etc.) to form a "triple": the
>>> fundamental data object is the triple Subject, and the Predicate and
>>> Object are added as RDFa attributes. Since NeXML can be transformed
>>> using a standard XSL stylesheet to RDF/XML, we can express a limitless
>>> number of statements about phylogenetics. H

-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com

From ngoto at gen-info.osaka-u.ac.jp  Fri Mar 19 01:18:41 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 19 Mar 2010 14:18:41 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code
 is *ON* for OBF projects!
Message-ID: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>

Begin forwarded message:

Date: Thu, 18 Mar 2010 17:02:32 -0500
From: Chris Fields <cjfields at illinois.edu>
To: open-bio-l at lists.open-bio.org
Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!


(forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)

Hi all,

Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!

GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo

Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.

For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.

Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!

Rob Buels
OBF GSoC 2010 Administrator


_______________________________________________
Open-Bio-l mailing list
Open-Bio-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/open-bio-l


From k.hayashi.info at gmail.com  Tue Mar 23 08:20:52 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Tue, 23 Mar 2010 21:20:52 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>

Hi, all

My name is Kazuhiro Hayashi.
I'm a 1st-year master's degree student at Depertment of Computational
Biology, Graduate School of Frontier Sciences, the University of
Tokyo.

The reason why I sent this mail is to ask you some questions about
Google Summer of Code 2010.

I'm interested in Google Summer of Code 2010, Especially, the projects
about BioRuby.
At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
like to contribute BioRuby community through Google Summer of Code
2010.
So, I have three questions.
Could you answer them?

One is about differences between Ruby 1.8.x and 1.9.2
OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
Can I join this project?

Another is how many programs in BioRuby run on Ruby 1.9.2.
Could you tell me weather you have already known it or not (and how to know it)?

The other is implementation of the unit tests.
Does this mean that the participant needs to implement unit tests for
all codes which haven't had them yet.

Currently, These are all my questions about GSoC 2010.

If you have some advice for the applicants, please send a reply to
this mailing list.

Thank you very much for reading my broken English. :-)

Best regards


2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Begin forwarded message:
>
> Date: Thu, 18 Mar 2010 17:02:32 -0500
> From: Chris Fields <cjfields at illinois.edu>
> To: open-bio-l at lists.open-bio.org
> Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
>
>
> (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
>
> Hi all,
>
> Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
>
> GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
>
> Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
>
> For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
>
> Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
>
> Rob Buels
> OBF GSoC 2010 Administrator
>
>
>
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


-- 
???
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From biopython at maubp.freeserve.co.uk  Tue Mar 23 09:20:57 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 23 Mar 2010 13:20:57 +0000
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
Message-ID: <320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>

On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
<k.hayashi.info at gmail.com> wrote:
> Hi, all
>
> My name is Kazuhiro Hayashi.
> I'm a 1st-year master's degree student at Depertment of Computational
> Biology, Graduate School of Frontier Sciences, the University of
> Tokyo.
>
> The reason why I sent this mail is to ask you some questions about
> Google Summer of Code 2010.
>
> ...
>
> Thank you very much for reading my broken English. :-)

Hello Hayashi-san,

I don't know if the BioRuby team have any preference for which
language the Google Summer of Code projects will be discussed
in (English and/or Japanese). It will probably depend on the mentors.

However, there is also a Japanese BioRuby mailing list:
http://lists.open-bio.org/mailman/listinfo/bioruby-ja

Peter
(@Biopython)

From ngoto at gen-info.osaka-u.ac.jp  Tue Mar 23 11:21:33 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 24 Mar 2010 00:21:33 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
Message-ID: <20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>

Hi Kazuhiro,

On Tue, 23 Mar 2010 21:20:52 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi, all
> 
> My name is Kazuhiro Hayashi.
> I'm a 1st-year master's degree student at Depertment of Computational
> Biology, Graduate School of Frontier Sciences, the University of
> Tokyo.
> 
> The reason why I sent this mail is to ask you some questions about
> Google Summer of Code 2010.
> 
> I'm interested in Google Summer of Code 2010, Especially, the projects
> about BioRuby.
> At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
> like to contribute BioRuby community through Google Summer of Code
> 2010.
> So, I have three questions.
> Could you answer them?
>
> One is about differences between Ruby 1.8.x and 1.9.2
> OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
> Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
> Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
> Can I join this project?

Yes.
You will need to study about them during the project, but not now.
I've modified the "needed skills" in the project wiki page
to clarify the point.

> Another is how many programs in BioRuby run on Ruby 1.9.2.
> Could you tell me weather you have already known it or not (and how to know it)?

I don't know much. Some programs worked, but some didn't.

> The other is implementation of the unit tests.
> Does this mean that the participant needs to implement unit tests for
> all codes which haven't had them yet.

Yes or no, depends on planning. One idea is to implement
almost all with rough coding, and to improve them after that.
I also think that classes and modules that strongly depend
on external program or web service can be skipped.

> Currently, These are all my questions about GSoC 2010.
> 
> If you have some advice for the applicants, please send a reply to
> this mailing list.
> 
> Thank you very much for reading my broken English. :-)
> 
> Best regards
> 
> 
> 2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> > Begin forwarded message:
> >
> > Date: Thu, 18 Mar 2010 17:02:32 -0500
> > From: Chris Fields <cjfields at illinois.edu>
> > To: open-bio-l at lists.open-bio.org
> > Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
> >
> >
> > (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
> >
> > Hi all,
> >
> > Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
> >
> > GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
> >
> > Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
> >
> > For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
> >
> > Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
> >
> > Rob Buels
> > OBF GSoC 2010 Administrator
> >
> >
> >
> > _______________________________________________
> > Open-Bio-l mailing list
> > Open-Bio-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/open-bio-l
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> 
> 
> -- 
> ???
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


From ngoto at gen-info.osaka-u.ac.jp  Wed Mar 24 10:22:23 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 24 Mar 2010 23:22:23 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>
Message-ID: <20100324142225.08B501CBC3D0@idnmail.gen-info.osaka-u.ac.jp>

Hi,

The objective of the project is software development.
I think it is OK to use Japanese for communicating with
Japanese-speaking mentors. Using the bioruby-ja mailing
list for discussion seems good.

Students still need to write application form in English
required by Google.  It would be great if someone can help
English proofreading for ESL students.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 23 Mar 2010 13:20:57 +0000
Peter <biopython at maubp.freeserve.co.uk> wrote:

> On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
> <k.hayashi.info at gmail.com> wrote:
> > Hi, all
> >
> > My name is Kazuhiro Hayashi.
> > I'm a 1st-year master's degree student at Depertment of Computational
> > Biology, Graduate School of Frontier Sciences, the University of
> > Tokyo.
> >
> > The reason why I sent this mail is to ask you some questions about
> > Google Summer of Code 2010.
> >
> > ...
> >
> > Thank you very much for reading my broken English. :-)
> 
> Hello Hayashi-san,
> 
> I don't know if the BioRuby team have any preference for which
> language the Google Summer of Code projects will be discussed
> in (English and/or Japanese). It will probably depend on the mentors.
> 
> However, there is also a Japanese BioRuby mailing list:
> http://lists.open-bio.org/mailman/listinfo/bioruby-ja
> 
> Peter
> (@Biopython)
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From k.hayashi.info at gmail.com  Wed Mar 24 10:35:21 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Wed, 24 Mar 2010 23:35:21 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>

Hi.

Thank you for your replies.

I'd like to communicate with you on this mailing list (and I will
write e-mails in English as much as possible ). :- )
However, If I should do it on somewhere else, I will do so.
I'm not sure where is the best place to talk about GSoC 2010.
Anyway, I appriciate your advice.


By the way, I have one more question.
Could you tell me how much I have to write the proposal concretely?
I have to write how to implement the programs and when I write each?

Best regards

Kazuhiro

2010/3/23 Peter <biopython at maubp.freeserve.co.uk>:
> On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
> <k.hayashi.info at gmail.com> wrote:
>> Hi, all
>>
>> My name is Kazuhiro Hayashi.
>> I'm a 1st-year master's degree student at Depertment of Computational
>> Biology, Graduate School of Frontier Sciences, the University of
>> Tokyo.
>>
>> The reason why I sent this mail is to ask you some questions about
>> Google Summer of Code 2010.
>>
>> ...
>>
>> Thank you very much for reading my broken English. :-)
>
> Hello Hayashi-san,
>
> I don't know if the BioRuby team have any preference for which
> language the Google Summer of Code projects will be discussed
> in (English and/or Japanese). It will probably depend on the mentors.
>
> However, there is also a Japanese BioRuby mailing list:
> http://lists.open-bio.org/mailman/listinfo/bioruby-ja
>
> Peter
> (@Biopython)
>


2010?3?24?0:21 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Hi Kazuhiro,
>
> On Tue, 23 Mar 2010 21:20:52 +0900
> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>
>> Hi, all
>>
>> My name is Kazuhiro Hayashi.
>> I'm a 1st-year master's degree student at Depertment of Computational
>> Biology, Graduate School of Frontier Sciences, the University of
>> Tokyo.
>>
>> The reason why I sent this mail is to ask you some questions about
>> Google Summer of Code 2010.
>>
>> I'm interested in Google Summer of Code 2010, Especially, the projects
>> about BioRuby.
>> At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
>> like to contribute BioRuby community through Google Summer of Code
>> 2010.
>> So, I have three questions.
>> Could you answer them?
>>
>> One is about differences between Ruby 1.8.x and 1.9.2
>> OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
>> Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
>> Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
>> Can I join this project?
>
> Yes.
> You will need to study about them during the project, but not now.
> I've modified the "needed skills" in the project wiki page
> to clarify the point.
>
>> Another is how many programs in BioRuby run on Ruby 1.9.2.
>> Could you tell me weather you have already known it or not (and how to know it)?
>
> I don't know much. Some programs worked, but some didn't.
>
>> The other is implementation of the unit tests.
>> Does this mean that the participant needs to implement unit tests for
>> all codes which haven't had them yet.
>
> Yes or no, depends on planning. One idea is to implement
> almost all with rough coding, and to improve them after that.
> I also think that classes and modules that strongly depend
> on external program or web service can be skipped.
>
>> Currently, These are all my questions about GSoC 2010.
>>
>> If you have some advice for the applicants, please send a reply to
>> this mailing list.
>>
>> Thank you very much for reading my broken English. :-)
>>
>> Best regards
>>
>>
>> 2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
>> > Begin forwarded message:
>> >
>> > Date: Thu, 18 Mar 2010 17:02:32 -0500
>> > From: Chris Fields <cjfields at illinois.edu>
>> > To: open-bio-l at lists.open-bio.org
>> > Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
>> >
>> >
>> > (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
>> >
>> > Hi all,
>> >
>> > Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
>> >
>> > GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
>> >
>> > Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
>> >
>> > For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
>> >
>> > Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
>> >
>> > Rob Buels
>> > OBF GSoC 2010 Administrator
>> >
>> >
>> >
>> > _______________________________________________
>> > Open-Bio-l mailing list
>> > Open-Bio-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/open-bio-l
>> >
>> >
>> > _______________________________________________
>> > BioRuby Project - http://www.bioruby.org/
>> > BioRuby mailing list
>> > BioRuby at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioruby
>> >
>>
>>
>> --
>> ???
>> Kazuhiro Hayashi
>> Department of Computational Biology,  The University of Tokyo
>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>> tel: 04-7136-3988
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From biopython at maubp.freeserve.co.uk  Wed Mar 24 10:51:46 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 24 Mar 2010 14:51:46 +0000
Subject: [BioRuby] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised
	E-utility Usage Policy
In-Reply-To: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
References: <A9D8BF3D8A74DF4A925FB541C0F39D2A220D32B4@NIHMLBX15.nih.gov>
	<320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com>
	<38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
Message-ID: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>

On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Mar 24, 2010, at 9:08 AM, Peter wrote:
>
>> Hi,
>>
>> This is probably of interest to all the Bio* projects offering access
>> to the NCBI Entrez utilities. See forwarded message below.
>>
>> I *think* the new guidelines basically say that the email & tool parameters are
>> optional BUT if your IP address ever gets banned for excessive use you then
>> have to register an email & tool combination.
>>
>> Regarding the email address, the NCBI say to use the email of the developer
>> (not the end user). However, they do not distinguish between the developers
>> of a library (like us), and the developers of an application or script using a
>> library (who may also be the end user).
>>
>> Currently we (Biopython) and I think BioPerl ask developers using our libraries
>> to populate the email address themselves. I *think* this is still the
>> right action.
>>
>> Peter
>
>
> Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I
> think with the SOAP-based ones as well). ?We're providing a specific set of
> tools for user to write up their own applications end applications. ?I can try
> contacting them regarding this to get an official response to clarify this
> somewhat.

Please give the NCBI an email - you can CC me too if you like.

> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a
> default, but always leave the email blank and issue a warning if it isn't
> set. ?We could just as easily leave both blank and issue warnings for both.

We currently leave out the email and set the tool parameter to "Biopython"
by default but this can be overridden. Currently leaving out the email does
cause Biopython to give a warning.

Peter


From hlapp at drycafe.net  Wed Mar 24 11:27:37 2010
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 24 Mar 2010 11:27:37 -0400
Subject: [BioRuby] [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce]
	NCBI Revised E-utility Usage Policy
In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>
References: <A9D8BF3D8A74DF4A925FB541C0F39D2A220D32B4@NIHMLBX15.nih.gov>
	<320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com>
	<38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
	<320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>
Message-ID: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net>


On Mar 24, 2010, at 10:51 AM, Peter wrote:

> Please give the NCBI an email - you can CC me too if you like.


Can't this be the developers' mailing list (or lists, the appropriate  
one for each toolkit)? We can even whitelist all NCBI sender addresses  
so they can easily email us if there are issues.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From k.hayashi.info at gmail.com  Thu Mar 25 13:31:07 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Fri, 26 Mar 2010 02:31:07 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
Message-ID: <b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>

Hi,

Thank you for your replies.

I'd like to communicate with you on this mailing list (and I will
write e-mails in English as much as possible ). :- )
However, If I should do it on somewhere else, I will do so.
I'm not sure where is the best place to talk about GSoC 2010.
Anyway, I appreciate your advice.


By the way, I have one more question.
Could you tell me how much I have to write the proposal concretely?
I have to write how to implement the programs and when I write each?

Best regards

Kazuhiro

( I'm sorry if you have already received the same mail. I sent it
yesterday, but I haven't received yet....)

-- 
???
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From czmasek at burnham.org  Thu Mar 25 20:39:42 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Thu, 25 Mar 2010 17:39:42 -0700
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <4B9917D1.9000702@molbio.su.se>
References: <4B96A5FB.9060607@molbio.su.se>
	<20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
	<4B9917D1.9000702@molbio.su.se>
Message-ID: <4BAC024E.6000009@burnham.org>

Hi, Daniel:

Sorry for the late reply, for some reasons my email reader suddenly 
sorts messages wrongly.

In any case, the parser for hmmer3 hmmscan and hmmsearch is basically 
finished.

So, if I could somehow get access to your test cases, that would be great!

Thank you!

Christian


Daniel Lundin wrote:
> Naohisa GOTO skrev:
>> Hi,
>>
>> Christian Zmasek are now working for the HMMER 3 support.
>> It will be great if you can help us.
>>
> Certainly. Since my alternative is writing a parser for myself, I might 
> as well put in my effort for the common good.
> 
> Christian, is there anything in particular I could help with? I have 
> started collecting some test cases for my own needs.
> 
> /Daniel
> 
>> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
>> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb
>>
>> Naohisa Goto
>> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>>
>> On Tue, 09 Mar 2010 20:48:11 +0100
>> Daniel Lundin <daniel.lundin at molbio.su.se> wrote:
>>
>>> Hi,
>>>
>>> HMMER 3 is currently available as a first release candidate. With it 
>>> comes several news both in the form of new tools and new kinds of data, 
>>> which means output formats are changed. Is anybody working on BioRuby 
>>> parsers for these?
>>>
>>> /D
>>>
>>> -- 
>>> Daniel Lundin
>>>
>>> Department of Molecular Biology & Functional Genomics
>>> Arrhenius Laboratories for Natural Sciences
>>> Stockholm University, SE-106 91 Stockholm, Sweden
>>>
>>> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
>>>
>>> Email: daniel.lundin at molbio.su.se
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From ngoto at gen-info.osaka-u.ac.jp  Fri Mar 26 08:43:38 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 26 Mar 2010 21:43:38 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
Message-ID: <20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>

Hi,

It is generally good to write many specific details.
However, the most important thing now is whether the proposal
is accepted by Google.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Fri, 26 Mar 2010 02:31:07 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi,
> 
> Thank you for your replies.
> 
> I'd like to communicate with you on this mailing list (and I will
> write e-mails in English as much as possible ). :- )
> However, If I should do it on somewhere else, I will do so.
> I'm not sure where is the best place to talk about GSoC 2010.
> Anyway, I appreciate your advice.
> 
> 
> By the way, I have one more question.
> Could you tell me how much I have to write the proposal concretely?
> I have to write how to implement the programs and when I write each?
> 
> Best regards
> 
> Kazuhiro
> 
> ( I'm sorry if you have already received the same mail. I sent it
> yesterday, but I haven't received yet....)
> 
> -- 
> ???
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988


From k.hayashi.info at gmail.com  Fri Mar 26 11:21:41 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Sat, 27 Mar 2010 00:21:41 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com> 
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com> 
	<20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>

Hi Goto-san,

> It is generally good to write many specific details.
> However, the most important thing now is whether the proposal
> is accepted by Google.

Is it possible to show you a draft of my proposal?
I'd like you to proofread my proposal before the deadline for application.

Best regards

Kazuhiro

2010?3?26?21:43 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Hi,
>
> It is generally good to write many specific details.
> However, the most important thing now is whether the proposal
> is accepted by Google.
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
> On Fri, 26 Mar 2010 02:31:07 +0900
> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>
>> Hi,
>>
>> Thank you for your replies.
>>
>> I'd like to communicate with you on this mailing list (and I will
>> write e-mails in English as much as possible ). :- )
>> However, If I should do it on somewhere else, I will do so.
>> I'm not sure where is the best place to talk about GSoC 2010.
>> Anyway, I appreciate your advice.
>>
>>
>> By the way, I have one more question.
>> Could you tell me how much I have to write the proposal concretely?
>> I have to write how to implement the programs and when I write each?
>>
>> Best regards
>>
>> Kazuhiro
>>
>> ( I'm sorry if you have already received the same mail. I sent it
>> yesterday, but I haven't received yet....)
>>
>> --
>> ???
>> Kazuhiro Hayashi
>> Department of Computational Biology,  The University of Tokyo
>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>> tel: 04-7136-3988
>
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From czmasek at burnham.org  Fri Mar 26 14:26:54 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Fri, 26 Mar 2010 11:26:54 -0700
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer o
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
	<20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>
Message-ID: <4BACFC6E.4010303@burnham.org>

Hi,

Re. "Is it possible to show you a draft of my proposal?"

I think this is not only possible, it is highly recommended.
 From my experience, a detailed, well written, and realistic proposal is 
very important.

Remember, not all projects will get accepted (currently, OBF has 14 
projects, I would be very surprised if more than half would get accepted 
at the end). The better a student's proposal, the more likely it is that 
the project will get accepted.


Christian


Kazuhiro Hayashi wrote:
> Hi Goto-san,
> 
>> It is generally good to write many specific details.
>> However, the most important thing now is whether the proposal
>> is accepted by Google.
> 
> Is it possible to show you a draft of my proposal?
> I'd like you to proofread my proposal before the deadline for application.
> 
> Best regards
> 
> Kazuhiro
> 
> 2010?3?26?21:43 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
>> Hi,
>>
>> It is generally good to write many specific details.
>> However, the most important thing now is whether the proposal
>> is accepted by Google.
>>
>> Naohisa Goto
>> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>>
>> On Fri, 26 Mar 2010 02:31:07 +0900
>> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Thank you for your replies.
>>>
>>> I'd like to communicate with you on this mailing list (and I will
>>> write e-mails in English as much as possible ). :- )
>>> However, If I should do it on somewhere else, I will do so.
>>> I'm not sure where is the best place to talk about GSoC 2010.
>>> Anyway, I appreciate your advice.
>>>
>>>
>>> By the way, I have one more question.
>>> Could you tell me how much I have to write the proposal concretely?
>>> I have to write how to implement the programs and when I write each?
>>>
>>> Best regards
>>>
>>> Kazuhiro
>>>
>>> ( I'm sorry if you have already received the same mail. I sent it
>>> yesterday, but I haven't received yet....)
>>>
>>> --
>>> ???
>>> Kazuhiro Hayashi
>>> Department of Computational Biology,  The University of Tokyo
>>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>>> tel: 04-7136-3988
>>
> 
> 
> 


From sararayburn at gmail.com  Sat Mar 27 16:13:01 2010
From: sararayburn at gmail.com (Sara Rayburn)
Date: Sat, 27 Mar 2010 15:13:01 -0500
Subject: [BioRuby] GSOC 2010 preliminary proposal question
Message-ID: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>

Hello all. My name is Sara Rayburn. I'm a doctoral student at the University of Louisiana at Lafayette. I am planning to submit a proposal to implement the speciation/duplication inference algorithm this summer. I'd like to tackle both the implementation and the extension to non-binary trees. In reading the posted reference on reconciliation in non-binary trees, there are two types of duplications referenced, required and conditional duplications. In an implementation of this approach, would it be better to identify only required duplications and clear speciations, or should there be an additional distinction for the conditional duplications?

I hope to post a preliminary project plan and proposal for feedback in the next couple of days. Thanks in advance for your feedback.


Sara Rayburn
University of Louisiana at Lafayette
sararayburn at gmail.com


From czmasek at burnham.org  Mon Mar 29 19:32:12 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 16:32:12 -0700
Subject: [BioRuby] Beta application for review: BioRuby - Simple
 duplication inference implementation
In-Reply-To: <C6EF422C-F22E-41D1-AF43-166C790502B8@gmail.com>
References: <C6EF422C-F22E-41D1-AF43-166C790502B8@gmail.com>
Message-ID: <4BB1387C.6090503@burnham.org>

Hi, Jure:

Your application seems to be on the right way.

In general, your time table needs to be more detailed.
For each step you should list:
1. Goal/deliverable (you have that)
2. Approach
3. Time estimation  (you have that)
4. Anticipated problems & possible alternative approaches


Some more comments:

> 
> *The idea:*
> 
> We would implement the simple and fast duplication inference algorithm 
> described by Zmasek and Eddy (Zmasek and Eddy, 2001, "A simple algorithm 
> to infer gene duplication and speciation events on a gene tree". Finding 
> gene duplications is an extremely important part of bioinformatics and 
> biomedical research, as duplications are thought to be powerful drivers 
> in the evolution of new protein function. 

I think 'extremely important part of bioinformatics' is a somewhat of an 
exaggeration and too vague. Better write about how gene duplications 
complicate efforts on gene function prediction, and their significance 
in (the theory of) molecular evolution.


> It is thus important to find 
> gene duplication sequences, which when translated are more likely to be 
> functionally different, and distinguish them from gene speciation 
> sequences, which are more likely functionally equivalent.

'gene duplication sequences' should be 'genes related by a duplication' 
or similar.
'gene speciation sequence' should be 'genes related by a speciation' or 
similar.


> Currently the algorithm supports rooted fully binary trees and we would 
> like to change that, by also implementing support for unrooted and 
> non-binary trees.

Goals are like this:
1. Implement algorithm as it is
2. Allow rooting of unrooted gene trees by minimizing sum of duplications.
Optional:
3. Extend algorithm to work on non-binary species trees
4. Extend algorithm to work on non-binary gene trees


> 
> *The work:*
> 
> There are several milestones to be reached in developing this idea and 
> this is the work plan I propose:
> 1. Development of unit tests with known species and gene trees (1 week).
> 
> 2. Making or reusing necessary data structures, made easier by last 
> years GSoC contribution implementing phyloXML in BioRuby (1/2 weeks - 1 
> week):
> - gene tree,
> - species tree,
> - tree node,
> - children(),
> - parent().
> 
> 3. Developing checks for the correctness of input data for rooted fully 
> binary trees SDI (1/2 weeks - 1 week):
> - making sure trees are rooted and binary,
> - all species/gene tree nodes have at least on type of taxonomic data.
> - making a taxonomy base from a type of data present in all nodes 
> (scientific or common name, taxonomy code, id),
> - making sure taxonomic data is unique throughout external nodes.
> 4. Implementation of the recursive M function (1 week)
> - traverse the gene tree in postorder (left subtree, right subtree, root),
> - finding occurrences where M(parent) equals M(child 1 or 2) - this is 
> representative for finding a duplication. If M(parent) matches neither, 
> the processed node is a speciation.
> 
> 5. Milestone - finished implementation of SDI for rooted fully binary 
> trees (1/2 week):
> - Extensive testing,
> - cleaning up.
> 
> 6. Working on unrooted non-binary trees implementation (4-8 weeks):
> - Look to the forester java library SDI module for insight (by the 
> mentor of this project, Zmasek),
> - Doing some heavy lifting,
> - at this point I consider this implementation a possible pitfall, 
> because of substantially increased complexity.

This needs to much more detailed.
Species trees are always rooted.
Unrooted gene trees can be handled naively by rooting them in all 
possible places, and running the SDI algorithm on each differently 
rooted tree, and keeping the gene tree which has the lowest number of 
duplications.
A more efficient approach for this is described in:
Zmasek and Eddy (2002). RIO: analyzing proteomes by automated 
phylogenomics using resampled inference of orthologs. BMC 
Bioinformatics. 2002 May 16;3:14.
See: 
http://evogsoc2010.wordpress.com/2010/03/25/references-for-gene-duplications-proposal/


> 
> 7. Finishing up (1 week):
> - Extensive testing,
> - cleaning up.
> 
> *Why me?:*
> 
> I like to set foot on unknown territory and challenge myself constantly. 
> That being said, I have long searched for something that would connect 
> my love of medicine to my love of programming, and now, thanks to GSoC 
> and OBF, I think I found it - bioinformatics. I am at a stage of my 
> medical study, where I have to decide what my future will entail, and I 
> am (now, after thinking about it for a long time) positive that 
> bioinformatics will be a big part of it. What better way to get future 
> off to a good start, than with a Google Summer of Code project? Based on 
> this enthusiasm alone you can be assured that I'll work really hard on 
> this project and that I will be happy to see it done. As this would be 
> my first serious open source engagement, you also have a chance of 
> forming a completely new addition to the open source world and making an 
> excellent contributor out of me.
> 
> *Previous experience:*
> 
> 1. I have been working on a simulation of an analytical chemistry method 
> for the past 2 years now, more specifically we have modeled laser 
> ablation + inductively coupled plasma mass spectrometry with a simple 
> model, which aids our elemental mapping projects. For the write-up of 
> this project I have been awarded with a "Pre?ernovo priznanje" in 2008 
> (PDF upon request). This work entails several interesting components, 
> from basics such as: C# development, image input, output, multi-threaded 
> programming, UI development; to complex themes such as: genetic 
> algorithms and neural networks. All of which I learned as we worked on 
> the project without much hassle (source code upon request). This work is 
> not yet open source, because we are in the finalizing stages of the 
> paper and will release the source code after publication under an open 
> source license.  
> 
> 2. I have programmed since I was a child and I have developed a wide 
> specter of things in my lifetime (from a full CMS in PHP to an IRC 
> robot, source code upon request), but I have little experience in fully 
> open source projects, which I think so highly of.
> 
> *Biography:*
> 
> My name is Jure Triglav and I'm a 24 year old medical student from 
> Ljubljana, Slovenia. I was born in a small town of Murska Sobota in 
> Slovenia, where I went to grade school (graded excellent for all years, 
> awarded "Zoisova ?tipendija" for the gifted, which I still hold) and 
> high-school (excellent, finished as "Zlati maturant" in the company of 
> about 200 best students in the country). I moved to Ljubljana in 2004 to 
> study medicine. I am now in the last year of my medical study which I 
> find challenging and very interesting. 
> My hobbies are all over the place, from book design to photography, from 
> web design to typography, from guitar to poetry, from reading to 
> programming, from traveling to sports. 
> 
>   
> 
> *Other obligations for the summer:*
> 
> I have 5-hour daily clinical practice every weekday in June, July and 
> August, which is not nearly as serious as it sounds, especially since 
> this is the summer rotation which is known for its laid back feel. These 
> practice start at 8 am and finish at 1 pm, and for students are not 
> really stressful or exhausting at all. I have in the past juggled many 
> research obligations with clinical practice and my studies without 
> hiccups, but I will not do this this summer and will dedicate 8 hours 
> daily to Google Summer of Code, as I realize what a great opportunity 
> this is and how much work is required. I have no other work, research or 
> vacation obligations for the period of Google Summer of Code.

Neverthelessm, this sounds like a serious concern.

> 
> *Contact information: *
> 
> (I will provide additional contact information in the final application)
> Name: Jure Triglav
> E-mail: juretriglav at gmail.com <mailto:juretriglav at gmail.com>
> IRC handle: x` on #obf-soc, #gsoc
> 


From czmasek at burnham.org  Mon Mar 29 19:39:29 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 16:39:29 -0700
Subject: [BioRuby] Google summer of code 2010 - Stathis Kamperis
In-Reply-To: <2218b9af1003290119q1c6b2eeclc3c84ffdbaa97b2a@mail.gmail.com>
References: <2218b9af1003290119q1c6b2eeclc3c84ffdbaa97b2a@mail.gmail.com>
Message-ID: <4BB13A31.8020203@burnham.org>

Hi, Stathis:

Thank you for your interest in this proposal!

Stathis Kamperis wrote:
> Dear Dr. Zmasek,
> 
> my name is Stathis Kamperis and I'm interested in this year's Google
> Summer of Code project:
> "Implementation of algorithm to infer gene duplications in BioRuby".
> 
> I am a medicine graduate, physics undergraduate and computer
> enthusiast. I come from Greece and I am 26 years old.
> I have a long standing programming experience with a vast range of
> programming languages including, since recently, Ruby.
> I also have a decent molecular/biology background.
> 
> I successfully participated in last years Google Summer of Code
> working for the DragonFlyBSD[1] organisation. My work had to do with
> POSIX standard conformance audit, regression testing and quality
> assurance.
> 
> As I understand, the project is about implementing your algorithm to
> BioRuby. Is there any prototype implemented in any language/framework
> at the moment ?

Yes, there is:
See: 
http://forester-atv.cvs.sourceforge.net/viewvc/forester-atv/forester-atv/java/src/org/forester/sdi/

Especially, SDI.java and SDIR.java (for unrooted trees)


In your abstract you mention:
> "We show empirically, using 1750 gene trees constructed from the Pfam
> protein family database, that it appears to be a practical (and often
> superior) algorithm for analyzing real gene trees."
> So, I wonder, what does 'empirically' mean here or how did you conduct
> your tests ?

Essentially, my Java implementation was used to run this tests.

Hope this helps,

Christian


From czmasek at burnham.org  Mon Mar 29 20:01:10 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 17:01:10 -0700
Subject: [BioRuby] GSOC 2010 preliminary proposal question
In-Reply-To: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>
References: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>
Message-ID: <4BB13F46.7010607@burnham.org>

Hi, Sara:

Thank you for your interest in this proposal!

I think focusing on 'required' duplications is appropriate, since 
non-binary species trees are oftentimes a means to express uncertainty 
in the "tree-of-life" and to prevent introduction of spurious 
duplications due to this.

Christian


Sara Rayburn wrote:
> Hello all. My name is Sara Rayburn. I'm a doctoral student at the University of Louisiana at Lafayette. I am planning to submit a proposal to implement the speciation/duplication inference algorithm this summer. I'd like to tackle both the implementation and the extension to non-binary trees. In reading the posted reference on reconciliation in non-binary trees, there are two types of duplications referenced, required and conditional duplications. In an implementation of this approach, would it be better to identify only required duplications and clear speciations, or should there be an additional distinction for the conditional duplications?
> 
> I hope to post a preliminary project plan and proposal for feedback in the next couple of days. Thanks in advance for your feedback.
> 
> 
> 
> Sara Rayburn
> University of Louisiana at Lafayette
> sararayburn at gmail.com
> 
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From donttrustben at gmail.com  Wed Mar 31 20:33:27 2010
From: donttrustben at gmail.com (Ben Woodcroft)
Date: Thu, 1 Apr 2010 11:33:27 +1100
Subject: [BioRuby] FlatFile GFF
Message-ID: <q2wbb2b67d01003311733jf8cb9aa8q173c827b3d9ac56a@mail.gmail.com>

Hi,

I have a conceptual question for the list. When I open a gff2 file using
Bio::FlatFile, the next_entry method gives me all of the lines at once (in
the form of a Bio::GFF::GFF2 object).

f = Bio::FlatFile.open(Bio::GFF::GFF2,"some.gff2") => Bio::FlatFile
g = f.next_entry => Bio::GFF::GFF2 object
g.records => array of GFF2 records

To me, this seems a little counter-intuitive. I expected to get info for a
single line of the GFF file from FlatFile#next_entry

The other problem is that the whole file must be parsed at the beginning,
and this can cause memory problems when using large GFF files (e.g. the
current WormBase gff2 is 2.6GB).

To get around the problem I can use File.foreach('some.gff2') and then parse
each line using Bio::GFF::GFF2. I'm not sure what the situation is with
other file formats.

So, my question is, could we introduce a foreach method into FlatFile that
iterates (without parsing all at once so it is light on memory) over the
GFF/etc entries in the file? Ideally we could change next_entry, but that
wouldn't be backwards compatible I don't think.

Thanks,
ben

-- 
FYI: My email addresses at unimelb, uq and gmail all redirect to the same
place.

From jillyh0 at gmail.com  Mon Mar  1 21:42:25 2010
From: jillyh0 at gmail.com (Jillian E Kozyra)
Date: Mon, 1 Mar 2010 16:42:25 -0500
Subject: [BioRuby] Phylogenetic Trees or Hierarchical Clustering
Message-ID: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>

Dear Colleagues,

We are working on a linguistics project in which we will calculate language
similarities. From the language similarity matrix, we would like to create
either a hierarchical clustering output or phylogenetic tree. We seek a pure
Ruby plugin with which to do this. Could you give us some guidance?

Thanks,
Jillian

-- 
917-434-7511
http://sswl.railsplayground.net


From bonnalraoul at ingm.it  Mon Mar  8 13:28:16 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Mon, 8 Mar 2010 14:28:16 +0100
Subject: [BioRuby] RVM: Ruby Version Manager
Message-ID: <cb6734ac-24c5-4320-9407-90976749bc1c@ingm.it>

Do you know this http://rvm.beginrescueend.com/ tool for having multiple ruby environment installed at the same time ?

RVM is a command line tool which allows us to easily install, manage and work with multiple ruby environments from interpreters to sets of gems. RVM itself is easy to  install!

I'm using it on a vm for developing and testing and it is awesome how it handles everything :-)

Give it a try.

--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 0 200 662 326
fax: +39 0 200 662 346
http://www.ingm.it


From daniel.lundin at molbio.su.se  Tue Mar  9 19:48:11 2010
From: daniel.lundin at molbio.su.se (Daniel Lundin)
Date: Tue, 09 Mar 2010 20:48:11 +0100
Subject: [BioRuby] HMMER 3 parsers?
Message-ID: <4B96A5FB.9060607@molbio.su.se>

Hi,

HMMER 3 is currently available as a first release candidate. With it 
comes several news both in the form of new tools and new kinds of data, 
which means output formats are changed. Is anybody working on BioRuby 
parsers for these?

/D

-- 
Daniel Lundin

Department of Molecular Biology & Functional Genomics
Arrhenius Laboratories for Natural Sciences
Stockholm University, SE-106 91 Stockholm, Sweden

tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88

Email: daniel.lundin at molbio.su.se


From rutgeraldo at gmail.com  Wed Mar 10 13:22:48 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Wed, 10 Mar 2010 13:22:48 +0000
Subject: [BioRuby] RDF Triples in BioRuby, a funding proposal to Google SoC
Message-ID: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>

Dear BioRuby-ites,

my apologies that my first email to this list is so long and
tangential. I am trying to find out how to express RDF triples in
BioRuby. In this email I'm explaining why I care enough to try to get
funding for someone to work on this. If you don't care about any of
this, you can stop reading now.

The National Evolutionary Synthesis Center (NESCent.org) is planning
to be a mentoring organization for the Google Summer of Code 2010. I
have submitted a project idea to this: to develop NeXML I/O and -
probably more importantly for you - RDF capabilities for BioRuby. If
funded, a student/coder will work on this full time over the summer,
under the shared supervision of Jan Aerts and myself. Here is the
link: http://tinyurl.com/biorubynexml

NeXML is a data format for phylogenetic data that can be read and
written in perl, python, java and (to some extent) c++ and javascript.
RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
can tell BioRuby isn't completely up to speed for it, yet.

(As an aside: you might ask yourself why there is something like NeXML
when there is PhyloXML for BioRuby. The answer is that NeXML solves a
different problem: PhyloXML started essentially as a next generation
of New Hampshire eXtended (NHX) to meet the annotation needs of
comparative genomics, things such as gene duplications and other
molecular evolution events, on phylogenetic trees; NeXML started as a
complete XML representation of the NEXUS format, providing other
comparative data types such as categorical and continuous character
state matrices, restriction site matrices, and so on, in addition to
trees, taxa, sequence alignments. There is obviously some overlap
between the formats, but I guess that is not unique in bioinformatics
:))

NeXML has a semantic annotation facility that uses RDFa. This allows
us to add additional metadata to a fundamental phylogenetic data
object (a tree, taxon, character, etc.) to form a "triple": the
fundamental data object is the triple Subject, and the Predicate and
Object are added as RDFa attributes. Since NeXML can be transformed
using a standard XSL stylesheet to RDF/XML, we can express a limitless
number of statements about phylogenetics. However, this means that any
NeXML I/O library needs to be able to represent RDF triples. I have
studied the BioRuby API as best as I could (but: I don't know ruby)
and couldn't identify how to do this.

My questions to you:

* is there a way to express triples in BioRuby?
* if there is not, what would be a good design to express triples in
BioRuby so that this would be more useful than just for NeXML?

Thank you!

Rutger

-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com


From ktym at hgc.jp  Wed Mar 10 14:21:15 2010
From: ktym at hgc.jp (Toshiaki Katayama)
Date: Wed, 10 Mar 2010 23:21:15 +0900
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google SoC
In-Reply-To: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
Message-ID: <9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>

Hi Rutger,

Thank you for your inputs on GSoC 2010!

> * is there a way to express triples in BioRuby?
> * if there is not, what would be a good design to express triples in
> BioRuby so that this would be more useful than just for NeXML?

This is what we discussed during the pre-BioHackathon 2010.

http://hackathon3.dbcls.jp/wiki/BioRuby

My first idea was to make all BioRuby object have common output
method to render the object contents in various formats
(such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).

Then, we tried to separate view from logic using erb, but as you
see in the above page, it still looks ugly. It is mainly because
view formatting itself requires some additional codes, specific
to each format.

Therefore, we don't have a solid conclusion on this yet, unfortunately.

Anyway, we already have PubMed to RDF converter written in Ruby as
the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at

http://togows.dbcls.jp/entry/pubmed/16381885
--> http://togows.dbcls.jp/entry/pubmed/16381885.ttl

and, we are also trying to support KEGG to RDF conversion in this
framework as well. I think we can put the code in BioRuby when we finished.

Your suggestions are welcome. :)

Regards,
Toshiaki

On 2010/03/10, at 22:22, Rutger Vos wrote:

> Dear BioRuby-ites,
> 
> my apologies that my first email to this list is so long and
> tangential. I am trying to find out how to express RDF triples in
> BioRuby. In this email I'm explaining why I care enough to try to get
> funding for someone to work on this. If you don't care about any of
> this, you can stop reading now.
> 
> The National Evolutionary Synthesis Center (NESCent.org) is planning
> to be a mentoring organization for the Google Summer of Code 2010. I
> have submitted a project idea to this: to develop NeXML I/O and -
> probably more importantly for you - RDF capabilities for BioRuby. If
> funded, a student/coder will work on this full time over the summer,
> under the shared supervision of Jan Aerts and myself. Here is the
> link: http://tinyurl.com/biorubynexml
> 
> NeXML is a data format for phylogenetic data that can be read and
> written in perl, python, java and (to some extent) c++ and javascript.
> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
> can tell BioRuby isn't completely up to speed for it, yet.
> 
> (As an aside: you might ask yourself why there is something like NeXML
> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
> different problem: PhyloXML started essentially as a next generation
> of New Hampshire eXtended (NHX) to meet the annotation needs of
> comparative genomics, things such as gene duplications and other
> molecular evolution events, on phylogenetic trees; NeXML started as a
> complete XML representation of the NEXUS format, providing other
> comparative data types such as categorical and continuous character
> state matrices, restriction site matrices, and so on, in addition to
> trees, taxa, sequence alignments. There is obviously some overlap
> between the formats, but I guess that is not unique in bioinformatics
> :))
> 
> NeXML has a semantic annotation facility that uses RDFa. This allows
> us to add additional metadata to a fundamental phylogenetic data
> object (a tree, taxon, character, etc.) to form a "triple": the
> fundamental data object is the triple Subject, and the Predicate and
> Object are added as RDFa attributes. Since NeXML can be transformed
> using a standard XSL stylesheet to RDF/XML, we can express a limitless
> number of statements about phylogenetics. However, this means that any
> NeXML I/O library needs to be able to represent RDF triples. I have
> studied the BioRuby API as best as I could (but: I don't know ruby)
> and couldn't identify how to do this.
> 
> My questions to you:
> 
> * is there a way to express triples in BioRuby?
> * if there is not, what would be a good design to express triples in
> BioRuby so that this would be more useful than just for NeXML?
> 
> Thank you!
> 
> Rutger
> 
> -- 
> Dr. Rutger A. Vos
> School of Biological Sciences
> Philip Lyle Building, Level 4
> University of Reading
> Reading
> RG6 6BX
> United Kingdom
> Tel: +44 (0) 118 378 7535
> http://www.nexml.org
> http://rutgervos.blogspot.com
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From rutgeraldo at gmail.com  Thu Mar 11 10:22:04 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Thu, 11 Mar 2010 10:22:04 +0000
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google 	SoC
In-Reply-To: <9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
	<9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
Message-ID: <2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>

Hi Toshiaki,

great to hear there's already been a lot of discussion over this.
(Well, I'd be surprised if there hadn't been :))

It looks to me like some fairly major bookkeeping would need to be
implemented high up in the inheritance tree if *all* bioruby objects
are to be serialized into RDF. It also would require all of bioruby to
be ontologized in one fell swoop.

It is perhaps more likely that subdomains are going to be ontologized
more or less independently from one another (as you mention,
references->RDF, or in my case phylogenetics->RDF) based implicitly on
intermediate data formats (pubmed records and nexml, respectively).

That is probably OK, we do things as needs arise.

But what would be handy if the API was at least general enough so that
this was extensible and we can make additional statements *about*
objects when we serialize them to RDF. For example, in your pubmed
turtle file, the subject is always
<http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>. Is there a way,
programmatically, where I can add additional statements about
<http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>?

Rutger

On Wed, Mar 10, 2010 at 2:21 PM, Toshiaki Katayama <ktym at hgc.jp> wrote:
> Hi Rutger,
>
> Thank you for your inputs on GSoC 2010!
>
>> * is there a way to express triples in BioRuby?
>> * if there is not, what would be a good design to express triples in
>> BioRuby so that this would be more useful than just for NeXML?
>
> This is what we discussed during the pre-BioHackathon 2010.
>
> http://hackathon3.dbcls.jp/wiki/BioRuby
>
> My first idea was to make all BioRuby object have common output
> method to render the object contents in various formats
> (such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).
>
> Then, we tried to separate view from logic using erb, but as you
> see in the above page, it still looks ugly. It is mainly because
> view formatting itself requires some additional codes, specific
> to each format.
>
> Therefore, we don't have a solid conclusion on this yet, unfortunately.
>
> Anyway, we already have PubMed to RDF converter written in Ruby as
> the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at
>
> http://togows.dbcls.jp/entry/pubmed/16381885
> --> http://togows.dbcls.jp/entry/pubmed/16381885.ttl
>
> and, we are also trying to support KEGG to RDF conversion in this
> framework as well. I think we can put the code in BioRuby when we finished.
>
> Your suggestions are welcome. :)
>
> Regards,
> Toshiaki
>
> On 2010/03/10, at 22:22, Rutger Vos wrote:
>
>> Dear BioRuby-ites,
>>
>> my apologies that my first email to this list is so long and
>> tangential. I am trying to find out how to express RDF triples in
>> BioRuby. In this email I'm explaining why I care enough to try to get
>> funding for someone to work on this. If you don't care about any of
>> this, you can stop reading now.
>>
>> The National Evolutionary Synthesis Center (NESCent.org) is planning
>> to be a mentoring organization for the Google Summer of Code 2010. I
>> have submitted a project idea to this: to develop NeXML I/O and -
>> probably more importantly for you - RDF capabilities for BioRuby. If
>> funded, a student/coder will work on this full time over the summer,
>> under the shared supervision of Jan Aerts and myself. Here is the
>> link: http://tinyurl.com/biorubynexml
>>
>> NeXML is a data format for phylogenetic data that can be read and
>> written in perl, python, java and (to some extent) c++ and javascript.
>> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
>> can tell BioRuby isn't completely up to speed for it, yet.
>>
>> (As an aside: you might ask yourself why there is something like NeXML
>> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
>> different problem: PhyloXML started essentially as a next generation
>> of New Hampshire eXtended (NHX) to meet the annotation needs of
>> comparative genomics, things such as gene duplications and other
>> molecular evolution events, on phylogenetic trees; NeXML started as a
>> complete XML representation of the NEXUS format, providing other
>> comparative data types such as categorical and continuous character
>> state matrices, restriction site matrices, and so on, in addition to
>> trees, taxa, sequence alignments. There is obviously some overlap
>> between the formats, but I guess that is not unique in bioinformatics
>> :))
>>
>> NeXML has a semantic annotation facility that uses RDFa. This allows
>> us to add additional metadata to a fundamental phylogenetic data
>> object (a tree, taxon, character, etc.) to form a "triple": the
>> fundamental data object is the triple Subject, and the Predicate and
>> Object are added as RDFa attributes. Since NeXML can be transformed
>> using a standard XSL stylesheet to RDF/XML, we can express a limitless
>> number of statements about phylogenetics. However, this means that any
>> NeXML I/O library needs to be able to represent RDF triples. I have
>> studied the BioRuby API as best as I could (but: I don't know ruby)
>> and couldn't identify how to do this.
>>
>> My questions to you:
>>
>> * is there a way to express triples in BioRuby?
>> * if there is not, what would be a good design to express triples in
>> BioRuby so that this would be more useful than just for NeXML?
>>
>> Thank you!
>>
>> Rutger
>>
>> --
>> Dr. Rutger A. Vos
>> School of Biological Sciences
>> Philip Lyle Building, Level 4
>> University of Reading
>> Reading
>> RG6 6BX
>> United Kingdom
>> Tel: +44 (0) 118 378 7535
>> http://www.nexml.org
>> http://rutgervos.blogspot.com
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
>


-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com


From bonnalraoul at ingm.it  Thu Mar 11 13:02:23 2010
From: bonnalraoul at ingm.it (Raoul Bonnal)
Date: Thu, 11 Mar 2010 14:02:23 +0100
Subject: [BioRuby] Ruby and Statistics
Message-ID: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>

Hello Folks, 
I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
Which library do you suggest ? 
I don't want to use rsruby (R), for now. Er extend every time Array.

I found this: ruby-statsample but I don't know if is the best one.

--
Raoul J.P. Bonnal
Life Science Informatics
Integrative Biology Program
Fondazione INGM
Via F. Sforza 28
20122 Milano, IT
phone: +39 0 200 662 326
fax: +39 0 200 662 346
http://www.ingm.it


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 13:53:02 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 22:53:02 +0900
Subject: [BioRuby] Ruby and Statistics
In-Reply-To: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
References: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
Message-ID: <20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>

Hi,

I found some modules, but I haven't used them.

math-statistics: http://www.notwork.org/~gotoken/ruby/p/statistics/

statarray: http://rubyforge.org/projects/statarray/

ruby-stats: http://pallas.telperion.info/ruby-stats/

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Thu, 11 Mar 2010 14:02:23 +0100
"Raoul Bonnal" <bonnalraoul at ingm.it> wrote:

> Hello Folks, 
> I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
> Which library do you suggest ? 
> I don't want to use rsruby (R), for now. Er extend every time Array.
> 
> I found this: ruby-statsample but I don't know if is the best one.
> 
> --
> Raoul J.P. Bonnal
> Life Science Informatics
> Integrative Biology Program
> Fondazione INGM
> Via F. Sforza 28
> 20122 Milano, IT
> phone: +39 0 200 662 326
> fax: +39 0 200 662 346
> http://www.ingm.it
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 14:12:49 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 23:12:49 +0900
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <4B96A5FB.9060607@molbio.su.se>
References: <4B96A5FB.9060607@molbio.su.se>
Message-ID: <20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>

Hi,

Christian Zmasek are now working for the HMMER 3 support.
It will be great if you can help us.

http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 09 Mar 2010 20:48:11 +0100
Daniel Lundin <daniel.lundin at molbio.su.se> wrote:

> Hi,
> 
> HMMER 3 is currently available as a first release candidate. With it 
> comes several news both in the form of new tools and new kinds of data, 
> which means output formats are changed. Is anybody working on BioRuby 
> parsers for these?
> 
> /D
> 
> -- 
> Daniel Lundin
> 
> Department of Molecular Biology & Functional Genomics
> Arrhenius Laboratories for Natural Sciences
> Stockholm University, SE-106 91 Stockholm, Sweden
> 
> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
> 
> Email: daniel.lundin at molbio.su.se
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From ngoto at gen-info.osaka-u.ac.jp  Thu Mar 11 14:59:11 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Thu, 11 Mar 2010 23:59:11 +0900
Subject: [BioRuby] Phylogenetic Trees or Hierarchical Clustering
In-Reply-To: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>
References: <9d7d43131003011342s3de1f182oacf6ce1e612a452a@mail.gmail.com>
Message-ID: <20100311145912.CF9091CBC3DA@idnmail.gen-info.osaka-u.ac.jp>

Hi,

I always use phylogenetic tree construction software such as
PHYLIP and MEGA4, and I don't know much about the pure Ruby
solutions. Below are found by using Google search.

There are some pure Ruby implementations of clustering algorithms,
though I haven't used them.

AI4R (Artificial Intelligence for Ruby):
http://ai4r.rubyforge.org/

clusterer: http://rubyforge.org/projects/clusterer/

I found a phylogenetic tree visualization implementation written
in JRuby, and I found it can also work with normal Ruby 1.8.7.

Egan A et al. (2008)
IDEA: Interactive Display for Evolutionary Analyses.
BMC Bioinformatics 2008, 9:524
http://www.biomedcentral.com/1471-2105/9/524
http://ideanalyses.sourceforge.net/

Thanks,

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Mon, 1 Mar 2010 16:42:25 -0500
Jillian E Kozyra <jillyh0 at gmail.com> wrote:

> Dear Colleagues,
> 
> We are working on a linguistics project in which we will calculate language
> similarities. From the language similarity matrix, we would like to create
> either a hierarchical clustering output or phylogenetic tree. We seek a pure
> Ruby plugin with which to do this. Could you give us some guidance?
> 
> Thanks,
> Jillian
> 
> -- 
> 917-434-7511
> http://sswl.railsplayground.net
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From daniel.lundin at molbio.su.se  Thu Mar 11 16:18:25 2010
From: daniel.lundin at molbio.su.se (Daniel Lundin)
Date: Thu, 11 Mar 2010 17:18:25 +0100
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
References: <4B96A5FB.9060607@molbio.su.se>
	<20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <4B9917D1.9000702@molbio.su.se>

Naohisa GOTO skrev:
> Hi,
> 
> Christian Zmasek are now working for the HMMER 3 support.
> It will be great if you can help us.
> 
Certainly. Since my alternative is writing a parser for myself, I might 
as well put in my effort for the common good.

Christian, is there anything in particular I could help with? I have 
started collecting some test cases for my own needs.

/Daniel

> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Tue, 09 Mar 2010 20:48:11 +0100
> Daniel Lundin <daniel.lundin at molbio.su.se> wrote:
> 
>> Hi,
>>
>> HMMER 3 is currently available as a first release candidate. With it 
>> comes several news both in the form of new tools and new kinds of data, 
>> which means output formats are changed. Is anybody working on BioRuby 
>> parsers for these?
>>
>> /D
>>
>> -- 
>> Daniel Lundin
>>
>> Department of Molecular Biology & Functional Genomics
>> Arrhenius Laboratories for Natural Sciences
>> Stockholm University, SE-106 91 Stockholm, Sweden
>>
>> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
>>
>> Email: daniel.lundin at molbio.su.se
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 


-- 
Daniel Lundin

Department of Molecular Biology & Functional Genomics
Arrhenius Laboratories for Natural Sciences
Stockholm University, SE-106 91 Stockholm, Sweden

tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88

Email: daniel.lundin at molbio.su.se


From pjotr.public14 at thebird.nl  Thu Mar 11 17:17:27 2010
From: pjotr.public14 at thebird.nl (Pjotr Prins)
Date: Thu, 11 Mar 2010 18:17:27 +0100
Subject: [BioRuby] Ruby and Statistics
In-Reply-To: <20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>
References: <2122bfdf-d902-4be1-aef2-95013cea31f6@ingm.it>
	<20100311135303.8C5201CBC41B@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <20100311171727.GD12523@thebird.nl>

Hi Raoul,

Biolib makes the GSL available for Ruby, as well as Rlib. So many
standard statistics can be used, including linear regression, etc. If
there is other libraries you want to use we can consider mapping
those to Ruby (BOOST is a candidate).

Main problem is that I am still in the process of documenting biolib
before its release 1.0.

If you are interested in using these tools, we can work it out between
us. Just tell me what functions you want, and I'll help map/document
them. Be great for Biolib - as testing is a good thing.

Pj.

On Thu, Mar 11, 2010 at 10:53:02PM +0900, Naohisa GOTO wrote:
> Hi,
> 
> I found some modules, but I haven't used them.
> 
> math-statistics: http://www.notwork.org/~gotoken/ruby/p/statistics/
> 
> statarray: http://rubyforge.org/projects/statarray/
> 
> ruby-stats: http://pallas.telperion.info/ruby-stats/
> 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
> 
> On Thu, 11 Mar 2010 14:02:23 +0100
> "Raoul Bonnal" <bonnalraoul at ingm.it> wrote:
> 
> > Hello Folks, 
> > I need to do statistical computations in Ruby, some time very basic operations like mean and stdv
> > Which library do you suggest ? 
> > I don't want to use rsruby (R), for now. Er extend every time Array.
> > 
> > I found this: ruby-statsample but I don't know if is the best one.
> > 
> > --
> > Raoul J.P. Bonnal
> > Life Science Informatics
> > Integrative Biology Program
> > Fondazione INGM
> > Via F. Sforza 28
> > 20122 Milano, IT
> > phone: +39 0 200 662 326
> > fax: +39 0 200 662 346
> > http://www.ingm.it
> > 
> > 
> > 
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From rutgeraldo at gmail.com  Mon Mar 15 12:27:27 2010
From: rutgeraldo at gmail.com (Rutger Vos)
Date: Mon, 15 Mar 2010 12:27:27 +0000
Subject: [BioRuby] RDF Triples in BioRuby,
	a funding proposal to Google SoC
In-Reply-To: <2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>
References: <2bb9b24a1003100522p68330d6bu3f8e5f3a7f50dd6b@mail.gmail.com>
	<9081A9B5-611C-45C2-A099-44BAF1E524F4@hgc.jp>
	<2bb9b24a1003110222h4bd642adv31d1975c9edc0bba@mail.gmail.com>
Message-ID: <2bb9b24a1003150527p439c135dm1a164e6a5218835f@mail.gmail.com>

To follow up along more practical lines, I've had to deal with similar
design issues in Bio::Phylo (perl), TreeBASE and Mesquite (both java).
I've learned it makes sense to have:

- a simple "annotation" object, with getters and setters for the
predicate namespace uri, the predicate string, and the value object
(either a literal or a uri),

- a get_annotations method for all (fundamental) data objects in the
toolkit that returns a collection of these annotation object

this way, when you serialize any bioruby object into rdf, you can add
as many other statements about that object as you want.

Would a refactoring along those lines have a chance of being
acceptable to the bioruby community (of course subsequent to a more
detailed RFC, testing, discussion, proof of concept, etc.)?

On Thursday, March 11, 2010, Rutger Vos <rutgeraldo at gmail.com> wrote:
> Hi Toshiaki,
>
> great to hear there's already been a lot of discussion over this.
> (Well, I'd be surprised if there hadn't been :))
>
> It looks to me like some fairly major bookkeeping would need to be
> implemented high up in the inheritance tree if *all* bioruby objects
> are to be serialized into RDF. It also would require all of bioruby to
> be ontologized in one fell swoop.
>
> It is perhaps more likely that subdomains are going to be ontologized
> more or less independently from one another (as you mention,
> references->RDF, or in my case phylogenetics->RDF) based implicitly on
> intermediate data formats (pubmed records and nexml, respectively).
>
> That is probably OK, we do things as needs arise.
>
> But what would be handy if the API was at least general enough so that
> this was extensible and we can make additional statements *about*
> objects when we serialize them to RDF. For example, in your pubmed
> turtle file, the subject is always
> <http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>. Is there a way,
> programmatically, where I can add additional statements about
> <http://togows.dbcls.jp/entry/ncbi-pubmed/16381885>?
>
> Rutger
>
> On Wed, Mar 10, 2010 at 2:21 PM, Toshiaki Katayama <ktym at hgc.jp> wrote:
>> Hi Rutger,
>>
>> Thank you for your inputs on GSoC 2010!
>>
>>> * is there a way to express triples in BioRuby?
>>> * if there is not, what would be a good design to express triples in
>>> BioRuby so that this would be more useful than just for NeXML?
>>
>> This is what we discussed during the pre-BioHackathon 2010.
>>
>> http://hackathon3.dbcls.jp/wiki/BioRuby
>>
>> My first idea was to make all BioRuby object have common output
>> method to render the object contents in various formats
>> (such as RDF/XML, Turtle, HTML, GFF, FASTA etc. if appropriate).
>>
>> Then, we tried to separate view from logic using erb, but as you
>> see in the above page, it still looks ugly. It is mainly because
>> view formatting itself requires some additional codes, specific
>> to each format.
>>
>> Therefore, we don't have a solid conclusion on this yet, unfortunately.
>>
>> Anyway, we already have PubMed to RDF converter written in Ruby as
>> the TogoWS REST API (http://togows.dbcls.jp/site/en/rest.html) at
>>
>> http://togows.dbcls.jp/entry/pubmed/16381885
>> --> http://togows.dbcls.jp/entry/pubmed/16381885.ttl
>>
>> and, we are also trying to support KEGG to RDF conversion in this
>> framework as well. I think we can put the code in BioRuby when we finished.
>>
>> Your suggestions are welcome. :)
>>
>> Regards,
>> Toshiaki
>>
>> On 2010/03/10, at 22:22, Rutger Vos wrote:
>>
>>> Dear BioRuby-ites,
>>>
>>> my apologies that my first email to this list is so long and
>>> tangential. I am trying to find out how to express RDF triples in
>>> BioRuby. In this email I'm explaining why I care enough to try to get
>>> funding for someone to work on this. If you don't care about any of
>>> this, you can stop reading now.
>>>
>>> The National Evolutionary Synthesis Center (NESCent.org) is planning
>>> to be a mentoring organization for the Google Summer of Code 2010. I
>>> have submitted a project idea to this: to develop NeXML I/O and -
>>> probably more importantly for you - RDF capabilities for BioRuby. If
>>> funded, a student/coder will work on this full time over the summer,
>>> under the shared supervision of Jan Aerts and myself. Here is the
>>> link: http://tinyurl.com/biorubynexml
>>>
>>> NeXML is a data format for phylogenetic data that can be read and
>>> written in perl, python, java and (to some extent) c++ and javascript.
>>> RDF is the cool "new" thing (as per BioHackathon2010), but as far as I
>>> can tell BioRuby isn't completely up to speed for it, yet.
>>>
>>> (As an aside: you might ask yourself why there is something like NeXML
>>> when there is PhyloXML for BioRuby. The answer is that NeXML solves a
>>> different problem: PhyloXML started essentially as a next generation
>>> of New Hampshire eXtended (NHX) to meet the annotation needs of
>>> comparative genomics, things such as gene duplications and other
>>> molecular evolution events, on phylogenetic trees; NeXML started as a
>>> complete XML representation of the NEXUS format, providing other
>>> comparative data types such as categorical and continuous character
>>> state matrices, restriction site matrices, and so on, in addition to
>>> trees, taxa, sequence alignments. There is obviously some overlap
>>> between the formats, but I guess that is not unique in bioinformatics
>>> :))
>>>
>>> NeXML has a semantic annotation facility that uses RDFa. This allows
>>> us to add additional metadata to a fundamental phylogenetic data
>>> object (a tree, taxon, character, etc.) to form a "triple": the
>>> fundamental data object is the triple Subject, and the Predicate and
>>> Object are added as RDFa attributes. Since NeXML can be transformed
>>> using a standard XSL stylesheet to RDF/XML, we can express a limitless
>>> number of statements about phylogenetics. H

-- 
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading
RG6 6BX
United Kingdom
Tel: +44 (0) 118 378 7535
http://www.nexml.org
http://rutgervos.blogspot.com


From ngoto at gen-info.osaka-u.ac.jp  Fri Mar 19 05:18:41 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 19 Mar 2010 14:18:41 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code
 is *ON* for OBF projects!
Message-ID: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>

Begin forwarded message:

Date: Thu, 18 Mar 2010 17:02:32 -0500
From: Chris Fields <cjfields at illinois.edu>
To: open-bio-l at lists.open-bio.org
Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!


(forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)

Hi all,

Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!

GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo

Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.

For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.

Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!

Rob Buels
OBF GSoC 2010 Administrator


_______________________________________________
Open-Bio-l mailing list
Open-Bio-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/open-bio-l


From k.hayashi.info at gmail.com  Tue Mar 23 12:20:52 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Tue, 23 Mar 2010 21:20:52 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>

Hi, all

My name is Kazuhiro Hayashi.
I'm a 1st-year master's degree student at Depertment of Computational
Biology, Graduate School of Frontier Sciences, the University of
Tokyo.

The reason why I sent this mail is to ask you some questions about
Google Summer of Code 2010.

I'm interested in Google Summer of Code 2010, Especially, the projects
about BioRuby.
At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
like to contribute BioRuby community through Google Summer of Code
2010.
So, I have three questions.
Could you answer them?

One is about differences between Ruby 1.8.x and 1.9.2
OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
Can I join this project?

Another is how many programs in BioRuby run on Ruby 1.9.2.
Could you tell me weather you have already known it or not (and how to know it)?

The other is implementation of the unit tests.
Does this mean that the participant needs to implement unit tests for
all codes which haven't had them yet.

Currently, These are all my questions about GSoC 2010.

If you have some advice for the applicants, please send a reply to
this mailing list.

Thank you very much for reading my broken English. :-)

Best regards


2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Begin forwarded message:
>
> Date: Thu, 18 Mar 2010 17:02:32 -0500
> From: Chris Fields <cjfields at illinois.edu>
> To: open-bio-l at lists.open-bio.org
> Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
>
>
> (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
>
> Hi all,
>
> Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
>
> GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
>
> Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
>
> For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
>
> Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
>
> Rob Buels
> OBF GSoC 2010 Administrator
>
>
>
> _______________________________________________
> Open-Bio-l mailing list
> Open-Bio-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/open-bio-l
>
>
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>


-- 
???
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From biopython at maubp.freeserve.co.uk  Tue Mar 23 13:20:57 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 23 Mar 2010 13:20:57 +0000
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
Message-ID: <320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>

On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
<k.hayashi.info at gmail.com> wrote:
> Hi, all
>
> My name is Kazuhiro Hayashi.
> I'm a 1st-year master's degree student at Depertment of Computational
> Biology, Graduate School of Frontier Sciences, the University of
> Tokyo.
>
> The reason why I sent this mail is to ask you some questions about
> Google Summer of Code 2010.
>
> ...
>
> Thank you very much for reading my broken English. :-)

Hello Hayashi-san,

I don't know if the BioRuby team have any preference for which
language the Google Summer of Code projects will be discussed
in (English and/or Japanese). It will probably depend on the mentors.

However, there is also a Japanese BioRuby mailing list:
http://lists.open-bio.org/mailman/listinfo/bioruby-ja

Peter
(@Biopython)


From ngoto at gen-info.osaka-u.ac.jp  Tue Mar 23 15:21:33 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 24 Mar 2010 00:21:33 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
Message-ID: <20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>

Hi Kazuhiro,

On Tue, 23 Mar 2010 21:20:52 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi, all
> 
> My name is Kazuhiro Hayashi.
> I'm a 1st-year master's degree student at Depertment of Computational
> Biology, Graduate School of Frontier Sciences, the University of
> Tokyo.
> 
> The reason why I sent this mail is to ask you some questions about
> Google Summer of Code 2010.
> 
> I'm interested in Google Summer of Code 2010, Especially, the projects
> about BioRuby.
> At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
> like to contribute BioRuby community through Google Summer of Code
> 2010.
> So, I have three questions.
> Could you answer them?
>
> One is about differences between Ruby 1.8.x and 1.9.2
> OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
> Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
> Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
> Can I join this project?

Yes.
You will need to study about them during the project, but not now.
I've modified the "needed skills" in the project wiki page
to clarify the point.

> Another is how many programs in BioRuby run on Ruby 1.9.2.
> Could you tell me weather you have already known it or not (and how to know it)?

I don't know much. Some programs worked, but some didn't.

> The other is implementation of the unit tests.
> Does this mean that the participant needs to implement unit tests for
> all codes which haven't had them yet.

Yes or no, depends on planning. One idea is to implement
almost all with rough coding, and to improve them after that.
I also think that classes and modules that strongly depend
on external program or web service can be skipped.

> Currently, These are all my questions about GSoC 2010.
> 
> If you have some advice for the applicants, please send a reply to
> this mailing list.
> 
> Thank you very much for reading my broken English. :-)
> 
> Best regards
> 
> 
> 2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> > Begin forwarded message:
> >
> > Date: Thu, 18 Mar 2010 17:02:32 -0500
> > From: Chris Fields <cjfields at illinois.edu>
> > To: open-bio-l at lists.open-bio.org
> > Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
> >
> >
> > (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
> >
> > Hi all,
> >
> > Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
> >
> > GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
> >
> > Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
> >
> > For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
> >
> > Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
> >
> > Rob Buels
> > OBF GSoC 2010 Administrator
> >
> >
> >
> > _______________________________________________
> > Open-Bio-l mailing list
> > Open-Bio-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/open-bio-l
> >
> >
> > _______________________________________________
> > BioRuby Project - http://www.bioruby.org/
> > BioRuby mailing list
> > BioRuby at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioruby
> >
> 
> 
> -- 
> ???
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


From ngoto at gen-info.osaka-u.ac.jp  Wed Mar 24 14:22:23 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Wed, 24 Mar 2010 23:22:23 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<320fb6e01003230620l58717628t4d12f67411805c48@mail.gmail.com>
Message-ID: <20100324142225.08B501CBC3D0@idnmail.gen-info.osaka-u.ac.jp>

Hi,

The objective of the project is software development.
I think it is OK to use Japanese for communicating with
Japanese-speaking mentors. Using the bioruby-ja mailing
list for discussion seems good.

Students still need to write application form in English
required by Google.  It would be great if someone can help
English proofreading for ESL students.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Tue, 23 Mar 2010 13:20:57 +0000
Peter <biopython at maubp.freeserve.co.uk> wrote:

> On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
> <k.hayashi.info at gmail.com> wrote:
> > Hi, all
> >
> > My name is Kazuhiro Hayashi.
> > I'm a 1st-year master's degree student at Depertment of Computational
> > Biology, Graduate School of Frontier Sciences, the University of
> > Tokyo.
> >
> > The reason why I sent this mail is to ask you some questions about
> > Google Summer of Code 2010.
> >
> > ...
> >
> > Thank you very much for reading my broken English. :-)
> 
> Hello Hayashi-san,
> 
> I don't know if the BioRuby team have any preference for which
> language the Google Summer of Code projects will be discussed
> in (English and/or Japanese). It will probably depend on the mentors.
> 
> However, there is also a Japanese BioRuby mailing list:
> http://lists.open-bio.org/mailman/listinfo/bioruby-ja
> 
> Peter
> (@Biopython)
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby


From k.hayashi.info at gmail.com  Wed Mar 24 14:35:21 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Wed, 24 Mar 2010 23:35:21 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>

Hi.

Thank you for your replies.

I'd like to communicate with you on this mailing list (and I will
write e-mails in English as much as possible ). :- )
However, If I should do it on somewhere else, I will do so.
I'm not sure where is the best place to talk about GSoC 2010.
Anyway, I appriciate your advice.


By the way, I have one more question.
Could you tell me how much I have to write the proposal concretely?
I have to write how to implement the programs and when I write each?

Best regards

Kazuhiro

2010/3/23 Peter <biopython at maubp.freeserve.co.uk>:
> On Tue, Mar 23, 2010 at 12:20 PM, Kazuhiro Hayashi
> <k.hayashi.info at gmail.com> wrote:
>> Hi, all
>>
>> My name is Kazuhiro Hayashi.
>> I'm a 1st-year master's degree student at Depertment of Computational
>> Biology, Graduate School of Frontier Sciences, the University of
>> Tokyo.
>>
>> The reason why I sent this mail is to ask you some questions about
>> Google Summer of Code 2010.
>>
>> ...
>>
>> Thank you very much for reading my broken English. :-)
>
> Hello Hayashi-san,
>
> I don't know if the BioRuby team have any preference for which
> language the Google Summer of Code projects will be discussed
> in (English and/or Japanese). It will probably depend on the mentors.
>
> However, there is also a Japanese BioRuby mailing list:
> http://lists.open-bio.org/mailman/listinfo/bioruby-ja
>
> Peter
> (@Biopython)
>


2010?3?24?0:21 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Hi Kazuhiro,
>
> On Tue, 23 Mar 2010 21:20:52 +0900
> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>
>> Hi, all
>>
>> My name is Kazuhiro Hayashi.
>> I'm a 1st-year master's degree student at Depertment of Computational
>> Biology, Graduate School of Frontier Sciences, the University of
>> Tokyo.
>>
>> The reason why I sent this mail is to ask you some questions about
>> Google Summer of Code 2010.
>>
>> I'm interested in Google Summer of Code 2010, Especially, the projects
>> about BioRuby.
>> At the moment, I will apply "Ruby 1.9.2 support of BioRuby and I'd
>> like to contribute BioRuby community through Google Summer of Code
>> 2010.
>> So, I have three questions.
>> Could you answer them?
>>
>> One is about differences between Ruby 1.8.x and 1.9.2
>> OBF's GSoC page says that the participant needs to know Ruby 1.9.2.
>> Until now, I've used only Ruby 1.8.7 and never used Ruby 1.9.2.
>> Honestly, I hardly know differences between Ruby 1.8.x and Ruby 1.9.2.
>> Can I join this project?
>
> Yes.
> You will need to study about them during the project, but not now.
> I've modified the "needed skills" in the project wiki page
> to clarify the point.
>
>> Another is how many programs in BioRuby run on Ruby 1.9.2.
>> Could you tell me weather you have already known it or not (and how to know it)?
>
> I don't know much. Some programs worked, but some didn't.
>
>> The other is implementation of the unit tests.
>> Does this mean that the participant needs to implement unit tests for
>> all codes which haven't had them yet.
>
> Yes or no, depends on planning. One idea is to implement
> almost all with rough coding, and to improve them after that.
> I also think that classes and modules that strongly depend
> on external program or web service can be skipped.
>
>> Currently, These are all my questions about GSoC 2010.
>>
>> If you have some advice for the applicants, please send a reply to
>> this mailing list.
>>
>> Thank you very much for reading my broken English. :-)
>>
>> Best regards
>>
>>
>> 2010/3/19 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
>> > Begin forwarded message:
>> >
>> > Date: Thu, 18 Mar 2010 17:02:32 -0500
>> > From: Chris Fields <cjfields at illinois.edu>
>> > To: open-bio-l at lists.open-bio.org
>> > Subject: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of Code is *ON* for OBF projects!
>> >
>> >
>> > (forwarding to the Open-Bio list, as the original post is still clearing the OBF mail filters)
>> >
>> > Hi all,
>> >
>> > Great news: Google announced today that the Open Bioinformatics Foundation has been accepted as a mentoring organization for this summer's Google Summer of Code!
>> >
>> > GSoC is a Google-sponsored student internship program for open-source projects, open to students from around the world (not just US residents).   Students are paid a $5000 USD stipend to work as a developer on an open-source project for the summer. For more on GSoC, see GSoC 2010 FAQ at http://tinyurl.com/yzemdfo
>> >
>> > Student applications are due April 9, 2010 at 19:00 UTC.  Students who are interested in participating should look at the OBF's GSoC page at http://open-bio.org/wiki/Google_Summer_of_Code, which lists project ideas, and who to contact about applying.
>> >
>> > For current developers on OBF projects, please consider volunteering to be a mentor if you have not already, and contribute project ideas.  Just list your name and project ideas on OBF wiki and on the relevant project's GSoC wiki page.
>> >
>> > Thanks to all who helped make OBF's application to GSoC a success, and let's have a great, productive summer of code!
>> >
>> > Rob Buels
>> > OBF GSoC 2010 Administrator
>> >
>> >
>> >
>> > _______________________________________________
>> > Open-Bio-l mailing list
>> > Open-Bio-l at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/open-bio-l
>> >
>> >
>> > _______________________________________________
>> > BioRuby Project - http://www.bioruby.org/
>> > BioRuby mailing list
>> > BioRuby at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/bioruby
>> >
>>
>>
>> --
>> ???
>> Kazuhiro Hayashi
>> Department of Computational Biology,  The University of Tokyo
>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>> tel: 04-7136-3988
>>
>> _______________________________________________
>> BioRuby Project - http://www.bioruby.org/
>> BioRuby mailing list
>> BioRuby at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioruby
>
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From biopython at maubp.freeserve.co.uk  Wed Mar 24 14:51:46 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 24 Mar 2010 14:51:46 +0000
Subject: [BioRuby] [Bioperl-l] Fwd: [Utilities-announce] NCBI Revised
	E-utility Usage Policy
In-Reply-To: <38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
References: <A9D8BF3D8A74DF4A925FB541C0F39D2A220D32B4@NIHMLBX15.nih.gov>
	<320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com>
	<38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
Message-ID: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>

On Wed, Mar 24, 2010 at 2:37 PM, Chris Fields <cjfields at illinois.edu> wrote:
>
> On Mar 24, 2010, at 9:08 AM, Peter wrote:
>
>> Hi,
>>
>> This is probably of interest to all the Bio* projects offering access
>> to the NCBI Entrez utilities. See forwarded message below.
>>
>> I *think* the new guidelines basically say that the email & tool parameters are
>> optional BUT if your IP address ever gets banned for excessive use you then
>> have to register an email & tool combination.
>>
>> Regarding the email address, the NCBI say to use the email of the developer
>> (not the end user). However, they do not distinguish between the developers
>> of a library (like us), and the developers of an application or script using a
>> library (who may also be the end user).
>>
>> Currently we (Biopython) and I think BioPerl ask developers using our libraries
>> to populate the email address themselves. I *think* this is still the
>> right action.
>>
>> Peter
>
>
> Basically, that's the same tactic I'm going with with Bio::DB::EUtilities (and I
> think with the SOAP-based ones as well). ?We're providing a specific set of
> tools for user to write up their own applications end applications. ?I can try
> contacting them regarding this to get an official response to clarify this
> somewhat.

Please give the NCBI an email - you can CC me too if you like.

> Re: the tool parameter, we currently set the tool itself to 'BioPerl' as a
> default, but always leave the email blank and issue a warning if it isn't
> set. ?We could just as easily leave both blank and issue warnings for both.

We currently leave out the email and set the tool parameter to "Biopython"
by default but this can be overridden. Currently leaving out the email does
cause Biopython to give a warning.

Peter


From hlapp at drycafe.net  Wed Mar 24 15:27:37 2010
From: hlapp at drycafe.net (Hilmar Lapp)
Date: Wed, 24 Mar 2010 11:27:37 -0400
Subject: [BioRuby] [Open-bio-l] [Bioperl-l] Fwd: [Utilities-announce]
	NCBI Revised E-utility Usage Policy
In-Reply-To: <320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>
References: <A9D8BF3D8A74DF4A925FB541C0F39D2A220D32B4@NIHMLBX15.nih.gov>
	<320fb6e01003240708o48eeb30eq3b09110dcc2d1873@mail.gmail.com>
	<38D43B03-4A85-48CB-913A-CD564EB5168C@illinois.edu>
	<320fb6e01003240751v2afd5d5bwa39590afa9b13209@mail.gmail.com>
Message-ID: <5D427F97-706E-4F66-95BA-2B397520C4FA@drycafe.net>


On Mar 24, 2010, at 10:51 AM, Peter wrote:

> Please give the NCBI an email - you can CC me too if you like.


Can't this be the developers' mailing list (or lists, the appropriate  
one for each toolkit)? We can even whitelist all NCBI sender addresses  
so they can easily email us if there are issues.

	-hilmar
-- 
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net :
===========================================================


From k.hayashi.info at gmail.com  Thu Mar 25 17:31:07 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Fri, 26 Mar 2010 02:31:07 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
Message-ID: <b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>

Hi,

Thank you for your replies.

I'd like to communicate with you on this mailing list (and I will
write e-mails in English as much as possible ). :- )
However, If I should do it on somewhere else, I will do so.
I'm not sure where is the best place to talk about GSoC 2010.
Anyway, I appreciate your advice.


By the way, I have one more question.
Could you tell me how much I have to write the proposal concretely?
I have to write how to implement the programs and when I write each?

Best regards

Kazuhiro

( I'm sorry if you have already received the same mail. I sent it
yesterday, but I haven't received yet....)

-- 
???
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From czmasek at burnham.org  Fri Mar 26 00:39:42 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Thu, 25 Mar 2010 17:39:42 -0700
Subject: [BioRuby] HMMER 3 parsers?
In-Reply-To: <4B9917D1.9000702@molbio.su.se>
References: <4B96A5FB.9060607@molbio.su.se>
	<20100311141250.789AA1CBC58F@idnmail.gen-info.osaka-u.ac.jp>
	<4B9917D1.9000702@molbio.su.se>
Message-ID: <4BAC024E.6000009@burnham.org>

Hi, Daniel:

Sorry for the late reply, for some reasons my email reader suddenly 
sorts messages wrongly.

In any case, the parser for hmmer3 hmmscan and hmmsearch is basically 
finished.

So, if I could somehow get access to your test cases, that would be great!

Thank you!

Christian


Daniel Lundin wrote:
> Naohisa GOTO skrev:
>> Hi,
>>
>> Christian Zmasek are now working for the HMMER 3 support.
>> It will be great if you can help us.
>>
> Certainly. Since my alternative is writing a parser for myself, I might 
> as well put in my effort for the common good.
> 
> Christian, is there anything in particular I could help with? I have 
> started collecting some test cases for my own needs.
> 
> /Daniel
> 
>> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer/hmmer3report.rb
>> http://github.com/cmzmasek/bioruby/blob/master/lib/bio/appl/hmmer3.rb
>>
>> Naohisa Goto
>> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>>
>> On Tue, 09 Mar 2010 20:48:11 +0100
>> Daniel Lundin <daniel.lundin at molbio.su.se> wrote:
>>
>>> Hi,
>>>
>>> HMMER 3 is currently available as a first release candidate. With it 
>>> comes several news both in the form of new tools and new kinds of data, 
>>> which means output formats are changed. Is anybody working on BioRuby 
>>> parsers for these?
>>>
>>> /D
>>>
>>> -- 
>>> Daniel Lundin
>>>
>>> Department of Molecular Biology & Functional Genomics
>>> Arrhenius Laboratories for Natural Sciences
>>> Stockholm University, SE-106 91 Stockholm, Sweden
>>>
>>> tel. +46 (0)8 16 41 95, mobile: +46 (0)708 123 922, fax. +46 (0)8 16 64 88
>>>
>>> Email: daniel.lundin at molbio.su.se
>>> _______________________________________________
>>> BioRuby Project - http://www.bioruby.org/
>>> BioRuby mailing list
>>> BioRuby at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioruby
> 
> 


From ngoto at gen-info.osaka-u.ac.jp  Fri Mar 26 12:43:38 2010
From: ngoto at gen-info.osaka-u.ac.jp (Naohisa GOTO)
Date: Fri, 26 Mar 2010 21:43:38 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
Message-ID: <20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>

Hi,

It is generally good to write many specific details.
However, the most important thing now is whether the proposal
is accepted by Google.

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org

On Fri, 26 Mar 2010 02:31:07 +0900
Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:

> Hi,
> 
> Thank you for your replies.
> 
> I'd like to communicate with you on this mailing list (and I will
> write e-mails in English as much as possible ). :- )
> However, If I should do it on somewhere else, I will do so.
> I'm not sure where is the best place to talk about GSoC 2010.
> Anyway, I appreciate your advice.
> 
> 
> By the way, I have one more question.
> Could you tell me how much I have to write the proposal concretely?
> I have to write how to implement the programs and when I write each?
> 
> Best regards
> 
> Kazuhiro
> 
> ( I'm sorry if you have already received the same mail. I sent it
> yesterday, but I haven't received yet....)
> 
> -- 
> ???
> Kazuhiro Hayashi
> Department of Computational Biology,  The University of Tokyo
> email: k_hayashi at cb.k.u-tokyo.ac.jp
> tel: 04-7136-3988


From k.hayashi.info at gmail.com  Fri Mar 26 15:21:41 2010
From: k.hayashi.info at gmail.com (Kazuhiro Hayashi)
Date: Sat, 27 Mar 2010 00:21:41 +0900
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer of
	Code is *ON* for OBF projects!
In-Reply-To: <20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com> 
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp> 
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com> 
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com> 
	<20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
Message-ID: <b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>

Hi Goto-san,

> It is generally good to write many specific details.
> However, the most important thing now is whether the proposal
> is accepted by Google.

Is it possible to show you a draft of my proposal?
I'd like you to proofread my proposal before the deadline for application.

Best regards

Kazuhiro

2010?3?26?21:43 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
> Hi,
>
> It is generally good to write many specific details.
> However, the most important thing now is whether the proposal
> is accepted by Google.
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>
> On Fri, 26 Mar 2010 02:31:07 +0900
> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>
>> Hi,
>>
>> Thank you for your replies.
>>
>> I'd like to communicate with you on this mailing list (and I will
>> write e-mails in English as much as possible ). :- )
>> However, If I should do it on somewhere else, I will do so.
>> I'm not sure where is the best place to talk about GSoC 2010.
>> Anyway, I appreciate your advice.
>>
>>
>> By the way, I have one more question.
>> Could you tell me how much I have to write the proposal concretely?
>> I have to write how to implement the programs and when I write each?
>>
>> Best regards
>>
>> Kazuhiro
>>
>> ( I'm sorry if you have already received the same mail. I sent it
>> yesterday, but I haven't received yet....)
>>
>> --
>> ???
>> Kazuhiro Hayashi
>> Department of Computational Biology,  The University of Tokyo
>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>> tel: 04-7136-3988
>
>


-- 
Kazuhiro Hayashi
Department of Computational Biology,  The University of Tokyo
email: k_hayashi at cb.k.u-tokyo.ac.jp
tel: 04-7136-3988


From czmasek at burnham.org  Fri Mar 26 18:26:54 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Fri, 26 Mar 2010 11:26:54 -0700
Subject: [BioRuby] Fw: [Open-bio-l] Fwd: [Bioperl-l] Google Summer o
 Code is *ON* for OBF projects!
In-Reply-To: <b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>
References: <20100319051842.7B8751CBC46A@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003230520t299734f3w2a117db3035edd92@mail.gmail.com>
	<20100323152133.B310E1CBC409@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003240735r28589250qc007fa3bdeb8db12@mail.gmail.com>
	<b51ee1fd1003251031n77fe4da4y9330eae958c8cec5@mail.gmail.com>
	<20100326124339.A02641CBC50D@idnmail.gen-info.osaka-u.ac.jp>
	<b51ee1fd1003260821y4adac538t79d14bc75a8bbafa@mail.gmail.com>
Message-ID: <4BACFC6E.4010303@burnham.org>

Hi,

Re. "Is it possible to show you a draft of my proposal?"

I think this is not only possible, it is highly recommended.
 From my experience, a detailed, well written, and realistic proposal is 
very important.

Remember, not all projects will get accepted (currently, OBF has 14 
projects, I would be very surprised if more than half would get accepted 
at the end). The better a student's proposal, the more likely it is that 
the project will get accepted.


Christian


Kazuhiro Hayashi wrote:
> Hi Goto-san,
> 
>> It is generally good to write many specific details.
>> However, the most important thing now is whether the proposal
>> is accepted by Google.
> 
> Is it possible to show you a draft of my proposal?
> I'd like you to proofread my proposal before the deadline for application.
> 
> Best regards
> 
> Kazuhiro
> 
> 2010?3?26?21:43 Naohisa GOTO <ngoto at gen-info.osaka-u.ac.jp>:
>> Hi,
>>
>> It is generally good to write many specific details.
>> However, the most important thing now is whether the proposal
>> is accepted by Google.
>>
>> Naohisa Goto
>> ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
>>
>> On Fri, 26 Mar 2010 02:31:07 +0900
>> Kazuhiro Hayashi <k.hayashi.info at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Thank you for your replies.
>>>
>>> I'd like to communicate with you on this mailing list (and I will
>>> write e-mails in English as much as possible ). :- )
>>> However, If I should do it on somewhere else, I will do so.
>>> I'm not sure where is the best place to talk about GSoC 2010.
>>> Anyway, I appreciate your advice.
>>>
>>>
>>> By the way, I have one more question.
>>> Could you tell me how much I have to write the proposal concretely?
>>> I have to write how to implement the programs and when I write each?
>>>
>>> Best regards
>>>
>>> Kazuhiro
>>>
>>> ( I'm sorry if you have already received the same mail. I sent it
>>> yesterday, but I haven't received yet....)
>>>
>>> --
>>> ???
>>> Kazuhiro Hayashi
>>> Department of Computational Biology,  The University of Tokyo
>>> email: k_hayashi at cb.k.u-tokyo.ac.jp
>>> tel: 04-7136-3988
>>
> 
> 
> 


From sararayburn at gmail.com  Sat Mar 27 20:13:01 2010
From: sararayburn at gmail.com (Sara Rayburn)
Date: Sat, 27 Mar 2010 15:13:01 -0500
Subject: [BioRuby] GSOC 2010 preliminary proposal question
Message-ID: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>

Hello all. My name is Sara Rayburn. I'm a doctoral student at the University of Louisiana at Lafayette. I am planning to submit a proposal to implement the speciation/duplication inference algorithm this summer. I'd like to tackle both the implementation and the extension to non-binary trees. In reading the posted reference on reconciliation in non-binary trees, there are two types of duplications referenced, required and conditional duplications. In an implementation of this approach, would it be better to identify only required duplications and clear speciations, or should there be an additional distinction for the conditional duplications?

I hope to post a preliminary project plan and proposal for feedback in the next couple of days. Thanks in advance for your feedback.


Sara Rayburn
University of Louisiana at Lafayette
sararayburn at gmail.com


From czmasek at burnham.org  Mon Mar 29 23:32:12 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 16:32:12 -0700
Subject: [BioRuby] Beta application for review: BioRuby - Simple
 duplication inference implementation
In-Reply-To: <C6EF422C-F22E-41D1-AF43-166C790502B8@gmail.com>
References: <C6EF422C-F22E-41D1-AF43-166C790502B8@gmail.com>
Message-ID: <4BB1387C.6090503@burnham.org>

Hi, Jure:

Your application seems to be on the right way.

In general, your time table needs to be more detailed.
For each step you should list:
1. Goal/deliverable (you have that)
2. Approach
3. Time estimation  (you have that)
4. Anticipated problems & possible alternative approaches


Some more comments:

> 
> *The idea:*
> 
> We would implement the simple and fast duplication inference algorithm 
> described by Zmasek and Eddy (Zmasek and Eddy, 2001, "A simple algorithm 
> to infer gene duplication and speciation events on a gene tree". Finding 
> gene duplications is an extremely important part of bioinformatics and 
> biomedical research, as duplications are thought to be powerful drivers 
> in the evolution of new protein function. 

I think 'extremely important part of bioinformatics' is a somewhat of an 
exaggeration and too vague. Better write about how gene duplications 
complicate efforts on gene function prediction, and their significance 
in (the theory of) molecular evolution.


> It is thus important to find 
> gene duplication sequences, which when translated are more likely to be 
> functionally different, and distinguish them from gene speciation 
> sequences, which are more likely functionally equivalent.

'gene duplication sequences' should be 'genes related by a duplication' 
or similar.
'gene speciation sequence' should be 'genes related by a speciation' or 
similar.


> Currently the algorithm supports rooted fully binary trees and we would 
> like to change that, by also implementing support for unrooted and 
> non-binary trees.

Goals are like this:
1. Implement algorithm as it is
2. Allow rooting of unrooted gene trees by minimizing sum of duplications.
Optional:
3. Extend algorithm to work on non-binary species trees
4. Extend algorithm to work on non-binary gene trees


> 
> *The work:*
> 
> There are several milestones to be reached in developing this idea and 
> this is the work plan I propose:
> 1. Development of unit tests with known species and gene trees (1 week).
> 
> 2. Making or reusing necessary data structures, made easier by last 
> years GSoC contribution implementing phyloXML in BioRuby (1/2 weeks - 1 
> week):
> - gene tree,
> - species tree,
> - tree node,
> - children(),
> - parent().
> 
> 3. Developing checks for the correctness of input data for rooted fully 
> binary trees SDI (1/2 weeks - 1 week):
> - making sure trees are rooted and binary,
> - all species/gene tree nodes have at least on type of taxonomic data.
> - making a taxonomy base from a type of data present in all nodes 
> (scientific or common name, taxonomy code, id),
> - making sure taxonomic data is unique throughout external nodes.
> 4. Implementation of the recursive M function (1 week)
> - traverse the gene tree in postorder (left subtree, right subtree, root),
> - finding occurrences where M(parent) equals M(child 1 or 2) - this is 
> representative for finding a duplication. If M(parent) matches neither, 
> the processed node is a speciation.
> 
> 5. Milestone - finished implementation of SDI for rooted fully binary 
> trees (1/2 week):
> - Extensive testing,
> - cleaning up.
> 
> 6. Working on unrooted non-binary trees implementation (4-8 weeks):
> - Look to the forester java library SDI module for insight (by the 
> mentor of this project, Zmasek),
> - Doing some heavy lifting,
> - at this point I consider this implementation a possible pitfall, 
> because of substantially increased complexity.

This needs to much more detailed.
Species trees are always rooted.
Unrooted gene trees can be handled naively by rooting them in all 
possible places, and running the SDI algorithm on each differently 
rooted tree, and keeping the gene tree which has the lowest number of 
duplications.
A more efficient approach for this is described in:
Zmasek and Eddy (2002). RIO: analyzing proteomes by automated 
phylogenomics using resampled inference of orthologs. BMC 
Bioinformatics. 2002 May 16;3:14.
See: 
http://evogsoc2010.wordpress.com/2010/03/25/references-for-gene-duplications-proposal/


> 
> 7. Finishing up (1 week):
> - Extensive testing,
> - cleaning up.
> 
> *Why me?:*
> 
> I like to set foot on unknown territory and challenge myself constantly. 
> That being said, I have long searched for something that would connect 
> my love of medicine to my love of programming, and now, thanks to GSoC 
> and OBF, I think I found it - bioinformatics. I am at a stage of my 
> medical study, where I have to decide what my future will entail, and I 
> am (now, after thinking about it for a long time) positive that 
> bioinformatics will be a big part of it. What better way to get future 
> off to a good start, than with a Google Summer of Code project? Based on 
> this enthusiasm alone you can be assured that I'll work really hard on 
> this project and that I will be happy to see it done. As this would be 
> my first serious open source engagement, you also have a chance of 
> forming a completely new addition to the open source world and making an 
> excellent contributor out of me.
> 
> *Previous experience:*
> 
> 1. I have been working on a simulation of an analytical chemistry method 
> for the past 2 years now, more specifically we have modeled laser 
> ablation + inductively coupled plasma mass spectrometry with a simple 
> model, which aids our elemental mapping projects. For the write-up of 
> this project I have been awarded with a "Pre?ernovo priznanje" in 2008 
> (PDF upon request). This work entails several interesting components, 
> from basics such as: C# development, image input, output, multi-threaded 
> programming, UI development; to complex themes such as: genetic 
> algorithms and neural networks. All of which I learned as we worked on 
> the project without much hassle (source code upon request). This work is 
> not yet open source, because we are in the finalizing stages of the 
> paper and will release the source code after publication under an open 
> source license.  
> 
> 2. I have programmed since I was a child and I have developed a wide 
> specter of things in my lifetime (from a full CMS in PHP to an IRC 
> robot, source code upon request), but I have little experience in fully 
> open source projects, which I think so highly of.
> 
> *Biography:*
> 
> My name is Jure Triglav and I'm a 24 year old medical student from 
> Ljubljana, Slovenia. I was born in a small town of Murska Sobota in 
> Slovenia, where I went to grade school (graded excellent for all years, 
> awarded "Zoisova ?tipendija" for the gifted, which I still hold) and 
> high-school (excellent, finished as "Zlati maturant" in the company of 
> about 200 best students in the country). I moved to Ljubljana in 2004 to 
> study medicine. I am now in the last year of my medical study which I 
> find challenging and very interesting. 
> My hobbies are all over the place, from book design to photography, from 
> web design to typography, from guitar to poetry, from reading to 
> programming, from traveling to sports. 
> 
>   
> 
> *Other obligations for the summer:*
> 
> I have 5-hour daily clinical practice every weekday in June, July and 
> August, which is not nearly as serious as it sounds, especially since 
> this is the summer rotation which is known for its laid back feel. These 
> practice start at 8 am and finish at 1 pm, and for students are not 
> really stressful or exhausting at all. I have in the past juggled many 
> research obligations with clinical practice and my studies without 
> hiccups, but I will not do this this summer and will dedicate 8 hours 
> daily to Google Summer of Code, as I realize what a great opportunity 
> this is and how much work is required. I have no other work, research or 
> vacation obligations for the period of Google Summer of Code.

Neverthelessm, this sounds like a serious concern.

> 
> *Contact information: *
> 
> (I will provide additional contact information in the final application)
> Name: Jure Triglav
> E-mail: juretriglav at gmail.com <mailto:juretriglav at gmail.com>
> IRC handle: x` on #obf-soc, #gsoc
> 


From czmasek at burnham.org  Mon Mar 29 23:39:29 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 16:39:29 -0700
Subject: [BioRuby] Google summer of code 2010 - Stathis Kamperis
In-Reply-To: <2218b9af1003290119q1c6b2eeclc3c84ffdbaa97b2a@mail.gmail.com>
References: <2218b9af1003290119q1c6b2eeclc3c84ffdbaa97b2a@mail.gmail.com>
Message-ID: <4BB13A31.8020203@burnham.org>

Hi, Stathis:

Thank you for your interest in this proposal!

Stathis Kamperis wrote:
> Dear Dr. Zmasek,
> 
> my name is Stathis Kamperis and I'm interested in this year's Google
> Summer of Code project:
> "Implementation of algorithm to infer gene duplications in BioRuby".
> 
> I am a medicine graduate, physics undergraduate and computer
> enthusiast. I come from Greece and I am 26 years old.
> I have a long standing programming experience with a vast range of
> programming languages including, since recently, Ruby.
> I also have a decent molecular/biology background.
> 
> I successfully participated in last years Google Summer of Code
> working for the DragonFlyBSD[1] organisation. My work had to do with
> POSIX standard conformance audit, regression testing and quality
> assurance.
> 
> As I understand, the project is about implementing your algorithm to
> BioRuby. Is there any prototype implemented in any language/framework
> at the moment ?

Yes, there is:
See: 
http://forester-atv.cvs.sourceforge.net/viewvc/forester-atv/forester-atv/java/src/org/forester/sdi/

Especially, SDI.java and SDIR.java (for unrooted trees)


In your abstract you mention:
> "We show empirically, using 1750 gene trees constructed from the Pfam
> protein family database, that it appears to be a practical (and often
> superior) algorithm for analyzing real gene trees."
> So, I wonder, what does 'empirically' mean here or how did you conduct
> your tests ?

Essentially, my Java implementation was used to run this tests.

Hope this helps,

Christian


From czmasek at burnham.org  Tue Mar 30 00:01:10 2010
From: czmasek at burnham.org (Christian M Zmasek)
Date: Mon, 29 Mar 2010 17:01:10 -0700
Subject: [BioRuby] GSOC 2010 preliminary proposal question
In-Reply-To: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>
References: <C0EB196E-8786-482B-95BC-4F5064F8E77B@gmail.com>
Message-ID: <4BB13F46.7010607@burnham.org>

Hi, Sara:

Thank you for your interest in this proposal!

I think focusing on 'required' duplications is appropriate, since 
non-binary species trees are oftentimes a means to express uncertainty 
in the "tree-of-life" and to prevent introduction of spurious 
duplications due to this.

Christian


Sara Rayburn wrote:
> Hello all. My name is Sara Rayburn. I'm a doctoral student at the University of Louisiana at Lafayette. I am planning to submit a proposal to implement the speciation/duplication inference algorithm this summer. I'd like to tackle both the implementation and the extension to non-binary trees. In reading the posted reference on reconciliation in non-binary trees, there are two types of duplications referenced, required and conditional duplications. In an implementation of this approach, would it be better to identify only required duplications and clear speciations, or should there be an additional distinction for the conditional duplications?
> 
> I hope to post a preliminary project plan and proposal for feedback in the next couple of days. Thanks in advance for your feedback.
> 
> 
> 
> Sara Rayburn
> University of Louisiana at Lafayette
> sararayburn at gmail.com
> 
> 
> 
> 
> 
> _______________________________________________
> BioRuby Project - http://www.bioruby.org/
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby