From biopython at maubp.freeserve.co.uk  Thu Apr 15 13:54:56 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 18:54:56 +0100
Subject: [BioSQL-l] [Biojava-l] Issue with SimpleNCBITaxon class
In-Reply-To: <D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
References: <4BC2200D.8000109@gmail.com>
	<B2DECC5B-650E-434E-8955-5F02DB4297AC@eaglegenomics.com>
	<4BC23A46.7090304@gmail.com>
	<D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
Message-ID: <m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>

Hi,

I've CC'd this to the BioSQL mailing list for cross project
discussion.

On Mon, Apr 12, 2010 at 7:57 AM, Richard Holland  wrote:
> Thanks Deepak.
>
> I've had a look at the code and I believe its due to the
> different ways in which BioJava and BioPerl load the
> taxon table.
>
> BioJava sets the ncbi_taxon_id and parent_taxon_id
> columns based on the values from the NCBI taxonomy
> file. The taxon_id column in BioJava is a meaningless
> auto-generated value that is never used.
>
> BioPerl however is generating taxon_id values and
> linking them by setting parent_taxon_id to the
> generated value. The parent value from the NCBI
> taxonomy file is therefore replaced with the BioPerl
> generated parent ID, meaning that instead of linking
> from parent_taxon_id to ncbi_taxon_id as per BioJava,
> the link is to taxon_id instead. (I'm basing this
> comment on looking at load_ncbi_taxonomy.pl from
> the BioSQL archives.)

Note that old versions of load_ncbi_taxonomy.pl
(which is part of BioSQL, not part of BioPerl) would
set taxon_id equal to ncbi_taxon_id, see:
http://bugzilla.open-bio.org/show_bug.cgi?id=2470

This may help explain the confusion.

> I believe if you load the taxonomy table using BioJava,
> you should see BioJava giving correct behaviour.
> Likewise if you load it using BioPerl, BioPerl will
> behave correctly. But if you load with one then query
> with the other, you'll get incorrect results.
>
> This sounds like a case for discussion on both lists -
> a matter of standardisation between the two projects.
> Not quickly/easily solvable for now.

Its not just two projects (BioPerl & BioJava) (grin).
Its at least five projects (BioSQL itself plus BioRuby
and Biopython).

I'm not sure about BioRuby's implementation, but
currently I think BioJava is the odd one out - BioPerl,
Biopython, and the BioSQL's load_ncbi_taxonomy.pl
all make entries in parent_taxon_id reference the
automatically generated taxon_id (please correct
me if I am wrong).

My personal view is that bioperl-db is the reference
implementation and should be followed in the event
of any ambiguity within BioSQL. In this particular
case, there is actually a BioSQL script to check
against too (load_ncbi_taxonomy.pl).

Hopefully Hilmar can give us an official verdict...

Peter

From rmb32 at cornell.edu  Sat Apr  3 16:09:27 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sat, 03 Apr 2010 13:09:27 -0700
Subject: [BioSQL-l] Google Summer of Code is *ON* for OBF projects!
Message-ID: <4BB7A077.4070802@cornell.edu>

Hi all,

Reminder:  GSoC student proposals must be submitted to Google by April 
9th, 19:00 UTC.  That's less than a week away.

Students: you should ALREADY be working with mentors on the project 
mailing lists, they can help you get your proposal into shape.

So far, we have 5 proposals submitted to our org in Google's web app. 
Keep them coming, and let's see some really good ones!

Rob Buels
OBF GSoC 2010 Administrator


From rmb32 at cornell.edu  Sun Apr  4 00:37:38 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sat, 03 Apr 2010 21:37:38 -0700
Subject: [BioSQL-l] Reminder: GSoC student applications due April 9,
	19:00 UTC
Message-ID: <4BB81792.8060001@cornell.edu>

Hi all,

Sending this again with a different subject line, just in case.

GSoC student proposals must be submitted to Google through their web 
application by *April 9th, 19:00 UTC*.  That's less than a week away.

Students: you should ALREADY be working with mentors on the project
mailing lists, they can help you get your proposal into shape.

So far, we have 6 proposals submitted to our org in Google's web app.
Keep them coming, and keep them good!

Rob Buels
OBF GSoC 2010 Administrator


From rohitrrj at gmail.com  Mon Apr  5 14:14:30 2010
From: rohitrrj at gmail.com (Rohit Jadhav)
Date: Mon, 5 Apr 2010 23:44:30 +0530
Subject: [BioSQL-l] Student internship program for open-source projects
Message-ID: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>

Dear Sir/Madam,
This has reference to Google summar code programme.

I am Rohit Jadhav, a Masters student in Bioinformatics at Indiana University
Purdue University Indianapolis (USA). I am looking for a Co-op/Internship
position for the Summer 2010. My areas of interest include Bioinformatics,
Data Mining and Systems Biology.

It is worth mentioning something about some of my courses I took recently in
the Fall semester of 2009, which have contributed in inspiring me to work in
the areas of data mining and systems biology. The Introduction courses in
Informatics and Bioinformatics were instrumental for me in getting the
current state of the art knowledge in these areas. The advance course in
Biostatistics helped me in solving the statistical questions problems
addressed in most of the bioinformatics papers. In the spring 2010 I am
taking a course on biological database management which will help me in
improving my knowledge in biological databases. The course on computational
systems biology is a research oriented course which will help me in keeping
up with the current advances in the area. The translational bioinformatics
course will help me build on my current knowledge on dealing the
high-throughput techniques like microarrays.


I have also worked as a web developer at the university?s information
technology services department, where I was a part of the web tech services
team. It was really a valuable experience as it made me work and get
experience on almost all the stages of the website development life cycle
right from understanding the complex problem, data retrieval, designing,
development and testing, also implementing my knowledge in perl, C#, Java
and other languages.

I have an undergraduate degree in Bioinformatics and am currently pursuing
my further studies in the field as a graduate student.


I am an Indian national with Indian citizenship. I had already applied on
line and I'll be glad to furnish any more details about the projects I had
undertaken.

I am Looking forward to hear from you.

Sincerely,

-- 
Rohit Jadhav
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RohitJadhav-resume.doc
Type: application/msword
Size: 48128 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biosql-l/attachments/20100405/e67899b9/attachment-0001.doc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RohitJadhav-SOP.pdf
Type: application/pdf
Size: 27547 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biosql-l/attachments/20100405/e67899b9/attachment-0001.pdf>

From biopython at maubp.freeserve.co.uk  Thu Apr 15 14:23:52 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 19:23:52 +0100
Subject: [BioSQL-l] SQLite support
In-Reply-To: <D7EF9032-2760-42A5-94E1-E88DBB2C0146@gmx.net>
References: <1f864af10812150224y540f1ba6y6b30168102885fcd@mail.gmail.com>
	<320fb6e00907050324i6d64d3abreb4d0c256bf1bdc4@mail.gmail.com>
	<320fb6e00907090529t61239952y1c86963f13c1db78@mail.gmail.com>
	<320fb6e00907280458q56f74ec6iefa420ac1caab8da@mail.gmail.com>
	<320fb6e00911240627o49bc1ec9nc0d26065ebc23423@mail.gmail.com>
	<070E8BA8-B2C1-4E44-AA2D-9934B3742406@illinois.edu>
	<320fb6e00911240907u32dca751ldb488cbc38f0e035@mail.gmail.com>
	<320fb6e00912100703g4e2b7068jb4fea67df3ebd8a8@mail.gmail.com>
	<320fb6e01001130337p1e0a361ci7ea1a5b5a9639731@mail.gmail.com>
	<D7EF9032-2760-42A5-94E1-E88DBB2C0146@gmx.net>
Message-ID: <i2j320fb6e01004151123u7dd6b1a1le39c57dd2deef035@mail.gmail.com>

On Wed, Jan 13, 2010 at 6:06 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
>
> Hi Peter, yes, I know I'm remiss on doing that. Will do shortly. Please
> don't stop pestering if I seem to have forgotten :-)
>
> ? ? ? ?-hilmar

Cough cough ;-)

Peter


From biopython at maubp.freeserve.co.uk  Thu Apr 15 14:34:26 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 19:34:26 +0100
Subject: [BioSQL-l] Student internship program for open-source projects
In-Reply-To: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>
References: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>
Message-ID: <w2i320fb6e01004151134o7e8c6cc4u365e3b243974fe09@mail.gmail.com>

On Mon, Apr 5, 2010 at 7:14 PM, Rohit Jadhav <rohitrrj at gmail.com> wrote:
> Dear Sir/Madam,
> This has reference to Google summar code programme.
>

Hi Rohit,

It seems you (and a few other students) had tried emailing the
BioSQL mailing list without first subscribing, and your messages
were held in a moderation queue until recently. Hopefully Robert
or Hilmar replied to you directly since the GSoC application
deadline has now passed.

Peter

From rmb32 at cornell.edu  Tue Apr 27 01:52:57 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 26 Apr 2010 22:52:57 -0700
Subject: [BioSQL-l] Google Summer of Code - accepted students
Message-ID: <4BD67BB9.3000804@cornell.edu>

Hi all,

I'm pleased to announce the acceptance of OBF's 2010 Google Summer of
Code students, listed in alphabetical order with their project titles
and primary mentors:

Mark Chapman (PM Andreas Prlic) - Improvements to BioJava including
Implementation of Multiple Sequence Alignment Algorithms

Jianjiong Gao (PM Peter Rose) - BioJava Packages for Identification,
Classification, and Visualization of Posttranslational Modification of
Proteins

Kazuhiro Hayashi (PM Naohisa Goto) - Ruby 1.9.2 support of BioRuby

Sara Rayburn (PM Christian Zmasek) - Implementing Speciation &
Duplication Inference Algorithm for Binary and Non-binary Species Tree

Joao Pedro Garcia Lopes Maia Rodrigues (PM Eric Talevich) - Extending
Bio.PDB: broadening the usefulness of BioPython's Structural Biology module

Jun Yin (PM Chris Fields) - BioPerl Alignment Subsystem Refactoring

Congratulations to our accepted students!

All told, we had 52 applications submitted for the 6 slots (5 originally
assigned, plus 1 extra) allotted to us by Google.  Proposals were
extremely competitive: 6 out of 52 translates to an 11.5% acceptance
rate.  We received a lot of really excellent proposals, the decisions
were not easy.

Thanks very much to all the students who applied, we very much
appreciate your hard work.

Here's to a great 2010 Summer of Code, I'm sure these students will do
some wonderful work.

Rob Buels
OBF GSoC 2010 Administrator


From sheoran143 at gmail.com  Fri Apr 16 14:43:55 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 16 Apr 2010 18:43:55 -0000
Subject: [BioSQL-l] [Biojava-l] Issue with SimpleNCBITaxon class
In-Reply-To: <m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>
References: <4BC2200D.8000109@gmail.com>	
	<B2DECC5B-650E-434E-8955-5F02DB4297AC@eaglegenomics.com>	
	<4BC23A46.7090304@gmail.com>	
	<D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
	<m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>
Message-ID: <4BC8AFEF.70107@gmail.com>

What my experience says on this issue we should make use of taxon_id 
because its a unique key in a local instance of biosql.
ncbi_taxon_id should only be used for mapping purpose only so that a 
person can map his local taxon_id to a ncbi_taxon_id otherwise it defeat 
the sole purpose of having taxon_id as primary key in taxon table. The 
main goal which I think when biosql is designed is to make it 
independent of any other organization like genbank or NCBI but its a 
feature so that we can map a number(ncbi_taxon_id) given by a know 
authority to a local number (taxon_id).

Deepak Sheoran

On 4/15/2010 12:54 PM, Peter wrote:
> Hi,
>
> I've CC'd this to the BioSQL mailing list for cross project
> discussion.
>
> On Mon, Apr 12, 2010 at 7:57 AM, Richard Holland  wrote:
>    
>> Thanks Deepak.
>>
>> I've had a look at the code and I believe its due to the
>> different ways in which BioJava and BioPerl load the
>> taxon table.
>>
>> BioJava sets the ncbi_taxon_id and parent_taxon_id
>> columns based on the values from the NCBI taxonomy
>> file. The taxon_id column in BioJava is a meaningless
>> auto-generated value that is never used.
>>
>> BioPerl however is generating taxon_id values and
>> linking them by setting parent_taxon_id to the
>> generated value. The parent value from the NCBI
>> taxonomy file is therefore replaced with the BioPerl
>> generated parent ID, meaning that instead of linking
>> from parent_taxon_id to ncbi_taxon_id as per BioJava,
>> the link is to taxon_id instead. (I'm basing this
>> comment on looking at load_ncbi_taxonomy.pl from
>> the BioSQL archives.)
>>      
> Note that old versions of load_ncbi_taxonomy.pl
> (which is part of BioSQL, not part of BioPerl) would
> set taxon_id equal to ncbi_taxon_id, see:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2470
>
> This may help explain the confusion.
>
>    
>> I believe if you load the taxonomy table using BioJava,
>> you should see BioJava giving correct behaviour.
>> Likewise if you load it using BioPerl, BioPerl will
>> behave correctly. But if you load with one then query
>> with the other, you'll get incorrect results.
>>
>> This sounds like a case for discussion on both lists -
>> a matter of standardisation between the two projects.
>> Not quickly/easily solvable for now.
>>      
> Its not just two projects (BioPerl&  BioJava) (grin).
> Its at least five projects (BioSQL itself plus BioRuby
> and Biopython).
>
> I'm not sure about BioRuby's implementation, but
> currently I think BioJava is the odd one out - BioPerl,
> Biopython, and the BioSQL's load_ncbi_taxonomy.pl
> all make entries in parent_taxon_id reference the
> automatically generated taxon_id (please correct
> me if I am wrong).
>
> My personal view is that bioperl-db is the reference
> implementation and should be followed in the event
> of any ambiguity within BioSQL. In this particular
> case, there is actually a BioSQL script to check
> against too (load_ncbi_taxonomy.pl).
>
> Hopefully Hilmar can give us an official verdict...
>
> Peter
>    


From rmb32 at cornell.edu  Mon Apr 26 18:54:52 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 26 Apr 2010 22:54:52 -0000
Subject: [BioSQL-l] Google Summer of Code - accepted students
Message-ID: <4BD60D63.1040400@cornell.edu>

Hi all,

I'm pleased to announce the acceptance of OBF's 2010 Google Summer of 
Code students, listed in alphabetical order with their project titles 
and primary mentors:

Mark Chapman (PM Andreas Prlic) - Improvements to BioJava including 
Implementation of Multiple Sequence Alignment Algorithms

Jianjiong Gao (PM Peter Rose) - BioJava Packages for Identification, 
Classification, and Visualization of Posttranslational Modification of 
Proteins

Kazuhiro Hayashi (PM Naohisa Goto) - Ruby 1.9.2 support of BioRuby

Sara Rayburn (PM Christian Zmasek) - Implementing Speciation & 
Duplication Inference Algorithm for Binary and Non-binary Species Tree

Joao Pedro Garcia Lopes Maia Rodrigues (PM Eric Talevich) - Extending 
Bio.PDB: broadening the usefulness of BioPython's Structural Biology module

Jun Yin (PM Chris Fields) - BioPerl Alignment Subsystem Refactoring

Congratulations to our accepted students!

All told, we had 52 applications submitted for the 6 slots (5 originally 
assigned, plus 1 extra) allotted to us by Google.  Proposals were 
extremely competitive: 6 out of 52 translates to an 11.5% acceptance 
rate.  We received a lot of really excellent proposals, the decisions 
were not easy.

Thanks very much to all the students who applied, we very much 
appreciate your hard work.

Here's to a great 2010 Summer of Code, I'm sure these students will do 
some wonderful work.

Rob Buels
OBF GSoC 2010 Administrator


From biopython at maubp.freeserve.co.uk  Thu Apr 15 17:54:56 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 18:54:56 +0100
Subject: [BioSQL-l] [Biojava-l] Issue with SimpleNCBITaxon class
In-Reply-To: <D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
References: <4BC2200D.8000109@gmail.com>
	<B2DECC5B-650E-434E-8955-5F02DB4297AC@eaglegenomics.com>
	<4BC23A46.7090304@gmail.com>
	<D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
Message-ID: <m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>

Hi,

I've CC'd this to the BioSQL mailing list for cross project
discussion.

On Mon, Apr 12, 2010 at 7:57 AM, Richard Holland  wrote:
> Thanks Deepak.
>
> I've had a look at the code and I believe its due to the
> different ways in which BioJava and BioPerl load the
> taxon table.
>
> BioJava sets the ncbi_taxon_id and parent_taxon_id
> columns based on the values from the NCBI taxonomy
> file. The taxon_id column in BioJava is a meaningless
> auto-generated value that is never used.
>
> BioPerl however is generating taxon_id values and
> linking them by setting parent_taxon_id to the
> generated value. The parent value from the NCBI
> taxonomy file is therefore replaced with the BioPerl
> generated parent ID, meaning that instead of linking
> from parent_taxon_id to ncbi_taxon_id as per BioJava,
> the link is to taxon_id instead. (I'm basing this
> comment on looking at load_ncbi_taxonomy.pl from
> the BioSQL archives.)

Note that old versions of load_ncbi_taxonomy.pl
(which is part of BioSQL, not part of BioPerl) would
set taxon_id equal to ncbi_taxon_id, see:
http://bugzilla.open-bio.org/show_bug.cgi?id=2470

This may help explain the confusion.

> I believe if you load the taxonomy table using BioJava,
> you should see BioJava giving correct behaviour.
> Likewise if you load it using BioPerl, BioPerl will
> behave correctly. But if you load with one then query
> with the other, you'll get incorrect results.
>
> This sounds like a case for discussion on both lists -
> a matter of standardisation between the two projects.
> Not quickly/easily solvable for now.

Its not just two projects (BioPerl & BioJava) (grin).
Its at least five projects (BioSQL itself plus BioRuby
and Biopython).

I'm not sure about BioRuby's implementation, but
currently I think BioJava is the odd one out - BioPerl,
Biopython, and the BioSQL's load_ncbi_taxonomy.pl
all make entries in parent_taxon_id reference the
automatically generated taxon_id (please correct
me if I am wrong).

My personal view is that bioperl-db is the reference
implementation and should be followed in the event
of any ambiguity within BioSQL. In this particular
case, there is actually a BioSQL script to check
against too (load_ncbi_taxonomy.pl).

Hopefully Hilmar can give us an official verdict...

Peter


From rmb32 at cornell.edu  Sat Apr  3 20:09:27 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sat, 03 Apr 2010 13:09:27 -0700
Subject: [BioSQL-l] Google Summer of Code is *ON* for OBF projects!
Message-ID: <4BB7A077.4070802@cornell.edu>

Hi all,

Reminder:  GSoC student proposals must be submitted to Google by April 
9th, 19:00 UTC.  That's less than a week away.

Students: you should ALREADY be working with mentors on the project 
mailing lists, they can help you get your proposal into shape.

So far, we have 5 proposals submitted to our org in Google's web app. 
Keep them coming, and let's see some really good ones!

Rob Buels
OBF GSoC 2010 Administrator


From rmb32 at cornell.edu  Sun Apr  4 04:37:38 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sat, 03 Apr 2010 21:37:38 -0700
Subject: [BioSQL-l] Reminder: GSoC student applications due April 9,
	19:00 UTC
Message-ID: <4BB81792.8060001@cornell.edu>

Hi all,

Sending this again with a different subject line, just in case.

GSoC student proposals must be submitted to Google through their web 
application by *April 9th, 19:00 UTC*.  That's less than a week away.

Students: you should ALREADY be working with mentors on the project
mailing lists, they can help you get your proposal into shape.

So far, we have 6 proposals submitted to our org in Google's web app.
Keep them coming, and keep them good!

Rob Buels
OBF GSoC 2010 Administrator


From rohitrrj at gmail.com  Mon Apr  5 18:14:30 2010
From: rohitrrj at gmail.com (Rohit Jadhav)
Date: Mon, 5 Apr 2010 23:44:30 +0530
Subject: [BioSQL-l] Student internship program for open-source projects
Message-ID: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>

Dear Sir/Madam,
This has reference to Google summar code programme.

I am Rohit Jadhav, a Masters student in Bioinformatics at Indiana University
Purdue University Indianapolis (USA). I am looking for a Co-op/Internship
position for the Summer 2010. My areas of interest include Bioinformatics,
Data Mining and Systems Biology.

It is worth mentioning something about some of my courses I took recently in
the Fall semester of 2009, which have contributed in inspiring me to work in
the areas of data mining and systems biology. The Introduction courses in
Informatics and Bioinformatics were instrumental for me in getting the
current state of the art knowledge in these areas. The advance course in
Biostatistics helped me in solving the statistical questions problems
addressed in most of the bioinformatics papers. In the spring 2010 I am
taking a course on biological database management which will help me in
improving my knowledge in biological databases. The course on computational
systems biology is a research oriented course which will help me in keeping
up with the current advances in the area. The translational bioinformatics
course will help me build on my current knowledge on dealing the
high-throughput techniques like microarrays.


I have also worked as a web developer at the university?s information
technology services department, where I was a part of the web tech services
team. It was really a valuable experience as it made me work and get
experience on almost all the stages of the website development life cycle
right from understanding the complex problem, data retrieval, designing,
development and testing, also implementing my knowledge in perl, C#, Java
and other languages.

I have an undergraduate degree in Bioinformatics and am currently pursuing
my further studies in the field as a graduate student.


I am an Indian national with Indian citizenship. I had already applied on
line and I'll be glad to furnish any more details about the projects I had
undertaken.

I am Looking forward to hear from you.

Sincerely,

-- 
Rohit Jadhav
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RohitJadhav-resume.doc
Type: application/msword
Size: 48128 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biosql-l/attachments/20100405/e67899b9/attachment-0002.doc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RohitJadhav-SOP.pdf
Type: application/pdf
Size: 27547 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biosql-l/attachments/20100405/e67899b9/attachment-0002.pdf>

From biopython at maubp.freeserve.co.uk  Thu Apr 15 18:23:52 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 19:23:52 +0100
Subject: [BioSQL-l] SQLite support
In-Reply-To: <D7EF9032-2760-42A5-94E1-E88DBB2C0146@gmx.net>
References: <1f864af10812150224y540f1ba6y6b30168102885fcd@mail.gmail.com>
	<320fb6e00907050324i6d64d3abreb4d0c256bf1bdc4@mail.gmail.com>
	<320fb6e00907090529t61239952y1c86963f13c1db78@mail.gmail.com>
	<320fb6e00907280458q56f74ec6iefa420ac1caab8da@mail.gmail.com>
	<320fb6e00911240627o49bc1ec9nc0d26065ebc23423@mail.gmail.com>
	<070E8BA8-B2C1-4E44-AA2D-9934B3742406@illinois.edu>
	<320fb6e00911240907u32dca751ldb488cbc38f0e035@mail.gmail.com>
	<320fb6e00912100703g4e2b7068jb4fea67df3ebd8a8@mail.gmail.com>
	<320fb6e01001130337p1e0a361ci7ea1a5b5a9639731@mail.gmail.com>
	<D7EF9032-2760-42A5-94E1-E88DBB2C0146@gmx.net>
Message-ID: <i2j320fb6e01004151123u7dd6b1a1le39c57dd2deef035@mail.gmail.com>

On Wed, Jan 13, 2010 at 6:06 PM, Hilmar Lapp <hlapp at gmx.net> wrote:
>
> Hi Peter, yes, I know I'm remiss on doing that. Will do shortly. Please
> don't stop pestering if I seem to have forgotten :-)
>
> ? ? ? ?-hilmar

Cough cough ;-)

Peter


From biopython at maubp.freeserve.co.uk  Thu Apr 15 18:34:26 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 15 Apr 2010 19:34:26 +0100
Subject: [BioSQL-l] Student internship program for open-source projects
In-Reply-To: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>
References: <r2s5cbb43251004051114ud8bfc0eer4869a766e77c24e2@mail.gmail.com>
Message-ID: <w2i320fb6e01004151134o7e8c6cc4u365e3b243974fe09@mail.gmail.com>

On Mon, Apr 5, 2010 at 7:14 PM, Rohit Jadhav <rohitrrj at gmail.com> wrote:
> Dear Sir/Madam,
> This has reference to Google summar code programme.
>

Hi Rohit,

It seems you (and a few other students) had tried emailing the
BioSQL mailing list without first subscribing, and your messages
were held in a moderation queue until recently. Hopefully Robert
or Hilmar replied to you directly since the GSoC application
deadline has now passed.

Peter


From rmb32 at cornell.edu  Tue Apr 27 05:52:57 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 26 Apr 2010 22:52:57 -0700
Subject: [BioSQL-l] Google Summer of Code - accepted students
Message-ID: <4BD67BB9.3000804@cornell.edu>

Hi all,

I'm pleased to announce the acceptance of OBF's 2010 Google Summer of
Code students, listed in alphabetical order with their project titles
and primary mentors:

Mark Chapman (PM Andreas Prlic) - Improvements to BioJava including
Implementation of Multiple Sequence Alignment Algorithms

Jianjiong Gao (PM Peter Rose) - BioJava Packages for Identification,
Classification, and Visualization of Posttranslational Modification of
Proteins

Kazuhiro Hayashi (PM Naohisa Goto) - Ruby 1.9.2 support of BioRuby

Sara Rayburn (PM Christian Zmasek) - Implementing Speciation &
Duplication Inference Algorithm for Binary and Non-binary Species Tree

Joao Pedro Garcia Lopes Maia Rodrigues (PM Eric Talevich) - Extending
Bio.PDB: broadening the usefulness of BioPython's Structural Biology module

Jun Yin (PM Chris Fields) - BioPerl Alignment Subsystem Refactoring

Congratulations to our accepted students!

All told, we had 52 applications submitted for the 6 slots (5 originally
assigned, plus 1 extra) allotted to us by Google.  Proposals were
extremely competitive: 6 out of 52 translates to an 11.5% acceptance
rate.  We received a lot of really excellent proposals, the decisions
were not easy.

Thanks very much to all the students who applied, we very much
appreciate your hard work.

Here's to a great 2010 Summer of Code, I'm sure these students will do
some wonderful work.

Rob Buels
OBF GSoC 2010 Administrator


From sheoran143 at gmail.com  Fri Apr 16 18:43:55 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 16 Apr 2010 18:43:55 -0000
Subject: [BioSQL-l] [Biojava-l] Issue with SimpleNCBITaxon class
In-Reply-To: <m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>
References: <4BC2200D.8000109@gmail.com>	
	<B2DECC5B-650E-434E-8955-5F02DB4297AC@eaglegenomics.com>	
	<4BC23A46.7090304@gmail.com>	
	<D75549FE-6866-4397-ACEE-A897C719C441@eaglegenomics.com>
	<m2o320fb6e01004151054rcb57a28fvad135dffbe35d5fa@mail.gmail.com>
Message-ID: <4BC8AFEF.70107@gmail.com>

What my experience says on this issue we should make use of taxon_id 
because its a unique key in a local instance of biosql.
ncbi_taxon_id should only be used for mapping purpose only so that a 
person can map his local taxon_id to a ncbi_taxon_id otherwise it defeat 
the sole purpose of having taxon_id as primary key in taxon table. The 
main goal which I think when biosql is designed is to make it 
independent of any other organization like genbank or NCBI but its a 
feature so that we can map a number(ncbi_taxon_id) given by a know 
authority to a local number (taxon_id).

Deepak Sheoran

On 4/15/2010 12:54 PM, Peter wrote:
> Hi,
>
> I've CC'd this to the BioSQL mailing list for cross project
> discussion.
>
> On Mon, Apr 12, 2010 at 7:57 AM, Richard Holland  wrote:
>    
>> Thanks Deepak.
>>
>> I've had a look at the code and I believe its due to the
>> different ways in which BioJava and BioPerl load the
>> taxon table.
>>
>> BioJava sets the ncbi_taxon_id and parent_taxon_id
>> columns based on the values from the NCBI taxonomy
>> file. The taxon_id column in BioJava is a meaningless
>> auto-generated value that is never used.
>>
>> BioPerl however is generating taxon_id values and
>> linking them by setting parent_taxon_id to the
>> generated value. The parent value from the NCBI
>> taxonomy file is therefore replaced with the BioPerl
>> generated parent ID, meaning that instead of linking
>> from parent_taxon_id to ncbi_taxon_id as per BioJava,
>> the link is to taxon_id instead. (I'm basing this
>> comment on looking at load_ncbi_taxonomy.pl from
>> the BioSQL archives.)
>>      
> Note that old versions of load_ncbi_taxonomy.pl
> (which is part of BioSQL, not part of BioPerl) would
> set taxon_id equal to ncbi_taxon_id, see:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2470
>
> This may help explain the confusion.
>
>    
>> I believe if you load the taxonomy table using BioJava,
>> you should see BioJava giving correct behaviour.
>> Likewise if you load it using BioPerl, BioPerl will
>> behave correctly. But if you load with one then query
>> with the other, you'll get incorrect results.
>>
>> This sounds like a case for discussion on both lists -
>> a matter of standardisation between the two projects.
>> Not quickly/easily solvable for now.
>>      
> Its not just two projects (BioPerl&  BioJava) (grin).
> Its at least five projects (BioSQL itself plus BioRuby
> and Biopython).
>
> I'm not sure about BioRuby's implementation, but
> currently I think BioJava is the odd one out - BioPerl,
> Biopython, and the BioSQL's load_ncbi_taxonomy.pl
> all make entries in parent_taxon_id reference the
> automatically generated taxon_id (please correct
> me if I am wrong).
>
> My personal view is that bioperl-db is the reference
> implementation and should be followed in the event
> of any ambiguity within BioSQL. In this particular
> case, there is actually a BioSQL script to check
> against too (load_ncbi_taxonomy.pl).
>
> Hopefully Hilmar can give us an official verdict...
>
> Peter
>    


From rmb32 at cornell.edu  Mon Apr 26 22:54:52 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Mon, 26 Apr 2010 22:54:52 -0000
Subject: [BioSQL-l] Google Summer of Code - accepted students
Message-ID: <4BD60D63.1040400@cornell.edu>

Hi all,

I'm pleased to announce the acceptance of OBF's 2010 Google Summer of 
Code students, listed in alphabetical order with their project titles 
and primary mentors:

Mark Chapman (PM Andreas Prlic) - Improvements to BioJava including 
Implementation of Multiple Sequence Alignment Algorithms

Jianjiong Gao (PM Peter Rose) - BioJava Packages for Identification, 
Classification, and Visualization of Posttranslational Modification of 
Proteins

Kazuhiro Hayashi (PM Naohisa Goto) - Ruby 1.9.2 support of BioRuby

Sara Rayburn (PM Christian Zmasek) - Implementing Speciation & 
Duplication Inference Algorithm for Binary and Non-binary Species Tree

Joao Pedro Garcia Lopes Maia Rodrigues (PM Eric Talevich) - Extending 
Bio.PDB: broadening the usefulness of BioPython's Structural Biology module

Jun Yin (PM Chris Fields) - BioPerl Alignment Subsystem Refactoring

Congratulations to our accepted students!

All told, we had 52 applications submitted for the 6 slots (5 originally 
assigned, plus 1 extra) allotted to us by Google.  Proposals were 
extremely competitive: 6 out of 52 translates to an 11.5% acceptance 
rate.  We received a lot of really excellent proposals, the decisions 
were not easy.

Thanks very much to all the students who applied, we very much 
appreciate your hard work.

Here's to a great 2010 Summer of Code, I'm sure these students will do 
some wonderful work.

Rob Buels
OBF GSoC 2010 Administrator