From rmb32 at cornell.edu  Sun Aug  1 15:17:14 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sun, 01 Aug 2010 12:17:14 -0700
Subject: [Biojava-l] GMOD Evo Hackathon Open Call for Participation
Message-ID: <4C55C83A.3060700@cornell.edu>

We are seeking participants for the GMOD Tools for Evolutionary Biology 
Hackathon, held November 8-12, 2010 at the US National Evolutionary 
Synthesis Center (NESCent) in Durham, NC.

This hackathon targets three critical gaps in the capabilities of the 
GMOD toolbox that currently limit its utility for evolutionary research:

  1. Visualization of comparative genomics data
  2. Visualization of phylogenetic data and trees
  3. Support for population diversity and phenotype data

If you are interested in these areas and have relevant expertise, you 
are strongly encouraged to apply. Relevant areas of expertise include 
more than just software development: if you are a GMOD power user, 
visualization guru, domain expert (comparative, phylogenetics, 
population, ...), or documentation wizard, then your skills are needed!

How To Apply:

Fill out the online application form at http://bit.ly/gmodevohack. 
Applications are due August 25.

About GMOD:

GMOD is an intercompatible suite of open-source software components for 
storing, managing, analyzing, and visualizing genome-scale data. GMOD 
includes many widely-used software components: GBrowse and JBrowse, both 
genome viewers; GBrowse_syn, a comparative genomics viewer; Chado, a 
generic and modular database schema; CMap, a comparative map viewer; as 
well as many other components including Apollo, MAKER, BioMart, 
InterMine, and Galaxy. We hope to extend the functionality of existing 
GMOD components, and integrate new components as well.

About Hackathons:

A hackathon is an intense event at which a group of programmers with 
different backgrounds and skills collaborate hands-on and face-to-face 
to develop working code that is of utility to the community as a whole. 
The mix of people will include domain experts and computer-savvy end-users.

More details about the event, its motivation, organization, procedures, 
and attendees, as well as URLs to the hackathon and related websites are 
included below.

Sincerely,

The GMOD EvoHack Organizing Committee (and project affiliations as
relevant):

Nicole Washington, Chair (LBNL, modENCODE, Phenote)

Robert Buels (SGN, Chado NatDiv)

Scott Cain (OICR, GMOD)

Dave Clements (NESCent, GMOD)

Hilmar Lapp (NESCent, Phenoscape, Chado NatDiv)

Sheldon McKay (University of Arizona, iPlant, GBrowse_syn)


-----------------------------

About the GMOD Evo Hackathon

Overview

We are organizing a hackathon to fill critical gaps in the capabilities 
of the Generic Model Organism Database (GMOD) toolbox that currently 
limit its utility for evolutionary research. Specifically, we will focus 
on tools for

   1) viewing comparative genomics data;
   2) visualizing phylogenomic data; and
   3) supporting population diversity data and phenotype annotation.

The event will be hosted at NESCent and bring together a group of about 
20+ software developers, end-user representatives, and documentation 
experts who would otherwise not meet. The participants will include key 
developers of GMOD components that currently lack features critical for 
emerging evolutionary biology research, developers of informatics tools 
in evolutionary research that lack GMOD integration, and 
informatics-savvy biologists who can represent end-user requirements.

The event will provide a unique opportunity to infuse the GMOD developer 
community with a heightened awareness of unmet needs in evolutionary 
biology that GMOD components have the potential to fill, and for tool 
developers in evolutionary biology to better understand how best to 
extend or integrate with already existing GMOD components.

Before the Event

Discussion of ideas and sometimes even design actually starts well 
before the hackathon, on mailing lists, wiki pages, and conference calls 
set up among accepted attendees.  This advance work lays the foundation 
for participants to be productive from the very first day.  This also 
means that participants should be willing to contribute some time in 
advance of the hackathon itself to participate in this preparatory 
discussion.

During the Event

Typically, hackathon participants use the morning of the first day of 
the event to organize themselves into working groups of between 3 and 6 
people, each with a focused implementation objective.  Ideas and 
objectives are discussed, and attendees coalesce around the projects in 
which they have the most experience or interest.


Deliverables / Event Results

The meeting's attendance, working groups, and outcomes will be fully 
logged and documented on the GMOD wiki (http://gmod.org). Each working 
group during the event will typically have its own wiki page, linked 
from the main EvoHack page, where it documents its minutes and design 
notes, and provides links to the code and documentation it produces. 
Also, since GMOD and NESCent are both committed to open source 
principles, all code and documentation produced by participants during 
the hackathon must be published under an OSI-approved open source 
license. As contributions to existing GMOD tools, all hackathon products 
will most likely satisfy this requirement automatically.

NESCent

This event is sponsored by the US National Evolutionary Synthesis Center 
(NESCent, http://www.nescent.org) through its Informatics Whitepapers 
program (http://www.nescent.org/informatics/whitepapers.php). NESCent 
promotes the synthesis of information, concepts and knowledge to address 
significant, emerging, or novel questions in evolutionary science and 
its applications. NESCent achieves this by supporting research and 
education across disciplinary, institutional, geographic, and 
demographic boundaries (see http://www.nescent.org/science/proposals.php).

Links

Main GMOD EvoHack page, and full proposal:
http://gmod.org/wiki/GMOD_Evo_Hackathon

NESCent: http://www.nescent.org/
GMOD: http://gmod.org <http://gmod.org/>
Similar past NESCent events, see: http://hackathon.nescent.org/
GMOD hackathon application:  http://bit.ly/gmodevohack

-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback


From darnells at dnastar.com  Mon Aug 16 18:26:13 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Mon, 16 Aug 2010 17:26:13 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
Message-ID: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>

I'm sorry for reposting this message.  I accidentally sent the previous one as HTML.

________________________________________
From: Steve Darnell 
Sent: Monday, August 16, 2010 5:19 PM
To: 'biojava-l at lists.open-bio.org'
Subject: SITE records in PDBFileReader

Greetings,

I am interested in parsing SITE records from a PDB file. ?I looked over the org.biojava.bio.structure API, but I was unable to find reference to this functionality. ?Does the PDBFileReader in BioJava extract SITE record information?? If not, would it be possible to add this capability to PDBFileReader and the Structure class?

SITE record format at wwPDB: http://www.wwpdb.org/documentation/format32/sect7.html 

Regards,
Steve Darnell


From darnells at dnastar.com  Mon Aug 16 18:19:28 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Mon, 16 Aug 2010 17:19:28 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
Message-ID: <A4009967D1886D4286A9B7931FD586100258B100@FS1.dnastar.com>

Greetings,

 
I am interested in parsing SITE records from a PDB file.  I looked over
the org.biojava.bio.structure API, but I was unable to find reference to
this functionality.  Does the PDBFileReader in BioJava extract SITE
record information?  If not, would it be possible to add this capability
to PDBFileReader and the Structure class?

 
SITE record format at wwPDB:
http://www.wwpdb.org/documentation/format32/sect7.html 

 
Regards,

Steve Darnell


From andreas at sdsc.edu  Mon Aug 16 18:49:56 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Mon, 16 Aug 2010 15:49:56 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
Message-ID: <AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>

Hi Steve,

thanks for the feature request. I will probably be able to add this at some
point in September. If you need it already before that, I will be happy to
commit a patch if somebody else provides it...

Andreas


On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com> wrote:

> I'm sorry for reposting this message.  I accidentally sent the previous one
> as HTML.
>
> ________________________________________
> From: Steve Darnell
> Sent: Monday, August 16, 2010 5:19 PM
> To: 'biojava-l at lists.open-bio.org'
> Subject: SITE records in PDBFileReader
>
> Greetings,
>
> I am interested in parsing SITE records from a PDB file.  I looked over the
> org.biojava.bio.structure API, but I was unable to find reference to this
> functionality.  Does the PDBFileReader in BioJava extract SITE record
> information?  If not, would it be possible to add this capability to
> PDBFileReader and the Structure class?
>
> SITE record format at wwPDB:
> http://www.wwpdb.org/documentation/format32/sect7.html
>
> Regards,
> Steve Darnell
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------

From andreas at sdsc.edu  Mon Aug 16 19:58:48 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Mon, 16 Aug 2010 16:58:48 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
Message-ID: <AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>

- Take a look at PDBFileParser.java and at
http://www.wwpdb.org/documentation/format32/sect7.html
- It needs a new Handler method for the Site records that builds up the data
containers.
- Create a new bean that will contain the data for the SITE record
- Instead of having fields for insertion code residue nr and chain IDs, you
can use the new PDBResidueNumber.java class to group this together.
- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-

On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary
<amr_alhossary at hotmail.com>wrote:

> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
>
> --------------------------------------------------
> From: "Andreas Prlic" <andreas at sdsc.edu>
> Sent: Tuesday, August 17, 2010 12:49 AM
> To: "Steve Darnell" <darnells at dnastar.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
>  Hi Steve,
>>
>> thanks for the feature request. I will probably be able to add this at
>> some
>> point in September. If you need it already before that, I will be happy to
>> commit a patch if somebody else provides it...
>>
>> Andreas
>>
>>
>> On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com>
>> wrote:
>>
>>  I'm sorry for reposting this message.  I accidentally sent the previous
>>> one
>>> as HTML.
>>>
>>> ________________________________________
>>> From: Steve Darnell
>>> Sent: Monday, August 16, 2010 5:19 PM
>>> To: 'biojava-l at lists.open-bio.org'
>>> Subject: SITE records in PDBFileReader
>>>
>>> Greetings,
>>>
>>> I am interested in parsing SITE records from a PDB file.  I looked over
>>> the
>>> org.biojava.bio.structure API, but I was unable to find reference to this
>>> functionality.  Does the PDBFileReader in BioJava extract SITE record
>>> information?  If not, would it be possible to add this capability to
>>> PDBFileReader and the Structure class?
>>>
>>> SITE record format at wwPDB:
>>> http://www.wwpdb.org/documentation/format32/sect7.html
>>>
>>> Regards,
>>> Steve Darnell
>>>
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>>>
>>
>>
>> --
>> -----------------------------------------------------------------------
>> Dr. Andreas Prlic
>> Senior Scientist, RCSB PDB Protein Data Bank
>> University of California, San Diego
>> (+1) 858.246.0526
>> -----------------------------------------------------------------------
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------

From amr_alhossary at hotmail.com  Mon Aug 16 19:48:18 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Tue, 17 Aug 2010 01:48:18 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
Message-ID: <BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>

If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


--------------------------------------------------
From: "Andreas Prlic" <andreas at sdsc.edu>
Sent: Tuesday, August 17, 2010 12:49 AM
To: "Steve Darnell" <darnells at dnastar.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: Re: [Biojava-l] SITE records in PDBFileReader

> Hi Steve,
>
> thanks for the feature request. I will probably be able to add this at 
> some
> point in September. If you need it already before that, I will be happy to
> commit a patch if somebody else provides it...
>
> Andreas
>
>
> On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com> 
> wrote:
>
>> I'm sorry for reposting this message.  I accidentally sent the previous 
>> one
>> as HTML.
>>
>> ________________________________________
>> From: Steve Darnell
>> Sent: Monday, August 16, 2010 5:19 PM
>> To: 'biojava-l at lists.open-bio.org'
>> Subject: SITE records in PDBFileReader
>>
>> Greetings,
>>
>> I am interested in parsing SITE records from a PDB file.  I looked over 
>> the
>> org.biojava.bio.structure API, but I was unable to find reference to this
>> functionality.  Does the PDBFileReader in BioJava extract SITE record
>> information?  If not, would it be possible to add this capability to
>> PDBFileReader and the Structure class?
>>
>> SITE record format at wwPDB:
>> http://www.wwpdb.org/documentation/format32/sect7.html
>>
>> Regards,
>> Steve Darnell
>>
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
>
> -- 
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 

From jbdundas at gmail.com  Mon Aug 16 22:43:02 2010
From: jbdundas at gmail.com (jitesh dundas)
Date: Tue, 17 Aug 2010 08:13:02 +0530
Subject: [Biojava-l] BioJava 3Proposal tasks
Message-ID: <AANLkTinzPg8r-veyi+LYZ4pvtSbuTnso=idtVGGV15ta@mail.gmail.com>

Dear All,

Sorry I am sending this again ,but I don't see it in the list
anywhere.please post it.

I went through the BioJava3 proposal as you mentioned earlier..There
are a few things that I could take up without much worries...

 I can find out how Hibernate can be best deployed for BioJava. PLease
note that I suggest we use only hibernate3 or higher versions.
HIbernate2 has implementation and performace issues..
 I can also look at Spring after this task is done..

 I can find out the architectural and implementation issues in
Biojava. I am strong in Analysis and could do all this reasonably
well..

 I just want someone to share my concerns with and validate the findings..

 Analyse how BioJava is being used by the community. See the UsageAnalysis page.
 I can do these..
 To start from scratch, creating a number of smaller jars as
sub-projects within an umbrella BioJava3 project. Each jar would
provide tools for a specific purpose. Additional jars would provide
cross-purpose tools such as format converters or text-to-object
interfaces. Possibly built using Maven instead of Ant.
 Although starting from scratch, much existing code could be reused or
refactored to suit the new design.
 We would take full advantage of Java 6, including generics,
(@)annotations, the built-in property change support. Everything would
be a bean - absolutely everything.
 We would aim to be fully Java EE compliant, with the majority of
components fully reusable as a bean in any other application, just
like Spring's components are.
 We would adhere rigidly to a common coding style and heavily comment the code.
 We should make it able to focus on any aspect the user requires and
keep its efficiency, removing its dependency on everything being
sequence-related.
 SymbolLists and Alphabets to be rethought as these are the most
common stumbling block.
 Make methods parallel-aware and take advantage of this when possible,
and provide a global variable to specify how much parallelisation can
take place. - I am very interested in this and would liek to take this
up asap Sir.. JDK 1.5 has parallel programming extension to use and we
can define a common method or mode for executing existing code or
functionalities..However, impact analysis will be needed as NOT ALL
CODE CAN BE MADE PARALLEL COMPLIANT DUE TO IMPLEMENTATION ISSUES>>WILL
NEED THOROUGH CHECKING...i can do this..
 Please reply and advise which i should take up first ..Points in bold
are of particular interest to me..Even those beyond those list are
welcome ...

 Regards,
 JD

From darnells at dnastar.com  Tue Aug 17 12:00:33 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Tue, 17 Aug 2010 11:00:33 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
Message-ID: <A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>

Andreas and Amr,

Thank you very much for agreeing  to add this feature.  May I make one additional refinement to my request?

REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER code in use in the SITE records.  Could the site name also be associated with the site identifier and residues?  There is precedence for parsing REMARK records in BioJava (e.g. experiment type, resolution), but this is a special case where REMARK 800 and SITE records are dependent on one another and physically separated in the header.

Regards,
Steve

________________________________________
From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of Andreas Prlic
Sent: Monday, August 16, 2010 6:59 PM
To: Amr AL-Hossary
Cc: Steve Darnell; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader

- Take a look at PDBFileParser.java and at?http://www.wwpdb.org/documentation/format32/sect7.html
- It needs a new Handler method for the Site records that builds up the data containers.
- Create a new bean that will contain the data for the SITE record
- Instead of having fields for insertion code residue nr and chain IDs, you can use the new?PDBResidueNumber.java class to group this together.
- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-?
On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


From amr_alhossary at hotmail.com  Tue Aug 17 13:36:55 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Tue, 17 Aug 2010 19:36:55 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
Message-ID: <BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>

I'll see it in a couple of days. I have first to be able to check out & in 
the source code.
All I found till now is anonymous access.

Amr

--------------------------------------------------
From: "Steve Darnell" <darnells at dnastar.com>
Sent: Tuesday, August 17, 2010 6:00 PM
To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" 
<amr_alhossary at hotmail.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: RE: [Biojava-l] SITE records in PDBFileReader

> Andreas and Amr,
>
> Thank you very much for agreeing  to add this feature.  May I make one 
> additional refinement to my request?
>
> REMARK 800 provides a very useful SITE_DESCRIPTION for each 
> SITE_IDENTIFIER code in use in the SITE records.  Could the site name also 
> be associated with the site identifier and residues?  There is precedence 
> for parsing REMARK records in BioJava (e.g. experiment type, resolution), 
> but this is a special case where REMARK 800 and SITE records are dependent 
> on one another and physically separated in the header.
>
> Regards,
> Steve
>
> ________________________________________
> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf 
> Of Andreas Prlic
> Sent: Monday, August 16, 2010 6:59 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
> - Take a look at PDBFileParser.java and 
> athttp://www.wwpdb.org/documentation/format32/sect7.html
> - It needs a new Handler method for the Site records that builds up the 
> data containers.
> - Create a new bean that will contain the data for the SITE record
> - Instead of having fields for insertion code residue nr and chain IDs, 
> you can use the newPDBResidueNumber.java class to group this together.
> - Add a get/set method for the Site beans to the Structure class
> - Create a junit test that make sure the parsing works ok.
>
> Hope that makes sense...
> Andreas
>
>
> -
> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary 
> <amr_alhossary at hotmail.com> wrote:
> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
> 

From andreas at sdsc.edu  Tue Aug 17 14:04:19 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Tue, 17 Aug 2010 11:04:19 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
	<BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
Message-ID: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>

Hi Amr,

thanks for taking this on.  For a first time contributor, it is probably
best to post your patches to the list, so somebody else can take a look at
them first and commit them for you.

Andreas


On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary
<amr_alhossary at hotmail.com>wrote:

> I'll see it in a couple of days. I have first to be able to check out & in
> the source code.
> All I found till now is anonymous access.
>
> Amr
>
> --------------------------------------------------
> From: "Steve Darnell" <darnells at dnastar.com>
> Sent: Tuesday, August 17, 2010 6:00 PM
> To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" <
> amr_alhossary at hotmail.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: RE: [Biojava-l] SITE records in PDBFileReader
>
>  Andreas and Amr,
>>
>> Thank you very much for agreeing  to add this feature.  May I make one
>> additional refinement to my request?
>>
>> REMARK 800 provides a very useful SITE_DESCRIPTION for each
>> SITE_IDENTIFIER code in use in the SITE records.  Could the site name also
>> be associated with the site identifier and residues?  There is precedence
>> for parsing REMARK records in BioJava (e.g. experiment type, resolution),
>> but this is a special case where REMARK 800 and SITE records are dependent
>> on one another and physically separated in the header.
>>
>> Regards,
>> Steve
>>
>> ________________________________________
>> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf
>> Of Andreas Prlic
>> Sent: Monday, August 16, 2010 6:59 PM
>> To: Amr AL-Hossary
>> Cc: Steve Darnell; biojava-l at lists.open-bio.org
>> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>>
>> - Take a look at PDBFileParser.java and athttp://
>> www.wwpdb.org/documentation/format32/sect7.html
>>
>> - It needs a new Handler method for the Site records that builds up the
>> data containers.
>> - Create a new bean that will contain the data for the SITE record
>> - Instead of having fields for insertion code residue nr and chain IDs,
>> you can use the newPDBResidueNumber.java class to group this together.
>>
>> - Add a get/set method for the Site beans to the Structure class
>> - Create a junit test that make sure the parsing works ok.
>>
>> Hope that makes sense...
>> Andreas
>>
>>
>> -
>> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <
>> amr_alhossary at hotmail.com> wrote:
>> If you like It would be my pleasure to do it for you,
>> Just tell me where to start (in the code).
>>
>> Amr
>>
>>
>>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------

From andreas at sdsc.edu  Wed Aug 18 14:26:23 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Wed, 18 Aug 2010 11:26:23 -0700
Subject: [Biojava-l] Last week of Google Summer of Code
Message-ID: <AANLkTi=2Cug=mQpB3k3FuthW3GcdoNs07mHTjvoqVkuB@mail.gmail.com>

Hi,

This is the last week of this year's Google Summer of Code project and
I am happy to announce that our two students Mark Chapman and
Jianjiong Gao did an amazing job on their two projects "All Java
Multiple Sequence Alignment" (MSA) and "Identification and
Classification of Posttranslational Modification of Proteins" (PTM).

For Multiple Sequence Alignments we?now have a flexible and
multi-threaded MSA implementation that works in linear space and that,
as an option, allows the users to define anchors that are used in the
build up of the multiple alignment. The code is available as part of
the new biojava3-alignment module.

The Posttranslational Modification module (biojava3-protmod) can
detect three different types of protein modifications in protein
structures. It comes with an XML file & Java data structures to store
information about different types of protein modifications, and
contains entries from RESID, PDBCC and PSI-MOD. There is also a
visualisation component to display cross linked PTM on a sequence
viewer.

Both Mark and Jianjiong have expressed their interest in maintaining
and further developing their modules and I am looking forward to
interacting more with them in the future. I want to thank the Mentors
and Co-Mentors Peter Rose, Kyle Ellrott and Scooter Willis for their
help and guidance for the projects, without them this would not have
been possible. Thanks also to Robert Buels and the ?Open
Bioinformatics Foundation for organizing our applications for GSoC and
last, but not least, Google for sponsoring this Summer of Code.

Happy BioJava-ing,

Andreas


From andrew.mcsweeny at rockets.utoledo.edu  Wed Aug 18 18:53:54 2010
From: andrew.mcsweeny at rockets.utoledo.edu (McSweeny, Andrew J)
Date: Wed, 18 Aug 2010 22:53:54 +0000
Subject: [Biojava-l] Annotations question
Message-ID: <469B4CD3D7690A418E8F96B7BA4585F81202C15E@BL2PRD0103MB052.prod.exchangelabs.com>

Hello,

I am interested in using BioJava to determine which features are located where on the assembled chromosome 21 (chr21.fa) from the UCSC genome browser website.

An example of something I would like to do is to pick a position at random (1-48,129,895) and then determine whether there are any exons or introns on the plus or minus strand.

What classes do I need to be familiar with to do this?

-Andrew


From rmb32 at cornell.edu  Thu Aug 19 13:09:45 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 19 Aug 2010 10:09:45 -0700
Subject: [Biojava-l] reminder: Aug 25 deadline for GMOD Hackathon application
Message-ID: <4C6D6559.3080809@cornell.edu>

Hi all,

This is your one-week reminder: the deadline for open applications to 
the GMOD Evo hackathon is Wednesday, August 25th.

Rob

========================================

We are seeking participants for the GMOD Tools for Evolutionary Biology
Hackathon, held November 8-12, 2010 at the US National Evolutionary
Synthesis Center (NESCent) in Durham, NC.

This hackathon targets three critical gaps in the capabilities of the
GMOD toolbox that currently limit its utility for evolutionary research:

  1. Visualization of comparative genomics data
  2. Visualization of phylogenetic data and trees
  3. Support for population diversity and phenotype data

If you are interested in these areas and have relevant expertise, you
are strongly encouraged to apply. Relevant areas of expertise include
more than just software development: if you are a GMOD power user,
visualization guru, domain expert (comparative, phylogenetics,
population, ...), or documentation wizard, then your skills are needed!

How To Apply:

Fill out the online application form at http://bit.ly/gmodevohack.
Applications are due August 25.

About GMOD:

GMOD is an intercompatible suite of open-source software components for
storing, managing, analyzing, and visualizing genome-scale data. GMOD
includes many widely-used software components: GBrowse and JBrowse, both
genome viewers; GBrowse_syn, a comparative genomics viewer; Chado, a
generic and modular database schema; CMap, a comparative map viewer; as
well as many other components including Apollo, MAKER, BioMart,
InterMine, and Galaxy. We hope to extend the functionality of existing
GMOD components, and integrate new components as well.

About Hackathons:

A hackathon is an intense event at which a group of programmers with
different backgrounds and skills collaborate hands-on and face-to-face
to develop working code that is of utility to the community as a whole.
The mix of people will include domain experts and computer-savvy end-users.

More details about the event, its motivation, organization, procedures,
and attendees, as well as URLs to the hackathon and related websites are
included below.

Sincerely,

The GMOD EvoHack Organizing Committee (and project affiliations as
relevant):

Nicole Washington, Chair (LBNL, modENCODE, Phenote)

Robert Buels (SGN, Chado NatDiv)

Scott Cain (OICR, GMOD)

Dave Clements (NESCent, GMOD)

Hilmar Lapp (NESCent, Phenoscape, Chado NatDiv)

Sheldon McKay (University of Arizona, iPlant, GBrowse_syn)


-----------------------------

About the GMOD Evo Hackathon

Overview

We are organizing a hackathon to fill critical gaps in the capabilities
of the Generic Model Organism Database (GMOD) toolbox that currently
limit its utility for evolutionary research. Specifically, we will focus
on tools for

   1) viewing comparative genomics data;
   2) visualizing phylogenomic data; and
   3) supporting population diversity data and phenotype annotation.

The event will be hosted at NESCent and bring together a group of about
20+ software developers, end-user representatives, and documentation
experts who would otherwise not meet. The participants will include key
developers of GMOD components that currently lack features critical for
emerging evolutionary biology research, developers of informatics tools
in evolutionary research that lack GMOD integration, and
informatics-savvy biologists who can represent end-user requirements.

The event will provide a unique opportunity to infuse the GMOD developer
community with a heightened awareness of unmet needs in evolutionary
biology that GMOD components have the potential to fill, and for tool
developers in evolutionary biology to better understand how best to
extend or integrate with already existing GMOD components.

Before the Event

Discussion of ideas and sometimes even design actually starts well
before the hackathon, on mailing lists, wiki pages, and conference calls
set up among accepted attendees.  This advance work lays the foundation
for participants to be productive from the very first day.  This also
means that participants should be willing to contribute some time in
advance of the hackathon itself to participate in this preparatory
discussion.

During the Event

Typically, hackathon participants use the morning of the first day of
the event to organize themselves into working groups of between 3 and 6
people, each with a focused implementation objective.  Ideas and
objectives are discussed, and attendees coalesce around the projects in
which they have the most experience or interest.


Deliverables / Event Results

The meeting's attendance, working groups, and outcomes will be fully
logged and documented on the GMOD wiki (http://gmod.org). Each working
group during the event will typically have its own wiki page, linked
from the main EvoHack page, where it documents its minutes and design
notes, and provides links to the code and documentation it produces.
Also, since GMOD and NESCent are both committed to open source
principles, all code and documentation produced by participants during
the hackathon must be published under an OSI-approved open source
license. As contributions to existing GMOD tools, all hackathon products
will most likely satisfy this requirement automatically.

NESCent

This event is sponsored by the US National Evolutionary Synthesis Center
(NESCent, http://www.nescent.org) through its Informatics Whitepapers
program (http://www.nescent.org/informatics/whitepapers.php). NESCent
promotes the synthesis of information, concepts and knowledge to address
significant, emerging, or novel questions in evolutionary science and
its applications. NESCent achieves this by supporting research and
education across disciplinary, institutional, geographic, and
demographic boundaries (see http://www.nescent.org/science/proposals.php).

Links

Main GMOD EvoHack page, and full proposal:
http://gmod.org/wiki/GMOD_Evo_Hackathon

NESCent: http://www.nescent.org/
GMOD: http://gmod.org <http://gmod.org/>
Similar past NESCent events, see: http://hackathon.nescent.org/
GMOD hackathon application:  http://bit.ly/gmodevohack

-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback


From amr_alhossary at hotmail.com  Fri Aug 27 07:57:16 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Fri, 27 Aug 2010 13:57:16 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl><AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com><A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com><BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
Message-ID: <BLU150-ds19733EBDE2961925A331288E860@phx.gbl>

I sent the updated code as an attachment to the group, as well as to Andreas 
Prlic<andreas at sdsc.edu>; Steve Darnell<darnells at dnastar.com>; 
jacobsen at ebi.ac.uk<jacobsen at ebi.ac.uk>; to be reviewed for submission.

It seems that the group daemon prevents attachments whatever small is their 
size.
Please feed me back if it wasn't delivered correctly.

This submitted updates handle dealing with "SITE" records to a sufficient 
degree (but didn't handle REMARK 800 yet)

to achieve this goal I had to create a new bean called "Residue". It is 
implemented as a static inner class inside PDBSite (and it can be extracted 
to be a top level class if needed).

I created it because I couldn't use any of the subclasses of Group class 
(e.g. HOH is  neither an amino acid, nor a nucleotide).

I guess this should be discussed on the biojava-dev mail list if any body is 
interested and if it suits the list policy.
I also have some comments on the already present code that needs to be 
discussed. to whom shall I address my comments?

Regards

Amr
From: Andreas Prlic
Sent: Tuesday, August 17, 2010 8:04 PM
To: Amr AL-Hossary
Cc: Steve Darnell ; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader


Hi Amr,

thanks for taking this on.  For a first time contributor, it is probably 
best to post your patches to the list, so somebody else can take a look at 
them first and commit them for you.

Andreas


On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com> 
wrote:

I'll see it in a couple of days. I have first to be able to check out & in 
the source code.
All I found till now is anonymous access.

Amr

--------------------------------------------------
From: "Steve Darnell" <darnells at dnastar.com>
Sent: Tuesday, August 17, 2010 6:00 PM
To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" 
<amr_alhossary at hotmail.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: RE: [Biojava-l] SITE records in PDBFileReader


Andreas and Amr,

Thank you very much for agreeing  to add this feature.  May I make one 
additional refinement to my request?

REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER 
code in use in the SITE records.  Could the site name also be associated 
with the site identifier and residues?  There is precedence for parsing 
REMARK records in BioJava (e.g. experiment type, resolution), but this is a 
special case where REMARK 800 and SITE records are dependent on one another 
and physically separated in the header.

Regards,
Steve

________________________________________
From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of 
Andreas Prlic
Sent: Monday, August 16, 2010 6:59 PM
To: Amr AL-Hossary
Cc: Steve Darnell; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader


- Take a look at PDBFileParser.java and 
athttp://www.wwpdb.org/documentation/format32/sect7.html

- It needs a new Handler method for the Site records that builds up the data 
containers.
- Create a new bean that will contain the data for the SITE record

- Instead of having fields for insertion code residue nr and chain IDs, you 
can use the newPDBResidueNumber.java class to group this together.

- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-
On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> 
wrote:
If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
----------------------------------------------------------------------- 


From jbdundas at gmail.com  Fri Aug 27 10:44:46 2010
From: jbdundas at gmail.com (jitesh dundas)
Date: Fri, 27 Aug 2010 20:14:46 +0530
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds19733EBDE2961925A331288E860@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
	<BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
	<BLU150-ds19733EBDE2961925A331288E860@phx.gbl>
Message-ID: <AANLkTikXAPTEWgHbuXovQ+VgW1CF4pQpzgYvDJaUVmzr@mail.gmail.com>

Hi,

Thanks & nice work.I think you need to tell your module lead about that..


Hibernate inclusion isnot a good idea for BioJava.It is slow & XML
based, thus big data files will be affected.
I think we need a plugin framework with better features that deploy
functionalities ,which biologists look for..

I have been doing analysis on the BioJava 3 proposal and have some
concerns on this, besides the other analysis that is present. I will
be sending it to my lead, Andreas Sir (not Andreas Prilic) on this.

Regards,
JD

On 8/27/10, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
> I sent the updated code as an attachment to the group, as well as to Andreas
> Prlic<andreas at sdsc.edu>; Steve Darnell<darnells at dnastar.com>;
> jacobsen at ebi.ac.uk<jacobsen at ebi.ac.uk>; to be reviewed for submission.
>
> It seems that the group daemon prevents attachments whatever small is their
> size.
> Please feed me back if it wasn't delivered correctly.
>
> This submitted updates handle dealing with "SITE" records to a sufficient
> degree (but didn't handle REMARK 800 yet)
>
> to achieve this goal I had to create a new bean called "Residue". It is
> implemented as a static inner class inside PDBSite (and it can be extracted
> to be a top level class if needed).
>
> I created it because I couldn't use any of the subclasses of Group class
> (e.g. HOH is  neither an amino acid, nor a nucleotide).
>
> I guess this should be discussed on the biojava-dev mail list if any body is
> interested and if it suits the list policy.
> I also have some comments on the already present code that needs to be
> discussed. to whom shall I address my comments?
>
> Regards
>
> Amr
> From: Andreas Prlic
> Sent: Tuesday, August 17, 2010 8:04 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell ; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
> Hi Amr,
>
> thanks for taking this on.  For a first time contributor, it is probably
> best to post your patches to the list, so somebody else can take a look at
> them first and commit them for you.
>
> Andreas
>
>
>
> On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com>
> wrote:
>
> I'll see it in a couple of days. I have first to be able to check out & in
> the source code.
> All I found till now is anonymous access.
>
> Amr
>
> --------------------------------------------------
> From: "Steve Darnell" <darnells at dnastar.com>
> Sent: Tuesday, August 17, 2010 6:00 PM
> To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary"
> <amr_alhossary at hotmail.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: RE: [Biojava-l] SITE records in PDBFileReader
>
>
> Andreas and Amr,
>
> Thank you very much for agreeing  to add this feature.  May I make one
> additional refinement to my request?
>
> REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER
> code in use in the SITE records.  Could the site name also be associated
> with the site identifier and residues?  There is precedence for parsing
> REMARK records in BioJava (e.g. experiment type, resolution), but this is a
> special case where REMARK 800 and SITE records are dependent on one another
> and physically separated in the header.
>
> Regards,
> Steve
>
> ________________________________________
> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of
> Andreas Prlic
> Sent: Monday, August 16, 2010 6:59 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
> - Take a look at PDBFileParser.java and
> athttp://www.wwpdb.org/documentation/format32/sect7.html
>
> - It needs a new Handler method for the Site records that builds up the data
> containers.
> - Create a new bean that will contain the data for the SITE record
>
> - Instead of having fields for insertion code residue nr and chain IDs, you
> can use the newPDBResidueNumber.java class to group this together.
>
> - Add a get/set method for the Site beans to the Structure class
> - Create a junit test that make sure the parsing works ok.
>
> Hope that makes sense...
> Andreas
>
>
> -
> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com>
> wrote:
> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
>
>
>
>
>
> --
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>

From amr_alhossary at hotmail.com  Fri Aug 27 04:55:11 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Fri, 27 Aug 2010 08:55:11 -0000
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl><AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com><A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com><BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
Message-ID: <BLU150-ds816FE9A83292D0511EF948E860@phx.gbl>

Dear all,

Please, some body revise the attached code & checks it in if it is OK, or contact me back for any inquiry.

This submitted updates handle dealing with "SITE" records to a sufficient degree (but didn't handle REMARK 800 yet)

to achieve this goal I had to create a new bean called "Residue". It is implemented as a static inner class inside PDBSite (and it can be extracted to be a top level class if needed).

Why I created it? because I couldn't use any of the subclasses of Group class (e.g. HOH is  neither an amino acid, nor a neucleotide). in case some body has another idea, let's open the discussion about it.

Regards

Amr


  From: Andreas Prlic 
  Sent: Tuesday, August 17, 2010 8:04 PM
  To: Amr AL-Hossary 
  Cc: Steve Darnell ; biojava-l at lists.open-bio.org 
  Subject: Re: [Biojava-l] SITE records in PDBFileReader


  Hi Amr,

  thanks for taking this on.  For a first time contributor, it is probably best to post your patches to the list, so somebody else can take a look at them first and commit them for you.

  Andreas


  On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:

    I'll see it in a couple of days. I have first to be able to check out & in the source code.
    All I found till now is anonymous access.

    Amr

    --------------------------------------------------
    From: "Steve Darnell" <darnells at dnastar.com>
    Sent: Tuesday, August 17, 2010 6:00 PM
    To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" <amr_alhossary at hotmail.com>
    Cc: <biojava-l at lists.open-bio.org>
    Subject: RE: [Biojava-l] SITE records in PDBFileReader


      Andreas and Amr,

      Thank you very much for agreeing  to add this feature.  May I make one additional refinement to my request?

      REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER code in use in the SITE records.  Could the site name also be associated with the site identifier and residues?  There is precedence for parsing REMARK records in BioJava (e.g. experiment type, resolution), but this is a special case where REMARK 800 and SITE records are dependent on one another and physically separated in the header.

      Regards,
      Steve

      ________________________________________
      From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of Andreas Prlic
      Sent: Monday, August 16, 2010 6:59 PM
      To: Amr AL-Hossary
      Cc: Steve Darnell; biojava-l at lists.open-bio.org
      Subject: Re: [Biojava-l] SITE records in PDBFileReader


      - Take a look at PDBFileParser.java and athttp://www.wwpdb.org/documentation/format32/sect7.html 

      - It needs a new Handler method for the Site records that builds up the data containers.
      - Create a new bean that will contain the data for the SITE record

      - Instead of having fields for insertion code residue nr and chain IDs, you can use the newPDBResidueNumber.java class to group this together. 

      - Add a get/set method for the Site beans to the Structure class
      - Create a junit test that make sure the parsing works ok.

      Hope that makes sense...
      Andreas


      -
      On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
      If you like It would be my pleasure to do it for you,
      Just tell me where to start (in the code).

      Amr


  -- 
  -----------------------------------------------------------------------
  Dr. Andreas Prlic
  Senior Scientist, RCSB PDB Protein Data Bank
  University of California, San Diego
  (+1) 858.246.0526
  -----------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SITE-specific commits.zip
Type: application/x-zip-compressed
Size: 34069 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100827/7926d6c9/attachment-0001.bin>

From sheoran143 at gmail.com  Thu Aug 19 20:45:29 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 20 Aug 2010 00:45:29 -0000
Subject: [Biojava-l] Required Correction in GenbankLocationParser class
Message-ID: <4C6DD03C.1080909@gmail.com>

  Their is problem with GenbankLocationParser class, this class don't 
process genbank record with    Accession: M32882. LocationParser class 
fails at following line in genbank record:

      gene  </nuccore/150738?itemid=33&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
      mRNA  </nuccore/150738?itemid=15&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
                      /note="alternative transcript"


Exception stack trace is as follows:

	Could not understand position: 10206,1..855
	org.biojava.bio.seq.io.ParseException: Could not understand position: 10206,1..855
	at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:277)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:244)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocation(GenbankLocationParser.java:131)

I did some investigation in following matter, and found the defect in 
regular expression named as "gp" in GenbankLocationParser class.

This error can be fixed by applying attached patch. And then for testing 
I have created a method which proves that it can now understand all the 
possible combination of location. This test class is also attached so 
that you can test my patch before and after its application.

I don't have access to svn so please apply this patch for me, and let me 
know if you approve this patch or not.

Thanks
Deepak Sheoran

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenbankLocationParser.patch
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/11dbea0f/attachment.pl>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LocationParserTest.java
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/11dbea0f/attachment-0001.pl>

From sheoran143 at gmail.com  Thu Aug 19 20:48:23 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 20 Aug 2010 00:48:23 -0000
Subject: [Biojava-l] Required Correction in GenbankLocationParser class
Message-ID: <4C6DD0E8.8070704@gmail.com>


Their is problem with GenbankLocationParser class, this class don't 
process genbank record with    Accession: M32882. LocationParser class 
fails at following line in genbank record:

      gene  </nuccore/150738?itemid=33&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
      mRNA  </nuccore/150738?itemid=15&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
                      /note="alternative transcript"


Exception stack trace is as follows:

	Could not understand position: 10206,1..855
	org.biojava.bio.seq.io.ParseException: Could not understand position: 10206,1..855
	at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:277)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:244)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocation(GenbankLocationParser.java:131)

I did some investigation in following matter, and found the defect in 
regular expression named as "gp" in GenbankLocationParser class.

This error can be fixed by applying attached patch. And then for testing 
I have created a method which proves that it can now understand all the 
possible combination of location. This test class is also attached so 
that you can test my patch before and after its application.

I don't have access to svn so please apply this patch for me, and let me 
know if you approve this patch or not.

Thanks
Deepak Sheoran

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenbankLocationParser.patch
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment.pl>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LocationParserTest.java
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment-0001.pl>

From rmb32 at cornell.edu  Sun Aug  1 19:17:14 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Sun, 01 Aug 2010 12:17:14 -0700
Subject: [Biojava-l] GMOD Evo Hackathon Open Call for Participation
Message-ID: <4C55C83A.3060700@cornell.edu>

We are seeking participants for the GMOD Tools for Evolutionary Biology 
Hackathon, held November 8-12, 2010 at the US National Evolutionary 
Synthesis Center (NESCent) in Durham, NC.

This hackathon targets three critical gaps in the capabilities of the 
GMOD toolbox that currently limit its utility for evolutionary research:

  1. Visualization of comparative genomics data
  2. Visualization of phylogenetic data and trees
  3. Support for population diversity and phenotype data

If you are interested in these areas and have relevant expertise, you 
are strongly encouraged to apply. Relevant areas of expertise include 
more than just software development: if you are a GMOD power user, 
visualization guru, domain expert (comparative, phylogenetics, 
population, ...), or documentation wizard, then your skills are needed!

How To Apply:

Fill out the online application form at http://bit.ly/gmodevohack. 
Applications are due August 25.

About GMOD:

GMOD is an intercompatible suite of open-source software components for 
storing, managing, analyzing, and visualizing genome-scale data. GMOD 
includes many widely-used software components: GBrowse and JBrowse, both 
genome viewers; GBrowse_syn, a comparative genomics viewer; Chado, a 
generic and modular database schema; CMap, a comparative map viewer; as 
well as many other components including Apollo, MAKER, BioMart, 
InterMine, and Galaxy. We hope to extend the functionality of existing 
GMOD components, and integrate new components as well.

About Hackathons:

A hackathon is an intense event at which a group of programmers with 
different backgrounds and skills collaborate hands-on and face-to-face 
to develop working code that is of utility to the community as a whole. 
The mix of people will include domain experts and computer-savvy end-users.

More details about the event, its motivation, organization, procedures, 
and attendees, as well as URLs to the hackathon and related websites are 
included below.

Sincerely,

The GMOD EvoHack Organizing Committee (and project affiliations as
relevant):

Nicole Washington, Chair (LBNL, modENCODE, Phenote)

Robert Buels (SGN, Chado NatDiv)

Scott Cain (OICR, GMOD)

Dave Clements (NESCent, GMOD)

Hilmar Lapp (NESCent, Phenoscape, Chado NatDiv)

Sheldon McKay (University of Arizona, iPlant, GBrowse_syn)


-----------------------------

About the GMOD Evo Hackathon

Overview

We are organizing a hackathon to fill critical gaps in the capabilities 
of the Generic Model Organism Database (GMOD) toolbox that currently 
limit its utility for evolutionary research. Specifically, we will focus 
on tools for

   1) viewing comparative genomics data;
   2) visualizing phylogenomic data; and
   3) supporting population diversity data and phenotype annotation.

The event will be hosted at NESCent and bring together a group of about 
20+ software developers, end-user representatives, and documentation 
experts who would otherwise not meet. The participants will include key 
developers of GMOD components that currently lack features critical for 
emerging evolutionary biology research, developers of informatics tools 
in evolutionary research that lack GMOD integration, and 
informatics-savvy biologists who can represent end-user requirements.

The event will provide a unique opportunity to infuse the GMOD developer 
community with a heightened awareness of unmet needs in evolutionary 
biology that GMOD components have the potential to fill, and for tool 
developers in evolutionary biology to better understand how best to 
extend or integrate with already existing GMOD components.

Before the Event

Discussion of ideas and sometimes even design actually starts well 
before the hackathon, on mailing lists, wiki pages, and conference calls 
set up among accepted attendees.  This advance work lays the foundation 
for participants to be productive from the very first day.  This also 
means that participants should be willing to contribute some time in 
advance of the hackathon itself to participate in this preparatory 
discussion.

During the Event

Typically, hackathon participants use the morning of the first day of 
the event to organize themselves into working groups of between 3 and 6 
people, each with a focused implementation objective.  Ideas and 
objectives are discussed, and attendees coalesce around the projects in 
which they have the most experience or interest.


Deliverables / Event Results

The meeting's attendance, working groups, and outcomes will be fully 
logged and documented on the GMOD wiki (http://gmod.org). Each working 
group during the event will typically have its own wiki page, linked 
from the main EvoHack page, where it documents its minutes and design 
notes, and provides links to the code and documentation it produces. 
Also, since GMOD and NESCent are both committed to open source 
principles, all code and documentation produced by participants during 
the hackathon must be published under an OSI-approved open source 
license. As contributions to existing GMOD tools, all hackathon products 
will most likely satisfy this requirement automatically.

NESCent

This event is sponsored by the US National Evolutionary Synthesis Center 
(NESCent, http://www.nescent.org) through its Informatics Whitepapers 
program (http://www.nescent.org/informatics/whitepapers.php). NESCent 
promotes the synthesis of information, concepts and knowledge to address 
significant, emerging, or novel questions in evolutionary science and 
its applications. NESCent achieves this by supporting research and 
education across disciplinary, institutional, geographic, and 
demographic boundaries (see http://www.nescent.org/science/proposals.php).

Links

Main GMOD EvoHack page, and full proposal:
http://gmod.org/wiki/GMOD_Evo_Hackathon

NESCent: http://www.nescent.org/
GMOD: http://gmod.org <http://gmod.org/>
Similar past NESCent events, see: http://hackathon.nescent.org/
GMOD hackathon application:  http://bit.ly/gmodevohack

-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback


From darnells at dnastar.com  Mon Aug 16 22:26:13 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Mon, 16 Aug 2010 17:26:13 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
Message-ID: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>

I'm sorry for reposting this message.  I accidentally sent the previous one as HTML.

________________________________________
From: Steve Darnell 
Sent: Monday, August 16, 2010 5:19 PM
To: 'biojava-l at lists.open-bio.org'
Subject: SITE records in PDBFileReader

Greetings,

I am interested in parsing SITE records from a PDB file. ?I looked over the org.biojava.bio.structure API, but I was unable to find reference to this functionality. ?Does the PDBFileReader in BioJava extract SITE record information?? If not, would it be possible to add this capability to PDBFileReader and the Structure class?

SITE record format at wwPDB: http://www.wwpdb.org/documentation/format32/sect7.html 

Regards,
Steve Darnell


From darnells at dnastar.com  Mon Aug 16 22:19:28 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Mon, 16 Aug 2010 17:19:28 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
Message-ID: <A4009967D1886D4286A9B7931FD586100258B100@FS1.dnastar.com>

Greetings,

 
I am interested in parsing SITE records from a PDB file.  I looked over
the org.biojava.bio.structure API, but I was unable to find reference to
this functionality.  Does the PDBFileReader in BioJava extract SITE
record information?  If not, would it be possible to add this capability
to PDBFileReader and the Structure class?

 
SITE record format at wwPDB:
http://www.wwpdb.org/documentation/format32/sect7.html 

 
Regards,

Steve Darnell


From andreas at sdsc.edu  Mon Aug 16 22:49:56 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Mon, 16 Aug 2010 15:49:56 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
Message-ID: <AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>

Hi Steve,

thanks for the feature request. I will probably be able to add this at some
point in September. If you need it already before that, I will be happy to
commit a patch if somebody else provides it...

Andreas


On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com> wrote:

> I'm sorry for reposting this message.  I accidentally sent the previous one
> as HTML.
>
> ________________________________________
> From: Steve Darnell
> Sent: Monday, August 16, 2010 5:19 PM
> To: 'biojava-l at lists.open-bio.org'
> Subject: SITE records in PDBFileReader
>
> Greetings,
>
> I am interested in parsing SITE records from a PDB file.  I looked over the
> org.biojava.bio.structure API, but I was unable to find reference to this
> functionality.  Does the PDBFileReader in BioJava extract SITE record
> information?  If not, would it be possible to add this capability to
> PDBFileReader and the Structure class?
>
> SITE record format at wwPDB:
> http://www.wwpdb.org/documentation/format32/sect7.html
>
> Regards,
> Steve Darnell
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------


From andreas at sdsc.edu  Mon Aug 16 23:58:48 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Mon, 16 Aug 2010 16:58:48 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
Message-ID: <AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>

- Take a look at PDBFileParser.java and at
http://www.wwpdb.org/documentation/format32/sect7.html
- It needs a new Handler method for the Site records that builds up the data
containers.
- Create a new bean that will contain the data for the SITE record
- Instead of having fields for insertion code residue nr and chain IDs, you
can use the new PDBResidueNumber.java class to group this together.
- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-

On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary
<amr_alhossary at hotmail.com>wrote:

> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
>
> --------------------------------------------------
> From: "Andreas Prlic" <andreas at sdsc.edu>
> Sent: Tuesday, August 17, 2010 12:49 AM
> To: "Steve Darnell" <darnells at dnastar.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
>  Hi Steve,
>>
>> thanks for the feature request. I will probably be able to add this at
>> some
>> point in September. If you need it already before that, I will be happy to
>> commit a patch if somebody else provides it...
>>
>> Andreas
>>
>>
>> On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com>
>> wrote:
>>
>>  I'm sorry for reposting this message.  I accidentally sent the previous
>>> one
>>> as HTML.
>>>
>>> ________________________________________
>>> From: Steve Darnell
>>> Sent: Monday, August 16, 2010 5:19 PM
>>> To: 'biojava-l at lists.open-bio.org'
>>> Subject: SITE records in PDBFileReader
>>>
>>> Greetings,
>>>
>>> I am interested in parsing SITE records from a PDB file.  I looked over
>>> the
>>> org.biojava.bio.structure API, but I was unable to find reference to this
>>> functionality.  Does the PDBFileReader in BioJava extract SITE record
>>> information?  If not, would it be possible to add this capability to
>>> PDBFileReader and the Structure class?
>>>
>>> SITE record format at wwPDB:
>>> http://www.wwpdb.org/documentation/format32/sect7.html
>>>
>>> Regards,
>>> Steve Darnell
>>>
>>> _______________________________________________
>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>
>>>
>>
>>
>> --
>> -----------------------------------------------------------------------
>> Dr. Andreas Prlic
>> Senior Scientist, RCSB PDB Protein Data Bank
>> University of California, San Diego
>> (+1) 858.246.0526
>> -----------------------------------------------------------------------
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------


From amr_alhossary at hotmail.com  Mon Aug 16 23:48:18 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Tue, 17 Aug 2010 01:48:18 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
Message-ID: <BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>

If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


--------------------------------------------------
From: "Andreas Prlic" <andreas at sdsc.edu>
Sent: Tuesday, August 17, 2010 12:49 AM
To: "Steve Darnell" <darnells at dnastar.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: Re: [Biojava-l] SITE records in PDBFileReader

> Hi Steve,
>
> thanks for the feature request. I will probably be able to add this at 
> some
> point in September. If you need it already before that, I will be happy to
> commit a patch if somebody else provides it...
>
> Andreas
>
>
> On Mon, Aug 16, 2010 at 3:26 PM, Steve Darnell <darnells at dnastar.com> 
> wrote:
>
>> I'm sorry for reposting this message.  I accidentally sent the previous 
>> one
>> as HTML.
>>
>> ________________________________________
>> From: Steve Darnell
>> Sent: Monday, August 16, 2010 5:19 PM
>> To: 'biojava-l at lists.open-bio.org'
>> Subject: SITE records in PDBFileReader
>>
>> Greetings,
>>
>> I am interested in parsing SITE records from a PDB file.  I looked over 
>> the
>> org.biojava.bio.structure API, but I was unable to find reference to this
>> functionality.  Does the PDBFileReader in BioJava extract SITE record
>> information?  If not, would it be possible to add this capability to
>> PDBFileReader and the Structure class?
>>
>> SITE record format at wwPDB:
>> http://www.wwpdb.org/documentation/format32/sect7.html
>>
>> Regards,
>> Steve Darnell
>>
>> _______________________________________________
>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
>
> -- 
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 


From jbdundas at gmail.com  Tue Aug 17 02:43:02 2010
From: jbdundas at gmail.com (jitesh dundas)
Date: Tue, 17 Aug 2010 08:13:02 +0530
Subject: [Biojava-l] BioJava 3Proposal tasks
Message-ID: <AANLkTinzPg8r-veyi+LYZ4pvtSbuTnso=idtVGGV15ta@mail.gmail.com>

Dear All,

Sorry I am sending this again ,but I don't see it in the list
anywhere.please post it.

I went through the BioJava3 proposal as you mentioned earlier..There
are a few things that I could take up without much worries...

 I can find out how Hibernate can be best deployed for BioJava. PLease
note that I suggest we use only hibernate3 or higher versions.
HIbernate2 has implementation and performace issues..
 I can also look at Spring after this task is done..

 I can find out the architectural and implementation issues in
Biojava. I am strong in Analysis and could do all this reasonably
well..

 I just want someone to share my concerns with and validate the findings..

 Analyse how BioJava is being used by the community. See the UsageAnalysis page.
 I can do these..
 To start from scratch, creating a number of smaller jars as
sub-projects within an umbrella BioJava3 project. Each jar would
provide tools for a specific purpose. Additional jars would provide
cross-purpose tools such as format converters or text-to-object
interfaces. Possibly built using Maven instead of Ant.
 Although starting from scratch, much existing code could be reused or
refactored to suit the new design.
 We would take full advantage of Java 6, including generics,
(@)annotations, the built-in property change support. Everything would
be a bean - absolutely everything.
 We would aim to be fully Java EE compliant, with the majority of
components fully reusable as a bean in any other application, just
like Spring's components are.
 We would adhere rigidly to a common coding style and heavily comment the code.
 We should make it able to focus on any aspect the user requires and
keep its efficiency, removing its dependency on everything being
sequence-related.
 SymbolLists and Alphabets to be rethought as these are the most
common stumbling block.
 Make methods parallel-aware and take advantage of this when possible,
and provide a global variable to specify how much parallelisation can
take place. - I am very interested in this and would liek to take this
up asap Sir.. JDK 1.5 has parallel programming extension to use and we
can define a common method or mode for executing existing code or
functionalities..However, impact analysis will be needed as NOT ALL
CODE CAN BE MADE PARALLEL COMPLIANT DUE TO IMPLEMENTATION ISSUES>>WILL
NEED THOROUGH CHECKING...i can do this..
 Please reply and advise which i should take up first ..Points in bold
are of particular interest to me..Even those beyond those list are
welcome ...

 Regards,
 JD


From darnells at dnastar.com  Tue Aug 17 16:00:33 2010
From: darnells at dnastar.com (Steve Darnell)
Date: Tue, 17 Aug 2010 11:00:33 -0500
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
Message-ID: <A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>

Andreas and Amr,

Thank you very much for agreeing  to add this feature.  May I make one additional refinement to my request?

REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER code in use in the SITE records.  Could the site name also be associated with the site identifier and residues?  There is precedence for parsing REMARK records in BioJava (e.g. experiment type, resolution), but this is a special case where REMARK 800 and SITE records are dependent on one another and physically separated in the header.

Regards,
Steve

________________________________________
From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of Andreas Prlic
Sent: Monday, August 16, 2010 6:59 PM
To: Amr AL-Hossary
Cc: Steve Darnell; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader

- Take a look at PDBFileParser.java and at?http://www.wwpdb.org/documentation/format32/sect7.html
- It needs a new Handler method for the Site records that builds up the data containers.
- Create a new bean that will contain the data for the SITE record
- Instead of having fields for insertion code residue nr and chain IDs, you can use the new?PDBResidueNumber.java class to group this together.
- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-?
On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


From amr_alhossary at hotmail.com  Tue Aug 17 17:36:55 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Tue, 17 Aug 2010 19:36:55 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
Message-ID: <BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>

I'll see it in a couple of days. I have first to be able to check out & in 
the source code.
All I found till now is anonymous access.

Amr

--------------------------------------------------
From: "Steve Darnell" <darnells at dnastar.com>
Sent: Tuesday, August 17, 2010 6:00 PM
To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" 
<amr_alhossary at hotmail.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: RE: [Biojava-l] SITE records in PDBFileReader

> Andreas and Amr,
>
> Thank you very much for agreeing  to add this feature.  May I make one 
> additional refinement to my request?
>
> REMARK 800 provides a very useful SITE_DESCRIPTION for each 
> SITE_IDENTIFIER code in use in the SITE records.  Could the site name also 
> be associated with the site identifier and residues?  There is precedence 
> for parsing REMARK records in BioJava (e.g. experiment type, resolution), 
> but this is a special case where REMARK 800 and SITE records are dependent 
> on one another and physically separated in the header.
>
> Regards,
> Steve
>
> ________________________________________
> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf 
> Of Andreas Prlic
> Sent: Monday, August 16, 2010 6:59 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
> - Take a look at PDBFileParser.java and 
> athttp://www.wwpdb.org/documentation/format32/sect7.html
> - It needs a new Handler method for the Site records that builds up the 
> data containers.
> - Create a new bean that will contain the data for the SITE record
> - Instead of having fields for insertion code residue nr and chain IDs, 
> you can use the newPDBResidueNumber.java class to group this together.
> - Add a get/set method for the Site beans to the Structure class
> - Create a junit test that make sure the parsing works ok.
>
> Hope that makes sense...
> Andreas
>
>
> -
> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary 
> <amr_alhossary at hotmail.com> wrote:
> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
> 


From andreas at sdsc.edu  Tue Aug 17 18:04:19 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Tue, 17 Aug 2010 11:04:19 -0700
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
	<BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
Message-ID: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>

Hi Amr,

thanks for taking this on.  For a first time contributor, it is probably
best to post your patches to the list, so somebody else can take a look at
them first and commit them for you.

Andreas


On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary
<amr_alhossary at hotmail.com>wrote:

> I'll see it in a couple of days. I have first to be able to check out & in
> the source code.
> All I found till now is anonymous access.
>
> Amr
>
> --------------------------------------------------
> From: "Steve Darnell" <darnells at dnastar.com>
> Sent: Tuesday, August 17, 2010 6:00 PM
> To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" <
> amr_alhossary at hotmail.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: RE: [Biojava-l] SITE records in PDBFileReader
>
>  Andreas and Amr,
>>
>> Thank you very much for agreeing  to add this feature.  May I make one
>> additional refinement to my request?
>>
>> REMARK 800 provides a very useful SITE_DESCRIPTION for each
>> SITE_IDENTIFIER code in use in the SITE records.  Could the site name also
>> be associated with the site identifier and residues?  There is precedence
>> for parsing REMARK records in BioJava (e.g. experiment type, resolution),
>> but this is a special case where REMARK 800 and SITE records are dependent
>> on one another and physically separated in the header.
>>
>> Regards,
>> Steve
>>
>> ________________________________________
>> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf
>> Of Andreas Prlic
>> Sent: Monday, August 16, 2010 6:59 PM
>> To: Amr AL-Hossary
>> Cc: Steve Darnell; biojava-l at lists.open-bio.org
>> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>>
>> - Take a look at PDBFileParser.java and athttp://
>> www.wwpdb.org/documentation/format32/sect7.html
>>
>> - It needs a new Handler method for the Site records that builds up the
>> data containers.
>> - Create a new bean that will contain the data for the SITE record
>> - Instead of having fields for insertion code residue nr and chain IDs,
>> you can use the newPDBResidueNumber.java class to group this together.
>>
>> - Add a get/set method for the Site beans to the Structure class
>> - Create a junit test that make sure the parsing works ok.
>>
>> Hope that makes sense...
>> Andreas
>>
>>
>> -
>> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <
>> amr_alhossary at hotmail.com> wrote:
>> If you like It would be my pleasure to do it for you,
>> Just tell me where to start (in the code).
>>
>> Amr
>>
>>
>>


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
-----------------------------------------------------------------------


From andreas at sdsc.edu  Wed Aug 18 18:26:23 2010
From: andreas at sdsc.edu (Andreas Prlic)
Date: Wed, 18 Aug 2010 11:26:23 -0700
Subject: [Biojava-l] Last week of Google Summer of Code
Message-ID: <AANLkTi=2Cug=mQpB3k3FuthW3GcdoNs07mHTjvoqVkuB@mail.gmail.com>

Hi,

This is the last week of this year's Google Summer of Code project and
I am happy to announce that our two students Mark Chapman and
Jianjiong Gao did an amazing job on their two projects "All Java
Multiple Sequence Alignment" (MSA) and "Identification and
Classification of Posttranslational Modification of Proteins" (PTM).

For Multiple Sequence Alignments we?now have a flexible and
multi-threaded MSA implementation that works in linear space and that,
as an option, allows the users to define anchors that are used in the
build up of the multiple alignment. The code is available as part of
the new biojava3-alignment module.

The Posttranslational Modification module (biojava3-protmod) can
detect three different types of protein modifications in protein
structures. It comes with an XML file & Java data structures to store
information about different types of protein modifications, and
contains entries from RESID, PDBCC and PSI-MOD. There is also a
visualisation component to display cross linked PTM on a sequence
viewer.

Both Mark and Jianjiong have expressed their interest in maintaining
and further developing their modules and I am looking forward to
interacting more with them in the future. I want to thank the Mentors
and Co-Mentors Peter Rose, Kyle Ellrott and Scooter Willis for their
help and guidance for the projects, without them this would not have
been possible. Thanks also to Robert Buels and the ?Open
Bioinformatics Foundation for organizing our applications for GSoC and
last, but not least, Google for sponsoring this Summer of Code.

Happy BioJava-ing,

Andreas


From andrew.mcsweeny at rockets.utoledo.edu  Wed Aug 18 22:53:54 2010
From: andrew.mcsweeny at rockets.utoledo.edu (McSweeny, Andrew J)
Date: Wed, 18 Aug 2010 22:53:54 +0000
Subject: [Biojava-l] Annotations question
Message-ID: <469B4CD3D7690A418E8F96B7BA4585F81202C15E@BL2PRD0103MB052.prod.exchangelabs.com>

Hello,

I am interested in using BioJava to determine which features are located where on the assembled chromosome 21 (chr21.fa) from the UCSC genome browser website.

An example of something I would like to do is to pick a position at random (1-48,129,895) and then determine whether there are any exons or introns on the plus or minus strand.

What classes do I need to be familiar with to do this?

-Andrew


From rmb32 at cornell.edu  Thu Aug 19 17:09:45 2010
From: rmb32 at cornell.edu (Robert Buels)
Date: Thu, 19 Aug 2010 10:09:45 -0700
Subject: [Biojava-l] reminder: Aug 25 deadline for GMOD Hackathon application
Message-ID: <4C6D6559.3080809@cornell.edu>

Hi all,

This is your one-week reminder: the deadline for open applications to 
the GMOD Evo hackathon is Wednesday, August 25th.

Rob

========================================

We are seeking participants for the GMOD Tools for Evolutionary Biology
Hackathon, held November 8-12, 2010 at the US National Evolutionary
Synthesis Center (NESCent) in Durham, NC.

This hackathon targets three critical gaps in the capabilities of the
GMOD toolbox that currently limit its utility for evolutionary research:

  1. Visualization of comparative genomics data
  2. Visualization of phylogenetic data and trees
  3. Support for population diversity and phenotype data

If you are interested in these areas and have relevant expertise, you
are strongly encouraged to apply. Relevant areas of expertise include
more than just software development: if you are a GMOD power user,
visualization guru, domain expert (comparative, phylogenetics,
population, ...), or documentation wizard, then your skills are needed!

How To Apply:

Fill out the online application form at http://bit.ly/gmodevohack.
Applications are due August 25.

About GMOD:

GMOD is an intercompatible suite of open-source software components for
storing, managing, analyzing, and visualizing genome-scale data. GMOD
includes many widely-used software components: GBrowse and JBrowse, both
genome viewers; GBrowse_syn, a comparative genomics viewer; Chado, a
generic and modular database schema; CMap, a comparative map viewer; as
well as many other components including Apollo, MAKER, BioMart,
InterMine, and Galaxy. We hope to extend the functionality of existing
GMOD components, and integrate new components as well.

About Hackathons:

A hackathon is an intense event at which a group of programmers with
different backgrounds and skills collaborate hands-on and face-to-face
to develop working code that is of utility to the community as a whole.
The mix of people will include domain experts and computer-savvy end-users.

More details about the event, its motivation, organization, procedures,
and attendees, as well as URLs to the hackathon and related websites are
included below.

Sincerely,

The GMOD EvoHack Organizing Committee (and project affiliations as
relevant):

Nicole Washington, Chair (LBNL, modENCODE, Phenote)

Robert Buels (SGN, Chado NatDiv)

Scott Cain (OICR, GMOD)

Dave Clements (NESCent, GMOD)

Hilmar Lapp (NESCent, Phenoscape, Chado NatDiv)

Sheldon McKay (University of Arizona, iPlant, GBrowse_syn)


-----------------------------

About the GMOD Evo Hackathon

Overview

We are organizing a hackathon to fill critical gaps in the capabilities
of the Generic Model Organism Database (GMOD) toolbox that currently
limit its utility for evolutionary research. Specifically, we will focus
on tools for

   1) viewing comparative genomics data;
   2) visualizing phylogenomic data; and
   3) supporting population diversity data and phenotype annotation.

The event will be hosted at NESCent and bring together a group of about
20+ software developers, end-user representatives, and documentation
experts who would otherwise not meet. The participants will include key
developers of GMOD components that currently lack features critical for
emerging evolutionary biology research, developers of informatics tools
in evolutionary research that lack GMOD integration, and
informatics-savvy biologists who can represent end-user requirements.

The event will provide a unique opportunity to infuse the GMOD developer
community with a heightened awareness of unmet needs in evolutionary
biology that GMOD components have the potential to fill, and for tool
developers in evolutionary biology to better understand how best to
extend or integrate with already existing GMOD components.

Before the Event

Discussion of ideas and sometimes even design actually starts well
before the hackathon, on mailing lists, wiki pages, and conference calls
set up among accepted attendees.  This advance work lays the foundation
for participants to be productive from the very first day.  This also
means that participants should be willing to contribute some time in
advance of the hackathon itself to participate in this preparatory
discussion.

During the Event

Typically, hackathon participants use the morning of the first day of
the event to organize themselves into working groups of between 3 and 6
people, each with a focused implementation objective.  Ideas and
objectives are discussed, and attendees coalesce around the projects in
which they have the most experience or interest.


Deliverables / Event Results

The meeting's attendance, working groups, and outcomes will be fully
logged and documented on the GMOD wiki (http://gmod.org). Each working
group during the event will typically have its own wiki page, linked
from the main EvoHack page, where it documents its minutes and design
notes, and provides links to the code and documentation it produces.
Also, since GMOD and NESCent are both committed to open source
principles, all code and documentation produced by participants during
the hackathon must be published under an OSI-approved open source
license. As contributions to existing GMOD tools, all hackathon products
will most likely satisfy this requirement automatically.

NESCent

This event is sponsored by the US National Evolutionary Synthesis Center
(NESCent, http://www.nescent.org) through its Informatics Whitepapers
program (http://www.nescent.org/informatics/whitepapers.php). NESCent
promotes the synthesis of information, concepts and knowledge to address
significant, emerging, or novel questions in evolutionary science and
its applications. NESCent achieves this by supporting research and
education across disciplinary, institutional, geographic, and
demographic boundaries (see http://www.nescent.org/science/proposals.php).

Links

Main GMOD EvoHack page, and full proposal:
http://gmod.org/wiki/GMOD_Evo_Hackathon

NESCent: http://www.nescent.org/
GMOD: http://gmod.org <http://gmod.org/>
Similar past NESCent events, see: http://hackathon.nescent.org/
GMOD hackathon application:  http://bit.ly/gmodevohack

-- 
http://gmod.org/wiki/GMOD_News
http://gmod.org/wiki/GMOD_Europe_2010
http://gmod.org/wiki/Help_Desk_Feedback


From amr_alhossary at hotmail.com  Fri Aug 27 11:57:16 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Fri, 27 Aug 2010 13:57:16 +0200
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl><AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com><A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com><BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
Message-ID: <BLU150-ds19733EBDE2961925A331288E860@phx.gbl>

I sent the updated code as an attachment to the group, as well as to Andreas 
Prlic<andreas at sdsc.edu>; Steve Darnell<darnells at dnastar.com>; 
jacobsen at ebi.ac.uk<jacobsen at ebi.ac.uk>; to be reviewed for submission.

It seems that the group daemon prevents attachments whatever small is their 
size.
Please feed me back if it wasn't delivered correctly.

This submitted updates handle dealing with "SITE" records to a sufficient 
degree (but didn't handle REMARK 800 yet)

to achieve this goal I had to create a new bean called "Residue". It is 
implemented as a static inner class inside PDBSite (and it can be extracted 
to be a top level class if needed).

I created it because I couldn't use any of the subclasses of Group class 
(e.g. HOH is  neither an amino acid, nor a nucleotide).

I guess this should be discussed on the biojava-dev mail list if any body is 
interested and if it suits the list policy.
I also have some comments on the already present code that needs to be 
discussed. to whom shall I address my comments?

Regards

Amr
From: Andreas Prlic
Sent: Tuesday, August 17, 2010 8:04 PM
To: Amr AL-Hossary
Cc: Steve Darnell ; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader


Hi Amr,

thanks for taking this on.  For a first time contributor, it is probably 
best to post your patches to the list, so somebody else can take a look at 
them first and commit them for you.

Andreas


On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com> 
wrote:

I'll see it in a couple of days. I have first to be able to check out & in 
the source code.
All I found till now is anonymous access.

Amr

--------------------------------------------------
From: "Steve Darnell" <darnells at dnastar.com>
Sent: Tuesday, August 17, 2010 6:00 PM
To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" 
<amr_alhossary at hotmail.com>
Cc: <biojava-l at lists.open-bio.org>
Subject: RE: [Biojava-l] SITE records in PDBFileReader


Andreas and Amr,

Thank you very much for agreeing  to add this feature.  May I make one 
additional refinement to my request?

REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER 
code in use in the SITE records.  Could the site name also be associated 
with the site identifier and residues?  There is precedence for parsing 
REMARK records in BioJava (e.g. experiment type, resolution), but this is a 
special case where REMARK 800 and SITE records are dependent on one another 
and physically separated in the header.

Regards,
Steve

________________________________________
From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of 
Andreas Prlic
Sent: Monday, August 16, 2010 6:59 PM
To: Amr AL-Hossary
Cc: Steve Darnell; biojava-l at lists.open-bio.org
Subject: Re: [Biojava-l] SITE records in PDBFileReader


- Take a look at PDBFileParser.java and 
athttp://www.wwpdb.org/documentation/format32/sect7.html

- It needs a new Handler method for the Site records that builds up the data 
containers.
- Create a new bean that will contain the data for the SITE record

- Instead of having fields for insertion code residue nr and chain IDs, you 
can use the newPDBResidueNumber.java class to group this together.

- Add a get/set method for the Site beans to the Structure class
- Create a junit test that make sure the parsing works ok.

Hope that makes sense...
Andreas


-
On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> 
wrote:
If you like It would be my pleasure to do it for you,
Just tell me where to start (in the code).

Amr


-- 
-----------------------------------------------------------------------
Dr. Andreas Prlic
Senior Scientist, RCSB PDB Protein Data Bank
University of California, San Diego
(+1) 858.246.0526
----------------------------------------------------------------------- 


From jbdundas at gmail.com  Fri Aug 27 14:44:46 2010
From: jbdundas at gmail.com (jitesh dundas)
Date: Fri, 27 Aug 2010 20:14:46 +0530
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <BLU150-ds19733EBDE2961925A331288E860@phx.gbl>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com>
	<AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com>
	<BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl>
	<AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com>
	<A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com>
	<BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
	<BLU150-ds19733EBDE2961925A331288E860@phx.gbl>
Message-ID: <AANLkTikXAPTEWgHbuXovQ+VgW1CF4pQpzgYvDJaUVmzr@mail.gmail.com>

Hi,

Thanks & nice work.I think you need to tell your module lead about that..


Hibernate inclusion isnot a good idea for BioJava.It is slow & XML
based, thus big data files will be affected.
I think we need a plugin framework with better features that deploy
functionalities ,which biologists look for..

I have been doing analysis on the BioJava 3 proposal and have some
concerns on this, besides the other analysis that is present. I will
be sending it to my lead, Andreas Sir (not Andreas Prilic) on this.

Regards,
JD

On 8/27/10, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
> I sent the updated code as an attachment to the group, as well as to Andreas
> Prlic<andreas at sdsc.edu>; Steve Darnell<darnells at dnastar.com>;
> jacobsen at ebi.ac.uk<jacobsen at ebi.ac.uk>; to be reviewed for submission.
>
> It seems that the group daemon prevents attachments whatever small is their
> size.
> Please feed me back if it wasn't delivered correctly.
>
> This submitted updates handle dealing with "SITE" records to a sufficient
> degree (but didn't handle REMARK 800 yet)
>
> to achieve this goal I had to create a new bean called "Residue". It is
> implemented as a static inner class inside PDBSite (and it can be extracted
> to be a top level class if needed).
>
> I created it because I couldn't use any of the subclasses of Group class
> (e.g. HOH is  neither an amino acid, nor a nucleotide).
>
> I guess this should be discussed on the biojava-dev mail list if any body is
> interested and if it suits the list policy.
> I also have some comments on the already present code that needs to be
> discussed. to whom shall I address my comments?
>
> Regards
>
> Amr
> From: Andreas Prlic
> Sent: Tuesday, August 17, 2010 8:04 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell ; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
> Hi Amr,
>
> thanks for taking this on.  For a first time contributor, it is probably
> best to post your patches to the list, so somebody else can take a look at
> them first and commit them for you.
>
> Andreas
>
>
>
> On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com>
> wrote:
>
> I'll see it in a couple of days. I have first to be able to check out & in
> the source code.
> All I found till now is anonymous access.
>
> Amr
>
> --------------------------------------------------
> From: "Steve Darnell" <darnells at dnastar.com>
> Sent: Tuesday, August 17, 2010 6:00 PM
> To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary"
> <amr_alhossary at hotmail.com>
> Cc: <biojava-l at lists.open-bio.org>
> Subject: RE: [Biojava-l] SITE records in PDBFileReader
>
>
> Andreas and Amr,
>
> Thank you very much for agreeing  to add this feature.  May I make one
> additional refinement to my request?
>
> REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER
> code in use in the SITE records.  Could the site name also be associated
> with the site identifier and residues?  There is precedence for parsing
> REMARK records in BioJava (e.g. experiment type, resolution), but this is a
> special case where REMARK 800 and SITE records are dependent on one another
> and physically separated in the header.
>
> Regards,
> Steve
>
> ________________________________________
> From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of
> Andreas Prlic
> Sent: Monday, August 16, 2010 6:59 PM
> To: Amr AL-Hossary
> Cc: Steve Darnell; biojava-l at lists.open-bio.org
> Subject: Re: [Biojava-l] SITE records in PDBFileReader
>
>
> - Take a look at PDBFileParser.java and
> athttp://www.wwpdb.org/documentation/format32/sect7.html
>
> - It needs a new Handler method for the Site records that builds up the data
> containers.
> - Create a new bean that will contain the data for the SITE record
>
> - Instead of having fields for insertion code residue nr and chain IDs, you
> can use the newPDBResidueNumber.java class to group this together.
>
> - Add a get/set method for the Site beans to the Structure class
> - Create a junit test that make sure the parsing works ok.
>
> Hope that makes sense...
> Andreas
>
>
> -
> On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com>
> wrote:
> If you like It would be my pleasure to do it for you,
> Just tell me where to start (in the code).
>
> Amr
>
>
>
>
>
>
> --
> -----------------------------------------------------------------------
> Dr. Andreas Prlic
> Senior Scientist, RCSB PDB Protein Data Bank
> University of California, San Diego
> (+1) 858.246.0526
> -----------------------------------------------------------------------
>
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
>


From amr_alhossary at hotmail.com  Fri Aug 27 08:55:11 2010
From: amr_alhossary at hotmail.com (Amr AL-Hossary)
Date: Fri, 27 Aug 2010 08:55:11 -0000
Subject: [Biojava-l] SITE records in PDBFileReader
In-Reply-To: <AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
References: <A4009967D1886D4286A9B7931FD586100258B102@FS1.dnastar.com><AANLkTin8BWY2=HY9xERE82pwHXz3wcYJob7PiDEW0JNV@mail.gmail.com><BLU150-ds60E744200BBCD8B9BE5EC8E9B0@phx.gbl><AANLkTim3BCru36+BQf_tjLfJtSiTykX9FvRQfgLHgpJC@mail.gmail.com><A4009967D1886D4286A9B7931FD586100258B1BE@FS1.dnastar.com><BLU150-ds9B842D2AF5F69F56BDDBA8E9C0@phx.gbl>
	<AANLkTinannR03cE015=C0fVS39rd8XbtSphnfzvPhWGx@mail.gmail.com>
Message-ID: <BLU150-ds816FE9A83292D0511EF948E860@phx.gbl>

Dear all,

Please, some body revise the attached code & checks it in if it is OK, or contact me back for any inquiry.

This submitted updates handle dealing with "SITE" records to a sufficient degree (but didn't handle REMARK 800 yet)

to achieve this goal I had to create a new bean called "Residue". It is implemented as a static inner class inside PDBSite (and it can be extracted to be a top level class if needed).

Why I created it? because I couldn't use any of the subclasses of Group class (e.g. HOH is  neither an amino acid, nor a neucleotide). in case some body has another idea, let's open the discussion about it.

Regards

Amr


  From: Andreas Prlic 
  Sent: Tuesday, August 17, 2010 8:04 PM
  To: Amr AL-Hossary 
  Cc: Steve Darnell ; biojava-l at lists.open-bio.org 
  Subject: Re: [Biojava-l] SITE records in PDBFileReader


  Hi Amr,

  thanks for taking this on.  For a first time contributor, it is probably best to post your patches to the list, so somebody else can take a look at them first and commit them for you.

  Andreas


  On Tue, Aug 17, 2010 at 10:36 AM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:

    I'll see it in a couple of days. I have first to be able to check out & in the source code.
    All I found till now is anonymous access.

    Amr

    --------------------------------------------------
    From: "Steve Darnell" <darnells at dnastar.com>
    Sent: Tuesday, August 17, 2010 6:00 PM
    To: "Andreas Prlic" <andreas at sdsc.edu>; "Amr AL-Hossary" <amr_alhossary at hotmail.com>
    Cc: <biojava-l at lists.open-bio.org>
    Subject: RE: [Biojava-l] SITE records in PDBFileReader


      Andreas and Amr,

      Thank you very much for agreeing  to add this feature.  May I make one additional refinement to my request?

      REMARK 800 provides a very useful SITE_DESCRIPTION for each SITE_IDENTIFIER code in use in the SITE records.  Could the site name also be associated with the site identifier and residues?  There is precedence for parsing REMARK records in BioJava (e.g. experiment type, resolution), but this is a special case where REMARK 800 and SITE records are dependent on one another and physically separated in the header.

      Regards,
      Steve

      ________________________________________
      From: andreas.prlic at gmail.com [mailto:andreas.prlic at gmail.com] On Behalf Of Andreas Prlic
      Sent: Monday, August 16, 2010 6:59 PM
      To: Amr AL-Hossary
      Cc: Steve Darnell; biojava-l at lists.open-bio.org
      Subject: Re: [Biojava-l] SITE records in PDBFileReader


      - Take a look at PDBFileParser.java and athttp://www.wwpdb.org/documentation/format32/sect7.html 

      - It needs a new Handler method for the Site records that builds up the data containers.
      - Create a new bean that will contain the data for the SITE record

      - Instead of having fields for insertion code residue nr and chain IDs, you can use the newPDBResidueNumber.java class to group this together. 

      - Add a get/set method for the Site beans to the Structure class
      - Create a junit test that make sure the parsing works ok.

      Hope that makes sense...
      Andreas


      -
      On Mon, Aug 16, 2010 at 4:48 PM, Amr AL-Hossary <amr_alhossary at hotmail.com> wrote:
      If you like It would be my pleasure to do it for you,
      Just tell me where to start (in the code).

      Amr


  -- 
  -----------------------------------------------------------------------
  Dr. Andreas Prlic
  Senior Scientist, RCSB PDB Protein Data Bank
  University of California, San Diego
  (+1) 858.246.0526
  -----------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SITE-specific commits.zip
Type: application/x-zip-compressed
Size: 34069 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100827/7926d6c9/attachment-0002.bin>

From sheoran143 at gmail.com  Fri Aug 20 00:45:29 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 20 Aug 2010 00:45:29 -0000
Subject: [Biojava-l] Required Correction in GenbankLocationParser class
Message-ID: <4C6DD03C.1080909@gmail.com>

  Their is problem with GenbankLocationParser class, this class don't 
process genbank record with    Accession: M32882. LocationParser class 
fails at following line in genbank record:

      gene  </nuccore/150738?itemid=33&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
      mRNA  </nuccore/150738?itemid=15&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
                      /note="alternative transcript"


Exception stack trace is as follows:

	Could not understand position: 10206,1..855
	org.biojava.bio.seq.io.ParseException: Could not understand position: 10206,1..855
	at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:277)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:244)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocation(GenbankLocationParser.java:131)

I did some investigation in following matter, and found the defect in 
regular expression named as "gp" in GenbankLocationParser class.

This error can be fixed by applying attached patch. And then for testing 
I have created a method which proves that it can now understand all the 
possible combination of location. This test class is also attached so 
that you can test my patch before and after its application.

I don't have access to svn so please apply this patch for me, and let me 
know if you approve this patch or not.

Thanks
Deepak Sheoran

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenbankLocationParser.patch
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/11dbea0f/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LocationParserTest.java
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/11dbea0f/attachment-0001.ksh>

From sheoran143 at gmail.com  Fri Aug 20 00:48:23 2010
From: sheoran143 at gmail.com (Deepak Sheoran)
Date: Fri, 20 Aug 2010 00:48:23 -0000
Subject: [Biojava-l] Required Correction in GenbankLocationParser class
Message-ID: <4C6DD0E8.8070704@gmail.com>


Their is problem with GenbankLocationParser class, this class don't 
process genbank record with    Accession: M32882. LocationParser class 
fails at following line in genbank record:

      gene  </nuccore/150738?itemid=33&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
      mRNA  </nuccore/150738?itemid=15&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
                      /note="alternative transcript"


Exception stack trace is as follows:

	Could not understand position: 10206,1..855
	org.biojava.bio.seq.io.ParseException: Could not understand position: 10206,1..855
	at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:277)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:244)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocation(GenbankLocationParser.java:131)

I did some investigation in following matter, and found the defect in 
regular expression named as "gp" in GenbankLocationParser class.

This error can be fixed by applying attached patch. And then for testing 
I have created a method which proves that it can now understand all the 
possible combination of location. This test class is also attached so 
that you can test my patch before and after its application.

I don't have access to svn so please apply this patch for me, and let me 
know if you approve this patch or not.

Thanks
Deepak Sheoran

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenbankLocationParser.patch
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LocationParserTest.java
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment-0001.ksh>