[Biojava-l] Required Correction in GenbankLocationParser class

Deepak Sheoran sheoran143 at gmail.com
Thu Aug 19 20:48:23 EDT 2010


Their is problem with GenbankLocationParser class, this class don't 
process genbank record with    Accession: M32882. LocationParser class 
fails at following line in genbank record:

      gene  </nuccore/150738?itemid=33&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
      mRNA  </nuccore/150738?itemid=15&report=gbwithparts>             join((8298.8300)..10206,1..855)
                      /gene="bcn"
                      /note="alternative transcript"


Exception stack trace is as follows:

	Could not understand position: 10206,1..855
	org.biojava.bio.seq.io.ParseException: Could not understand position: 10206,1..855
	at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parsePosition(GenbankLocationParser.java:285)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:277)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocString(GenbankLocationParser.java:244)
         at org.biojavax.bio.seq.io.GenbankLocationParser.parseLocation(GenbankLocationParser.java:131)

I did some investigation in following matter, and found the defect in 
regular expression named as "gp" in GenbankLocationParser class.

This error can be fixed by applying attached patch. And then for testing 
I have created a method which proves that it can now understand all the 
possible combination of location. This test class is also attached so 
that you can test my patch before and after its application.

I don't have access to svn so please apply this patch for me, and let me 
know if you approve this patch or not.

Thanks
Deepak Sheoran

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: GenbankLocationParser.patch
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment.pl>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: LocationParserTest.java
URL: <http://lists.open-bio.org/pipermail/biojava-l/attachments/20100820/259f3ec6/attachment-0001.pl>


More information about the Biojava-l mailing list