[Bioperl-l] Bio/SeqIO/genbank.pm patch

Erik er at xs4all.nl
Thu Nov 16 21:01:12 UTC 2006


Hi all,

Using bioperl-live, I noticed a problem with the parsing in
Bio/SeqIO/genbank.pm.

It occurs in the DBSOURCE section, where the 'dblink' annotation gets its
values. I got several values that had a double colon, like
InterPro::IPR011000 etc. Not all 'dblink' values were affected.

Here is a patch which seems to fix it / it works for me:

=======
--- Bio/SeqIO/genbank.pm.orig	2006-11-16 18:33:30.060417520 +0100
+++ Bio/SeqIO/genbank.pm	2006-11-16 20:29:59.014934936 +0100
@@ -504,7 +504,7 @@
 				my $db;
 				# this is because GenBank dropped the spaces!!!
 				# I'm sure we're not going to get this right
-				if( $id =~
s/^(EchoBASE|IntAct|SWISS-2DPAGE|ECO2DBASE|ECOGENE|TIGRFAMs|TIGR|GO|InterPro|Pfam|PROSITE|SGD|GermOnline|HSSP|PhosSite)//i
) {
+				if( $id =~
s/^(EchoBASE|IntAct|SWISS-2DPAGE|ECO2DBASE|ECOGENE|TIGRFAMs|TIGR|GO|InterPro|Pfam|PROSITE|SGD|GermOnline|HSSP|PhosSite)://i
) {
 				    $db = $1;
 				}
 				$annotation->add_Annotation=======


I also wrote a few tests for the problem, which also needed an extra file
in t/data.

I will attach the lot

hth,

Erik


-------------- next part --------------
A non-text attachment was scrubbed...
Name: P35527.gb
Type: application/octet-stream
Size: 14346 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0012.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genbank.t.diff
Type: application/octet-stream
Size: 2562 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0013.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genbank.pm.diff
Type: application/octet-stream
Size: 608 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0014.obj>


More information about the Bioperl-l mailing list