[Biopython] Is this a valid genbank record?

Michael Thon mike.thon at gmail.com
Wed Jan 18 11:11:52 UTC 2012


On Jan 18, 2012, at 12:03 PM, Peter Cock wrote:

> On Wed, Jan 18, 2012 at 10:14 AM, Michael Thon <mike.thon at gmail.com> wrote:
>> Does anyone know if these GenBank records are valid:
>> 
>> http://www.ncbi.nlm.nih.gov/protein/323463153
>> http://www.ncbi.nlm.nih.gov/protein/93279336
>> 
>> ...because biopython raised an exception when trying to parse them.  They have weird feature locations:
>> 
>>     Het             join(bond(127),bond(127),bond(130),bond(130),bond(138),
>>                     bond(138),bond(139),bond(138))
>> 
>> 
>> thanks
>> Mike
> 
> See also: http://www.bioperl.org/wiki/BioPerl_Locations#bond.28location.2Clocation...location.29
> 
> The use of "bond" in a feature location isn't described in the official
> GenBank/EMBL/DDBJ Feature Table definition, but that is aimed at
> nucleotide sequences only. I'm unaware of an official documentation
> on GenPept variations.
> 
> Being practical we'd better update the parser to cope with it, even
> though it does seem to be a rare corner case.
> 
> I'd have to go back and check, but I suspect prior to the parser rewrite
> back in Biopython 1.55 (released August 2011) we might have allowed
> for this.
> 
> Are you happy to test an updated parser?
> 
I'm happy to volunteer my student to test it :)  Just post here when its ready and we'll try it.  I'll have to think about what to do to get a script to use a local installation of biopython instead of the system-installed one. Do I need to mess with PYTHONPATH?

Mike





More information about the Biopython mailing list