[Biopython] Is this a valid genbank record?
Michael Thon
mike.thon at gmail.com
Wed Jan 18 11:11:52 UTC 2012
On Jan 18, 2012, at 12:03 PM, Peter Cock wrote:
> On Wed, Jan 18, 2012 at 10:14 AM, Michael Thon <mike.thon at gmail.com> wrote:
>> Does anyone know if these GenBank records are valid:
>>
>> http://www.ncbi.nlm.nih.gov/protein/323463153
>> http://www.ncbi.nlm.nih.gov/protein/93279336
>>
>> ...because biopython raised an exception when trying to parse them. They have weird feature locations:
>>
>> Het join(bond(127),bond(127),bond(130),bond(130),bond(138),
>> bond(138),bond(139),bond(138))
>>
>>
>> thanks
>> Mike
>
> See also: http://www.bioperl.org/wiki/BioPerl_Locations#bond.28location.2Clocation...location.29
>
> The use of "bond" in a feature location isn't described in the official
> GenBank/EMBL/DDBJ Feature Table definition, but that is aimed at
> nucleotide sequences only. I'm unaware of an official documentation
> on GenPept variations.
>
> Being practical we'd better update the parser to cope with it, even
> though it does seem to be a rare corner case.
>
> I'd have to go back and check, but I suspect prior to the parser rewrite
> back in Biopython 1.55 (released August 2011) we might have allowed
> for this.
>
> Are you happy to test an updated parser?
>
I'm happy to volunteer my student to test it :) Just post here when its ready and we'll try it. I'll have to think about what to do to get a script to use a local installation of biopython instead of the system-installed one. Do I need to mess with PYTHONPATH?
Mike
More information about the Biopython
mailing list