[Biopython] Is this a valid genbank record?
Fields, Christopher J
cjfields at illinois.edu
Wed Jan 18 16:31:12 UTC 2012
On Jan 18, 2012, at 9:34 AM, Peter Cock wrote:
> On Wed, Jan 18, 2012 at 3:24 PM, Fields, Christopher J
> <cjfields at illinois.edu> wrote:
>> On Jan 18, 2012, at 8:50 AM, Peter Cock wrote:
>>
>>> On Wed, Jan 18, 2012 at 11:11 AM, Michael Thon <mike.thon at gmail.com> wrote:
>>>>
>>>> On Jan 18, 2012, at 12:03 PM, Peter Cock wrote:
>>>>
>>>>> On Wed, Jan 18, 2012 at 10:14 AM, Michael Thon <mike.thon at gmail.com> wrote:
>>>>>> They have weird feature locations:
>>>>>>
>>>>>> Het join(bond(127),bond(127),bond(130),bond(130),bond(138),
>>>>>> bond(138),bond(139),bond(138))
>>>>>>
>>>
>>> Do you actually need to do anything with this feature? If not, then
>>> the pragmatic solution is we issue a warning but otherwise ignore
>>> the feature and continue parsing. I'm struggling to grok exactly
>>> what this location is trying to convey - maybe I should read the
>>> associated paper?
>>
>> GenPept is littered with these. With bioperl we only attempt to
>> support 'bond' types for round-tripping, but I don't recall whether
>> this has been extensively tested, though it would be easy enough
>> to add this in to see if the location factory will handle this properly
>> (both to and from a location string).
>>
>> Do wish NCBI would document this more...
>
> +1
>
>
> join(bond(127),bond(127),bond(130),bond(130),bond(138),
> bond(138),bond(139),bond(138))
>
> rather than:
>
> bond(127,127,130,130,138,138,139,138)
>
> or indeed by so many of the residues are bonded more than
> once?
>
> Peter
No, that one is particularly odd, but there isn't a reason I could see where this couldn't be supported, it's just a join of simple locations. Seems this is something that may be auto-generated, wouldn't be surprised to see more of these.
As to whether it's a valid GenBank record, well, considering the source of the record is NCBI, I think it's safe to say it's valid.
(though again, this all comes back to how helpful it would be to have documentation re: how bond() is defined within the context of the feature table)
chris
More information about the Biopython
mailing list