[Biopython-dev] Genbank structured comments

Fields, Christopher J cjfields at illinois.edu
Thu Sep 10 17:06:41 UTC 2015


This is very similar to the issue bioperl had with nested annotations; namely that some annotation data from SwissProt (GENE NAME I believe) had a hierarchal structure.  Seems a bit thornier in this case as the annotation would have a both a standard comment field and a named collection of meta-data tied together.  

Brian, how is this implemented in BioPerl? 

chris

> On Sep 10, 2015, at 10:47 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> 
> Good question...
> 
> e.g. http://www.ncbi.nlm.nih.gov/nuccore/291609868
> and http://www.ncbi.nlm.nih.gov/nuccore/FJ966082
> 
> It almost makes me wonder if that should have top level
> keys of MIENS-Data and FluData - or is that too nested?
> 
> Peter
> 
> On Thu, Sep 10, 2015 at 4:37 PM, Brian Osborne <bosborne11 at verizon.net> wrote:
>> Peter,
>> 
>> Another question, maybe the last one: what do we do what the “header” and “footer” strings, things like “FluData”, "GISAID_EpiFlu(TM)Data”, and “Assembly-Data”?
>> 
>> They could also be keys in the dict, of course. Values are ‘’?
>> 
>> Thanks again,
>> 
>> Brian O.
>> 
>> 
>>> On Sep 10, 2015, at 1:25 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>> 
>>> On Wed, Sep 9, 2015 at 11:37 PM, Brian Osborne <bosborne11 at verizon.net> wrote:
>>>> Chris,
>>>> 
>>>> This is the documentation I’m familiar with, but there may be more:
>>>> 
>>>> http://www.ncbi.nlm.nih.gov/genbank/structuredcomment
>>>> 
>>>> Peter, I can definitely separate these using ‘comment’ and
>>>> ‘structured_comment’ keys in the record.annotations dict.
>>>> 
>>>> If there’s no structured comment in the Genbank file, would
>>>> there simply be an empty dict in the SeqRecord?
>>>> 
>>>> E.g.
>>>> 
>>>>>>> record.annotations[‘structured_comment']
>>>> {}
>>> 
>>> That makes sense - equally no entry in the annotation dictionary
>>> would be reasonable.
>>> 
>>> Peter
>> 




More information about the Biopython-dev mailing list