[BioPython] Uniprot Parser

Ruchira Datta ruchira.datta at gmail.com
Sun Feb 24 17:53:10 UTC 2008


On Sun, Feb 24, 2008 at 9:48 AM, Peter <biopython at maubp.freeserve.co.uk>
wrote:

> On Sun, Feb 24, 2008 at 5:36 PM, Ruchira Datta <ruchira.datta at gmail.com>
> wrote:
> > I just found another bug, which would be a bit trickier to fix properly.
> >
> >  This code:
> >
> >     def database_cross_reference(self, line):
> >         # From CLD1_HUMAN, Release 39:
> >         # DR   EMBL; [snip]; -. [EMBL / GenBank / DDBJ] [CoDingSequence]
> >         # DR   PRODOM [Domain structure / List of seq. sharing at least
> 1
> >  domai
> >         # DR   SWISS-2DPAGE; GET REGION ON 2D PAGE.
> >         line = line[5:]
> >         # Remove the comments at the end of the line
> >         i = line.find('[')
> >         if i >= 0:
> >             line = line[:i]
> >         cols = line.rstrip(_CHOMP).split(';')
> >         cols = [col.lstrip() for col in cols]
> >         self.data.cross_references.append(tuple(cols))
> >
> >  applied to this line of the TrEMBL record for A2RB21_ASPNG:
> >
> >  DR   GO; GO:0016277; F:[myelin basic protein]-arginine N-methyltra...;
> >  IEA:EC.
> >
> >  got me this tuple:
> >
> >  ('GO', 'GO:0016277', 'F:')
> >
> >  The bracketed term was interpreted as a comment and the whole line was
> >  stripped.
>
> That does look tricky... especially if we want to preserve backwards
> compatibility.  This "F" cross reference looks like the partial text
> for the GO term.  I wonder how common this is? (square brackets in the
> cross references themselves).  I can't see the use of "F" mentioned
> here: http://www.expasy.org/sprot/userman.html#DR_line
>
> Could you file a bug and add a few more other examples if you find them.
>
> Thanks
>
> Peter
>

Here 'F;' means the annotation refers to the molecular function part of the
Gene
Ontology (as opposed to, e.g., 'P:' for biological process).

I think this is quite rare, but I'll see if any other examples came up.

--Ruchira



More information about the Biopython mailing list