[Biopython] UniprotXML dbReference parser

Peter Cock p.j.a.cock at googlemail.com
Thu Oct 6 22:26:19 UTC 2011


2011/10/6 Tiago Antão <tiagoantao at gmail.com>:
> Hi,
>
> Do I understand wrongly or the UniprotXML parser for
>
> <dbReference type="RefSeq" id="NP_001117940.1" key="6">
> <property type="nucleotide sequence ID" value="NM_001124468.1"/>
> </dbReference>
>
> simply ignores the "property type" information?

Probably... I think it emulates the very simple list of
db:acc strings produced by the GenBank parser etc,
but try dir(...) on it.  Although PDB references look
to get part of their information dumped in the
record's annotations dictionary.

I guess we could return a list of DB reference objects
which happen to act like the old style string for back
compatibility.

> If so, is there any way to get access to the XML raw data
> (so that I can grep it)?

Are you asking for XML parsing library recommendations?
Or you could hack the SeqIO parser instead... i've CC'd
Andrea who wrote it in case he can add something
more practical.

Peter




More information about the Biopython mailing list