[Biopython] access Uniprot record by different ids

Peter Cock p.j.a.cock at googlemail.com
Thu Jul 12 18:37:11 UTC 2012


On Thu, Jul 12, 2012 at 12:06 PM, Sheila the angel
<from.d.putto at gmail.com> wrote:
> Thanks for reply.
> Now I made two dictionary one for uniprot_sprot.dat and another for
> secondary ids to primary ids.
> However it take too long to do this and I can't do Pickle for my_dict.
> I would like to know is it possible to dump my_dict (the uniprot.dat data)
> to MySql database.

Have you tried the Bio.SeqIO.index_db(...) function? This builds
an SQLite database to hold the lookup table of offsets (i.e. the
primary accession only). Creating the index is a little slow, but
reuse is very fast.

For your second dictionary mapping secondary accessions to
the primary accession, you should be able to use pickle.

> I looked at biopython-BioSQL page  but didn't understand much
> (I am new to SQL)
> Thanks

BioSQL is a bit complicated to get started with (although
using SQLite is a lot simpler than MySQL or PostgreSQL).

Peter



More information about the Biopython mailing list