[Biopython] Dealing with Non-RefSeq IDs / InParanoid
Matthew Strand
stran104 at chapman.edu
Wed Jul 1 03:01:14 UTC 2009
For the benefit of future users who find this thread through a search, I
would like to share how to retreive a sequence from NCBI given a non-NCBI
protein ID (or other ID). This was question 3 in my original message.
Suppose you have a non-NCBI protein ID, say CE23997 (from WormBase) and you
want to retrieve the sequence from NCBI.
You can use Bio.Entrez.esearch(db='protein', term='CE23997') to get a list
of NCBI GIs that refrence this identifer. In this case there is only one
(17554770).
Then you can get the sequence using Entrez.efetch(db="protein",
id='17554770', rettype="fasta").
This may be obvious to some, but it was not to me; primarially because I was
unaware of the esearch functionality.
--
Matthew Strand
More information about the Biopython
mailing list