[BioPython] Need help to get Fasta sequence of Gis !

Brad Chapman chapmanb at uga.edu
Thu Apr 8 17:15:42 EDT 2004


Hey Jonathon, Iddo;

Jonathon:
> >I'm a newbie to Biopython and I would like to get the fasta sequences of 
> >a huge list of Gis. Any suggestions ?

Iddo:
> Since the list is huge, I guess you should do it standalone, rather than 
> via the net.

That's the best idea. But if you want to do it by the web and it is
feasible (depends a lot on your definition of huge), you can use the
Biopython EUtils interface. If your list of gis is in a variable
called my_gis, you could do this like:

from Bio.EUtils import DBIds
from Bio.EUtils import DBIdsClient

# assuming they are GIs for DNA sequence
db_ids = DBIds("nucleotide", my_gis)
eutils_client = DBIdClient.from_dbids(db_ids)
fasta_handle = eutils_client.efetch(retmode = "text", 
                                    rettype = "fasta")
output_handle = open("my_output.fasta", "w")
output_handle.write(fasta_handle.read())
output_handle.close()

EUtils is pretty nice at giving you back a lot of sequences, so that
might work for you.

Best of luck.
Brad


More information about the BioPython mailing list