[Biopython] BLAST database access

Peter biopython at maubp.freeserve.co.uk
Sun Jan 24 13:47:14 UTC 2010


On Sun, Jan 24, 2010 at 2:14 AM, xyz <mitlox at op.pl> wrote:
> Hello,
> I have run MegaBlast and the results I can parse for example with:
>
> input_file = open("megablastres.txt","r")
> for line in input_file.readlines():
>   if line[0] == "#" :
>       #header line, ignore
>   else:
>       parts = line.rstrip().split()
>       print "Subject id = %s" % parts[1]

If all you want is the subject ID, that looks simple. I guess
you are using one of the simple tabular output formats?

> How could I retrieve the sequence which belong to subject id
> from BLAST database with BioPython?

Are you using a local BLAST database, or an online one?
If online, I would try using the hit ID to search via the NCBI
Entrez interface, see the Bio.Entrez chapter in our tutorial.
If the database is local, then the NCBI provides a tool as
part of the BLAST suite for this called fastacmd.

Peter



More information about the Biopython mailing list