[Biopython] From genome to lineage with Entrez

Peter Cock p.j.a.cock at googlemail.com
Wed Mar 23 18:01:32 UTC 2011


On Wed, Mar 23, 2011 at 5:43 PM, Fabio Gori <gori at cs.ru.nl> wrote:
> Hi all,
>
> I have downloaded all the bacterial genomes
> (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/all.fna.tar.gz) and I want to compare
> their taxonomic lineages.
>
> I'm looking for a way to get their lineages with Entrez. From the files I can
> get the accession numbers and GIs, but I don't know how to get their taxonomic
> ids.
> I know that I can step from GIs to Taxids processing the file
> gi_taxid_nucl.dmp, but I'd prefer to use Entrez.
>

I think you can do it with ELink, but personally I'd use the taxid dump file,
since it sounds like you'll want to process hundreds of lineages.

Peter



More information about the Biopython mailing list