[Biopython] How to run esearch in BioPython without specifying any filtering terms

Brad Chapman chapmanb at 50mail.com
Wed Jul 15 20:16:55 UTC 2009


Hello;

> The BioPython tutorial (p.86) shows how once the available fields of an
> Entrez database have been found with Einfo ,  queries can be run that use
> those fields in the term argument of Esearch (for instance Jones[AUTH]).
> 
> However, I?d like to retrieve all IDs from a database without specifying any
> filtering term.
> 
> If I leave the term argument out in the Entrez.efetch method, BioPython
> returns an error.
[..]
> How can you run esearch in BioPython with no filtering terms?

Retrieving all IDs isn't practical for most of the databases due to
large numbers of entries. That's why a term is required in Biopython,
and why most NCBI databases likely won't have an option to return
everything. For example, 'pcsubstance' looks to contain 81 million
records from the available downloads:

ftp://ftp.ncbi.nlm.nih.gov/pubchem/Substance/CURRENT-Full/XML/

To realistically loop over a query, you'll need to limit your search
via some subset of things you are interested in to make the numbers
more manageable.

Hope this helps,
Brad



More information about the Biopython mailing list