[Biopython] How to run esearch in BioPython without specifying any filtering terms
Brad Chapman
chapmanb at 50mail.com
Wed Jul 15 20:16:55 UTC 2009
Hello;
> The BioPython tutorial (p.86) shows how once the available fields of an
> Entrez database have been found with Einfo , queries can be run that use
> those fields in the term argument of Esearch (for instance Jones[AUTH]).
>
> However, I?d like to retrieve all IDs from a database without specifying any
> filtering term.
>
> If I leave the term argument out in the Entrez.efetch method, BioPython
> returns an error.
[..]
> How can you run esearch in BioPython with no filtering terms?
Retrieving all IDs isn't practical for most of the databases due to
large numbers of entries. That's why a term is required in Biopython,
and why most NCBI databases likely won't have an option to return
everything. For example, 'pcsubstance' looks to contain 81 million
records from the available downloads:
ftp://ftp.ncbi.nlm.nih.gov/pubchem/Substance/CURRENT-Full/XML/
To realistically loop over a query, you'll need to limit your search
via some subset of things you are interested in to make the numbers
more manageable.
Hope this helps,
Brad
More information about the Biopython
mailing list