[Biopython-dev] Online access, Bio.PubMed & Bio.GenBank vs Bio.Entre

Peter biopython at maubp.freeserve.co.uk
Tue Aug 19 09:07:47 UTC 2008


> I do think though that having one Bio.Entrez.email is better
> than having to specify the email address on each call to Entrez.

I agree with you on this change, having one place to set the email
address should make using Bio.Entrez and following the NCBI guidelines
much easier.

On Tue, Aug 19, 2008 at 12:46 AM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
>> Thirdly, assuming we don't deprecate it, perhaps
>> Bio.PubMed.search_for() should just use Bio.Entrez.read()
>> to parse the XML rather than its own mini-parser?
>
> Now that Bio.Entrez is available, the mini-parser in Bio.PubMed is no longer needed.
>

OK.  Its not urgent but worth doing.

>> Finally, perhaps Bio.Entrez neads its own version
>> search_for() which would parse the XML results into a
>> list of IDs, and download them in batches.  However,
>> this might be best done as in combination with some
>> history helper functions to make a combined esearch
>> and efetch easier, which is a bigger job.
>
> It is not entirely clear to me if a search_for function
> (in Bio.PubMed, Bio.GenBank, or Bio.Entrez) is a good
> idea. The search_for function provides a higher-level
> interface to the low-level functionality in Entrez. But
> there is a reason that Entrez only provides low-level
> functions: it cannot provide higher-level functions
> without knowing what the user wants. We as biopython
> don't know much more han Entrez (except that they'll
> want to parse the result using Python).

You are right if we are talking about all possible uses of Entrez.

> Maybe I'm being too pessimistic, but I think the result
> will be either an over-engineered function that tries to
> cater to all possible user wishes, or a more
> straightforward function that is useful only for a minority
> of users.

I was thinking that the "search for some sequences and then download
them" task might be a common enough and straightforward enough task to
warrent a simple helper function.  However, as I haven't yet made any
serious use of the Entrez module in real code, I may not be the best
person to judge this (I prefer to download multiple genomes
automatically by FTP).  We can opt to wait and see what user feedback
we get from Bio.Entrez users I guess.

Peter



More information about the Biopython-dev mailing list