[Bioperl-l] Appending efetch results to a file
Chris Fields
cjfields at illinois.edu
Fri May 8 14:22:31 UTC 2009
On May 8, 2009, at 8:42 AM, Mark A. Jensen wrote:
>> I need to request 100's of records, and to avoid stress the Entrez
>> server I do my fetching inside a loop that increments the -
>> retstart parameter in the factory.
>
> This raises a question in my mind: should EUtilities use
> Bio::WebAgent rather
> than LWP::UserAgent directly, and doesn't Bio::WebAgent have
> magical properties that ease the server burden without having to
> build it into the user code directly?
I thought about that originally, but there is a significant difference
between the two agent implementations. Bio::WebAgent is-a
LWP::Useragent subclass, whereas Bio::DB::GenericWebAgent and it's ilk
contain a user agent instance (has-a). I choose the latter course b/c
I favor composition over inheritance, and LWP::UserAgent uses
different named parameter handling than BioPerl (no '-');
Bio::WebAgent code works around that in the constructor. Rather that
than the possibility of down the road to run into odd parameter issues.
Not to mention, I may genericize it more in the future to be capable
of using SOAP-based methods, so switching out the ua made more sense
in the long run (still a lot to do on that end).
I haven't discussed this extensively on the list before, but when I
redesigned EUtilities I wanted to separate out the various tasks, e.g.
ua, parser, parameter handling, etc. So, for the specific eutil
tools, parser = Bio::Tools:EUtilities, parameter =
Bio::Tools::EUtilities::EUtilParameters, ua = LWP::UserAgent. For
other DBs one could switch out the relevant bits for DB-specific
implementations. Then, Bio::DB::EUtilities basically decorates all
three, acts as the traffic cop to get the various bits playing well
together, delegates as needed, etc.
This'll allow additional components to be added in at later points if
needed, and the basic tool can be used for retrieving raw data or as a
souped-up agent for retrieving remote data in a new set of modules
(Bio::Entrez::*, maybe). There are some experimental bits in there
still (repeated requests with the exact same params do not spam
eutils, for instance, and there is some 'lazy' code in the parser),
but it seems to largely work, and those bits can be removed fairly
easily if they prove problematic.
chris
More information about the Bioperl-l
mailing list