[Bioperl-l] RPSblast and existing BLAST packages (WAS: RemoteBlast)

Richard Adams Richard.Adams at ed.ac.uk
Tue Nov 4 13:16:07 EST 2003


Will,

That's great, just looking at your docs it definitely sounds the best 
option to have separate modules for the specific parts of each blast 
program. You obviously spend a lot of time woriking on this and so I'd 
imagine that most of your methods for setting up the query
could be put straight in with little change, with the base class 
containing the remote_blast() and local_blast()
methods instead of your run() method, which send the query to 
RemoteBlast / StandAloneBlast for actually running the query.

Maybe to answer Donald's point the module for running hmmer could also 
be included?

If you'd be willing to send your code for the ncbi-blast modules and, 
I'll try and put together  a draft plan of module organisation and 
methods for discussion.

Cheers

Richard





Will Spooner wrote:

>Hi Richard,
>
>I recently implemented a BioPerl-based generic sequence search API
>for Ensembl (http://www.ensembl.org/Multi/blastview). This seems very
>similar to what you propose below. The approach I used, however,  was to
>abstract the differences between search methods (wu-blast, ncbi-blast,
>fasta etc) into different  perl modules. This is similar to the way that
>SeqIO and SearchIO handle different formats. For example:
>
>  # This lazy-loads Bio/Tools/Run/Search/wu_blastn.pm
>  my $search = Bio::Tools::Run::Search->new( -method=>'wu_blastn' );
>
>  # This lazy-loads Bio/Tools/Run/Search/fasta.pm
>  my $search = Bio::Tools::Run::Search->new( -method=>'fasta' );
>
>Bio::Tools::Run::Search has the following methods:
>
>  'seq'      - adds the query sequance
>  'database' - configures the database location
>  'command'  - generates the command to run
>  'option'   - configures command options
>  'dispatch' - launches the command
>  'environment_variables' - configures environment variables
>  'run'      - combines 'command' + 'dispatch'
>  'status'   - reports job status (PENDING, RUNNING, COMPLETED etc)
>  'report'   - returns the raw search report
>  'next_result' - returns a Bio::Search::Result object
>  (N.b. Bio::Tools::Run::Search ISA Bio::Tools::Run::WrapperBase)
>
>This approach is pretty nice because you can easily subclass the '-method'
>modules to change the search behaviour. For example,
>-method=>'wu_blastn_bsub' is the same as -method=>'wu_blastn', except
>that the 'dispatch' method has been overridden to use the bsub job
>submission system. In addition, new '-methods' can be added without
>editing existing code.
>
>Whilst I'm still developing the core of the system, I have functioning
>modules for wu-blast (inline, offline, bsub), ncbi-blast (inline,
>offline, bsub), blat (gfClient) and ssaha (ssahaClient).
>
>A lot more detail can be found at:
>  http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/EnsemblBlastView.html
>
>If this style of approach is of interest to you moving forward, then I
>would be very interested in contributing.
>
>Kind regards,
>
>Will
>
>
>On Tue, 4 Nov 2003, Richard Adams wrote:
>
>  
>
>>I'm not sure - here are some random musings
>>It seems easier to organize the modules by program package rather than
>>by program function - for example,
>>Smith-Waterman modules are distinct from Blast modules even though the
>>programs have similar aims. If we're going to have
>>a uniform access to Remote Blast and standalone blast then one way might
>>be to have BlastQuery class with common parameter
>>setting methods, and methods such as
>>   run_remote_blast
>>   run_local_blast
>>    which access the implementing code as appropriate. But this might be
>>a pain to implement without breaking everyone's existing code.
>>
>>Or, since standaloneblast uses autoload
>>we could just add alternative allowable names for methods so that
>>$factory->p('blastn') and $factory->program('blastn')are treated the
>>same. Having method names the same as the header names in the blast URI
>>documentation might be best as I would suspect that everyone has used
>>the web interface but not everyone uses standalone blast.
>>
>>Richard
>>
>>
>>
>>--
>>Dr Richard Adams
>>Bioinformatician,
>>Psychiatric Genetics Group,
>>Medical Genetics,
>>Molecular Medicine Centre,
>>Western General Hospital,
>>Crewe Rd West,
>>Edinburgh UK
>>EH4 2XU
>>
>>Tel: 44 131 651 1084
>>richard.adams at ed.ac.uk
>>
>>
>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>    
>>
>
>---
>Dr William Spooner                          whs at sanger.ac.uk
>Ensembl Web Developer                 http://www.ensembl.org
>
>  
>

-- 
Dr Richard Adams
Bioinformatician,
Psychiatric Genetics Group,
Medical Genetics,
Molecular Medicine Centre,
Western General Hospital,
Crewe Rd West,
Edinburgh UK
EH4 2XU

Tel: 44 131 651 1084
richard.adams at ed.ac.uk





More information about the Bioperl-l mailing list