[Bioperl-l] Still working with biopipe....

Fri Nov 21 10:44:28 EST 2003

Hi,
Still working with biopipe....
I’m now trying to create de novo a pipeline to find orthologues between 
Oryza sativa (Os) and Arabidopsis thaliana (At) by BBMH (best blast mutal 
hit) (before to develop something more efficient and more complicate !).
So I started by a simple blast between a prot from Os to At multifasta prot 
using the Bio::DB::Fasta and all the bioperl methods needed (and loop on all 
the Os proteins instead of a massive blast with a chunk of Os proteins). I 
would like to take a sequence from oriz_mfasta.txt ( using the get_Seq_by_id 
fonction) and blast it against arabido_mfasta.txt and so on for all the seq 
of oryza.This is the first step. But, it's not working !!! Probably because 
it's not really clear for me the function of all the XML code I am working 
with (especially the <datamonger> tag !).

You will find the code and the biopipe output below.
Thanks in advance

<pipeline_setup>

<!-- FILES  -->
<global
         rootdir="/home/conte/test_blast"
         datadir="$rootdir/datahope"
         workdir="$rootdir/blasthope"
         inputfile="$datadir/oriz_mfasta.txt"
         blastpath = ""
         blast_param1="-p blastp -e 1e-5"
         blastdb1="$datadir/arabido_mfasta.txt"
         resultdir1="$rootdir/resulthope/analysis1"
/>
<pipeline_flow_setup>
<!--CALL  MODULES  -->
  <database_setup>
    <streamadaptor>
      <module>Bio::Pipeline::Dumper</module>
    </streamadaptor>
    <streamadaptor>
      <module>Bio::DB::Fasta</module>
    </streamadaptor>
   </database_setup>

<!-- IOHANDLER PICK UP iDs-->
     <iohandler_setup>
    <iohandler>
     <adaptor_id>2</adaptor_id>
     <adaptor_type>STREAM</adaptor_type>
     <iohandler_type>INPUT</iohandler_type>
     <method>
       <name>new</name>
       <rank>1</rank>
       <argument>
         <value>$inputfile</value>
       </argument>
     </method>
     <method>
       <name>get_Seq_by_id</name>
     <argument>
     <value>INPUT</value>
     </argument>
       <rank>2</rank>
     </method>
   </iohandler>

    <iohandler>
     <adaptor_id>2</adaptor_id>
     <adaptor_type>STREAM</adaptor_type>
     <iohandler_type>INPUT</iohandler_type>
    <method>
       <name>new</name>
       <rank>1</rank>
       <argument>
           <value>$inputfile</value>
       </argument>
    </method>
    <method>
       <name>get_all_ids</name>
       <rank>2</rank>
    </method>
   </iohandler>

<!-- PARAMETRES OUTPUT (DUMPER) -->
   <iohandler>
     <adaptor_id>1</adaptor_id>
     <adaptor_type>STREAM</adaptor_type>
     <iohandler_type>OUTPUT</iohandler_type>
     <method>
       <name>new</name>
       <rank>1</rank>
       <argument>
         <tag>-dir</tag>
         <value>$resultdir1</value>
         SCALAR
         <rank>1</rank>
       </argument>
       <argument>
         <tag>-module</tag>
         <value>generic</value>
         SCALAR
         <rank>1</rank>
       </argument>
       <argument>
         <tag>-prefix</tag>
         SCALAR
         <value>INPUT</value>
         <rank>2</rank>
       </argument>
       <argument>
         <tag>-format</tag>
         SCALAR
         <value>gff</value>
         <rank>3</rank>
       </argument>
       <argument>
         <tag>-file_suffix</tag>
         SCALAR
         <value>gff</value>
         <rank>4</rank>
       </argument>
     </method>
     <method>
       <name>dump</name>
       <rank>2</rank>
       <argument>
        <value>OUTPUT</value>
         ARRAY
         <rank>1</rank>
       </argument>
      </method>
     </iohandler>
  </iohandler_setup>

<!-- ANALYSIS -->
    <analysis>
     <data_monger>
       <initial></initial>
       <input>
         <name>protein_ids</name>
         <iohandler>1</iohandler>
       </input>
       <input_create>
          <module>setup_initial</module>
          <rank>1</rank>
          <argument>
               <tag>protein_ids</tag>
               <value>2</value>
           </argument>
        </input_create>
</data_monger>
<input_iohandler></input_iohandler>
   </analysis>

<!-- BLAST-->
   <analysis>
     <logic_name>Blast</logic_name>
     <runnable>Bio::Pipeline::Runnable::Blast</runnable>
     <db>family</db>
     <db_file>$blastdb1</db_file>
     <program>blastall</program>

<!-- BLASTPATH-->
     <program_file>$blastpath</program_file>
     <analysis_parameters>$blast_param1</analysis_parameters>
     <runnable_parameters>-formatdb 1 -result_dir 
$resultdir1</runnable_parameters>

     <input_iohandler></input_iohandler>

     <output_iohandler></output_iohandler>
   </analysis>

<!-- RULES -->
<rule>
     <current_analysis_id>1</current_analysis_id>
     <next_analysis_id>2</next_analysis_id>
     NOTHING

</rule>

</pipeline_flow_setup>
<job_setup>
</job_setup>

</pipeline_setup>

And I obtain:
“
Creating biopipe
  Loading Schema...
Reading Data_setup xml   : /home/conte/xml/newhope.xml
Doing DBAdaptor and IOHandler setup
Doing Pipeline Flow Setup
Doing Analysis..

------------- EXCEPTION  -------------
MSG: Need to store analysis first
STACK Bio::Pipeline::SQL::JobAdaptor::store 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/SQL/JobAdaptor.pm:459
STACK Bio::Pipeline::XMLImporter::_create_initial_input_and_job 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/XMLImporter.pm:837
STACK Bio::Pipeline::XMLImporter::run 
/usr/local/ActivePerl-5.8/lib/site_perl/5.8.0/Bio/Pipeline/XMLImporter.pm:484
STACK toplevel PipelineManager:120

Matthieu CONTE
M. Sc. in Bioinformatics form SIB
CIRAD
00 33 06.68.90.28.70
m_conte at hotmail.com

_________________________________________________________________
MSN Search, le moteur de recherche qui pense comme vous ! 
http://search.msn.fr/worldwide.asp