[Bioperl-l] Bioperl and matcher

Vilanova,David,LAUSANNE,NRC/BS david.vilanova@rdls.nestle.com
Tue, 26 Nov 2002 16:58:32 +0100


 
Hello,
I have problems retrieving the alignments from an emboss output.
The program belows read 2 files and runs a matcher of all against all.
Matcher gives me an msf output and then I try to parse this alignment with
Bio::AlignIO.
However I get an exception...
 
Processing sequence 1..vs..3...done
 
------------- EXCEPTION  -------------
MSG: 1 exists as an alignment line but not in the header. Not confident of
what is going on!
STACK Bio::AlignIO::msf::next_aln
/usr/local/lib/perl5/site_perl/5.8.0/Bio/AlignIO/msf.pm:106
STACK toplevel Run_Emboss.pl:50
 
--------------------------------------
 
Here is the output from matcher:
!!NA_MULTIPLE_ALIGNMENT 1.0
 
  out MSF: 5 Type: N 26/11/02 CompCheck: 2090 ..
 
  Name: EMBOSS_001 Len: 5  Check: 1045 Weight: 1.00
  Name: EMBOSS_002 Len: 5  Check: 1045 Weight: 1.00
 
//
 
           1   5
EMBOSS_001 CGGCG
EMBOSS_002 CGGCG

 
###########################################################
It doesn't work for fasta format as well in my script (see output below):
Processing sequence 1..vs..3...done
Use of uninitialized value in sprintf at
/usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 257, <GEN2>
line 4.
Use of uninitialized value in hash element at
/usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
line 4.
Use of uninitialized value in hash element at
/usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
line 4.
Use of uninitialized value in hash element at
/usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 270, <GEN2>
line 4.

#########################
 
 
#Script
#! /usr/bin/perl -w
 
use Bio::Factory::EMBOSS;
use Bio::SeqIO;
use Bio::AlignIO;
 
die "Usage: perl script.pl [seqfileA] [seqfileB] [outfile]\n" unless @ARGV
eq '3';
 
#Read input files
($seqfileA,$seqfileB,$outfile) = @ARGV;
 
#Initialize Object
$EMBOSS = new Bio::Factory::EMBOSS;
 
#Define emboss program to run
$application = $EMBOSS->program('matcher');
 
#Manipulate SeqfileA file
$seqA = new Bio::SeqIO (-file => $seqfileA,
   -format => 'fasta');
 

while ($seqinA = $seqA->next_seq){
    $inseqA = "asis::".$seqinA->seq;
    $seqidA = $seqinA->id;
 
    
    print "####$seqidA\n";
    #Initialize seqB at every iteration of SeqA
    $seqB = new Bio::SeqIO (-file => $seqfileB,
       -format => 'fasta');
    
    while ($seqinB = $seqB->next_seq){
 $inseqB = "asis::".$seqinB->seq; #Format like asis::ATGCGA (required for
emboss)
 $seqidB = $seqinB->id;
 
 print "Processing sequence $seqidA..vs..$seqidB...";
 
 #Define program parameters and run...
 $application->run({
     -sequencea => $inseqA,
     -sequenceb => $inseqB,
     -aformat => 'msf',
     -outfile => $outfile });  
 print "done\n";
 
 $alnin = new Bio::AlignIO(-format => 'msf',
      -file  => $outfile    );
 
 while ($aln = $alnin->next_aln){
     print $aln->no_residues,"\n";
     #print $aln->consensus_string,"\n";
     
 }
    }
}