[Bioperl-l] RE: Bioperl and matcher

Vilanova,David,LAUSANNE,NRC/BS david.vilanova@rdls.nestle.com
Tue, 26 Nov 2002 17:14:19 +0100


Ok,I use:

$alnin = new Bio::AlignIO(-format =>'emboss',
				  -file  => $outfile    );
while ($aln = $alnin->next_aln){
      print $aln->no_residues,"\n";
}

I don't specify any format to emboss so I get the standard alignment.
In this case It doesn't work, it never enters this loop... but the program
doesn't crash. It does all the alignements, store the aln in outfile but
seems not to read it..!! bizarre ???

David




-----Original Message-----
From: Jason Stajich [mailto:jason@cgt.mc.duke.edu]
Sent: mardi, 26. novembre 2002 17:05
To: Vilanova,David,LAUSANNE,NRC/BS
Cc: 'bioperl-l@bioperl.org'; 'emboss@embnet.org'
Subject: Re: Bioperl and matcher


Our msf parser is seeing something it isn't expecting - not sure why -
what happens when you just use the straight 'emboss' parser with standard
emboss alignment output which is the route that has been most heavily
tested?

-jason

Jason Stajich
Duke University
jason at cgt.mc.duke.edu

On Tue, 26 Nov 2002, Vilanova,David,LAUSANNE,NRC/BS wrote:

>
> Hello,
> I have problems retrieving the alignments from an emboss output.
> The program belows read 2 files and runs a matcher of all against all.
> Matcher gives me an msf output and then I try to parse this alignment with
> Bio::AlignIO.
> However I get an exception...
>
> Processing sequence 1..vs..3...done
>
> ------------- EXCEPTION  -------------
> MSG: 1 exists as an alignment line but not in the header. Not confident of
> what is going on!
> STACK Bio::AlignIO::msf::next_aln
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/AlignIO/msf.pm:106
> STACK toplevel Run_Emboss.pl:50
>
> --------------------------------------
>
> Here is the output from matcher:
> !!NA_MULTIPLE_ALIGNMENT 1.0
>
>   out MSF: 5 Type: N 26/11/02 CompCheck: 2090 ..
>
>   Name: EMBOSS_001 Len: 5  Check: 1045 Weight: 1.00
>   Name: EMBOSS_002 Len: 5  Check: 1045 Weight: 1.00
>
> //
>
>            1   5
> EMBOSS_001 CGGCG
> EMBOSS_002 CGGCG
>
>
> ###########################################################
> It doesn't work for fasta format as well in my script (see output below):
> Processing sequence 1..vs..3...done
> Use of uninitialized value in sprintf at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 257, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 270, <GEN2>
> line 4.
>
> #########################
>
>
> #Script
> #! /usr/bin/perl -w
>
> use Bio::Factory::EMBOSS;
> use Bio::SeqIO;
> use Bio::AlignIO;
>
> die "Usage: perl script.pl [seqfileA] [seqfileB] [outfile]\n" unless @ARGV
> eq '3';
>
> #Read input files
> ($seqfileA,$seqfileB,$outfile) = @ARGV;
>
> #Initialize Object
> $EMBOSS = new Bio::Factory::EMBOSS;
>
> #Define emboss program to run
> $application = $EMBOSS->program('matcher');
>
> #Manipulate SeqfileA file
> $seqA = new Bio::SeqIO (-file => $seqfileA,
>    -format => 'fasta');
>
>
> while ($seqinA = $seqA->next_seq){
>     $inseqA = "asis::".$seqinA->seq;
>     $seqidA = $seqinA->id;
>
>
>     print "####$seqidA\n";
>     #Initialize seqB at every iteration of SeqA
>     $seqB = new Bio::SeqIO (-file => $seqfileB,
>        -format => 'fasta');
>
>     while ($seqinB = $seqB->next_seq){
>  $inseqB = "asis::".$seqinB->seq; #Format like asis::ATGCGA (required for
> emboss)
>  $seqidB = $seqinB->id;
>
>  print "Processing sequence $seqidA..vs..$seqidB...";
>
>  #Define program parameters and run...
>  $application->run({
>      -sequencea => $inseqA,
>      -sequenceb => $inseqB,
>      -aformat => 'msf',
>      -outfile => $outfile });
>  print "done\n";
>
>  $alnin = new Bio::AlignIO(-format => 'msf',
>       -file  => $outfile    );
>
>  while ($aln = $alnin->next_aln){
>      print $aln->no_residues,"\n";
>      #print $aln->consensus_string,"\n";
>
>  }
>     }
> }
>
>
>
>
>
>
>
>
>