[Bioperl-l] AlignIO clustal

Bernd Web bernd.web at gmail.com
Fri Apr 3 14:11:44 UTC 2009


Hi,

I noticed this issue is not specific to Clustal; it also occurs for Fasta.
The "problem" arises in a last check, which is only done on the last sequence;
it is still present in the current code (webcvs) in the next_aln code.

In fasta.pm:
#  If $end <= 0, we have either reached the end of
	#  file in <> or we have encountered some other error
	if ( $end <= 0 ) {
		undef $aln;
		return $aln;
	}

In clustalw.pm

    # not sure if this should be a default option - or we can pass in
    # an option to do this in the future? --jason stajich
    # $aln->map_chars('\.','-');
    undef $aln if ( !defined $end || $end <= 0 );
    return $aln;

And the last sequence actually got a  zero end. This was given in an
$aln->slice where gap only sequences are retained. It will also get a
"0" in next_aln itself if no coordinates would be present.

1_3265047/1-0          ---------------------------

For now, uncommenting  "undef $aln if ( !defined $end || $end <= 0 );" works.


Regards,
Bernd

On Fri, Apr 3, 2009 at 3:47 PM, Bernd Web <bernd.web at gmail.com> wrote:
> Hi,
>
> Using Bioperl 1.5.2 and AlignIO, I now run into an issue with a
> clustalw alignment.
> At the moment, I cannot update to a newer version, so am not sure this
> problem still exists.
>
> The problem is that the $aln object does not exists when the last
> sequence in a block contains gaps only.
> Anybody has seen this or knows a fix? Code and example input follows below.
>
>
> Regards,
> Bernd
>
>
> use Bio::AlignIO;
> my $in = Bio::AlignIO->new(-file => 'test.aln',
>                           -format => 'clustalw');
>
> my $out = Bio::AlignIO->new(-file => '>testerr.ALN',
>                            -format => 'clustalw');
>
> my $aln = $in->next_aln();
> print $aln->length, "\n";
>
> test.aln contains:
>
> CLUSTAL W(1.81) multiple sequence alignment
>
>
> QUERY/7-143            PETLE-ARINRATNPLNKEL--DWASI
> 7082547/1-128          ---------ERATNDMLIGP--DWAVN
> 1_3265048/1-0          ---------------------------
> 3265047/2-138          QTSLE-ALLLKATNSQNQNI--DTAAV
> 1_3265047/1-0          ---------------------------
>



More information about the Bioperl-l mailing list