[Bioperl-l] How to change a fasta format alignment into clustalw format?
Tao Zhu
taozhu at mail.bnu.edu.cn
Wed Sep 12 12:28:31 UTC 2012
Hello, everyone
I have an multiple protein sequence alignment in FASTA format:
>SPOG_04578#scry
MESRMTNSVRIRSITKKDVSVVFQFI2IELADFEDARDQVEATEESLLHAFGFT-
>SOCG_01498#soct
----MTNSVRVRPITNKDISTVIQFI2IELADFEEARDQVEATEESLLNVFGFNE
>SPAC1002.07c#spom
-----MGSVRIRSVIKEDLPTVYQFI2KELAEFEKCEDQVEATIPNLEVAFGFID
>SJAG_03288#sjap
--MTNKTTAVVRRLKREDCPVVLQFI2KELAEYQKEPQQVEATVEKLEKAFGFVE
I want to change it to CLUSTALW format. It could have been easy:
my $in = shift;
my $out = shift;
my $alignio = Bio::AlignIO->new(-file=>$in, -format=>'fasta');
my $writeio = Bio::AlignIO->new(-file=>">$out", -format=>'clustalw');
while ( my $align_obj = $alignio->next_aln ) {
$writeio->write_aln($align_obj);
}
That'OK. However it doesn't work, because it says "seq doesn't validate".
In fact there has letter "2" in the alignment. Such "2" is intentionally
marked by myself, meaning a phase-2 intron exists here. I hope to keep
these markers in the output clustalw format. Is there any methods?
--
Tao Zhu, College of Life Sciences, Beijing Normal University, Beijing
100875, China
Email: tzhu at mail.bnu.edu.cn
More information about the Bioperl-l
mailing list