[Bioperl-l] converting to clustalw alignment format

Tao Zhu bill_zt at sina.com
Tue Sep 18 05:28:19 EDT 2012



I have a DNA seq alignment in FASTA format:

> scry
TATAAAAACATTAAGTGCTACTGCACTGATATAATTGTTAAC
> soct
TACAAAAATTTCAAATGCTATTGTACTGATATCATTGTCGAC
> spom
TTTAAAAATGTAGAGTGTTATTGCACTAATTTAAGCATTGAC
> sjap
GAAAAAGAAATCGAGTGTTATTGTACAGACTTGATTGTAAAC

I changed it to CLUSTALW format using Bio::AlignIO

my $in =  shift;
my $out = shift;
use Bio::AlignIO;
my $align_obj   = Bio::AlignIO->new(-file=>$in,
-format=>'fasta')->next_aln;
my $writeio_obj = Bio::AlignIO->new(-file=>">$out", -format=>'clustalw');
$writeio_obj->write_aln($align_obj);

The result is:
scry/1-139             TATAAAAACATTAAGTGCTACTGCACTGATATAATTGTTAAC
soct/1-137             TACAAAAATTTCAAATGCTATTGTACTGATATCATTGTCGAC
spom/1-135             TTTAAAAATGTAGAGTGTTATTGCACTAATTTAAGCATTGAC
sjap/1-87              GAAAAAGAAATCGAGTGTTATTGTACAGACTTGATTGTAAAC
                          *** *  *  * ** ** ** **  *  * *   *  **

I didn't want positions like "/1-139" to be shown, I just want the same
sequence name as original.

How should I do to eliminate positions like "/1-XXX"? Thank you!


More information about the Bioperl-l mailing list