[Bioperl-l] Parsing clustalw alignments

Ryan Golhar golharam at umdnj.edu
Mon Jan 30 12:40:58 EST 2006


Thanks.  Here's what I ended up doing:

$seqio = Bio::AlignIO->new(-format => 'msf', -file =>
"alnfile.clustalw");
my $aln = $seqio->next_aln();
@_ = $aln->each_seq_with_id('org1');
$seq1 = $_[0]->seq;

-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org
[mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of Jason Stajich
Sent: Sunday, January 29, 2006 2:49 PM
To: golharam at umdnj.edu
Cc: 'bioperl-l'
Subject: Re: [Bioperl-l] Parsing clustalw alignments


See the Bio::SimpleAlign documentation for information on how to  
interact with an alignment

Here is some code from the SYNOPSIS
# Extract sequences and check values for the alignment column $pos
   foreach $seq ($aln->each_seq) {
       $res = $seq->subseq($pos, $pos);
       $count{$res}++;
   }


So for you question:
# get the aln parser
my $alnio = Bio::AlignIO->new(-format => 'clustalw', -file  
=>"alnfile.aln);
while( my $aln = $alnio->next_aln ) {
  # get the alignments one by one
  for my $seq ( $aln->each_seq ) {
  # get the sequences out from the alignment
   print "sequence as a string", $seq->seq, "\n";
   }
}


next_seq is an API Sequence streams, not something we have  
implemented for alignments since you can get them all out with the  
each_seq method.

-jason
On Jan 29, 2006, at 12:48 PM, Ryan Golhar wrote:

> I can't figure this out from the documentation.  In fact, I'm not sure

> its possible:
>
> I have a bunch of clustalw alignments in GCG (MSF) format.  Each 
> alignment consists of three sequences.  I want to get the sequences 
> including the gaps from the alignment.
>
> I'm trying to use Bio::AlignIO to read the alignment file, then trying

> to get each sequence from the alignment. I tried doing this:
>
> $seqio = Bio::AlignIO->new(-format => 'clustalw', -file => 
> "align$x.clustalw"); my $aln = $seqio->next_aln();
> $seq1 = $aln->next_seq()->seq;
>
> Getting the sequence from the alignment isn't working and I'm not sure

> how to do it.  Does anyone have any ideas as to what I might try?
>
> --
> Ryan Golhar  -  golharam at umdnj.edu
> The Informatics Institute of UMDNJ
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org 
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12


_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list