[Bioperl-l] handling nexus files

Jason Stajich jason at bioperl.org
Mon Oct 5 18:16:50 EDT 2009


you can manipulate the alignments to do this - so see what you can do  
with Bio::Align objects like Bio::SimpleAlign which is what you get  
when parsing with Bio::AlignIO.

The concatenation problem basically requires concatenating the data in  
the sequence objects.

Here is really basic solution - you'd want to add some error checking  
in there for consistency of data, all IDs present in all files, etc.

http://gist.github.com/202528

use Bio::AlignIO;
use Bio::SimpleAlign;
use strict;
my %seqs;
for my $file ( @ARGV ) {
  my $in = Bio::AlignIO->new(-format=> 'nexus', -file => $file);
   if ( my $aln = $in->next_aln ) {
    for my $seq ( $aln->each_seq ) {
    $seqs{$seq->display_id} .= $seq->seq;
   }
  }
}

my $newaln = Bio::SimpleAlign->new;
for my $id ( keys %seqs ) {
  $newaln->add_seq(Bio::LocatableSeq->new(-id=> $id,-seq=>$seqs{$id}));
}
my $out = Bio::AlignIO->new(-format => 'nexus');
$out->write_aln($newaln);

-jason
On Oct 5, 2009, at 11:35 AM, Denzel Li wrote:

> Hello all:
> Does bioperl support functions for handling nexus files? More  
> specifically,
> I need two functions, 1) combine multiple nexus files into one, 2)  
> split a
> nexus files into multiple nexus files. For example,
> given the following two files  (file1.nex, file2.nex), is there  
> function to
> combine them into one file as shown in "combinedFile.nex", or to split
> "combinedFile.nex" into two files (file1.nex, file2.nex).
>
> ------------------------------
> # file1.nex
> begin data;
>  dimensions ntax=2 nchar=3
> b1 GGG
> b2 GGT
> ;end;
> ---------------------------------
> # file2.nex
> begin data;
>  dimensions ntax=2 nchar=3
> b1 AAA
> b2 AAT
> ;end;
> -------------------------------
>
> # combinedFile.nex
> begin data;
>  dimensions ntax=2 nchar=6
> [alignment from file1.nex]
> b1 GGG
> b2 GGT
> [alignment from file2.nex]
> b1 AAA
> b2 AAT
> ;end;
>
> begin sets;
> charset a1=1-3;
> charset a2=4-6;
> end;
> --------------------------------
> Any suggestion is highly appreciated. Thank you.
>
> Best,
> Denzel
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich
jason.stajich at gmail.com
jason at bioperl.org



More information about the Bioperl-l mailing list