[Bioperl-l] How do you create a Bio::Align::AlignI object?

Josh Lauricha laurichj at bioinfo.ucr.edu
Fri Oct 31 12:15:19 EST 2003


On Fri 10/31/03 15:58, Wes Barris wrote:
> Hi,
> 
> I need to convert the contents of an ACE file into a series of MSF files
> (one for each contig).  Since BioPerl does not offer a straight-forward
> conversion process for this task, I have to manually read each contig
> section of the ACE file and manually create the structures for an
> AlignI object.  The only way I have found to create an AlignI object
> was by opening an input stream to an existing file.  As a result, I
> had to write the consensus to a file just so that I could open it again
> and gain access to an AlignI object.
> 
> Here is my code.  Is there a slicker way to do this?

Without reading through your code, the way to create and AlignI
implementing object is useing Bio::SimpleAlign.

> 
> acetomsf.pl:
> 
> #!/usr/local/bin/perl -w
> #
> #
> use strict;
> use Bio::Assembly::IO;
> use Bio::AlignIO;
> use Bio::SeqIO;
> #
> my $usage = "Usage: $0 <infile.ace>\n";
> my $infile = shift or die $usage;
> my $prefix = 'btcn';
> my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
> my $assembly = $io->next_assembly;
> 
> foreach my $contig ($assembly->all_contigs()) {
>    my $contigName = $prefix.$contig->id;
> #  $contig->match();    # not yet implemented
> #
> # Write the consensus to a file.
> #
>    my $consensusSeq = new Bio::Seq(
>                         -seq=>$contig->get_consensus_sequence->seq,
>                         -id=>$prefix.$contig->id);
>    my $seqout = new Bio::SeqIO(-file=>">$contigName.fa", -format=>'fasta');
>    $seqout->write_seq($consensusSeq);
> #
> # Snarf the consensus back in so that we can create a Bio::Align::AlignI 
> object.
> #
>    my $instream  = new Bio::AlignIO(-format=>'fasta', 
>    -file=>"$contigName.fa");
>    my $outstream = new Bio::AlignIO(-format=>'msf', 
>    -file=>">$contigName.msf");
>    my $aln = $instream->next_aln();
> #
> # Loop through each sequence in the contig adding it to the new alignment.
> #
>    foreach my $seq ($contig->each_seq) {
> #
> # Some of the ACE starting locations are negative.
> #
>       if ($contig->get_seq_coord($seq)->start <= 0) {
>          my $offset = -$contig->get_seq_coord($seq)->start + 2;
>          $seq->seq($seq->subseq($offset,$seq->length));
>          print($seq->primary_id," was clipped at the beginning by 
>          $offset\n");
>          }
> #
> # Pad each sequence so it aligns with the consensus.
> #
>       my $before = $contig->get_seq_coord($seq)->start - 1;
>       my $after = $contig->get_consensus_length - 
>       $contig->get_seq_coord($seq)->end;
>       my $alignedSequence = '-' x $before . $seq->seq . '-' x $after;
> #
> # Some of the ACE ending locations go beyond the consensus.
> #
>       if (length($alignedSequence) > $contig->get_consensus_length) {
>          $alignedSequence = substr($alignedSequence, 1 
>          ,$contig->get_consensus_length);
>          print($seq->primary_id," was clipped at the end\n");
>          }
>       $seq->seq($alignedSequence);
>       $aln->add_seq($seq);
>       }
>    $aln->set_displayname_flat;
>    $outstream->write_aln($aln);
>    undef $outstream;
>    }
> 
> 
> -- 
> Wes Barris
> E-Mail: Wes.Barris at csiro.au
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 

-- 

----------------------------
| Josh Lauricha            |
| laurichj at bioinfo.ucr.edu |
| Bioinformatics, UCR      |
|--------------------------|


More information about the Bioperl-l mailing list