[Bioperl-l] Parsing needle/water output

shalabh sharma shalabh.sharma7 at gmail.com
Fri May 15 14:13:50 EDT 2009


Hi Mahmut,                   Thanks a lot, actually this is exactly what i
was looking for. Its really helpful.
One thing more, is there any way i can get the full id of the sequence from
the water output.
 like ->
         My sequece id is: JCVI_READ_1103769852490

and the water output gives something like this :

 4206170-42077    409 CCATGCCGCGTGTATGAAGAAGGCCTTCGG   458

        |||  .|||     ||.|          ||||||| |||.||||| ||.|

JCVI_READ_110      1 CCA--ACGC-----TGCA----------GGGTTGT-AAG   31

So when i run the script i get output:
                                                      JCVI_READ_110     1
 1078

Instead of getting JCVI_READ_1103769852490 i am getting   just JCVI_READ_110

Thanks
Shalabh

On Fri, May 15, 2009 at 1:45 PM, Mahmut Uludag <uludag at ebi.ac.uk> wrote:

>
> > > > yes, i tried to read the documentation about
> > > > Bio::AlignIO::emboss, but there
> > > > is not much in it.So is like i can call the same functions
> > > > which are used in
> > > > searchIO. I need start and end position of the pairwise alignment.
>
> Hi Shalabh,
>
> I copied below an example script that prints start and end position of
> sequences used for constructing pairwise alignment by EMBOSS 'water'
> program. After your last email this became an unnecessary example but i
> thought it would be useful for people with similar questions in the
> future.
>
> Regards,
> Mahmut
>
>
> use Bio::Factory::EMBOSS;
> use Bio::Seq;
> use Bio::AlignIO;
>
> my $aseq = Bio::Seq->new( -id => 'seq1', -seq => 'AACATGTAGGGATAG' );
> my $bseq = Bio::Seq->new( -id => 'seq2', -seq => 'GCATGTTTAGATAG' );
> my $factory      = new Bio::Factory::EMBOSS;
> my $water        = $factory->program("water");
> my $wateroutfile = 'out.water';
>
> $water->run(
>        {
>                -asequence => $aseq,
>                -bsequence => $bseq,
>                -gapopen   => '10.0',
>                -gapextend => '0.5',
>                -outfile   => $wateroutfile
>        }
> );
>
> my $alnin = new Bio::AlignIO(
>        -format => 'emboss',
>        -file   => $wateroutfile
> );
>
> while ( my $aln = $alnin->next_aln ) {
>        # process the alignment -- Bio::SimpleAlign objects
>        foreach $seq ( $aln->each_seq() ) {
>                print "\n" . $seq->display_id;
>                print " " . $seq->start . " " . $seq->end;
>        }
>        print "\n" . $aln->percentage_identity;
>        print "\n" . $aln->consensus_string(50) . "\n";
> }
>
>
>



More information about the Bioperl-l mailing list