[Bioperl-l] Parsing needle/water output
shalabh sharma
shalabh.sharma7 at gmail.com
Fri May 15 14:13:50 EDT 2009
Hi Mahmut, Thanks a lot, actually this is exactly what i
was looking for. Its really helpful.
One thing more, is there any way i can get the full id of the sequence from
the water output.
like ->
My sequece id is: JCVI_READ_1103769852490
and the water output gives something like this :
4206170-42077 409 CCATGCCGCGTGTATGAAGAAGGCCTTCGG 458
||| .||| ||.| ||||||| |||.||||| ||.|
JCVI_READ_110 1 CCA--ACGC-----TGCA----------GGGTTGT-AAG 31
So when i run the script i get output:
JCVI_READ_110 1
1078
Instead of getting JCVI_READ_1103769852490 i am getting just JCVI_READ_110
Thanks
Shalabh
On Fri, May 15, 2009 at 1:45 PM, Mahmut Uludag <uludag at ebi.ac.uk> wrote:
>
> > > > yes, i tried to read the documentation about
> > > > Bio::AlignIO::emboss, but there
> > > > is not much in it.So is like i can call the same functions
> > > > which are used in
> > > > searchIO. I need start and end position of the pairwise alignment.
>
> Hi Shalabh,
>
> I copied below an example script that prints start and end position of
> sequences used for constructing pairwise alignment by EMBOSS 'water'
> program. After your last email this became an unnecessary example but i
> thought it would be useful for people with similar questions in the
> future.
>
> Regards,
> Mahmut
>
>
> use Bio::Factory::EMBOSS;
> use Bio::Seq;
> use Bio::AlignIO;
>
> my $aseq = Bio::Seq->new( -id => 'seq1', -seq => 'AACATGTAGGGATAG' );
> my $bseq = Bio::Seq->new( -id => 'seq2', -seq => 'GCATGTTTAGATAG' );
> my $factory = new Bio::Factory::EMBOSS;
> my $water = $factory->program("water");
> my $wateroutfile = 'out.water';
>
> $water->run(
> {
> -asequence => $aseq,
> -bsequence => $bseq,
> -gapopen => '10.0',
> -gapextend => '0.5',
> -outfile => $wateroutfile
> }
> );
>
> my $alnin = new Bio::AlignIO(
> -format => 'emboss',
> -file => $wateroutfile
> );
>
> while ( my $aln = $alnin->next_aln ) {
> # process the alignment -- Bio::SimpleAlign objects
> foreach $seq ( $aln->each_seq() ) {
> print "\n" . $seq->display_id;
> print " " . $seq->start . " " . $seq->end;
> }
> print "\n" . $aln->percentage_identity;
> print "\n" . $aln->consensus_string(50) . "\n";
> }
>
>
>
More information about the Bioperl-l
mailing list