[Bioperl-l] how to not count gaps in the multiple sequence alignment

Jun Yin jun.yin at ucd.ie
Wed Nov 2 12:29:45 EDT 2011


Hi,
 
You need to calculate the coordinates of the protein coding gene in the alignment by yourself. After that, you can use the slice function to get the alignment block for the selected gene, e.g.
 
$aln2 = $aln->slice(20, 30);
 
Cheers,
Jun


----- Original Message -----
From: wenbin mei <wenbinmei at gmail.com>
Date: Wednesday, November 2, 2011 5:51 am
Subject: [Bioperl-l] how to not count gaps in the multiple sequence alignment
To: bioperl-l at lists.open-bio.org

> Hi,
> 
> I need some help in coding. I have a multiple sequence alignment 
> which has
> gaps. And also I have a reference genome sequence in the 
> alignment which I
> know all the coordinates for the protein coding genes. I want to 
> extractall these protein coding genes alignment from the big 
> alignment. I am using
> Bio SimpleAlign but the question is that due to the gaps in the 
> alignment,the coordinates has shifted in the alignment. I wonder 
> is there a way I can
> not count the gaps and still be able to extract the protein 
> alignment. One
> way I can do is remove the gaps in the reference first and then 
> extract the
> sequence. But I don't like this way ... Thank you for help.
> 
> -best,
> wenbin
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list