[BioPython] ClustalwAlignment Object

Fri Dec 17 09:18:15 EST 2004

Hi Martina,

I had exactly the same problem, and couldn't find any ready-made
solution for this. If your unaligned short sequence doesn't have already
gaps at beginning or end (which can make it hard...), you could easily
check for the start and end of matching regions with something like

alignment=Clustalw.do_alignment(cline)
recs=alignment.get_all_seqs()
# if the long sequence was the first one
longsequence=recs[0].seq.tostring()	
shortsequence=recs[0].seq.tostring()
matchstart=len(shortsequence)-len(shortsequence.lstrip('-'))
matchend=matchstart+len(shortsequence.strip('-'))-1

Hope that helps,
Frank

On Fri, 2004-12-17 at 05:16, Martina wrote:
> Hi,
> 
> When aligning 2 sequences with clustalw you get a ClustalAlignment 
> Object. Because I'm aligning a long one with a shorter sequence, I'm 
> only interested in the matching region. Are there any methods to get 
> the postion of the first and last match and then do something like: 
> ClustalAlignment.get_part_of_alignment(start_postion, end_postion)?
> Of course I could parse the *.aln files, but is there a simpler 
> solution? I'm aware of AlignInfo - but that seems to be only more 
> sophisticated stuff.
> I'm new to Python, so I might miss some basics here.
> 
> Thanks.
> Martina
> _______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net/member/frankkauff.shtml