[BioPython] ClustalwAlignment Object

Kevin Berney kevin_berney at yahoo.com
Fri Dec 17 13:52:13 EST 2004


so there is th __str__() function which reproduces the
output file.  i don't note any way to reduce this to
just the aligned area easily ... 

if you really want to dive in, you can look in
__init__.py under the Clustalw folder in the python
source.  under: 

class ClustalAlignment(Alignment):

and

        def __str__(self):


is something that you can play with.



that being said, when i want to reproduce a subset of
my alignment, i usually run a script similiar to yours
below.


--- Martina <boehme at mpiib-berlin.mpg.de> wrote:

> Thanks to both of you for your quick answers!
> It is working fine, but to print out the match there
> must be a simpler 
> solution, no?
> 
> start = alignment._star_info.find('*')
> end   = alignment._star_info.rfind('*')
> all_records = alignment.get_all_seqs()
> match = all_records[0].seq
> vars = [ match[i] for i in range (start,end) ]
> print "".join(vars)
> match = all_records[1].seq
> vars = [ match[i] for i in range (start,end) ]
> print "".join(vars)
> 
> Thanks again!
> Martina
> 
> Frank Kauff wrote:
> 
> > Hi Martina,
> > 
> > I had exactly the same problem, and couldn't find
> any ready-made
> > solution for this. If your unaligned short
> sequence doesn't have already
> > gaps at beginning or end (which can make it
> hard...), you could easily
> > check for the start and end of matching regions
> with something like
> > 
> > alignment=Clustalw.do_alignment(cline)
> > recs=alignment.get_all_seqs()
> > # if the long sequence was the first one
> > longsequence=recs[0].seq.tostring()	
> > shortsequence=recs[0].seq.tostring()
> >
>
matchstart=len(shortsequence)-len(shortsequence.lstrip('-'))
> >
> matchend=matchstart+len(shortsequence.strip('-'))-1
> > 
> > 
> > Hope that helps,
> > Frank
> > 
> > On Fri, 2004-12-17 at 05:16, Martina wrote:
> > 
> >>Hi,
> >>
> >>When aligning 2 sequences with clustalw you get a
> ClustalAlignment 
> >>Object. Because I'm aligning a long one with a
> shorter sequence, I'm 
> >>only interested in the matching region. Are there
> any methods to get 
> >>the postion of the first and last match and then
> do something like: 
>
>>ClustalAlignment.get_part_of_alignment(start_postion,
> end_postion)?
> >>Of course I could parse the *.aln files, but is
> there a simpler 
> >>solution? I'm aware of AlignInfo - but that seems
> to be only more 
> >>sophisticated stuff.
> >>I'm new to Python, so I might miss some basics
> here.
> >>
> >>Thanks.
> >>Martina
> >>_______________________________________________
> >>BioPython mailing list  -  BioPython at biopython.org
> >>http://biopython.org/mailman/listinfo/biopython
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the BioPython mailing list