[Biopython-dev] Alignment object

Brad Chapman chapmanb at 50mail.com
Tue Mar 2 15:03:08 UTC 2010


Peter and Kevin;

> >> My code does not (yet) attempt to deal with next-gen sequencing
> >> alignments, 
[...]
> >> Perhaps until this is settled, it would be premature to merge my
> >> alignment class to the trunk. After all, we may need to tweak the
> >> alignment object class heirachy.

My vote would be to merge what you've done in for handling
standard multiple alignments, and then look at next-generation read
representation as an analogous but separate problem. All of the
SeqRecord objects which are useful for drilling in on multiple
alignments are likely going to be memory hogs for any real world
next gen work.

> > I'm just jumping in here and have not yet read all of the background
> > material.  However, I am working with next-gen alignments and am
> > curious as to what you have in mind.  At first glance, it sounds like
> > you want to access aligned reads in a 'pileup' format (i.e., an object
> > model akin to http://samtools.sourceforge.net/pileup.shtml).  Or are
> > you thinking of something different entirely?

This is a good way to go. SAM is at least an emerging standard that
people are adopting, and samtools and the pysam module do a good job
of dealing with them:

http://code.google.com/p/pysam/

pysam exposes a Pileup style API from sorted and indexed BAM files 
and scales great for large alignment files:

http://wwwfgu.anat.ox.ac.uk/~andreas/documentation/samtools/api.html

This is a good starting point for providing interoperability with
Biopython; it would be great to re-use what we can from these
projects.

Brad



More information about the Biopython-dev mailing list