[Biopython-dev] Newbler ACE file to SAM?

Nick Loman n.j.loman at bham.ac.uk
Tue Aug 17 16:35:45 UTC 2010


Kevin Jacobs <jacobs at bioinformed.com> wrote:
> I'm stuck with the ACE conversion for exactly the same reason.  The 
> consensus and reads are  gapped for multiple alignments so that there 
> are no mismatches at all.  I will have to recompute the Smith-Waterman 
> alignments of each read against the ungapped consensus in order to 
> produce SAM/BAM output.  I'm surprised that the 
> pairwise alignments for the de novo assembly are so problematic.  My 
> understanding was they they were pairwise against the consensus 
> contigs and would be exactly what you'd want for SAM/BAM. 
>  Unfortunately, I'm mainly dealing with only human data and don't have 
> any direct examples to know for sure.  I can re-process some of our 
> EBV data with the de novo aligner and see what can be done.

Hi Kevin

I was expecting it to be similar to the gsMapper output but it isn't. 
When you supply -pt to gsAssembler (to specify 454PairAlign.txt should 
be output) then each pair in the file relates to reads from the original 
SFF files, not the contigs. I guess this makes sense as it is probably 
represents a stage of the de novo assembly process (an all against all 
pairwise comparison on the reads).

I guess I can get around this by running gsMapper against the assembly 
using the SFF files as a second stage, and then using Newbler2SAM on 
this instead, but I was kind of hoping to avoid this (as I would expect 
it to give slightly different results).

Another possible workaround is potentially using GAP5 from the Staden 
package - I understand it can read ACE and output SAM.

Cheers,

Nick





More information about the Biopython-dev mailing list