[Biopython-dev] Sequential SFF IO
Peter Cock
p.j.a.cock at googlemail.com
Thu Jan 27 13:32:45 UTC 2011
On Wed, Jan 26, 2011 at 11:24 PM, Kevin wrote:
> On Wed, Jan 26, 2011 at 2:44 PM, Peter wrote:
>>
>> I'm currently looking at trimming 5' and 3' PCR primer sequences -
>> which could equally be used for barcodes etc. I'd probably wrap this
>> as a Galaxy tool (using Biopython).
>>
>
> I have 90% of such a tool written. I use a banded Smith-Waterman
> alignment to match barcodes and generic PCR adapters/consensus
> sequence to ensure that adapters and barcodes can be detected at
> both ends of reads.
Interesting - and yes, we do seem to have similar aims here. I have
been doing ungapped alignments, allowing 0 or 1 (maybe in future 2)
mismatches, working on getting this running at reasonable speed.
Gapped alignments would be particularly important in 454 reads
with homopolymer errors, but most barcodes and PCR primers
will avoid homopolymer runs so I don't expect this to be a common
problem in this use case. Do you have good reasons to go to the
expense of a gapped alignment?
Peter
More information about the Biopython-dev
mailing list