[Biopython] consensus for forward and reverse reads from a sequencing run

Ivan Gregoretti ivangreg at gmail.com
Tue Feb 25 03:34:24 UTC 2014


Hello Leo,

Besides pandaseq, also consider FLASH from the Salzberg lab.
http://ccb.jhu.edu/software/FLASH/

I've been using it for over a year without problems. I wish there was
a Biopython tool though.

Cheers,

Ivan



Ivan Gregoretti, PhD
Bioinformatics



On Mon, Feb 24, 2014 at 9:21 PM, Willis, Jordan R
<jordan.r.willis at vanderbilt.edu> wrote:
> Hi Leo,
>
> I know this is not what you asked and I'm not sure if BioPython has a module, but I would really recommend pandaseq (https://github.com/neufeld/pandaseq). Its written in C, so its much faster than python and really could not be any more simple to use. I typically use this for HiSeq and MiSeq runs and it just requires the forward and reverse paired end reads and spits out a consensus (with PHRED scores if you want).
>
> Jordan
>
> On Feb 24, 2014, at 7:59 PM, Leo Alexander Hansmann <leo2 at stanford.edu<mailto:leo2 at stanford.edu>> wrote:
>
> Hi,
> I'm getting two fasta files from an Illumina MiSeq run. One contains forward, the other reverse reads. The lines in both files are corresponding, meaning the first sequence in the forward read file should pair with the first sequence line in the reverse read file. Both sequences overlap in the middle in a varying amount of nucleotides. How can I get python or biopython to generate a file with the consensus sequences of each read. For example:
> sequence in the forward read file: AATCGTCGGTTACTCTG
> corresponding line in the reverse read file: CTCTGAGGGAGAGATC
> I want: AATCGTCGGTTACTCTGAGGGAGAGATC
> Thank you so much!
> Leo
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython




More information about the Biopython mailing list