[Bioperl-l] SeqIO: paired end reads

Wed Aug 3 21:17:01 UTC 2011

it depends on the assembler - For Illumina usually the paired ends end with /1 /2 and they have the same ID but are in two different files. Depends on if you are using interleaved paired reads or in two separate files.  some just expect the paired reads to be mated by virtue of being in same order in two files.  the ABYSS and Velvet manuals both explain what is expected so you will want to check on what are Newbler's assumptions on how the paired ends are encoded.

There are simulator tools if that is what you are trying to do in the end? 
checkout wgsim which comes with samtools or try dnaa

On Aug 3, 2011, at 1:01 PM, Lee Katz wrote:

> Hi all!  I was wondering how to construct paired end reads from scratch.  I
> know the locations of certain sequences across the genome with a high degree
> of confidence and so I want to give them to my assembler as paired end
> reads, along with my other sequence runs (454 and Illumina runs).  I plan to
> use Newbler.
> 
> My only problem is that I do not know the correct format in order to specify
> distance and sequences for a paired end reads run, and so I hope that there
> is a SeqIO solution.  At the least, I hope that one bioperl member can point
> me to where the definition of the paired end reads file format is...?
> 
> Thank you!
> 
> --Lee
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l