[Bioperl-l] Merging separate sequence and quality files to FASTQ ?

Chris Fields cjfields at illinois.edu
Thu Dec 3 13:47:32 UTC 2009


Dan,

On Dec 3, 2009, at 7:07 AM, Dan Bolser wrote:

> 2009/12/3 Peter <biopython at maubp.freeserve.co.uk>:
>> On Thu, Dec 3, 2009 at 11:44 AM, Dan Bolser <dan.bolser at gmail.com> wrote:
>>> Hi, can someone test the script here on zero length fasta / qual files?
>>> 
>>> http://www.bioperl.org/wiki/Merging_separate_sequence_and_quality_files_to_FASTQ
>>> 
>>> It seems the output has an extra newline in the sequence part of the
>>> output (which throws off scripts that rely on the 'four lines per
>>> record' structure of the fastq (although I'm not sure if it's illegal
>>> fastq).
>> 
>> Hi Dan,
>> 
>> The OBF consensus was FASTQ records with a zero length
>> sequence might be useful, and should be output as exactly
>> four lines (one blank sequence line, one blank quality line).
>> However for parsing, any number of blank lines should be OK.
>> http://lists.open-bio.org/pipermail/open-bio-l/2009-July/000522.html
>> 
>> I can confirm the perl script currently outputs a FASTQ file
>> with TWO blank lines for the sequence, giving five lines in
>> total for the zero length record. That does suggest a bug.
>> What version of BioPerl are you running?
> 
> Hi Peter,
> 
> Basically, I'm not running the 'latest' version of BP, which is why I
> asked this question of the list rather than filing a bug report. What
> version are you running? ;-)
> 
> Sounds like 5 lines instead of the expected 4 is a minor bug. (Thanks
> for the info).

FASTQ parsing had undergone a major revision prior to 1.6.1 (the latest release in CPAN).  Basically, it now parses all three FASTQ variants.  However, Peter indicates there may still be a problem, and it's likely he's running 1.6.1.  Peter can you confirm that?

>> Peter
>> 
>> P.S. The script is throwing away any description after the
>> identifier.
> 
> That's probably bad. Feel free to edit the script on the wiki. Sadly,
> MediaWiki's diff features are less than optimal, so developing scripts
> on the wiki isn't ideal. Anyone know how to plug git-hub into a script
> apparently hosted on a wiki?
> 
> Or is git-hub basically designed to be 'wiki for code'?

It's more an integrated solution for hosting code via git, with a wiki, bug queue, etc.  Think Soourceforge, but a lot nicer and with no ads ;>

BitBucket/Hg is another (very nice) solution along the same lines, developed in Python (Github is Ruby-centric).

> I'm wondering, because with the FlaggedRevs extension you could
> basically build a whole release in the wiki. Which would be fun if
> nothing else!

I'm not following you there.  Could you elaborate on why that would be beneficial?  I could see (

chris






More information about the Bioperl-l mailing list