[Bioperl-l] SeqIO & multi-line fastq
Joel Martin
j_martin at lbl.gov
Fri Nov 7 22:45:34 UTC 2008
Hello,
multiline fastq seems broken by design, @ is a quality score
and also the id delimiter. the script accompanying maq for converting
fastq to fasta can't parse the multiline fastq output by maq, so I'd
say it's maq that's wrong.
I did this to parse them, but wasn't sure enough about /^\+/ to
suggest it for bioperl.
while (<$fh>) {
if (/^@(\S+)/) { # read name
print ">$1\n";
my $lines = 0;
while ( <$fh> ) { # read sequence
if ( ! (/^\+/) ) { # stop at '+' line
print;
$lines++;
}
else {
last;
}
}
while ( $lines-- ) { # skip quals
<$fh>;
}
}
}
Joel
On Fri, Nov 07, 2008 at 03:59:07PM -0500, Tristan Lefebure wrote:
> Hi there,
>
> I'm parsing with SeqIO a FastQ file made by MAQ. SeqIO complains because
> this is a multiline fastq file. By looking at the Bio::SeqIO::fastq,
> it's pretty obvious that it can't handle multilines. Who is wrong? MAQ,
> SeqIO, or am I missing something?
>
> Some more details below:
>
> ###
> [tristan at trudy maq_easyrun] seq2seq.pl cns.fq fastq cns.fna fasta
>
> ------------- EXCEPTION -------------
> MSG: AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT
> doesn't match fastq descriptor line type
> STACK
> Bio::SeqIO::fastq::next_seq /usr/local/share/perl/5.10.0/Bio/SeqIO/fastq.pm:113
> STACK toplevel /home/tristan/bin/seq2seq.pl:25
> -------------------------------------
> ###
>
> The fastq file looks like that:
> -----------
> @nctc11168
> atgAATCCAAGCCAAATACTTGAAAATTTAAAAAAAGAATTAAGTGAAAACGAATACGAA
> AACTATTTATCAAATTTAAAATTCAACGAAAAACAAAGCAAAGCAGATCTTTTAGTTTTT
> AATGCTCCAAATGAACTCATGGCTAAATTCATACAAACAAAATACGGCAAAAAAATCGCG
> CATTTTTATGAAGTGCAAAGCGGAAATAAAGCCATCATAAATATACAAGCACAAAGTGCT
> AAACAAAGCAACAAAAGCACAAAAATCGACATAGCTCATATAAAAGCACAAAGCACGATT
> TTAAATC[...]
> [some 20000 lines later]
> AACCTTTTTTTATAAAATTTAAGATAAAATTTATACATTATGCAAAATTTAAAGAGAgat
> n
> +
> EQWWZ`cffilmu~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> ~~~~~[...]
> ---------
>
> Thanks!
>
> -Tristan
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list