[Bioperl-l] fastq parsing problem

Sat May 9 01:45:18 UTC 2009

Hi Michael--
Can you send along the exception? The line you send seems to 
parse as advertised in the debugger (as long as the last newline
that breaks up the string of %'s is not really there).
thanks, Mark
----- Original Message ----- 
From: "Michael Muratet" <mmuratet at hudsonalpha.org>
To: <bioperl-l at lists.open-bio.org>; <maq-help at lists.sourceforge.net>
Sent: Friday, May 08, 2009 3:29 PM
Subject: [Bioperl-l] fastq parsing problem


> Greetings
> 
> I've got a problem parsing fastq output from the maq aligner. The  
> parser is throwing an exception for the following record:
> 
> @HWI-EAS146:3:1:2:177#0/1
> CTCCGCTNNCTTCTCAGCTTTCTTGTAGGCGATAGACTTCCCGAGCCTANCCAGAGCAACGAGCNTNNNGNNNNTN
> +
> @,AB=>-&&:5).;+*=<*8?%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
> %%%%%
> 
> I looked up the line in fastq.pm that does the parsing:
> 
>    116   my ($top,$sequence,$top2,$qualsequence) = $entry =~ /^
>    117                                                         \@?(. 
> +?)\n
>    118                                                         ([^ 
> \@]*?)\n
>    119                                                         \+?(. 
> +?)\n
>    120                                                         (.*)\n
>    121                                                       /xs
> 
> I don't consider myself a regex-pert, but I would interpret the above  
> as "put everything after one or zero @ characters on the first line in  
> $top; then put anything that is not @ on the second line in $sequence;  
> then everything after one or zero + characters on the third line in  
> $top2; then everything on the fourth line in $qualsequence; and don't  
> be greedy".
> 
> It seems like the fastq record above should parse with these rules. I  
> note that the @ character is escaped in the regex and appears in  
> several of the problem records, but not all. Has anyone come across  
> this before? I don't see this exact problem in the list archives.
> 
> Thanks
> 
> Mike
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
>