[Bioperl-l] About FASTQ parser
Abhishek Pratap
abhishek.vit at gmail.com
Thu Sep 17 18:16:33 UTC 2009
Hi Chris
I am just wondering if the following is intentionally excluded from a
fasta record or a bug.
After reading in each fastq record from a FASTQ fiel the output of the
same recored ( $out->write_seq($seq) ) has line/text missing after
the + sign.
Eg:
@HWI-EAS397:1:1:11:252#NNNTNN/1
NACAATATCAATTAGAGGATTGCTTNGTTNAAGGNNTNGNTNNNANTNT
+
DNXPMXNYXMPVXZVTXYZ[[BBBBBBBBBBBBBBBBBBBBBBBBBBBB
PS: In our case we need the exact record to be printed out as we need
to split the fastq file into multiple fastq files based on the read
index in the @ Line. So exact output is needed to avoid conflicts with
downstream processing pipelines.
Thanks,
-Abhi
Thanks,
-Abhi
On Thu, Sep 17, 2009 at 12:39 AM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The FASTQ parser hasn't been released to CPAN yet. It is available via
> bioperl-live. We haven't added any code yet to the HOWTO's, but the
> SYNOPSIS example in Bio::SeqIO::fastq should be sufficient to get you
> started.
>
> Bio::Seq::Quality is the object returned via next_seq(); it can be queried
> for PHRED qual scores and other bits. If you want to split things up you
> should call next_seq(), then generate a FASTQ output stream in the variant
> you want:
>
> my $outfasta = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>fasta.file');
> my $outqual = Bio::SeqIO->new(-format => 'fastq-sanger', -file =>
> '>qual.file');
>
> while (my $seq = $in->next_seq) {
> $outfasta->write_fasta($seq);
> $outqual->write_qual($seq);
> }
>
> Note I haven't tested that yet, but it should work. Let me know if it
> doesn't.
>
> chris
>
> On Sep 16, 2009, at 3:13 PM, Abhishek Pratap wrote:
>
>> Hi Chris
>>
>> I remember seeing a recent email about new bioperl fastq parser. Is it
>> part of bioperl 1.6 dist. I installed one and based on the doc
>>
>> here(http://doc.bioperl.org/releases/bioperl-current/bioperl-live/Bio/SeqIO/fastq.html)
>> I am a bit lost.
>>
>> I see two methods there : using Bio::SeqIO::fastq and
>> Bio::Seq::Quality. Are both same in terms of data returned and latter
>> giving a scale up in speed ?
>>
>> This is not to offend any developer but small example/s on the HOWTO's
>> helps a lot.
>>
>> The current example (copied below) is not working. I guess it is based
>> on a previous version of code.
>>
>> # grabs the FASTQ parser, specifies the Illumina variant
>> my $in = Bio::SeqIO->new(-format => 'fastq-illumina',
>> -file => 'mydata.fq');
>>
>>
>> My basic requirement is to read each read in fastq record and split it
>> into header: read: quality.
>>
>>
>> Thanks,
>> -Abhi
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
More information about the Bioperl-l
mailing list