[Bioperl-l] Next-gen modules

Chris Fields cjfields at illinois.edu
Fri Jul 24 08:19:42 EDT 2009


On Jul 24, 2009, at 4:28 AM, Peter wrote:

> On Thu, Jul 23, 2009 at 11:58 PM, Chris  
> Fields<cjfields at illinois.edu> wrote:
>>> i.e. Something like this four line Biopython script would be  
>>> perfect:
>>> http://biopython.org/wiki/Reading_from_unix_pipes
>>
>> We use named parameters so it's a little more verbose.
>>
>> use Bio::SeqIO;
>> my $in  = Bio::SeqIO->new(-fh => \*STDIN, -format => 'fastq-sanger');
>> my $out = Bio::SeqIO->new(-format => 'fastq-solexa');
>> while (my $seq = $in->next_seq) { $out->write_seq($seq) }
>
> Thanks. So that implicitly uses STDOUT for the output?

Yes.

>> Don't be surprised if there are still bugs lurking about, just let  
>> me know
>> and I'll fix 'em.
>
> Have you guys (BioPerl) have also gone for "fastq-sanger" instead of
> just "fastq" for the Sanger Standard version of FASTQ (like EMBOSS)?
> Does BioPerl use just "fastq" to mean anything?

Short answer: yes, and yes.

Slightly longer answer: I've set up SeqIO so it converts "new(-format  
=> 'foo-bar')" to new(-format => 'foo, -variant => 'bar').  In the  
fastq constructor, if the variant is expected but isn't defined (i.e.  
for 'fastq') it defaults to sanger.  Makes it a bit easier maintenance- 
wise if a new variant pops up.

> If BioPerl and EMBOSS are using "fastq-sanger", I think Biopython will
> have to support that as an alias too:
> http://lists.open-bio.org/pipermail/biopython-dev/2009-July/ 
> 006416.html
>
> Thanks,
>
> Peter

It's consistent with the 'format-variant' usage, but 'fastq' for us is  
backwards-compatible, so we'll likely support both.

chris




More information about the Bioperl-l mailing list