[Bioperl-l] Bio::SeqIO::new possible wierdness

Peter van Heusden pvh at egenetics.com
Wed Jan 28 15:45:29 EST 2004


Jason Stajich wrote:

>On Wed, 28 Jan 2004, Donald G. Jackson wrote:
>
>  
>
>>Personally, I like the fall-back but agree that $ARGV[0] shouldn't be it.
>>I'd suggest STDIN - if somebody calls new without a file/handle I think
>>they're more likely to be reading.  OTOH, guessing format woud be tough.
>>    
>>
>
>the guess format is trying to read off the top of the file I think - we
>support a 'peek' type of reading into the file, by having the _pushback
>functionality in Root::IO.
>
>I would like to see something like this go into Root:IO rather than in
>SeqIO - and have Root::IO give back a filename if it knows what it is.
>
>Also the Root::IO code could also do something like this:
> $file = "-" unless defined $file;
> open my $fh => $input or die $!;
>
>Which will then read from stdin if now filename is sent in - right now we
>don't really support that anymore because it was causing clog-ups in some
>of the DB::GFF code/tests I think.
>
>Maybe we localize this to 'FormattedReaderWriters' -- all the
>XXXIO(-format => 'XXX') modules so as to avoid the problems Lincoln saw.
>
>
>  
>
Can you to where Lincoln "saw" this problem? The BioPerl mailing list 
archive is not searchable, and searching via Google doesn't turn 
anything up.

Anyway, I'll look into Root::IO tomorrow and see what I come up with.

Peter

>  
>
>>At the very least a warning would be appropriate, perhaps indicating the
>>course of action.
>>
>>For xml handlers we can check the dtd and throw an error.  I will modify
>>my SeqIO::tinyseq::tinyseqHandler to do so.
>>
>>Don Jackson
>>
>>
>>
>>Peter van Heusden wrote:
>>
>>    
>>
>>>My review of the Bio::SeqIO::new method shows the following behaviour:
>>>
>>>Missing both ?file and ?fh arguments: falls back to using $ARGV[0]
>>>(the first command line argument) as sequence filename. If this fails,
>>>gives an exception about ?Unknown format?.
>>>-file argument (without ?fh argument):
>>>? given, but file unreadable: throws exception
>>>? undefined: reads $ARGV[0], as above.
>>>-fh argument (without ?file argument):
>>>? given, but not a filehandle: gives exception
>>>? given, but an invalid filehandle (not open): gives exception
>>>? undefined: reads $ARGV[0], as above.
>>>-format argument: if the sequence file doesn?t correspond to the given
>>>format, some parsers give an error (e.g. EMBL), while others do not
>>>(GenBank), instead silently give wrong results.
>>>-format argument without ?file argument: Silently creates a SeqIO
>>>object which writes to STDOUT.
>>>
>>>I don't think that this $ARGV[0] shortcut should be in there - it
>>>causes unnecessary potential confusion. Imagine a situation where -fh
>>>or -file is specified (using a variable), but that variable somehow
>>>does not get defined. In that case, the $ARGV[0] fallback behaviour
>>>would be used, which might lead to a non-obvious error behaviour.
>>>
>>>I'd like to propose that either -file or -fh should be specified,
>>>otherwise an exception is thrown. While I'm about it, I'm thinking of
>>>migrating the exceptions to the new 'typed exceptions' that BioPerl
>>>now provides - is there any consensus on exception type names?
>>>
>>>Peter
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at portal.open-bio.org
>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>      
>>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>    
>>
>
>--
>Jason Stajich
>Duke University
>jason at cgt.mc.duke.edu
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>  
>



More information about the Bioperl-l mailing list