[Bioperl-l] genpept/swiss

Ewan Birney birney@ebi.ac.uk
Mon, 4 Sep 2000 15:52:01 +0100 (GMT)


On Mon, 4 Sep 2000 hilmar.lapp@pharma.novartis.com wrote:

> 
> 
> 
> 
> This describes exactly my situation in which I have to read in data in
> all sorts of different formats (and people's interpretations of these
> formats).
> The problem now is that BioPerl throws a warning if a sequence does not
> comply 100% with the standards and exits. While at that moment I want to
> 
>      You mean it throws an exception. (Issuing a warning shouldn't cause an
>      exit.)

I am all for throwing warnings in reading formats...

> 
> be able to say that he can ignore the warning if (e.g.) he has read the
> sequence correctly.
> 
>      Does this sound like a call for a callback a client program can
>      provide? The question then is what should be passed to the callback
>      routine? The sequence object as it has been constructed so far? Sounds
>      fragile, and may be useless in many cases. The complete offending
>      source record? Would discard the parse done so far (for the callback),
>      and would require a partial rewrite of the parsers because they read
>      line-by-line (at least most if not all of the rich format parsers).
> 
> Something that would be really nice to have is a more modular approach
> in which it would be easy to say:  'this data is in a format which is
> EMBL, with the following quirks, additional fields, ... '.
> 
>      Yes. But this needs a careful design of how can you split up the parse
>      of a sequence record into subtasks that are a) fairly independent (and
>      can thus be overridden by your QuirkyEMBL parser), and b) common to
>      all (rich) formats. Anyone's done any work in this direction so far?


The biopython guys are big into this sort of mutli-layered parser. I ---
am impressed at what they are doing but not motivated enough to put this
into bioperl - more important things to go in first.

I think making specific subclasses which the parsers build
(Bio::Seq::Swiss) etc allowing us places to put swissprot specific stuff
would solve 50% of these sorts of problems.



> 
> 
>           Hilmar
> 
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------