[Biopython-dev] WIT and KEGG
Cayte
katel at worldpath.net
Sun Aug 12 01:52:22 EDT 2001
----- Original Message -----
From: "Tarjei S Mikkelsen" <tarjei at genome.wi.mit.edu>
> I'm not too fond of adding this to the format file. HTML markup isn't
> part of the KEGG format description, so this seems a bit ad hoc.
>
> Instead I suggest that you either run the input through
> File.SGMLHandle or File.SGMLStripper before you pass the
> WIT record to KEGG.Enzyme.Parser OR write a separate Parser
> class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer
> around KEGG.Enzyme._Consumer.
>
The problem is I'm experimenting with a filter to strip out junk ( not
necessarily html ) between records.
The motivation is that I've had Martel fail on just an extraneous line feed.
Somehow the idea of chaining two filters together trips a watch for bugs
alarm in my mind.
> > The format failed halfway through the file. I think the problem is
the
> > order of entries. The format specifies GENES before MOTIF but
> > this order is
> > reversed in the test file. Maybe the format should be less sensitive to
> > order ,where it doesn't convey information.
>
> Yeah, the entries are supposed to come in a specified order, but even
> the KEGG people don't follow that rule. I've committed a change to
> KEGG.Enzyme.enzyme_format.py that assumes very little about entry
> ordering. If that's the error, it should work for you now.
>
Now its stopping on files with db links like this example:
PIR: B49338 B49935 E64239 KIECAA
These are quibbles but the computer doesn't understand quibbles:).
Cayte
> Tarjei
>
>
More information about the Biopython-dev
mailing list