[Biopython-dev] KEGG support
Renato Alves
rjalves at igc.gulbenkian.pt
Wed Feb 10 19:44:59 EST 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
- From Peter on 02/10/2010 10:27 PM:
> Excellent news. Have you looked at the existing KEGG parsers in
> Biopython, and do you think the current style is suitable? (I haven't
> looked at the code recently myself, but will do).
The style seems good enough but I was thinking of having a more
functional approach, at least for the parser to try to get away of the
massive if/elif/else cascades. The writer would come as second priority
and would be similar although I would also try to keep code duplication
at lower levels than what we can see in the Enzyme/__init__.py file. I
would also consider using Genes.py instead of Genes/__init__.py ... I
don't see the need of packages here.
> Regarding the SeqIO interface (for KEGG GENES only?), I would be
> happy to advise. Initially I suggest you work on adding a parser much
> like the other KEGG parsers, returning gene records. Then we can
> add a Bio/SeqIO/KeggGeneIO.py wrapper to turn these into SeqRecord
> objects.
Yes for now my main goal would be GENES. The other formats can probably
grow from there. Your suggestion on the SeqIO seems reasonable. I'll try
to have a prototype in the next days/weekend and we can discuss from there.
> I have not used SOAP, and have a personal preference for REST style
> APIs. However, if that is what KEGG offers, this is worth considering.
> I think Brad has some experience with (other) SOAP services in Python.
> Note the KEGG documentation suggests using SOAPpy for Python.
According to the http://www.genome.jp/kegg/docs/weblink.html page they
do mention a REST like URL for generic entries, pathways and brite. But
it seems more useful for external linking than as an API. I couldn't
even figure out how to return the information in plaintext instead of
the default HTML. About SOAPpy, I've nothing against it besides the fact
that when I first tried I had few problems. Anyway it was a long time
ago... I've only played with suds since.
> Interestingly, KEGG are however looking into providing RDF (and
> perhaps one day SPARQL endpoints). I will try and find out what sort
> of time scale they have in mind while I am at the BioHackathon 2010
> this week - http://hackathon3.dbcls.jp/
We'll be waiting on your feedback on this :)
> For now, I would prioritise the KEGG flat file parsers.
Agreed.
> Peter
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iEYEARECAAYFAktzUwgACgkQYh11EUYTX9SPcwCfSrNkIovs1vnPinuAtMFZQJYn
pmAAnjHAAro2Ls/c1Nq4DCuliReaPm64
=Dohn
-----END PGP SIGNATURE-----
More information about the Biopython-dev
mailing list