[Biopython-dev] Accessing ExPASy through Bio.SwissProt /Bio.SeqIO

Peter biopython at maubp.freeserve.co.uk
Fri Dec 7 10:46:32 UTC 2007


> To summarize, I rewrote the chapter on SwissProt/Prosite/Prodoc/ExPASy and
> put it here:
>
> http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html#htoc51
> (chapter 6 in the tutorial)
>
> This is merely a proposal on how this should work; none of this is in CVS
> yet. Please let us know if you have any objections.

I would add a note saying doing it this way gives
Bio.SwissProt.SProt.Record objects,
while you could alternatively get SeqRecord objects as described in
the SeqIO chapter
(use a reference).

> If there are no objections, I can upload the new code to CVS. That would
> conclude my work on Bio.WWW.ExPASy; the final (and biggest) part of my work
> on Bio.WWW will be to look at the various Biopython modules to interact with
> NCBI (Genbank, EUtils).

That will be "fun"!

> Two comments:
> 1) In this proposal, I am using SwissProt.parse instead of SeqIO.parse since
> the latter does not (yet) store all information contained in a SwissProt
> file. I'd be happy though to move to SeqIO.parse for SwissProt also once it
> does.
> 2) It may be nice to have a SwissProt.read and SeqIO.read to read and return
> exactly one record from the handle, in addition to parse() to create an
> iterator to read multiple records.

I'd suggested a Bio.SeqIO function, with a name like parse1() or
parse_sole() etc which
would return a single SeqRecord - and raise an error if the handle
didn't contain one
and only one record.  We could call this function read() if you prefer.

Peter



More information about the Biopython-dev mailing list