[Biopython-dev] Bio.SeqIO.convert function?

Peter biopython at maubp.freeserve.co.uk
Sat Aug 8 07:14:18 EDT 2009


On Wed, Jul 29, 2009 at 8:43 AM, Peter<biopython at maubp.freeserve.co.uk> wrote:
> On Tue, Jul 28, 2009 at 11:09 PM, Brad Chapman<chapmanb at 50mail.com> wrote:
>> Extending this to AlignIO and TreeIO as Eric suggested is
>> also great.
>
> Whatever we do for Bio.SeqIO, we can follow the same pattern
> for Bio.AlignIO etc.
>
>> So +1 from me,
>> Brad
>
> And we basically had a +0 from Michiel, and a +1 from Eric.
> And I like the idea but am not convinced we need it. Maybe
> we should put the suggestion forward on the main discussion
> list for debate?

I've stuck a branch up on github which (thus far) simply defines
the Bio.SeqIO.convert and Bio.AlignIO.convert functions.
Adding optimised code can come later.

http://github.com/peterjc/biopython/commits/convert

Right now (based on the other thread), I've experimented
with making the convert functions accept either handles
or filenames. This will make the convert function even
more of a convenience wrapper, in addition to its role as a
standardised API to allow file format specific optimisations.

Taking handles and/or filenames does rather complicate
things, and not just for remembering to close the handles.
There are issues like should we silently replace any existing
output file (I went for yes), and should the output file be
deleted if the conversion fails part way (I went for no)?
Dealing with just handles would free us from all these
considerations.

You could even consider using Python's temporary file support
to write the file to a temp location, and only at the end move
it to the desired location. However that is getting far too
complicated for my liking (and may runs into permissions
issues on Unix). If anyone wants to do this, they can do it
explicitly in the calling script.

How does this look so far?

Peter


More information about the Biopython-dev mailing list