[Biopython-dev] questions for next release

Mon Dec 18 14:51:10 EST 2000

On Sat, 16 Dec 2000, Andrew Dalke wrote:

> Jeff:
> > Do you want to develop it as part of the biopython CVS (and
> > release), or do you want it to be a dependency that's bundled
> > and installed together?
> 
> I would rather it be the latter.  I expect people will want
> to use Martel independent of the other biopython code.

Sounds reasonable.  One thing, though: Martel currently ships with a few
formats for databases.  I do want the ones used for biopython to be CVS'd
in the biopython repository, so that developers with read/write access to
biopython can work on the formats.  I don't think biopython should depend
on format definitions in Martel.

> I can move the development repository to biopython.org.  My concerns
> with that are two-fold.  First, I haven't figured out how to connect
> to my ISP from under Linux, so I don't have a direct connection to the
> rest of the world.  That makes it hard to talk to CVS.

Either way doesn't make a difference to me.

> >- Are we going to use/bundle mxTextTools 1.2?
> 
> Martel should work fine with 1.1.1 or 1.2, so the first is
> not a concern.  I'll ask Marc-Andre about his release plans.

Thanks!

> >- setup.py now accepts earlier versions (<0.8?) of distutils.
> > Should we require the version that comes with Python 2.0?
> 
> Yes.  We have other dependencies now on 2.0 than just setup.py

OK.

> >- Any objections to moving more code into __init__.py?  For
> > example, the code in Prosite/Prosite.py would be moved to
> > Prosite/__init__.py.  This would definitely BREAK CODE, but
> > the fix would be trivial.
> 
> I have no problems, but I think I'm the one who introduced
> using __init__.py to biopython so I'm not the best of sources.
> 
> Brad correctly pointed out that some people don't know about
> that use so may get somewhat confused about it.  As I recall,
> others here and elsewhere have had that problem so it shouldn't
> be ignored.

Yeah, definitely.  I still overlook __init__.py when looking for code.  
What can be done about this?  Documentation?

> On the other hand, I have had problems with another library
> which had a module of the form "X.X" (like Prosite.Prosite).

Oh, I see what you're getting at.  That's definitely bad.

> What this means is if you have Prosite/Prosite.py then do not
> put anything into Prosite/__init__.py and vice versa.

Yep.  I'll interpret that as evidence to move stuff into __init__.py.  :)

> >  If this does happen, does anyone know how to move code
> > between files without losing the CVS logs of the changes?
> 
> I don't know.  Also, is there any way to import the CVS logs
> of Martel?

I suspect both solutions will require some surgery on the CVS repository
and RCS files.

> >- Should we check in Brad's new GenBank code?
> 
> I think Brad and I still need to do a bit more work on the
> parser definition.  Neither his original code nor my modified
> version pass the "fully parses an NCBI file" although it's
> getting pretty close.

Is this a "no" vote, then?  Rebuttals?

> A related question, and one which was raised earlier, is,
> where should the format definitions be located in biopython?
> 
> There are also database specific builders (which convert the
> format definitions to a database specific data structure) and
> generic builders (eg, which make a generic data structure
> but possibly discarding some data).

There's two places they can go.  First, you can put each one in the
package in which it belongs.  That means, the fasta format would go in
Bio/Fasta, genbank in Bio/GenBank, swissprot in Bio/SwissProt, etc.  This
would be consistent with the current design, and it would be clear where
to look for the format.

Second, we can have a formats package (could be called something else),
where we put all the Martel stuff.  This would make it easier to check to
see what formats exist, which could be helpful for SeqIO-type
functionality.  All you'd have to do is scan the directory and suck up all
the formats in there.  The other way, we'd have to specify them manually.

Any votes?  Comments?

> >- Andrew, I've submitted a bug report (more of a feature request)
> >in Jitterbug about making the regression tests indifferent to
> >EOL conventions.  This would be nice if people are developing
> >and testing on different platforms, which breaks the tests.
> >Could you look at it and let me know what you think?
> 
> Umm, I don't see it.  There are none assigned to me ... Oh!
> with br_regrtest.  Sorry, I thought you were talking about
> Martel.  Some of it's regression tests are also newline
> specific.  Okay, it shouldn't be too hard.  Could either see
> what changes are in the 2.0 distribution or replace the line
> reader with something which understands the different styles.

Great!  This will be nice, because some of the regression tests are
breaking on differing newline conventions.  Different styles can occur
within the same file.

> Jeff in a reply to Brad:
> > Are you also including Andrew's location parser?
> 
> Remember, that parser hasn't been seriously tested.  Also,
> including it requires inclusion of SPARK.  That's not hard
> because it's a single, pure-python file.  I think it should
> be included because of its general usefulness and because
> it isn't a real distribution in its own right.

Good, we'll include it, as well as SPARK, then.  Judging from the recent
traffic on the bioperl list, this is a feature we should have.

Jeff