[BioPython] what should I do next?

Jeffrey Chang jchang@SMI.Stanford.EDU
Fri, 14 Apr 2000 00:34:37 -0700 (PDT)


There's many ways to answer this question.  I'll summarize what I think
are the tough, outstanding issues that will need to be resolved before we
have our first release, which I hope to be around the middle of this year.  
Thus, this is not directed specifically towards you, but just a general
list of things that need to be figured out for biopython.  Hopefully,
there may be something here that interests you, and coincides with your
goals.  Or, you may have something else in mind entirely, which would be
great too!




Scanner generator
-----------------

There's been some discussion on the list for developing a general way to
create parsers (or scanners or consumers) from a simple (easier than
python code) syntax.  I think this is a great idea, because it would make
the development and maintenance of new parsers a lot easier.  We've hashed
out some issues on this list, and even have some prototype
implementations, but I think more work needs to be done.  We still need to
agree on a syntax, and find a fast implementation.



Dealing with structure 
----------------------

This will likely be a very difficult or impossible task.  People working
on structure applications have different needs in terms of what needs to
be modelled, or performance requirements.  Thus, I suspect what will need
to happen is for biopython to define several general representations for
structure and to let people choose which one fits their needs.

Andrew has a lot of experience with this and would be a good person to
either drive, or advise on it.


Regression testing framework
----------------------------

Discussions ongoing...



Tools integration
-----------------

There are other packages that are useful to people likely to use
biopython.  Examples include NumPy or MMTK.  I also really like Andrew
Sterian's wrapper to Matlab.  Should we be writing tools that are
integrated with these?  How can we write tools that are aware of these,
and will use them if present?

Also, to what extent should be be distributing tools that are more
peripheral to biology stuff, but still useful for bioinformaticians?  For
example, I have some machine learning tools that may be useful to some
scientists, but aren't biology specific.  biojava seems to include these
kinds of things, but bioperl doesn't.



Distribution/Installation issues
--------------------------------

Some framework for how to do configuration and installation needs to be
done.  I think we should still be aiming to be cross-platform.  It would
also be nice to have a python implementations of everything, with parallel
implementations of appropriate critical sections in C for performance.  
This would make use of jpython easier.  However, I think this is less of a
priority now, because biojava now fulfills the need of biology code in
java.



Documentation
-------------

We need to figure out how to handle documentation.  Right now, I've been
collecting it all as text files.  However, that's not the best solution.  
Without any sort of semantic markup (html, xml, tex, other), you can't
convert it usefully into any other formats, e.g. html, postscript, etc.

Python currently uses tex.  I have a vague recollection somewhere that
they may be moving away from that, but I can't seem to find a reference
for that now.

It's also useful to be using a docstring browser.


Infrastructure work
-------------------

There's been some discussion on zope-ifying the web site.  Supposedly,
it'll make maintenance of the site much easier.

We need to start using a bug database.  This probably won't be hard, as
systems have already been set up for bioperl and biojava.  However, when
that happens, we will need to set up a biopython-devel mailing list, and
move some of the more technical discussions there.



Those are some of the main areas that stick out in my mind.  There's also
some more low-level stuff in the README file in the distribution.

Jeff




On Tue, 11 Apr 2000, Andrew Dalke wrote:

> Hello,
> 
>   I was wondering who here might be using Python for bioinformatics
> soon.  I ask because I've been working on the code I proposed to
> be part of the biopython core, and I really would like to work with
> people using it so I can see what needs to be changed and improved.
> 
>   If there are people who need this sort of code, I would also like
> to know more about specific needs.  I'm doing this partially as
> a hobby now, which makes it hard to focus on specific requirements
> since I don't have good use cases.  Also, my background is
> structural biology/modeling, so I don't know what sequence people
> really want.
> 
>   That is, what should I work on next?  I have framework for all
> sorts of sequence analysis tools, there's some existing parsing
> code of Jeff's, there's the parsing framework we talked about last
> year, there's the development of a generic sequence record class,
> there's the corba interface.  There's a lot of possibilities, but
> which are most useful?
> 
>   I'm also doing this in hope of real employment as a contractor,
> so if anyone wants to pay me for this, please contact me :)
> 
>                     Andrew
>                     dalke@acm.org
> 
> 
> 
> 
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
>