[Biopython-dev] next release closer (?)

Andrew Dalke dalke at acm.org
Thu Nov 30 22:43:46 EST 2000


Brad:
>Doh! Sorry, that I didn't say anything about this -- I'd actually
>forgotten about this fix and it didn't cross my mind that
>Cayte's problem could be related to it. This is my fault, I should
>have posted to the dev list about this...

Couple of points.  I know I tend to forget the details of things
after a couple of months, and I don't expect others to have better
memories.  In this case, my first thought to the problem was that
some string wasn't being converted to an integer - which wasn't
the case - so there wasn't much of a clue to jog your memory.

Secondly, even if you posted to the list two months ago when you
did the fix, the odds of me (or anyone else) remembering is also
pretty low.


>This brings up a point -- does anyone think it would be worthwhile to
>have CVS commits and log messages sent to the dev list? Bioperl has
>this and I think it's very worthwhile

A couple of months ago there was a bug report on the bioperl list (not
the dev list, the general one).  As I recall, someone reported a problem
in BLAST parsing where it didn't understand one of the fasta|id|label
forms.  It turns out the code had been fixed and the problem was that
the person who reported the bug hadn't tried the newer bioperl release.

It took a while for there to be any response regarding the problem.
Part of the reason was the poor bug report, but the other, more major
part was likely that no one remembered that there had been a change/fix.
After all, it had been 6 months previous.   This despite that bioperl
has the CVS email notifications and they have both more developers and
more people using the BLAST parser.

It was much easier just to go to the CVS logs for the appropriate file
and see all the changes at once; which is what I did to track down how
Cayte's problem disappeared.

Therefore, I do not think that CVS email notifications would really
help out for this case.

That's not saying that email notification don't have other uses.  Two
I can think of are "hey, what's this punk doing messing with my code?"
and status updates.

The first of these can be done with other tools, like looking at which
files changed when doing a cvs update, or using the cvs log to see the
list of changes.

I didn't use the best of phrases for the latter of these.  It's an idea
I picked up from McConnell's "Rapid Development" (a book which I fully
recommend, btw).  He suggests breaking a project up into "mini-milestones",
which are tasks that can be completed within a couple of days.  When the
task is completed, the developer sends out a short email to the group
saying it's done.  It might also point out how to use the new feature or
describe that it's 100 times faster than the older code or ....  The
result helps improve communications, helps the project manager track
the task timelines, and gives everyone a bit of good news that things
are getting done.

I think CVS updates are too fine grained for this level of communications.
They report on the changes done on a per-file basis and not on a per-task
or per-bug basis.  When you read the email notification you need to
reconstruct what's going one.  (You still need to do that when looking at
the cvs log, but then you can use cvs diff to see the actual code changes
and you have the code right there to look through.)

Also, I get enough email as it is now - I don't want to get email for
every bug report (esp. ones like "Oops, fixed typo in 'protien'")

Therefore, I still don't think that automatic email notification of CVS
changes is all that useful an ability.

 -- then for cases like this I
>would feel more comfortable going ahead with a small "fix" because I
>know Andrew would read the log... Then he could think: and go in and check
up on the
>fix, if he feels like it. Just an idea, but maybe posting patches is
>better...

>I would really like to have bugs sent to the dev list when they come
>in -- I just noticed a couple from Iddo that I should have dealt with
>(I think that is all fixed now, regardless), but didn't realize were
>there. Whadda you all think about this?

Bugs are different.  Unless there's someone willing to triage bugs and
pass them on the right person (and hopefully the person will respond)
it might as well go to everyone.  Plus, as I've said, I don't like having
a lot of email so there's a negative feedback loop to reduce the bug
count :)

So I've no problems with this.  Though in the future if there are both
a lot of bugs and a lot of different development, something will need
to be done to make sure there is some way to direct the right messages
to the right people.  (Improving signal to noise.)

>> Also, BTW, when we make the change to Python 2.0, I suggest changing
>> Pattern.py's Prosite.search so that endpos defaults to sys.maxint
>> instead of the None it does now.  This keeps it compatible with the
>> Python API and prevents the if-branches in the code - I don't like
>> branches since they are harder to test fully.


>As far as I can tell, we are officially requiring 2.0 and no one seems
>to mind,

I thought the switchover to 2.0 wasn't going to occur until after the
next release (the one that's coming closer (?) :)  So I was going to
wait until then - so long as I remember.

> This way I won't have to stay up nights worrying about more bugs
> in my "fixes" :-).

There is an extreme viewpoint to this.  As I understand XP, any desired
behaviour should have a test for it.  This allows people to change the
code and - so long as the tests still pass - assume the changes are valid.

This doesn't work in the most literal sense since I could have code like

  if endpos == 87655:
      endpos = endpos + 8

and there's no way people will write a test for every possible input
combination.  On the other hand, it is a good practice to test boundary
conditions, so there could (should?) be a test for endpos = None and
endpos = 0.  Had they been present, your bug would have been found right
away.

So one way to sleep more comfortably is to add regression tests.  While
you then lose sleep worrying that you aren't testing everything, I've
found I gain more than I lose.

                    Andrew
                    dalke at acm.org






More information about the Biopython-dev mailing list