From dag at sonsorol.org  Sun Nov  4 17:29:33 2001
From: dag at sonsorol.org (chris dagdigian)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Computational biology course members looking for project ideas
Message-ID: <3BE5C14D.4080808@sonsorol.org>


Hi folks,

This email came through to our newly established 
"volunteer@open-bio.org" mail address. I'm forwarding it to the various 
lists and to people who may have some project ideas for Jace.

Rather than individually bombarding Jace with requests it would probably 
be best for anyone who has a project idea to email their proposals back 
to vounteer@open-bio.org. We'll put the ideas together and respond back 
to Princeton.

Regards,
Chris
-------------- next part --------------
An embedded message was scrubbed...
From: "Jace Kohlmeier" <jkohlmei@Princeton.EDU>
Subject: [Volunteer] project request
Date: Fri, 2 Nov 2001 16:11:42 -0500
Size: 2959
Url: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011104/e59ec35b/Volunteerprojectrequest.eml
From tarjei_mikkelsen at hotmail.com  Sun Nov  4 23:42:13 2001
From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Pathway module
Message-ID: <F56943xJJCqEGfeNl9H000286ad@hotmail.com>

I've committed the following modules to the CVS tree:

Bio.Pathway
Bio.KEGG.Map
Bio.MetaTool.Input

Together they form my first (and rather rudimentary) attempt at creating 
classes for representing and working with metabolic and signalling pathways.

Bio.Pathway contains classes for representing collections of biochemical 
reactions of the type A + B <-> C (Reaction/System), and classes for 
representing explicit networks of arbitrary interactions 
(Interaction/Network).

Bio.KEGG.Map contains a parser for reading a KEGG metabolic pathway map into 
Reaction/System objects.

Bio.MetaTool.Input contains a function for converting a System object into a 
string that can be used as input tothe MetaTool program.

Sample usage can be deduced from the correponding test files.

This is very much a prototype so I welcome anyone interested to have a look 
and poke at it (and rip it apart). I don't recommend that it is included in 
the actual Biopython distribution until it is a bit more fleshed out and 
tested, but I'll leave that up to whoever makes those decisions.

thanks,

Tarjei Mikkelsen
tarjei@genome.wi.mit.edu

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


From jchang at SMI.Stanford.EDU  Mon Nov  5 19:42:41 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/48
In-Reply-To: <01102418073807.14420@sienna.berkeley.edu>
References: <200110242350.f9ONoEB24543@pw600a.bioperl.org>
 <01102418073807.14420@sienna.berkeley.edu>
Message-ID: <p05101002b80ce2411f89@[192.168.0.4]>

Good catch!  it's fixed in the repository.

Thanks,
Jeff

At 6:03 PM -0700 10/24/01, Gavin E. Crooks wrote:
>The new code dosn't work as intended, since parse() may raise an exception.
>
>This
>
>     def parse_file(self, filename):
>         h = open(filename)
>         retval = self.parse(h)
>         h.close()
>         return retval
>
>should be
>
>     def parse_file(self, filename):
>         h = open(filename)
>         try:
>             return self.parse(h)
>         finally :
>             h.close()
>
>Gavin
>
>p.s. The viewcvs diff appears to be broken.
>
>
>On Wed, 24 Oct 2001, you wrote:
>>  JitterBug notification
>>
>>  jchang changed notes
>>
>>  Message summary for PR#48
>>	From: gec@compbio.berkeley.edu
>>	Subject: Unclosed file
>>	Date: Wed, 24 Oct 2001 13:17:43 -0400
>>	0 replies	0 followups
>>	Notes: It gets closed implicitly as the reference in parse 
>>goes out of scope.  However,
>>  you're right that it's better to be done explicitly, so I've made 
>>the changes in
>>  the file.
>>
>>  Thanks,
>>  Jeff
>>
>>
>>  ====> ORIGINAL MESSAGE FOLLOWS <====
>>
>>  From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001
>>  Received: from localhost (localhost [127.0.0.1])
>>	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133
>>	for <biopython-bugs@pw600a.bioperl.org>; Wed, 24 Oct 2001 
>>13:17:43 -0400
>>  Date: Wed, 24 Oct 2001 13:17:43 -0400
>>  Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org>
>>  From: gec@compbio.berkeley.edu
>>  To: biopython-bugs@bioperl.org
>>  Subject: Unclosed file
>>
>>  Full_Name: Gavin Crooks
>>  Module: ParserSupport.AbstractParser
>>  Version:
>>  OS:
>>  Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184)
>>
>>
>>  AbstractParser.parse_file(self,filename) does not close the file it opens.
>>
>>
>>  _______________________________________________
>>  Biopython-dev mailing list
>>  Biopython-dev@biopython.org
>>  http://biopython.org/mailman/listinfo/biopython-dev
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From biopython-bugs at bioperl.org  Mon Nov  5 19:59:53 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/31
Message-ID: <200111060059.fA60xrB09668@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#31
	From: hy263book@263.net
	Subject: When I encounter "No hits found"
	Date: Wed, 16 May 2001 04:14:35 -0400
	0 replies 	0 followups
	Notes: fixed in subsequent emails
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From hy263book@263.net Wed May 16 04:14:35 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187
	for <biopython-bugs@pw600a.bioperl.org>; Wed, 16 May 2001 04:14:35 -0400
Date: Wed, 16 May 2001 04:14:35 -0400
Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org>
From: hy263book@263.net
To: biopython-bugs@bioperl.org
Subject: When I encounter "No hits found"

Full_Name: Huang Ying
Module: Bio.Blast.NCBIStandalond
Version: 
OS: Win2k
Submission from: (NULL) (166.111.30.26)


I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast
result is "No hits found",python send the wrong message


From biopython-bugs at bioperl.org  Mon Nov  5 19:59:53 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Notification: incoming/31
Message-ID: <200111060059.fA60xrB09672@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#31 from incoming to fixed-bugs
Message summary for PR#31
	From: hy263book@263.net
	Subject: When I encounter "No hits found"
	Date: Wed, 16 May 2001 04:14:35 -0400
	0 replies 	0 followups
	Notes: fixed in subsequent emails
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From hy263book@263.net Wed May 16 04:14:35 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187
	for <biopython-bugs@pw600a.bioperl.org>; Wed, 16 May 2001 04:14:35 -0400
Date: Wed, 16 May 2001 04:14:35 -0400
Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org>
From: hy263book@263.net
To: biopython-bugs@bioperl.org
Subject: When I encounter "No hits found"

Full_Name: Huang Ying
Module: Bio.Blast.NCBIStandalond
Version: 
OS: Win2k
Submission from: (NULL) (166.111.30.26)


I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast
result is "No hits found",python send the wrong message


From jchang at SMI.Stanford.EDU  Mon Nov  5 20:07:30 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Pathway module
In-Reply-To: <F56943xJJCqEGfeNl9H000286ad@hotmail.com>
References: <F56943xJJCqEGfeNl9H000286ad@hotmail.com>
Message-ID: <p05101005b80ce7d96f29@[192.168.0.4]>

>This is very much a prototype so I welcome anyone interested to have 
>a look and poke at it (and rip it apart). I don't recommend that it 
>is included in the actual Biopython distribution until it is a bit 
>more fleshed out and tested, but I'll leave that up to whoever makes 
>those decisions.

I won't include it if it's against your wishes.  However, since 
Biopython is still at alpha, a likely to remain there at least 
through the next release, I think it's OK to put in experimental 
code.  The people using it now are early adopters that are likely to 
be able to help flesh things out.

Jeff

From katel at worldpath.net  Tue Nov  6 02:25:53 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Pathway module
References: <F56943xJJCqEGfeNl9H000286ad@hotmail.com>
Message-ID: <001b01c16694$45c02da0$010a0a0a@cadence.com>

----- Original Message -----
From: "Tarjei Mikkelsen" <tarjei_mikkelsen@hotmail.com>
To: <biopython-dev@biopython.org>
Sent: Sunday, November 04, 2001 8:42 PM
Subject: [Biopython-dev] Pathway module


>
> I've committed the following modules to the CVS tree:
>
> Bio.Pathway
> Bio.KEGG.Map
> Bio.MetaTool.Input
>
> Together they form my first (and rather rudimentary) attempt at creating
> classes for representing and working with metabolic and signalling
pathways.
>
> Bio.Pathway contains classes for representing collections of biochemical
> reactions of the type A + B <-> C (Reaction/System), and classes for
> representing explicit networks of arbitrary interactions
> (Interaction/Network).
>
> Bio.KEGG.Map contains a parser for reading a KEGG metabolic pathway map
into
> Reaction/System objects.
>
> Bio.MetaTool.Input contains a function for converting a System object into
a
> string that can be used as input tothe MetaTool program.
>
> Sample usage can be deduced from the correponding test files.
>
> This is very much a prototype so I welcome anyone interested to have a
look
> and poke at it (and rip it apart). I don't recommend that it is included
in
> the actual Biopython distribution until it is a bit more fleshed out and
> tested, but I'll leave that up to whoever makes those decisions.
>
  Great!!! I hope to have time to look into it Wednesday.

                                     Cayte


From gec at compbio.berkeley.edu  Tue Nov  6 14:09:04 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Failed tests.
Message-ID: <01110611111207.04148@sienna.berkeley.edu>


I have just installed biopython using the latest code in CVS. A whole bunch of
tests fail. Are these my problem's, or biopython's?


Gavin Crooks
gec@compbio.berkeley.edu
http://threeplusone.com

=====================================================================
ERROR: test_KEGG
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 136, in runTest
    __import__(self.test_name)
  File "./test_KEGG.py", line 8, in ?
    from Bio.KEGG import Map
ImportError: cannot import name Map
======================================================================
ERROR: test_Pathway
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 136, in runTest
    __import__(self.test_name)
  File "./test_Pathway.py", line 10, in ?
    from Bio.Pathway import *
ImportError: No module named Pathway
======================================================================
ERROR: test_intelligenetics
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 136, in runTest
    __import__(self.test_name)
  File "./test_intelligenetics.py", line 29, in ?
    src_handle = open( datafile )
IOError: [Errno 2] No such file or directory: 'IntelliGenetics/TAT_mase_nuc.txt'======================================================================
ERROR: test_metatool
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 136, in runTest
    __import__(self.test_name)
  File "./test_metatool.py", line 29, in ?
    src_handle = open( datafile )
IOError: [Errno 2] No such file or directory: 'MetaTool/meta.out'
======================================================================
FAIL: test_GenBank
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 153, in runTest
    expected_handle)
  File "./run_tests.py", line 247, in compare_output
    assert expected_line == output_line, \
AssertionError:
Output  : "keys: ['L31939', 'AJ237582', 'X62281', 'AF297471', 'M81224', 'X55053']\n"
Expected: "keys: ['X55053', 'M81224', 'AF297471', 'X62281', 'L31939', 'AJ237582']\n"
======================================================================
FAIL: test_SubsMat
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./run_tests.py", line 153, in runTest
    expected_handle)
  File "./run_tests.py", line 247, in compare_output
    assert expected_line == output_line, \
AssertionError:
Output  : 'M -0.0 0.4 0.7 0.8 1.0\n'
Expected: 'M 0.0 0.4 0.7 0.8 1.0\n'
----------------------------------------------------------------------
Ran 31 tests in 71.317s                                                 

From jchang at SMI.Stanford.EDU  Tue Nov  6 14:45:16 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Failed tests.
In-Reply-To: <01110611111207.04148@sienna.berkeley.edu>
References: <01110611111207.04148@sienna.berkeley.edu>
Message-ID: <p05101001b80deda07e5b@[171.65.33.250]>

The import errors seem to apply to new modules that Terjei put in. 
Do you have those in your directory?

It looks like there are missing files in intelligenetics and 
metatool.  Cayte, could you check on those?

GenBank is a bug in the regression tests that should be fixed.

SubsMat is a known problem that hasn't been fixed yet.

Jeff

From tarjei_mikkelsen at hotmail.com  Tue Nov  6 14:51:35 2001
From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Failed tests.
Message-ID: <F34E6JZp9f4jx17oiTa0002b6c2@hotmail.com>

The KEGG and Pathway errors are probably caused by the setup script failing 
to install them. I've commited a quick fix for that. Alternatively, you can 
just copy the Bio/Pathway and Bio/KEGG directories to 
/lib/pythonX/site-packages/Bio directory.


- Tarjei

>=====================================================================
>ERROR: test_KEGG
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_KEGG.py", line 8, in ?
>     from Bio.KEGG import Map
>ImportError: cannot import name Map
>======================================================================
>ERROR: test_Pathway
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_Pathway.py", line 10, in ?
>     from Bio.Pathway import *
>ImportError: No module named Pathway
>======================================================================


_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


From tarjei_mikkelsen at hotmail.com  Tue Nov  6 14:54:40 2001
From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Pathway module
Message-ID: <F38WLp7dmtnlaMTlpWw0002b071@hotmail.com>


>I won't include it if it's against your wishes.  However, since
>Biopython is still at alpha, a likely to remain there at least
>through the next release, I think it's OK to put in experimental
>code.  The people using it now are early adopters that are likely to
>be able to help flesh things out.

Okay, that's fine. I've added them to the setup.py script as noted in my 
previous email.

- Tarjei

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


From chapmanb at arches.uga.edu  Tue Nov  6 15:02:21 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Failed tests.
In-Reply-To: <p05101001b80deda07e5b@[171.65.33.250]>
References: <01110611111207.04148@sienna.berkeley.edu> <p05101001b80deda07e5b@[171.65.33.250]>
Message-ID: <20011106150221.A26736@ci350185-a.athen1.ga.home.com>

Jeff:
> GenBank is a bug in the regression tests that should be fixed.

Yup. I'm actually working on the GenBank modules right now and
noticed this problem. I'll fix it when I update those modules
(hopefully tonight :-).

Thanks for the heads up.
Brad
-- 
PGP public key available from http://pgp.mit.edu/

From gec at compbio.berkeley.edu  Tue Nov  6 15:04:08 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:06 2005
Subject: [Biopython-dev] Failed tests.
In-Reply-To: <p05101001b80deda07e5b@[171.65.33.250]>
References: <01110611111207.04148@sienna.berkeley.edu> <p05101001b80deda07e5b@[171.65.33.250]>
Message-ID: <01110612111908.04148@sienna.berkeley.edu>

> The import errors seem to apply to new modules that Terjei put in. 
> Do you have those in your directory?

The import errors seem to have fixed themselves. Perhaps python was
looking at an older biopython version?!

> It looks like there are missing files in intelligenetics and 
> metatool.  Cayte, could you check on those?

IntelliGenetics/TAT_mase_nuc.txt is IntelliGenetics/TAT_mase-nuc.txt in CVS, and
there is no meta.out file.

Gavin

From jchang at SMI.Stanford.EDU  Tue Nov  6 15:13:32 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Failed tests.
In-Reply-To: <01110611111207.04148@sienna.berkeley.edu>
References: <01110611111207.04148@sienna.berkeley.edu>
Message-ID: <p05101003b80df4be2951@[171.65.33.250]>

I've fixed the GenBank bug (was printing out the keys to a 
dictionary) and one I found in Fasta (name of string type changed in 
Python 2.2).

Jeff


At 11:09 AM -0800 11/6/01, Gavin E. Crooks wrote:
>I have just installed biopython using the latest code in CVS. A whole bunch of
>tests fail. Are these my problem's, or biopython's?
>
>
>Gavin Crooks
>gec@compbio.berkeley.edu
>http://threeplusone.com
>
>=====================================================================
>ERROR: test_KEGG
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_KEGG.py", line 8, in ?
>     from Bio.KEGG import Map
>ImportError: cannot import name Map
>======================================================================
>ERROR: test_Pathway
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_Pathway.py", line 10, in ?
>     from Bio.Pathway import *
>ImportError: No module named Pathway
>======================================================================
>ERROR: test_intelligenetics
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_intelligenetics.py", line 29, in ?
>     src_handle = open( datafile )
>IOError: [Errno 2] No such file or directory: 
>'IntelliGenetics/TAT_mase_nuc.txt'======================================================================
>ERROR: test_metatool
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 136, in runTest
>     __import__(self.test_name)
>   File "./test_metatool.py", line 29, in ?
>     src_handle = open( datafile )
>IOError: [Errno 2] No such file or directory: 'MetaTool/meta.out'
>======================================================================
>FAIL: test_GenBank
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 153, in runTest
>     expected_handle)
>   File "./run_tests.py", line 247, in compare_output
>     assert expected_line == output_line, \
>AssertionError:
>Output  : "keys: ['L31939', 'AJ237582', 'X62281', 'AF297471', 
>'M81224', 'X55053']\n"
>Expected: "keys: ['X55053', 'M81224', 'AF297471', 'X62281', 
>'L31939', 'AJ237582']\n"
>======================================================================
>FAIL: test_SubsMat
>----------------------------------------------------------------------
>Traceback (most recent call last):
>   File "./run_tests.py", line 153, in runTest
>     expected_handle)
>   File "./run_tests.py", line 247, in compare_output
>     assert expected_line == output_line, \
>AssertionError:
>Output  : 'M -0.0 0.4 0.7 0.8 1.0\n'
>Expected: 'M 0.0 0.4 0.7 0.8 1.0\n'
>----------------------------------------------------------------------
>Ran 31 tests in 71.317s                                                
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From gec at compbio.berkeley.edu  Tue Nov  6 15:21:14 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Failed tests.
In-Reply-To: <F34E6JZp9f4jx17oiTa0002b6c2@hotmail.com>
References: <F34E6JZp9f4jx17oiTa0002b6c2@hotmail.com>
Message-ID: <01110612225009.04148@sienna.berkeley.edu>

On Tue, 06 Nov 2001, you wrote:
> The KEGG and Pathway errors are probably caused by the setup script failing 
> to install them. I've commited a quick fix for that. Alternatively, you can 
> just copy the Bio/Pathway and Bio/KEGG directories to 
> /lib/pythonX/site-packages/Bio directory.
> 
> 
> - Tarjei
>
Yes, that makes sense now.

Thanks,

Gavin

From gec at compbio.berkeley.edu  Tue Nov  6 15:32:12 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
In-Reply-To: <p05101002b80dee5aa9e4@[171.65.33.250]>
References: <01110517352006.04148@sienna.berkeley.edu> <v04011703b80d29e9d68d@[158.252.219.163]> <p05101002b80dee5aa9e4@[171.65.33.250]>
Message-ID: <0111061236540B.04148@sienna.berkeley.edu>

On Tue, 06 Nov 2001, you wrote:
> >And is there a particular version of python we should be programming 
> >to? I just loaded up the latest version, and assumed that anything 
> >avaliable would be avaliable. Perhaps I should downgrade?
> 
> We're currently only requiring Python 2.0, but perhaps it's time to 
> reevaluate that.
> 
> Jeff

One advantage of moving biopython to python 2.1 is that you can
presumable remove all of the PyUnit code thats in Biopython, since PyUnit
is now included.

Perhaps we could always use the latest stable python version, at least
so long as biopython is in alpha!

Gavin

From gec at compbio.berkeley.edu  Tue Nov  6 16:09:59 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Fwd: Unit tests
Message-ID: <0111061310310D.04148@sienna.berkeley.edu>


I have been working on updating the SCOP module. Now, I am a big fan of
unit tests, and I find myself writing unit tests for almost everything. Not
simple regression tests, but proper unit tests using the PyUnit framework.

I am having problems understanding how to integrate my tests into biopythons
framework. Most of the things in biopython/tests appear to be regression tests
with a PyUnit coating.

After looking around a bit, I found this unit test description from Zope.

http://cvs.zope.org/Zope/doc/UNITTEST.txt?rev=1.2&content-type=text/vnd.viewcvs-markup

Not only does this document clearly describe what a unit test should be, it also
describes how the Zope tests are organized. Here is the important bit.

>Writing Unit Tests For The Zope Core
>
> If you're writing core code, you probably don't need to listen to
>  any more of this.  :-) The rules for writing tests for Zope core
>  code are simple:
>
>  - The testing code should make use of PyUnit
>     (/lib/python/unittest.py).  Instructions for using PyUnit are
>     available at http://pyunit.sourceforge.net.
>
>  - Tests must be placed in a "tests" subdirectory of the package or
>     directory in which the core code you're testing lives.
>
>   - Test modules should be named something which represents the
>     functionality they test, and should begin with the prefix "test."
>     E.g., a test module for BTree should be named testBTree.py.
>
>   - An individual test module should take no longer than 60 seconds
>     to complete.

This is very similar to one of the two main ways of organizing Junit tests in
the java community.

I think this would be a good way to organize biopythons unit tests. Thoughts?
Comments?

Gavin Crooks

gec@compbio.berkeley.edu
http://threeplusone.com/


On Tue, 06 Nov 2001, you wrote:
> No objections here.  Brad can probably give you better insight about 
> the regression tests, since he did the coding for it.
> 
> Jeff
> 
> 
> >Perhaps we should move this conversation to biopython-dev?
> >
> >On Tue, 06 Nov 2001, you wrote:
> >>  >Could you explain what the relation of Tests/UnitTest is to thre
> >>  >rest of the tests? Its all very confusing.
> >>
> >>  Yeah.  We're using python-style regression testing, which means that
> >>  a test suite is just a python script whose name begins with "test_"
> >>  and outputs a bunch of text.  Doing a regression test means running
> >>  the script and checking it against previous output.
> >>
> >>  We're using unittest.py under the hood to do the checking.  Thus,
> >>  we're not taking advantage of all the nice features that it provides.
> >>  While it would be worth considering moving the whole system over, I'm
> >>  not sure anybody wants to go back and rewrite all the old tests.
> >>
> >>
> >>
> >>  >And is there a particular version of python we should be programming
> >>  >to? I just loaded up the latest version, and assumed that anything
> >>  >avaliable would be avaliable. Perhaps I should downgrade?
> >>
> >>  We're currently only requiring Python 2.0, but perhaps it's time to
> >>  reevaluate that.
> >>
> >>  Jeff
-------------------------------------------------------

From pewilkinson at informaxinc.com  Tue Nov  6 17:29:36 2001
From: pewilkinson at informaxinc.com (Peter Wilkinson)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] RE: Refseq Data
In-Reply-To: <200110311701.f9VH1cB31871@pw600a.bioperl.org>
Message-ID: <002301c16712$848f9880$331ea8c0@l001696w00>

Hi Brad,

I tried your update (most recent changes you said is on, it is not working
with the nucleotide records either right now. I get the following error:

  File "D:\Program Files\Python21\Bio\GenBank\__init__.py", line 1205, in
feed
    self._parser.parseFile(handle)
  File "D:\Program Files\Python21\Martel\Parser.py", line 226, in parseFile
    self.parseString(fileobj.read())
  File "D:\Program Files\Python21\Martel\Parser.py", line 254, in
parseString
    self._err_handler.fatalError(result)
  File "D:\Program Files\Python21\lib\xml\sax\handler.py", line 38, in
fatalError
    raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character
379

I am puzzled about something though. I just downloaded the 'latest' files
from the web from the Bio/GenBank directory. However the CVS viewer from the
web shows that the files are 3 weeks old. I will try to go in with the
command line and see what I can find ...

Did you commit the changes to the CVS tree, or is the webCVS viewer doing
something funky?


Peter


>
> Traceback (most recent call last):
> entry = parser.parse(gb_handle)
> File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse
> self._scanner.feed(handle, self_consumer)
> File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed
> self._parser.parseFile(handle)
> File "/usr/.../Martel/Parser.py", line 226, in parseFile
> self.parseString(fileobj.read())
> File "/usr/.../Martel/Parser.py", line 254, in parseString
> self._err_handler.fatalError(result)  File
> "/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError
> raise exceptionParserPositionException: error parsing at or
> beyond character
> 2889
>
> Any help will be greatly appreciated.
>
> Thank You,
> Jeong
>
> -----Original Message-----
> From: Brad Chapman [mailto:chapmanb@arches.uga.edu]
> Sent: Tuesday, September 18, 2001 9:26 PM
> To: Jeong Joung
> Cc: biopython-dev@biopython.org
> Subject: Re: Parsing Protein GenBank Records
>
>
> Hi Joung;
> (ccing this to biopython-dev since this is relevant to everyone)
>
> > I'm having trouble parsing GenBank records obtained from the protein
> > database. The parser works fine for nucleotide GenBank
> records , but not
> for
> > protein records. I would appreciate it very much if you can
> guide me in
> > right direction for parsing such records.
> >
> > Here is the code and the error that I get back.
> >
> > >>> parser = GenBank.RecordParser()
> > >>> ncbi = GenBank.NCBIDictionary(database='Protein')
> > >>> rec = ncbi['6754304']
>
> The parser does work for proteins in general, but does fail badly on
> this particular REFSEQ sequence. In the past, REFSEQ stuff has been
> only "sort of" GenBank format, and this record is no exception. It
> has a lot of formatting problems (has no identifier for the sequence
> type in the LOCUS line, has extra DBSOURCE tag, has non-standard
> feature table types and keys (Protein, Region, region_name)).
> Anyways, it is a big non-standard formatting mess.
>
> I've fixed the GenBank parser to be able to handle this, and checked
> the changes into CVS. Diffs to the relevant files (Record.py,
> __init__.py and genbank_format.py in Bio.GenBank) are also attached
> to this file in case you don't have CVS access.
>
> Thanks for the bug report. Hope this works for you!
>
> Brad
> --
> PGP public key available from http://pgp.mit.edu/
>
>
> --__--__--
>
> Message: 2
> From: "Jeong Joung" <j.joung@AptusGenomics.com>
> To: "Brad Chapman" <chapmanb@arches.uga.edu>
> Cc: <biopython-dev@biopython.org>
> Date: Tue, 30 Oct 2001 15:48:19 -0500
> Subject: [Biopython-dev] Parsing Protein GenBank Records
>
> Hi,
>
> I just found out that this problem occurs on some REFSEQ
> nucleotide records
> as well.
>
> Thank You,
> Jeong
>
> -----Original Message-----
> From: Jeong Joung [mailto:j.joung@AptusGenomics.com]
> Sent: Tuesday, October 30, 2001 2:17 PM
> To: Brad Chapman
> Cc: biopython-dev@biopython.org
> Subject: RE: Parsing Protein GenBank Records
>
>
> Hello,
>
> Thanks for your help. The updated parser now works well for
> most REFSEQ
> proteins. I came across several REFSEQ protein records where
> the parser
> still fails on UNIX machine. The following is the error message:
>
> Traceback (most recent call last):
> entry = parser.parse(gb_handle)
> File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse
> self._scanner.feed(handle, self_consumer)
> File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed
> self._parser.parseFile(handle)
> File "/usr/.../Martel/Parser.py", line 226, in parseFile
> self.parseString(fileobj.read())
> File "/usr/.../Martel/Parser.py", line 254, in parseString
> self._err_handler.fatalError(result)  File
> "/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError
> raise exceptionParserPositionException: error parsing at or
> beyond character
> 2889
>
> Any help will be greatly appreciated.
>
> Thank You,
> Jeong
>
> -----Original Message-----
> From: Brad Chapman [mailto:chapmanb@arches.uga.edu]
> Sent: Tuesday, September 18, 2001 9:26 PM
> To: Jeong Joung
> Cc: biopython-dev@biopython.org
> Subject: Re: Parsing Protein GenBank Records
>
>
> Hi Joung;
> (ccing this to biopython-dev since this is relevant to everyone)
>
> > I'm having trouble parsing GenBank records obtained from the protein
> > database. The parser works fine for nucleotide GenBank
> records , but not
> for
> > protein records. I would appreciate it very much if you can
> guide me in
> > right direction for parsing such records.
> >
> > Here is the code and the error that I get back.
> >
> > >>> parser = GenBank.RecordParser()
> > >>> ncbi = GenBank.NCBIDictionary(database='Protein')
> > >>> rec = ncbi['6754304']
>
> The parser does work for proteins in general, but does fail badly on
> this particular REFSEQ sequence. In the past, REFSEQ stuff has been
> only "sort of" GenBank format, and this record is no exception. It
> has a lot of formatting problems (has no identifier for the sequence
> type in the LOCUS line, has extra DBSOURCE tag, has non-standard
> feature table types and keys (Protein, Region, region_name)).
> Anyways, it is a big non-standard formatting mess.
>
> I've fixed the GenBank parser to be able to handle this, and checked
> the changes into CVS. Diffs to the relevant files (Record.py,
> __init__.py and genbank_format.py in Bio.GenBank) are also attached
> to this file in case you don't have CVS access.
>
> Thanks for the bug report. Hope this works for you!
>
> Brad
> --
> PGP public key available from http://pgp.mit.edu/
>
>
>
> --__--__--
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
>
>
> End of Biopython-dev Digest


From chapmanb at arches.uga.edu  Tue Nov  6 19:58:10 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
In-Reply-To: <0111061236540B.04148@sienna.berkeley.edu>
References: <01110517352006.04148@sienna.berkeley.edu> <v04011703b80d29e9d68d@[158.252.219.163]> <p05101002b80dee5aa9e4@[171.65.33.250]> <0111061236540B.04148@sienna.berkeley.edu>
Message-ID: <20011106195809.C26903@ci350185-a.athen1.ga.home.com>

Gavin:
> > >And is there a particular version of python we should be programming 
> > >to?

Jeff:
> > We're currently only requiring Python 2.0, but perhaps it's time to 
> > reevaluate that.

Gavin:
> One advantage of moving biopython to python 2.1 is that you can
> presumable remove all of the PyUnit code thats in Biopython, since PyUnit
> is now included.
> 
> Perhaps we could always use the latest stable python version, at least
> so long as biopython is in alpha!

I think you may have misunderstood Jeff. Python 2.0 is the minimum
version needed for biopython. I use biopython with 2.0, 2.1 and
2.2pre-releases (depending on how lazy I am at updating the python
the machine), and everything works fine. 

The regression tests sometimes report problems between different
python versions and on different architectures (and in different
phases of the moon :-), and we try to take care of these when they
are noticed. If you run a test itself (ie. python test_Whatever.py),
then it should work fine, but the regression comparison with the old
output is what will fail.

We do our best to keep these up to date, but I know I have been
especially slack in working on this recently due to the excessive
amount of lab work I've had to do.

Too-many-mini-preps-to-think-clearly-about-regression-tests-ly yr's,
Brad
-- 
PGP public key available from http://pgp.mit.edu/

From chapmanb at arches.uga.edu  Tue Nov  6 20:15:12 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Fwd: Unit tests
In-Reply-To: <0111061310310D.04148@sienna.berkeley.edu>
References: <0111061310310D.04148@sienna.berkeley.edu>
Message-ID: <20011106201512.D26903@ci350185-a.athen1.ga.home.com>

Hi Gavin;

> I have been working on updating the SCOP module. 

Great!

> Now, I am a big fan of unit tests, and I find myself writing 
> unit tests for almost everything. 

Me too. My testing ability has improved dramatically since starting
on biopython -- and now for all of my independent work (and
biopython-corba), I code unit tests like crazy.

> I am having problems understanding how to integrate my tests 
> into biopythons framework. Most of the things in biopython/tests 
> appear to be regression tests with a PyUnit coating.

Yes, I used PyUnit to build up the regression testing framework
(based heavily on the regression tests that Andrew already had). I
mostly just use PyUnit to deal with printing the output, etc, but it
is definately a regression test framework. The regression testing
framework and integrating tests into it is described on a wiki page:

http://biopython.org/wiki/html/BioPython/RegressionTests.html

(damn, this still uses br_regression.py instead of run_tests.py. I
need to update this).

> After looking around a bit, I found this unit test description from Zope.
[...]
> I think this would be a good way to organize biopythons unit tests. 
> Thoughts? Comments?

I think it's great to write an individual test for a module
(test_whatever.py) in whatever way you feel most comfortable with. I
agree with Jeff that we don't want to rewrite all of the tests now
(gack!) but definately feel free to write new tests as Unit Tests,
they will still plug into the high-level regression testing
framework just fine.

Personally, I think that although the regression test stuff can be a
pain sometimes, it is quite useful for tests like the BLAST output
tests or the GenBank tests. It is much easier to dump the output of
a parse to a file and make sure it remains the same then to
individually have 'assert genbank_record.locus == "WHATEVER"' for
a hundred different attributes on a record.

So, I guess I don't think there is a need to be heavy-handed about
how you "have" to do tests. Personally, I'd rather leave it up to
the individual person writing the tests.

The-important-thing-is-having-some-tests-ly yr's,
Brad
-- 
PGP public key available from http://pgp.mit.edu/

From adalke at mindspring.com  Wed Nov  7 00:04:59 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
Message-ID: <0dbe01c16749$c0a38520$0301a8c0@josiah.dalkescientific.com>

Brad:
>I think you may have misunderstood Jeff. Python 2.0 is the minimum
>version needed for biopython. I use biopython with 2.0, 2.1 and
>2.2pre-releases (depending on how lazy I am at updating the python
>the machine), and everything works fine. 

I must add that iterators in Python 2.2 make me want to
rethink how readers are done in Martel and Biopython.

Backwards compatibility?  Who needs *that*?! :)

So probably a year before 2.2 is mainstream enough to
consider a switchover.  *sigh*

                    Andrew
                    dalke@dalkescientific.com


From chapmanb at arches.uga.edu  Wed Nov  7 12:23:18 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] debug_level=2 problem in Martel.Generate
Message-ID: <20011107122318.B40277@ci350185-a.athen1.ga.home.com>

Hello all;
While updating GenBank for RefSeq, I got knee-deep into debugging with
Martel and noticed that when I set debug_level=2, I was getting an
unexpected amount of output. Instead of the normal 20ish characters of
text in the file being parsed, I was getting the entire file parsed up
to that point. 

Digging into this, I realized that this seems to be due to a change
between versions 1.5 and 1.6 of Martel/Generate.py. Andrew's notes state
that:

Fixed debug error where text[x-8:x+8] failed when x < 8, since x-8 is
negative, which pulls from the end.

The fix was min(0, x-8), instead of just x-8. Unfortunately, this prints
from the beginning as x advances through the file, and gives all the
output I was seeing (and, I don't think fixes the problem, since min(0,
-2) gives the negative problem you were seeing). 

If I'm getting this right, the fix should be max(0, x-8), which seems to
give the correct output. I've attached a patch for this. If anyone
(Andrew :-), can verify that I'm thinking about this right, I'll be
happy to check it in. Thanks!

Brad

-------------- next part --------------
--- Generate.py.orig	Sat Oct 20 19:47:47 2001
+++ Generate.py	Wed Nov  7 01:40:31 2001
@@ -492,7 +492,7 @@
             s = s[:17] + " ... " + s[-17:]
         self.msg = s
     def __call__(self, text, x, end):
-        print "Match %s (x=%d): %s" % (repr(text[min(0, x-8):x+8]), x,
+        print "Match %s (x=%d): %s" % (repr(text[max(0, x-8):x+8]), x,
                                             repr(self.msg))
         return x
 
From chapmanb at arches.uga.edu  Wed Nov  7 12:32:07 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Parsing Protein GenBank Records
In-Reply-To: <OIEFKBIBGGKFMCHEPFMJIEPNCAAA.j.joung@AptusGenomics.com>
References: <OIEFKBIBGGKFMCHEPFMJIEPNCAAA.j.joung@AptusGenomics.com>
Message-ID: <20011107123207.C40277@ci350185-a.athen1.ga.home.com>

[Talking about the GenBank parser]
Jeong:
> The updated parser now works well for most REFSEQ
> proteins. I came across several REFSEQ protein records where the parser
> still fails on UNIX machine. The following is the error message:
>
> I just found out that this problem occurs on some REFSEQ nucleotide records
> as well.

Thanks for the heads up. I've done a lot of work on the GenBank parser
and run it across a lot of human chromosome 1 from RefSeq, and it now
seems to be acting right for me. There were a bunch of fixes to file in
Bio/GenBank (genbank_format.py, __init__.py, LocationParser.py and
Record.py) and to Bio/SeqFeature.py to handle RefSeq decently.

I've committed all of the changes to CVS, so if you have the most recent
CVS everything should work smoothly with RefSeq (I hope :-). I hope this
will also fix the problems that Peter was reporting.

If you still have problems, please don't hesitate to let me know and
I'll work more on it. Let me know what files you're having problems
with, and I can take a look at those specificially.

Enjoy!
Brad

From gec at threeplusone.com  Wed Nov  7 12:41:47 2001
From: gec at threeplusone.com (Gavin)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
In-Reply-To: <20011106195809.C26903@ci350185-a.athen1.ga.home.com>
References: <0111061236540B.04148@sienna.berkeley.edu>
 <01110517352006.04148@sienna.berkeley.edu>
 <v04011703b80d29e9d68d@[158.252.219.163]>
 <p05101002b80dee5aa9e4@[171.65.33.250]>
 <0111061236540B.04148@sienna.berkeley.edu>
Message-ID: <v04011700b80f1fdbdc74@[158.252.219.163]>

>Brad:
>I think you may have misunderstood Jeff. Python 2.0 is the minimum
>version needed for biopython. I use biopython with 2.0, 2.1 and
>2.2pre-releases (depending on how lazy I am at updating the python
>the machine), and everything works fine. 

Not at all. If python 2.0 is the minimum version, then I cannot use python 2.1 features when writing biopython code. The only way of making sure of that (that I can think of) is to program against python 2.0.

As a case in point; It was a long while before I realized that PyUnit had only been added to the core python libraries with version 2.1. I only found that out by looking at the PyUnit web site.  Before that revelation I was very puzzled as to why biopython contains PyUnit code.

(This still seems odd. Why isn't PyUnit installed as a separate package? (If you need it.) )

Anyways, the impression that I am getting is that it is too early in the development of biopython to be worrying about the details of which python versions are usable. Yes No?

Gavin

From jchang at SMI.Stanford.EDU  Wed Nov  7 13:53:31 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
In-Reply-To: <v04011700b80f1fdbdc74@[158.252.219.163]>
References: <0111061236540B.04148@sienna.berkeley.edu>
 <01110517352006.04148@sienna.berkeley.edu>
 <v04011703b80d29e9d68d@[158.252.219.163]>
 <p05101002b80dee5aa9e4@[171.65.33.250]>
 <0111061236540B.04148@sienna.berkeley.edu>
 <v04011700b80f1fdbdc74@[158.252.219.163]>
Message-ID: <p05101000b80f2f327b98@[192.168.0.4]>

At 9:41 AM -0800 11/7/01, Gavin wrote:
>As a case in point; It was a long while before I realized that 
>PyUnit had only been added to the core python libraries with version 
>2.1. I only found that out by looking at the PyUnit web site. 
>Before that revelation I was very puzzled as to why biopython 
>contains PyUnit code.
>
>(This still seems odd. Why isn't PyUnit installed as a separate 
>package? (If you need it.) )

Since Python doesn't have a CPAN-like package manager yet, requiring 
extra packages creates a barrier to entry for new users.  For small 
packages (as far as the license allows) we've just been bundling them 
in biopython to make installation easier.  Otherwise, we'd have more 
dependencies in pyunit, spark, (older) br_regrtest, more?, etc... 
When we move the requirement onto 2.1, we can remove the pyunit stuff.


>Anyways, the impression that I am getting is that it is too early in 
>the development of biopython to be worrying about the details of 
>which python versions are usable. Yes No?

No.  While the package is under heavy development, there are a lot of 
people using it for production work.  Requiring the latest version 
would place the burden on them to upgrade, possibly before they're 
ready.

It might be possible to fork biopython into stable and development 
version, where the development version could require the latest 
version of python.  However, that would take a lot of resources -- 
probably more than we have.

Jeff

From katel at worldpath.net  Wed Nov  7 17:00:25 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Failed tests.
References: <01110611111207.04148@sienna.berkeley.edu> <p05101001b80deda07e5b@[171.65.33.250]>
Message-ID: <001c01c167d7$a4d470a0$010a0a0a@cadence.com>

----- Original Message -----
From: "Jeffrey Chang" <jchang@SMI.Stanford.EDU>
To: <gec@compbio.berkeley.edu>; <biopython-dev@biopython.org>
Sent: Tuesday, November 06, 2001 11:45 AM
Subject: Re: [Biopython-dev] Failed tests.


> The import errors seem to apply to new modules that Terjei put in.
> Do you have those in your directory?
>
> It looks like there are missing files in intelligenetics and
> metatool.  Cayte, could you check on those?
>
  I just committed meta.out.  It slipped through the cracks.  The test
passes on my local system.
No I'll check IntelliGenetics.

                                                Cayte


From katel at worldpath.net  Wed Nov  7 17:34:31 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
References: <0111061236540B.04148@sienna.berkeley.edu> <01110517352006.04148@sienna.berkeley.edu> <v04011703b80d29e9d68d@[158.252.219.163]> <p05101002b80dee5aa9e4@[171.65.33.250]> <0111061236540B.04148@sienna.berkeley.edu> <v04011700b80f1fdbdc74@[158.252.219.163]> <p05101000b80f2f327b98@[192.168.0.4]>
Message-ID: <002d01c167dc$5f41d320$010a0a0a@cadence.com>

> No.  While the package is under heavy development, there are a lot of
> people using it for production work.  Requiring the latest version
> would place the burden on them to upgrade, possibly before they're
> ready.
>
>
   It would be nice to have feedback from these users about what is useful,
what new features are needed and what could be easier to use.

                                              Cayte


From adalke at mindspring.com  Wed Nov  7 17:24:31 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] debug_level=2 problem in Martel.Generate
Message-ID: <0e9501c167da$fb8c6c60$0301a8c0@josiah.dalkescientific.com>

Brad:
>If I'm getting this right, the fix should be max(0, x-8), which seems to
>give the correct output. I've attached a patch for this. If anyone
>(Andrew :-), can verify that I'm thinking about this right, I'll be
>happy to check it in. Thanks!

Oops! Yep.  That's what I get for not testing.  *chagrin*  :)

Patch away.

                        Andrew


From katel at worldpath.net  Wed Nov  7 21:31:46 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Python version
References: <0dbe01c16749$c0a38520$0301a8c0@josiah.dalkescientific.com>
Message-ID: <007201c167fd$83c69160$010a0a0a@cadence.com>

----- Original Message -----
From: "Andrew Dalke" <adalke@mindspring.com>
To: <biopython-dev@biopython.org>
Sent: Tuesday, November 06, 2001 9:04 PM
Subject: Re: [Biopython-dev] Python version


> Brad:
> >I think you may have misunderstood Jeff. Python 2.0 is the minimum
> >version needed for biopython. I use biopython with 2.0, 2.1 and
> >2.2pre-releases (depending on how lazy I am at updating the python
> >the machine), and everything works fine.
>
> I must add that iterators in Python 2.2 make me want to
> rethink how readers are done in Martel and Biopython.
>
  Today, when I tested  my NBRF parser, I fed my RecordFile handle into the
Martel RecordReader.  I found one bug that took all day to fix, but it
worked by the end of the day.

  I feel that something like RecordFile is needed, because, for example, it
would be a major hassle to remove all the blank lines from my NBRF file.  I
should add a lot of test cases to RecordFile.  But if python is going to
handle iteration for us, maybe I should not put my time into it?

                                                  Cayte


From biopython-bugs at bioperl.org  Thu Nov  8 14:53:43 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/45
Message-ID: <200111081953.fA8JrhB06066@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#45
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:54:38 -0400
	0 replies 	0 followups
	Notes: Gavin reports this bug as fixed with his new SCOP module.
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:54:38 -0400
Date: Tue, 23 Oct 2001 18:54:38 -0400
Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Thu Nov  8 14:53:44 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/45
Message-ID: <200111081953.fA8JriB06070@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#45 from incoming to fixed-bugs
Message summary for PR#45
	From: gec@compbio.berkeley.edu
	Subject: PDB sequence numbers can be negative
	Date: Tue, 23 Oct 2001 18:54:38 -0400
	0 replies 	0 followups
	Notes: Gavin reports this bug as fixed with his new SCOP module.
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 18:54:38 -0400
Date: Tue, 23 Oct 2001 18:54:38 -0400
Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org>
From: gec@compbio.berkeley.edu
To: biopython-bugs@bioperl.org
Subject: PDB sequence numbers can be negative

Full_Name: Gavin Crooks
Module: SCOP/Location.py
Version: 
OS: 
Submission from: sienna.berkeley.edu (128.32.236.51)


PDB residue sequence numbers can, on occasion, be
negative. e.g. 1B9N. SCOP domains sometimes start
on negative sequence numbers. This breaks the
location parser in Bio.SCOP.Location.py


From biopython-bugs at bioperl.org  Thu Nov  8 14:54:15 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/49
Message-ID: <200111081954.fA8JsFB06083@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#49
	From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
	Subject: Re: [Biopython-dev] Notification: incoming/46
	Date: Wed, 24 Oct 2001 16:58:30 -0700
	0 replies 	0 followups
	Notes: dup of #46


====> ORIGINAL MESSAGE FOLLOWS <====

>From jchang@SMI.Stanford.EDU Wed Oct 24 19:57:15 2001
Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9ONvAB24866
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 19:57:15 -0400
Received: from [171.65.33.250] (air11-smi.Stanford.EDU [171.65.33.250])
	by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f9ONvEC09544
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 16:57:14 -0700 (PDT)
Mime-Version: 1.0
X-Sender: jchang@smi.stanford.edu (Unverified)
Message-Id: <p05101004b7fd060a5fa5@[171.65.33.250]>
In-Reply-To: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
References: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
Date: Wed, 24 Oct 2001 16:58:30 -0700
To: biopython-bugs@bioperl.org
From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
Subject: Re: [Biopython-dev] Notification: incoming/46
Content-Type: text/plain; charset="us-ascii" ; format="flowed"

Hi Gavin,

Could you send me a sample of this?  It'll be helpful to have a test 
case to test fixes.

Thanks,
Jeff

>JitterBug notification
>
>new message incoming/46
>
>Message summary for PR#46
>	From: gec@compbio.berkeley.edu
>	Subject: PDB sequence numbers can be negative
>	Date: Tue, 23 Oct 2001 18:56:44 -0400
>	0 replies	0 followups
>
>====> ORIGINAL MESSAGE FOLLOWS <====
>
>>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
>Received: from localhost (localhost [127.0.0.1])
>	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
>	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 
>18:56:44 -0400
>Date: Tue, 23 Oct 2001 18:56:44 -0400
>Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
>From: gec@compbio.berkeley.edu
>To: biopython-bugs@bioperl.org
>Subject: PDB sequence numbers can be negative
>
>Full_Name: Gavin Crooks
>Module: SCOP/Location.py
>Version:
>OS:
>Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
>
>PDB residue sequence numbers can, on occasion, be
>negative. e.g. 1B9N. SCOP domains sometimes start
>on negative sequence numbers. This breaks the
>location parser in Bio.SCOP.Location.py
>
>
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From biopython-bugs at bioperl.org  Thu Nov  8 14:54:15 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/49
Message-ID: <200111081954.fA8JsFB06087@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#49 from incoming to fixed-bugs
Message summary for PR#49
	From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
	Subject: Re: [Biopython-dev] Notification: incoming/46
	Date: Wed, 24 Oct 2001 16:58:30 -0700
	0 replies 	0 followups
	Notes: dup of #46


====> ORIGINAL MESSAGE FOLLOWS <====

>From jchang@SMI.Stanford.EDU Wed Oct 24 19:57:15 2001
Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9ONvAB24866
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 19:57:15 -0400
Received: from [171.65.33.250] (air11-smi.Stanford.EDU [171.65.33.250])
	by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f9ONvEC09544
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 16:57:14 -0700 (PDT)
Mime-Version: 1.0
X-Sender: jchang@smi.stanford.edu (Unverified)
Message-Id: <p05101004b7fd060a5fa5@[171.65.33.250]>
In-Reply-To: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
References: <200110232256.f9NMuiB13336@pw600a.bioperl.org>
Date: Wed, 24 Oct 2001 16:58:30 -0700
To: biopython-bugs@bioperl.org
From: Jeffrey Chang <jchang@SMI.Stanford.EDU>
Subject: Re: [Biopython-dev] Notification: incoming/46
Content-Type: text/plain; charset="us-ascii" ; format="flowed"

Hi Gavin,

Could you send me a sample of this?  It'll be helpful to have a test 
case to test fixes.

Thanks,
Jeff

>JitterBug notification
>
>new message incoming/46
>
>Message summary for PR#46
>	From: gec@compbio.berkeley.edu
>	Subject: PDB sequence numbers can be negative
>	Date: Tue, 23 Oct 2001 18:56:44 -0400
>	0 replies	0 followups
>
>====> ORIGINAL MESSAGE FOLLOWS <====
>
>>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001
>Received: from localhost (localhost [127.0.0.1])
>	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330
>	for <biopython-bugs@pw600a.bioperl.org>; Tue, 23 Oct 2001 
>18:56:44 -0400
>Date: Tue, 23 Oct 2001 18:56:44 -0400
>Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org>
>From: gec@compbio.berkeley.edu
>To: biopython-bugs@bioperl.org
>Subject: PDB sequence numbers can be negative
>
>Full_Name: Gavin Crooks
>Module: SCOP/Location.py
>Version:
>OS:
>Submission from: sienna.berkeley.edu (128.32.236.51)
>
>
>
>PDB residue sequence numbers can, on occasion, be
>negative. e.g. 1B9N. SCOP domains sometimes start
>on negative sequence numbers. This breaks the
>location parser in Bio.SCOP.Location.py
>
>
>_______________________________________________
>Biopython-dev mailing list
>Biopython-dev@biopython.org
>http://biopython.org/mailman/listinfo/biopython-dev


From biopython-bugs at bioperl.org  Thu Nov  8 14:54:48 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/50
Message-ID: <200111081954.fA8JslB06100@pw600a.bioperl.org>

JitterBug notification

jchang changed notes

Message summary for PR#50
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: Re: [Biopython-dev] Notification: incoming/49
	Date: Wed, 24 Oct 2001 17:40:52 -0700
	0 replies 	0 followups
	Notes: Gavin reports this as fixed.
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Wed Oct 24 20:49:43 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9P0ngB25247
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 20:49:42 -0400
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id RAA03432
	for biopython-bugs@bioperl.org; Wed, 24 Oct 2001 17:49:42 -0700
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: Re: [Biopython-dev] Notification: incoming/49
Date: Wed, 24 Oct 2001 17:40:52 -0700
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
References: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
In-Reply-To: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
MIME-Version: 1.0
Message-Id: <01102417494205.14420@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


How about  "A:-1-126", direct from SCOP...
16118	px	a.4.5.8	d1b9ma1	1b9m A:-1-126

I am in the middle of updating the SCOP module, and I have already
refactored that code, and fixed this bug. And I've written a nice shiny
unit test.  But I was concerned that this same bug could crop up elsewhere.
Its the kind of obscure boundary case that could trip up any code working 
with PDB sequence numbers.

Gavin

gec@compbio.berkeley.edu
http://threeplusone.com

> Hi Gavin,
> 
> Could you send me a sample of this?  It'll be helpful to have a test 
> case to test fixes.
> 
> Thanks,
> Jeff
>
> >Full_Name: Gavin Crooks
> >Module: SCOP/Location.py
> >Version:
> >OS:
> >Submission from: sienna.berkeley.edu (128.32.236.51)
> >
> >PDB residue sequence numbers can, on occasion, be
> >negative. e.g. 1B9N. SCOP domains sometimes start
> >on negative sequence numbers. This breaks the
> >location parser in Bio.SCOP.Location.py
>


From biopython-bugs at bioperl.org  Thu Nov  8 14:54:48 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/50
Message-ID: <200111081954.fA8JsmB06104@pw600a.bioperl.org>

JitterBug notification

jchang moved PR#50 from incoming to fixed-bugs
Message summary for PR#50
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: Re: [Biopython-dev] Notification: incoming/49
	Date: Wed, 24 Oct 2001 17:40:52 -0700
	0 replies 	0 followups
	Notes: Gavin reports this as fixed.
- jchang


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Wed Oct 24 20:49:43 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9P0ngB25247
	for <biopython-bugs@bioperl.org>; Wed, 24 Oct 2001 20:49:42 -0400
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id RAA03432
	for biopython-bugs@bioperl.org; Wed, 24 Oct 2001 17:49:42 -0700
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: Re: [Biopython-dev] Notification: incoming/49
Date: Wed, 24 Oct 2001 17:40:52 -0700
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
References: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
In-Reply-To: <200110242357.f9ONvGB24883@pw600a.bioperl.org>
MIME-Version: 1.0
Message-Id: <01102417494205.14420@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


How about  "A:-1-126", direct from SCOP...
16118	px	a.4.5.8	d1b9ma1	1b9m A:-1-126

I am in the middle of updating the SCOP module, and I have already
refactored that code, and fixed this bug. And I've written a nice shiny
unit test.  But I was concerned that this same bug could crop up elsewhere.
Its the kind of obscure boundary case that could trip up any code working 
with PDB sequence numbers.

Gavin

gec@compbio.berkeley.edu
http://threeplusone.com

> Hi Gavin,
> 
> Could you send me a sample of this?  It'll be helpful to have a test 
> case to test fixes.
> 
> Thanks,
> Jeff
>
> >Full_Name: Gavin Crooks
> >Module: SCOP/Location.py
> >Version:
> >OS:
> >Submission from: sienna.berkeley.edu (128.32.236.51)
> >
> >PDB residue sequence numbers can, on occasion, be
> >negative. e.g. 1B9N. SCOP domains sometimes start
> >on negative sequence numbers. This breaks the
> >location parser in Bio.SCOP.Location.py
>


From biopython-bugs at bioperl.org  Thu Nov  8 18:51:08 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/53
Message-ID: <200111082351.fA8Np8B08921@pw600a.bioperl.org>

JitterBug notification

new message incoming/53

Message summary for PR#53
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: SCOP
	Date: Thu, 8 Nov 2001 15:35:13 -0800
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Thu Nov  8 18:51:07 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915
	for <biopython-bugs@bioperl.org>; Thu, 8 Nov 2001 18:51:07 -0500
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718
	for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: SCOP
Date: Thu, 8 Nov 2001 15:35:13 -0800
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
MIME-Version: 1.0
Message-Id: <0111081551110I.04148@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


The SCOP package has been updated and extended. 

Additions include parsers for the new CLA, HIE and DES files, a Residues class
to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy
itself, a Raf module to handle ASTRAL RAF files, and a script, 
/Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records
from the relevant PDB file.

Enjoy,

    Gavin


From biopython-bugs at bioperl.org  Thu Nov  8 18:56:47 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/53
Message-ID: <200111082356.fA8NulB09018@pw600a.bioperl.org>

JitterBug notification

gec changed notes

Message summary for PR#53
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: SCOP
	Date: Thu, 8 Nov 2001 15:35:13 -0800
	0 replies 	0 followups
	Notes: Stupid user error: emailed to wrong address


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Thu Nov  8 18:51:07 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915
	for <biopython-bugs@bioperl.org>; Thu, 8 Nov 2001 18:51:07 -0500
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718
	for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: SCOP
Date: Thu, 8 Nov 2001 15:35:13 -0800
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
MIME-Version: 1.0
Message-Id: <0111081551110I.04148@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


The SCOP package has been updated and extended. 

Additions include parsers for the new CLA, HIE and DES files, a Residues class
to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy
itself, a Raf module to handle ASTRAL RAF files, and a script, 
/Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records
from the relevant PDB file.

Enjoy,

    Gavin


From biopython-bugs at bioperl.org  Thu Nov  8 18:56:47 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/53
Message-ID: <200111082356.fA8NulB09022@pw600a.bioperl.org>

JitterBug notification

gec moved PR#53 from incoming to trash
Message summary for PR#53
	From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
	Subject: SCOP
	Date: Thu, 8 Nov 2001 15:35:13 -0800
	0 replies 	0 followups
	Notes: Stupid user error: emailed to wrong address


====> ORIGINAL MESSAGE FOLLOWS <====

>From gec@sienna.berkeley.edu Thu Nov  8 18:51:07 2001
Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915
	for <biopython-bugs@bioperl.org>; Thu, 8 Nov 2001 18:51:07 -0500
Received: from localhost (localhost [[UNIX: localhost]])
	by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718
	for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800
From: "Gavin E. Crooks" <gec@compbio.berkeley.edu>
Reply-To: gec@compbio.berkeley.edu
Organization: Very Little
To: biopython-bugs@bioperl.org
Subject: SCOP
Date: Thu, 8 Nov 2001 15:35:13 -0800
X-Mailer: KMail [version 1.0.29]
Content-Type: text/plain
MIME-Version: 1.0
Message-Id: <0111081551110I.04148@sienna.berkeley.edu>
Content-Transfer-Encoding: 8bit


The SCOP package has been updated and extended. 

Additions include parsers for the new CLA, HIE and DES files, a Residues class
to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy
itself, a Raf module to handle ASTRAL RAF files, and a script, 
/Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records
from the relevant PDB file.

Enjoy,

    Gavin


From gec at compbio.berkeley.edu  Thu Nov  8 18:57:08 2001
From: gec at compbio.berkeley.edu (Gavin E. Crooks)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Fwd: SCOP
Message-ID: <0111081558350J.04148@sienna.berkeley.edu>


The SCOP package has been updated and extended. 

Additions include parsers for the new CLA, HIE and DES files, a Residues class
to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy
itself, a Raf module to handle ASTRAL RAF files, and a script, 
/Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records
from the relevant PDB file.

Enjoy,

    Gavin

From katel at worldpath.net  Fri Nov  9 23:51:03 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] NBRF
Message-ID: <001501c169a3$4e5d2f00$010a0a0a@cadence.com>

  I just added the NBRF parser to CVS.  Its passed a sanity check but no
more.

  I'm firmly convinced that humans were never meant to stare at screens of
as, cs, gs and ts.:) But I see no alternative for checking the baseline.
Any better ideas?

  The unit tests seem to mesh with a more computation intensive as opposed
to data intensive target.  RecordFile could benefit from unit tests, but I'd
like to know if we plan to switch to python 2.2 iteration handling.  If
RecordFile is a temporary solution I'll focus elsewhere.

                                         Cayte


From mkc at mathdogs.com  Sat Nov 10 15:52:38 2001
From: mkc at mathdogs.com (Mike Coleman)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] doc typos
Message-ID: <20011110205238.9342C3407E@debian>

Here's some typos I spotted in the tutorial:

cutomize -> customize
ExtenedIUPACDNA -> ExtendedIUPACDNA ?
Mitochondriall -> Mitochondrial ?

subleties -> subtleties

definately -> definitely
humungous -> humongous ?

"What the heck in a handle?" -> "What the heck is a handle?"

From katel at worldpath.net  Wed Nov 14 02:37:09 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] biotech fair
Message-ID: <001f01c16cdf$2c2036a0$010a0a0a@cadence.com>

  Some of you may be interested in this event
http://www.bioitworld.com/aboutus/index.shtml

                          Cayte


From katel at worldpath.net  Wed Nov 14 22:29:46 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] First impressions of Pathway
Message-ID: <006d01c16d85$c73fa700$010a0a0a@cadence.com>

   First we should thank all Tarjei for his work.  It looks like a great
start.  The next step, I think would be to create some examples and see how
it plays.

Quibbles:
   I'd prefer more neutral  nomenclature than parent-child because they bias
the reader toward a tree structure.

   df_search seems to assume a connected graph ( at least it looks like it
would konk out early with a disconnected graph )..  All assumptions should
be documented.

  The following line needs a description of what each tuple contains?  Since
python is typeless it requires more documentation at the interfaces.

    catalysts   -- list of tuples of catalysts involved in the same reaction
                     step


  In MultiNetwork.remove_node the sense of the filter is reversed from what
I'd expect if you want to remove dangling edges.   My understanding is that
filter returns items that make the condition true?

           self.__adjacency_list[node] = filter(lambda x,node=node: x[0] is
node,

self.__adjacency_list[node].list())

In the following sequence, node may be redefined before it is used.  It
looks to me that you intend to use the initial definition.


        for node in self.__adjacency_list.keys():
            self.__adjacency_list[node] = filter(lambda x,node=node: x[0] is
node,

self.__adjacency_list[node].list())
        # remove all refering pairs in label map
        for label in self.__label_map.keys():
            self.__label_map[label] = filter(lambda x,node=node: x[0] is
node or x[1] is node,
                                             self.__label_map[label].list())


Cayte


From katel at worldpath.net  Fri Nov 16 19:38:21 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] new modules
Message-ID: <004b01c16f00$29c2cd00$010a0a0a@cadence.com>

   If anyone has started work on Msf please lt me know.  Otherwise, I'll
write it.

  I'm reading a paper with the jaw crunching title "Creating Metabolic
Pathway Models Using Data Mining and Expert Knowledge" in the hope it will
contain insight in how to provide effective tools for pathway analysis.


Cayte


From katel at worldpath.net  Fri Nov 16 22:49:04 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] saf instead of msf?
Message-ID: <007301c16f1a$cdffa2c0$010a0a0a@cadence.com>

  One format site ominously described Msf format as "obsolete".  It
explained that the checksum field mase msf hard to update.  Another site
listed "recommended" formats in bold lettering.  Msf was listed in faded
lettering.

  So I my look into SAF instead.

                                               Cayte


From tarjei_mikkelsen at hotmail.com  Sun Nov 18 02:53:03 2001
From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] First impressions of Pathway
Message-ID: <F72Ezf2HL9OqNHKEn8d0000bde4@hotmail.com>

>Quibbles:
>    I'd prefer more neutral  nomenclature than parent-child because they 
>bias
>the reader toward a tree structure.

'parent' and 'child' is only used for the MultiGraph class, which is a 
generic directed graph rep intended for internal representation. It is the 
Network class that is exposed to the user. Network uses 'source' and 'sink' 
as the corresponding terms - although I'm not convinced that is the best 
naming either.

>    df_search seems to assume a connected graph ( at least it looks like it
>would konk out early with a disconnected graph )..  All assumptions should
>be documented.

I've updated the documentation to better reflect what it does

>   The following line needs a description of what each tuple contains?  
>Since
>python is typeless it requires more documentation at the interfaces.
>
>     catalysts   -- list of tuples of catalysts involved in the same 
>reaction
>                      step

This has been simplified to a list of catalysts . The type is arbitrary by 
design - it could be anything from a string descriptor to a Enzyme record 
object, depending on the needs of the user.

>   In MultiNetwork.remove_node the sense of the filter is reversed from 
>what
>I'd expect if you want to remove dangling edges.   My understanding is that
>filter returns items that make the condition true?

Yup, good catch. The whole remove_edges method was faulty. It's been 
corrected and a test has been added.


thanks,

Tarjei


_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


From biopython-bugs at bioperl.org  Tue Nov 20 17:49:49 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Notification: incoming/54
Message-ID: <200111202249.fAKMnnA22346@pw600a.bioperl.org>

JitterBug notification

new message incoming/54

Message summary for PR#54
	From: <toner@fastmail.ca>
	Subject: toner cartridges
	Date: Tue, 20 Nov 2001 17:48:48
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From toner@fastmail.ca Tue Nov 20 17:49:49 2001
Received: from ELIXIR.ELIXIRSOLUTIONS.NET ([64.14.239.183])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fAKMnmA22340;
	Tue, 20 Nov 2001 17:49:48 -0500
Received: from unknown ([64.3.195.224] unverified) by ELIXIR.ELIXIRSOLUTIONS.NET with Microsoft SMTPSVC(5.0.2195.3779);
	 Wed, 21 Nov 2001 04:20:10 +0530
From: <toner@fastmail.ca>
Subject: toner cartridges
Date: Tue, 20 Nov 2001 17:48:48
Message-Id: <736.332024.362305@unknown>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Bcc:
X-OriginalArrivalTime: 20 Nov 2001 22:50:12.0328 (UTC) FILETIME=[B668B680:01C17215]


**** VORTEX SUPPLIES ****

YOUR LASER PRINTER TONER CARTRIDGE,
COPIER AND FAX CARTRIDGE CONNECTION

SAVE UP TO 30% FROM RETAIL

ORDER BY PHONE:1-888-288-9043
ORDER BY FAX: 1-888-977-1577
E-MAIL REMOVAL LINE: 1-888-248-4930


UNIVERSITY AND/OR SCHOOL PURCHASE ORDERS WELCOME. (NO CREDIT APPROVAL REQUIRED)
ALL OTHER PURCHASE ORDER REQUESTS REQUIRE CREDIT APPROVAL.
PAY BY CHECK (C.O.D), CREDIT CARD OR PURCHASE ORDER (NET 30 DAYS).

IF YOUR ORDER IS BY CREDIT CARD PLEASE LEAVE YOUR CREDIT CARD # PLUS EXPIRATION DATE. 
IF YOUR ORDER IS BY PURCHASE ORDER LEAVE YOUR SHIPPING/BILLING ADDRESSES AND YOUR P.O. NUMBER


NOTE: WE DO NOT CARRY 

1) XEROX, BROTHER, PANASONIC, FUJITSU PRODUCTS
2) HP DESKJETJET/INK JET OR BUBBLE JET CARTRIDGES 
3) CANON BUBBLE JET CARTRIDGES 
4) ANY OFFBRANDS BESIDES THE ONES LISTED BELOW.    

OUR NEW , LASER PRINTER TONER CARTRIDGE, PRICES ARE  AS FOLLOWS: 
(PLEASE ORDER BY PAGE NUMBER AND/OR ITEM NUMBER)

HEWLETT PACKARD: (ON PAGE 2)

ITEM #1  LASERJET SERIES  4L,4P (74A)------------------------$44
ITEM #2  LASERJET SERIES  1100 (92A)-------------------------$44
ITEM #3  LASERJET SERIES  2 (95A)----------------------------$39
ITEM #4  LASERJET SERIES  2P (75A)---------------------------$54 
ITEM #5  LASERJET SERIES  5P,6P,5MP, 6MP (3903A)----------  -$44
ITEM #6  LASERJET SERIES  5SI, 8000 (09A)--------------------$95
ITEM #7  LASERJET SERIES  2100, 2200 (96A)-------------------$74
ITEM #8  LASERJET SERIES  8100 (82X)-------------------------$115
ITEM #9  LASERJET SERIES  5L/6L (3906A)----------------------$39
ITEM #10 LASERJET SERIES  4V---------------------------------$95
ITEM #11 LASERJET SERIES 4000 (27X)--------------------------$79
ITEM #12 LASERJET SERIES 3SI/4SI (91A)-----------------------$54
ITEM #13 LASERJET SERIES 4, 4M, 5,5M-------------------------$49
ITEM #13A LASERJET SERIES 5000 (29X)-------------------------$125
ITEM #13B LASERJET SERIES 1200-------------------------------$59
ITEM #13C LASERJET SERIES 4100-------------------------------$99
ITEM #18   LASERJET SERIES 3100------------------------------$39
ITEM #19 LASERJET SERIES 4500 BLACK--------------------------$79
ITEM #20 LASERJET SERIES 4500 COLORS ------------------------$125

HEWLETT PACKARD FAX (ON PAGE 2)

ITEM #14 LASERFAX 500, 700 (FX1)----------$49
ITEM #15  LASERFAX 5000,7000 (FX2)--------$64
ITEM #16  LASERFAX (FX3)------------------$59
ITEM #17  LASERFAX (FX4)------------------$54


LEXMARK/IBM (ON PAGE 3)

OPTRA 4019, 4029 HIGH YIELD---------------$89
OPTRA R, 4039, 4049 HIGH YIELD-----------$105
OPTRA E310.312 HIGH YIELD----------------$79

OPTRA E-----------------------------------$59
OPTRA N----------------------------------$115
OPTRA S----------------------------------$165
OPTRA T----------------------------------$195
OPTRA E310/312---------------------------$79


EPSON (ON PAGE 4)

ACTION LASER 7000,7500,8000,9000----------$105
ACTION LASER 1000,1500--------------------$105


CANON PRINTERS (ON PAGE 5)

PLEASE CALL FOR MODELS AND UPDATED PRICES
FOR CANON PRINTER CARTRIDGES

PANASONIC (0N PAGE 7)

NEC SERIES 2 MODELS 90 AND 95----------$105

APPLE (0N PAGE 8)

LASER WRITER PRO 600 or 16/600------------------$49 
LASER WRITER SELECT 300,320,360-----------------$74
LASER WRITER 300 AND 320------------------------$54
LASER WRITER NT, 2NT----------------------------$54
LASER WRITER 12/640-----------------------------$79

CANON FAX (ON PAGE 9)

LASERCLASS 4000 (FX3)---------------------------$59
LASERCLASS 5000,6000,7000 (FX2)-----------------$54
LASERFAX 5000,7000 (FX2)------------------------$54
LASERFAX 8500,9000 (FX4)------------------------$54

CANON COPIERS (PAGE 10)

PC 3, 6RE, 7 AND 11 (A30)---------------------$69
PC 300,320,700,720,760,900,910,920(E-40)------$89


90 DAY UNLIMITED WARRANTY INCLUDED ON ALL PRODUCTS.

ALL TRADEMARKS AND BRAND NAMES LISTED ABOVE ARE PROPERTY OF THE 
RESPECTIVE HOLDERS AND USED FOR DESCRIPTIVE PURPOSES ONLY.


From katel at worldpath.net  Wed Nov 28 20:40:31 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Is Martel always appropriate?
Message-ID: <002601c17876$d5e65280$010a0a0a@cadence.com>

It looks to me that Martel is not a good fit for SAF.  That is because the
approprate action depends on state information.  If the line starts with
"zebra" append the sequence to the zebra sequence, if it starts with
"giraffe" append to the giraffe sequence , if it starts with "unicorn"
append to the unicorn sequence.    Martel is not oriented to state machines
except as implied by the ordering of expressions.

A simpler approach would be to filter the comments out, then split each line
and use the label component as
a selector.  Please share your opinions.


Cayte


From katel at worldpath.net  Thu Nov 29 23:03:41 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Useful pathway support
Message-ID: <002701c17954$00971680$010a0a0a@cadence.com>

  A useful project for Pathway would be a converter to E-Cell format.  Plus
a port of their script interpreter.

  Since ECell is released under the GNU public license, it should not be a
problem.

                                                                     Cayte


From adalke at mindspring.com  Fri Nov 30 01:04:12 2001
From: adalke at mindspring.com (Andrew Dalke)
Date: Sat Mar  5 14:43:07 2005
Subject: [Biopython-dev] Is Martel always appropriate?
Message-ID: <0c3901c17964$d59b4da0$0301a8c0@josiah.dalkescientific.com>

Cayte:
>It looks to me that Martel is not a good fit for SAF.  That is because the
>approprate action depends on state information.  If the line starts with
>"zebra" append the sequence to the zebra sequence, if it starts with
>"giraffe" append to the giraffe sequence , if it starts with "unicorn"
>append to the unicorn sequence.
>
>A simpler approach would be to filter the comments out, then split each
line
>and use the label component as
>a selector.  Please share your opinions.

I've had a hard time trying to figure out an answer to this.  Perhaps
the best is to start with the philosophy behind Martel.

Many formats needs to be read in bioinformatics.  Nearly everyone
writes parsers from scratch.  That wastes time and gets boring.  What
I want is a tool to help identify parts of the text and provide
a standard framework for building a parser.

Strictly speaking, Martel is a tokenizer, not a parser.  That means
it is used to identify parts of the text, but all it does with that
information is pass the subtext and some description off to the real
parser.  The parser takes these two things and does whatever is
appropriate, which is usually building up some sort of data structure.

The boundary between the two is vague and learned by experience.
For example, at one extreme, here's a format definition for SAF and
every other format.

format = Martel.Group("character", Martel.Re(r"[\000-\377]"))

In this case, the tokenizer isn't doing anything to help understand
the format -- all the work is passed off on the parser, and all
Martel does is provide a SAX-based parser framework.

Martel can help provide better tokenization, even when it doesn't
know everything in the sytem.  For example, it seems there are three
types of lines in the SAF format

# name_1   EFQEDQENVN PEKAAPAQQP RTRAGLAVLR AGNSRGAGGA PTLPETLNVA
line1_format = Group("line",
                     Group("name", Re(r"[\w]{1,14}")) + \
                     Re("[ \t]+") + \
                     ToEOL("sequence"))

#                     10         20         30         40
line2_format = Group("numbers",
                     Re(r" (\d+ *)*\R"))


comment_line_format = Group("comment", Str("#") + ToEol())

line_format = line1_format + line2_format + comment_line_format

format = Rep(line_format)

All this does is help figure out which lines contain sequences
and which should be ignored, and of those with sequences, it
says which characters belong to the "name" and which belong
to the "sequence".

The parser (the SAX handler) is the one in charge of turning this
information into something useful, and is the one which does the
state transitions.  For example, here's one which might work

class SAFHandler(handler.ContentHandler):
  def startDocument(self):
    self.capture = 0
    self.text = None
    self.sequences = {}
    self.guide_name = None
    self.current_name = None
    self.block = {}

  def startElement(self, tag, attrs):
    if tag == "name" or tag == "sequence":
      self.capture = 1
      self.text = ""

  def characters(self, text):
    if self.capture:
      self.text = self.text + text

  def _new_block(self):
    for name, seq in self.block:
      self.sequences[name] = self.sequences.get(name, "") + seq
    self.block.clear()

  def endElement(self, tag):
    if tag == "name":
      self.current_name = self.text
      if self.text == self.guide_name:  # start a new block
        self._new_block()
      elif self.guide_name is None:     # keep track of the guide name
        self.guide_name = self.text
    elif tag == "sequence":
      if not self.block.has_key(self.current_name):  # no duplicates in a
block
        self.block[self.current_name] = self.text.replace(" ", "")
    self.capture = 0

  def endDocument(self):
    self._new_block()

After parsing, the handler's 'sequences' should contain all the
sequence entries, and its 'guide_name' tells which of those should be
listed first.  (The format definition at
  http://www.embl-heidelberg.de/predictprotein/Dexa/optin_safDes.html
implies the order of the other items is arbitrary.)


So in this case, Martel isn't powerful enough to detect when new
blocks arise, but it is able to provide some context to simplify
matters for the parser.  On the other hand, here's how the parser
would be written in a more standard style

_line_pat = re.compile(r"([^ ]{1,14})[ \t]+(.*)")

def _new_block(block, sequences):
  for name, seq in block:
    sequences[name] = sequences.get(name, "") + seq
  block.clear()

def parse(infile):
  block = {}
  sequences = {}
  guide_name = None
  for line in infile.readlines():  # assume enough memory
    if line.startsWith("#"):
      continue
    if line.startsWith(" "):  # an approximation for the 'numbers' lines
      continue
    # Find the first space or tab
    m = _line_pat.match(line, "[ \t]")
    if m is None:
      raise "bad format"
    name = m.group(1)
    seq = m.group(2)
    if name == guide_name:
      _new_block(block, sequences)
    elif guide_name is None:
      guide_name = name
    if not block.has_key(name):
      block[name] = block.get(name, "") + seq.replace
  _new_block(block, sequences)
  return sequences

This is about 30 lines, compared to about 55.  It's shorter and
easier to understand.  As you say, it's simpler.

A reason it's simpler is because the format is very simple.
There's almost no structure to it.  Another reason is
because the Martel handler needs to store possibly multiple
'characters' callbacks.  It's also simpler because the parser
only needs to do one things -- build up a data structure.  This
SAF format doesn't need things like an XML markup generator or an
indexer or support for multiple variations of a format.  Martel
looses partially because it is too flexible.

Addressing your statement:
>Martel is not oriented to state machines
>except as implied by the ordering of expressions.

You are correct.  But Martel is only intended to be half the
solution.  The other half is the callback handler.  Together
they handle SAF just fine, except with about twice the complexity
if all you want to do is turn SAF into a data structure.

It's possible to conceive of a different way to receive callbacks
which is specialized for this case.  Consider something like this:

from Martel import SimpleHandler

# "SimpleHandler" is a hypothetical base class which stores all
# the events inside a set of named elements and returns them as
# a simple dictionary data structure where the key is the
# tag name and the values are all of the matching text.  Hierarchical
# data structures are ignored.  This is similar to Sean McGrath's
# RAX approach.

class MyHandler(SimpleHandler.SimpleHandler):
  def startDocument(self):
    self.sequences = {}
    self.block = {}
    self.guide_name = None

  def parseLine(self, terms):
    name = terms["name"]
    if self.guide_name = name:
      self._new_block(self)
    elif self.guide_name is None:
      self.guide_name = name
    if not self.block.has_key(name):
      self.block[name] = string.replace(terms["seq"], " ", "")

  def endDocument(self):
     self._new_block()

  def _new_block(self):
    for name, seq in self.block:
      self.sequences[name] = self.sequences.get(name, "") + seq

handler = MyHandler(want = ("line",))
parser.setHandler(handler)
parser.parse(file)

and this is of comparable length to the hand-rolled code.

If this task of parsing simple data structures is common
enough then perhaps the best solution is to write a specialized
handler base class which abstracts out the generic dirty work.

                    Andrew
                    dalke@dalkescientific.com
P.S.
  And no, I didn't test any of this code :)