From dag at sonsorol.org Sun Nov 4 17:29:33 2001 From: dag at sonsorol.org (chris dagdigian) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Computational biology course members looking for project ideas Message-ID: <3BE5C14D.4080808@sonsorol.org> Hi folks, This email came through to our newly established "volunteer@open-bio.org" mail address. I'm forwarding it to the various lists and to people who may have some project ideas for Jace. Rather than individually bombarding Jace with requests it would probably be best for anyone who has a project idea to email their proposals back to vounteer@open-bio.org. We'll put the ideas together and respond back to Princeton. Regards, Chris -------------- next part -------------- An embedded message was scrubbed... From: "Jace Kohlmeier" Subject: [Volunteer] project request Date: Fri, 2 Nov 2001 16:11:42 -0500 Size: 2959 Url: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20011104/e59ec35b/Volunteerprojectrequest.eml From tarjei_mikkelsen at hotmail.com Sun Nov 4 23:42:13 2001 From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Pathway module Message-ID: I've committed the following modules to the CVS tree: Bio.Pathway Bio.KEGG.Map Bio.MetaTool.Input Together they form my first (and rather rudimentary) attempt at creating classes for representing and working with metabolic and signalling pathways. Bio.Pathway contains classes for representing collections of biochemical reactions of the type A + B <-> C (Reaction/System), and classes for representing explicit networks of arbitrary interactions (Interaction/Network). Bio.KEGG.Map contains a parser for reading a KEGG metabolic pathway map into Reaction/System objects. Bio.MetaTool.Input contains a function for converting a System object into a string that can be used as input tothe MetaTool program. Sample usage can be deduced from the correponding test files. This is very much a prototype so I welcome anyone interested to have a look and poke at it (and rip it apart). I don't recommend that it is included in the actual Biopython distribution until it is a bit more fleshed out and tested, but I'll leave that up to whoever makes those decisions. thanks, Tarjei Mikkelsen tarjei@genome.wi.mit.edu _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp From jchang at SMI.Stanford.EDU Mon Nov 5 19:42:41 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Notification: incoming/48 In-Reply-To: <01102418073807.14420@sienna.berkeley.edu> References: <200110242350.f9ONoEB24543@pw600a.bioperl.org> <01102418073807.14420@sienna.berkeley.edu> Message-ID: Good catch! it's fixed in the repository. Thanks, Jeff At 6:03 PM -0700 10/24/01, Gavin E. Crooks wrote: >The new code dosn't work as intended, since parse() may raise an exception. > >This > > def parse_file(self, filename): > h = open(filename) > retval = self.parse(h) > h.close() > return retval > >should be > > def parse_file(self, filename): > h = open(filename) > try: > return self.parse(h) > finally : > h.close() > >Gavin > >p.s. The viewcvs diff appears to be broken. > > >On Wed, 24 Oct 2001, you wrote: >> JitterBug notification >> >> jchang changed notes >> >> Message summary for PR#48 >> From: gec@compbio.berkeley.edu >> Subject: Unclosed file >> Date: Wed, 24 Oct 2001 13:17:43 -0400 >> 0 replies 0 followups >> Notes: It gets closed implicitly as the reference in parse >>goes out of scope. However, >> you're right that it's better to be done explicitly, so I've made >>the changes in >> the file. >> >> Thanks, >> Jeff >> >> >> ====> ORIGINAL MESSAGE FOLLOWS <==== >> >> From gec@compbio.berkeley.edu Wed Oct 24 13:17:43 2001 >> Received: from localhost (localhost [127.0.0.1]) >> by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9OHHgB21133 >> for ; Wed, 24 Oct 2001 >>13:17:43 -0400 >> Date: Wed, 24 Oct 2001 13:17:43 -0400 >> Message-Id: <200110241717.f9OHHgB21133@pw600a.bioperl.org> >> From: gec@compbio.berkeley.edu >> To: biopython-bugs@bioperl.org >> Subject: Unclosed file >> >> Full_Name: Gavin Crooks >> Module: ParserSupport.AbstractParser >> Version: >> OS: >> Submission from: sdn-ar-005casfrmp182.dialsprint.net (158.252.212.184) >> >> >> AbstractParser.parse_file(self,filename) does not close the file it opens. >> >> >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev@biopython.org >> http://biopython.org/mailman/listinfo/biopython-dev >_______________________________________________ >Biopython-dev mailing list >Biopython-dev@biopython.org >http://biopython.org/mailman/listinfo/biopython-dev From biopython-bugs at bioperl.org Mon Nov 5 19:59:53 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Notification: incoming/31 Message-ID: <200111060059.fA60xrB09668@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#31 From: hy263book@263.net Subject: When I encounter "No hits found" Date: Wed, 16 May 2001 04:14:35 -0400 0 replies 0 followups Notes: fixed in subsequent emails - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From hy263book@263.net Wed May 16 04:14:35 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187 for ; Wed, 16 May 2001 04:14:35 -0400 Date: Wed, 16 May 2001 04:14:35 -0400 Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org> From: hy263book@263.net To: biopython-bugs@bioperl.org Subject: When I encounter "No hits found" Full_Name: Huang Ying Module: Bio.Blast.NCBIStandalond Version: OS: Win2k Submission from: (NULL) (166.111.30.26) I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast result is "No hits found",python send the wrong message From biopython-bugs at bioperl.org Mon Nov 5 19:59:53 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Notification: incoming/31 Message-ID: <200111060059.fA60xrB09672@pw600a.bioperl.org> JitterBug notification jchang moved PR#31 from incoming to fixed-bugs Message summary for PR#31 From: hy263book@263.net Subject: When I encounter "No hits found" Date: Wed, 16 May 2001 04:14:35 -0400 0 replies 0 followups Notes: fixed in subsequent emails - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From hy263book@263.net Wed May 16 04:14:35 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187 for ; Wed, 16 May 2001 04:14:35 -0400 Date: Wed, 16 May 2001 04:14:35 -0400 Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org> From: hy263book@263.net To: biopython-bugs@bioperl.org Subject: When I encounter "No hits found" Full_Name: Huang Ying Module: Bio.Blast.NCBIStandalond Version: OS: Win2k Submission from: (NULL) (166.111.30.26) I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast result is "No hits found",python send the wrong message From jchang at SMI.Stanford.EDU Mon Nov 5 20:07:30 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Pathway module In-Reply-To: References: Message-ID: >This is very much a prototype so I welcome anyone interested to have >a look and poke at it (and rip it apart). I don't recommend that it >is included in the actual Biopython distribution until it is a bit >more fleshed out and tested, but I'll leave that up to whoever makes >those decisions. I won't include it if it's against your wishes. However, since Biopython is still at alpha, a likely to remain there at least through the next release, I think it's OK to put in experimental code. The people using it now are early adopters that are likely to be able to help flesh things out. Jeff From katel at worldpath.net Tue Nov 6 02:25:53 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Pathway module References: Message-ID: <001b01c16694$45c02da0$010a0a0a@cadence.com> ----- Original Message ----- From: "Tarjei Mikkelsen" To: Sent: Sunday, November 04, 2001 8:42 PM Subject: [Biopython-dev] Pathway module > > I've committed the following modules to the CVS tree: > > Bio.Pathway > Bio.KEGG.Map > Bio.MetaTool.Input > > Together they form my first (and rather rudimentary) attempt at creating > classes for representing and working with metabolic and signalling pathways. > > Bio.Pathway contains classes for representing collections of biochemical > reactions of the type A + B <-> C (Reaction/System), and classes for > representing explicit networks of arbitrary interactions > (Interaction/Network). > > Bio.KEGG.Map contains a parser for reading a KEGG metabolic pathway map into > Reaction/System objects. > > Bio.MetaTool.Input contains a function for converting a System object into a > string that can be used as input tothe MetaTool program. > > Sample usage can be deduced from the correponding test files. > > This is very much a prototype so I welcome anyone interested to have a look > and poke at it (and rip it apart). I don't recommend that it is included in > the actual Biopython distribution until it is a bit more fleshed out and > tested, but I'll leave that up to whoever makes those decisions. > Great!!! I hope to have time to look into it Wednesday. Cayte From gec at compbio.berkeley.edu Tue Nov 6 14:09:04 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Failed tests. Message-ID: <01110611111207.04148@sienna.berkeley.edu> I have just installed biopython using the latest code in CVS. A whole bunch of tests fail. Are these my problem's, or biopython's? Gavin Crooks gec@compbio.berkeley.edu http://threeplusone.com ===================================================================== ERROR: test_KEGG ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 136, in runTest __import__(self.test_name) File "./test_KEGG.py", line 8, in ? from Bio.KEGG import Map ImportError: cannot import name Map ====================================================================== ERROR: test_Pathway ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 136, in runTest __import__(self.test_name) File "./test_Pathway.py", line 10, in ? from Bio.Pathway import * ImportError: No module named Pathway ====================================================================== ERROR: test_intelligenetics ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 136, in runTest __import__(self.test_name) File "./test_intelligenetics.py", line 29, in ? src_handle = open( datafile ) IOError: [Errno 2] No such file or directory: 'IntelliGenetics/TAT_mase_nuc.txt'====================================================================== ERROR: test_metatool ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 136, in runTest __import__(self.test_name) File "./test_metatool.py", line 29, in ? src_handle = open( datafile ) IOError: [Errno 2] No such file or directory: 'MetaTool/meta.out' ====================================================================== FAIL: test_GenBank ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 153, in runTest expected_handle) File "./run_tests.py", line 247, in compare_output assert expected_line == output_line, \ AssertionError: Output : "keys: ['L31939', 'AJ237582', 'X62281', 'AF297471', 'M81224', 'X55053']\n" Expected: "keys: ['X55053', 'M81224', 'AF297471', 'X62281', 'L31939', 'AJ237582']\n" ====================================================================== FAIL: test_SubsMat ---------------------------------------------------------------------- Traceback (most recent call last): File "./run_tests.py", line 153, in runTest expected_handle) File "./run_tests.py", line 247, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'M -0.0 0.4 0.7 0.8 1.0\n' Expected: 'M 0.0 0.4 0.7 0.8 1.0\n' ---------------------------------------------------------------------- Ran 31 tests in 71.317s From jchang at SMI.Stanford.EDU Tue Nov 6 14:45:16 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Failed tests. In-Reply-To: <01110611111207.04148@sienna.berkeley.edu> References: <01110611111207.04148@sienna.berkeley.edu> Message-ID: The import errors seem to apply to new modules that Terjei put in. Do you have those in your directory? It looks like there are missing files in intelligenetics and metatool. Cayte, could you check on those? GenBank is a bug in the regression tests that should be fixed. SubsMat is a known problem that hasn't been fixed yet. Jeff From tarjei_mikkelsen at hotmail.com Tue Nov 6 14:51:35 2001 From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Failed tests. Message-ID: The KEGG and Pathway errors are probably caused by the setup script failing to install them. I've commited a quick fix for that. Alternatively, you can just copy the Bio/Pathway and Bio/KEGG directories to /lib/pythonX/site-packages/Bio directory. - Tarjei >===================================================================== >ERROR: test_KEGG >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_KEGG.py", line 8, in ? > from Bio.KEGG import Map >ImportError: cannot import name Map >====================================================================== >ERROR: test_Pathway >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_Pathway.py", line 10, in ? > from Bio.Pathway import * >ImportError: No module named Pathway >====================================================================== _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp From tarjei_mikkelsen at hotmail.com Tue Nov 6 14:54:40 2001 From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Pathway module Message-ID: >I won't include it if it's against your wishes. However, since >Biopython is still at alpha, a likely to remain there at least >through the next release, I think it's OK to put in experimental >code. The people using it now are early adopters that are likely to >be able to help flesh things out. Okay, that's fine. I've added them to the setup.py script as noted in my previous email. - Tarjei _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp From chapmanb at arches.uga.edu Tue Nov 6 15:02:21 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Failed tests. In-Reply-To: References: <01110611111207.04148@sienna.berkeley.edu> Message-ID: <20011106150221.A26736@ci350185-a.athen1.ga.home.com> Jeff: > GenBank is a bug in the regression tests that should be fixed. Yup. I'm actually working on the GenBank modules right now and noticed this problem. I'll fix it when I update those modules (hopefully tonight :-). Thanks for the heads up. Brad -- PGP public key available from http://pgp.mit.edu/ From gec at compbio.berkeley.edu Tue Nov 6 15:04:08 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:06 2005 Subject: [Biopython-dev] Failed tests. In-Reply-To: References: <01110611111207.04148@sienna.berkeley.edu> Message-ID: <01110612111908.04148@sienna.berkeley.edu> > The import errors seem to apply to new modules that Terjei put in. > Do you have those in your directory? The import errors seem to have fixed themselves. Perhaps python was looking at an older biopython version?! > It looks like there are missing files in intelligenetics and > metatool. Cayte, could you check on those? IntelliGenetics/TAT_mase_nuc.txt is IntelliGenetics/TAT_mase-nuc.txt in CVS, and there is no meta.out file. Gavin From jchang at SMI.Stanford.EDU Tue Nov 6 15:13:32 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Failed tests. In-Reply-To: <01110611111207.04148@sienna.berkeley.edu> References: <01110611111207.04148@sienna.berkeley.edu> Message-ID: I've fixed the GenBank bug (was printing out the keys to a dictionary) and one I found in Fasta (name of string type changed in Python 2.2). Jeff At 11:09 AM -0800 11/6/01, Gavin E. Crooks wrote: >I have just installed biopython using the latest code in CVS. A whole bunch of >tests fail. Are these my problem's, or biopython's? > > >Gavin Crooks >gec@compbio.berkeley.edu >http://threeplusone.com > >===================================================================== >ERROR: test_KEGG >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_KEGG.py", line 8, in ? > from Bio.KEGG import Map >ImportError: cannot import name Map >====================================================================== >ERROR: test_Pathway >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_Pathway.py", line 10, in ? > from Bio.Pathway import * >ImportError: No module named Pathway >====================================================================== >ERROR: test_intelligenetics >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_intelligenetics.py", line 29, in ? > src_handle = open( datafile ) >IOError: [Errno 2] No such file or directory: >'IntelliGenetics/TAT_mase_nuc.txt'====================================================================== >ERROR: test_metatool >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 136, in runTest > __import__(self.test_name) > File "./test_metatool.py", line 29, in ? > src_handle = open( datafile ) >IOError: [Errno 2] No such file or directory: 'MetaTool/meta.out' >====================================================================== >FAIL: test_GenBank >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 153, in runTest > expected_handle) > File "./run_tests.py", line 247, in compare_output > assert expected_line == output_line, \ >AssertionError: >Output : "keys: ['L31939', 'AJ237582', 'X62281', 'AF297471', >'M81224', 'X55053']\n" >Expected: "keys: ['X55053', 'M81224', 'AF297471', 'X62281', >'L31939', 'AJ237582']\n" >====================================================================== >FAIL: test_SubsMat >---------------------------------------------------------------------- >Traceback (most recent call last): > File "./run_tests.py", line 153, in runTest > expected_handle) > File "./run_tests.py", line 247, in compare_output > assert expected_line == output_line, \ >AssertionError: >Output : 'M -0.0 0.4 0.7 0.8 1.0\n' >Expected: 'M 0.0 0.4 0.7 0.8 1.0\n' >---------------------------------------------------------------------- >Ran 31 tests in 71.317s >_______________________________________________ >Biopython-dev mailing list >Biopython-dev@biopython.org >http://biopython.org/mailman/listinfo/biopython-dev From gec at compbio.berkeley.edu Tue Nov 6 15:21:14 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Failed tests. In-Reply-To: References: Message-ID: <01110612225009.04148@sienna.berkeley.edu> On Tue, 06 Nov 2001, you wrote: > The KEGG and Pathway errors are probably caused by the setup script failing > to install them. I've commited a quick fix for that. Alternatively, you can > just copy the Bio/Pathway and Bio/KEGG directories to > /lib/pythonX/site-packages/Bio directory. > > > - Tarjei > Yes, that makes sense now. Thanks, Gavin From gec at compbio.berkeley.edu Tue Nov 6 15:32:12 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version In-Reply-To: References: <01110517352006.04148@sienna.berkeley.edu> Message-ID: <0111061236540B.04148@sienna.berkeley.edu> On Tue, 06 Nov 2001, you wrote: > >And is there a particular version of python we should be programming > >to? I just loaded up the latest version, and assumed that anything > >avaliable would be avaliable. Perhaps I should downgrade? > > We're currently only requiring Python 2.0, but perhaps it's time to > reevaluate that. > > Jeff One advantage of moving biopython to python 2.1 is that you can presumable remove all of the PyUnit code thats in Biopython, since PyUnit is now included. Perhaps we could always use the latest stable python version, at least so long as biopython is in alpha! Gavin From gec at compbio.berkeley.edu Tue Nov 6 16:09:59 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Fwd: Unit tests Message-ID: <0111061310310D.04148@sienna.berkeley.edu> I have been working on updating the SCOP module. Now, I am a big fan of unit tests, and I find myself writing unit tests for almost everything. Not simple regression tests, but proper unit tests using the PyUnit framework. I am having problems understanding how to integrate my tests into biopythons framework. Most of the things in biopython/tests appear to be regression tests with a PyUnit coating. After looking around a bit, I found this unit test description from Zope. http://cvs.zope.org/Zope/doc/UNITTEST.txt?rev=1.2&content-type=text/vnd.viewcvs-markup Not only does this document clearly describe what a unit test should be, it also describes how the Zope tests are organized. Here is the important bit. >Writing Unit Tests For The Zope Core > > If you're writing core code, you probably don't need to listen to > any more of this. :-) The rules for writing tests for Zope core > code are simple: > > - The testing code should make use of PyUnit > (/lib/python/unittest.py). Instructions for using PyUnit are > available at http://pyunit.sourceforge.net. > > - Tests must be placed in a "tests" subdirectory of the package or > directory in which the core code you're testing lives. > > - Test modules should be named something which represents the > functionality they test, and should begin with the prefix "test." > E.g., a test module for BTree should be named testBTree.py. > > - An individual test module should take no longer than 60 seconds > to complete. This is very similar to one of the two main ways of organizing Junit tests in the java community. I think this would be a good way to organize biopythons unit tests. Thoughts? Comments? Gavin Crooks gec@compbio.berkeley.edu http://threeplusone.com/ On Tue, 06 Nov 2001, you wrote: > No objections here. Brad can probably give you better insight about > the regression tests, since he did the coding for it. > > Jeff > > > >Perhaps we should move this conversation to biopython-dev? > > > >On Tue, 06 Nov 2001, you wrote: > >> >Could you explain what the relation of Tests/UnitTest is to thre > >> >rest of the tests? Its all very confusing. > >> > >> Yeah. We're using python-style regression testing, which means that > >> a test suite is just a python script whose name begins with "test_" > >> and outputs a bunch of text. Doing a regression test means running > >> the script and checking it against previous output. > >> > >> We're using unittest.py under the hood to do the checking. Thus, > >> we're not taking advantage of all the nice features that it provides. > >> While it would be worth considering moving the whole system over, I'm > >> not sure anybody wants to go back and rewrite all the old tests. > >> > >> > >> > >> >And is there a particular version of python we should be programming > >> >to? I just loaded up the latest version, and assumed that anything > >> >avaliable would be avaliable. Perhaps I should downgrade? > >> > >> We're currently only requiring Python 2.0, but perhaps it's time to > >> reevaluate that. > >> > >> Jeff ------------------------------------------------------- From pewilkinson at informaxinc.com Tue Nov 6 17:29:36 2001 From: pewilkinson at informaxinc.com (Peter Wilkinson) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] RE: Refseq Data In-Reply-To: <200110311701.f9VH1cB31871@pw600a.bioperl.org> Message-ID: <002301c16712$848f9880$331ea8c0@l001696w00> Hi Brad, I tried your update (most recent changes you said is on, it is not working with the nucleotide records either right now. I get the following error: File "D:\Program Files\Python21\Bio\GenBank\__init__.py", line 1205, in feed self._parser.parseFile(handle) File "D:\Program Files\Python21\Martel\Parser.py", line 226, in parseFile self.parseString(fileobj.read()) File "D:\Program Files\Python21\Martel\Parser.py", line 254, in parseString self._err_handler.fatalError(result) File "D:\Program Files\Python21\lib\xml\sax\handler.py", line 38, in fatalError raise exception Martel.Parser.ParserPositionException: error parsing at or beyond character 379 I am puzzled about something though. I just downloaded the 'latest' files from the web from the Bio/GenBank directory. However the CVS viewer from the web shows that the files are 3 weeks old. I will try to go in with the command line and see what I can find ... Did you commit the changes to the CVS tree, or is the webCVS viewer doing something funky? Peter > > Traceback (most recent call last): > entry = parser.parse(gb_handle) > File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse > self._scanner.feed(handle, self_consumer) > File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed > self._parser.parseFile(handle) > File "/usr/.../Martel/Parser.py", line 226, in parseFile > self.parseString(fileobj.read()) > File "/usr/.../Martel/Parser.py", line 254, in parseString > self._err_handler.fatalError(result) File > "/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError > raise exceptionParserPositionException: error parsing at or > beyond character > 2889 > > Any help will be greatly appreciated. > > Thank You, > Jeong > > -----Original Message----- > From: Brad Chapman [mailto:chapmanb@arches.uga.edu] > Sent: Tuesday, September 18, 2001 9:26 PM > To: Jeong Joung > Cc: biopython-dev@biopython.org > Subject: Re: Parsing Protein GenBank Records > > > Hi Joung; > (ccing this to biopython-dev since this is relevant to everyone) > > > I'm having trouble parsing GenBank records obtained from the protein > > database. The parser works fine for nucleotide GenBank > records , but not > for > > protein records. I would appreciate it very much if you can > guide me in > > right direction for parsing such records. > > > > Here is the code and the error that I get back. > > > > >>> parser = GenBank.RecordParser() > > >>> ncbi = GenBank.NCBIDictionary(database='Protein') > > >>> rec = ncbi['6754304'] > > The parser does work for proteins in general, but does fail badly on > this particular REFSEQ sequence. In the past, REFSEQ stuff has been > only "sort of" GenBank format, and this record is no exception. It > has a lot of formatting problems (has no identifier for the sequence > type in the LOCUS line, has extra DBSOURCE tag, has non-standard > feature table types and keys (Protein, Region, region_name)). > Anyways, it is a big non-standard formatting mess. > > I've fixed the GenBank parser to be able to handle this, and checked > the changes into CVS. Diffs to the relevant files (Record.py, > __init__.py and genbank_format.py in Bio.GenBank) are also attached > to this file in case you don't have CVS access. > > Thanks for the bug report. Hope this works for you! > > Brad > -- > PGP public key available from http://pgp.mit.edu/ > > > --__--__-- > > Message: 2 > From: "Jeong Joung" > To: "Brad Chapman" > Cc: > Date: Tue, 30 Oct 2001 15:48:19 -0500 > Subject: [Biopython-dev] Parsing Protein GenBank Records > > Hi, > > I just found out that this problem occurs on some REFSEQ > nucleotide records > as well. > > Thank You, > Jeong > > -----Original Message----- > From: Jeong Joung [mailto:j.joung@AptusGenomics.com] > Sent: Tuesday, October 30, 2001 2:17 PM > To: Brad Chapman > Cc: biopython-dev@biopython.org > Subject: RE: Parsing Protein GenBank Records > > > Hello, > > Thanks for your help. The updated parser now works well for > most REFSEQ > proteins. I came across several REFSEQ protein records where > the parser > still fails on UNIX machine. The following is the error message: > > Traceback (most recent call last): > entry = parser.parse(gb_handle) > File "/usr/.../Bio/GenBank/__init__.py", line 281, in parse > self._scanner.feed(handle, self_consumer) > File "/usr/.../Bio/GenBank/__init__.py", line 1143, in feed > self._parser.parseFile(handle) > File "/usr/.../Martel/Parser.py", line 226, in parseFile > self.parseString(fileobj.read()) > File "/usr/.../Martel/Parser.py", line 254, in parseString > self._err_handler.fatalError(result) File > "/usr/.../python2.1/xml/sax/handler.py", line 38, in fatalError > raise exceptionParserPositionException: error parsing at or > beyond character > 2889 > > Any help will be greatly appreciated. > > Thank You, > Jeong > > -----Original Message----- > From: Brad Chapman [mailto:chapmanb@arches.uga.edu] > Sent: Tuesday, September 18, 2001 9:26 PM > To: Jeong Joung > Cc: biopython-dev@biopython.org > Subject: Re: Parsing Protein GenBank Records > > > Hi Joung; > (ccing this to biopython-dev since this is relevant to everyone) > > > I'm having trouble parsing GenBank records obtained from the protein > > database. The parser works fine for nucleotide GenBank > records , but not > for > > protein records. I would appreciate it very much if you can > guide me in > > right direction for parsing such records. > > > > Here is the code and the error that I get back. > > > > >>> parser = GenBank.RecordParser() > > >>> ncbi = GenBank.NCBIDictionary(database='Protein') > > >>> rec = ncbi['6754304'] > > The parser does work for proteins in general, but does fail badly on > this particular REFSEQ sequence. In the past, REFSEQ stuff has been > only "sort of" GenBank format, and this record is no exception. It > has a lot of formatting problems (has no identifier for the sequence > type in the LOCUS line, has extra DBSOURCE tag, has non-standard > feature table types and keys (Protein, Region, region_name)). > Anyways, it is a big non-standard formatting mess. > > I've fixed the GenBank parser to be able to handle this, and checked > the changes into CVS. Diffs to the relevant files (Record.py, > __init__.py and genbank_format.py in Bio.GenBank) are also attached > to this file in case you don't have CVS access. > > Thanks for the bug report. Hope this works for you! > > Brad > -- > PGP public key available from http://pgp.mit.edu/ > > > > --__--__-- > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > > > End of Biopython-dev Digest From chapmanb at arches.uga.edu Tue Nov 6 19:58:10 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version In-Reply-To: <0111061236540B.04148@sienna.berkeley.edu> References: <01110517352006.04148@sienna.berkeley.edu> <0111061236540B.04148@sienna.berkeley.edu> Message-ID: <20011106195809.C26903@ci350185-a.athen1.ga.home.com> Gavin: > > >And is there a particular version of python we should be programming > > >to? Jeff: > > We're currently only requiring Python 2.0, but perhaps it's time to > > reevaluate that. Gavin: > One advantage of moving biopython to python 2.1 is that you can > presumable remove all of the PyUnit code thats in Biopython, since PyUnit > is now included. > > Perhaps we could always use the latest stable python version, at least > so long as biopython is in alpha! I think you may have misunderstood Jeff. Python 2.0 is the minimum version needed for biopython. I use biopython with 2.0, 2.1 and 2.2pre-releases (depending on how lazy I am at updating the python the machine), and everything works fine. The regression tests sometimes report problems between different python versions and on different architectures (and in different phases of the moon :-), and we try to take care of these when they are noticed. If you run a test itself (ie. python test_Whatever.py), then it should work fine, but the regression comparison with the old output is what will fail. We do our best to keep these up to date, but I know I have been especially slack in working on this recently due to the excessive amount of lab work I've had to do. Too-many-mini-preps-to-think-clearly-about-regression-tests-ly yr's, Brad -- PGP public key available from http://pgp.mit.edu/ From chapmanb at arches.uga.edu Tue Nov 6 20:15:12 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Fwd: Unit tests In-Reply-To: <0111061310310D.04148@sienna.berkeley.edu> References: <0111061310310D.04148@sienna.berkeley.edu> Message-ID: <20011106201512.D26903@ci350185-a.athen1.ga.home.com> Hi Gavin; > I have been working on updating the SCOP module. Great! > Now, I am a big fan of unit tests, and I find myself writing > unit tests for almost everything. Me too. My testing ability has improved dramatically since starting on biopython -- and now for all of my independent work (and biopython-corba), I code unit tests like crazy. > I am having problems understanding how to integrate my tests > into biopythons framework. Most of the things in biopython/tests > appear to be regression tests with a PyUnit coating. Yes, I used PyUnit to build up the regression testing framework (based heavily on the regression tests that Andrew already had). I mostly just use PyUnit to deal with printing the output, etc, but it is definately a regression test framework. The regression testing framework and integrating tests into it is described on a wiki page: http://biopython.org/wiki/html/BioPython/RegressionTests.html (damn, this still uses br_regression.py instead of run_tests.py. I need to update this). > After looking around a bit, I found this unit test description from Zope. [...] > I think this would be a good way to organize biopythons unit tests. > Thoughts? Comments? I think it's great to write an individual test for a module (test_whatever.py) in whatever way you feel most comfortable with. I agree with Jeff that we don't want to rewrite all of the tests now (gack!) but definately feel free to write new tests as Unit Tests, they will still plug into the high-level regression testing framework just fine. Personally, I think that although the regression test stuff can be a pain sometimes, it is quite useful for tests like the BLAST output tests or the GenBank tests. It is much easier to dump the output of a parse to a file and make sure it remains the same then to individually have 'assert genbank_record.locus == "WHATEVER"' for a hundred different attributes on a record. So, I guess I don't think there is a need to be heavy-handed about how you "have" to do tests. Personally, I'd rather leave it up to the individual person writing the tests. The-important-thing-is-having-some-tests-ly yr's, Brad -- PGP public key available from http://pgp.mit.edu/ From adalke at mindspring.com Wed Nov 7 00:04:59 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version Message-ID: <0dbe01c16749$c0a38520$0301a8c0@josiah.dalkescientific.com> Brad: >I think you may have misunderstood Jeff. Python 2.0 is the minimum >version needed for biopython. I use biopython with 2.0, 2.1 and >2.2pre-releases (depending on how lazy I am at updating the python >the machine), and everything works fine. I must add that iterators in Python 2.2 make me want to rethink how readers are done in Martel and Biopython. Backwards compatibility? Who needs *that*?! :) So probably a year before 2.2 is mainstream enough to consider a switchover. *sigh* Andrew dalke@dalkescientific.com From chapmanb at arches.uga.edu Wed Nov 7 12:23:18 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] debug_level=2 problem in Martel.Generate Message-ID: <20011107122318.B40277@ci350185-a.athen1.ga.home.com> Hello all; While updating GenBank for RefSeq, I got knee-deep into debugging with Martel and noticed that when I set debug_level=2, I was getting an unexpected amount of output. Instead of the normal 20ish characters of text in the file being parsed, I was getting the entire file parsed up to that point. Digging into this, I realized that this seems to be due to a change between versions 1.5 and 1.6 of Martel/Generate.py. Andrew's notes state that: Fixed debug error where text[x-8:x+8] failed when x < 8, since x-8 is negative, which pulls from the end. The fix was min(0, x-8), instead of just x-8. Unfortunately, this prints from the beginning as x advances through the file, and gives all the output I was seeing (and, I don't think fixes the problem, since min(0, -2) gives the negative problem you were seeing). If I'm getting this right, the fix should be max(0, x-8), which seems to give the correct output. I've attached a patch for this. If anyone (Andrew :-), can verify that I'm thinking about this right, I'll be happy to check it in. Thanks! Brad -------------- next part -------------- --- Generate.py.orig Sat Oct 20 19:47:47 2001 +++ Generate.py Wed Nov 7 01:40:31 2001 @@ -492,7 +492,7 @@ s = s[:17] + " ... " + s[-17:] self.msg = s def __call__(self, text, x, end): - print "Match %s (x=%d): %s" % (repr(text[min(0, x-8):x+8]), x, + print "Match %s (x=%d): %s" % (repr(text[max(0, x-8):x+8]), x, repr(self.msg)) return x From chapmanb at arches.uga.edu Wed Nov 7 12:32:07 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Parsing Protein GenBank Records In-Reply-To: References: Message-ID: <20011107123207.C40277@ci350185-a.athen1.ga.home.com> [Talking about the GenBank parser] Jeong: > The updated parser now works well for most REFSEQ > proteins. I came across several REFSEQ protein records where the parser > still fails on UNIX machine. The following is the error message: > > I just found out that this problem occurs on some REFSEQ nucleotide records > as well. Thanks for the heads up. I've done a lot of work on the GenBank parser and run it across a lot of human chromosome 1 from RefSeq, and it now seems to be acting right for me. There were a bunch of fixes to file in Bio/GenBank (genbank_format.py, __init__.py, LocationParser.py and Record.py) and to Bio/SeqFeature.py to handle RefSeq decently. I've committed all of the changes to CVS, so if you have the most recent CVS everything should work smoothly with RefSeq (I hope :-). I hope this will also fix the problems that Peter was reporting. If you still have problems, please don't hesitate to let me know and I'll work more on it. Let me know what files you're having problems with, and I can take a look at those specificially. Enjoy! Brad From gec at threeplusone.com Wed Nov 7 12:41:47 2001 From: gec at threeplusone.com (Gavin) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version In-Reply-To: <20011106195809.C26903@ci350185-a.athen1.ga.home.com> References: <0111061236540B.04148@sienna.berkeley.edu> <01110517352006.04148@sienna.berkeley.edu> <0111061236540B.04148@sienna.berkeley.edu> Message-ID: >Brad: >I think you may have misunderstood Jeff. Python 2.0 is the minimum >version needed for biopython. I use biopython with 2.0, 2.1 and >2.2pre-releases (depending on how lazy I am at updating the python >the machine), and everything works fine. Not at all. If python 2.0 is the minimum version, then I cannot use python 2.1 features when writing biopython code. The only way of making sure of that (that I can think of) is to program against python 2.0. As a case in point; It was a long while before I realized that PyUnit had only been added to the core python libraries with version 2.1. I only found that out by looking at the PyUnit web site. Before that revelation I was very puzzled as to why biopython contains PyUnit code. (This still seems odd. Why isn't PyUnit installed as a separate package? (If you need it.) ) Anyways, the impression that I am getting is that it is too early in the development of biopython to be worrying about the details of which python versions are usable. Yes No? Gavin From jchang at SMI.Stanford.EDU Wed Nov 7 13:53:31 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version In-Reply-To: References: <0111061236540B.04148@sienna.berkeley.edu> <01110517352006.04148@sienna.berkeley.edu> <0111061236540B.04148@sienna.berkeley.edu> Message-ID: At 9:41 AM -0800 11/7/01, Gavin wrote: >As a case in point; It was a long while before I realized that >PyUnit had only been added to the core python libraries with version >2.1. I only found that out by looking at the PyUnit web site. >Before that revelation I was very puzzled as to why biopython >contains PyUnit code. > >(This still seems odd. Why isn't PyUnit installed as a separate >package? (If you need it.) ) Since Python doesn't have a CPAN-like package manager yet, requiring extra packages creates a barrier to entry for new users. For small packages (as far as the license allows) we've just been bundling them in biopython to make installation easier. Otherwise, we'd have more dependencies in pyunit, spark, (older) br_regrtest, more?, etc... When we move the requirement onto 2.1, we can remove the pyunit stuff. >Anyways, the impression that I am getting is that it is too early in >the development of biopython to be worrying about the details of >which python versions are usable. Yes No? No. While the package is under heavy development, there are a lot of people using it for production work. Requiring the latest version would place the burden on them to upgrade, possibly before they're ready. It might be possible to fork biopython into stable and development version, where the development version could require the latest version of python. However, that would take a lot of resources -- probably more than we have. Jeff From katel at worldpath.net Wed Nov 7 17:00:25 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Failed tests. References: <01110611111207.04148@sienna.berkeley.edu> Message-ID: <001c01c167d7$a4d470a0$010a0a0a@cadence.com> ----- Original Message ----- From: "Jeffrey Chang" To: ; Sent: Tuesday, November 06, 2001 11:45 AM Subject: Re: [Biopython-dev] Failed tests. > The import errors seem to apply to new modules that Terjei put in. > Do you have those in your directory? > > It looks like there are missing files in intelligenetics and > metatool. Cayte, could you check on those? > I just committed meta.out. It slipped through the cracks. The test passes on my local system. No I'll check IntelliGenetics. Cayte From katel at worldpath.net Wed Nov 7 17:34:31 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version References: <0111061236540B.04148@sienna.berkeley.edu> <01110517352006.04148@sienna.berkeley.edu> <0111061236540B.04148@sienna.berkeley.edu> Message-ID: <002d01c167dc$5f41d320$010a0a0a@cadence.com> > No. While the package is under heavy development, there are a lot of > people using it for production work. Requiring the latest version > would place the burden on them to upgrade, possibly before they're > ready. > > It would be nice to have feedback from these users about what is useful, what new features are needed and what could be easier to use. Cayte From adalke at mindspring.com Wed Nov 7 17:24:31 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] debug_level=2 problem in Martel.Generate Message-ID: <0e9501c167da$fb8c6c60$0301a8c0@josiah.dalkescientific.com> Brad: >If I'm getting this right, the fix should be max(0, x-8), which seems to >give the correct output. I've attached a patch for this. If anyone >(Andrew :-), can verify that I'm thinking about this right, I'll be >happy to check it in. Thanks! Oops! Yep. That's what I get for not testing. *chagrin* :) Patch away. Andrew From katel at worldpath.net Wed Nov 7 21:31:46 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Python version References: <0dbe01c16749$c0a38520$0301a8c0@josiah.dalkescientific.com> Message-ID: <007201c167fd$83c69160$010a0a0a@cadence.com> ----- Original Message ----- From: "Andrew Dalke" To: Sent: Tuesday, November 06, 2001 9:04 PM Subject: Re: [Biopython-dev] Python version > Brad: > >I think you may have misunderstood Jeff. Python 2.0 is the minimum > >version needed for biopython. I use biopython with 2.0, 2.1 and > >2.2pre-releases (depending on how lazy I am at updating the python > >the machine), and everything works fine. > > I must add that iterators in Python 2.2 make me want to > rethink how readers are done in Martel and Biopython. > Today, when I tested my NBRF parser, I fed my RecordFile handle into the Martel RecordReader. I found one bug that took all day to fix, but it worked by the end of the day. I feel that something like RecordFile is needed, because, for example, it would be a major hassle to remove all the blank lines from my NBRF file. I should add a lot of test cases to RecordFile. But if python is going to handle iteration for us, maybe I should not put my time into it? Cayte From biopython-bugs at bioperl.org Thu Nov 8 14:53:43 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/45 Message-ID: <200111081953.fA8JrhB06066@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#45 From: gec@compbio.berkeley.edu Subject: PDB sequence numbers can be negative Date: Tue, 23 Oct 2001 18:54:38 -0400 0 replies 0 followups Notes: Gavin reports this bug as fixed with his new SCOP module. - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266 for ; Tue, 23 Oct 2001 18:54:38 -0400 Date: Tue, 23 Oct 2001 18:54:38 -0400 Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org> From: gec@compbio.berkeley.edu To: biopython-bugs@bioperl.org Subject: PDB sequence numbers can be negative Full_Name: Gavin Crooks Module: SCOP/Location.py Version: OS: Submission from: sienna.berkeley.edu (128.32.236.51) PDB residue sequence numbers can, on occasion, be negative. e.g. 1B9N. SCOP domains sometimes start on negative sequence numbers. This breaks the location parser in Bio.SCOP.Location.py From biopython-bugs at bioperl.org Thu Nov 8 14:53:44 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/45 Message-ID: <200111081953.fA8JriB06070@pw600a.bioperl.org> JitterBug notification jchang moved PR#45 from incoming to fixed-bugs Message summary for PR#45 From: gec@compbio.berkeley.edu Subject: PDB sequence numbers can be negative Date: Tue, 23 Oct 2001 18:54:38 -0400 0 replies 0 followups Notes: Gavin reports this bug as fixed with his new SCOP module. - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@compbio.berkeley.edu Tue Oct 23 18:54:38 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMscB13266 for ; Tue, 23 Oct 2001 18:54:38 -0400 Date: Tue, 23 Oct 2001 18:54:38 -0400 Message-Id: <200110232254.f9NMscB13266@pw600a.bioperl.org> From: gec@compbio.berkeley.edu To: biopython-bugs@bioperl.org Subject: PDB sequence numbers can be negative Full_Name: Gavin Crooks Module: SCOP/Location.py Version: OS: Submission from: sienna.berkeley.edu (128.32.236.51) PDB residue sequence numbers can, on occasion, be negative. e.g. 1B9N. SCOP domains sometimes start on negative sequence numbers. This breaks the location parser in Bio.SCOP.Location.py From biopython-bugs at bioperl.org Thu Nov 8 14:54:15 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/49 Message-ID: <200111081954.fA8JsFB06083@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#49 From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/46 Date: Wed, 24 Oct 2001 16:58:30 -0700 0 replies 0 followups Notes: dup of #46 ====> ORIGINAL MESSAGE FOLLOWS <==== >From jchang@SMI.Stanford.EDU Wed Oct 24 19:57:15 2001 Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9ONvAB24866 for ; Wed, 24 Oct 2001 19:57:15 -0400 Received: from [171.65.33.250] (air11-smi.Stanford.EDU [171.65.33.250]) by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f9ONvEC09544 for ; Wed, 24 Oct 2001 16:57:14 -0700 (PDT) Mime-Version: 1.0 X-Sender: jchang@smi.stanford.edu (Unverified) Message-Id: In-Reply-To: <200110232256.f9NMuiB13336@pw600a.bioperl.org> References: <200110232256.f9NMuiB13336@pw600a.bioperl.org> Date: Wed, 24 Oct 2001 16:58:30 -0700 To: biopython-bugs@bioperl.org From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/46 Content-Type: text/plain; charset="us-ascii" ; format="flowed" Hi Gavin, Could you send me a sample of this? It'll be helpful to have a test case to test fixes. Thanks, Jeff >JitterBug notification > >new message incoming/46 > >Message summary for PR#46 > From: gec@compbio.berkeley.edu > Subject: PDB sequence numbers can be negative > Date: Tue, 23 Oct 2001 18:56:44 -0400 > 0 replies 0 followups > >====> ORIGINAL MESSAGE FOLLOWS <==== > >>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001 >Received: from localhost (localhost [127.0.0.1]) > by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330 > for ; Tue, 23 Oct 2001 >18:56:44 -0400 >Date: Tue, 23 Oct 2001 18:56:44 -0400 >Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org> >From: gec@compbio.berkeley.edu >To: biopython-bugs@bioperl.org >Subject: PDB sequence numbers can be negative > >Full_Name: Gavin Crooks >Module: SCOP/Location.py >Version: >OS: >Submission from: sienna.berkeley.edu (128.32.236.51) > > > >PDB residue sequence numbers can, on occasion, be >negative. e.g. 1B9N. SCOP domains sometimes start >on negative sequence numbers. This breaks the >location parser in Bio.SCOP.Location.py > > >_______________________________________________ >Biopython-dev mailing list >Biopython-dev@biopython.org >http://biopython.org/mailman/listinfo/biopython-dev From biopython-bugs at bioperl.org Thu Nov 8 14:54:15 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/49 Message-ID: <200111081954.fA8JsFB06087@pw600a.bioperl.org> JitterBug notification jchang moved PR#49 from incoming to fixed-bugs Message summary for PR#49 From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/46 Date: Wed, 24 Oct 2001 16:58:30 -0700 0 replies 0 followups Notes: dup of #46 ====> ORIGINAL MESSAGE FOLLOWS <==== >From jchang@SMI.Stanford.EDU Wed Oct 24 19:57:15 2001 Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9ONvAB24866 for ; Wed, 24 Oct 2001 19:57:15 -0400 Received: from [171.65.33.250] (air11-smi.Stanford.EDU [171.65.33.250]) by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f9ONvEC09544 for ; Wed, 24 Oct 2001 16:57:14 -0700 (PDT) Mime-Version: 1.0 X-Sender: jchang@smi.stanford.edu (Unverified) Message-Id: In-Reply-To: <200110232256.f9NMuiB13336@pw600a.bioperl.org> References: <200110232256.f9NMuiB13336@pw600a.bioperl.org> Date: Wed, 24 Oct 2001 16:58:30 -0700 To: biopython-bugs@bioperl.org From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/46 Content-Type: text/plain; charset="us-ascii" ; format="flowed" Hi Gavin, Could you send me a sample of this? It'll be helpful to have a test case to test fixes. Thanks, Jeff >JitterBug notification > >new message incoming/46 > >Message summary for PR#46 > From: gec@compbio.berkeley.edu > Subject: PDB sequence numbers can be negative > Date: Tue, 23 Oct 2001 18:56:44 -0400 > 0 replies 0 followups > >====> ORIGINAL MESSAGE FOLLOWS <==== > >>From gec@compbio.berkeley.edu Tue Oct 23 18:56:44 2001 >Received: from localhost (localhost [127.0.0.1]) > by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9NMuiB13330 > for ; Tue, 23 Oct 2001 >18:56:44 -0400 >Date: Tue, 23 Oct 2001 18:56:44 -0400 >Message-Id: <200110232256.f9NMuiB13330@pw600a.bioperl.org> >From: gec@compbio.berkeley.edu >To: biopython-bugs@bioperl.org >Subject: PDB sequence numbers can be negative > >Full_Name: Gavin Crooks >Module: SCOP/Location.py >Version: >OS: >Submission from: sienna.berkeley.edu (128.32.236.51) > > > >PDB residue sequence numbers can, on occasion, be >negative. e.g. 1B9N. SCOP domains sometimes start >on negative sequence numbers. This breaks the >location parser in Bio.SCOP.Location.py > > >_______________________________________________ >Biopython-dev mailing list >Biopython-dev@biopython.org >http://biopython.org/mailman/listinfo/biopython-dev From biopython-bugs at bioperl.org Thu Nov 8 14:54:48 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/50 Message-ID: <200111081954.fA8JslB06100@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#50 From: "Gavin E. Crooks" Subject: Re: [Biopython-dev] Notification: incoming/49 Date: Wed, 24 Oct 2001 17:40:52 -0700 0 replies 0 followups Notes: Gavin reports this as fixed. - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@sienna.berkeley.edu Wed Oct 24 20:49:43 2001 Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9P0ngB25247 for ; Wed, 24 Oct 2001 20:49:42 -0400 Received: from localhost (localhost [[UNIX: localhost]]) by sienna.berkeley.edu (8.9.3/8.9.3) id RAA03432 for biopython-bugs@bioperl.org; Wed, 24 Oct 2001 17:49:42 -0700 From: "Gavin E. Crooks" Reply-To: gec@compbio.berkeley.edu Organization: Very Little To: biopython-bugs@bioperl.org Subject: Re: [Biopython-dev] Notification: incoming/49 Date: Wed, 24 Oct 2001 17:40:52 -0700 X-Mailer: KMail [version 1.0.29] Content-Type: text/plain References: <200110242357.f9ONvGB24883@pw600a.bioperl.org> In-Reply-To: <200110242357.f9ONvGB24883@pw600a.bioperl.org> MIME-Version: 1.0 Message-Id: <01102417494205.14420@sienna.berkeley.edu> Content-Transfer-Encoding: 8bit How about "A:-1-126", direct from SCOP... 16118 px a.4.5.8 d1b9ma1 1b9m A:-1-126 I am in the middle of updating the SCOP module, and I have already refactored that code, and fixed this bug. And I've written a nice shiny unit test. But I was concerned that this same bug could crop up elsewhere. Its the kind of obscure boundary case that could trip up any code working with PDB sequence numbers. Gavin gec@compbio.berkeley.edu http://threeplusone.com > Hi Gavin, > > Could you send me a sample of this? It'll be helpful to have a test > case to test fixes. > > Thanks, > Jeff > > >Full_Name: Gavin Crooks > >Module: SCOP/Location.py > >Version: > >OS: > >Submission from: sienna.berkeley.edu (128.32.236.51) > > > >PDB residue sequence numbers can, on occasion, be > >negative. e.g. 1B9N. SCOP domains sometimes start > >on negative sequence numbers. This breaks the > >location parser in Bio.SCOP.Location.py > From biopython-bugs at bioperl.org Thu Nov 8 14:54:48 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/50 Message-ID: <200111081954.fA8JsmB06104@pw600a.bioperl.org> JitterBug notification jchang moved PR#50 from incoming to fixed-bugs Message summary for PR#50 From: "Gavin E. Crooks" Subject: Re: [Biopython-dev] Notification: incoming/49 Date: Wed, 24 Oct 2001 17:40:52 -0700 0 replies 0 followups Notes: Gavin reports this as fixed. - jchang ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@sienna.berkeley.edu Wed Oct 24 20:49:43 2001 Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f9P0ngB25247 for ; Wed, 24 Oct 2001 20:49:42 -0400 Received: from localhost (localhost [[UNIX: localhost]]) by sienna.berkeley.edu (8.9.3/8.9.3) id RAA03432 for biopython-bugs@bioperl.org; Wed, 24 Oct 2001 17:49:42 -0700 From: "Gavin E. Crooks" Reply-To: gec@compbio.berkeley.edu Organization: Very Little To: biopython-bugs@bioperl.org Subject: Re: [Biopython-dev] Notification: incoming/49 Date: Wed, 24 Oct 2001 17:40:52 -0700 X-Mailer: KMail [version 1.0.29] Content-Type: text/plain References: <200110242357.f9ONvGB24883@pw600a.bioperl.org> In-Reply-To: <200110242357.f9ONvGB24883@pw600a.bioperl.org> MIME-Version: 1.0 Message-Id: <01102417494205.14420@sienna.berkeley.edu> Content-Transfer-Encoding: 8bit How about "A:-1-126", direct from SCOP... 16118 px a.4.5.8 d1b9ma1 1b9m A:-1-126 I am in the middle of updating the SCOP module, and I have already refactored that code, and fixed this bug. And I've written a nice shiny unit test. But I was concerned that this same bug could crop up elsewhere. Its the kind of obscure boundary case that could trip up any code working with PDB sequence numbers. Gavin gec@compbio.berkeley.edu http://threeplusone.com > Hi Gavin, > > Could you send me a sample of this? It'll be helpful to have a test > case to test fixes. > > Thanks, > Jeff > > >Full_Name: Gavin Crooks > >Module: SCOP/Location.py > >Version: > >OS: > >Submission from: sienna.berkeley.edu (128.32.236.51) > > > >PDB residue sequence numbers can, on occasion, be > >negative. e.g. 1B9N. SCOP domains sometimes start > >on negative sequence numbers. This breaks the > >location parser in Bio.SCOP.Location.py > From biopython-bugs at bioperl.org Thu Nov 8 18:51:08 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/53 Message-ID: <200111082351.fA8Np8B08921@pw600a.bioperl.org> JitterBug notification new message incoming/53 Message summary for PR#53 From: "Gavin E. Crooks" Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@sienna.berkeley.edu Thu Nov 8 18:51:07 2001 Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915 for ; Thu, 8 Nov 2001 18:51:07 -0500 Received: from localhost (localhost [[UNIX: localhost]]) by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718 for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800 From: "Gavin E. Crooks" Reply-To: gec@compbio.berkeley.edu Organization: Very Little To: biopython-bugs@bioperl.org Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 X-Mailer: KMail [version 1.0.29] Content-Type: text/plain MIME-Version: 1.0 Message-Id: <0111081551110I.04148@sienna.berkeley.edu> Content-Transfer-Encoding: 8bit The SCOP package has been updated and extended. Additions include parsers for the new CLA, HIE and DES files, a Residues class to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy itself, a Raf module to handle ASTRAL RAF files, and a script, /Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records from the relevant PDB file. Enjoy, Gavin From biopython-bugs at bioperl.org Thu Nov 8 18:56:47 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/53 Message-ID: <200111082356.fA8NulB09018@pw600a.bioperl.org> JitterBug notification gec changed notes Message summary for PR#53 From: "Gavin E. Crooks" Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 0 replies 0 followups Notes: Stupid user error: emailed to wrong address ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@sienna.berkeley.edu Thu Nov 8 18:51:07 2001 Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915 for ; Thu, 8 Nov 2001 18:51:07 -0500 Received: from localhost (localhost [[UNIX: localhost]]) by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718 for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800 From: "Gavin E. Crooks" Reply-To: gec@compbio.berkeley.edu Organization: Very Little To: biopython-bugs@bioperl.org Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 X-Mailer: KMail [version 1.0.29] Content-Type: text/plain MIME-Version: 1.0 Message-Id: <0111081551110I.04148@sienna.berkeley.edu> Content-Transfer-Encoding: 8bit The SCOP package has been updated and extended. Additions include parsers for the new CLA, HIE and DES files, a Residues class to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy itself, a Raf module to handle ASTRAL RAF files, and a script, /Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records from the relevant PDB file. Enjoy, Gavin From biopython-bugs at bioperl.org Thu Nov 8 18:56:47 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/53 Message-ID: <200111082356.fA8NulB09022@pw600a.bioperl.org> JitterBug notification gec moved PR#53 from incoming to trash Message summary for PR#53 From: "Gavin E. Crooks" Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 0 replies 0 followups Notes: Stupid user error: emailed to wrong address ====> ORIGINAL MESSAGE FOLLOWS <==== >From gec@sienna.berkeley.edu Thu Nov 8 18:51:07 2001 Received: from sienna.berkeley.edu (IDENT:root@sienna.Berkeley.EDU [128.32.236.51]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fA8Np6B08915 for ; Thu, 8 Nov 2001 18:51:07 -0500 Received: from localhost (localhost [[UNIX: localhost]]) by sienna.berkeley.edu (8.9.3/8.9.3) id PAA18718 for biopython-bugs@bioperl.org; Thu, 8 Nov 2001 15:51:11 -0800 From: "Gavin E. Crooks" Reply-To: gec@compbio.berkeley.edu Organization: Very Little To: biopython-bugs@bioperl.org Subject: SCOP Date: Thu, 8 Nov 2001 15:35:13 -0800 X-Mailer: KMail [version 1.0.29] Content-Type: text/plain MIME-Version: 1.0 Message-Id: <0111081551110I.04148@sienna.berkeley.edu> Content-Transfer-Encoding: 8bit The SCOP package has been updated and extended. Additions include parsers for the new CLA, HIE and DES files, a Residues class to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy itself, a Raf module to handle ASTRAL RAF files, and a script, /Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records from the relevant PDB file. Enjoy, Gavin From gec at compbio.berkeley.edu Thu Nov 8 18:57:08 2001 From: gec at compbio.berkeley.edu (Gavin E. Crooks) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Fwd: SCOP Message-ID: <0111081558350J.04148@sienna.berkeley.edu> The SCOP package has been updated and extended. Additions include parsers for the new CLA, HIE and DES files, a Residues class to represent SCOP domain definitions, a Scop class to hold the Scop hierarchy itself, a Raf module to handle ASTRAL RAF files, and a script, /Scripts/scop_pdb.py, that can extract a SCOP domain's ATOM and HETATOM records from the relevant PDB file. Enjoy, Gavin From katel at worldpath.net Fri Nov 9 23:51:03 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] NBRF Message-ID: <001501c169a3$4e5d2f00$010a0a0a@cadence.com> I just added the NBRF parser to CVS. Its passed a sanity check but no more. I'm firmly convinced that humans were never meant to stare at screens of as, cs, gs and ts.:) But I see no alternative for checking the baseline. Any better ideas? The unit tests seem to mesh with a more computation intensive as opposed to data intensive target. RecordFile could benefit from unit tests, but I'd like to know if we plan to switch to python 2.2 iteration handling. If RecordFile is a temporary solution I'll focus elsewhere. Cayte From mkc at mathdogs.com Sat Nov 10 15:52:38 2001 From: mkc at mathdogs.com (Mike Coleman) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] doc typos Message-ID: <20011110205238.9342C3407E@debian> Here's some typos I spotted in the tutorial: cutomize -> customize ExtenedIUPACDNA -> ExtendedIUPACDNA ? Mitochondriall -> Mitochondrial ? subleties -> subtleties definately -> definitely humungous -> humongous ? "What the heck in a handle?" -> "What the heck is a handle?" From katel at worldpath.net Wed Nov 14 02:37:09 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] biotech fair Message-ID: <001f01c16cdf$2c2036a0$010a0a0a@cadence.com> Some of you may be interested in this event http://www.bioitworld.com/aboutus/index.shtml Cayte From katel at worldpath.net Wed Nov 14 22:29:46 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] First impressions of Pathway Message-ID: <006d01c16d85$c73fa700$010a0a0a@cadence.com> First we should thank all Tarjei for his work. It looks like a great start. The next step, I think would be to create some examples and see how it plays. Quibbles: I'd prefer more neutral nomenclature than parent-child because they bias the reader toward a tree structure. df_search seems to assume a connected graph ( at least it looks like it would konk out early with a disconnected graph ).. All assumptions should be documented. The following line needs a description of what each tuple contains? Since python is typeless it requires more documentation at the interfaces. catalysts -- list of tuples of catalysts involved in the same reaction step In MultiNetwork.remove_node the sense of the filter is reversed from what I'd expect if you want to remove dangling edges. My understanding is that filter returns items that make the condition true? self.__adjacency_list[node] = filter(lambda x,node=node: x[0] is node, self.__adjacency_list[node].list()) In the following sequence, node may be redefined before it is used. It looks to me that you intend to use the initial definition. for node in self.__adjacency_list.keys(): self.__adjacency_list[node] = filter(lambda x,node=node: x[0] is node, self.__adjacency_list[node].list()) # remove all refering pairs in label map for label in self.__label_map.keys(): self.__label_map[label] = filter(lambda x,node=node: x[0] is node or x[1] is node, self.__label_map[label].list()) Cayte From katel at worldpath.net Fri Nov 16 19:38:21 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] new modules Message-ID: <004b01c16f00$29c2cd00$010a0a0a@cadence.com> If anyone has started work on Msf please lt me know. Otherwise, I'll write it. I'm reading a paper with the jaw crunching title "Creating Metabolic Pathway Models Using Data Mining and Expert Knowledge" in the hope it will contain insight in how to provide effective tools for pathway analysis. Cayte From katel at worldpath.net Fri Nov 16 22:49:04 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] saf instead of msf? Message-ID: <007301c16f1a$cdffa2c0$010a0a0a@cadence.com> One format site ominously described Msf format as "obsolete". It explained that the checksum field mase msf hard to update. Another site listed "recommended" formats in bold lettering. Msf was listed in faded lettering. So I my look into SAF instead. Cayte From tarjei_mikkelsen at hotmail.com Sun Nov 18 02:53:03 2001 From: tarjei_mikkelsen at hotmail.com (Tarjei Mikkelsen) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] First impressions of Pathway Message-ID: >Quibbles: > I'd prefer more neutral nomenclature than parent-child because they >bias >the reader toward a tree structure. 'parent' and 'child' is only used for the MultiGraph class, which is a generic directed graph rep intended for internal representation. It is the Network class that is exposed to the user. Network uses 'source' and 'sink' as the corresponding terms - although I'm not convinced that is the best naming either. > df_search seems to assume a connected graph ( at least it looks like it >would konk out early with a disconnected graph ).. All assumptions should >be documented. I've updated the documentation to better reflect what it does > The following line needs a description of what each tuple contains? >Since >python is typeless it requires more documentation at the interfaces. > > catalysts -- list of tuples of catalysts involved in the same >reaction > step This has been simplified to a list of catalysts . The type is arbitrary by design - it could be anything from a string descriptor to a Enzyme record object, depending on the needs of the user. > In MultiNetwork.remove_node the sense of the filter is reversed from >what >I'd expect if you want to remove dangling edges. My understanding is that >filter returns items that make the condition true? Yup, good catch. The whole remove_edges method was faulty. It's been corrected and a test has been added. thanks, Tarjei _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp From biopython-bugs at bioperl.org Tue Nov 20 17:49:49 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Notification: incoming/54 Message-ID: <200111202249.fAKMnnA22346@pw600a.bioperl.org> JitterBug notification new message incoming/54 Message summary for PR#54 From: Subject: toner cartridges Date: Tue, 20 Nov 2001 17:48:48 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From toner@fastmail.ca Tue Nov 20 17:49:49 2001 Received: from ELIXIR.ELIXIRSOLUTIONS.NET ([64.14.239.183]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id fAKMnmA22340; Tue, 20 Nov 2001 17:49:48 -0500 Received: from unknown ([64.3.195.224] unverified) by ELIXIR.ELIXIRSOLUTIONS.NET with Microsoft SMTPSVC(5.0.2195.3779); Wed, 21 Nov 2001 04:20:10 +0530 From: Subject: toner cartridges Date: Tue, 20 Nov 2001 17:48:48 Message-Id: <736.332024.362305@unknown> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Bcc: X-OriginalArrivalTime: 20 Nov 2001 22:50:12.0328 (UTC) FILETIME=[B668B680:01C17215] **** VORTEX SUPPLIES **** YOUR LASER PRINTER TONER CARTRIDGE, COPIER AND FAX CARTRIDGE CONNECTION SAVE UP TO 30% FROM RETAIL ORDER BY PHONE:1-888-288-9043 ORDER BY FAX: 1-888-977-1577 E-MAIL REMOVAL LINE: 1-888-248-4930 UNIVERSITY AND/OR SCHOOL PURCHASE ORDERS WELCOME. (NO CREDIT APPROVAL REQUIRED) ALL OTHER PURCHASE ORDER REQUESTS REQUIRE CREDIT APPROVAL. PAY BY CHECK (C.O.D), CREDIT CARD OR PURCHASE ORDER (NET 30 DAYS). IF YOUR ORDER IS BY CREDIT CARD PLEASE LEAVE YOUR CREDIT CARD # PLUS EXPIRATION DATE. IF YOUR ORDER IS BY PURCHASE ORDER LEAVE YOUR SHIPPING/BILLING ADDRESSES AND YOUR P.O. NUMBER NOTE: WE DO NOT CARRY 1) XEROX, BROTHER, PANASONIC, FUJITSU PRODUCTS 2) HP DESKJETJET/INK JET OR BUBBLE JET CARTRIDGES 3) CANON BUBBLE JET CARTRIDGES 4) ANY OFFBRANDS BESIDES THE ONES LISTED BELOW. OUR NEW , LASER PRINTER TONER CARTRIDGE, PRICES ARE AS FOLLOWS: (PLEASE ORDER BY PAGE NUMBER AND/OR ITEM NUMBER) HEWLETT PACKARD: (ON PAGE 2) ITEM #1 LASERJET SERIES 4L,4P (74A)------------------------$44 ITEM #2 LASERJET SERIES 1100 (92A)-------------------------$44 ITEM #3 LASERJET SERIES 2 (95A)----------------------------$39 ITEM #4 LASERJET SERIES 2P (75A)---------------------------$54 ITEM #5 LASERJET SERIES 5P,6P,5MP, 6MP (3903A)---------- -$44 ITEM #6 LASERJET SERIES 5SI, 8000 (09A)--------------------$95 ITEM #7 LASERJET SERIES 2100, 2200 (96A)-------------------$74 ITEM #8 LASERJET SERIES 8100 (82X)-------------------------$115 ITEM #9 LASERJET SERIES 5L/6L (3906A)----------------------$39 ITEM #10 LASERJET SERIES 4V---------------------------------$95 ITEM #11 LASERJET SERIES 4000 (27X)--------------------------$79 ITEM #12 LASERJET SERIES 3SI/4SI (91A)-----------------------$54 ITEM #13 LASERJET SERIES 4, 4M, 5,5M-------------------------$49 ITEM #13A LASERJET SERIES 5000 (29X)-------------------------$125 ITEM #13B LASERJET SERIES 1200-------------------------------$59 ITEM #13C LASERJET SERIES 4100-------------------------------$99 ITEM #18 LASERJET SERIES 3100------------------------------$39 ITEM #19 LASERJET SERIES 4500 BLACK--------------------------$79 ITEM #20 LASERJET SERIES 4500 COLORS ------------------------$125 HEWLETT PACKARD FAX (ON PAGE 2) ITEM #14 LASERFAX 500, 700 (FX1)----------$49 ITEM #15 LASERFAX 5000,7000 (FX2)--------$64 ITEM #16 LASERFAX (FX3)------------------$59 ITEM #17 LASERFAX (FX4)------------------$54 LEXMARK/IBM (ON PAGE 3) OPTRA 4019, 4029 HIGH YIELD---------------$89 OPTRA R, 4039, 4049 HIGH YIELD-----------$105 OPTRA E310.312 HIGH YIELD----------------$79 OPTRA E-----------------------------------$59 OPTRA N----------------------------------$115 OPTRA S----------------------------------$165 OPTRA T----------------------------------$195 OPTRA E310/312---------------------------$79 EPSON (ON PAGE 4) ACTION LASER 7000,7500,8000,9000----------$105 ACTION LASER 1000,1500--------------------$105 CANON PRINTERS (ON PAGE 5) PLEASE CALL FOR MODELS AND UPDATED PRICES FOR CANON PRINTER CARTRIDGES PANASONIC (0N PAGE 7) NEC SERIES 2 MODELS 90 AND 95----------$105 APPLE (0N PAGE 8) LASER WRITER PRO 600 or 16/600------------------$49 LASER WRITER SELECT 300,320,360-----------------$74 LASER WRITER 300 AND 320------------------------$54 LASER WRITER NT, 2NT----------------------------$54 LASER WRITER 12/640-----------------------------$79 CANON FAX (ON PAGE 9) LASERCLASS 4000 (FX3)---------------------------$59 LASERCLASS 5000,6000,7000 (FX2)-----------------$54 LASERFAX 5000,7000 (FX2)------------------------$54 LASERFAX 8500,9000 (FX4)------------------------$54 CANON COPIERS (PAGE 10) PC 3, 6RE, 7 AND 11 (A30)---------------------$69 PC 300,320,700,720,760,900,910,920(E-40)------$89 90 DAY UNLIMITED WARRANTY INCLUDED ON ALL PRODUCTS. ALL TRADEMARKS AND BRAND NAMES LISTED ABOVE ARE PROPERTY OF THE RESPECTIVE HOLDERS AND USED FOR DESCRIPTIVE PURPOSES ONLY. From katel at worldpath.net Wed Nov 28 20:40:31 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Is Martel always appropriate? Message-ID: <002601c17876$d5e65280$010a0a0a@cadence.com> It looks to me that Martel is not a good fit for SAF. That is because the approprate action depends on state information. If the line starts with "zebra" append the sequence to the zebra sequence, if it starts with "giraffe" append to the giraffe sequence , if it starts with "unicorn" append to the unicorn sequence. Martel is not oriented to state machines except as implied by the ordering of expressions. A simpler approach would be to filter the comments out, then split each line and use the label component as a selector. Please share your opinions. Cayte From katel at worldpath.net Thu Nov 29 23:03:41 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Useful pathway support Message-ID: <002701c17954$00971680$010a0a0a@cadence.com> A useful project for Pathway would be a converter to E-Cell format. Plus a port of their script interpreter. Since ECell is released under the GNU public license, it should not be a problem. Cayte From adalke at mindspring.com Fri Nov 30 01:04:12 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:07 2005 Subject: [Biopython-dev] Is Martel always appropriate? Message-ID: <0c3901c17964$d59b4da0$0301a8c0@josiah.dalkescientific.com> Cayte: >It looks to me that Martel is not a good fit for SAF. That is because the >approprate action depends on state information. If the line starts with >"zebra" append the sequence to the zebra sequence, if it starts with >"giraffe" append to the giraffe sequence , if it starts with "unicorn" >append to the unicorn sequence. > >A simpler approach would be to filter the comments out, then split each line >and use the label component as >a selector. Please share your opinions. I've had a hard time trying to figure out an answer to this. Perhaps the best is to start with the philosophy behind Martel. Many formats needs to be read in bioinformatics. Nearly everyone writes parsers from scratch. That wastes time and gets boring. What I want is a tool to help identify parts of the text and provide a standard framework for building a parser. Strictly speaking, Martel is a tokenizer, not a parser. That means it is used to identify parts of the text, but all it does with that information is pass the subtext and some description off to the real parser. The parser takes these two things and does whatever is appropriate, which is usually building up some sort of data structure. The boundary between the two is vague and learned by experience. For example, at one extreme, here's a format definition for SAF and every other format. format = Martel.Group("character", Martel.Re(r"[\000-\377]")) In this case, the tokenizer isn't doing anything to help understand the format -- all the work is passed off on the parser, and all Martel does is provide a SAX-based parser framework. Martel can help provide better tokenization, even when it doesn't know everything in the sytem. For example, it seems there are three types of lines in the SAF format # name_1 EFQEDQENVN PEKAAPAQQP RTRAGLAVLR AGNSRGAGGA PTLPETLNVA line1_format = Group("line", Group("name", Re(r"[\w]{1,14}")) + \ Re("[ \t]+") + \ ToEOL("sequence")) # 10 20 30 40 line2_format = Group("numbers", Re(r" (\d+ *)*\R")) comment_line_format = Group("comment", Str("#") + ToEol()) line_format = line1_format + line2_format + comment_line_format format = Rep(line_format) All this does is help figure out which lines contain sequences and which should be ignored, and of those with sequences, it says which characters belong to the "name" and which belong to the "sequence". The parser (the SAX handler) is the one in charge of turning this information into something useful, and is the one which does the state transitions. For example, here's one which might work class SAFHandler(handler.ContentHandler): def startDocument(self): self.capture = 0 self.text = None self.sequences = {} self.guide_name = None self.current_name = None self.block = {} def startElement(self, tag, attrs): if tag == "name" or tag == "sequence": self.capture = 1 self.text = "" def characters(self, text): if self.capture: self.text = self.text + text def _new_block(self): for name, seq in self.block: self.sequences[name] = self.sequences.get(name, "") + seq self.block.clear() def endElement(self, tag): if tag == "name": self.current_name = self.text if self.text == self.guide_name: # start a new block self._new_block() elif self.guide_name is None: # keep track of the guide name self.guide_name = self.text elif tag == "sequence": if not self.block.has_key(self.current_name): # no duplicates in a block self.block[self.current_name] = self.text.replace(" ", "") self.capture = 0 def endDocument(self): self._new_block() After parsing, the handler's 'sequences' should contain all the sequence entries, and its 'guide_name' tells which of those should be listed first. (The format definition at http://www.embl-heidelberg.de/predictprotein/Dexa/optin_safDes.html implies the order of the other items is arbitrary.) So in this case, Martel isn't powerful enough to detect when new blocks arise, but it is able to provide some context to simplify matters for the parser. On the other hand, here's how the parser would be written in a more standard style _line_pat = re.compile(r"([^ ]{1,14})[ \t]+(.*)") def _new_block(block, sequences): for name, seq in block: sequences[name] = sequences.get(name, "") + seq block.clear() def parse(infile): block = {} sequences = {} guide_name = None for line in infile.readlines(): # assume enough memory if line.startsWith("#"): continue if line.startsWith(" "): # an approximation for the 'numbers' lines continue # Find the first space or tab m = _line_pat.match(line, "[ \t]") if m is None: raise "bad format" name = m.group(1) seq = m.group(2) if name == guide_name: _new_block(block, sequences) elif guide_name is None: guide_name = name if not block.has_key(name): block[name] = block.get(name, "") + seq.replace _new_block(block, sequences) return sequences This is about 30 lines, compared to about 55. It's shorter and easier to understand. As you say, it's simpler. A reason it's simpler is because the format is very simple. There's almost no structure to it. Another reason is because the Martel handler needs to store possibly multiple 'characters' callbacks. It's also simpler because the parser only needs to do one things -- build up a data structure. This SAF format doesn't need things like an XML markup generator or an indexer or support for multiple variations of a format. Martel looses partially because it is too flexible. Addressing your statement: >Martel is not oriented to state machines >except as implied by the ordering of expressions. You are correct. But Martel is only intended to be half the solution. The other half is the callback handler. Together they handle SAF just fine, except with about twice the complexity if all you want to do is turn SAF into a data structure. It's possible to conceive of a different way to receive callbacks which is specialized for this case. Consider something like this: from Martel import SimpleHandler # "SimpleHandler" is a hypothetical base class which stores all # the events inside a set of named elements and returns them as # a simple dictionary data structure where the key is the # tag name and the values are all of the matching text. Hierarchical # data structures are ignored. This is similar to Sean McGrath's # RAX approach. class MyHandler(SimpleHandler.SimpleHandler): def startDocument(self): self.sequences = {} self.block = {} self.guide_name = None def parseLine(self, terms): name = terms["name"] if self.guide_name = name: self._new_block(self) elif self.guide_name is None: self.guide_name = name if not self.block.has_key(name): self.block[name] = string.replace(terms["seq"], " ", "") def endDocument(self): self._new_block() def _new_block(self): for name, seq in self.block: self.sequences[name] = self.sequences.get(name, "") + seq handler = MyHandler(want = ("line",)) parser.setHandler(handler) parser.parse(file) and this is of comparable length to the hand-rolled code. If this task of parsing simple data structures is common enough then perhaps the best solution is to write a specialized handler base class which abstracts out the generic dirty work. Andrew dalke@dalkescientific.com P.S. And no, I didn't test any of this code :)