From biopython-bug-admin at bioperl.org Sat Jul 8 01:05:14 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/7 Message-ID: <200007080505.BAA12351@pw600a.bioperl.org> JitterBug notification new message incoming/7 Message summary for PR#7 From: katel@worldpath.net Subject: prosite alphabet too inclusive Date: Sat, 8 Jul 2000 01:05:14 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 01:05:14 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: prosite alphabet too inclusive Message-ID: <200007080505.BAA12335@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: pattern.py Version: OS: win98 Submission from: pm41-112-174.worldpath.net (209.187.112.174) The alphabet used allows A-Z. According to: http://swift.embl-heidelberg.de/7tm/query/userman.html J, O and U have no assigned amino acids? From dalke at acm.org Sat Jul 8 15:21:54 2000 From: dalke at acm.org (Andrew Dalke) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/7 Message-ID: <000c01bfe911$c848e700$0991cdcf@josiah.daylight.com> >katel@worldpath.net > Subject: prosite alphabet too inclusive [...] >The alphabet used allows A-Z. According to: > >http://swift.embl-heidelberg.de/7tm/query/userman.html > > J, O and U have no assigned amino acids? That's my doing. I was being lenient because I figured if you use an unusual character then there's probably a reason. I just checked against ExPASy's scanprosite, and they explicitly don't allow J, O or U (or 1, or ~ or chr(15) :). So I'll fix my code. Hmm, that means I need to figure out how to login to CVS for write access. And just how to I register with Jitterbug? I don't see a place to create a new login. Andrew dalke@acm.org From biopython-bug-admin at bioperl.org Sat Jul 8 20:37:38 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 Message-ID: <200007090037.UAA14296@pw600a.bioperl.org> JitterBug notification new message incoming/8 Message summary for PR#8 From: katel@worldpath.net Subject: nr_false_neg initialized as a tuple Date: Sat, 8 Jul 2000 20:37:37 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:37:37 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: nr_false_neg initialized as a tuple Message-ID: <200007090037.UAA14280@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: prosite.py Version: OS: win98 Submission from: (NULL) (209.187.114.189) in orosite, self.nr_false_neg is initialized as the tuple ( None, None ) even though is is an integer in the prosite file. From biopython-bug-admin at bioperl.org Sat Jul 8 20:41:57 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/9 Message-ID: <200007090041.UAA14421@pw600a.bioperl.org> JitterBug notification new message incoming/9 Message summary for PR#9 From: katel@worldpath.net Subject: empty __init__.py Date: Sat, 8 Jul 2000 20:41:56 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:41:56 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: empty __init__.py Message-ID: <200007090041.UAA14405@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Tools Version: OS: win98 Submission from: (NULL) (209.187.114.189) __init__.py in Tools directory is empty. From katel at worldpath.net Sat Jul 8 21:55:27 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] new site Message-ID: <002101bfe948$c2655080$bd72bbd1@g0fjl> The site, http://rebase.neb.com/rebase/rebase.html, may be worth adding to our list of parsers. Cayte -------------- next part -------------- An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20000708/5993c1d3/attachment.htm From biopython-bug-admin at bioperl.org Sat Jul 8 23:40:09 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/9 Message-ID: <200007090340.XAA14819@pw600a.bioperl.org> JitterBug notification dalke changed notes Message summary for PR#9 From: katel@worldpath.net Subject: empty __init__.py Date: Sat, 8 Jul 2000 20:41:56 -0400 0 replies 0 followups Notes: An empty __init__.py is okay - it exists to tell Python that the directory is a submodule. It is customary, however, to put a comment like "This is a Python module" or some such to prevent confusion. Also, some programs (like at least one version of pkzip) will not include an empty file in an archive. I'll see about fixing this once I figure out how to get write access. ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:41:56 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: empty __init__.py Message-ID: <200007090041.UAA14405@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Tools Version: OS: win98 Submission from: (NULL) (209.187.114.189) __init__.py in Tools directory is empty. From biopython-bug-admin at bioperl.org Sun Jul 9 10:31:24 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 Message-ID: <200007091431.KAA15803@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#8 From: katel@worldpath.net Subject: nr_false_neg initialized as a tuple Date: Sat, 8 Jul 2000 20:37:37 -0400 0 replies 0 followups Notes: Hmmm... nr_false_neg should be initialized to 'None', so I'll fix that. However, not setting a value specified in a record is a more serious problem. Which prosite file is this happening on? ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:37:37 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: nr_false_neg initialized as a tuple Message-ID: <200007090037.UAA14280@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: prosite.py Version: OS: win98 Submission from: (NULL) (209.187.114.189) in orosite, self.nr_false_neg is initialized as the tuple ( None, None ) even though is is an integer in the prosite file. From biopython-bug-admin at bioperl.org Sun Jul 9 10:31:51 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/7 Message-ID: <200007091431.KAA15901@pw600a.bioperl.org> JitterBug notification jchang moved PR#7 from incoming to andrew Message summary for PR#7 From: katel@worldpath.net Subject: prosite alphabet too inclusive Date: Sat, 8 Jul 2000 01:05:14 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 01:05:14 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: prosite alphabet too inclusive Message-ID: <200007080505.BAA12335@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: pattern.py Version: OS: win98 Submission from: pm41-112-174.worldpath.net (209.187.112.174) The alphabet used allows A-Z. According to: http://swift.embl-heidelberg.de/7tm/query/userman.html J, O and U have no assigned amino acids? From biopython-bug-admin at bioperl.org Sun Jul 9 10:31:52 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 Message-ID: <200007091431.KAA15928@pw600a.bioperl.org> JitterBug notification jchang moved PR#8 from incoming to cayte Message summary for PR#8 From: katel@worldpath.net Subject: nr_false_neg initialized as a tuple Date: Sat, 8 Jul 2000 20:37:37 -0400 0 replies 0 followups Notes: Hmmm... nr_false_neg should be initialized to 'None', so I'll fix that. However, not setting a value specified in a record is a more serious problem. Which prosite file is this happening on? ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:37:37 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: nr_false_neg initialized as a tuple Message-ID: <200007090037.UAA14280@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: prosite.py Version: OS: win98 Submission from: (NULL) (209.187.114.189) in orosite, self.nr_false_neg is initialized as the tuple ( None, None ) even though is is an integer in the prosite file. From biopython-bug-admin at bioperl.org Sun Jul 9 10:31:53 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/9 Message-ID: <200007091431.KAA15947@pw600a.bioperl.org> JitterBug notification jchang moved PR#9 from incoming to andrew Message summary for PR#9 From: katel@worldpath.net Subject: empty __init__.py Date: Sat, 8 Jul 2000 20:41:56 -0400 0 replies 0 followups Notes: An empty __init__.py is okay - it exists to tell Python that the directory is a submodule. It is customary, however, to put a comment like "This is a Python module" or some such to prevent confusion. Also, some programs (like at least one version of pkzip) will not include an empty file in an archive. I'll see about fixing this once I figure out how to get write access. ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sat Jul 8 20:41:56 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: empty __init__.py Message-ID: <200007090041.UAA14405@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Tools Version: OS: win98 Submission from: (NULL) (209.187.114.189) __init__.py in Tools directory is empty. From jchang at SMI.Stanford.EDU Sun Jul 9 11:40:18 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] new jitterbug directories Message-ID: Hello developers, I've created directories for Andrew, Brad, Jeffrey, and Cayte. All bugs should still come into "incoming", but after that, the bugs will move into specific directories, so people will know what they're working on. Do we need a formal process for moving bugs around, or should people just take their own bugs? Jeff From katel at worldpath.net Sun Jul 9 18:19:39 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 References: <200007091431.KAA15803@pw600a.bioperl.org> Message-ID: <002501bfe9f3$c7932fe0$2ddc85d0@g0fjl> > > Message summary for PR#8 > From: katel@worldpath.net > Subject: nr_false_neg initialized as a tuple > Date: Sat, 8 Jul 2000 20:37:37 -0400 > 0 replies 0 followups > Notes: Hmmm... nr_false_neg should be initialized to 'None', so I'll fix that. > > However, not setting a value specified in a record is a more serious problem. > Which prosite file is this happening on? > > I saw it on all the files in my Test\Prosite directory. Cayte From katel at worldpath.net Sun Jul 9 18:22:59 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] new jitterbug directories References: Message-ID: <003501bfe9f4$3e163cc0$2ddc85d0@g0fjl> ----- Original Message ----- From: Jeffrey Chang To: Sent: Sunday, July 09, 2000 11:40 AM Subject: [Biopython-dev] new jitterbug directories > Hello developers, > > I've created directories for Andrew, Brad, Jeffrey, and Cayte. All bugs > should still come into "incoming", but after that, the bugs will move into > specific directories, so people will know what they're working on. > > Do we need a formal process for moving bugs around, or should people just > take their own bugs? > We can't assume that the developer will necessarily fix the bug in an opensource environment. The developer may be too busy, for a while, and new people may pitch in. Cayte From biopython-bug-admin at bioperl.org Sun Jul 9 17:17:33 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/10 Message-ID: <200007092117.RAA17038@pw600a.bioperl.org> JitterBug notification new message incoming/10 Message summary for PR#10 From: katel@worldpath.net Subject: continuation lines in references Date: Sun, 9 Jul 2000 17:17:32 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sun Jul 9 17:17:32 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: continuation lines in references Message-ID: <200007092117.RAA17022@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Prodoc.py Version: OS: Submission from: pm41-220-45.worldpath.net (208.133.220.45) Line 348 assumes a colon for a continuation line in a reference. The data I found uses indentation. From biopython-bug-admin at bioperl.org Sun Jul 9 17:19:42 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/11 Message-ID: <200007092119.RAA17153@pw600a.bioperl.org> JitterBug notification new message incoming/11 Message summary for PR#11 From: katel@worldpath.net Subject: tranlate by name Date: Sun, 9 Jul 2000 17:19:41 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sun Jul 9 17:19:41 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: tranlate by name Message-ID: <200007092119.RAA17137@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Translate.py Version: OS: win98 Submission from: pm41-220-45.worldpath.net (208.133.220.45) The following code deos not work: trans = Translate.unambiguous_dna_by_name[ 'Vertebrate Mitochondrial' ] From jchang at SMI.Stanford.EDU Mon Jul 10 00:09:28 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 In-Reply-To: <002501bfe9f3$c7932fe0$2ddc85d0@g0fjl> Message-ID: On Sun, 9 Jul 2000, Cayte wrote: > > > > Message summary for PR#8 > > From: katel@worldpath.net > > Subject: nr_false_neg initialized as a tuple > > Date: Sat, 8 Jul 2000 20:37:37 -0400 > > 0 replies 0 followups > > Notes: Hmmm... nr_false_neg should be initialized to 'None', so I'll fix that. > > > > However, not setting a value specified in a record is a more serious problem. > > Which prosite file is this happening on? > > > > > I saw it on all the files in my Test\Prosite directory. > > > Cayte I still can't duplicate it: taiyang:~/remotecvs/biopython/Tests/Prosite> python Python 1.5.2 (#1, Sep 26 1999, 16:32:39) [C] on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> from Bio.Prosite import Prosite >>> rec = Prosite.RecordParser().parse(open('ps001')) >>> print rec.nr_false_neg 33 >>> rec = Prosite.RecordParser().parse(open('ps00107.txt')) >>> print rec.nr_false_neg 142 >>> Can you send some code that will reproduce the bug? Jeff From jchang at SMI.Stanford.EDU Mon Jul 10 00:23:48 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/10 In-Reply-To: <200007092117.RAA17038@pw600a.bioperl.org> Message-ID: > Full_Name: Katharine Lindner > Module: Prodoc.py > Version: > OS: > Submission from: pm41-220-45.worldpath.net (208.133.220.45) > > > Line 348 assumes a colon for a continuation line in a reference. The data I > found uses indentation. According to Prodoc.py,v in the main repository, Prodoc.py has always looked for spaces. Prodoc.py (1.2) 347 elif line[:4] == ' ': 348 if not self._ref: 349 raise SyntaxError, "Unnumbered reference lines\n%s" % line 350 self._ref.citation = self._ref.citation + line[5:] Do you mean that you found some data that uses colons? If so, please send it, and we can add it to the test suite. Jeff From katel at worldpath.net Mon Jul 10 04:02:00 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/8 References: Message-ID: <002201bfea45$21318640$010a0a0a@0q6vm> > > > However, not setting a value specified in a record is a more serious problem. > > > Which prosite file is this happening on? > > > > > > > > I saw it on all the files in my Test\Prosite directory. > > > > > > Cayte > > > I still can't duplicate it: > > taiyang:~/remotecvs/biopython/Tests/Prosite> python > Python 1.5.2 (#1, Sep 26 1999, 16:32:39) [C] on sunos5 > Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam > >>> from Bio.Prosite import Prosite > >>> rec = Prosite.RecordParser().parse(open('ps001')) > >>> print rec.nr_false_neg > 33 > >>> rec = Prosite.RecordParser().parse(open('ps00107.txt')) > >>> print rec.nr_false_neg > 142 > >>> > > > Can you send some code that will reproduce the bug? > > Jeff > > I used code that assumes a tuple or list def print_list( list ): for item in list: print( ' ' + str( item ) ) f Cayte From dalke at acm.org Mon Jul 10 01:12:36 2000 From: dalke at acm.org (Andrew Dalke) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/7 Message-ID: <002a01bfea2d$776997e0$363ce1cf@josiah.daylight.com> >Hmm, that means I need to figure out how to login to CVS for write >access. To make things even more fun, I was connecting from behind a firewall. Not that you all want to hear the details, but it took me an hour so I want to blab. Yes, an hour to get the following few lines. Ended up having to run a tcp forwarder on a non-priveledged port, since I don't have root on that machine. firewall> tcpxd 2200 biopython.org 22 (22 is the ssh port) mybox> setenv CVS_RSH ssh mybox> cvs -d ext:dalke@firewall:/home/repository/biopython checkout biopython And no, I don't know how to have it not ask for my password each time. Andrew From katel at worldpath.net Tue Jul 11 02:28:46 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/10 References: Message-ID: <002f01bfeb01$45e9ad20$010a0a0a@0q6vm> > According to Prodoc.py,v in the main repository, Prodoc.py has always > looked for spaces. > > Prodoc.py (1.2) > 347 elif line[:4] == ' ': > 348 if not self._ref: > 349 raise SyntaxError, "Unnumbered reference lines\n%s" % > line > 350 self._ref.citation = self._ref.citation + line[5:] > OK, erase everything I wrote. The files contain a copyright banner, after the references but before {END}, that messes up prodoc.py. Cayte From klindner at jlc.net Sat Jul 15 18:00:35 2000 From: klindner at jlc.net (Katharine Lindner) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] lookup by name Message-ID: <3970DF03.8888C7F0@jlc.net> To proceed with the gui, I need the codon lookup by name feature to be fixed. My picklist uses the long names, because they are easier to recognize and the picklist eliminates the typing. Should I do the fix myself? Cayte From katel at worldpath.net Sat Jul 15 23:21:54 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase and regexp Message-ID: <39712A52.CCD936F2@worldpath.net> I was thinking about using the parser/consumer template to parse a rebase file. It would be useful if _fail_condition in ParserSupport.py could handle regular expressions. The rebase tags are indented, so white space needs to be skipped before the start field. Cayte From jchang at SMI.Stanford.EDU Sun Jul 16 00:39:47 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase and regexp In-Reply-To: <39712A52.CCD936F2@worldpath.net> Message-ID: Great! Glad you're interested in adding rebase. Yes, we can add this if it's really needed. However, I've purposefully left it out mostly to discourage the use of regular expressions for performance reasons. Is this something that you can do with either the 'start' and/or 'contains' parameters? For example, in the Blast.NCBIStandalone module, I inserted the whitespace directly into the string: read_and_call(uhandle, consumer.score, start=' Score') read_and_call(uhandle, consumer.identities, start=' Identities') Alternative ways to do this might be: read_and_call(uhandle, consumer.score, contains='Score') or: read_and_call(uhandle, consumer.score, start=' ', contains='Score') Please let me know. Jeff On Sat, 15 Jul 2000, Cayte wrote: > I was thinking about using the parser/consumer template to parse a > rebase file. It would be useful if _fail_condition in ParserSupport.py > could handle regular expressions. The rebase tags are indented, so > white space needs to be skipped before the start field. > > Cayte > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > From jchang at SMI.Stanford.EDU Sun Jul 16 19:51:16 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] lookup by name Message-ID: It's Andrew's baby, so he's got first say in things. However, if he doesn't have the time, have at it. Andrew? Jeff > To proceed with the gui, I need the codon lookup by name feature to > be fixed. My picklist uses the long names, because they are easier to > recognize and the picklist eliminates the typing. Should I do the fix > myself? > > > Cayte From biopython-bug-admin at bioperl.org Wed Jul 19 15:08:17 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/10 Message-ID: <200007191908.PAA16959@pw600a.bioperl.org> JitterBug notification jchang moved PR#10 from incoming to fixed-bugs Message summary for PR#10 From: katel@worldpath.net Subject: continuation lines in references Date: Sun, 9 Jul 2000 17:17:32 -0400 0 replies 0 followups Notes: [Jeff] > According to Prodoc.py,v in the main repository, Prodoc.py has always > looked for spaces. > > Prodoc.py (1.2) > 347 elif line[:4] == ' ': > 348 if not self._ref: > 349 raise SyntaxError, "Unnumbered reference lines\n%s" % > line > 350 self._ref.citation = self._ref.citation line[5:] > [Cayte] OK, erase everything I wrote. The files contain a copyright banner, after the references but before {END}, that messes up prodoc.p ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sun Jul 9 17:17:32 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: continuation lines in references Message-ID: <200007092117.RAA17022@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Prodoc.py Version: OS: Submission from: pm41-220-45.worldpath.net (208.133.220.45) Line 348 assumes a colon for a continuation line in a reference. The data I found uses indentation. From biopython-bug-admin at bioperl.org Wed Jul 19 15:08:17 2000 From: biopython-bug-admin at bioperl.org (biopython-bug-admin@bioperl.org) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] Notification: incoming/10 Message-ID: <200007191908.PAA16945@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#10 From: katel@worldpath.net Subject: continuation lines in references Date: Sun, 9 Jul 2000 17:17:32 -0400 0 replies 0 followups Notes: [Jeff] > According to Prodoc.py,v in the main repository, Prodoc.py has always > looked for spaces. > > Prodoc.py (1.2) > 347 elif line[:4] == ' ': > 348 if not self._ref: > 349 raise SyntaxError, "Unnumbered reference lines\n%s" % > line > 350 self._ref.citation = self._ref.citation line[5:] > [Cayte] OK, erase everything I wrote. The files contain a copyright banner, after the references but before {END}, that messes up prodoc.p ====> ORIGINAL MESSAGE FOLLOWS <==== From katel at worldpath.net Sun Jul 9 17:17:32 2000 From: katel at worldpath.net (katel@worldpath.net) Date: Sat Mar 5 14:42:50 2005 Subject: continuation lines in references Message-ID: <200007092117.RAA17022@pw600a.bioperl.org> Full_Name: Katharine Lindner Module: Prodoc.py Version: OS: Submission from: pm41-220-45.worldpath.net (208.133.220.45) Line 348 assumes a colon for a continuation line in a reference. The data I found uses indentation. From katel at worldpath.net Sun Jul 23 15:08:04 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase Message-ID: <000a01bff4d9$558f52c0$989403cf@g0fjl> An alternative to regular expressions, that would work with rebase, would be read_and_calls with an lstrip? The only reason for a regular expression is lots of white space before every field. Except for the recognition site. The recognition site may deserve a class of its own, with cutting sites, methyl sites, overhange, etc.? Some of the fields in rebase, like enzyme number and source, stay the same but some vary or appear only in a few files. Should I use an on-the fly dictionary for the field that only appear in a few files? I didn't know what restriction enzymes were, but I found a web page www.csun.edu/~hcbio027/Bio572_F97/L2/Framemain2.html Then I could see why the rebase page showed a pair of scissors! Cayte -------------- next part -------------- An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20000723/18d21271/attachment.htm From katel at worldpath.net Mon Jul 24 00:46:24 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase References: <000a01bff4d9$558f52c0$989403cf@g0fjl> Message-ID: <001801bff52a$20684880$3a70bbd1@g0fjl> Looking at the text rebase files, I noticed a difference between the Internet Explorer conversion to text and the Netscape Navigator version. The Netscape version tries to preserve more of the look and feel of the html file, but both try to preserve indention. It ocurred to me that it might be useful to have our own converter to prevent bugs caused by variations in browsers. It would also eliminate the need for stripping whitespace. The utility would simply remove the angle bracketted stuff and forget about how it looks on a page. But the converter could be written most efficiently in perl. Are we having mixed language applications? The advantage is that you can use each language for what its best at. The disadvantage is that users have to install lots of compilers. The utility could be useful in a lot of places, since many databases use HTML. This is nice for human viewers but its a hassle if its being used as input to other software. Cayte -------------- next part -------------- An HTML attachment was scrubbed... URL: http://portal.open-bio.org/pipermail/biopython-dev/attachments/20000724/f9962baf/attachment.htm From jchang at SMI.Stanford.EDU Mon Jul 24 03:21:16 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase In-Reply-To: <000a01bff4d9$558f52c0$989403cf@g0fjl> Message-ID: > An alternative to regular expressions, that would work with rebase, > would be read_and_calls with an lstrip? The only reason for a regular > expression is lots of white space before every field. Except for the > recognition site. Well, I've already added in support for regular expressions, so that's an option now. If you need to lstrip the string before read_and_call, you can write a wrapper function that does that. > The recognition site may deserve a class of its own, with cutting > sites, methyl sites, overhange, etc.? Probably the correct thing to do is to have a SeqFeature class (like bioperl) to annotate locations in a sequence. > Some of the fields in rebase, like enzyme number and source, stay > the same but some vary or appear only in a few files. Should I use an > on-the fly dictionary for the field that only appear in a few files? What I've been doing is creating classes that can hold every field, and initializing fields to a reasonable default value. It seems friendlier than to require the client to check for the existence in a dictionary. Jeff From jchang at SMI.Stanford.EDU Mon Jul 24 03:28:01 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase In-Reply-To: <001801bff52a$20684880$3a70bbd1@g0fjl> Message-ID: There's already a class that strips HTML tags in: Bio.File.SGMLHandle It decorates a file handle to HTML data (e.g. a socket to a web page) and returns only the non-tag data. It uses Python's built-in sgmllib library, since stripping tags is non-trivial. There's also a consumer decorator so that you can build consumers that don't have to deal with tags: Bio.ParserSupport.SGMLStrippingConsumer Jeff On Mon, 24 Jul 2000, Cayte wrote: > Looking at the text rebase files, I noticed a difference between the > Internet Explorer conversion to text and the Netscape Navigator > version. The Netscape version tries to preserve more of the look and > feel of the html file, but both try to preserve indention. It ocurred > to me that it might be useful to have our own converter to prevent > bugs caused by variations in browsers. It would also eliminate the > need for stripping whitespace. The utility would simply remove the > angle bracketted stuff and forget about how it looks on a page. But > the converter could be written most efficiently in perl. Are we having > mixed language applications? The advantage is that you can use each > language for what its best at. The disadvantage is that users have to > install lots of compilers. > > The utility could be useful in a lot of places, since many databases > use HTML. This is nice for human viewers but its a hassle if its > being used as input to other software. > > > Cayte > From katel at worldpath.net Mon Jul 24 12:45:08 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase References: Message-ID: <001601bff58e$887826c0$45dc85d0@g0fjl> ----- Original Message ----- From: Jeffrey Chang To: Cayte Cc: Sent: Monday, July 24, 2000 3:21 AM Subject: Re: [Biopython-dev] rebase > > Some of the fields in rebase, like enzyme number and source, stay > > the same but some vary or appear only in a few files. Should I use an > > on-the fly dictionary for the field that only appear in a few files? > > What I've been doing is creating classes that can hold every field, and > initializing fields to a reasonable default value. It seems friendlier > than to require the client to check for the existence in a dictionary. > I agree. What I'm talking about is a just in case dictionary. Without looking at every entry, we may miss some fields. When I find out about them, I can set defaults. Cayte From katel at worldpath.net Sat Jul 29 21:20:46 2000 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase References: Message-ID: <002201bff9c4$6b8ef360$64dc85d0@g0fjl> ----- Original Message ----- From: Jeffrey Chang To: Cayte Cc: ; Sent: Monday, July 24, 2000 3:28 AM Subject: Re: [Biopython-dev] rebase > There's already a class that strips HTML tags in: > Bio.File.SGMLHandle > > It decorates a file handle to HTML data (e.g. a socket to a web page) and > returns only the non-tag data. It uses Python's built-in sgmllib library, > since stripping tags is non-trivial. > > There's also a consumer decorator so that you can build consumers that > don't have to deal with tags: > Bio.ParserSupport.SGMLStrippingConsumer > > Jeff > The consumer decorator doesn't solve the problem, because it occurs in the _Scanner. SGML Handle works, except the linefeeds are placed in such a way, that there may be no separation between a key word and data from a previous field. As an experiment, I hacked handle_data in a copy of File.py and I was able to solve the problem. But to do it cleanly in production code, I would need to be able to be able to pass my own parser to SGMLStripper, as an optional parameter. The . alternative would be to subclass both SGMLStripper and /SGMLHandle, because hamdle_data is deeply buried in these classes. Isn't it spooky, the way our coding problems we deal with echo the problems our molecules solve? Cayte From jchang at SMI.Stanford.EDU Mon Jul 31 18:41:31 2000 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:42:50 2005 Subject: [Biopython-dev] rebase In-Reply-To: <002201bff9c4$6b8ef360$64dc85d0@g0fjl> Message-ID: [Jeff, saying that biopython has tools to strip HTML] [Cayte] > The consumer decorator doesn't solve the problem, because it occurs in the > _Scanner. I'm not sure what problem you're trying to solve... > SGML Handle works, except the linefeeds are placed in such a way, > that there may be no separation between a key word and data from a > previous field. Ah, yes, that's true. If you have text fields that are separated only by HTML tags, then it's insufficient to just strip the tags because then you'd have no separator. > As an experiment, I hacked handle_data in a copy of File.py and I was > able to solve the problem. But to do it cleanly in production code, I > would need to be able to be able to pass my own parser to > SGMLStripper, as an optional parameter. The . alternative would be to > subclass both SGMLStripper and /SGMLHandle, because hamdle_data is > deeply buried in these classes. Allowing an optional parser parameter for the SGMLStripper seems like the way to go. I'll fix the File.py file. Jeff