From tarjei at genome.wi.mit.edu Wed Aug 1 02:02:41 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:01 2005 Subject: [Biopython-dev] Pathway Module In-Reply-To: <002101c1194c$d997d5e0$010a0a0a@cadence.com> Message-ID: Hi, thanks to Cayte for taking the initiative on getting a Pathway module discussion going. Below are my ramblings on what I think such a module should be like. This is all off the top off my head, so any feedback would be greatly appreciated. First off all I think it is an useful exercise to consider what kind of tasks would benefit from the availability of reaction/pathway classes. I can think of the following: * Elementary mode analysis and MCA. - Involves converting a set of reactions to a stochiometry matrix * Mapping genes clustered by location or expression to pathways * Route queries (how can we transform A to B given a set of enzymes?) * Neighborhood queries (which enzymes are k-separated from enzyme Y?) - All three of these focus on the graph structure of the pathways. * Dynamic simulations The last task is beyond the scope of anything we could do on this project. Not only because of the technical challenges, but also because the lack of information about kinetics. There is a fair amount of kinetic information in databases like EMP and Brenda, but these numbers are extremely context specific and irregular. I therefore think that information like reaction temperature, free energies, experimentally determined kinetics, and even which organism a reaction has been observed in are best left in the Record objects of the individual database modules. I think the core of a biopython pathway module should be a relatively lightweight abstraction for pathway connectivity, and not much more. Below is a quick description of what I imagine it could look like. Note that this is a description of an *abstraction*, not a python *implementation*. CLASSES: Species: - A very light class for representing any biochemical species that are present in the system we're interested in. It could be a small molecule, an enzyme, whatever. a unique name or id - identifies what this species is (EC number, CAS number, something like that) a user-defined reference - ref to object containing further information, probably an appropriate Record Reaction: - Represents any biochemical transformation that can take place in the system, such as an enzymatic reaction, or a spontaneous transformation. a set S of Species objects - the substrates s set P of Species objects - the products a set E of Species objects - the enzymes a set F of species objects - the factors (cofactors, effectors, inhibitors?) System: - Represents the biochemical system we're interested in. It is essentially a directed multi-graph were the vertices are Species and the edges are labeled with references to the reaction that links the parent vertex to the child vertex. a set V of Species objects - these are all biochemical species in this system, including metabolites, enzymes and whatnot a set E of tuples (from, to, reaction) where from, to refer to elements in V and reaction is a (not necessarily unique) Reaction object where from is a substrate and to is a product. - these are the 'edges' that collectively define a multi-graph representing the network connectivity So for example, in as system with Species A,B,C,D,E and one Reaction R1: A + B -E-> C + D, the System object would be S1: V = {A,B,C,D,E} E = {(A,C,E), (A,D,E), (B,C,E), (B,D,E)} USAGE: This is a collection of imagined user interactions with the pathway module: First we create a bunch of Species objects which refer to descriptions of them, such as KEGG or WIT records. This step will usually happen inside a database parser: A = Species('A',ref1) B = Species('B',ref2) C = Species('C',ref2) ... Then we create any Reaction objects. This will also usually happen inside a parser module: R1 = Reaction(name='smelly',substrates=[A,B],enzymes=[E],products=[C,D]) R2 = Reaction(name='decay',substrates=[C]) R3 = R1.reverse() It should be easy to create a System object from a collection of Reactions. Connectivity should be inferred automatically when several reactions are combined: >>>S = System() >>>S.add_reaction(R1) >>>S.add_reaction(R2) >>>repr(S.species()) [Species('A'), Species('B'), ..., Species('E')] We might be interested in only some of the species: >>>repr(S.enzymes()) [Species('E')] >>>repr(S.metabolites()) [Species('A'), Species('B'), Species('C'), Species('D')] Other useful information: >>>S.stochiometry() [[-1 -1 1 1], [0 0 -1 0]] Putting the information to use: flux analysis: >>>import Bio.Pathway.Metatool >>>Metatool.find_elementary_modes(S, exterals=[A,D]) ...Metatool output... neighborhood queries: >>>import Bio.Pathway.Graph >>>Graph.find_neighbours(S, E1, separation=3) [[E2, E3], [E4], []] ..and so on. You get the picture. Appendix :) - reply to Cayte: > Step is separate from reaction, because a reaction could occur in > more than one pathway. I'm not sure I see the rationale for this. It is true that a reaction can occur in several pathways, but unless there is information about a reaction that only applies to a specific pathway there is no need to keep a separate Step object - you can just let two different pathway objects reference the same reaction object. > There may be other information associate with reaction, like > temperature, but I haven't come across it yet in the WIT or > EMP databases. As I said above, I don't think we should represent kinetics and other "volatile" information in the core pathway objects. - Tarjei From m.1.robinson at herts.ac.uk Wed Aug 1 03:41:44 2001 From: m.1.robinson at herts.ac.uk (Mark Robinson) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing References: <3B66BE5E.7020204@herts.ac.uk> Message-ID: <3B67B2B8.1020901@herts.ac.uk> Thanks Jeff, I'll get stuck in using it then ;). Hope thats been some help, I have to say so far I am really impressed by what I am seeing. Great work!! blobby Jeffrey Chang wrote: > Hey Mark, > > Thanks for letting us know about these. I'm moving this thread onto > the "biopython-dev" list, as it's probably more appropriate there. > >> Failure: test_SubsMat >> >> AssertionError: >> output: 'M0.00 0.40 0.70 0.80 1.00\n' >> Expected: 'M -0.00 0.40 0.70 0.80 1.00\n' > > > It looks like this is from a difference in how windows and Iddo's OS > handles 0's. It's probably not serious, but should be fixed. Iddo, > can you write some code that will check for this? > > >> Error: test_gobase >> >> from Bio import Sequence >> ImportError: cannot import name Sequence >> >> Error: test_rebase >> >> from Bio import Sequence >> ImportError: cannot import name Sequence > > > These seem to be from some legacy code that hasn't been cleaned up. > It's now fixed in the CVS and will be incorporated into the next release. > > > >> Failure: test_prodoc >> >> AssertionError: >> Output: 'J. \n' >> Expected: 'J. \n' > > > Brad, this looks pretty odd. Is it a newline problem? > > Jeff > > > From chapmanb at arches.uga.edu Wed Aug 1 05:28:30 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] "Features" of Bio.Clustalw In-Reply-To: References: <15203.3182.97701.271322@taxus.athen1.ga.home.com> Message-ID: <15207.52158.379530.574917@taxus.athen1.ga.home.com> Hi Davide; [Clustalw bugs] > Here I send the patches I was able to cook up, these are only minor > changes, anyway I hope it will help. Great! I applied these to CVS. Thanks much for the contribution! > I think that having a class like MultipleAlignCL is superior to passing > the alignment arguments to a function as is for blastpgp or blastall. I'm glad you like it :-). This is just an idea I came up with because clustalw had so many options. It seemed less confusing than trying to pass in all of those options through a function. blastall and blastpgp are Jeff Chang's functions, so maybe he could comment on your idea to have classes to encompass their options. I'm not positive if he even likes the "command line in a class" idea :-). > Finally it is a general mechanism and could be used to give a uniform > interface to functions invoking external programs. > > Do you think you would be interested in a patch implementing such > behaviour? I think one could also retain compatibilty with the current > interface. As I mentioned above, it is really Jeff's call about whether or not he'd like to see something like this in blastall() and friends; but I do think having a general interface would be nice. There was a lot of talk as BOSC/ISMB conference this year about other programs that it would nice for biopython to interface to (EMBOSS in particular) so there is definately interest and a lot of work that could be done along these lines, if you are interested. Also, during one of the talks at the ISMB conference I got inspired and had an idea for a generic class for running Applications. Based on what I scrawled on a piece of notebook paper during the talk, I wrote up something that kind of sketches out the ideas I had and attached it to this mail. This isn't working code or anything -- just enough to show the ideas. I'm not really sure if this is good, but I thought you might be interested in looking at it if you want to work further on this. Feel free to use it or not use it. Thanks again for the patches and interest! Brad -------------- next part -------------- """Rough ideas for a general way to access applications in biopython. """ import os # --- the general classes class AbstractApplication: """Generic interface for running applications from biopython. This class shouldn't be called directly; it should be subclassed to provide an implementation for a specific application. """ def __init__(self): self.program_name = "" self.parameters = [] def run(self): """Construct the commandline and run the program. """ pass def construct_commandline(self): """Make the commandline with the currently set options. """ commandline = "%s " % self.program_name for parameter in self.parameters: if parameter.is_required and not(parameter.is_set): raise ValueError("Parameter %s is not set." % parameter.names) if parameter.is_set: commandline += str(parameter) return commandline def set_parameter(self, name, value = None): """Set a commandline option for a program. """ set_option = 0 for parameter in self.parameters: if name in parameter.names: if value is not None: if parameter.checker_function is not None: paramater.checker_function(value) parameter.value = value parameter.is_set = 1 set_option = 1 if set_option == 0: raise ValueError("Option name %s was not found." % name) class _AbstractParameter: """A class to hold information about a parameter for a commandline. Do not use this directly, instead use one of the subclasses. Attributes: o names -- a list of string names by which the parameter can be referenced (ie. ["-a", "--append", "append"]). The first name in the list is considered to be the one that goes on the commandline, for those parameters that print the option. o checker_function -- a reference to a function that will determine if a given value is valid for this parameter. o description -- a description of the option. o is_required -- a flag to indicate if the parameter must be set for the program to be run. o is_set -- if the parameter has been set o value -- the value of a parameter """ def __init__(self, names = [], checker_function = None, is_required = 0, description = ""): self.names = names self.checker_function = checker_function self.description = description self.is_required = 0 self.is_set = 0 self.value = None class _Option(_AbstractParameter): """Represent an option that can be set for a program. This holds UNIXish options like --append=yes and -a yes """ def __str__(self): """Return the value of this option for the commandline. """ # first deal with long options if self.names[0].find("--") >= 0: output = "%s" % self.names[0] if self.value is not None: output += "=%s " % self.value # now short options elif self.names[0].find("-") >= 0: output = "%s " % self.names[0] if self.value is not None: output += "%s " % self.value else: raise ValueError("Unrecognized option type: %s" % self.names[0]) return output class _Argument(_AbstractParameter): """Represent an argument on a commandline. """ def __str__(self): if self.value is not None: return "%s " % self.value else: return " " # --- Example program for Clustalw class ClustalwApplication(AbstractApplication): """Accessing Clustalw through the Application interface. XXX This is not done at all -- just meant as an example of how the AbstractApplication stuff might work. This class could also have the same 'helper functions' as the current MultipleAlignCL class. """ def __init__(self): AbstractApplication.__init__(self) self.program_name = "clustalw" self.parameters = \ [_Argument(["sequence_file"], self._file_exists, 1), _Option(["-USETREE=", "guide_tree"], self._file_exists, 0), _Option(["-TYPE=", "output_type"], self._valid_output_type, 0) ] def run(self): commandline = self.construct_commandline() # just put in the stuff from Bio/Clustalw/__init__.py.do_alignment() # --- functions to check for valid parameters def _file_exists(self, filename): """Make sure that a passed filename exists. """ if not(os.path.exists(filename)): raise ValueError("File %s does not exist." % filename) def _valid_output_type(self, type): OUTPUT_TYPES = ['GCG', 'GDE', 'PHYLIP', 'PIR', 'NEXUS'] if type not in OUTPUT_TYPES: raise ValueError("Output type %s not valid. Options are %s" % (type, OUTPUT_TYPES)) From chapmanb at arches.uga.edu Wed Aug 1 05:42:58 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: References: <3B66BE5E.7020204@herts.ac.uk> Message-ID: <15207.53026.138744.787675@taxus.athen1.ga.home.com> Jeff: > Thanks for letting us know about these. I'm moving this thread onto > the "biopython-dev" list, as it's probably more appropriate there. I'd like to second the thanks -- it's all around nice to have people using biopython regularly on non-UNIX platforms. > >Failure: test_SubsMat > > > >AssertionError: > >output: 'M0.00 0.40 0.70 0.80 1.00\n' > >Expected: 'M -0.00 0.40 0.70 0.80 1.00\n' > > It looks like this is from a difference in how windows and Iddo's OS > handles 0's. It's probably not serious, but should be fixed. Iddo, > can you write some code that will check for this? I think this actually might be a python version difference and not an OS difference. I'm also seeing it right now on my machine: $ uname -a NetBSD taxus.athen1.ga.home.com 1.5.1 NetBSD 1.5.1 (TAXUS) #1: Tue Jun 12 09:13:48 EDT 2001 chapmanb@taxus:/usr/src/sys/arch/macppc/compile/TAXUS macppc $ python Python 2.1 (#6, Jul 8 2001, 17:18:01) [GCC egcs-2.91.66 19990314 (egcs-1.1.2 release)] on netbsd1 FAIL: test_SubsMat ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 153, in runTest expected_handle) File "run_tests.py", line 247, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'M 0.00 0.40 0.70 0.80 1.00\n' Expected: 'M -0.00 0.40 0.70 0.80 1.00\n' This is just one I've thrown my hands up in the air about. It's not really a bug in SubsMat (hey, 0.00 and -0.00 are still the same, right :-), but I'm not sure how to make the regression checker recognize this. > >Failure: test_prodoc > > > >AssertionError: > >Output: 'J. \n' > >Expected: 'J. \n' > > Brad, this looks pretty odd. Is it a newline problem? This is another one I've seen on Windows and also on Yair's Mac stuff, but have to throw my hands up in the air about. What Mark reported here is different from what I've seen -- my error looks like: Output: 'J. \n' Expected: 'J.\n' So, there is, for some unknown reason, as extra space generated at the end of the line, that we don't see on UNIX platforms. I'm not sure what is going on here, or how we can make the regression tester stop choking on it (other than reintroducing my "end of the line whitespace isn't important stuff" :-). Any ideas for anyone? I'd definately like to clear up these two problems if we could. Brad From idoerg at cc.huji.ac.il Wed Aug 1 08:31:57 2001 From: idoerg at cc.huji.ac.il (Iddo Friedberg) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: <15207.53026.138744.787675@taxus.athen1.ga.home.com> Message-ID: Hi, OK, I'm on to the test_SubsMat problem. I'll see what I can do to accomodate this. Seems like a format-string handling problem, which may arise from different OS versions. Doesn't seem to be from different python versions, as I'm also using the 2.1, and the test was good in both 2.1 and 2.0. Brad is using a 2.1 on a FreeBSD machine, and is getting different output than me. On another matter: got a problem with test_unigene: idoerg@arrakis:biopython/Tests> python run_tests.py test_unigene.py test_unigene ... FAIL ====================================================================== FAIL: test_unigene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 153, in runTest expected_handle) File "run_tests.py", line 247, in compare_output assert expected_line == output_line, \ AssertionError: Output : ' key is AI616857\n' Expected: ' key is AA495266\n' ---------------------------------------------------------------------- Ran 1 tests in 0.732s FAILED (failures=1) My machine: idoerg@arrakis:biopython/Tests> uname -a Linux arrakis.md.huji.ac.il 2.2.16-22enterprise #1 SMP Tue Aug 22 16:29:32 EDT 2000 i686 unknown My Python: idoerg@arrakis:biopython/Tests> python Python 2.1 (#1, Jul 11 2001, 11:27:29) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-85)] on linux2 Platform-independence-means-that-some-platforms-are-more-independent-than-others'ly yours, Iddo On Wed, 1 Aug 2001, Brad Chapman wrote: : Jeff: : > Thanks for letting us know about these.I'm moving this thread onto : > the "biopython-dev" list, as it's probably more appropriate there. : : I'd like to second the thanks -- it's all around nice to have people : using biopython regularly on non-UNIX platforms. : : > >Failure: test_SubsMat : > > : > >AssertionError: : > >output: 'M0.00 0.40 0.70 0.80 1.00\n' : > >Expected: 'M -0.00 0.40 0.70 0.80 1.00\n' : > : > It looks like this is from a difference in how windows and Iddo's OS : > handles 0's.It's probably not serious, but should be fixed.Iddo, : > can you write some code that will check for this? : : I think this actually might be a python version difference and not an : OS difference. I'm also seeing it right now on my machine: : : $ uname -a : NetBSD taxus.athen1.ga.home.com 1.5.1 NetBSD 1.5.1 (TAXUS) #1: Tue Jun 12 09:13:48 EDT 2001 chapmanb@taxus:/usr/src/sys/arch/macppc/compile/TAXUS macppc : : $ python : Python 2.1 (#6, Jul8 2001, 17:18:01) : [GCC egcs-2.91.66 19990314 (egcs-1.1.2 release)] on netbsd1 : : FAIL: test_SubsMat : ---------------------------------------------------------------------- : Traceback (most recent call last): : File "run_tests.py", line 153, in runTest : expected_handle) : File "run_tests.py", line 247, in compare_output : assert expected_line == output_line, \ : AssertionError: : Output: 'M 0.00 0.40 0.70 0.80 1.00\n' : Expected: 'M -0.00 0.40 0.70 0.80 1.00\n' : : This is just one I've thrown my hands up in the air about. It's not : really a bug in SubsMat (hey, 0.00 and -0.00 are still the same, right : :-), but I'm not sure how to make the regression checker recognize this. : : > >Failure: test_prodoc : > > : > >AssertionError: : > >Output: 'J. \n' : > >Expected: 'J. \n' : > : > Brad, this looks pretty odd.Is it a newline problem? : : This is another one I've seen on Windows and also on Yair's Mac stuff, : but have to throw my hands up in the air about. What Mark reported : here is different from what I've seen -- my error looks like: : : Output: 'J. \n' : Expected: 'J.\n' : : So, there is, for some unknown reason, as extra space generated at the : end of the line, that we don't see on UNIX platforms. I'm not sure : what is going on here, or how we can make the regression tester stop : choking on it (other than reintroducing my "end of the line whitespace : isn't important stuff" :-). : : Any ideas for anyone? I'd definately like to clear up these two : problems if we could. : : Brad : : _______________________________________________ : Biopython-dev mailing list : Biopython-dev@biopython.org : http://biopython.org/mailman/listinfo/biopython-dev : -- Iddo Friedberg | Tel: +972-2-6758647 Dept. of Molecular Genetics and Biotechnology | Fax: +972-2-6757308 The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il POB 12272, Jerusalem 91120 | Israel | http://bioinfo.md.huji.ac.il/marg/people-home/iddo/ From jchang at SMI.Stanford.EDU Wed Aug 1 11:01:35 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] "Features" of Bio.Clustalw In-Reply-To: <15207.52158.379530.574917@taxus.athen1.ga.home.com> References: <15203.3182.97701.271322@taxus.athen1.ga.home.com> <15207.52158.379530.574917@taxus.athen1.ga.home.com> Message-ID: > [Davide Marchignoli] > > I think that having a class like MultipleAlignCL is superior to passing > > the alignment arguments to a function as is for blastpgp or blastall. [Brad Chapman] [...] >blastall and blastpgp are Jeff Chang's functions, so maybe he could >comment on your idea to have classes to encompass their options. I'm >not positive if he even likes the "command line in a class" idea :-). > >> Finally it is a general mechanism and could be used to give a uniform >> interface to functions invoking external programs. >> >> Do you think you would be interested in a patch implementing such >> behaviour? I think one could also retain compatibilty with the current >> interface. Yes, I think that's a good idea, and one that I've used in other modules I've written. However, I do still want a low-level interface mapped closely to the program where you pass in variables as parameters to the function. If you have that, it's always possible to build other interfaces on top of it, as you suggest. However, it's harder to go the other way around. >class AbstractApplication: > """Generic interface for running applications from biopython. > > This class shouldn't be called directly; it should be subclassed to > provide an implementation for a specific application. > """ Looks pretty cool. The only thing that might be missing is some way of dealing with the output. That way, you can pass around applications that you can call, and it will return usable objects. But maybe that should be done in a decorator class. Jeff From idoerg at cc.huji.ac.il Wed Aug 1 12:23:20 2001 From: idoerg at cc.huji.ac.il (Iddo Friedberg) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: <3B66BE5E.7020204@herts.ac.uk> Message-ID: Hi, I just read Mark's post a bit more carefully: On Tue, 31 Jul 2001, Mark Robinson wrote: : Hi guys, : [Description of a couple of bugs in the regression tests] : : === : : The two AssertionErrors don't occur if I run the individual test script : only if I run it from the graphical interface, and I guess it looks like : the newline error you flag in the tutorial. Can anybody say why the AssertionErrors do not occur when the individual scripts are run, but only when the graphical interface is used? This sounds a bit weird... I should add my thanks here to all those who are involved in porting & checking Biopython on various platforms. This is very important. Iddo -- Iddo Friedberg | Tel: +972-2-6758647 Dept. of Molecular Genetics and Biotechnology | Fax: +972-2-6757308 The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il POB 12272, Jerusalem 91120 | Israel | http://bioinfo.md.huji.ac.il/marg/people-home/iddo/ From chapmanb at arches.uga.edu Wed Aug 1 12:43:41 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: References: <3B66BE5E.7020204@herts.ac.uk> Message-ID: <15208.12733.310043.228233@taxus.athen1.ga.home.com> Mark: > : The two AssertionErrors don't occur if I run the individual test script > : only if I run it from the graphical interface, and I guess it looks like > : the newline error you flag in the tutorial. Iddo: > Can anybody say why the AssertionErrors do not occur when the individual > scripts are run, but only when the graphical interface is used? This > sounds a bit weird... Sure, if you just run the test script: python test_SubsMat.py the test itself runs fine. But if you add on the comparison of the generated output to the old output, then that's where you'll get the error (an AssertionError in this case, since the regression testing framework just asserts that the lines are the same). If you want to just run the regression testing stuff on a single test, you can do: python run_tests.py test_SubsMat to just do SubsMat. Back-to-work-ly yr's, Brad From idoerg at cc.huji.ac.il Wed Aug 1 12:55:10 2001 From: idoerg at cc.huji.ac.il (Iddo Friedberg) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: <15208.12733.310043.228233@taxus.athen1.ga.home.com> Message-ID: On Wed, 1 Aug 2001, Brad Chapman wrote: : Mark: : > : The two AssertionErrors don't occur if I run the individual test script : > : only if I run it from the graphical interface, and I guess it looks like : > : the newline error you flag in the tutorial. : : Iddo: : > Can anybody say why the AssertionErrors do not occur when the individual : > scripts are run, but only when the graphical interface is used? This : > sounds a bit weird... : : Sure, if you just run the test script: : : python test_SubsMat.py : : the test itself runs fine. But if you add on the comparison of the : generated output to the old output, then that's where you'll get the : error (an AssertionError in this case, since the regression testing : framework just asserts that the lines are the same). : Oh, I thought that Mark meant: python run_tests.py test_SubsMat.py (As you show later, which compares with the old output). That explains stuff. Iddo -- Iddo Friedberg | Tel: +972-2-6758647 Dept. of Molecular Genetics and Biotechnology | Fax: +972-2-6757308 The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il POB 12272, Jerusalem 91120 | Israel | http://bioinfo.md.huji.ac.il/marg/people-home/iddo/ From tarjei at genome.wi.mit.edu Wed Aug 1 23:49:33 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway Module In-Reply-To: <003c01c11b17$477f3000$010a0a0a@cadence.com> Message-ID: > > > Step is separate from reaction, because a reaction could occur in > > > more than one pathway. > > > > I'm not sure I see the rationale for this. It is true that a reaction > > can occur in several pathways, but unless there is information about a > > reaction that only applies to a specific pathway there is no need to > > keep a separate Step object - you can just let two different pathway > > objects reference the same reaction object. > > > > The information that applies to just one pathway is the branching and > sequence, the in links and out links to other steps.. Maybe you can tease > this information out of the products and substrates for each > reaction, but I > thought of using explicit links from one step to the next step(s). So there are two issues here: 1) When is there a link between two reactions in a pathway? and 2) How do we represent those links? For 1) my understanding is that a pathway is uniquely defined by the substrates and products of its constituent reactions. That is, there is always a link from A->B to B->A and from C + D -> E to E -> F, and there is never a link from B->A to A->B, or from E -> C + D to A -> B. Because of this I think it is important that a Pathway/System class infer links between reactions automatically. That is, if a user combines two reactions A -> B and B -> C into a pathway, s/he should not have to explicitly define the link between them. For 2) there are several equivalent options. My proposed classes would keep a (kind of) adjacency list/matrix in the System class that explicitly define these links. If I understand your proposal, your idea is to keep an array of Step objects that reference their neighbors internally. These two representations are essentially equivalent and shouldn't make any difference to the end-user, so I don't have a particularly strong opinion about which is the better. Keep the ideas flowing :) Tarjei From jchang at SMI.Stanford.EDU Wed Aug 1 19:57:22 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] add dynamic programming alignment modules Message-ID: On the flight home from ISMB, I coded up some modules to do pairwise alignments. I went ahead and put them into the Bio.Align package because they seem most appropriate there -- I hope nobody objects! There are two main modules: pairwise.py and fastpairwise.py. The first one implements a slower, more general alignment algorithm. The second is faster, but requires an affine gap penalty. Right now, they're both implemented in python. However, I broke the code up in such a way so that it won't be hard to swap out a piece of it with fast C code. I didn't have time to do this on the flight, but might get to it at a later date. Enjoy! Jeff From jchang at SMI.Stanford.EDU Wed Aug 1 13:40:16 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: References: Message-ID: >On another matter: got a problem with test_unigene: > >idoerg@arrakis:biopython/Tests> python run_tests.py test_unigene.py >test_unigene ... FAIL >My machine: > >idoerg@arrakis:biopython/Tests> uname -a >Linux arrakis.md.huji.ac.il 2.2.16-22enterprise #1 SMP Tue Aug 22 16:29:32 >EDT 2000 i686 unknown > >My Python: > >idoerg@arrakis:biopython/Tests> python >Python 2.1 (#1, Jul 11 2001, 11:27:29) >[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-85)] on linux2 Uh, oh. That's bad. Are you sure you have a current CVS? Mine works. I'm on: SunOS helio 5.6 Generic_105181-25 sun4u sparc SUNW,Ultra-Enterprise Python 2.1 (#7, Apr 17 2001, 18:53:25) [GCC 2.8.1] on sunos5 Cayte, could you look into this? Jeff From jchang at SMI.Stanford.EDU Wed Aug 1 13:30:46 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: <15207.53026.138744.787675@taxus.athen1.ga.home.com> References: <3B66BE5E.7020204@herts.ac.uk> <15207.53026.138744.787675@taxus.athen1.ga.home.com> Message-ID: I think this may be a problem in test_prodoc.py rather than the regression testing framework. This output is generated in a function called print_references: def print_references( list ): for item in list: text = item.number + ' ' + item.authors + ' ' + item.citation while text: print text[ :80 ] text = text[ 80: ] It prints some text out 80 characters at a time. Perhaps this boundary is falling on different characters depending on the OS' line breaking convention. To make things more difficult, the text file itself has different types of line breaks. I've gone through and changed this code to: def print_references( list ): for item in list: print item.number print item.authors print item.citation and submitted the changes and the new output to CVS. I don't have a reproducible here, so could someone with a Windows machine take a look at it? Thanks, Jeff > > >Failure: test_prodoc >> > >> >AssertionError: >> >Output: 'J. \n' >> >Expected: 'J. \n' >> >> Brad, this looks pretty odd. Is it a newline problem? > >This is another one I've seen on Windows and also on Yair's Mac stuff, >but have to throw my hands up in the air about. What Mark reported >here is different from what I've seen -- my error looks like: > >Output: 'J. \n' >Expected: 'J.\n' > >So, there is, for some unknown reason, as extra space generated at the >end of the line, that we don't see on UNIX platforms. I'm not sure >what is going on here, or how we can make the regression tester stop >choking on it (other than reintroducing my "end of the line whitespace >isn't important stuff" :-). > >Any ideas for anyone? I'd definately like to clear up these two >problems if we could. > >Brad From katel at worldpath.net Thu Aug 2 01:52:10 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway Module References: Message-ID: <003c01c11b17$477f3000$010a0a0a@cadence.com> ----- Original Message ----- S1: V = {A,B,C,D,E} E = {(A,C,E), (A,D,E), (B,C,E), (B,D,E)} > > Step is separate from reaction, because a reaction could occur in > > more than one pathway. > > I'm not sure I see the rationale for this. It is true that a reaction > can occur in several pathways, but unless there is information about a > reaction that only applies to a specific pathway there is no need to > keep a separate Step object - you can just let two different pathway > objects reference the same reaction object. > The information that applies to just one pathway is the branching and sequence, the in links and out links to other steps.. Maybe you can tease this information out of the products and substrates for each reaction, but I thought of using explicit links from one step to the next step(s). Cayte From davide at biodec.com Thu Aug 2 04:29:11 2001 From: davide at biodec.com (Davide Marchignoli) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Abstract application wrapper In-Reply-To: <15207.52158.379530.574917@taxus.athen1.ga.home.com> Message-ID: Hi, On Wed, 1 Aug 2001, Brad Chapman wrote: > Hi Davide; > > As I mentioned above, it is really Jeff's call about whether or not > he'd like to see something like this in blastall() and friends; but I > do think having a general interface would be nice. There was a lot of > talk as BOSC/ISMB conference this year about other programs that it > would nice for biopython to interface to (EMBOSS in particular) so > there is definately interest and a lot of work that could be done > along these lines, if you are interested. > > Also, during one of the talks at the ISMB conference I got inspired > and had an idea for a generic class for running Applications. Based on > what I scrawled on a piece of notebook paper during the talk, I wrote > up something that kind of sketches out the ideas I had and attached it > to this mail. This isn't working code or anything -- just enough to > show the ideas. I'm not really sure if this is good, but I thought you > might be interested in looking at it if you want to work further on > this. Feel free to use it or not use it. > > Thanks again for the patches and interest! > > Brad > I think it is really very nice! In my opinion it is general enough to encapsulate most (if not all) external programs used within biopython. If there is an agreement on the interface I think it should not be a problem to fix the implementation details. However I slightly prefer the lighter version in which you have a class AbstractApplicationCommandLine (yes to be shortened) instead of AbstractApplication where the only difference is that you do not have a run method and have also a __str__ method behaving as construct_commandline. (or maybe better something returning a list of strings?) In my opinion the advantage of such architecture is that you do not have a wrapper around the function running the application, but rather your class works side by side with the function running the application. You retain the lowest level interface given by the function to which you can pass the os.system string and also an higher level interface in which you pass an instance of some class derived from AbstractApplicationCommandLine. In my opinion, at the moment the interface provided by blastpgp is not completely low level. For instance you cannot pass to blastpgp a parameter that is not listed in att2param. The blastpgp function already does some kind of parsing. With this approach you would not repeat work (parameter parsing would be done only at the level of the CommandLine class), you would retain an interface at lower level than the one you have now and finally you would have an high level interface provided by the AbstractApplicationCommandLine class. One of the nice things it would allow would be the following: # NON working code, for example purpose only def blastpgp(commandline): args = str(commandline).split() ... r, w, e = popen2.popen3(args) if commandline isinstance(BlastpgpCommandline) and commandline.streaminput: commandline.write_input(w) else: w.close where BlastpgpCommandline implements: def set_seqinput(self, seq_record): self.input_seq_record = seq_record self.streaminput = 1 def set_streaminput(self, stream): self.input_stream = stream self.streaminput = 1 def write_input(self, outstream): if self.input_stream: outstream.write(self.input_stream.read()) elif self.input_seq_record: SeqIO.Fasta.FastaWriter(outstream).write(self.input_seq_record) else: raise ValueError so the user could write something like args = BlastpgpCommandline(...) args.set_input(seqrecord) # passing a SeqRecord as input align = blastpgp(args) Let me know what you think about it. Bye, Davide Marchignoli From idoerg at cc.huji.ac.il Thu Aug 2 02:51:22 2001 From: idoerg at cc.huji.ac.il (Iddo Friedberg) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: Message-ID: Hi, [Iddo] : >On another matter: got a problem with test_unigene: : > : >idoerg@arrakis:biopython/Tests> python run_tests.pytest_unigene.py : >test_unigene ... FAIL False alarm. Sorry. Iddo On Wed, 1 Aug 2001, Jeffrey Chang wrote: : : : >My machine: : > : >idoerg@arrakis:biopython/Tests> uname -a : >Linux arrakis.md.huji.ac.il 2.2.16-22enterprise #1 SMP Tue Aug 22 16:29:32 : >EDT 2000 i686 unknown : > : >My Python: : > : >idoerg@arrakis:biopython/Tests> python : >Python 2.1 (#1, Jul 11 2001, 11:27:29) : >[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-85)] on linux2 : : : Uh, oh.That's bad. Are you sure you have a current CVS?Mine : works.I'm on: : : SunOS helio 5.6 Generic_105181-25 sun4u sparc SUNW,Ultra-Enterprise : : Python 2.1 (#7, Apr 17 2001, 18:53:25) : [GCC 2.8.1] on sunos5 : : : Cayte, could you look into this? : : Jeff : -- Iddo Friedberg | Tel: +972-2-6758647 Dept. of Molecular Genetics and Biotechnology | Fax: +972-2-6757308 The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il POB 12272, Jerusalem 91120 | Israel | http://bioinfo.md.huji.ac.il/marg/people-home/iddo/ From jefftc at Stanford.EDU Thu Aug 2 20:43:30 2001 From: jefftc at Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] test (apologies) Message-ID: My previous emails to biopython did not go through, so I'm sending this to check if there's a problem with my mail. Sorry about the spam! Jeff From katel at worldpath.net Fri Aug 3 00:00:17 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing References: Message-ID: <001c01c11bd0$d0c05e20$010a0a0a@cadence.com> > Cayte, could you look into this? > Its failing on a dictionary item. I think I need to sort the items before printing. Andrew fixed this on another file and I was going to put the fix in unigene and kabat when I became bogged down on a Martel issue. Cayte From katel at worldpath.net Fri Aug 3 03:40:26 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing References: <001c01c11bd0$d0c05e20$010a0a0a@cadence.com> Message-ID: <002301c11bef$935c06a0$010a0a0a@cadence.com> ----- Original Message ----- From: "Cayte" To: "Iddo Friedberg" ; ; "Jeffrey Chang" Sent: Thursday, August 02, 2001 9:00 PM Subject: Re: [Biopython-dev] Re: [BioPython] tests failing > > Cayte, could you look into this? > > > Its failing on a dictionary item. I think I need to sort the items before > printing. Andrew fixed this on another file and I was going to put the fix > in unigene and kabat when I became bogged down on a Martel issue. > I'm confused. Someone( Andrew? ) must have fixed the code. The latest CVS code ( 8/2 ) doesn't have the problem. I just downloaded it and tried it Andrew said he made a fix but that was a while ago. The latest unigene code is dated 8/2.!?!?!? The heat ust be getting to me.:)! Cayte From pewilkinson at informaxinc.com Fri Aug 3 14:34:04 2001 From: pewilkinson at informaxinc.com (Peter Wilkinson) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway module In-Reply-To: <200108031601.f73G1qq22184@pw600a.bioperl.org> Message-ID: <001c01c11c4a$e04f8530$7d0210ac@l001696w00> ok guys, if you are not aware, look up BIND on the net. This is written by Christopher Hogue's group (yes, the same would has writtn chapter in the Baxevanis text Bioinformatics). his web site is at the Samuel Lunenfield institute in Toronto, Ontario (sp?) on the web, and there is a paper explaining how it works on the sire. It is easy to find with www.google.com the BIND data model works very well, and it involves interactions that can be between protein:protein, protein:molecule, protein:photon, etc. And so the pathway can be then built from all these interactions. I heavily suggest that you have a good look at the BIND model, since like genbank, will become the standard public archive for molecular interactions and pathway data Peter From katel at worldpath.net Sat Aug 4 00:04:40 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway module References: <001c01c11c4a$e04f8530$7d0210ac@l001696w00> Message-ID: <000501c11c9a$97c07b80$010a0a0a@cadence.com> > > if you are not aware, look up BIND on the net. This is written by > Christopher Hogue's group (yes, the same would has writtn chapter in the > Baxevanis text Bioinformatics). > > I found the text and my first impression is that its too ambitious for what we are doing. We are not supporting kinetics or simulation. Our mandate, as I understand it, is to provide tools that make it easier to work with databses like BIND, to complement not duplicate their functionality. BINDS supports detail at the atomic level, this enzyme weaks three electons on the 4th carbon. We could use a subset, maybe Interaction, Pathway and Action objects. But we would either carry a lot of extra baggage or hsve lots of empty fields. Even if the fields are empty we would need code to fish through them and pull out the data of interest. On the positive side, it would be extensible and compatible with a standard format. But I'd be concerned that the format is so rich, you'd lose the forest for the trees. What do others think? Cayte > From tarjei at genome.wi.mit.edu Fri Aug 3 21:59:38 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway module In-Reply-To: <000501c11c9a$97c07b80$010a0a0a@cadence.com> Message-ID: > > if you are not aware, look up BIND on the net. > > > I found the text and my first impression is that its too ambitious for > what we are doing. We are not supporting kinetics or simulation. Our > mandate, as I understand it, is to provide tools that make it > easier to work with databses like BIND, to complement not duplicate > their functionality. I totally agree on this point. What we need is something that is both much more lightweight and more "processed" than the BIND data model. Out "Pathway" should be a data structure that makes it simple to operate on data selectively extracted from BIND or other databases. Basically it's the difference between the Bio.GenBank.Record class and the Bio.Seq class. On the other hand, BIND appears to have matured and gained some momentum since last time I heard of it, and there is no doubt that a module for parsing and selecting BIND data would be very useful. It would be something to look at after our WIT/EMP and KEGG modules. (After all, with 6 pathways stored the BIND database is currently much less useful than these two). On a related note, The EcoCyc ontology paper (Karp P. (2000) Bioinformatics, v16, n3, p269-285) is also worth a look for anyone interested in this topic. Tarjei From chapmanb at arches.uga.edu Sat Aug 4 10:41:11 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: [BioPython] tests failing In-Reply-To: References: <3B66BE5E.7020204@herts.ac.uk> <15207.53026.138744.787675@taxus.athen1.ga.home.com> Message-ID: <15212.2439.368738.975068@taxus.athen1.ga.home.com> Jeff: > I think this may be a problem in test_prodoc.py rather than the > regression testing framework. This output is generated in a function > called print_references: [...] > It prints some text out 80 characters at a time. Perhaps this > boundary is falling on different characters depending on the OS' line > breaking convention. To make things more difficult, the text file > itself has different types of line breaks. [...] > I don't have a reproducible here, so could someone with a Windows > machine take a look at it? I just tested it out (after much swearing while attempting to get CVS working on Windows. Grrrrrr...) and it turns out, as I've always suspected, that you are a genius. No more failing for test_prodoc, yippeee! Excellent deduction, Jeff. My current test status is: ==> Windows 98 w/ Python 2.1 test_MultiProc is failing with a complaint about os.fork not existing on Windows. I guess there is not much we can do about this. test_GenBank is failing with a parse error. I'll investigate this further once I manage to get CVS working properly. test_SubsMat is failing with the -0.00 and 0.00 thing. ==> UNIX (my NetBSD machine) w/ Python 2.1 test_SubsMat is failing with the -0.00 and 0.00 thing. test_interpro was failing with: IOError: [Errno 2] No such file or directory: 'InterPro/ipr001064.htm' which seems to be due to the fact the test files are named IPR001064.htm (ie. capital IPR). This probably didn't fail on Windows, but does on my machine. I just checked in a fix for this to test_interpro, so it's taken care of. So that's where I'm at. Thanks again Jeff for the prodoc fix! Brad From chapmanb at arches.uga.edu Sat Aug 4 10:51:13 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Re: Abstract application wrapper In-Reply-To: References: <15207.52158.379530.574917@taxus.athen1.ga.home.com> Message-ID: <15212.3041.23351.55042@taxus.athen1.ga.home.com> Hi Davide, Jeff; [AbstractApplication ideas] > I think it is really very nice! Thanks, I'm glad that I used my time during that conference talk productively :-) > However I slightly prefer the lighter version in which you have a class > > AbstractApplicationCommandLine (yes to be shortened) instead of > AbstractApplication > > where the only difference is that you do not have a run method and have > also a __str__ method behaving as construct_commandline. (or maybe better > something returning a list of strings?) I've been thinking about your points while at work (I've got lots of time to think while grinding up cactus), and I totally agree with you. I like the idea of the class representing a command-line, and so __str__ returns the string representation of that class (I do prefer the actual commandline being returned over a list of string). So I also think it would be better to just have an AbstractCommandLine class that only represents the class, and then have the functions to run the programs separate from the class. [....snip... Lots of good justifications] > Let me know what you think about it. You've convinced me :-). I'd be very happy if you'd like to work on this and get something together. Having a common way to deal with command-lines would be very nice, and might convince us to get support together for more programs :-) Brad From chapmanb at arches.uga.edu Sat Aug 4 11:02:56 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] add dynamic programming alignment modules In-Reply-To: References: Message-ID: <15212.3744.940607.190511@taxus.athen1.ga.home.com> Hey Jeff; > On the flight home from ISMB, I coded up some modules to do pairwise > alignments. I went ahead and put them into the Bio.Align package > because they seem most appropriate there -- I hope nobody objects! Sweet! You are the man. And to think, I spent my whole time on the flight nursing a bad headache caused by staying up the entire night before (whoops, forgot to book a hotel room for that last night in Denmark!), and reading Hunter S. Thompson books. Seriously, I'm very happy to have this. I also have some dynamic programming stuff in my HMM module (which I am getting ready for potential submission right now -- working myself through the fun of writing up docs); once I get that ready we can see if there is anything there we can generalize and merge together. Brad From ybenita at mac.com Sat Aug 4 18:57:52 2001 From: ybenita at mac.com (Yair Benita) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Mac stuff Message-ID: Hi Guys, I am glad to be joining this list. I will do my best to contribute to this great project and especially keep you aware of some Mac issues. Almost everything works on the Mac. I have some problems with all WWW modules but I can't put my finger on it yet. Local BLAST is not working on the Mac because OS.py does not have an attribute pipe(). Before I dive into the C code of pipe() and try to make a similar Mac attribute, does any of you have a nice and easy alternative? Thanks, Yari -- Yair Benita Pharmaceutical Proteomics Utrecht University Netherlands From katel at worldpath.net Mon Aug 6 02:56:58 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway Module References: Message-ID: <000f01c11e44$fe30b960$010a0a0a@cadence.com> I found another paper that may be interesting, at least, to keep the ideas flowing. http://www.ebi.ac.uk/research/pfmp/publications/biol_chem_2000/Biol_Chem-MS- revised.html The approach is based on an entity relationship model. What I liked about this approach is that it represents interection on any level of granularity, without mixing levels. You can zoom into the level of molecular reactions or zoom out to the level of pathways. This is done with two basic elements, entities and interections that can be combined in a variety of ways, subclassed, nested, chained and combined to build representations in a flexible way. A subclass of entity also provides evidence objects so you can see if different techniques converge or assess the certainty of the conclusions offered. IMHO its worth an hour or so to read. Cayte From tarjei at genome.wi.mit.edu Mon Aug 6 02:06:34 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway Module In-Reply-To: <000f01c11e44$fe30b960$010a0a0a@cadence.com> Message-ID: > I found another paper that may be interesting, at least, to > keep the ideas flowing. > Thanks for the pointer. It looks like interesting work. I'll take a look as soon as I get a chance. I played around with some quick and dirty code this weekend to test whether my initial ideas sucked (quick answer: yup ;) ) This might be obvious, but it occurred to me that there are two different "pathway" concepts that are useful in different circumstances: The first, a System, is as a set of reactions that are implicitly connected through their products and substrates. This is essentially equivalent to a stochiometric matrix, which is useful for things like flux/mode analysis. The second, a Pathway, is a set of species/metabolites that are explicitly linked through reactions. This is equivalent to a graph, which is useful for things like route searches, neighbor analysis and so on. You can convert from a System to a Pathway by specifying which of the products and substrates from the System reactions are to be used as nodes in the Pathway graph. The reverse conversion is trivial. I think that in our module it might be useful to make a distinction between these two concepts. The reason being that they are each useful for different kind of analyses, and that databases like KEGG, WIT and BIND seem to contain many more individual reactions - which can be grouped into a System - than are used in their "curated" pathways. Does this make sense? Tarjei From katel at worldpath.net Mon Aug 6 19:32:24 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Pathway Module References: Message-ID: <002601c11ed0$0df25260$010a0a0a@cadence.com> > This might be obvious, but it occurred to me that there are two > different "pathway" concepts that are useful in different circumstances: > > The first, a System, is as a set of reactions that are > implicitly connected through their products and substrates. This is > essentially equivalent to a stochiometric matrix, which is useful for > things like flux/mode analysis. > > The second, a Pathway, is a set of species/metabolites that are > explicitly linked through reactions. This is equivalent to a graph, > which is useful for things like route searches, neighbor analysis > and so on. > > You can convert from a System to a Pathway by specifying which of > the products and substrates from the System reactions are to be > used as nodes in the Pathway graph. The reverse conversion is trivial. > > I think that in our module it might be useful to make a distinction > between these two concepts. The reason being that they are each useful > for different kind of analyses, and that databases like KEGG, WIT > and BIND seem to contain many more individual reactions - which can > be grouped into a System - than are used in their "curated" pathways. > > Does this make sense? > The user is boss. If the separation of modules is the most efficient way to support the typical user scenarios, I think we should go with it. Cayte > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > > From jchang at SMI.Stanford.EDU Mon Aug 6 17:22:30 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Mac stuff In-Reply-To: References: Message-ID: >Local BLAST is not working on the Mac because OS.py does not have an >attribute pipe(). Before I dive into the C code of pipe() and try to make a >similar Mac attribute, does any of you have a nice and easy alternative? blast gets launched with a call to the popen2 module, which seems to be supported only on Unix and Windows. How do you exec a process on a Mac? Does Python have a module to do stuff like this? Jeff From jchang at SMI.Stanford.EDU Mon Aug 6 18:30:09 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] add dynamic programming alignment modules In-Reply-To: <15212.3744.940607.190511@taxus.athen1.ga.home.com> References: <15212.3744.940607.190511@taxus.athen1.ga.home.com> Message-ID: At 11:02 AM -0400 8/4/01, Brad Chapman wrote: >Hey Jeff; > >> On the flight home from ISMB, I coded up some modules to do pairwise >> alignments. I went ahead and put them into the Bio.Align package >> because they seem most appropriate there -- I hope nobody objects! > >Sweet! You are the man. And to think, I spent my whole time on the >flight nursing a bad headache caused by staying up the entire night >before (whoops, forgot to book a hotel room for that last night in >Denmark!), and reading Hunter S. Thompson books. Yeah, I don't know why you did that. There were plenty of places you could have stayed! >Seriously, I'm very happy to have this. I also have some dynamic >programming stuff in my HMM module (which I am getting ready for >potential submission right now -- working myself through the fun of >writing up docs); once I get that ready we can see if there is >anything there we can generalize and merge together. Sounds good. Although the boundary conditions are different, I believe the recurrences are the same, so we can share that part. We only have to write it in C once! Jeff From julio at hpcf.upr.edu Tue Aug 7 16:23:16 2001 From: julio at hpcf.upr.edu (julio@hpcf.upr.edu) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Fixed Bug Message-ID: <200108072023.QAA13527@astraeus.hpcf.upr.edu> I include the archive FASTA.py with some change, and correct some errors The first thing is : the class write_records(records) mising the self write_records(self, records) corrected the second thing is : write(self , record) not support mutable objects with the following change support mutable objets data = self.tostring and this line have one adiional error before the fixed version, the sentence before is : data = self.seq this is not correct because not exist seq attribute in Seq.py -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/octet-stream Size: 2 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20010807/efdc9d5d/attachment.obj From tarjei at genome.wi.mit.edu Wed Aug 8 14:27:41 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] Microarray jamboree Message-ID: Hi people, If there is anyone interested in microarray data and all that good stuff you might want to check this out: There is a jamboree planned in Toronto on September 14-19 where people from academia (UC Berkeley, EBI) and industry (Affymetrix, Rosetta, etc.) will gather to implement open source tools to work with the new MAGE-ML (an XML format for microarray data that is set to be the successor of various existing standards, I forget their names) that is being released by the Object Management Group some time soon. The plan is to develop an API for the MAGE object model in several different languages. Currently there are people signed up to work on C/C++, Perl and CORBA implementations - *but no Python*. If any of you biopythoneers are interested in doing a Python implementation there you should sign up on the microarray-format mailing list at www.mged.org, and then notify the organizer (Paul Spellman) ASAP. thanks, Tarjei From katel at worldpath.net Wed Aug 8 20:19:53 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] WIT and KEGG References: Message-ID: <002001c12069$05af1620$010a0a0a@cadence.com> Yesterday, I downloaded your new code for enzymes. I started code, but WIT uses the KEGG format for enzymes. So we may be able to get by with one piece of code for both. In your test files, what is the difference between the irregular and the sample files?. Did you manually strip out the HTML stuff? I'll try to upload more test cases because sometimes I've seen bugs on the tenth case. I hope its cooler where you live than in my area( 88 F ). Cayte From katel at worldpath.net Wed Aug 8 20:35:26 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:02 2005 Subject: [Biopython-dev] WIT and KEGG References: <002001c12069$05af1620$010a0a0a@cadence.com> Message-ID: <002601c1206b$303bcee0$010a0a0a@cadence.com> > > In your test files, what is the difference between the irregular and the > sample files?. Did you manually strip out the HTML stuff? I'll try to > upload more test cases because sometimes I've seen bugs on the tenth case. > The only difference I can see between the WIT text and KEGG is an html tag embedded in the entry line in WIT. The format needs to strip out the angle bracketed stuff between ENTRY and the EC number. Cayte From tarjei at genome.wi.mit.edu Wed Aug 8 17:32:03 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG In-Reply-To: <002001c12069$05af1620$010a0a0a@cadence.com> Message-ID: > Yesterday, I downloaded your new code for enzymes. I started > code, but WIT uses the KEGG format for enzymes. So we may be > able to get by with one piece of code for both. Yeah, I noticed that when I played around with WIT the other day. I suspect that they're not only using the same format, but that the enzyme record there are in fact the same as those in KEGG (or maybe it's the other way around, I don't know). - I haven't verified this though. It makes sense to not duplicate the code, so we can either move the shared parts into a module by itself, or you can just import my KEGG code in your modules. > In your test files, what is the difference between the irregular and the > sample files?. The KEGG distribution comes with a text file describing the record format. The .irregular files contains records distributed by KEGG that does not conform to their description . > Did you manually strip out the HTML stuff? There was no HTML. You can download all the enzyme records in KEGG in one big flatfile with no markup. If you want to pull down records directly from a web page you can just strip the tags off in a simple preprocessing step. There might even be a standard library call for that. > I'll try to > upload more test cases because sometimes I've seen bugs > on the tenth case. That would be great. > I hope its cooler where you live than in my area( 88 F ). A little, it's about 80 at Logan now. They've warned us that we might get up into the 90ies before the weekend though. We'll just have to make sure the good old heat-shock proteins are working :) Tarjei From katel at worldpath.net Thu Aug 9 00:26:17 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Rebase Message-ID: <003201c1208b$7055d1e0$010a0a0a@cadence.com> I just changed the print routines to sort keys before printing and renamed the top routine to __str__ to make it consistent with python style. Cayte From katel at worldpath.net Thu Aug 9 01:50:18 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Rebase References: <003201c1208b$7055d1e0$010a0a0a@cadence.com> Message-ID: <006901c12097$2cd9b420$010a0a0a@cadence.com> I just uploaded some test files to biopython/tests/WIT, both the text and htm versions. They should work for KEGG except for the embedded html tag on the ENTRY line. Cayte From katel at worldpath.net Fri Aug 10 01:22:30 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] RecordFile Message-ID: <002001c1215c$7565d3c0$010a0a0a@cadence.com> I updated a fix to RecordFile, to check for an end of file condition I had previously missed. I spliced it into my local version of test_KEGG and it passed. The next step is to see if it can strip out gibberish between records Unless the gibberish contains a start tag . I don't know how to make it absolutely bulletproof, but hopefully I can make it useful. Cayte From katel at worldpath.net Fri Aug 10 20:58:48 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG References: Message-ID: <003b01c12200$c8ff3f40$010a0a0a@cadence.com> I made these changes to a copy of KEGG/enzyyme_format.py, html_tag = Expression.Literal( '<' ) + Rep( AnyBut( '>\n\r' ) ) + Expression.Literal( '>' ) entry = Group("entry", Str1("EC ") + Rep( Str( " " ) ) + Opt( html_tag ) + Rep(Rep1(Integer()) + point) + Rep1(Integer()) + Rep( Str( " " ) ) + Opt( html_tag ) ) The format failed halfway through the file. I think the problem is the order of entries. The format specifies GENES before MOTIF but this order is reversed in the test file. Maybe the format should be less sensitive to order ,where it doesn't convey information. Cayte From tarjei at genome.wi.mit.edu Sat Aug 11 00:35:04 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG In-Reply-To: <003b01c12200$c8ff3f40$010a0a0a@cadence.com> Message-ID: > I made these changes to a copy of KEGG/enzyyme_format.py, > > html_tag = Expression.Literal( '<' ) + Rep( AnyBut( '>\n\r' ) ) + > Expression.Literal( '>' ) > > entry = Group("entry", > Str1("EC ") + > Rep( Str( " " ) ) + Opt( html_tag ) + > Rep(Rep1(Integer()) + point) + > Rep1(Integer()) + > Rep( Str( " " ) ) + Opt( html_tag ) ) I'm not too fond of adding this to the format file. HTML markup isn't part of the KEGG format description, so this seems a bit ad hoc. Instead I suggest that you either run the input through File.SGMLHandle or File.SGMLStripper before you pass the WIT record to KEGG.Enzyme.Parser OR write a separate Parser class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer around KEGG.Enzyme._Consumer. > The format failed halfway through the file. I think the problem is the > order of entries. The format specifies GENES before MOTIF but > this order is > reversed in the test file. Maybe the format should be less sensitive to > order ,where it doesn't convey information. Yeah, the entries are supposed to come in a specified order, but even the KEGG people don't follow that rule. I've committed a change to KEGG.Enzyme.enzyme_format.py that assumes very little about entry ordering. If that's the error, it should work for you now. Tarjei From katel at worldpath.net Sun Aug 12 01:52:22 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG References: Message-ID: <001001c122f2$f602e760$010a0a0a@cadence.com> ----- Original Message ----- From: "Tarjei S Mikkelsen" > I'm not too fond of adding this to the format file. HTML markup isn't > part of the KEGG format description, so this seems a bit ad hoc. > > Instead I suggest that you either run the input through > File.SGMLHandle or File.SGMLStripper before you pass the > WIT record to KEGG.Enzyme.Parser OR write a separate Parser > class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer > around KEGG.Enzyme._Consumer. > The problem is I'm experimenting with a filter to strip out junk ( not necessarily html ) between records. The motivation is that I've had Martel fail on just an extraneous line feed. Somehow the idea of chaining two filters together trips a watch for bugs alarm in my mind. > > The format failed halfway through the file. I think the problem is the > > order of entries. The format specifies GENES before MOTIF but > > this order is > > reversed in the test file. Maybe the format should be less sensitive to > > order ,where it doesn't convey information. > > Yeah, the entries are supposed to come in a specified order, but even > the KEGG people don't follow that rule. I've committed a change to > KEGG.Enzyme.enzyme_format.py that assumes very little about entry > ordering. If that's the error, it should work for you now. > Now its stopping on files with db links like this example: PIR: B49338 B49935 E64239 KIECAA These are quibbles but the computer doesn't understand quibbles:). Cayte > Tarjei > > From biopython-bugs at bioperl.org Mon Aug 13 21:57:24 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/39 Message-ID: <200108140157.f7E1vOq28569@pw600a.bioperl.org> JitterBug notification new message incoming/39 Message summary for PR#39 From: cirano@chollian.net Subject: Parsing Problem of GenBank format Date: Mon, 13 Aug 2001 21:57:23 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From cirano@chollian.net Mon Aug 13 21:57:24 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7E1vIq28563 for ; Mon, 13 Aug 2001 21:57:23 -0400 Date: Mon, 13 Aug 2001 21:57:23 -0400 Message-Id: <200108140157.f7E1vIq28563@pw600a.bioperl.org> From: cirano@chollian.net To: biopython-bugs@bioperl.org Subject: Parsing Problem of GenBank format Full_Name: Chang Gyeom, Kim Module: Bio/File.py/saveline module Version: Biopython1.00a2 OS: Redhat7.1 Submission from: (NULL) (203.248.117.3) My Source code: from Bio import GenBank search_term = "Lupine leghemoglobin" gi_list = GenBank.search_for(search_term) ncbi_dict = GenBank.NCBIDictionary() gb_seqrecord = ncbi_dict[ gi_list[0] ] print gb_seqrecord When I run this code, I lost first 5 lines of GenBank Record. I think this problem is caused by the function of "saveline" located in Bio/File.py module So I revised the code like this: def saveline(self, line): if line: handle_contents = self.read() self._saved = line + handle_contents self._handle = StringIO.StringIO(self._saved) Although I fixed my problem, I'm not sure this is the right way. From biopython-bugs at bioperl.org Tue Aug 14 02:07:19 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/32 Message-ID: <200108140607.f7E67Jq29518@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#32 From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/31 Date: Wed, 16 May 2001 11:58:00 -0700 0 replies 0 followups Notes: duplicate of Bug #31. How did this get split? ====> ORIGINAL MESSAGE FOLLOWS <==== >From jchang@SMI.Stanford.EDU Wed May 16 13:53:23 2001 Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4GHrJb11642 for ; Wed, 16 May 2001 13:53:23 -0400 Received: from [171.65.33.127] (chang-smi.Stanford.EDU [171.65.33.127]) by crg-gw.Stanford.EDU (8.9.1a/8.9.1) with ESMTP id LAA23878; Wed, 16 May 2001 11:58:23 -0700 (PDT) User-Agent: Microsoft-Outlook-Express-Macintosh-Edition/5.02.2022 Date: Wed, 16 May 2001 11:58:00 -0700 Subject: Re: [Biopython-dev] Notification: incoming/31 From: Jeffrey Chang To: CC: Message-ID: In-Reply-To: <200105160814.f4G8EZb32193@pw600a.bioperl.org> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Content-Transfer-Encoding: 7bit Hi Huang, Could you send the file that's generating the output? We have regression tests that check for behavior for "No hits found", and it does not generate any error message, as designed. helio:~/remotecvs/biopython/Tests/Blast> python Python 2.1 (#7, Apr 17 2001, 18:53:25) [GCC 2.8.1] on sunos5 Type "copyright", "credits" or "license" for more information. >>> from Bio.Blast import NCBIStandalone >>> rec = NCBIStandalone.BlastParser().parse_file('bt002') >>> print rec.alignments [] >>> Thanks, Jeff > From: biopython-bugs@bioperl.org > Date: Wed, 16 May 2001 04:14:35 -0400 > To: biopython-dev@biopython.org > Subject: [Biopython-dev] Notification: incoming/31 > > JitterBug notification > > new message incoming/31 > > Message summary for PR#31 > From: hy263book@263.net > Subject: When I encounter "No hits found" > Date: Wed, 16 May 2001 04:14:35 -0400 > 0 replies 0 followups > > ====> ORIGINAL MESSAGE FOLLOWS <==== > >> From hy263book@263.net Wed May 16 04:14:35 2001 > Received: from localhost (localhost [127.0.0.1]) > by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187 > for ; Wed, 16 May 2001 04:14:35 -0400 > Date: Wed, 16 May 2001 04:14:35 -0400 > Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org> > From: hy263book@263.net > To: biopython-bugs@bioperl.org > Subject: When I encounter "No hits found" > > Full_Name: Huang Ying > Module: Bio.Blast.NCBIStandalond > Version: > OS: Win2k > Submission from: (NULL) (166.111.30.26) > > > I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast > result is "No hits found",python send the wrong message > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > From biopython-bugs at bioperl.org Tue Aug 14 02:07:20 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/32 Message-ID: <200108140607.f7E67Kq29522@pw600a.bioperl.org> JitterBug notification jchang moved PR#32 from incoming to fixed-bugs Message summary for PR#32 From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/31 Date: Wed, 16 May 2001 11:58:00 -0700 0 replies 0 followups Notes: duplicate of Bug #31. How did this get split? ====> ORIGINAL MESSAGE FOLLOWS <==== >From jchang@SMI.Stanford.EDU Wed May 16 13:53:23 2001 Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4GHrJb11642 for ; Wed, 16 May 2001 13:53:23 -0400 Received: from [171.65.33.127] (chang-smi.Stanford.EDU [171.65.33.127]) by crg-gw.Stanford.EDU (8.9.1a/8.9.1) with ESMTP id LAA23878; Wed, 16 May 2001 11:58:23 -0700 (PDT) User-Agent: Microsoft-Outlook-Express-Macintosh-Edition/5.02.2022 Date: Wed, 16 May 2001 11:58:00 -0700 Subject: Re: [Biopython-dev] Notification: incoming/31 From: Jeffrey Chang To: CC: Message-ID: In-Reply-To: <200105160814.f4G8EZb32193@pw600a.bioperl.org> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Content-Transfer-Encoding: 7bit Hi Huang, Could you send the file that's generating the output? We have regression tests that check for behavior for "No hits found", and it does not generate any error message, as designed. helio:~/remotecvs/biopython/Tests/Blast> python Python 2.1 (#7, Apr 17 2001, 18:53:25) [GCC 2.8.1] on sunos5 Type "copyright", "credits" or "license" for more information. >>> from Bio.Blast import NCBIStandalone >>> rec = NCBIStandalone.BlastParser().parse_file('bt002') >>> print rec.alignments [] >>> Thanks, Jeff > From: biopython-bugs@bioperl.org > Date: Wed, 16 May 2001 04:14:35 -0400 > To: biopython-dev@biopython.org > Subject: [Biopython-dev] Notification: incoming/31 > > JitterBug notification > > new message incoming/31 > > Message summary for PR#31 > From: hy263book@263.net > Subject: When I encounter "No hits found" > Date: Wed, 16 May 2001 04:14:35 -0400 > 0 replies 0 followups > > ====> ORIGINAL MESSAGE FOLLOWS <==== > >> From hy263book@263.net Wed May 16 04:14:35 2001 > Received: from localhost (localhost [127.0.0.1]) > by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4G8EYb32187 > for ; Wed, 16 May 2001 04:14:35 -0400 > Date: Wed, 16 May 2001 04:14:35 -0400 > Message-Id: <200105160814.f4G8EYb32187@pw600a.bioperl.org> > From: hy263book@263.net > To: biopython-bugs@bioperl.org > Subject: When I encounter "No hits found" > > Full_Name: Huang Ying > Module: Bio.Blast.NCBIStandalond > Version: > OS: Win2k > Submission from: (NULL) (166.111.30.26) > > > I use Bio.Blast.NCBIStandalone.BlastParser to analysis Blast report.When blast > result is "No hits found",python send the wrong message > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > From biopython-bugs at bioperl.org Tue Aug 14 02:10:10 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/39 Message-ID: <200108140610.f7E6AAq29623@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#39 From: cirano@chollian.net Subject: Parsing Problem of GenBank format Date: Mon, 13 Aug 2001 21:57:23 -0400 0 replies 0 followups Notes: Thanks for the bug report. Andrew Dalke noted this earlier and submitted the follow fix for UndoHandle.read: def read(self, size=-1): if size == -1: saved = string.join(self._saved, "") self._saved[:] = [] else: It's checked into the CVS and will go out the next release. Actually, enough people are getting tripped out on it that that should happen sooner than later. ====> ORIGINAL MESSAGE FOLLOWS <==== >From cirano@chollian.net Mon Aug 13 21:57:24 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7E1vIq28563 for ; Mon, 13 Aug 2001 21:57:23 -0400 Date: Mon, 13 Aug 2001 21:57:23 -0400 Message-Id: <200108140157.f7E1vIq28563@pw600a.bioperl.org> From: cirano@chollian.net To: biopython-bugs@bioperl.org Subject: Parsing Problem of GenBank format Full_Name: Chang Gyeom, Kim Module: Bio/File.py/saveline module Version: Biopython1.00a2 OS: Redhat7.1 Submission from: (NULL) (203.248.117.3) My Source code: from Bio import GenBank search_term = "Lupine leghemoglobin" gi_list = GenBank.search_for(search_term) ncbi_dict = GenBank.NCBIDictionary() gb_seqrecord = ncbi_dict[ gi_list[0] ] print gb_seqrecord When I run this code, I lost first 5 lines of GenBank Record. I think this problem is caused by the function of "saveline" located in Bio/File.py module So I revised the code like this: def saveline(self, line): if line: handle_contents = self.read() self._saved = line + handle_contents self._handle = StringIO.StringIO(self._saved) Although I fixed my problem, I'm not sure this is the right way. From biopython-bugs at bioperl.org Tue Aug 14 02:10:10 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/39 Message-ID: <200108140610.f7E6AAq29627@pw600a.bioperl.org> JitterBug notification jchang moved PR#39 from incoming to fixed-bugs Message summary for PR#39 From: cirano@chollian.net Subject: Parsing Problem of GenBank format Date: Mon, 13 Aug 2001 21:57:23 -0400 0 replies 0 followups Notes: Thanks for the bug report. Andrew Dalke noted this earlier and submitted the follow fix for UndoHandle.read: def read(self, size=-1): if size == -1: saved = string.join(self._saved, "") self._saved[:] = [] else: It's checked into the CVS and will go out the next release. Actually, enough people are getting tripped out on it that that should happen sooner than later. ====> ORIGINAL MESSAGE FOLLOWS <==== >From cirano@chollian.net Mon Aug 13 21:57:24 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7E1vIq28563 for ; Mon, 13 Aug 2001 21:57:23 -0400 Date: Mon, 13 Aug 2001 21:57:23 -0400 Message-Id: <200108140157.f7E1vIq28563@pw600a.bioperl.org> From: cirano@chollian.net To: biopython-bugs@bioperl.org Subject: Parsing Problem of GenBank format Full_Name: Chang Gyeom, Kim Module: Bio/File.py/saveline module Version: Biopython1.00a2 OS: Redhat7.1 Submission from: (NULL) (203.248.117.3) My Source code: from Bio import GenBank search_term = "Lupine leghemoglobin" gi_list = GenBank.search_for(search_term) ncbi_dict = GenBank.NCBIDictionary() gb_seqrecord = ncbi_dict[ gi_list[0] ] print gb_seqrecord When I run this code, I lost first 5 lines of GenBank Record. I think this problem is caused by the function of "saveline" located in Bio/File.py module So I revised the code like this: def saveline(self, line): if line: handle_contents = self.read() self._saved = line + handle_contents self._handle = StringIO.StringIO(self._saved) Although I fixed my problem, I'm not sure this is the right way. From biopython-bugs at bioperl.org Tue Aug 14 02:10:10 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: PR#39 Message-ID: <200108140610.f7E6AAq29632@pw600a.bioperl.org> JitterBug notification jchang moved PR#39 from incoming to fixed-bugs Message summary for PR#39 From: cirano@chollian.net Subject: Parsing Problem of GenBank format Date: Mon, 13 Aug 2001 21:57:23 -0400 0 replies 0 followups Notes: Thanks for the bug report. Andrew Dalke noted this earlier and submitted the follow fix for UndoHandle.read: def read(self, size=-1): if size == -1: saved = string.join(self._saved, "") self._saved[:] = [] else: It's checked into the CVS and will go out the next release. Actually, enough people are getting tripped out on it that that should happen sooner than later. ====> ORIGINAL MESSAGE FOLLOWS <==== >From cirano@chollian.net Mon Aug 13 21:57:24 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7E1vIq28563 for ; Mon, 13 Aug 2001 21:57:23 -0400 Date: Mon, 13 Aug 2001 21:57:23 -0400 Message-Id: <200108140157.f7E1vIq28563@pw600a.bioperl.org> From: cirano@chollian.net To: biopython-bugs@bioperl.org Subject: Parsing Problem of GenBank format Full_Name: Chang Gyeom, Kim Module: Bio/File.py/saveline module Version: Biopython1.00a2 OS: Redhat7.1 Submission from: (NULL) (203.248.117.3) My Source code: from Bio import GenBank search_term = "Lupine leghemoglobin" gi_list = GenBank.search_for(search_term) ncbi_dict = GenBank.NCBIDictionary() gb_seqrecord = ncbi_dict[ gi_list[0] ] print gb_seqrecord When I run this code, I lost first 5 lines of GenBank Record. I think this problem is caused by the function of "saveline" located in Bio/File.py module So I revised the code like this: def saveline(self, line): if line: handle_contents = self.read() self._saved = line + handle_contents self._handle = StringIO.StringIO(self._saved) Although I fixed my problem, I'm not sure this is the right way. From biopython-bugs at bioperl.org Tue Aug 14 02:11:32 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/35 Message-ID: <200108140611.f7E6BWq29758@pw600a.bioperl.org> JitterBug notification jchang changed notes Message summary for PR#35 From: tarjei@mit.edu Subject: NCBIStandalone.BlastParser bug Date: Tue, 19 Jun 2001 10:57:42 -0400 0 replies 0 followups Notes: format change, got fixed and released in biopython 1.0a2 -Jeff ====> ORIGINAL MESSAGE FOLLOWS <==== >From tarjei@mit.edu Tue Jun 19 10:57:42 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f5JEvg826272 for ; Tue, 19 Jun 2001 10:57:42 -0400 Date: Tue, 19 Jun 2001 10:57:42 -0400 Message-Id: <200106191457.f5JEvg826272@pw600a.bioperl.org> From: tarjei@mit.edu To: biopython-bugs@bioperl.org Subject: NCBIStandalone.BlastParser bug Full_Name: Tarjei Mikkelsen Module: Bio.Blast.NCBIStandalone.BlastParser Version: 1.00a OS: Dec/Alpha OSF1 Submission from: incognito.mit.edu (18.246.0.239) The standalone BLAST record parser (Bio.Blast.NCBISTandalone.BlastParser) fails with a SyntaxError when the (path)name of the database spans more than one line. The following code stub/BLAST output will reproduce the bug: (Even though this example is from BLAST 2.0.5 the same thing happens in newer versions) <<<<>>>> from Bio.Blast import NCBIStandalone blast_out = open("blast_parser_bug.out", "r") blast_parser = NCBIStandalone.BlastParser() blast_record = blast_parser.parse(blast_out) <<<<>>>> <<<<>>>> BLASTP 2.0.5 [May-5-1998] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= eco:b1416 (83 letters) Database: /home/strontium/tarjei/pathway/src/Bio/Pathway/data/2.7.1.11 .fa 39 sequences; 18,779 total letters Searching......................................done Score E Sequences producing significant alignments: (bits) Value spy:SPy1283 20 0.64 lla:L0002 20 0.84 >spy:SPy1283 Length = 337 Score = 20.4 bits (41), Expect = 0.64 Identities = 10/26 (38%), Positives = 17/26 (64%), Gaps = 1/26 (3%) Query: 21 GYTDEEIVSSDIIG-SHFGSVFDATQ 45 G +EE+V S I+G + G++F T+ Sbjct: 287 GIHNEELVESPILGTAEEGALFSLTE 312 >lla:L0002 Length = 340 Score = 20.0 bits (40), Expect = 0.84 Identities = 10/25 (40%), Positives = 16/25 (64%), Gaps = 1/25 (4%) Query: 21 GYTDEEIVSSDIIG-SHFGSVFDAT 44 G +EE+V S I+G + G++F T Sbjct: 286 GIRNEELVESPILGTAEEGALFSLT 310 Score = 18.8 bits (37), Expect = 1.9 Identities = 9/29 (31%), Positives = 17/29 (58%), Gaps = 1/29 (3%) Query: 28 VSSDIIGSHFGSVFD-ATQTEITAVGDLQ 55 + +DI+G+ F FD A T + A+ ++ Sbjct: 126 IDNDIVGTDFTIGFDTAVSTVVDALDKIR 154 Database: /home/strontium/tarjei/pathway/src/Bio/Pathway/data/2.7.1. 11.fa Posted date: Jun 18, 2001 1:19 PM Number of letters in database: 18,779 Number of sequences in database: 39 Lambda K H 0.313 0.129 0.352 Gapped Lambda K H 0.270 0.0470 0.230 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 2788 Number of Sequences: 39 Number of extensions: 119 Number of successful extensions: 3 Number of sequences better than 10: 2 Number of HSP's better than 10.0 without gapping: 2 Number of HSP's successfully gapped in prelim test: 0 Number of HSP's that attempted gapping in prelim test: 0 Number of HSP's gapped (non-prelim): 3 length of query: 83 length of database: 18779 effective HSP length: 33 effective length of query: 50 effective length of database: 17492 effective search space: 874600 T: 11 A: 40 X1: 16 ( 7.2 bits) X2: 38 (14.8 bits) X3: 64 (24.9 bits) S1: 34 (18.3 bits) S2: 31 (16.5 bits) <<<<>>>> From biopython-bugs at bioperl.org Tue Aug 14 02:11:32 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/35 Message-ID: <200108140611.f7E6BWq29762@pw600a.bioperl.org> JitterBug notification jchang moved PR#35 from incoming to fixed-bugs Message summary for PR#35 From: tarjei@mit.edu Subject: NCBIStandalone.BlastParser bug Date: Tue, 19 Jun 2001 10:57:42 -0400 0 replies 0 followups Notes: format change, got fixed and released in biopython 1.0a2 -Jeff ====> ORIGINAL MESSAGE FOLLOWS <==== >From tarjei@mit.edu Tue Jun 19 10:57:42 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f5JEvg826272 for ; Tue, 19 Jun 2001 10:57:42 -0400 Date: Tue, 19 Jun 2001 10:57:42 -0400 Message-Id: <200106191457.f5JEvg826272@pw600a.bioperl.org> From: tarjei@mit.edu To: biopython-bugs@bioperl.org Subject: NCBIStandalone.BlastParser bug Full_Name: Tarjei Mikkelsen Module: Bio.Blast.NCBIStandalone.BlastParser Version: 1.00a OS: Dec/Alpha OSF1 Submission from: incognito.mit.edu (18.246.0.239) The standalone BLAST record parser (Bio.Blast.NCBISTandalone.BlastParser) fails with a SyntaxError when the (path)name of the database spans more than one line. The following code stub/BLAST output will reproduce the bug: (Even though this example is from BLAST 2.0.5 the same thing happens in newer versions) <<<<>>>> from Bio.Blast import NCBIStandalone blast_out = open("blast_parser_bug.out", "r") blast_parser = NCBIStandalone.BlastParser() blast_record = blast_parser.parse(blast_out) <<<<>>>> <<<<>>>> BLASTP 2.0.5 [May-5-1998] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= eco:b1416 (83 letters) Database: /home/strontium/tarjei/pathway/src/Bio/Pathway/data/2.7.1.11 .fa 39 sequences; 18,779 total letters Searching......................................done Score E Sequences producing significant alignments: (bits) Value spy:SPy1283 20 0.64 lla:L0002 20 0.84 >spy:SPy1283 Length = 337 Score = 20.4 bits (41), Expect = 0.64 Identities = 10/26 (38%), Positives = 17/26 (64%), Gaps = 1/26 (3%) Query: 21 GYTDEEIVSSDIIG-SHFGSVFDATQ 45 G +EE+V S I+G + G++F T+ Sbjct: 287 GIHNEELVESPILGTAEEGALFSLTE 312 >lla:L0002 Length = 340 Score = 20.0 bits (40), Expect = 0.84 Identities = 10/25 (40%), Positives = 16/25 (64%), Gaps = 1/25 (4%) Query: 21 GYTDEEIVSSDIIG-SHFGSVFDAT 44 G +EE+V S I+G + G++F T Sbjct: 286 GIRNEELVESPILGTAEEGALFSLT 310 Score = 18.8 bits (37), Expect = 1.9 Identities = 9/29 (31%), Positives = 17/29 (58%), Gaps = 1/29 (3%) Query: 28 VSSDIIGSHFGSVFD-ATQTEITAVGDLQ 55 + +DI+G+ F FD A T + A+ ++ Sbjct: 126 IDNDIVGTDFTIGFDTAVSTVVDALDKIR 154 Database: /home/strontium/tarjei/pathway/src/Bio/Pathway/data/2.7.1. 11.fa Posted date: Jun 18, 2001 1:19 PM Number of letters in database: 18,779 Number of sequences in database: 39 Lambda K H 0.313 0.129 0.352 Gapped Lambda K H 0.270 0.0470 0.230 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 2788 Number of Sequences: 39 Number of extensions: 119 Number of successful extensions: 3 Number of sequences better than 10: 2 Number of HSP's better than 10.0 without gapping: 2 Number of HSP's successfully gapped in prelim test: 0 Number of HSP's that attempted gapping in prelim test: 0 Number of HSP's gapped (non-prelim): 3 length of query: 83 length of database: 18779 effective HSP length: 33 effective length of query: 50 effective length of database: 17492 effective search space: 874600 T: 11 A: 40 X1: 16 ( 7.2 bits) X2: 38 (14.8 bits) X3: 64 (24.9 bits) S1: 34 (18.3 bits) S2: 31 (16.5 bits) <<<<>>>> From biopython-bugs at bioperl.org Tue Aug 14 16:44:35 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/40 Message-ID: <200108142044.f7EKiZq02776@pw600a.bioperl.org> JitterBug notification new message incoming/40 Message summary for PR#40 From: joungjh@AptusGenomics.com Subject: retrieving GenBank records from NCBI Date: Tue, 14 Aug 2001 16:44:34 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From joungjh@AptusGenomics.com Tue Aug 14 16:44:35 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7EKiYq02770 for ; Tue, 14 Aug 2001 16:44:34 -0400 Date: Tue, 14 Aug 2001 16:44:34 -0400 Message-Id: <200108142044.f7EKiYq02770@pw600a.bioperl.org> From: joungjh@AptusGenomics.com To: biopython-bugs@bioperl.org Subject: retrieving GenBank records from NCBI Full_Name: J. Joung Module: GenBank Version: biopython-1.00a2 OS: UNIX Submission from: gw-aptusgen1.cust.fast.net (209.92.248.166) I'm using GenBank NCBIDictionary to retrieve a GenBank record. The retrived record is missing the following information: LOCUS, DEFINITION, ACCESSION, VERSION, and KEYWORDS. Is there a way of obtaining the GenBank id from a known locuslink id in biopython? From katel at worldpath.net Tue Aug 14 21:43:32 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG References: <001001c122f2$f602e760$010a0a0a@cadence.com> Message-ID: <005801c1252b$b3274040$010a0a0a@cadence.com> ----- Original Message ----- From: "Cayte" To: Cc: Sent: Saturday, August 11, 2001 10:52 PM Subject: Re: [Biopython-dev] WIT and KEGG > > ----- Original Message ----- > From: "Tarjei S Mikkelsen" > > I'm not too fond of adding this to the format file. HTML markup isn't > > part of the KEGG format description, so this seems a bit ad hoc. > > > > Instead I suggest that you either run the input through > > File.SGMLHandle or File.SGMLStripper before you pass the > > WIT record to KEGG.Enzyme.Parser OR write a separate Parser > > class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer > > around KEGG.Enzyme._Consumer. > > > The problem is I'm experimenting with a filter to strip out junk ( not > necessarily html ) between records. > The motivation is that I've had Martel fail on just an extraneous line feed. > Somehow the idea of chaining two filters together trips a watch for bugs > alarm in my mind. > > > > The format failed halfway through the file. I think the problem is > the > > > order of entries. The format specifies GENES before MOTIF but > > > this order is > > > reversed in the test file. Maybe the format should be less sensitive to > > > order ,where it doesn't convey information. > > > > Yeah, the entries are supposed to come in a specified order, but even > > the KEGG people don't follow that rule. I've committed a change to > > KEGG.Enzyme.enzyme_format.py that assumes very little about entry > > ordering. If that's the error, it should work for you now. > > > > Now its stopping on files with db links like this example: > > PIR: B49338 B49935 E64239 KIECAA > > These are quibbles but the computer doesn't understand quibbles:). > > Cayte > > Tarjei > > > > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > > From tarjei at genome.wi.mit.edu Tue Aug 14 19:15:16 2001 From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG In-Reply-To: <005801c1252b$b3274040$010a0a0a@cadence.com> Message-ID: > > > Instead I suggest that you either run the input through > > > File.SGMLHandle or File.SGMLStripper before you pass the > > > WIT record to KEGG.Enzyme.Parser OR write a separate Parser > > > class in your WIT module that wraps a > ParserSupport.SGMLStrippingConsumer > > > around KEGG.Enzyme._Consumer. > > > > > The problem is I'm experimenting with a filter to strip out junk ( not > > necessarily html ) between records. > > The motivation is that I've had Martel fail on just an extraneous line > feed. > > Somehow the idea of chaining two filters together trips a watch for bugs > > alarm in my mind. Sure, for experimentation that's fine, but I'd prefer to keep it the way it is in the distribution version. Especially because the HTML versions of these records are full of other markup _in_ the record that has to be cleaned out anyway - and adding regexps for all of those would be a mess. > > > > The format failed halfway through the file. I think the > problem is > > the > > > > order of entries. The format specifies GENES before MOTIF but > > > > this order is > > > > reversed in the test file. Maybe the format should be less > sensitive > to > > > > order ,where it doesn't convey information. > > > > > > Yeah, the entries are supposed to come in a specified order, but even > > > the KEGG people don't follow that rule. I've committed a change to > > > KEGG.Enzyme.enzyme_format.py that assumes very little about entry > > > ordering. If that's the error, it should work for you now. > > > > > > > Now its stopping on files with db links like this example: > > > > PIR: B49338 B49935 E64239 KIECAA > > > > These are quibbles but the computer doesn't understand quibbles:). Yeah, I missed this case because it doesn't appear in KEGG. I've committed another change which appears to deal well with it. Btw, I'm going away for a couple of weeks, so I'll won't be very responsive during that time. But I'm planning to bring my laptop to do some more experiments with reaction/pathway classes. take care, Tarjei From katel at worldpath.net Wed Aug 15 02:14:04 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] METATOOL Message-ID: <001b01c12551$7d784fe0$010a0a0a@cadence.com> The WIT files work fine with the KEGG parser now. In the next couple of weeks, I plan to look into METATOOL, maybe start a Martel parser for the output. Pathway researchers use it a lot, like genomic researchers use blast. The output of METATOOL is flat - no html tags. Cayte From biopython-bugs at bioperl.org Wed Aug 15 01:45:12 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/41 Message-ID: <200108150545.f7F5jCq05973@pw600a.bioperl.org> JitterBug notification new message incoming/41 Message summary for PR#41 From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/40 Date: Tue, 14 Aug 2001 22:46:45 -0700 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From jchang@SMI.Stanford.EDU Wed Aug 15 01:45:11 2001 Received: from crg-gw.Stanford.EDU (root@crg-gw.Stanford.EDU [171.65.32.201]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7F5jAq05966 for ; Wed, 15 Aug 2001 01:45:11 -0400 Received: from [192.168.0.4] (c1128134-a.stcla1.sfba.home.com [24.176.209.55]) by crg-gw.Stanford.EDU (8.11.5/8.11.5) with ESMTP id f7F5jDU24945; Tue, 14 Aug 2001 22:45:13 -0700 (PDT) Mime-Version: 1.0 X-Sender: jchang@smi.stanford.edu (Unverified) Message-Id: In-Reply-To: <200108142044.f7EKiZq02776@pw600a.bioperl.org> References: <200108142044.f7EKiZq02776@pw600a.bioperl.org> Date: Tue, 14 Aug 2001 22:46:45 -0700 To: biopython-bugs@bioperl.org, biopython-dev@biopython.org, joungjh@aptusgenomics.com From: Jeffrey Chang Subject: Re: [Biopython-dev] Notification: incoming/40 Content-Type: text/plain; charset="us-ascii" ; format="flowed" At 4:44 PM -0400 8/14/01, biopython-bugs@bioperl.org wrote: >Full_Name: J. Joung >I'm using GenBank NCBIDictionary to retrieve a GenBank record. The retrived >record is missing the following information: LOCUS, DEFINITION, ACCESSION, >VERSION, and KEYWORDS. Is this information that's in the Genbank record? It should be returning whatever NCBI returns, or raising an exception. Dropping information would be odd. Do you have a reproducible? What is the accession you're using? >Is there a way of obtaining the GenBank id from a known locuslink id in >biopython? No, we don't have any locuslink functionality at the moment. Jeff From jchang at SMI.Stanford.EDU Wed Aug 15 01:46:45 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/40 In-Reply-To: <200108142044.f7EKiZq02776@pw600a.bioperl.org> References: <200108142044.f7EKiZq02776@pw600a.bioperl.org> Message-ID: At 4:44 PM -0400 8/14/01, biopython-bugs@bioperl.org wrote: >Full_Name: J. Joung >I'm using GenBank NCBIDictionary to retrieve a GenBank record. The retrived >record is missing the following information: LOCUS, DEFINITION, ACCESSION, >VERSION, and KEYWORDS. Is this information that's in the Genbank record? It should be returning whatever NCBI returns, or raising an exception. Dropping information would be odd. Do you have a reproducible? What is the accession you're using? >Is there a way of obtaining the GenBank id from a known locuslink id in >biopython? No, we don't have any locuslink functionality at the moment. Jeff From jchang at SMI.Stanford.EDU Wed Aug 15 01:50:20 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG In-Reply-To: <001001c122f2$f602e760$010a0a0a@cadence.com> References: <001001c122f2$f602e760$010a0a0a@cadence.com> Message-ID: At 10:52 PM -0700 8/11/01, Cayte wrote: >From: "Tarjei S Mikkelsen" > > Instead I suggest that you either run the input through >> File.SGMLHandle or File.SGMLStripper before you pass the >> WIT record to KEGG.Enzyme.Parser OR write a separate Parser >> class in your WIT module that wraps a ParserSupport.SGMLStrippingConsumer >> around KEGG.Enzyme._Consumer. >> > The problem is I'm experimenting with a filter to strip out junk ( not >necessarily html ) between records. >The motivation is that I've had Martel fail on just an extraneous line feed. >Somehow the idea of chaining two filters together trips a watch for bugs >alarm in my mind. I agree with Tarjei that these should be separated out, if possible. Yes, there's a possibility of bugs when chaining filters together, but having two entities developed and debugged separately should have fewer bugs (and easier maintenance) than a system where all the functionality is munged together. Jeff From biopython-bugs at bioperl.org Wed Aug 15 08:22:26 2001 From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/42 Message-ID: <200108151222.f7FCMQq08880@pw600a.bioperl.org> JitterBug notification new message incoming/42 Message summary for PR#42 From: joungjh@email.com Subject: Re: [Biopython-dev] Notification: incoming/40 Date: Wed, 15 Aug 2001 08:22:26 -0400 0 replies 0 followups ====> ORIGINAL MESSAGE FOLLOWS <==== >From joungjh@email.com Wed Aug 15 08:22:26 2001 Received: from localhost (localhost [127.0.0.1]) by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f7FCMPq08874 for ; Wed, 15 Aug 2001 08:22:26 -0400 Date: Wed, 15 Aug 2001 08:22:26 -0400 Message-Id: <200108151222.f7FCMPq08874@pw600a.bioperl.org> From: joungjh@email.com To: biopython-bugs@bioperl.org Subject: Re: [Biopython-dev] Notification: incoming/40 Full_Name: Module: Version: OS: Submission from: gw-aptusgen1.cust.fast.net (209.92.248.166) >>I'm using GenBank NCBIDictionary to retrieve a GenBank record. The retrived >>record is missing the following information: LOCUS, DEFINITION, ACCESSION, >>VERSION, and KEYWORDS. >Is this information that's in the Genbank record? It should be >returning whatever NCBI returns, or raising an exception. Dropping >information would be odd. Do you have a reproducible? What is the >accession you're using? Yes, LOCUS, DEFINITION, ACCESSION, VERSION, and KEYWORDS information is in GenBank record. Any GenBank id would drop this information on UNIX. You can try GenBank id of '15145772'. I have installed biopython-1.00a1 windows version on my pc and this seems to return all information correctly. Thank you for your quick response. From chapmanb at arches.uga.edu Wed Aug 15 08:55:37 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/42 In-Reply-To: <200108151222.f7FCMQq08880@pw600a.bioperl.org> Message-ID: Hey all; Bug report from J. Joung: > >>I'm using GenBank NCBIDictionary to retrieve a GenBank record. The retrived > >>record is missing the following information: LOCUS, DEFINITION, ACCESSION, > >>VERSION, and KEYWORDS. Jeff: > >Is this information that's in the Genbank record? It should be > >returning whatever NCBI returns, or raising an exception. Dropping > >information would be odd. Do you have a reproducible? What is the > >accession you're using? I think this is the infamous "lose the first 5 lines of the file" bug that popped up in biopython-1.00a2 (which would also explain why 1.00a1 works just file). This has been fixed in the current CVS, so the next release should be bug free (well, at least in regards to this bug :-). The solution for now is to fix Bio/File.py. I'm not exactly sure how this would be done with diffs on windows, but attached is the change which fixes the problem. I hope I've picked up on your problem correctly -- if this change doesn't help please let us know! Thanks for the bug report, and sorry about the problem! Hope this helps. Brad $ more File.diff Index: File.py =================================================================== RCS file: /home/repository/biopython/biopython/Bio/File.py,v retrieving revision 1.12 retrieving revision 1.13 diff -c -r1.12 -r1.13 *** File.py 2001/06/04 04:44:09 1.12 --- File.py 2001/07/14 23:48:51 1.13 *************** *** 46,60 **** return line def read(self, size=-1): ! saved = '' ! while size > 0 and self._saved: ! if len(self._saved[0]) <= size: ! size = size - len(self._saved[0]) ! saved = saved + self._saved.pop(0) ! else: ! saved = saved + self._saved[0][:size] ! self._saved[0] = self._saved[0][size:] ! size = 0 return saved + self._handle.read(size) def saveline(self, line): --- 46,64 ---- return line def read(self, size=-1): ! if size == -1: ! saved = string.join(self._saved, "") ! self._saved[:] = [] ! else: ! saved = '' ! while size > 0 and self._saved: ! if len(self._saved[0]) <= size: ! size = size - len(self._saved[0]) ! saved = saved + self._saved.pop(0) ! else: ! saved = saved + self._saved[0][:size] ! self._saved[0] = self._saved[0][size:] ! size = 0 return saved + self._handle.read(size) def saveline(self, line): From katel at worldpath.net Wed Aug 15 19:12:15 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] WIT and KEGG References: <001001c122f2$f602e760$010a0a0a@cadence.com> Message-ID: <002f01c125df$bc14d0a0$010a0a0a@cadence.com> ----- Original Message ----- From: "Jeffrey Chang" > I agree with Tarjei that these should be separated out, if possible. > Yes, there's a possibility of bugs when chaining filters together, > but having two entities developed and debugged separately should have > fewer bugs (and easier maintenance) than a system where all the > functionality is munged together. > I'm not sure what two entities you are referring two. Two filters? I can see the case for not cluttering the KEGG format with html filters. Two modules? There may be no need for a separate WIT module because the 10 ( filtered ) WIT files are accepted by the KEGG parser. And the WIT documentation claims to be using KEGG format. Of course I need to take a close, byte by byte look to see if any problem lurks in the details.. So WIT may just need a preprocesor consisting of chained filters. Cayte > > From katel at worldpath.net Wed Aug 15 19:18:47 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool and Martel Message-ID: <003701c125e0$a3bc1940$010a0a0a@cadence.com> Does Martel handle embedded size fields? The MetaTool output contains lots of matrixes preceded by column row counts. It would be hard, Martel would have to catch and store data on the fly. It's not strictly necessary but without it Martel would accept matrixes that were not consistent with the size fields. Cayte From adalke at mindspring.com Wed Aug 15 10:17:37 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool and Martel Message-ID: <017301c12595$098a54e0$0201a8c0@josiah.dalkescientific.com> > Does Martel handle embedded size fields? Yes! I needed it for support for the MDL file format. Suppose you have something like 2 1 this and that 3 1 but not the other 1 3 this is a test which should be turned into record 1 == ("this and", "that") record 2 == ("but not the", "other") record 3 == ("this", "is a test") Then you can use something like >>> from Martel import Integer, Str, RepN, Group, AnyEol, Re, Rep >>> word = Group("word", Re("[^ \R]+")) >>> >>> record = Integer("n1") + Str(" ") + Integer("n2") + \ ... Group("group1", RepN(Str(" ") + word, "n1")) + \ ... Group("group2", RepN(Str(" ") + word, "n2")) + \ ... AnyEol() >>> >>> from xml.sax import saxutils >>> format = Rep(record) >>> parser = format.make_parser() >>> parser.setContentHandler(saxutils.XMLGenerator()) >>> parser.parseString("""\ ... 2 1 this and that ... 3 1 but not the other ... 1 3 this is a test ... """) 2 1 this and that 3 1 but not the other 1 3 this is a test >>> A couple more details are at: http://www.dalkescientific.com/Martel/ebi-talk/img35.htm This is only usable if the number and the repeat count are the same. Eg, if the count value is N to mean N-1 repeats then it isn't possible to support it. (N+1 is doable as a repeat of N then a repeat of 1.) But I've not come across that case. Yet. > It's not strictly necessary but without it Martel would accept matrixes >that were not consistent with the size fields. There are other formats (MDL mol format) where the counts are required else things get out of synch. Andrew From jchang at SMI.Stanford.EDU Thu Aug 16 01:32:16 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Notification: incoming/42 In-Reply-To: References: Message-ID: At 8:55 AM -0400 8/15/01, Brad Chapman wrote: >Hey all; > >Bug report from J. Joung: >> >>I'm using GenBank NCBIDictionary to retrieve a GenBank record. >>The retrived >> >>record is missing the following information: LOCUS, DEFINITION, ACCESSION, >> >>VERSION, and KEYWORDS. > >Jeff: >> >Is this information that's in the Genbank record? It should be >> >returning whatever NCBI returns, or raising an exception. Dropping >> >information would be odd. Do you have a reproducible? What is the >> >accession you're using? > >I think this is the infamous "lose the first 5 lines of the file" bug that >popped up in biopython-1.00a2 (which would also explain why 1.00a1 works >just file). This has been fixed in the current CVS, so the next release >should be bug free (well, at least in regards to this bug :-). Hey, good call! I completely forgot about that. It looks like we really should release a fix soon... Jeff From dagdigian at blackstonecomputing.com Wed Aug 22 13:40:45 2001 From: dagdigian at blackstonecomputing.com (Chris Dagdigian) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Need help debugging our python-based viewcvs.cgi Message-ID: <3B83EE9D.4060802@blackstonecomputing.com> Hey folks, Our python-based web CVS front end breaks as soon as you traverse into a CVSROOT and then try to click on one of the links meant to aid in traversing the directory tree. The central problem is that the URLS that are constructed are wrong after you get to a certain depth in the CVS tree. As an example check out: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Martel/formats/?cvsroot=biopython That link will put you into biopython/Martel/formats... Now- on that same page try clicking on one of the links next to the Current Directory: navigation line. The URL link back to biopython/biopython is just plain wrong and it causes the CGI to bomb out with an error. It seems to be appending extra path info to the arguments that get passed back to the CGI. At this point I'm not sure if this is a python bug in the code or perhaps an artifact of how our our virtual website and cgi-bin directories are configured. Does anyone have the spare cycles to fool around with this app and try to debug it? I don't know enough python to feel comfortable diving around in the URL contruction codebase. I'll set up account access and permisions (if necessary) if anyone wants to help out in debugging this app. Regards, Chris From michal at orfeus.bioinfo.pl Wed Aug 22 13:59:35 2001 From: michal at orfeus.bioinfo.pl (Michal Kurowski) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] swissprot not working ? Message-ID: <20010822195935.A1749@orfeus> Hi, I've installed biopython-1.00a2 revently and I'm having some unexpected problems: 1) swissprot module has some serious problems. Running "swissprot.py" from the "examples" directory gives traceback i am attaching. 2) installation won't go smoothly. ( I'm sure I've got TextTools installed ;-). The log is in a attachment. My python is: Python 2.0 (#1, Dec 20 2000, 15:28:16) [GCC 2.96 20000731 (Red Hat Linux 7.0)] on linux2 Type "copyright", "credits" or "license" for more information. >>> from the redhat rpm package. Cheers, -- Michal Kurowski -------------- next part -------------- Script started on Wed Aug 22 19:53:20 2001 ]0;michal@a7: /home/michal[michal@a7 michal]$ python2 /home/seals/michal/bin/swiss_kinase.py Traceback (most recent call last): File "/home/seals/michal/bin/swiss_kinase.py", line 23, in ? cur_record = s_iterator.next() File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 168, in next return self._parser.parse(File.StringHandle(data)) File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 289, in parse self._scanner.feed(handle, self._consumer) File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 332, in feed self._scan_record(uhandle, consumer) File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 337, in _scan_record fn(self, uhandle, consumer) File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 369, in _scan_id self._scan_line('ID', uhandle, consumer.identification, exactly_one=1) File "/usr/lib/python2.0/site-packages/Bio/SwissProt/SProt.py", line 359, in _scan_line read_and_call(uhandle, event_fn, start=line_type) File "/usr/lib/python2.0/site-packages/Bio/ParserSupport.py", line 326, in read_and_call raise SyntaxError, errmsg SyntaxError: Line does not start with 'ID': AC P54646; ]0;michal@a7: /home/michal[michal@a7 michal]$ exit exit Script done on Wed Aug 22 19:53:25 2001 -------------- next part -------------- Script started on Wed Aug 22 19:50:29 2001 ]0;michal@a7: /usr/local/src/biopython-1.00a2[root@a7 biopython-1.00a2]# python2 setup.py test running test test_Enzyme ... ok test_FSSP ... ok test_Fasta ... ok test_Fasta2 ... ok test_File ... ok test_GenBank ... ok test_GenBankFormat ... ok test_KeyWList ... ok test_Location ... ok test_LocationParser ... ok test_NCBIStandalone ... ok test_NCBIWWW ... ok test_ParserSupport ... ok test_SProt ... ok test_SubsMat ... ok test_align ... ok test_gobase ... ERROR test_kabat ... ok test_prodoc ... ok test_property_manager ... ok test_prosite ... ok test_prosite2 ... ok test_rebase ... ERROR test_seq ... ok test_translate ... ok test_unigene ... FAIL ====================================================================== ERROR: test_gobase ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 136, in runTest __import__(self.test_name) File "test_gobase.py", line 12, in ? from Bio import Gobase File "/usr/lib/python2.0/site-packages/Bio/Gobase/__init__.py", line 33, in ? from Bio import Sequence ImportError: cannot import name Sequence ====================================================================== ERROR: test_rebase ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 136, in runTest __import__(self.test_name) File "test_rebase.py", line 12, in ? from Bio.Rebase import Rebase File "/usr/lib/python2.0/site-packages/Bio/Rebase/__init__.py", line 32, in ? from Bio import Sequence ImportError: cannot import name Sequence ====================================================================== FAIL: test_unigene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 153, in runTest expected_handle) File "run_tests.py", line 247, in compare_output assert expected_line == output_line, \ AssertionError: Output : ' key is D61454\012' Expected: ' key is F10922\012' ---------------------------------------------------------------------- Ran 26 tests in 49.226s FAILED (failures=1, errors=2) ]0;michal@a7: /usr/local/src/biopython-1.00a2[root@a7 biopython-1.00a2]# exit exit Script done on Wed Aug 22 19:51:36 2001 From chapmanb at arches.uga.edu Wed Aug 22 14:40:05 2001 From: chapmanb at arches.uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] swissprot not working ? In-Reply-To: <20010822195935.A1749@orfeus> References: <20010822195935.A1749@orfeus> Message-ID: <15235.64645.89742.341180@taxus.athen1.ga.home.com> Hi Michal; Thanks for writing. Since you caught me right at the end of writing e-mails, you get an extra fast response :-) > I've installed biopython-1.00a2 revently and I'm having some > unexpected problems: In short, the problems you are having look like bugs that we have noticed and squashed since the 1.00a2 release. I'll get into more detail below, but if you want to "just fix the problems," getting the latest CVS version should work for you. The biopython source is available via anonymous CVS, with instructions at: http://cvs.biopython.org/ We also hope to make a new release relatively soon. Anyways, the problems you are seeing are due to us, and not you :-) > 1) swissprot module has some serious problems. Running "swissprot.py" > from the "examples" directory gives traceback i am attaching. There is a (now infamous) bug that snuck into 1.00a2 in which the first 5 lines of a file will be eaten (under some conditions). The traceback you are seeing is caused during retrieval of the swissprot records in the swissprot.py example. The record is retrieved, but is short the first 5 lines, so the swissprot parser thinks it is malformed. > 2) installation won't go smoothly. ( I'm sure I've got TextTools > installed ;-). The log is in a attachment. The installation looks good (hey, a majority of the tests passed :-), but these are also a few bugs in the tests: > ImportError: cannot import name Sequence This is caused by an old module Bio.Sequence (which has been replaced by Bio.Seq), which was referenced in a few places we didn't expect. This has been fixed. > Output : ' key is D61454\012' > Expected: ' key is F10922\012' This is caused by different dictionary key orderings under different version of python. The module itself works fine, but when the output generated by your version of python is compared to the "golden output" produced by a different version, the key orderings differ so the comparison fails. I believe this problem has also been fixed (by sorting the dictionary keys so they are always standard). But, at any rate, this is a regression test bug, and shouldn't affect your use of the module. Thanks for reporting these problems. We definately like to get feedback about this sort of thing. I hope this clears things up and that you enjoy using Biopython! Brad From michal at orfeus.bioinfo.pl Wed Aug 22 14:52:44 2001 From: michal at orfeus.bioinfo.pl (Michal Kurowski) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] Re: swissprot not working ? In-Reply-To: <15235.64645.89742.341180@taxus.athen1.ga.home.com>; from chapmanb@arches.uga.edu on Wed, Aug 22, 2001 at 02:40:05PM -0400 References: <20010822195935.A1749@orfeus> <15235.64645.89742.341180@taxus.athen1.ga.home.com> Message-ID: <20010822205244.A6217@orfeus> Brad Chapman [chapmanb@arches.uga.edu] wrote: > Hi Michal; > Thanks for writing. Since you caught me right at the end of writing > e-mails, you get an extra fast response :-) Seems I'm really lucky ;-). > In short, the problems you are having look like bugs that we have > noticed and squashed since the 1.00a2 release. I'll get into more > detail below, but if you want to "just fix the problems," getting the > latest CVS version should work for you. The biopython source is > available via anonymous CVS, with instructions at: > > http://cvs.biopython.org/ > I'm going there right away. > There is a (now infamous) bug that snuck into 1.00a2 in which the > first 5 lines of a file will be eaten (under some conditions). The > traceback you are seeing is caused during retrieval of the swissprot > records in the swissprot.py example. The record is retrieved, but is > short the first 5 lines, so the swissprot parser thinks it is malformed. I was having the same type of errors when trying my own scripts. After a small invastigation I found that SProt.py module is a culprit. At least it seems to ;-). > > > 2) installation won't go smoothly. ( I'm sure I've got TextTools > > installed ;-). The log is in a attachment. > > The installation looks good (hey, a majority of the tests passed :-), > but these are also a few bugs in the tests: I was using the last "alpha" release previously and I don't remember anything like that ( but now I've got diffrent TextTools ). > Thanks for reporting these problems. We definately like to get > feedback about this sort of thing. I hope this clears things up > and that you enjoy using Biopython! I surely do. Thanks a lot, -- Michal Kurowski From katel at worldpath.net Sun Aug 26 02:10:58 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool and Martel References: <017301c12595$098a54e0$0201a8c0@josiah.dalkescientific.com> Message-ID: <002301c12df5$e16c67a0$010a0a0a@cadence.com> > >>> from Martel import Integer, Str, RepN, Group, AnyEol, Re, Rep > >>> word = Group("word", Re("[^ \R]+")) > >>> > >>> record = Integer("n1") + Str(" ") + Integer("n2") + \ > ... Group("group1", RepN(Str(" ") + word, "n1")) + \ > ... Group("group2", RepN(Str(" ") + word, "n2")) + \ > ... AnyEol() Can the variable be reassigned within a single record? MetaTool outputs a lot of matrixes. It would be simpler to reassign row_count and column_count for each matrix than invent a new variable name for each matrix and clutter up the code with repetitive, almost the same matrix definitions. Cayte From katel at worldpath.net Sun Aug 26 22:07:39 2001 From: katel at worldpath.net (Cayte) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool Message-ID: <000f01c12e9d$0dad7820$010a0a0a@cadence.com> The MetaTool parser will need to represent matrixes. Before writing my own class, I found an extension called Numeric Python, that provides powerful support for matix representation and manipulation. The only drawback I can see is that it requires bundling yet another tool with the distribution. But Metatool is new and having powerful matrix features will allow users to experiment in unanticipated ways. Of course I'd need to investigate more to see how reliable the extension is. Is this the way to go? Cyte From adalke at mindspring.com Sun Aug 26 11:53:02 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool and Martel Message-ID: <049801c12e47$30e8bf80$0201a8c0@josiah.dalkescientific.com> Cayte: > Can the variable be reassigned within a single record? Yes. It uses the most recently matched value, including if there was a partial match on path that require back tracking. > It would be simpler to reassign row_count and column_count > for each matrix than invent a new variable name for each > matrix and clutter up the code with repetitive, almost > the same matrix definitions. No problem. Go ahead. Andrew From adalke at mindspring.com Sun Aug 26 11:56:26 2001 From: adalke at mindspring.com (Andrew Dalke) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool Message-ID: <04a101c12e47$a9e075e0$0201a8c0@josiah.dalkescientific.com> >I found an extension called Numeric Python, that provides >powerful support for matix representation and manipulation. Numeric Python is pretty widely used, and rather easy to install. > Of course I'd need to investigate more to see how reliable > the extension is. One of my clients uses it all the time. Years ago there used to be a lot of things (almost all non-standard uses) that would cause it to fail, but they've been long ago cleaned up. I think Numeric was one of the first common non-Guido extension to Python. > Is this the way to go? Yes. If you're doing non-trivial matrix numerics it's best to use Numeric, even given the extra dependency. Andrew dalke@dalkescientific.com From jchang at SMI.Stanford.EDU Mon Aug 27 01:15:36 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] MetaTool In-Reply-To: <000f01c12e9d$0dad7820$010a0a0a@cadence.com> Message-ID: > The MetaTool parser will need to represent matrixes. Before > writing my own class, I found an extension called Numeric Python, that > provides powerful support for matix representation and manipulation. > The only drawback I can see is that it requires bundling yet another > tool with the distribution. But Metatool is new and having powerful > matrix features will allow users to experiment in unanticipated ways. > > Of course I'd need to investigate more to see how reliable the > extension is. > > Is this the way to go? Yes. It's already a dependency for Biopython, for some of the more algorithmic code. Jeff From jchang at SMI.Stanford.EDU Tue Aug 28 20:18:49 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] next release Message-ID: Hello everyone, I'd like to roll a new biopython release. This will also be the final release before I move around the directory structure as discussed at BOSC. This release will not contain a lot of new functionality, but will be mostly fix bugs, including the now infamous UndoHandle bug. The code for the release should be working correctly, so all the core developers should let me know if your stuff is ready to be released, and if not, when it will be. The regression tests seem to all pass... Thanks, Jeff From jchang at SMI.Stanford.EDU Fri Aug 31 18:15:22 2001 From: jchang at SMI.Stanford.EDU (Jeffrey Chang) Date: Sat Mar 5 14:43:03 2005 Subject: [Biopython-dev] next release imminent Message-ID: If nobody has any rejections, I'm going to put together the next release this weekend. Please let me know if I should hold off... Thanks, Jeff