From tiagoantao at gmail.com Sun Sep 2 10:29:47 2007 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Sun, 2 Sep 2007 15:29:47 +0100 Subject: [Biopython-dev] Jython and sqlite In-Reply-To: References: <6d941f120708300731k78443be1x858513d0f56e07ca@mail.gmail.com> Message-ID: <6d941f120709020729i783d106bq919646143d34d1bc@mail.gmail.com> Hi! On 8/30/07, Sebastian Bassi wrote: > There is also a sqlite module for previous version of Python. So I > guess you could check python version at the beginning of your code and > then set the import properly. The code will just run with python Thanks for the suggestion. What I will do is the following: 1. Go to the BioSQL mailing list and ask what is their opinion (I think not much will happen). 2. After the feedback I will try to figure a solution that is not python 2.5 dependent (like yours), my main goals will be: a. No 2.5 dependence b. Should not be a big hassle to users (I would prefer not to require installing a full blown database, that might scare off users with less knowledge) c. Should be easy to develop and maintain d. Be sure that nobody in biopython-dev will have strong feelings against it In any case, this is only relevant to HapMap stuff, which is probably a couple of months down the road still, so there is plenty of time to discuss... -- http://www.tiago.org/ps From mdehoon at c2b2.columbia.edu Sun Sep 9 10:18:34 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Sun, 09 Sep 2007 23:18:34 +0900 Subject: [Biopython-dev] New Biopython release Message-ID: <46E400BA.90504@c2b2.columbia.edu> Hi everybody, Let's make a new release (1.44) of Biopython. Biopython has received a lot of improvements and bug fixes in recent months, and the current (1.43) release hangs on some platforms during the Biopython tests (due to a bug in my own Bio.Cluster module, ahem). I am planning to make a new release during the next weekend (around 9/15). A number of bugs in Bugzilla currently have a partial solution that has not yet made it into CVS. I suggest to commit those partial solutions to CVS if possible so that they can be included in the next release. On my machine, only test_SeqIO fails its test. The error seems to be trivial, but we should fix it before making the next release. --Michiel. From bugzilla-daemon at portal.open-bio.org Sun Sep 9 16:13:53 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 16:13:53 -0400 Subject: [Biopython-dev] [Bug 2090] Blast.NCBIStandalone BlastParser fails with blastall 2.2.14 In-Reply-To: Message-ID: <200709092013.l89KDrUW014767@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2090 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #520 is|0 |1 obsolete| | ------- Comment #15 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 16:13 EST ------- (From update of attachment 520) As per comment 14, I've checked something based on this into CVS. I'm leaving this bug open, as we still can't read the plain text output from multiple queries. Now that we have moved over to XML as the default, this isn't such a problem. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 9 17:54:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 17:54:31 -0400 Subject: [Biopython-dev] [Bug 2351] Make SeqRecord subclass Seq subclass string? In-Reply-To: Message-ID: <200709092154.l89LsVjS022303@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2351 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 17:54 EST ------- Created an attachment (id=751) --> (http://bugzilla.open-bio.org/attachment.cgi?id=751&action=view) Make str() work intuitively on Seq and MutableSeq objects This patch only changes the Seq and MutableSeq objects: 1. Makes the str() method give the full sequence for Seq and MutableSeq objects 2. Added docstrings to encourage str(my_seq) instead my_seq.tostring() 3. Fixed the repr() of a MutableSeq object This patch updates Bio/Seq.py and the unit test case test_seq.py and its output. It does not add any .short() method to give a truncated representation string like the current str() method gives. It does not do anything to Bio/SeqRecord.py -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Sun Sep 9 16:47:17 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 09 Sep 2007 21:47:17 +0100 Subject: [Biopython-dev] New Biopython release In-Reply-To: <46E400BA.90504@c2b2.columbia.edu> References: <46E400BA.90504@c2b2.columbia.edu> Message-ID: <46E45BD5.1060905@maubp.freeserve.co.uk> Michiel de Hoon wrote: > Hi everybody, > > Let's make a new release (1.44) of Biopython. Biopython has received > a lot of improvements and bug fixes in recent months, and the current > (1.43) release hangs on some platforms during the Biopython tests > (due to a bug in my own Bio.Cluster module, ahem). I am planning to > make a new release during the next weekend (around 9/15). Good idea - I was wondering about suggesting this. By the way, I updated the NEWS file fairly recently. > A number of bugs in Bugzilla currently have a partial solution that > has not yet made it into CVS. I suggest to commit those partial > solutions to CVS if possible so that they can be included in the next > release. Some of those are enhancement bugs where things are still a little undecided. I'm fairly happy with the __getitem__ method for the alignment, but would want to change it slightly if we added a __getitem__ to the SeqRecord itself. > On my machine, only test_SeqIO fails its test. The error seems to be > trivial, but we should fix it before making the next release. It was trivial ;) I had forgotten to check in a SwissProt test case. Sorry. We should get Tiago's input before making a release - see if he's happy for his initial Bio.PopGen code to be released yet. If not, then it shouldn't be too hard to removed that module, its test cases, and section in the tutorial. Just a little fiddly... I've also done a little work on writing EMBL and GenBank files (based in part on Howard Salis' patches on Bug 2294) but I don't think they are ready yet. As part of this I am planning to making some small changed to the EMBL and GenBank parsers to record a few more bits of annotation. Rather than rush this I'll hold back until after release 1.44 is done. Peter From bugzilla-daemon at portal.open-bio.org Sun Sep 9 18:25:02 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 18:25:02 -0400 Subject: [Biopython-dev] [Bug 1963] Adding __str__ method to codon tables and translators In-Reply-To: Message-ID: <200709092225.l89MP28C025915@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1963 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 18:25 EST ------- Checked in a version based on this, but which copes with a generic translator (which will accept either RNA or DNA). See Bio/Data/CodonTable.py revision 1.4 and Bio/Translate.py revision 1.2 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 9 20:10:37 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 20:10:37 -0400 Subject: [Biopython-dev] [Bug 2351] Make SeqRecord subclass Seq subclass string? In-Reply-To: Message-ID: <200709100010.l8A0Abet032204@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2351 ------- Comment #4 from sbassi at gmail.com 2007-09-09 20:10 EST ------- (In reply to comment #3) > It does not add any .short() method to give a truncated representation string > like the current str() method gives. Why not? This new method should not cause any compatibility problem -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Mon Sep 10 12:53:58 2007 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 10 Sep 2007 17:53:58 +0100 Subject: [Biopython-dev] New Biopython release - Bio.PopGen? In-Reply-To: <46E55CDE.2020805@warwick.ac.uk> References: <46E55CDE.2020805@warwick.ac.uk> Message-ID: <6d941f120709100953r16b6b695q2e02b8f10c39d1f7@mail.gmail.com> Hi My apologies for the delay in answering, but I am traveling during a couple of weeks as I am in the organization of a conservation genetics course. Unit the 21st I will probably be sloppy in answering. You can go ahead and include the code if you want (any decision that you take is OK), please note the following: 1. Without a statistics module the functionality is limited (a good reason to delay the release) 2. I believe that the code is in an acceptable state. 3. A diff with suggested fdist documentation is attached to the bugzilla entry. I would lean towards a delay until statistics are in, but if it is too much hassle please go ahead and make it public. Tiago On 9/10/07, Peter Cock wrote: > Hi Tiago, > > I'm forwarding this email in case you had missed it on the mailing list. > > Basically what do you want to do with Bio.PopGen given that Michiel is > hoping to do a Biopython release by the end of this week? > > Thanks > > Peter > > -------- Original Message -------- > Subject: Re: [Biopython-dev] New Biopython release > Date: Sun, 09 Sep 2007 21:47:17 +0100 > From: Peter > Reply-To: biopython-dev at lists.open-bio.org > To: Michiel de Hoon > CC: biopython-dev at lists.open-bio.org > References: <46E400BA.90504 at c2b2.columbia.edu> > > Michiel de Hoon wrote: > > Hi everybody, > > > > Let's make a new release (1.44) of Biopython. Biopython has received > > a lot of improvements and bug fixes in recent months, and the current > > (1.43) release hangs on some platforms during the Biopython tests > > (due to a bug in my own Bio.Cluster module, ahem). I am planning to > > make a new release during the next weekend (around 9/15). > > Good idea - I was wondering about suggesting this. By the way, I > updated the NEWS file fairly recently. > > > A number of bugs in Bugzilla currently have a partial solution that > > has not yet made it into CVS. I suggest to commit those partial > > solutions to CVS if possible so that they can be included in the next > > release. > > Some of those are enhancement bugs where things are still a little > undecided. I'm fairly happy with the __getitem__ method for the > alignment, but would want to change it slightly if we added a > __getitem__ to the SeqRecord itself. > > > On my machine, only test_SeqIO fails its test. The error seems to be > > trivial, but we should fix it before making the next release. > > It was trivial ;) > I had forgotten to check in a SwissProt test case. Sorry. > > We should get Tiago's input before making a release - see if he's happy > for his initial Bio.PopGen code to be released yet. If not, then it > shouldn't be too hard to removed that module, its test cases, and > section in the tutorial. Just a little fiddly... > > I've also done a little work on writing EMBL and GenBank files (based in > part on Howard Salis' patches on Bug 2294) but I don't think they are > ready yet. As part of this I am planning to making some small changed > to the EMBL and GenBank parsers to record a few more bits of annotation. > Rather than rush this I'll hold back until after release 1.44 is done. > > Peter > > > -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Tue Sep 11 06:10:38 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 06:10:38 -0400 Subject: [Biopython-dev] [Bug 2174] FDist Support in BioPython In-Reply-To: Message-ID: <200709111010.l8BAAcDC010284@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2174 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 06:10 EST ------- I'm only commenting on the English. All this Controller stuff sounds complicated, but I guess its based on how FDist works. I would clarify this: > The FDist data format is application specific and is not used at > all by other applications, as such, it is always necessary to > convert from an external format. There is code to convert a GenePop > format to FDist, here is an example of usage (along with imports that > will be needed on examples further below): Perhaps: > The FDist data format is application specific and is not used at > all by other applications, as such you will probably have to convert > your data for use with FDist. Biopython can help you do this. > Here is an example converting from GenePop format to FDist format > (along with imports that will be needed on examples further below): Small point, there/the: Before: >> In practice, when there number of populations is low, the mutation model >> is stepwise and the sample size increases, fdist will not be able to >> simulate an acceptable approximate average $F_{st}$. After: >> In practice, when the number of populations is low, the mutation model >> is stepwise and the sample size increases, fdist will not be able to >> simulate an acceptable approximate average $F_{st}$. If you fix the there/the, then I think that can be comitted to CVS. Would you like me or Michiel to do this for you, as you are travelling this week? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mdehoon at c2b2.columbia.edu Tue Sep 11 10:37:57 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 11 Sep 2007 23:37:57 +0900 Subject: [Biopython-dev] Bio.MultiProc Message-ID: <46E6A845.3030601@c2b2.columbia.edu> Hi everybody, In preparation for the upcoming release, I was running the Biopython test suite and found that test_copen.py hangs on Cygwin. It doesn't fail, it just sits there forever. This may be related to the use of fork() instead of select() in Bio/MultiProc/copen.py. Anyway, while it is probably possible to fix this, I'd have to dig fairly deep into the code, and I am not sure if it is worth it. It looks like the copen functions are used only in Bio/config, which is needed for Bio.db. A description of the functionality of thia module can be found in the tutorial section 4.7.2. Now, I don't remember users asking about this module on the mailing list. From the tutorial documentation, it seems to be a nice piece of code, but I doubt that it is being used often in practice. So I was wondering: 1) Is anybody on this list using this code? 2) If not, can I mark it as deprecated for the upcoming release? Hopefully, people who are using this code will notice, and let us know that they need it. --Michiel. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 16:00:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:00:32 -0400 Subject: [Biopython-dev] [Bug 2361] New: Test Suite Failures Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2361 Summary: Test Suite Failures Product: Biopython Version: 1.43 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sjf4 at u.washington.edu Your test suite does not seem to function correctly (or your software). I've been trying to make several of the tests succeed for days, but have not been able to. I've tried many different versions of biopython and the related modules on RHEL4 and RHEL5. I finally thought to try the tests on Windows, and many of the same tests fail there as well. Many of the failing tests use the mxtexttools module, which you can only obtain v3.0.0 of now, but the documentation refers to 2.0.x. If the software works properly, but just the tests fail, I would appreciate you updating the tests. As a non-scientist, I have no way to test this software before I pass it on to my users, so I rely upon test suites like these to assure some level of functionality before handing it off. The output of the failed tests is below. ====================================================================== ERROR: test_CodonUsage ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_CodonUsage.py", line 10, in ? X.generate_index("./CodonUsage/HighlyExpressedGenes.txt") File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SeqUtils/ CodonUsage.py", line 74, in generate_index self._count_codons(FastaFile) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SeqUtils/ CodonUsage.py", line 117, in _count_codons cur_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_Fasta2 ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_Fasta2.py", line 44, in ? data = record_parser.parse( src_handle ) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 100, in parse return self.convert_lax(iterator.next()) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_KEGG ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_KEGG.py", line 67, in ? t_KEGG_Enzyme(test_KEGG_Enzyme_files) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_KEGG.py", line 23, in t_KEG G_Enzyme record = records.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/KEGG/Enzy me/__init__.py", line 225, in next data = self._reader.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_align ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_align.py", line 129, in ? alignment = FastaAlign.parse_file(to_parse, 'PROTEIN') File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/Fas taAlign.py", line 48, in parse_file cur_align = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_format_registry ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_format_registry.py", line 4 9, in ? parser.parseFile(_open('EDD_RAT.dat')) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 452, in parseFile self._err_handler.fatalError(ParserRecordException( File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 444, in parseFile record = reader.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_geo ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_geo.py", line 24, in ? record = records.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 79, in next return self._parser.parse(File.StringHandle(data)) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 228, in parse self._scanner.feed(handle, self._consumer) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 126, in feed self._parser.parseFile(handle) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 328, in parseFile self.parseString(fileobj.read()) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 1427 ====================================================================== FAIL: test_Fasta ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Basic operation of the Record Parser. ... ERROR\n' Expected: 'Basic operation of the Record Parser. ... ok\n' ====================================================================== FAIL: test_GenBankFormat ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 166, in runSafeTest cur_test.run_tests([]) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_GenBankFormat.py", line 588 , in run_tests test_list.test() File "/scratch/sjf4/temp/biopython-1.43/Tests/martel_support.py", line 51, in test raise AssertionError, "cannot parse" AssertionError: cannot parse ====================================================================== FAIL: test_NNGene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Find all motifs in a set of sequences. ... ERROR\n' Expected: 'Find all motifs in a set of sequences. ... ok\n' ====================================================================== FAIL: test_SCOP_Astral ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ERROR\ n' Expected: 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ok\n' ---------------------------------------------------------------------- Ran 91 tests in 99.339s FAILED (failures=4, errors=6) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 16:23:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:23:03 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112023.l8BKN38E018573@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 16:23 EST ------- You seem to be the bearer of bad news :( My first impression is that you are right in that mxTextTools 3.0 is to blame: http://www.egenix.com/products/python/mxBase/mxTextTools/changelog.html e.g. from their changes from 2.0.3 to 3.0.0 > Restructured tag commands and their numbering so that > low-level commands come before the special ones. Old > tag tables need to be "recompiled" due to this change! If you can still download mxTextTools 2.x.x from their website, its not at all obvious. As to the impact in Biopython, most of the unit tests failures are clearly problems in Biopython's Martel library and/or the python Sax library: test_CodonUsage - depends on Bio.Fasta which depends on Martel test_Fasta2 - depends on Martel test_KEGG - depends on Martel test_align - depends on Bio.Fasta which depends on Martel test_format_registry - depends on Martel test_geo - depends on Martel test_Fasta - unclear from the error, but known to depend on Martel test_GenBankFormat - depends on Martel test_NNGene - failure unclear test_SCOP_Astral - failure unclear Most of these Martel based parsers are less commonly used (IMO), which the exception of Bio.Fasta where at least removing the Martel dependance would be fairly easy. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 16:54:14 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:54:14 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112054.l8BKsEHk022566@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 16:54 EST ------- I don't see why test_NNGene and test_SCOP_Astral would fail - as far as I can see, there is no link to Martel. Stephen - Would you be able to post the output of the following (run in the Tests subdirectory): python test_NNGene.py python test_SCOP_Astral.py Thank you. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 17:00:54 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:00:54 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112100.l8BL0sW5023454@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #3 from sjf4 at u.washington.edu 2007-09-11 17:00 EST ------- NNGene Test Ouput = = = = = = = = = = = = SchemaMatchingTest:Matching schema to strings works correctly. ... ok Reading and writing motifs to a file ... ok Reading and writing schemas to a file. ... ok Reading and writing signatures to a file. ... ok Retrieve counts for particular patterns in the repository. ... ok Retrieve all patterns from a repository. ... ok Retrieve patterns from both sides of the list (top and bottom). ... ok Retrieve random patterns from the repository. ... ok Retrieve a certain number of the top patterns. ... ok Retrieve the top percentge of patterns from the repository. ... ok Test the ability to remove A rich patterns from the repository. ... ok Find all motifs in a set of sequences. ... ERROR Find the difference in motif counts between two sets of sequences. ... ERROR Convert a sequence into its motif representation. ... ok Return all unambiguous characters that can be in a motif. ... ok Find the positions of ambiguous items in a sequence. ... ok Find all matches in a sequence. ... ok Make sure motif compiled regular expressions are cached properly. ... ok Find the number of ambiguous items in a sequence. ... ok Find how many matches are present in a sequence. ... ok Convert a string into a representation of motifs. ... ok Find schemas from sequence inputs. ... ERROR Find schemas that differentiate between two sets of sequences. ... ERROR Generating schema from a simple list of motifs. ... ok Generating schema from a real life set of motifs. ... ERROR Convert sequences into schema representations. ... ERROR Find signatures from sequence inputs. ... ERROR Convert a sequence into its signature representation. ... ok ====================================================================== ERROR: Find all motifs in a set of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 253, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find the difference in motif counts between two sets of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 253, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find schemas from sequence inputs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 416, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find schemas that differentiate between two sets of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 416, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Generating schema from a real life set of motifs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 546, in t_hard_from_motifs schema_bank = self._load_schema_repository() File "test_NNGene.py", line 573, in _load_schema_repository seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Convert sequences into schema representations. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 599, in t_schema_representation schema_bank = self._load_schema_repository() File "test_NNGene.py", line 573, in _load_schema_repository seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find signatures from sequence inputs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 636, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ---------------------------------------------------------------------- Ran 28 tests in 0.250s FAILED (errors=7) ==== ==== ==== ==== ==== SCOP_Astral Test Ouput = = = = = = = = = = = = E..E ====================================================================== ERROR: testConstructWithCustomFile (__main__.AstralTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_SCOP_Astral.py", line 55, in testConstructWithCustomFile assert astral.getSeqBySid('d3sdha_').data == "AAAAA" File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SCOP/__init__.py", line 806, in getSeqBySid return self.fasta_dict[domain].seq File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 214, in __getitem__ return dict.__getitem__(self,key) KeyError: 'd3sdha_' ====================================================================== ERROR: testGetSeq (__main__.AstralTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_SCOP_Astral.py", line 43, in testGetSeq assert self.astral.getSeqBySid('d3sdha_').data == "AAAAA" File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SCOP/__init__.py", line 806, in getSeqBySid return self.fasta_dict[domain].seq File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 214, in __getitem__ return dict.__getitem__(self,key) KeyError: 'd3sdha_' ---------------------------------------------------------------------- Ran 4 tests in 0.114s FAILED (errors=2) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 17:21:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:21:15 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112121.l8BLLFcZ025354@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Main Distribution |Martel/Mindy ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 17:21 EST ------- Thanks Stephen. The good news is that test_NNGene is also failing in the Martel/Sax library (called via Bio.Fasta). Its not so clear cut, but this likely that root cause of the test_SCOP_Astral failure too. Its not a full solution, but we may be able to minimise this problem by changing Bio.Fasta not to use Martel, e.g. see bug 2058 [For what its worth, I have filed this bug under the Martel/Mindy component] -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 17:56:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:56:20 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709112156.l8BLuKaL027005@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Test Suite Failures |Test Suite Failures from | |Martel/Sax with egenix | |mxTextTools 3.0 ------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 17:56 EST ------- I am able to reproduce this on Ubuntu Dapper Drake - switching from mxTextTools 2.0.6 (from the Ubuntu binary packages) to mxTextTools 3.0.0 (installed from source under my home directory only). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 18:33:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 18:33:34 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709112233.l8BMXYpP028597@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 18:33 EST ------- After applying the updated the patch on bug 2058 to remove the Martel dependancy from Bio.Fasta, things are a LOT better. http://bugzilla.open-bio.org/attachment.cgi?id=755 Note that with mxTextTools 3, test_Fasta would still fail on indexing a file as a simple database using Mindy. With the patch I think we only have five unit test failures: test_Fasta test_GenBankFormat test_KEGG test_format_registry test_geo -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 21:47:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 21:47:52 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120147.l8C1lqPt004412@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #7 from mdehoon at ims.u-tokyo.ac.jp 2007-09-11 21:47 EST ------- With the patch on bug 2058, I am finding the same five unit test failures on Cygwin (plus test_copen.py, but that is a Cygwin-specific failure unrelated to mxTextTools). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 01:45:22 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 01:45:22 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120545.l8C5jMSg017128@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #8 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 01:45 EST ------- I've written a Martel-free parser for Bio.Geo. With this parser, test_geo.py now passes. I've uploaded the new parser to CVS; feel free to comment. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 04:47:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 04:47:26 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120847.l8C8lQFn030017@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #9 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 04:47 EST ------- I've uploaded a Martel-free parser for Bio.KEGG.Compound to CVS. I'm still working on the same for Bio.KEGG.Enzyme and Bio.KEGG.Map. Just to let you know what I'm working on, to avoid duplicated efforts. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 04:57:24 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 04:57:24 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120857.l8C8vOrm030713@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 04:57 EST ------- Regarding GEO, your changes look sensible. I see you have removed Bio.Geo.RecordParser() and deprecated Bio.Geo.Iterator() to use a parse() function. That makes sense given recent Biopython developments. For the long term, and thinking as a potential end user, something much more like Sean Davis' GEOquery package for R/BioConductor would be much nicer. http://www.warwick.ac.uk/go/peter_cock/r/geo/ Right now, breaking the GEO files at each ^ (caret) line it perhaps a little too low level. In particular, in Biopython I would like to be able to take a GDS file (GEO Dataset), and have it loaded as an annotated matrix of expression levels (genes as rows, samples as columns) suitable for use with Bio.Cluster But that is probable best left as a future enhancement bug. -------------------------------------------------------------------- We still have the Bio.Fasta.Dictionary and Bio.GenBank.Dictionary classes (and anything else like it) to worry about. These use Mindy to build a set of lookup tables as files on disk, allowing keyed like access to records WITHOUT having all the records in memory. I'm a bit hazzy on the implementation details. I personally don't use them, and its not something currently supported by Bio.SeqIO either. -------------------------------------------------------------------- If you replace the KEGG parser then I fear the remaining problems are very tightly linked to Martel, but also in my opinion not key features of Biopython. We could try asking on the egenix mailing list to se if they have some examples of updating python code to work with mxTextTools 3.0 ... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 05:29:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 05:29:15 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120929.l8C9TFqk032621@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 05:29 EST ------- We could add an mxTextTools version check to Martel/setup.py using something like this: from mx import TextTools good = int(TextTools.__version__.split(".")[0]) < 3 Note - I would just print a warning rather than refusing to install. Possibly do a run-time check in Martel/Parser.py, Generate.py and RecordReader.py and raise an ImportError? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 06:29:39 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 06:29:39 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121029.l8CATdt7004244@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- OS/Version|Linux |All ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 06:29 EST ------- Confirmed problem on Windows XP with Python 2.3, changing bug's OS field to All. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 07:11:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:11:03 -0400 Subject: [Biopython-dev] [Bug 2362] New: test_copen fails on Windows XP as tries os.fork() Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2362 Summary: test_copen fails on Windows XP as tries os.fork() Product: Biopython Version: 1.43 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk I'm using Biopython from CVS (1.43+), fresh install compiled from source using MSVC 6.0 Output from: python test_copen.py opening handle Traceback (most recent call last): File "test_copen.py", line 14, in ? handle = copen.copen_fn(print_args, *(range(2) + ['a', 'b', 'c'])) File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Bio\Mu ltiProc\copen.py", line 66, in copen_fn pid = os.fork() AttributeError: 'module' object has no attribute 'fork' Michiel has seen a similar problem on Windows using cygwin (hangs rather than the attribute error), see bug 2361 comment 7 and this mailing list post: http://lists.open-bio.org/pipermail/biopython/2007-September/003722.html -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 07:12:46 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:12:46 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121112.l8CBCkem008143@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #13 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 07:12 EST ------- (In reply to comment #7) > With the patch on bug 2058, I am finding the same five unit test failures on > Cygwin (plus test_copen.py, but that is a Cygwin-specific failure unrelated to > mxTextTools). See Bug 2362, I find test_copen.py fails on Windows XP (non-cygwin) with an AttributeError rather than hanging. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 07:34:19 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:34:19 -0400 Subject: [Biopython-dev] [Bug 2363] New: Bio.Pathway files not stored as plain text in CVS? Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2363 Summary: Bio.Pathway files not stored as plain text in CVS? Product: Biopython Version: Not Applicable Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk I've just checked out a fresh clean copy of Biopython from CVS, and built it from source on Windows XP with Python 2.3 and MSCV 6.0 as the compiler. test_pathway.py and test_KEGG.py both failed (complaining about syntax errors on import statements which looked fine). This can be fixed by editing the files Bio/Pathway/__init__.py and Bio/Pathway/Rep/*.py to use Windows/DOS/PC line endings (odd!) I suspect that the Bio.Pathway python files have been checked into CVS as binary files rather than as plain text. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 08:20:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 08:20:31 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121220.l8CCKVST016224@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #14 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 08:20 EST ------- I have now uploaded Martel-free parsers for Bio.KEGG.Compound/Enzyme/Map. With these new parsers, test_KEGG.py now passes. I also updated some data files in the Tests/output and Tests/KEGG directories. Three more to go. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 09:19:13 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 09:19:13 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121319.l8CDJDGH022220@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #15 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 09:19 EST ------- Good work Michiel :) I can confirm using Windows XP, Python 2.3, and mxTextTools 3.0 the following all pass: test_align test_CodonUsage test_Fasta2 test_geo test_KEGG test_NNGene test_SCOP_Astral Note you may need to delete the Mindy index directory Tests\SCOP\scopseq-test\astral-scopdom-seqres-all-test.fa.idx to force its recreation in test_SCOP_Astral.py The following still fail with mxTextTools 3.0, but do work with mxTextTools 2.0: test_format_registry - ReaderError: invalid format starting with '' test_GenBankFormat - AssertionError, "cannot parse" test_Fasta - fails indexing files I'll double check this on Linux in a few hours time. I don't see why test_Fasta is failing, doing "python test_Fasta.py" looks fine. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 11:12:54 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 11:12:54 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121512.l8CFCsJJ031631@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #16 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 11:12 EST ------- >From looking at test_format_registry, I doubt that this code is still being used by anybody. Rather than banging our heads over how to fix this, I suggest that we remove the corresponding code for the next Biopython release and see if anybody complains. If not, our problem is solved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 12:34:13 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 12:34:13 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121634.l8CGYDPB004823@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #17 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 12:34 EST ------- Firsly I can confirm my findings in comment 15 also apply on Linux. Further to comment 16, I also doubt that anybody uses the GenBank martel expression which test_GenBankFormat.py checks. Isn't removing this code a little drastic? We could release Biopython 1.44 with a warning that Martel and some minor parts of Biopython which use it will not work with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 16:51:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 16:51:20 -0400 Subject: [Biopython-dev] [Bug 2364] New: New version of MeltingTemp.py Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2364 Summary: New version of MeltingTemp.py Product: Biopython Version: Not Applicable Platform: PC OS/Version: Linux Status: NEW Severity: enhancement Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sbassi at gmail.com Removed string and some costetic changes in the code. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 16:52:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 16:52:52 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709122052.l8CKqqpC018192@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #1 from sbassi at gmail.com 2007-09-12 16:52 EST ------- Created an attachment (id=756) --> (http://bugzilla.open-bio.org/attachment.cgi?id=756&action=view) New version of MeltingTemp.py This file should replace old MeltingTemp.py -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 00:44:04 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 00:44:04 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709130444.l8D4i41Q014333@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #18 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 00:44 EST ------- On my computer, "python test_Fasta.py" does fail. Three of the four tests in test_Fasta.py succeed, the fourth one (DictionaryTest) fails. The error occurs in the index_file function in Bio/Fasta/__init__.py, which is needed to create a Fasta.Dictionary. This code is used to create your own Fasta database, along the lines of the Genbank example in section 4.3.4 in the Tutorial. I think that this stuff can be done more cleanly with the new Bio.SeqIO. I'll ask on the Biopython mailing list if somebody is using index_file. If not, we can deprecate only that function in Bio.Fasta, and remove the corresponding test in test_Fasta.py. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 04:51:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 04:51:09 -0400 Subject: [Biopython-dev] [Bug 2174] FDist Support in BioPython In-Reply-To: Message-ID: <200709130851.l8D8p9de031499@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2174 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #740 is|0 |1 obsolete| | ------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 04:51 EST ------- (From update of attachment 740) I checked this in yesterday -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 06:55:22 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 06:55:22 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131055.l8DAtMdr006501@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #19 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 06:55 EST ------- Looking at this error: test_GenBankFormat - AssertionError, "cannot parse" This error occurs due to the last test in test_GenBankFormat.py. If I remove add_test("ncbi_format", ncbi_format, header_s + record_s1+record_s2+record_s3) then the test passes. I didn't see ncbi_format being used anywhere in Biopython. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 07:27:28 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 07:27:28 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131127.l8DBRSwv008497@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #20 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 07:27 EST ------- Note that there are other index_file functions in Bio.GenBank and Bio.SwissProt The Bio.Fasta index_file/Dictionary is also used in several other modules including the SCOP Astral class (indexing a Fasta file to serve as a database). So depreciating it isn't quite as trivial as it could be! For anyone unfamiliar with the details, note that while Bio.SeqIO.to_dict() achieves a similar aim, it is done in memory. The Mindy based index_file/Dictionary classes parse the file once to create a lookup table on disk allowing random access to any record in the file. This functionality was probably more important historically (lower memory on desktop computers), and seems to be a mid point between the simple in memory dictionary and a full blown SQL database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 07:57:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 07:57:03 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131157.l8DBv3bO010568@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #21 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 07:57 EST ------- > The Bio.Fasta index_file/Dictionary is also used in several other modules > including the SCOP Astral class (indexing a Fasta file to serve as a database). > So depreciating it isn't quite as trivial as it could be! Yes, but these can be trivially replaced by the corresponding Bio.SeqIO code. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 08:04:16 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 08:04:16 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131204.l8DC4G2V011062@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #22 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 08:04 EST ------- Following up comment 19, if the error handler in Tests/martel_support.py is commented out, we can actually see the stack error triggered in test_GenBankFormat.py as follows: Traceback (most recent call last): File "C:\temp\biopython_cvs\biopython_all\biopython\Tests\test_GenBankFormat.py", line 603, in -toplevel- test_list.test() File "C:\temp\biopython_cvs\biopython_all\biopython\Tests\martel_support.py", line 41, in test parser.parseString(s) File "c:\python23\lib\site-packages\Martel\Parser.py", line 557, in parseString self.parseFile(strfile) File "c:\python23\lib\site-packages\Martel\Parser.py", line 587, in parseFile self._err_handler.fatalError(exc) File "c:\python23\lib\xml\sax\handler.py", line 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "c:\python23\lib\site-packages\Martel\Parser.py", line 578, in parseFile header = header_reader.next() File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Martel\RecordReader.py", line 413, in next positions = _find_end_positions(lookahead, _tag_lines_tagtable) File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Martel\RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' This error is very similar to some of the others in the original bug report (parsers we have since moved to pure python). Looking at the stack for this in IDLE, the Martel.Parser.parseFile() function has a cStringIO.StringI object as its fileobj variable. I signed up to the egenix mailing list, and asked them to clarify what they meant by "Removed support for buffer-compatible input objects" in their change log, and specifically if this meant we can't use Python's StringIO handles? The reply was: > Yes, we had to do this as a result of the restructuring of the > underlying code which no longer works on a char* pointer, but > instead uses the object type information to see whether it needs > to compile a Unicode tag table or a string one. I suspect the use of StringIO / cStringIO in Biopython would explains most/all of the Martel based test failures. I'm not sure if we can work around this... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 08:36:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 08:36:51 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131236.l8DCapcZ012966@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #23 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 08:36 EST ------- Follow up from egenix - I'm not sure quite what this means in Martel... I asked: > Do you have any suggested workarounds for using mxTextTools to parse > data held in a string (rather than read from a handle to an opened file)? And Marc-Andre Lemburg replied: > I think I lost you there :-) > > mxTextTools *does* work on Python strings and Unicode. It no longer works > on objects that just expose the buffer API. We'll likely add support for > that at some later stage, but for now, the Unicode support was more > important to get right. > > You can easily convert a StringIO instance to a Python string using > .getvalue() method. > > For larger amounts of data, it's also a good idea to process the data > in chunks. mxTextTools allows for this by returning the index of where > it stopped parsing the input. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 18:38:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 18:38:32 -0400 Subject: [Biopython-dev] [Bug 2348] Slicing the Seq object (returns a string when use a stride) In-Reply-To: Message-ID: <200709132238.l8DMcWBc028839@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2348 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 18:38 EST ------- P.S. I still think slicing a Seq object with a stride should return another Seq object, but some of the functions/methods in Bio/Seq.py actually expected a string. I have now fixed those, and extended test_seq.py to actually check these functions. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 05:17:50 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:17:50 -0400 Subject: [Biopython-dev] [Bug 2366] New: Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2366 Summary: Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq Product: Biopython Version: Not Applicable Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Currently the (Reverse)complementfunctions/methods in Bio.Seq do NOT support ambiguous nucleotides. For example, the complement of H={ACU} should be D={UGA} I'll upload a patch to Bio/Seq.py and its unit test in a moment... bugzilla doesn't let you do this as part of filing a bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 05:21:59 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:21:59 -0400 Subject: [Biopython-dev] [Bug 2366] Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq In-Reply-To: Message-ID: <200709140921.l8E9LxDf004471@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2366 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 05:21 EST ------- Created an attachment (id=758) --> (http://bugzilla.open-bio.org/attachment.cgi?id=758&action=view) Patch to Bio/Seq.py and Tests/test_seq.py and Tests/output/test_se * Fixes (reverse) complement of ambiguous sequences * Removes some code duplication (at the cost of extra function calls) * Adds some missing doc strings * Includes a mini-test in Bio/Seq.py (which can be removed) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 05:39:00 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:39:00 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709140939.l8E9d0UZ005615@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 05:38 EST ------- So this updates Bio/SeqUtils/MetlingTemp.py to use string methods instead of the string module. Seems fine to me. It may just my imagination (working on Linux), but it seems Bio/SeqUtils/MetlingTemp.py has been checked into CVS as a binary file with Windows/DOS new lines. After running dos2unix on things I can get a sensible diff between local copies. If I run unix2dos on your new version, then cvs diff gives sensible output. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 06:25:58 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 06:25:58 -0400 Subject: [Biopython-dev] [Bug 2366] Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq In-Reply-To: Message-ID: <200709141025.l8EAPwWk009283@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2366 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 06:25 EST ------- Part of my updated test_seq.py unit test fails when run with the entire test suite; It appears some other unit test is polluting the Bio.Data.IUPACData.ambiguous_dna_values dictionary. Adding this to test_seq.py (after applying the patch) seems to fix this. #When run the full test suite, some other unit test is polluting this dict: for ambig_char in ["-", "?"] : if ambig_char in ambiguous_dna_values : del ambiguous_dna_values[ambig_char] -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 08:53:00 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 08:53:00 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709141253.l8ECr0hP020991@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #3 from sbassi at gmail.com 2007-09-14 08:52 EST ------- (In reply to comment #2) > It may just my imagination (working on Linux), but it seems > Bio/SeqUtils/MetlingTemp.py has been checked into CVS as a binary file with > Windows/DOS new lines. The original version was made under Windows, now I work with Linux. I evolved, so my code and platform :) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 09:51:28 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 09:51:28 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709141351.l8EDpShE025015@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #24 from mdehoon at ims.u-tokyo.ac.jp 2007-09-14 09:51 EST ------- Would it be possible to remove the dependence of Bio.SeqIO on Bio.GenBank? I am trying to disentangle the mxTextTools-dependent stuff from the code unaffected by the recent mxTextTools update. Often, the easiest way to do this is to replace the Martel-dependent code with Bio.SeqIO (for example, see my update of Bio/SeqUtils/__init__.py. But if Bio.SeqIO then relies on a Martel-based parser, we're back to square one. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 15:29:18 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 15:29:18 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709141929.l8EJTIFT012178@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #25 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 15:29 EST ------- Bio.SeqIO does depend on Bio.GenBank for both EMBL and GenBank parsing. The good news is the only bit of Bio.GenBank which depends on Martel is the index_file() function and Dictionary class in Bio/GenBank/__init__.py which work in the same way as the equivalent functions in Bio.Fasta Note that I did find one excess import statment in Bio/GenBank/__init__.py which I have now removed in CVS revision 1.74 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 16 07:25:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Sep 2007 07:25:52 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200709161125.l8GBPqLc012096@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- OS/Version|Windows XP |All Platform|PC |All Summary|Bio.Pathway files not stored|Some python files not stored |as plain text in CVS? |as plain text in CVS? ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-16 07:25 EST ------- Additionally, Sebasti??n Bassi's Bio/SeqUtils/MeltingTemp.py appears to be stored in CVS with DOS/Windows newlines. So far this has only caused problems with the diff command. See bug 2364 And in the other direction, Doc/Images/BlastRecord.png, PSIBlastRecord.png and smcra.png appear to be checked in as text: They work fine on Linux, but when checked out on Windows the images are corrupt. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 16 07:27:47 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Sep 2007 07:27:47 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709161127.l8GBRlF1012214@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-16 07:27 EST ------- I've check this in, /Bio/SeqUtils/MeltingTemp.py revision 1.6 - Thanks Sebastian I've made a note of the new line problem (CVS text vs. binary) on Bug 2363 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 06:25:48 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 06:25:48 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200709171025.l8HAPm6R011219@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 06:25 EST ------- Also Bio\ECell\__init__.py seems to need its new lines "fixed" for the unit tests to pass on Windows. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 10:36:14 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 10:36:14 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171436.l8HEaECE028354@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #26 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 10:36 EST ------- Marked the index_file() function and Dictionary classes in Bio.Fasta and Bio.GenBank as deprecated, and removed the corresponding test in test_Fasta.py. test_Fasta.py now passes with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 10:40:55 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 10:40:55 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171440.l8HEetmT028671@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #27 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 10:40 EST ------- Created an attachment (id=759) --> (http://bugzilla.open-bio.org/attachment.cgi?id=759&action=view) Patch to remove Bio.FormatIO This patch affects the following files: Bio/FormatIO.py Now just a place holder that raises an ImportError, to help anyone work out what is wrong if they have any old code using Bio.FormatIO Bio/SeqRecord.py Removes the code which needed Bio.FormatIO, means Bio.SeqRecord.io is no longer defined. Bio/Search.py Removes the code which needed Bio.FormatIO, means Bio.Search.io is no longer defined. Bio/Search.py is still used from Bio/builders/Search/search.py and that appears to be OK still (?) Tests/test_format_registry.py Removed bits using Bio.SeqRecord.io and Bio.Search.io I think this means test_format_registry.py now passes with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 11:48:49 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 11:48:49 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171548.l8HFmnUM001364@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #28 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 11:48 EST ------- Following comment 26, I have updated Bio/SCOP/__init__.py to use Bio.SeqIO.to_dict() instead of Bio.Fasta.index_file() and the Bio.Fasta.Dictionary class. Now test_SCOP_Astral.py passes without triggering the deprecation warnings I added to Bio.Fasta. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mdehoon at c2b2.columbia.edu Tue Sep 18 04:25:47 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue, 18 Sep 2007 04:25:47 -0400 Subject: [Biopython-dev] Status of the upcoming release Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B623@mail2.exch.c2b2.columbia.edu> Hi everybody, Originally I was planning to create a new Biopython release during last weekend. However, as you have seen from the discussions on the mailing list, while we were preparing for this new release, we discovered that Biopython does not work well with the new version of mxTextTools (3.0). This code is being used by Martel, which is used for various parsers in Biopython. In particular Peter and I have been trying to find solutions for this problem, but we're not quite there yet. Currently, I am getting two remaining errors from the Biopython test suite (I believe there were ten when we started). I feel that we should postpone the release until we sort this out. The difficulty of solving these bugs is that they are located in various interdependent modules. None of the currently active developers are familiar with this code. To make matters worse, some of the code cannot even be deprecated without causing spurious deprecation warnings all over Biopython (even in totally unrelated code). On the bright side, there seem to be few (if any) users of the code that are causing the mxTextTools problems. Therefore I think that in practice, few users will actually run into problems if we remove the offending modules. So it may not be worth banging our heads over this. Unfortunately I will be out of town for the next ten days (I had been hoping to finish the release before), so I'm afraid the next release will have to wait until after that. In the mean time, feel free to download current Biopython versions from CVS to see if all your favorite modules are still there. If not, let us know which module you'd like to retain (and why). --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From bugzilla-daemon at portal.open-bio.org Mon Sep 24 10:08:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:08:51 -0400 Subject: [Biopython-dev] [Bug 2372] New: installing with non-admin permissions Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2372 Summary: installing with non-admin permissions Product: Biopython Version: Not Applicable Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: gould at embl.de I have a scenario where I want to install python 2.5 and biopython 1.43(also dependencies egenix-mx-base-3.0.0 and Numeric-24.2) in a non-standard install directory as I have only non-admin permissions on a particular machine. I have selected a single directory into which I have installed everything with the PATH env variable now pointing to this version of python as opposed to one in /usr/bin. I have followed the instructions as per: http://biopython.org/DIST/docs/install/Installation.html However, there seems to be something missing as some of the tests in biopython 1.43 fail as outlined below: ERROR: test_CodonUsage ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_CodonUsage.py", line 10, in ? X.generate_index("./CodonUsage/HighlyExpressedGenes.txt") File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/SeqUtils/CodonUsage.py", line 74, in generate_index self._count_codons(FastaFile) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/SeqUtils/CodonUsage.py", line 117, in _count_codons cur_record = iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_Fasta2 ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_Fasta2.py", line 44, in ? data = record_parser.parse( src_handle ) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 100, in parse return self.convert_lax(iterator.next()) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_KEGG ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_KEGG.py", line 67, in ? t_KEGG_Enzyme(test_KEGG_Enzyme_files) File "test_KEGG.py", line 23, in t_KEGG_Enzyme record = records.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/KEGG/Enzyme/__init__.py", line 225, in next data = self._reader.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_align ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_align.py", line 129, in ? alignment = FastaAlign.parse_file(to_parse, 'PROTEIN') File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/FastaAlign.py", line 48, in parse_file cur_align = iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_format_registry ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_format_registry.py", line 49, in ? parser.parseFile(_open('EDD_RAT.dat')) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 452, in parseFile self._err_handler.fatalError(ParserRecordException( File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 444, in parseFile record = reader.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_geo ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_geo.py", line 24, in ? record = records.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 79, in next return self._parser.parse(File.StringHandle(data)) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 228, in parse self._scanner.feed(handle, self._consumer) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 126, in feed self._parser.parseFile(handle) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 328, in parseFile self.parseString(fileobj.read()) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 1427 ====================================================================== FAIL: test_Fasta ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Basic operation of the Record Parser. ... ERROR\n' Expected: 'Basic operation of the Record Parser. ... ok\n' ====================================================================== FAIL: test_GenBankFormat ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 166, in runSafeTest cur_test.run_tests([]) File "test_GenBankFormat.py", line 588, in run_tests test_list.test() File "martel_support.py", line 51, in test raise AssertionError, "cannot parse" AssertionError: cannot parse ====================================================================== FAIL: test_NNGene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Find all motifs in a set of sequences. ... ERROR\n' Expected: 'Find all motifs in a set of sequences. ... ok\n' ====================================================================== FAIL: test_SCOP_Astral ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ERROR\n' Expected: 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ok\n' ---------------------------------------------------------------------- Ran 92 tests in 81.465s I've tried various versions of python with biopython but without success at getting these tests to run. I basically need the following piece of code to run: from Bio.WWW import ExPASy from Bio.SwissProt import SProt from Bio import File results = ExPASy.get_sprot_raw('P12931') all_results = results.read() sp_parser = SProt.RecordParser() sp_iterator = SProt.Iterator(File.StringHandle(all_results), sp_parser) Record = sp_iterator.next() but it crashes out at the last line with error: File "/g/gibson/gould/submaster/python/lib/python2.4/site-packages/Bio/ParserS upport.py", line 300, in read_and_call raise SyntaxError, errmsg SyntaxError: Line does not start with 'SQ': PE 1: Evidence at protein level; any suggestions as to what the problem might be would be appreciated. thanks in advance -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 24 10:24:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:24:34 -0400 Subject: [Biopython-dev] [Bug 2372] installing with non-admin permissions In-Reply-To: Message-ID: <200709241424.l8OEOYsj011002@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2372 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-24 10:24 EST ------- This isn't a problem with the install location - it looks like a duplicate of bug 2361, egenix-mx-base-3.0.0 isn't fully backwards compatible. We hope to have a new release out within a few weeks which will address (most of) the egenix mxTextTools trouble; however if you don't want to wait then you could install biopython from CVS. Your short example does work for me using Biopython CVS and egenix base 3.0.0 *** This bug has been marked as a duplicate of bug 2361 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 24 10:24:37 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:24:37 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709241424.l8OEObYj011015@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gould at embl.de ------- Comment #29 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-24 10:24 EST ------- *** Bug 2372 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Tue Sep 25 07:15:48 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Sep 2007 12:15:48 +0100 Subject: [Biopython-dev] poor man's databases for large sequence files In-Reply-To: <46F8C37A.1000005@maubp.freeserve.co.uk> References: <46F83061.3090207@maubp.freeserve.co.uk> <46F86705.1090109@mail.nih.gov> <46F8C37A.1000005@maubp.freeserve.co.uk> Message-ID: <46F8EDE4.7030702@maubp.freeserve.co.uk> On the discussion list I wrote: > I've been thinking about extending Bio.SeqIO to support a (read only) > dictionary like interface for large sequence files (WITHOUT having > everything in memory). > > Some of the older Biopython sequence format specific modules have an > index_file function and matching Dictionary class to do this (based > internally on either Martel/Mindy or a DIY Biopython indexer based on > pickle). Some thoughts and timings using Bio.SwissProt.SProt, and the 1.1 GB UniProt file. I have enough RAM that Linux has probably cached the entire flat file for me. Just in case, I have run these timings a few times to be fair. Note that just counting the records take about 6mins using the SeqRecord parser. I think we can do a lot better. Anyway, I wanted to talk about indexing files as simple read only databases. Using the current (old) SProt indexing functions: index_file - about 7 or 8 mins, one file of 34 MB (small!) Dictionary - about 16s random access - well under 0.1s This old code works using Bio.Index to store the start (seek position) and length of each record (as determined by parsing the entire file) using cPickle. In theory, any sequential file format could be handled this way - provided the parser leaves the handle's seek position in a sensible place when returning records. This approach will not work for non-sequential file formats (e.g. most alignments). My experimental code instead stores every SeqRecord object in full using cPickle (in one large file), and the seek positions for these pickled records in a second small index file (as a dict stored with cPickle). Experimental code with pickled SeqRecord objects: indexing file - about 7 or 8 mins (similar), two files, 554 MB (big!) loading index - under 1s (much faster) random access - well under 0.1s (similar, maybe faster) This approach will work on any file format (and even for objects other than SeqRecord objects, provided they can be pickled). It seems to be a lot faster when loading the index, at the expense of requiring a LARGE index file. The indexing times for the two methods is very similar - about 6 mins of this is parsing the records in the first place. I haven't yet looked at using the python shelve library to provide a read only dictionary. Also python's marshal library may be useful. Then there is the Mindy back end, used in Bio.Fasta and Bio.GenBank for their index_file and Dictionary classes (which replaced previous Bio.Index based code). I haven't timed these. Peter P.S. Using any of pickle, shelve or marshal does leave a potential security hole if anyone could prepare a malicious index file. From biopython-dev at maubp.freeserve.co.uk Tue Sep 25 14:48:37 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Sep 2007 19:48:37 +0100 Subject: [Biopython-dev] [BioPython] poor man's databases for large sequence files In-Reply-To: <46F8F3E5.5020802@mail.nih.gov> References: <46F83061.3090207@maubp.freeserve.co.uk> <46F86705.1090109@mail.nih.gov> <46F8C37A.1000005@maubp.freeserve.co.uk> <46F8F3E5.5020802@mail.nih.gov> Message-ID: <46F95805.5030906@maubp.freeserve.co.uk> I wrote: >> What I had in mind was say indexing all of UniProt which is currently >> 1.1 GB in the SwissProt flat file format, but each record is pretty small. I have written some experimental code to store SeqRecord objects using pickle (and zlib), and tried this on the 283454 UniProt records from here (both fasta and swiss-prot flat file format): ftp://ftp.uniprot.org/pub/databases/uniprot_datafiles_by_format/fasta/uniprot_sprot.fasta.gz ftp://ftp.uniprot.org/pub/databases/uniprot_datafiles_by_format/flatfile/uniprot_sprot.dat.gz Fasta file, "uniprot_sprot.fasta", 125 MB * my pickled SeqRecord database needs about 230 MB (two files), takes about 30s to build the index, 1s to load it * my zlib-pickled SeqRecord database needs about 147 MB (two files), takes about 75s to build the index, 2s to load it * existing Bio.Fasta index using Mindy needs 73 MB (four files) takes about 90s to build the index, 2s to load it SwissProt file, "uniprot_sprot.dat", 1.1 GB * my pickled SeqRecord database needs about 550 MB (two files) takes about 7min to build the index, 1s to load it * my zlib-pickled SeqRecord database needs about 295 MB (two files) takes about 8min to build the index, 3s to load it * existing Bio.SwissProt.SProt index needs only 35 MB (one file) takes about 7.5min to build the index, 16s to load it Note that just parsing the big SwissProt format file takes about 6min, indexing it adds only a comparatively modest overhead. In all cases, once the index has been built and loaded, accessing records by key is almost instantaneous. In terms of run time, my experimental (zlib) pickled read only dictionary is comparable to the existing Biopython functionality - they are both sub-second. However, is the overhead of the bigger index files too much? We appear to be talking about between twice and ten times the size required by the old format specific indexing. Comments? The reason my index are big is I am storing complete records - not just their position within the original file. The motivation is this will work with any file format (regardless of the parser), or even any collection of records. Peter From bugzilla-daemon at portal.open-bio.org Tue Sep 25 16:56:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Sep 2007 16:56:52 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709252056.l8PKuq8N007917@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 ------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-25 16:56 EST ------- I've checked in the __iter__ part of the patch, which addresses the main thrust of this bug. I have not yet checked in the __getitem__ bit because I think the behaviour of the splicing options should match whatever we decide to do for SeqRecord and Seq objects. I'm currently considering creating a new Alignment class to live in Bio/Align/__init__.py (which will make it easier to import - much more discoverable) which would subclass list directly. In particular I want to allow creation of an alignment directly from a list/iterator/generator of SeqRecord objects - something impossible with the current __init__ arguments. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 25 18:19:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Sep 2007 18:19:32 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709252219.l8PMJWvc012510@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-25 18:19 EST ------- Created an attachment (id=768) --> (http://bugzilla.open-bio.org/attachment.cgi?id=768&action=view) Replacement Bio/Align/__init__.py (alignment update v4) This is based on attachment 732, my 3rd version of a patch to Bio/Align/Generic.py but handled as new alignment class in Bio/Align/__init__.py This implements a new alignment class which: * directly subclasses the python list (as a list of SeqRecords) * allows flexible subscripting using __getitem__ * enforces strict alphabet and length checking in __init__, append and extend There is plenty more polish needed - including tackling tricky questions like __setitem__ (or __setslice__) and the related questions about editing alignments. As per my comment 10, I would like to get SeqRecord to support splicing giving SeqRecords with (partial) annotation. If this is done, then the alignment class can exploit this (i.e. only have one set of code dealing with the annotation when splicing SeqRecords). Right now only the id/name/description are preserved when splicing alignments. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 00:50:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 00:50:09 -0400 Subject: [Biopython-dev] [Bug 2374] New: Uppdated lcc code. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2374 Summary: Uppdated lcc code. Product: Biopython Version: 1.43 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sbassi at gmail.com Here is a revised version of the lcc code. changes: 1) clean up some code (removed global var, string module). 2) works for both lower and uppercase sequences. 3) both functions inside this module expect just the sequence to calculate the lcc and not a sequence to be sliced. So now is up to the coder to pass the string sliced. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 00:52:33 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 00:52:33 -0400 Subject: [Biopython-dev] [Bug 2374] Uppdated lcc code. In-Reply-To: Message-ID: <200709260452.l8Q4qXS5032351@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2374 ------- Comment #1 from sbassi at gmail.com 2007-09-26 00:52 EST ------- Created an attachment (id=769) --> (http://bugzilla.open-bio.org/attachment.cgi?id=769&action=view) New version of LCC -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 04:03:56 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 04:03:56 -0400 Subject: [Biopython-dev] [Bug 2374] Uppdated lcc code. In-Reply-To: Message-ID: <200709260803.l8Q83ubg009768@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2374 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-26 04:03 EST ------- Could you change the docstrings to follow PEP 257 more closely? http://www.python.org/dev/peps/pep-0257/ In particular I'd like it to: * explicitly say if the input can be either a Seq object or a plain string. * state wsize should be an integer * describe the return value (list of floats, and float, I believe?) * give the full name - which I am guessing is low composition complexity (LCC) Would make sense to move this from Bio/lcc.py to Bio/SeqUtils/lcc.py (like Michiel recently moved the crc.py module). Would you have any objections to this? The code clearly only looks at ACTG; extending it to unambiguous nucleotides is possible right (DNA or RNA)? What about ambiguous nucleotides? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 27 13:36:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Sep 2007 13:36:51 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709271736.l8RHaphd019477@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #768 is|0 |1 obsolete| | ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-27 13:36 EST ------- Created an attachment (id=770) --> (http://bugzilla.open-bio.org/attachment.cgi?id=770&action=view) Replacement Bio/Align/__init__.py (alignment update v5) This implements a new alignment class which: * directly subclasses the python list (as a list of SeqRecords) * should be a fully backwards compatible with Bio.Align.Generic.Alignment * implements __str__ and __repr__ methods which are useable on large alignment * allows flexible subscripting using __getitem__ * enforces strict alphabet and length checking in __init__, append, extend, __add__ and __radd__ (the last two give list like addition) Provisos from comment 11 still apply. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Sat Sep 29 08:02:02 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sat, 29 Sep 2007 13:02:02 +0100 Subject: [Biopython-dev] Code review? Reverse complements etc Message-ID: <46FE3EBA.1010907@maubp.freeserve.co.uk> Would anyone have a chance to go over my patch on Bug 2366, Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq http://bugzilla.open-bio.org/show_bug.cgi?id=2366 I would be great to have some some comments on this before Michiel starts getting Biopython 1.44 ready. Thanks Peter From tiagoantao at gmail.com Sun Sep 2 14:29:47 2007 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Sun, 2 Sep 2007 15:29:47 +0100 Subject: [Biopython-dev] Jython and sqlite In-Reply-To: References: <6d941f120708300731k78443be1x858513d0f56e07ca@mail.gmail.com> Message-ID: <6d941f120709020729i783d106bq919646143d34d1bc@mail.gmail.com> Hi! On 8/30/07, Sebastian Bassi wrote: > There is also a sqlite module for previous version of Python. So I > guess you could check python version at the beginning of your code and > then set the import properly. The code will just run with python Thanks for the suggestion. What I will do is the following: 1. Go to the BioSQL mailing list and ask what is their opinion (I think not much will happen). 2. After the feedback I will try to figure a solution that is not python 2.5 dependent (like yours), my main goals will be: a. No 2.5 dependence b. Should not be a big hassle to users (I would prefer not to require installing a full blown database, that might scare off users with less knowledge) c. Should be easy to develop and maintain d. Be sure that nobody in biopython-dev will have strong feelings against it In any case, this is only relevant to HapMap stuff, which is probably a couple of months down the road still, so there is plenty of time to discuss... -- http://www.tiago.org/ps From mdehoon at c2b2.columbia.edu Sun Sep 9 14:18:34 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Sun, 09 Sep 2007 23:18:34 +0900 Subject: [Biopython-dev] New Biopython release Message-ID: <46E400BA.90504@c2b2.columbia.edu> Hi everybody, Let's make a new release (1.44) of Biopython. Biopython has received a lot of improvements and bug fixes in recent months, and the current (1.43) release hangs on some platforms during the Biopython tests (due to a bug in my own Bio.Cluster module, ahem). I am planning to make a new release during the next weekend (around 9/15). A number of bugs in Bugzilla currently have a partial solution that has not yet made it into CVS. I suggest to commit those partial solutions to CVS if possible so that they can be included in the next release. On my machine, only test_SeqIO fails its test. The error seems to be trivial, but we should fix it before making the next release. --Michiel. From bugzilla-daemon at portal.open-bio.org Sun Sep 9 20:13:53 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 16:13:53 -0400 Subject: [Biopython-dev] [Bug 2090] Blast.NCBIStandalone BlastParser fails with blastall 2.2.14 In-Reply-To: Message-ID: <200709092013.l89KDrUW014767@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2090 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #520 is|0 |1 obsolete| | ------- Comment #15 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 16:13 EST ------- (From update of attachment 520) As per comment 14, I've checked something based on this into CVS. I'm leaving this bug open, as we still can't read the plain text output from multiple queries. Now that we have moved over to XML as the default, this isn't such a problem. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 9 21:54:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 17:54:31 -0400 Subject: [Biopython-dev] [Bug 2351] Make SeqRecord subclass Seq subclass string? In-Reply-To: Message-ID: <200709092154.l89LsVjS022303@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2351 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 17:54 EST ------- Created an attachment (id=751) --> (http://bugzilla.open-bio.org/attachment.cgi?id=751&action=view) Make str() work intuitively on Seq and MutableSeq objects This patch only changes the Seq and MutableSeq objects: 1. Makes the str() method give the full sequence for Seq and MutableSeq objects 2. Added docstrings to encourage str(my_seq) instead my_seq.tostring() 3. Fixed the repr() of a MutableSeq object This patch updates Bio/Seq.py and the unit test case test_seq.py and its output. It does not add any .short() method to give a truncated representation string like the current str() method gives. It does not do anything to Bio/SeqRecord.py -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Sun Sep 9 20:47:17 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 09 Sep 2007 21:47:17 +0100 Subject: [Biopython-dev] New Biopython release In-Reply-To: <46E400BA.90504@c2b2.columbia.edu> References: <46E400BA.90504@c2b2.columbia.edu> Message-ID: <46E45BD5.1060905@maubp.freeserve.co.uk> Michiel de Hoon wrote: > Hi everybody, > > Let's make a new release (1.44) of Biopython. Biopython has received > a lot of improvements and bug fixes in recent months, and the current > (1.43) release hangs on some platforms during the Biopython tests > (due to a bug in my own Bio.Cluster module, ahem). I am planning to > make a new release during the next weekend (around 9/15). Good idea - I was wondering about suggesting this. By the way, I updated the NEWS file fairly recently. > A number of bugs in Bugzilla currently have a partial solution that > has not yet made it into CVS. I suggest to commit those partial > solutions to CVS if possible so that they can be included in the next > release. Some of those are enhancement bugs where things are still a little undecided. I'm fairly happy with the __getitem__ method for the alignment, but would want to change it slightly if we added a __getitem__ to the SeqRecord itself. > On my machine, only test_SeqIO fails its test. The error seems to be > trivial, but we should fix it before making the next release. It was trivial ;) I had forgotten to check in a SwissProt test case. Sorry. We should get Tiago's input before making a release - see if he's happy for his initial Bio.PopGen code to be released yet. If not, then it shouldn't be too hard to removed that module, its test cases, and section in the tutorial. Just a little fiddly... I've also done a little work on writing EMBL and GenBank files (based in part on Howard Salis' patches on Bug 2294) but I don't think they are ready yet. As part of this I am planning to making some small changed to the EMBL and GenBank parsers to record a few more bits of annotation. Rather than rush this I'll hold back until after release 1.44 is done. Peter From bugzilla-daemon at portal.open-bio.org Sun Sep 9 22:25:02 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 18:25:02 -0400 Subject: [Biopython-dev] [Bug 1963] Adding __str__ method to codon tables and translators In-Reply-To: Message-ID: <200709092225.l89MP28C025915@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1963 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-09 18:25 EST ------- Checked in a version based on this, but which copes with a generic translator (which will accept either RNA or DNA). See Bio/Data/CodonTable.py revision 1.4 and Bio/Translate.py revision 1.2 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 10 00:10:37 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 9 Sep 2007 20:10:37 -0400 Subject: [Biopython-dev] [Bug 2351] Make SeqRecord subclass Seq subclass string? In-Reply-To: Message-ID: <200709100010.l8A0Abet032204@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2351 ------- Comment #4 from sbassi at gmail.com 2007-09-09 20:10 EST ------- (In reply to comment #3) > It does not add any .short() method to give a truncated representation string > like the current str() method gives. Why not? This new method should not cause any compatibility problem -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Mon Sep 10 16:53:58 2007 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 10 Sep 2007 17:53:58 +0100 Subject: [Biopython-dev] New Biopython release - Bio.PopGen? In-Reply-To: <46E55CDE.2020805@warwick.ac.uk> References: <46E55CDE.2020805@warwick.ac.uk> Message-ID: <6d941f120709100953r16b6b695q2e02b8f10c39d1f7@mail.gmail.com> Hi My apologies for the delay in answering, but I am traveling during a couple of weeks as I am in the organization of a conservation genetics course. Unit the 21st I will probably be sloppy in answering. You can go ahead and include the code if you want (any decision that you take is OK), please note the following: 1. Without a statistics module the functionality is limited (a good reason to delay the release) 2. I believe that the code is in an acceptable state. 3. A diff with suggested fdist documentation is attached to the bugzilla entry. I would lean towards a delay until statistics are in, but if it is too much hassle please go ahead and make it public. Tiago On 9/10/07, Peter Cock wrote: > Hi Tiago, > > I'm forwarding this email in case you had missed it on the mailing list. > > Basically what do you want to do with Bio.PopGen given that Michiel is > hoping to do a Biopython release by the end of this week? > > Thanks > > Peter > > -------- Original Message -------- > Subject: Re: [Biopython-dev] New Biopython release > Date: Sun, 09 Sep 2007 21:47:17 +0100 > From: Peter > Reply-To: biopython-dev at lists.open-bio.org > To: Michiel de Hoon > CC: biopython-dev at lists.open-bio.org > References: <46E400BA.90504 at c2b2.columbia.edu> > > Michiel de Hoon wrote: > > Hi everybody, > > > > Let's make a new release (1.44) of Biopython. Biopython has received > > a lot of improvements and bug fixes in recent months, and the current > > (1.43) release hangs on some platforms during the Biopython tests > > (due to a bug in my own Bio.Cluster module, ahem). I am planning to > > make a new release during the next weekend (around 9/15). > > Good idea - I was wondering about suggesting this. By the way, I > updated the NEWS file fairly recently. > > > A number of bugs in Bugzilla currently have a partial solution that > > has not yet made it into CVS. I suggest to commit those partial > > solutions to CVS if possible so that they can be included in the next > > release. > > Some of those are enhancement bugs where things are still a little > undecided. I'm fairly happy with the __getitem__ method for the > alignment, but would want to change it slightly if we added a > __getitem__ to the SeqRecord itself. > > > On my machine, only test_SeqIO fails its test. The error seems to be > > trivial, but we should fix it before making the next release. > > It was trivial ;) > I had forgotten to check in a SwissProt test case. Sorry. > > We should get Tiago's input before making a release - see if he's happy > for his initial Bio.PopGen code to be released yet. If not, then it > shouldn't be too hard to removed that module, its test cases, and > section in the tutorial. Just a little fiddly... > > I've also done a little work on writing EMBL and GenBank files (based in > part on Howard Salis' patches on Bug 2294) but I don't think they are > ready yet. As part of this I am planning to making some small changed > to the EMBL and GenBank parsers to record a few more bits of annotation. > Rather than rush this I'll hold back until after release 1.44 is done. > > Peter > > > -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Tue Sep 11 10:10:38 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 06:10:38 -0400 Subject: [Biopython-dev] [Bug 2174] FDist Support in BioPython In-Reply-To: Message-ID: <200709111010.l8BAAcDC010284@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2174 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 06:10 EST ------- I'm only commenting on the English. All this Controller stuff sounds complicated, but I guess its based on how FDist works. I would clarify this: > The FDist data format is application specific and is not used at > all by other applications, as such, it is always necessary to > convert from an external format. There is code to convert a GenePop > format to FDist, here is an example of usage (along with imports that > will be needed on examples further below): Perhaps: > The FDist data format is application specific and is not used at > all by other applications, as such you will probably have to convert > your data for use with FDist. Biopython can help you do this. > Here is an example converting from GenePop format to FDist format > (along with imports that will be needed on examples further below): Small point, there/the: Before: >> In practice, when there number of populations is low, the mutation model >> is stepwise and the sample size increases, fdist will not be able to >> simulate an acceptable approximate average $F_{st}$. After: >> In practice, when the number of populations is low, the mutation model >> is stepwise and the sample size increases, fdist will not be able to >> simulate an acceptable approximate average $F_{st}$. If you fix the there/the, then I think that can be comitted to CVS. Would you like me or Michiel to do this for you, as you are travelling this week? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mdehoon at c2b2.columbia.edu Tue Sep 11 14:37:57 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 11 Sep 2007 23:37:57 +0900 Subject: [Biopython-dev] Bio.MultiProc Message-ID: <46E6A845.3030601@c2b2.columbia.edu> Hi everybody, In preparation for the upcoming release, I was running the Biopython test suite and found that test_copen.py hangs on Cygwin. It doesn't fail, it just sits there forever. This may be related to the use of fork() instead of select() in Bio/MultiProc/copen.py. Anyway, while it is probably possible to fix this, I'd have to dig fairly deep into the code, and I am not sure if it is worth it. It looks like the copen functions are used only in Bio/config, which is needed for Bio.db. A description of the functionality of thia module can be found in the tutorial section 4.7.2. Now, I don't remember users asking about this module on the mailing list. From the tutorial documentation, it seems to be a nice piece of code, but I doubt that it is being used often in practice. So I was wondering: 1) Is anybody on this list using this code? 2) If not, can I mark it as deprecated for the upcoming release? Hopefully, people who are using this code will notice, and let us know that they need it. --Michiel. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 20:00:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:00:32 -0400 Subject: [Biopython-dev] [Bug 2361] New: Test Suite Failures Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2361 Summary: Test Suite Failures Product: Biopython Version: 1.43 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sjf4 at u.washington.edu Your test suite does not seem to function correctly (or your software). I've been trying to make several of the tests succeed for days, but have not been able to. I've tried many different versions of biopython and the related modules on RHEL4 and RHEL5. I finally thought to try the tests on Windows, and many of the same tests fail there as well. Many of the failing tests use the mxtexttools module, which you can only obtain v3.0.0 of now, but the documentation refers to 2.0.x. If the software works properly, but just the tests fail, I would appreciate you updating the tests. As a non-scientist, I have no way to test this software before I pass it on to my users, so I rely upon test suites like these to assure some level of functionality before handing it off. The output of the failed tests is below. ====================================================================== ERROR: test_CodonUsage ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_CodonUsage.py", line 10, in ? X.generate_index("./CodonUsage/HighlyExpressedGenes.txt") File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SeqUtils/ CodonUsage.py", line 74, in generate_index self._count_codons(FastaFile) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SeqUtils/ CodonUsage.py", line 117, in _count_codons cur_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_Fasta2 ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_Fasta2.py", line 44, in ? data = record_parser.parse( src_handle ) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 100, in parse return self.convert_lax(iterator.next()) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_KEGG ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_KEGG.py", line 67, in ? t_KEGG_Enzyme(test_KEGG_Enzyme_files) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_KEGG.py", line 23, in t_KEG G_Enzyme record = records.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/KEGG/Enzy me/__init__.py", line 225, in next data = self._reader.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_align ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_align.py", line 129, in ? alignment = FastaAlign.parse_file(to_parse, 'PROTEIN') File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/Fas taAlign.py", line 48, in parse_file cur_align = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__i nit__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterPa rser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_format_registry ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_format_registry.py", line 4 9, in ? parser.parseFile(_open('EDD_RAT.dat')) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 452, in parseFile self._err_handler.fatalError(ParserRecordException( File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 444, in parseFile record = reader.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Record Reader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_geo ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_geo.py", line 24, in ? record = records.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 79, in next return self._parser.parse(File.StringHandle(data)) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 228, in parse self._scanner.feed(handle, self._consumer) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Geo/__ini t__.py", line 126, in feed self._parser.parseFile(handle) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 328, in parseFile self.parseString(fileobj.read()) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser .py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", l ine 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 1427 ====================================================================== FAIL: test_Fasta ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Basic operation of the Record Parser. ... ERROR\n' Expected: 'Basic operation of the Record Parser. ... ok\n' ====================================================================== FAIL: test_GenBankFormat ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 166, in runSafeTest cur_test.run_tests([]) File "/scratch/sjf4/temp/biopython-1.43/Tests/test_GenBankFormat.py", line 588 , in run_tests test_list.test() File "/scratch/sjf4/temp/biopython-1.43/Tests/martel_support.py", line 51, in test raise AssertionError, "cannot parse" AssertionError: cannot parse ====================================================================== FAIL: test_NNGene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Find all motifs in a set of sequences. ... ERROR\n' Expected: 'Find all motifs in a set of sequences. ... ok\n' ====================================================================== FAIL: test_SCOP_Astral ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ERROR\ n' Expected: 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ok\n' ---------------------------------------------------------------------- Ran 91 tests in 99.339s FAILED (failures=4, errors=6) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 20:23:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:23:03 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112023.l8BKN38E018573@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 16:23 EST ------- You seem to be the bearer of bad news :( My first impression is that you are right in that mxTextTools 3.0 is to blame: http://www.egenix.com/products/python/mxBase/mxTextTools/changelog.html e.g. from their changes from 2.0.3 to 3.0.0 > Restructured tag commands and their numbering so that > low-level commands come before the special ones. Old > tag tables need to be "recompiled" due to this change! If you can still download mxTextTools 2.x.x from their website, its not at all obvious. As to the impact in Biopython, most of the unit tests failures are clearly problems in Biopython's Martel library and/or the python Sax library: test_CodonUsage - depends on Bio.Fasta which depends on Martel test_Fasta2 - depends on Martel test_KEGG - depends on Martel test_align - depends on Bio.Fasta which depends on Martel test_format_registry - depends on Martel test_geo - depends on Martel test_Fasta - unclear from the error, but known to depend on Martel test_GenBankFormat - depends on Martel test_NNGene - failure unclear test_SCOP_Astral - failure unclear Most of these Martel based parsers are less commonly used (IMO), which the exception of Bio.Fasta where at least removing the Martel dependance would be fairly easy. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 20:54:14 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 16:54:14 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112054.l8BKsEHk022566@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 16:54 EST ------- I don't see why test_NNGene and test_SCOP_Astral would fail - as far as I can see, there is no link to Martel. Stephen - Would you be able to post the output of the following (run in the Tests subdirectory): python test_NNGene.py python test_SCOP_Astral.py Thank you. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 21:00:54 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:00:54 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112100.l8BL0sW5023454@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #3 from sjf4 at u.washington.edu 2007-09-11 17:00 EST ------- NNGene Test Ouput = = = = = = = = = = = = SchemaMatchingTest:Matching schema to strings works correctly. ... ok Reading and writing motifs to a file ... ok Reading and writing schemas to a file. ... ok Reading and writing signatures to a file. ... ok Retrieve counts for particular patterns in the repository. ... ok Retrieve all patterns from a repository. ... ok Retrieve patterns from both sides of the list (top and bottom). ... ok Retrieve random patterns from the repository. ... ok Retrieve a certain number of the top patterns. ... ok Retrieve the top percentge of patterns from the repository. ... ok Test the ability to remove A rich patterns from the repository. ... ok Find all motifs in a set of sequences. ... ERROR Find the difference in motif counts between two sets of sequences. ... ERROR Convert a sequence into its motif representation. ... ok Return all unambiguous characters that can be in a motif. ... ok Find the positions of ambiguous items in a sequence. ... ok Find all matches in a sequence. ... ok Make sure motif compiled regular expressions are cached properly. ... ok Find the number of ambiguous items in a sequence. ... ok Find how many matches are present in a sequence. ... ok Convert a string into a representation of motifs. ... ok Find schemas from sequence inputs. ... ERROR Find schemas that differentiate between two sets of sequences. ... ERROR Generating schema from a simple list of motifs. ... ok Generating schema from a real life set of motifs. ... ERROR Convert sequences into schema representations. ... ERROR Find signatures from sequence inputs. ... ERROR Convert a sequence into its signature representation. ... ok ====================================================================== ERROR: Find all motifs in a set of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 253, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find the difference in motif counts between two sets of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 253, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find schemas from sequence inputs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 416, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find schemas that differentiate between two sets of sequences. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 416, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Generating schema from a real life set of motifs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 546, in t_hard_from_motifs schema_bank = self._load_schema_repository() File "test_NNGene.py", line 573, in _load_schema_repository seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Convert sequences into schema representations. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 599, in t_schema_representation schema_bank = self._load_schema_repository() File "test_NNGene.py", line 573, in _load_schema_repository seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: Find signatures from sequence inputs. ---------------------------------------------------------------------- Traceback (most recent call last): File "test_NNGene.py", line 636, in setUp seq_record = iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/nfs/gs/software/rhel5/python-2.4.4/lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ---------------------------------------------------------------------- Ran 28 tests in 0.250s FAILED (errors=7) ==== ==== ==== ==== ==== SCOP_Astral Test Ouput = = = = = = = = = = = = E..E ====================================================================== ERROR: testConstructWithCustomFile (__main__.AstralTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_SCOP_Astral.py", line 55, in testConstructWithCustomFile assert astral.getSeqBySid('d3sdha_').data == "AAAAA" File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SCOP/__init__.py", line 806, in getSeqBySid return self.fasta_dict[domain].seq File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 214, in __getitem__ return dict.__getitem__(self,key) KeyError: 'd3sdha_' ====================================================================== ERROR: testGetSeq (__main__.AstralTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "test_SCOP_Astral.py", line 43, in testGetSeq assert self.astral.getSeqBySid('d3sdha_').data == "AAAAA" File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/SCOP/__init__.py", line 806, in getSeqBySid return self.fasta_dict[domain].seq File "/scratch/sjf4/temp/biopython-1.43/build/lib.linux-i686-2.4/Bio/Fasta/__init__.py", line 214, in __getitem__ return dict.__getitem__(self,key) KeyError: 'd3sdha_' ---------------------------------------------------------------------- Ran 4 tests in 0.114s FAILED (errors=2) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 21:21:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:21:15 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures In-Reply-To: Message-ID: <200709112121.l8BLLFcZ025354@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Main Distribution |Martel/Mindy ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 17:21 EST ------- Thanks Stephen. The good news is that test_NNGene is also failing in the Martel/Sax library (called via Bio.Fasta). Its not so clear cut, but this likely that root cause of the test_SCOP_Astral failure too. Its not a full solution, but we may be able to minimise this problem by changing Bio.Fasta not to use Martel, e.g. see bug 2058 [For what its worth, I have filed this bug under the Martel/Mindy component] -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 21:56:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 17:56:20 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709112156.l8BLuKaL027005@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Test Suite Failures |Test Suite Failures from | |Martel/Sax with egenix | |mxTextTools 3.0 ------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 17:56 EST ------- I am able to reproduce this on Ubuntu Dapper Drake - switching from mxTextTools 2.0.6 (from the Ubuntu binary packages) to mxTextTools 3.0.0 (installed from source under my home directory only). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 11 22:33:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 18:33:34 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709112233.l8BMXYpP028597@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-11 18:33 EST ------- After applying the updated the patch on bug 2058 to remove the Martel dependancy from Bio.Fasta, things are a LOT better. http://bugzilla.open-bio.org/attachment.cgi?id=755 Note that with mxTextTools 3, test_Fasta would still fail on indexing a file as a simple database using Mindy. With the patch I think we only have five unit test failures: test_Fasta test_GenBankFormat test_KEGG test_format_registry test_geo -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 01:47:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 11 Sep 2007 21:47:52 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120147.l8C1lqPt004412@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #7 from mdehoon at ims.u-tokyo.ac.jp 2007-09-11 21:47 EST ------- With the patch on bug 2058, I am finding the same five unit test failures on Cygwin (plus test_copen.py, but that is a Cygwin-specific failure unrelated to mxTextTools). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 05:45:22 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 01:45:22 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120545.l8C5jMSg017128@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #8 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 01:45 EST ------- I've written a Martel-free parser for Bio.Geo. With this parser, test_geo.py now passes. I've uploaded the new parser to CVS; feel free to comment. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 08:47:26 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 04:47:26 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120847.l8C8lQFn030017@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #9 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 04:47 EST ------- I've uploaded a Martel-free parser for Bio.KEGG.Compound to CVS. I'm still working on the same for Bio.KEGG.Enzyme and Bio.KEGG.Map. Just to let you know what I'm working on, to avoid duplicated efforts. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 08:57:24 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 04:57:24 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120857.l8C8vOrm030713@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 04:57 EST ------- Regarding GEO, your changes look sensible. I see you have removed Bio.Geo.RecordParser() and deprecated Bio.Geo.Iterator() to use a parse() function. That makes sense given recent Biopython developments. For the long term, and thinking as a potential end user, something much more like Sean Davis' GEOquery package for R/BioConductor would be much nicer. http://www.warwick.ac.uk/go/peter_cock/r/geo/ Right now, breaking the GEO files at each ^ (caret) line it perhaps a little too low level. In particular, in Biopython I would like to be able to take a GDS file (GEO Dataset), and have it loaded as an annotated matrix of expression levels (genes as rows, samples as columns) suitable for use with Bio.Cluster But that is probable best left as a future enhancement bug. -------------------------------------------------------------------- We still have the Bio.Fasta.Dictionary and Bio.GenBank.Dictionary classes (and anything else like it) to worry about. These use Mindy to build a set of lookup tables as files on disk, allowing keyed like access to records WITHOUT having all the records in memory. I'm a bit hazzy on the implementation details. I personally don't use them, and its not something currently supported by Bio.SeqIO either. -------------------------------------------------------------------- If you replace the KEGG parser then I fear the remaining problems are very tightly linked to Martel, but also in my opinion not key features of Biopython. We could try asking on the egenix mailing list to se if they have some examples of updating python code to work with mxTextTools 3.0 ... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 09:29:15 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 05:29:15 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709120929.l8C9TFqk032621@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 05:29 EST ------- We could add an mxTextTools version check to Martel/setup.py using something like this: from mx import TextTools good = int(TextTools.__version__.split(".")[0]) < 3 Note - I would just print a warning rather than refusing to install. Possibly do a run-time check in Martel/Parser.py, Generate.py and RecordReader.py and raise an ImportError? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 10:29:39 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 06:29:39 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121029.l8CATdt7004244@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- OS/Version|Linux |All ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 06:29 EST ------- Confirmed problem on Windows XP with Python 2.3, changing bug's OS field to All. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 11:11:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:11:03 -0400 Subject: [Biopython-dev] [Bug 2362] New: test_copen fails on Windows XP as tries os.fork() Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2362 Summary: test_copen fails on Windows XP as tries os.fork() Product: Biopython Version: 1.43 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk I'm using Biopython from CVS (1.43+), fresh install compiled from source using MSVC 6.0 Output from: python test_copen.py opening handle Traceback (most recent call last): File "test_copen.py", line 14, in ? handle = copen.copen_fn(print_args, *(range(2) + ['a', 'b', 'c'])) File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Bio\Mu ltiProc\copen.py", line 66, in copen_fn pid = os.fork() AttributeError: 'module' object has no attribute 'fork' Michiel has seen a similar problem on Windows using cygwin (hangs rather than the attribute error), see bug 2361 comment 7 and this mailing list post: http://lists.open-bio.org/pipermail/biopython/2007-September/003722.html -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 11:12:46 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:12:46 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121112.l8CBCkem008143@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #13 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 07:12 EST ------- (In reply to comment #7) > With the patch on bug 2058, I am finding the same five unit test failures on > Cygwin (plus test_copen.py, but that is a Cygwin-specific failure unrelated to > mxTextTools). See Bug 2362, I find test_copen.py fails on Windows XP (non-cygwin) with an AttributeError rather than hanging. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 11:34:19 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 07:34:19 -0400 Subject: [Biopython-dev] [Bug 2363] New: Bio.Pathway files not stored as plain text in CVS? Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2363 Summary: Bio.Pathway files not stored as plain text in CVS? Product: Biopython Version: Not Applicable Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk I've just checked out a fresh clean copy of Biopython from CVS, and built it from source on Windows XP with Python 2.3 and MSCV 6.0 as the compiler. test_pathway.py and test_KEGG.py both failed (complaining about syntax errors on import statements which looked fine). This can be fixed by editing the files Bio/Pathway/__init__.py and Bio/Pathway/Rep/*.py to use Windows/DOS/PC line endings (odd!) I suspect that the Bio.Pathway python files have been checked into CVS as binary files rather than as plain text. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 12:20:31 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 08:20:31 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121220.l8CCKVST016224@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #14 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 08:20 EST ------- I have now uploaded Martel-free parsers for Bio.KEGG.Compound/Enzyme/Map. With these new parsers, test_KEGG.py now passes. I also updated some data files in the Tests/output and Tests/KEGG directories. Three more to go. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 13:19:13 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 09:19:13 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121319.l8CDJDGH022220@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #15 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 09:19 EST ------- Good work Michiel :) I can confirm using Windows XP, Python 2.3, and mxTextTools 3.0 the following all pass: test_align test_CodonUsage test_Fasta2 test_geo test_KEGG test_NNGene test_SCOP_Astral Note you may need to delete the Mindy index directory Tests\SCOP\scopseq-test\astral-scopdom-seqres-all-test.fa.idx to force its recreation in test_SCOP_Astral.py The following still fail with mxTextTools 3.0, but do work with mxTextTools 2.0: test_format_registry - ReaderError: invalid format starting with '' test_GenBankFormat - AssertionError, "cannot parse" test_Fasta - fails indexing files I'll double check this on Linux in a few hours time. I don't see why test_Fasta is failing, doing "python test_Fasta.py" looks fine. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 15:12:54 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 11:12:54 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121512.l8CFCsJJ031631@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #16 from mdehoon at ims.u-tokyo.ac.jp 2007-09-12 11:12 EST ------- >From looking at test_format_registry, I doubt that this code is still being used by anybody. Rather than banging our heads over how to fix this, I suggest that we remove the corresponding code for the next Biopython release and see if anybody complains. If not, our problem is solved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 16:34:13 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 12:34:13 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709121634.l8CGYDPB004823@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #17 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-12 12:34 EST ------- Firsly I can confirm my findings in comment 15 also apply on Linux. Further to comment 16, I also doubt that anybody uses the GenBank martel expression which test_GenBankFormat.py checks. Isn't removing this code a little drastic? We could release Biopython 1.44 with a warning that Martel and some minor parts of Biopython which use it will not work with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 20:51:20 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 16:51:20 -0400 Subject: [Biopython-dev] [Bug 2364] New: New version of MeltingTemp.py Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2364 Summary: New version of MeltingTemp.py Product: Biopython Version: Not Applicable Platform: PC OS/Version: Linux Status: NEW Severity: enhancement Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sbassi at gmail.com Removed string and some costetic changes in the code. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 12 20:52:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 12 Sep 2007 16:52:52 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709122052.l8CKqqpC018192@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #1 from sbassi at gmail.com 2007-09-12 16:52 EST ------- Created an attachment (id=756) --> (http://bugzilla.open-bio.org/attachment.cgi?id=756&action=view) New version of MeltingTemp.py This file should replace old MeltingTemp.py -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 04:44:04 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 00:44:04 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709130444.l8D4i41Q014333@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #18 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 00:44 EST ------- On my computer, "python test_Fasta.py" does fail. Three of the four tests in test_Fasta.py succeed, the fourth one (DictionaryTest) fails. The error occurs in the index_file function in Bio/Fasta/__init__.py, which is needed to create a Fasta.Dictionary. This code is used to create your own Fasta database, along the lines of the Genbank example in section 4.3.4 in the Tutorial. I think that this stuff can be done more cleanly with the new Bio.SeqIO. I'll ask on the Biopython mailing list if somebody is using index_file. If not, we can deprecate only that function in Bio.Fasta, and remove the corresponding test in test_Fasta.py. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 08:51:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 04:51:09 -0400 Subject: [Biopython-dev] [Bug 2174] FDist Support in BioPython In-Reply-To: Message-ID: <200709130851.l8D8p9de031499@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2174 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #740 is|0 |1 obsolete| | ------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 04:51 EST ------- (From update of attachment 740) I checked this in yesterday -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 10:55:22 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 06:55:22 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131055.l8DAtMdr006501@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #19 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 06:55 EST ------- Looking at this error: test_GenBankFormat - AssertionError, "cannot parse" This error occurs due to the last test in test_GenBankFormat.py. If I remove add_test("ncbi_format", ncbi_format, header_s + record_s1+record_s2+record_s3) then the test passes. I didn't see ncbi_format being used anywhere in Biopython. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 11:27:28 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 07:27:28 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131127.l8DBRSwv008497@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #20 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 07:27 EST ------- Note that there are other index_file functions in Bio.GenBank and Bio.SwissProt The Bio.Fasta index_file/Dictionary is also used in several other modules including the SCOP Astral class (indexing a Fasta file to serve as a database). So depreciating it isn't quite as trivial as it could be! For anyone unfamiliar with the details, note that while Bio.SeqIO.to_dict() achieves a similar aim, it is done in memory. The Mindy based index_file/Dictionary classes parse the file once to create a lookup table on disk allowing random access to any record in the file. This functionality was probably more important historically (lower memory on desktop computers), and seems to be a mid point between the simple in memory dictionary and a full blown SQL database. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 11:57:03 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 07:57:03 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131157.l8DBv3bO010568@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #21 from mdehoon at ims.u-tokyo.ac.jp 2007-09-13 07:57 EST ------- > The Bio.Fasta index_file/Dictionary is also used in several other modules > including the SCOP Astral class (indexing a Fasta file to serve as a database). > So depreciating it isn't quite as trivial as it could be! Yes, but these can be trivially replaced by the corresponding Bio.SeqIO code. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 12:04:16 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 08:04:16 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131204.l8DC4G2V011062@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #22 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 08:04 EST ------- Following up comment 19, if the error handler in Tests/martel_support.py is commented out, we can actually see the stack error triggered in test_GenBankFormat.py as follows: Traceback (most recent call last): File "C:\temp\biopython_cvs\biopython_all\biopython\Tests\test_GenBankFormat.py", line 603, in -toplevel- test_list.test() File "C:\temp\biopython_cvs\biopython_all\biopython\Tests\martel_support.py", line 41, in test parser.parseString(s) File "c:\python23\lib\site-packages\Martel\Parser.py", line 557, in parseString self.parseFile(strfile) File "c:\python23\lib\site-packages\Martel\Parser.py", line 587, in parseFile self._err_handler.fatalError(exc) File "c:\python23\lib\xml\sax\handler.py", line 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "c:\python23\lib\site-packages\Martel\Parser.py", line 578, in parseFile header = header_reader.next() File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Martel\RecordReader.py", line 413, in next positions = _find_end_positions(lookahead, _tag_lines_tagtable) File "C:\TEMP\biopython_cvs\biopython_all\biopython\build\lib.win32-2.3\Martel\RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' This error is very similar to some of the others in the original bug report (parsers we have since moved to pure python). Looking at the stack for this in IDLE, the Martel.Parser.parseFile() function has a cStringIO.StringI object as its fileobj variable. I signed up to the egenix mailing list, and asked them to clarify what they meant by "Removed support for buffer-compatible input objects" in their change log, and specifically if this meant we can't use Python's StringIO handles? The reply was: > Yes, we had to do this as a result of the restructuring of the > underlying code which no longer works on a char* pointer, but > instead uses the object type information to see whether it needs > to compile a Unicode tag table or a string one. I suspect the use of StringIO / cStringIO in Biopython would explains most/all of the Martel based test failures. I'm not sure if we can work around this... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 12:36:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 08:36:51 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709131236.l8DCapcZ012966@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #23 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 08:36 EST ------- Follow up from egenix - I'm not sure quite what this means in Martel... I asked: > Do you have any suggested workarounds for using mxTextTools to parse > data held in a string (rather than read from a handle to an opened file)? And Marc-Andre Lemburg replied: > I think I lost you there :-) > > mxTextTools *does* work on Python strings and Unicode. It no longer works > on objects that just expose the buffer API. We'll likely add support for > that at some later stage, but for now, the Unicode support was more > important to get right. > > You can easily convert a StringIO instance to a Python string using > .getvalue() method. > > For larger amounts of data, it's also a good idea to process the data > in chunks. mxTextTools allows for this by returning the index of where > it stopped parsing the input. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 13 22:38:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Sep 2007 18:38:32 -0400 Subject: [Biopython-dev] [Bug 2348] Slicing the Seq object (returns a string when use a stride) In-Reply-To: Message-ID: <200709132238.l8DMcWBc028839@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2348 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-13 18:38 EST ------- P.S. I still think slicing a Seq object with a stride should return another Seq object, but some of the functions/methods in Bio/Seq.py actually expected a string. I have now fixed those, and extended test_seq.py to actually check these functions. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 09:17:50 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:17:50 -0400 Subject: [Biopython-dev] [Bug 2366] New: Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2366 Summary: Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq Product: Biopython Version: Not Applicable Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Currently the (Reverse)complementfunctions/methods in Bio.Seq do NOT support ambiguous nucleotides. For example, the complement of H={ACU} should be D={UGA} I'll upload a patch to Bio/Seq.py and its unit test in a moment... bugzilla doesn't let you do this as part of filing a bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 09:21:59 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:21:59 -0400 Subject: [Biopython-dev] [Bug 2366] Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq In-Reply-To: Message-ID: <200709140921.l8E9LxDf004471@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2366 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 05:21 EST ------- Created an attachment (id=758) --> (http://bugzilla.open-bio.org/attachment.cgi?id=758&action=view) Patch to Bio/Seq.py and Tests/test_seq.py and Tests/output/test_se * Fixes (reverse) complement of ambiguous sequences * Removes some code duplication (at the cost of extra function calls) * Adds some missing doc strings * Includes a mini-test in Bio/Seq.py (which can be removed) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 09:39:00 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 05:39:00 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709140939.l8E9d0UZ005615@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 05:38 EST ------- So this updates Bio/SeqUtils/MetlingTemp.py to use string methods instead of the string module. Seems fine to me. It may just my imagination (working on Linux), but it seems Bio/SeqUtils/MetlingTemp.py has been checked into CVS as a binary file with Windows/DOS new lines. After running dos2unix on things I can get a sensible diff between local copies. If I run unix2dos on your new version, then cvs diff gives sensible output. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 10:25:58 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 06:25:58 -0400 Subject: [Biopython-dev] [Bug 2366] Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq In-Reply-To: Message-ID: <200709141025.l8EAPwWk009283@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2366 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 06:25 EST ------- Part of my updated test_seq.py unit test fails when run with the entire test suite; It appears some other unit test is polluting the Bio.Data.IUPACData.ambiguous_dna_values dictionary. Adding this to test_seq.py (after applying the patch) seems to fix this. #When run the full test suite, some other unit test is polluting this dict: for ambig_char in ["-", "?"] : if ambig_char in ambiguous_dna_values : del ambiguous_dna_values[ambig_char] -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 12:53:00 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 08:53:00 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709141253.l8ECr0hP020991@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 ------- Comment #3 from sbassi at gmail.com 2007-09-14 08:52 EST ------- (In reply to comment #2) > It may just my imagination (working on Linux), but it seems > Bio/SeqUtils/MetlingTemp.py has been checked into CVS as a binary file with > Windows/DOS new lines. The original version was made under Windows, now I work with Linux. I evolved, so my code and platform :) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 13:51:28 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 09:51:28 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709141351.l8EDpShE025015@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #24 from mdehoon at ims.u-tokyo.ac.jp 2007-09-14 09:51 EST ------- Would it be possible to remove the dependence of Bio.SeqIO on Bio.GenBank? I am trying to disentangle the mxTextTools-dependent stuff from the code unaffected by the recent mxTextTools update. Often, the easiest way to do this is to replace the Martel-dependent code with Bio.SeqIO (for example, see my update of Bio/SeqUtils/__init__.py. But if Bio.SeqIO then relies on a Martel-based parser, we're back to square one. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Sep 14 19:29:18 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Sep 2007 15:29:18 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709141929.l8EJTIFT012178@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #25 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-14 15:29 EST ------- Bio.SeqIO does depend on Bio.GenBank for both EMBL and GenBank parsing. The good news is the only bit of Bio.GenBank which depends on Martel is the index_file() function and Dictionary class in Bio/GenBank/__init__.py which work in the same way as the equivalent functions in Bio.Fasta Note that I did find one excess import statment in Bio/GenBank/__init__.py which I have now removed in CVS revision 1.74 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 16 11:25:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Sep 2007 07:25:52 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200709161125.l8GBPqLc012096@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- OS/Version|Windows XP |All Platform|PC |All Summary|Bio.Pathway files not stored|Some python files not stored |as plain text in CVS? |as plain text in CVS? ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-16 07:25 EST ------- Additionally, Sebasti??n Bassi's Bio/SeqUtils/MeltingTemp.py appears to be stored in CVS with DOS/Windows newlines. So far this has only caused problems with the diff command. See bug 2364 And in the other direction, Doc/Images/BlastRecord.png, PSIBlastRecord.png and smcra.png appear to be checked in as text: They work fine on Linux, but when checked out on Windows the images are corrupt. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Sep 16 11:27:47 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Sep 2007 07:27:47 -0400 Subject: [Biopython-dev] [Bug 2364] New version of MeltingTemp.py In-Reply-To: Message-ID: <200709161127.l8GBRlF1012214@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2364 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-16 07:27 EST ------- I've check this in, /Bio/SeqUtils/MeltingTemp.py revision 1.6 - Thanks Sebastian I've made a note of the new line problem (CVS text vs. binary) on Bug 2363 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 10:25:48 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 06:25:48 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200709171025.l8HAPm6R011219@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 06:25 EST ------- Also Bio\ECell\__init__.py seems to need its new lines "fixed" for the unit tests to pass on Windows. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 14:36:14 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 10:36:14 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171436.l8HEaECE028354@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #26 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 10:36 EST ------- Marked the index_file() function and Dictionary classes in Bio.Fasta and Bio.GenBank as deprecated, and removed the corresponding test in test_Fasta.py. test_Fasta.py now passes with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 14:40:55 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 10:40:55 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171440.l8HEetmT028671@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #27 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 10:40 EST ------- Created an attachment (id=759) --> (http://bugzilla.open-bio.org/attachment.cgi?id=759&action=view) Patch to remove Bio.FormatIO This patch affects the following files: Bio/FormatIO.py Now just a place holder that raises an ImportError, to help anyone work out what is wrong if they have any old code using Bio.FormatIO Bio/SeqRecord.py Removes the code which needed Bio.FormatIO, means Bio.SeqRecord.io is no longer defined. Bio/Search.py Removes the code which needed Bio.FormatIO, means Bio.Search.io is no longer defined. Bio/Search.py is still used from Bio/builders/Search/search.py and that appears to be OK still (?) Tests/test_format_registry.py Removed bits using Bio.SeqRecord.io and Bio.Search.io I think this means test_format_registry.py now passes with mxTextTools 3.0 -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 17 15:48:49 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Sep 2007 11:48:49 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709171548.l8HFmnUM001364@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 ------- Comment #28 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-17 11:48 EST ------- Following comment 26, I have updated Bio/SCOP/__init__.py to use Bio.SeqIO.to_dict() instead of Bio.Fasta.index_file() and the Bio.Fasta.Dictionary class. Now test_SCOP_Astral.py passes without triggering the deprecation warnings I added to Bio.Fasta. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mdehoon at c2b2.columbia.edu Tue Sep 18 08:25:47 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue, 18 Sep 2007 04:25:47 -0400 Subject: [Biopython-dev] Status of the upcoming release Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B623@mail2.exch.c2b2.columbia.edu> Hi everybody, Originally I was planning to create a new Biopython release during last weekend. However, as you have seen from the discussions on the mailing list, while we were preparing for this new release, we discovered that Biopython does not work well with the new version of mxTextTools (3.0). This code is being used by Martel, which is used for various parsers in Biopython. In particular Peter and I have been trying to find solutions for this problem, but we're not quite there yet. Currently, I am getting two remaining errors from the Biopython test suite (I believe there were ten when we started). I feel that we should postpone the release until we sort this out. The difficulty of solving these bugs is that they are located in various interdependent modules. None of the currently active developers are familiar with this code. To make matters worse, some of the code cannot even be deprecated without causing spurious deprecation warnings all over Biopython (even in totally unrelated code). On the bright side, there seem to be few (if any) users of the code that are causing the mxTextTools problems. Therefore I think that in practice, few users will actually run into problems if we remove the offending modules. So it may not be worth banging our heads over this. Unfortunately I will be out of town for the next ten days (I had been hoping to finish the release before), so I'm afraid the next release will have to wait until after that. In the mean time, feel free to download current Biopython versions from CVS to see if all your favorite modules are still there. If not, let us know which module you'd like to retain (and why). --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From bugzilla-daemon at portal.open-bio.org Mon Sep 24 14:08:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:08:51 -0400 Subject: [Biopython-dev] [Bug 2372] New: installing with non-admin permissions Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2372 Summary: installing with non-admin permissions Product: Biopython Version: Not Applicable Platform: Other OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: gould at embl.de I have a scenario where I want to install python 2.5 and biopython 1.43(also dependencies egenix-mx-base-3.0.0 and Numeric-24.2) in a non-standard install directory as I have only non-admin permissions on a particular machine. I have selected a single directory into which I have installed everything with the PATH env variable now pointing to this version of python as opposed to one in /usr/bin. I have followed the instructions as per: http://biopython.org/DIST/docs/install/Installation.html However, there seems to be something missing as some of the tests in biopython 1.43 fail as outlined below: ERROR: test_CodonUsage ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_CodonUsage.py", line 10, in ? X.generate_index("./CodonUsage/HighlyExpressedGenes.txt") File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/SeqUtils/CodonUsage.py", line 74, in generate_index self._count_codons(FastaFile) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/SeqUtils/CodonUsage.py", line 117, in _count_codons cur_record = iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_Fasta2 ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_Fasta2.py", line 44, in ? data = record_parser.parse( src_handle ) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 100, in parse return self.convert_lax(iterator.next()) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_KEGG ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_KEGG.py", line 67, in ? t_KEGG_Enzyme(test_KEGG_Enzyme_files) File "test_KEGG.py", line 23, in t_KEGG_Enzyme record = records.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/KEGG/Enzyme/__init__.py", line 225, in next data = self._reader.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_align ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_align.py", line 129, in ? alignment = FastaAlign.parse_file(to_parse, 'PROTEIN') File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/FastaAlign.py", line 48, in parse_file cur_align = iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Fasta/__init__.py", line 72, in next result = self._iterator.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/IterParser.py", line 152, in iterateFile self.header_parser.parseString(rec) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 0 ====================================================================== ERROR: test_format_registry ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_format_registry.py", line 49, in ? parser.parseFile(_open('EDD_RAT.dat')) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 452, in parseFile self._err_handler.fatalError(ParserRecordException( File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserRecordException: Traceback (most recent call last): File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 444, in parseFile record = reader.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 295, in next positions = _find_end_positions(lookahead, self.tagtable) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/RecordReader.py", line 239, in _find_end_positions raise ReaderError("invalid format starting with %s" % repr(text[:50])) ReaderError: invalid format starting with '' ====================================================================== ERROR: test_geo ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 162, in runSafeTest cur_test = __import__(self.test_name) File "test_geo.py", line 24, in ? record = records.next() File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 79, in next return self._parser.parse(File.StringHandle(data)) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 228, in parse self._scanner.feed(handle, self._consumer) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Bio/Geo/__init__.py", line 126, in feed self._parser.parseFile(handle) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 328, in parseFile self.parseString(fileobj.read()) File "/home/gould/biopython-1.43/build/lib.linux-x86_64-2.4/Martel/Parser.py", line 356, in parseString self._err_handler.fatalError(result) File "/g/gibson/gould/submaster/python//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception ParserPositionException: error parsing at or beyond character 1427 ====================================================================== FAIL: test_Fasta ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Basic operation of the Record Parser. ... ERROR\n' Expected: 'Basic operation of the Record Parser. ... ok\n' ====================================================================== FAIL: test_GenBankFormat ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 166, in runSafeTest cur_test.run_tests([]) File "test_GenBankFormat.py", line 588, in run_tests test_list.test() File "martel_support.py", line 51, in test raise AssertionError, "cannot parse" AssertionError: cannot parse ====================================================================== FAIL: test_NNGene ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'Find all motifs in a set of sequences. ... ERROR\n' Expected: 'Find all motifs in a set of sequences. ... ok\n' ====================================================================== FAIL: test_SCOP_Astral ---------------------------------------------------------------------- Traceback (most recent call last): File "run_tests.py", line 149, in runTest self.runSafeTest() File "run_tests.py", line 186, in runSafeTest expected_handle) File "run_tests.py", line 286, in compare_output assert expected_line == output_line, \ AssertionError: Output : 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ERROR\n' Expected: 'testConstructWithCustomFile (test_SCOP_Astral.AstralTests) ... ok\n' ---------------------------------------------------------------------- Ran 92 tests in 81.465s I've tried various versions of python with biopython but without success at getting these tests to run. I basically need the following piece of code to run: from Bio.WWW import ExPASy from Bio.SwissProt import SProt from Bio import File results = ExPASy.get_sprot_raw('P12931') all_results = results.read() sp_parser = SProt.RecordParser() sp_iterator = SProt.Iterator(File.StringHandle(all_results), sp_parser) Record = sp_iterator.next() but it crashes out at the last line with error: File "/g/gibson/gould/submaster/python/lib/python2.4/site-packages/Bio/ParserS upport.py", line 300, in read_and_call raise SyntaxError, errmsg SyntaxError: Line does not start with 'SQ': PE 1: Evidence at protein level; any suggestions as to what the problem might be would be appreciated. thanks in advance -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 24 14:24:34 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:24:34 -0400 Subject: [Biopython-dev] [Bug 2372] installing with non-admin permissions In-Reply-To: Message-ID: <200709241424.l8OEOYsj011002@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2372 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-24 10:24 EST ------- This isn't a problem with the install location - it looks like a duplicate of bug 2361, egenix-mx-base-3.0.0 isn't fully backwards compatible. We hope to have a new release out within a few weeks which will address (most of) the egenix mxTextTools trouble; however if you don't want to wait then you could install biopython from CVS. Your short example does work for me using Biopython CVS and egenix base 3.0.0 *** This bug has been marked as a duplicate of bug 2361 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 24 14:24:37 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 24 Sep 2007 10:24:37 -0400 Subject: [Biopython-dev] [Bug 2361] Test Suite Failures from Martel/Sax with egenix mxTextTools 3.0 In-Reply-To: Message-ID: <200709241424.l8OEObYj011015@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2361 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gould at embl.de ------- Comment #29 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-24 10:24 EST ------- *** Bug 2372 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Tue Sep 25 11:15:48 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Sep 2007 12:15:48 +0100 Subject: [Biopython-dev] poor man's databases for large sequence files In-Reply-To: <46F8C37A.1000005@maubp.freeserve.co.uk> References: <46F83061.3090207@maubp.freeserve.co.uk> <46F86705.1090109@mail.nih.gov> <46F8C37A.1000005@maubp.freeserve.co.uk> Message-ID: <46F8EDE4.7030702@maubp.freeserve.co.uk> On the discussion list I wrote: > I've been thinking about extending Bio.SeqIO to support a (read only) > dictionary like interface for large sequence files (WITHOUT having > everything in memory). > > Some of the older Biopython sequence format specific modules have an > index_file function and matching Dictionary class to do this (based > internally on either Martel/Mindy or a DIY Biopython indexer based on > pickle). Some thoughts and timings using Bio.SwissProt.SProt, and the 1.1 GB UniProt file. I have enough RAM that Linux has probably cached the entire flat file for me. Just in case, I have run these timings a few times to be fair. Note that just counting the records take about 6mins using the SeqRecord parser. I think we can do a lot better. Anyway, I wanted to talk about indexing files as simple read only databases. Using the current (old) SProt indexing functions: index_file - about 7 or 8 mins, one file of 34 MB (small!) Dictionary - about 16s random access - well under 0.1s This old code works using Bio.Index to store the start (seek position) and length of each record (as determined by parsing the entire file) using cPickle. In theory, any sequential file format could be handled this way - provided the parser leaves the handle's seek position in a sensible place when returning records. This approach will not work for non-sequential file formats (e.g. most alignments). My experimental code instead stores every SeqRecord object in full using cPickle (in one large file), and the seek positions for these pickled records in a second small index file (as a dict stored with cPickle). Experimental code with pickled SeqRecord objects: indexing file - about 7 or 8 mins (similar), two files, 554 MB (big!) loading index - under 1s (much faster) random access - well under 0.1s (similar, maybe faster) This approach will work on any file format (and even for objects other than SeqRecord objects, provided they can be pickled). It seems to be a lot faster when loading the index, at the expense of requiring a LARGE index file. The indexing times for the two methods is very similar - about 6 mins of this is parsing the records in the first place. I haven't yet looked at using the python shelve library to provide a read only dictionary. Also python's marshal library may be useful. Then there is the Mindy back end, used in Bio.Fasta and Bio.GenBank for their index_file and Dictionary classes (which replaced previous Bio.Index based code). I haven't timed these. Peter P.S. Using any of pickle, shelve or marshal does leave a potential security hole if anyone could prepare a malicious index file. From biopython-dev at maubp.freeserve.co.uk Tue Sep 25 18:48:37 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Sep 2007 19:48:37 +0100 Subject: [Biopython-dev] [BioPython] poor man's databases for large sequence files In-Reply-To: <46F8F3E5.5020802@mail.nih.gov> References: <46F83061.3090207@maubp.freeserve.co.uk> <46F86705.1090109@mail.nih.gov> <46F8C37A.1000005@maubp.freeserve.co.uk> <46F8F3E5.5020802@mail.nih.gov> Message-ID: <46F95805.5030906@maubp.freeserve.co.uk> I wrote: >> What I had in mind was say indexing all of UniProt which is currently >> 1.1 GB in the SwissProt flat file format, but each record is pretty small. I have written some experimental code to store SeqRecord objects using pickle (and zlib), and tried this on the 283454 UniProt records from here (both fasta and swiss-prot flat file format): ftp://ftp.uniprot.org/pub/databases/uniprot_datafiles_by_format/fasta/uniprot_sprot.fasta.gz ftp://ftp.uniprot.org/pub/databases/uniprot_datafiles_by_format/flatfile/uniprot_sprot.dat.gz Fasta file, "uniprot_sprot.fasta", 125 MB * my pickled SeqRecord database needs about 230 MB (two files), takes about 30s to build the index, 1s to load it * my zlib-pickled SeqRecord database needs about 147 MB (two files), takes about 75s to build the index, 2s to load it * existing Bio.Fasta index using Mindy needs 73 MB (four files) takes about 90s to build the index, 2s to load it SwissProt file, "uniprot_sprot.dat", 1.1 GB * my pickled SeqRecord database needs about 550 MB (two files) takes about 7min to build the index, 1s to load it * my zlib-pickled SeqRecord database needs about 295 MB (two files) takes about 8min to build the index, 3s to load it * existing Bio.SwissProt.SProt index needs only 35 MB (one file) takes about 7.5min to build the index, 16s to load it Note that just parsing the big SwissProt format file takes about 6min, indexing it adds only a comparatively modest overhead. In all cases, once the index has been built and loaded, accessing records by key is almost instantaneous. In terms of run time, my experimental (zlib) pickled read only dictionary is comparable to the existing Biopython functionality - they are both sub-second. However, is the overhead of the bigger index files too much? We appear to be talking about between twice and ten times the size required by the old format specific indexing. Comments? The reason my index are big is I am storing complete records - not just their position within the original file. The motivation is this will work with any file format (regardless of the parser), or even any collection of records. Peter From bugzilla-daemon at portal.open-bio.org Tue Sep 25 20:56:52 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Sep 2007 16:56:52 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709252056.l8PKuq8N007917@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 ------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-25 16:56 EST ------- I've checked in the __iter__ part of the patch, which addresses the main thrust of this bug. I have not yet checked in the __getitem__ bit because I think the behaviour of the splicing options should match whatever we decide to do for SeqRecord and Seq objects. I'm currently considering creating a new Alignment class to live in Bio/Align/__init__.py (which will make it easier to import - much more discoverable) which would subclass list directly. In particular I want to allow creation of an alignment directly from a list/iterator/generator of SeqRecord objects - something impossible with the current __init__ arguments. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Sep 25 22:19:32 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 25 Sep 2007 18:19:32 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709252219.l8PMJWvc012510@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-25 18:19 EST ------- Created an attachment (id=768) --> (http://bugzilla.open-bio.org/attachment.cgi?id=768&action=view) Replacement Bio/Align/__init__.py (alignment update v4) This is based on attachment 732, my 3rd version of a patch to Bio/Align/Generic.py but handled as new alignment class in Bio/Align/__init__.py This implements a new alignment class which: * directly subclasses the python list (as a list of SeqRecords) * allows flexible subscripting using __getitem__ * enforces strict alphabet and length checking in __init__, append and extend There is plenty more polish needed - including tackling tricky questions like __setitem__ (or __setslice__) and the related questions about editing alignments. As per my comment 10, I would like to get SeqRecord to support splicing giving SeqRecords with (partial) annotation. If this is done, then the alignment class can exploit this (i.e. only have one set of code dealing with the annotation when splicing SeqRecords). Right now only the id/name/description are preserved when splicing alignments. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 04:50:09 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 00:50:09 -0400 Subject: [Biopython-dev] [Bug 2374] New: Uppdated lcc code. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2374 Summary: Uppdated lcc code. Product: Biopython Version: 1.43 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: sbassi at gmail.com Here is a revised version of the lcc code. changes: 1) clean up some code (removed global var, string module). 2) works for both lower and uppercase sequences. 3) both functions inside this module expect just the sequence to calculate the lcc and not a sequence to be sliced. So now is up to the coder to pass the string sliced. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 04:52:33 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 00:52:33 -0400 Subject: [Biopython-dev] [Bug 2374] Uppdated lcc code. In-Reply-To: Message-ID: <200709260452.l8Q4qXS5032351@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2374 ------- Comment #1 from sbassi at gmail.com 2007-09-26 00:52 EST ------- Created an attachment (id=769) --> (http://bugzilla.open-bio.org/attachment.cgi?id=769&action=view) New version of LCC -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Sep 26 08:03:56 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Sep 2007 04:03:56 -0400 Subject: [Biopython-dev] [Bug 2374] Uppdated lcc code. In-Reply-To: Message-ID: <200709260803.l8Q83ubg009768@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2374 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-26 04:03 EST ------- Could you change the docstrings to follow PEP 257 more closely? http://www.python.org/dev/peps/pep-0257/ In particular I'd like it to: * explicitly say if the input can be either a Seq object or a plain string. * state wsize should be an integer * describe the return value (list of floats, and float, I believe?) * give the full name - which I am guessing is low composition complexity (LCC) Would make sense to move this from Bio/lcc.py to Bio/SeqUtils/lcc.py (like Michiel recently moved the crc.py module). Would you have any objections to this? The code clearly only looks at ACTG; extending it to unambiguous nucleotides is possible right (DNA or RNA)? What about ambiguous nucleotides? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Sep 27 17:36:51 2007 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Sep 2007 13:36:51 -0400 Subject: [Biopython-dev] [Bug 1944] Align.Generic adding iterator and more In-Reply-To: Message-ID: <200709271736.l8RHaphd019477@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1944 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #768 is|0 |1 obsolete| | ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2007-09-27 13:36 EST ------- Created an attachment (id=770) --> (http://bugzilla.open-bio.org/attachment.cgi?id=770&action=view) Replacement Bio/Align/__init__.py (alignment update v5) This implements a new alignment class which: * directly subclasses the python list (as a list of SeqRecords) * should be a fully backwards compatible with Bio.Align.Generic.Alignment * implements __str__ and __repr__ methods which are useable on large alignment * allows flexible subscripting using __getitem__ * enforces strict alphabet and length checking in __init__, append, extend, __add__ and __radd__ (the last two give list like addition) Provisos from comment 11 still apply. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython-dev at maubp.freeserve.co.uk Sat Sep 29 12:02:02 2007 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sat, 29 Sep 2007 13:02:02 +0100 Subject: [Biopython-dev] Code review? Reverse complements etc Message-ID: <46FE3EBA.1010907@maubp.freeserve.co.uk> Would anyone have a chance to go over my patch on Bug 2366, Ambiguous nucleotides in (Reverse)complement functions in Bio.Seq http://bugzilla.open-bio.org/show_bug.cgi?id=2366 I would be great to have some some comments on this before Michiel starts getting Biopython 1.44 ready. Thanks Peter