From p.j.a.cock at googlemail.com Sat Oct 5 06:15:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 11:15:33 +0100 Subject: [Biopython-dev] GenePop tests - No such file or directory: 'big.gen.IN2' Message-ID: Hi Tiago, The buildbot has often been failing, particularly on Jython 2.7 on Windows XP, but also one Linux from time to time. The failure is often stochastic - a rerun can fix it. e.g. http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Jython%202.7/builds/267/steps/shell/logs/stdio ====================================================================== ERROR: test_get_heterozygosity_info (test_PopGen_GenePop_EasyController.AppTest) Test heterozygosity info. ---------------------------------------------------------------------- Traceback (most recent call last): File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Tests/test_PopGen_GenePop_EasyController.py", line 53, in test_get_heterozygosity_info hz_info = self.ctrl.get_heterozygosity_info(0, "Locus2") File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Bio/PopGen/GenePop/EasyController.py", line 76, in get_heterozygosity_info geno_freqs = self._controller.calc_allele_genotype_freqs(self._fname) File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Bio/PopGen/GenePop/Controller.py", line 679, in calc_allele_genotype_freqs locf = open(fname + ".IN2") IOError: [Errno 2] No such file or directory: 'big.gen.IN2' ---------------------------------------------------------------------- The error appears to be in Bio/PopGen/GenePop/Controller.py where sometimes copying a file and then opening it immediately can fail: popf = open(fname + ".INF") shutil.copyfile(fname + ".INF", fname + ".IN2") locf = open(fname + ".IN2") pop_iter = _FileIterator(pop_parser, popf, fname + ".INF") locus_iter = _FileIterator(locus_parser, locf, fname + ".IN2") It seems the _FileIterator class is a wrapper to loop over a file and then delete the file - and since you need to parse the file in two ways (pop_parser and locus_parser) you've made a copy of the file. I presume these files are too big to load into memory and then delete immediately? In terms of fixing the symptoms, we could try something crude like this (untested): popf = open(fname + ".INF") shutil.copyfile(fname + ".INF", fname + ".IN2") while not os.path.isfile(fname + ".IN2"): sleep(0.5) locf = open(fname + ".IN2") pop_iter = _FileIterator(pop_parser, popf, fname + ".INF") locus_iter = _FileIterator(locus_parser, locf, fname + ".IN2") I'm not familiar with the output files so can't immediately see a more satisfactory solution. Peter P.S. As an aside, I would refactor this so that _FileIterator opens the handle itself from the filename given, which seems cleaner (opening and closing the handle in one place) and makes the calling code shorter too: shutil.copyfile(fname + ".INF", fname + ".IN2") pop_iter = _FileIterator(pop_parser, fname + ".INF") locus_iter = _FileIterator(locus_parser, fname + ".IN2") (If you have no objections, I'm happy to make that change) From p.j.a.cock at googlemail.com Sat Oct 5 15:02:47 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 20:02:47 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Sep 30, 2013 at 5:18 PM, Peter Cock wrote: > On Mon, Sep 30, 2013 at 12:22 AM, Peter Cock wrote: > >> Assuming my methodology isn't flawed, we're about half way >> in terms of getting every file in Biopython do be dual Python 2 >> and Python 3 code: >> >> 262 no change, 290 need fixers >> Troublesome ones at 52.5% > > New numbers with Bio._py3k.urllib changes which should > have dropped the number of troublesome files by at most > 13 files: > > 374 no change, 177 need fixers > Troublesome ones 32.1% > > I think my markup script is a bit fragile in terms of the exact > sequence of steps with do2to3.py etc. But much better > numbers than Sunday night :) I wasn't using the -B switch in diff until now, that makes things easier: 383 no change, 171 need fixers Troublesome ones 30.9% Revised branch here: https://github.com/peterjc/biopython/tree/mark2to3b https://travis-ci.org/peterjc/biopython/builds/12175589 This is rebased on the master where I've also cut down the number of fixers in use, so together we get a good speed up for the Python 3 install time. I've rebased the urllib changes (include in the above test branch) and made a pull request for comment: https://github.com/biopython/biopython/pull/245 Peter From p.j.a.cock at googlemail.com Sat Oct 5 17:36:25 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 22:36:25 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sat, Oct 5, 2013 at 8:02 PM, Peter Cock wrote: > On Mon, Sep 30, 2013 at 5:18 PM, Peter Cock wrote: >> On Mon, Sep 30, 2013 at 12:22 AM, Peter Cock wrote: >> >>> Assuming my methodology isn't flawed, we're about half way >>> in terms of getting every file in Biopython do be dual Python 2 >>> and Python 3 code: >>> >>> 262 no change, 290 need fixers >>> Troublesome ones at 52.5% >> >> New numbers with Bio._py3k.urllib changes which should >> have dropped the number of troublesome files by at most >> 13 files: >> >> 374 no change, 177 need fixers >> Troublesome ones 32.1% >> >> I think my markup script is a bit fragile in terms of the exact >> sequence of steps with do2to3.py etc. But much better >> numbers than Sunday night :) > > I wasn't using the -B switch in diff until now, that makes > things easier: > > 383 no change, 171 need fixers > Troublesome ones 30.9% > > Revised branch here: > > https://github.com/peterjc/biopython/tree/mark2to3b > https://travis-ci.org/peterjc/biopython/builds/12175589 > > This is rebased on the master where I've also cut down the > number of fixers in use, so together we get a good speed > up for the Python 3 install time. > > I've rebased the urllib changes (include in the above > test branch) and made a pull request for comment: > https://github.com/biopython/biopython/pull/245 > > Peter Incorporating another new feature branch gives: 387 no change, 161 need fixers Troublesome ones 29.4% The new batch of 2to3 issues solved is changes to built in functions like range, zip, map, filter. Branch: https://github.com/peterjc/biopython/tree/builtins https://github.com/biopython/biopython/pull/246 Peter From p.j.a.cock at googlemail.com Sun Oct 6 10:03:00 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 15:03:00 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sat, Oct 5, 2013 at 10:36 PM, Peter Cock wrote: > > Incorporating another new feature branch gives: > > 387 no change, 161 need fixers > Troublesome ones 29.4% > > The new batch of 2to3 issues solved is changes to > built in functions like range, zip, map, filter. Branch: > https://github.com/peterjc/biopython/tree/builtins > https://github.com/biopython/biopython/pull/246 I've added basestring and input to the builtins branch (pull request updated), helps even more. However, I realised I am effectively reimplementing the MIT licensed 'six' library with 'Bio._py3k' and it would be simpler to just use that instead (and that would make life easier for contributors already using 'six' on other projects): https://pypi.python.org/pypi/six/ https://bitbucket.org/gutworth/six http://pythonhosted.org/six/ Expect a slight reworking of these branches to appear later, bundling a copy of 'six' as Bio/_py3k/__init__.py Peter From p.j.a.cock at googlemail.com Sun Oct 6 16:23:57 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 21:23:57 +0100 Subject: [Biopython-dev] Moving from thread to threading in Bio.PopGen? Message-ID: Hi Tiago, The Python 2 thread library is _thread under Python 3, hinting they'd like us to use the new threading library instead. How easy do you think that would be? http://docs.python.org/2.6/library/threading.html Here's a relevant looking snippet of code: http://stackoverflow.com/questions/4003783/translate-thread-start-new-thread-to-the-new-threading-api Thanks, Peter From p.j.a.cock at googlemail.com Sun Oct 6 17:50:18 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 22:50:18 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sun, Oct 6, 2013 at 3:03 PM, Peter Cock wrote: > On Sat, Oct 5, 2013 at 10:36 PM, Peter Cock wrote: >> >> Incorporating another new feature branch gives: >> >> 387 no change, 161 need fixers >> Troublesome ones 29.4% >> >> The new batch of 2to3 issues solved is changes to >> built in functions like range, zip, map, filter. Branch: >> https://github.com/peterjc/biopython/tree/builtins >> https://github.com/biopython/biopython/pull/246 > > I've added basestring and input to the builtins branch > (pull request updated), helps even more. > > However, I realised I am effectively reimplementing the > MIT licensed 'six' library with 'Bio._py3k' and it would > be simpler to just use that instead (and that would make > life easier for contributors already using 'six' on other > projects): > > https://pypi.python.org/pypi/six/ > https://bitbucket.org/gutworth/six > http://pythonhosted.org/six/ > > Expect a slight reworking of these branches to appear > later, bundling a copy of 'six' ... New branch is https://github.com/peterjc/biopython/tree/six with 'six' bundled and using this for more import fixes. Using that work, we're now at under a quarter of the files needing 2to3 changes using the modified do2to3.py, https://github.com/peterjc/biopython/tree/mark2to3c https://travis-ci.org/peterjc/biopython/builds/12208302 416 no change, 132 need fixers Troublesome ones 24.1% Progress :) Peter From tiagoantao at gmail.com Mon Oct 7 05:51:18 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 7 Oct 2013 10:51:18 +0100 Subject: [Biopython-dev] Moving from thread to threading in Bio.PopGen? In-Reply-To: References: Message-ID: Hi, I will change this today to get rid of that import thread. Tiago On 6 October 2013 21:23, Peter Cock wrote: > Hi Tiago, > > The Python 2 thread library is _thread under Python 3, > hinting they'd like us to use the new threading library > instead. How easy do you think that would be? > > http://docs.python.org/2.6/library/threading.html > > Here's a relevant looking snippet of code: > > > http://stackoverflow.com/questions/4003783/translate-thread-start-new-thread-to-the-new-threading-api > > Thanks, > > Peter > -- "The truth may be out there, but the lies are already in your head" - Terry Pratchett From pizzadave108 at gmail.com Tue Oct 8 10:51:53 2013 From: pizzadave108 at gmail.com (pizza Dave) Date: Tue, 8 Oct 2013 15:51:53 +0100 Subject: [Biopython-dev] Thesis Ideas Message-ID: Hi I am currently doing a post grad in data analytics I come from a plant bio-techonology background so I would like to work in bioinformatics area in the future. I am currently trying to think of a thesis project. I am a strong programmer curtently using mostly python but an experencet in C++ and Java. I would like to get some ideas of contrubutions I could make to biopython as a thesis. I would like to do something with alot of programming because that's what i'm good at and it's also alot of fun. also it would be great to contribution to the development bio-python so I am wondering do you have any ideas for a project I cound do for my thesis, I have about 10 weeks to compleat it starting in january its worth 10 credits. so its quite a small short project. any Ideas would be greatly apprechided. Thanks David Morrisroe. From p.j.a.cock at googlemail.com Wed Oct 9 02:27:31 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 Oct 2013 07:27:31 +0100 Subject: [Biopython-dev] PyTennessee 2014, Nashville, February 22nd and 23rd Message-ID: Any Biopythoneers based in/near Tennessee? http://www.pytennessee.org/speaking/cfp/ "PyTennessee 2014, taking place for the first time in Nashville on February 22nd and 23rd, will be accepting all types of proposals starting Oct 1st, 2013 through Nov 1st, 2013. Due to the competitive nature of the selection process, we encourage prospective speakers to submit their proposals as early as possible. We're looking forward to receiving your best proposals for tutorials and talks. Lightning talk sign ups will be available at the registration desk the day of the event." According to the organisers on Twitter they'd be interested in a Biopython talk submission, https://twitter.com/ed_dodds/status/387664414305820672 https://twitter.com/PyTennessee/status/387666325612396545 Peter From p.j.a.cock at googlemail.com Mon Oct 14 11:00:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 14 Oct 2013 16:00:46 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hello all, Despite a nasty cold, I've made further progress over the weekend. Switching to assuming Python 3 style dictionaries is a single biggest step forward - and as long as we have good test coverage I think this is low risk. I think a dual code base without needing 2to3 may be attainable for the next Biopython release. However, before that, I'd like to take a moment to discuss changing imports, e.g. Doc/examples/getgene.py Do people prefer something explicit like this, try: import gdbm # Python 2 except ImportError: from dbm import gnu as gdbm # Python 3 Or something via a helper library (e.g. our Bio._py3k or a bundled copy of the six library): from six import dbm_gnu as gdbm That's a rare example, something far more common is StringIO, which also crops up in our doctests. e.g. Python 2 only: >>> from StringIO import StringIO Both 2 and 3: >>> try: ... from StringIO import StringIO # Python 2 ... except ImportError: ... from io import StringIO # Python 3 ... Both via Bio._py3k, not ideal for a doctest as it is a private module intended as an implementation detail: >>> from Bio._py3k import StringIO Both via six, not ideal if we're bundling it as Bio._six or similar: >>> from six import StringIO Or, for a more common and more complex example, have a look at how urllib has changed under Python 3. See some of the commits here: https://github.com/peterjc/biopython/tree/six For docstrings, I actually prefer the explicit commented version with the try/except. For the main code, using a central helper like Bio._py3k or a bundled copy of six makes sense from a code management perspective - it would ensure consistency (and be easy to remove once we drop Python 2 support). Any thoughts? Thanks, Peter From eric.talevich at gmail.com Mon Oct 14 16:34:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 14 Oct 2013 13:34:00 -0700 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Oct 14, 2013 at 8:00 AM, Peter Cock wrote: > Hello all, > > Despite a nasty cold, I've made further progress over the > weekend. Switching to assuming Python 3 style dictionaries > is a single biggest step forward - and as long as we have > good test coverage I think this is low risk. I think a dual > code base without needing 2to3 may be attainable for > the next Biopython release. Nice :) > [...] something far more common is > StringIO, which also crops up in our doctests. e.g. > > Python 2 only: > > >>> from StringIO import StringIO > > Both 2 and 3: > > >>> try: > ... from StringIO import StringIO # Python 2 > ... except ImportError: > ... from io import StringIO # Python 3 > ... > For the case of StringIO and BytesIO, the top-level io module was added in Python 2.6: http://docs.python.org/2/library/io.html In Py2.6, the implementation of io.StringIO is slow, and io.BytesIO not meaningfully different from io.StringIO, but both should be fine for doctests. -Eric From p.j.a.cock at googlemail.com Mon Oct 14 19:36:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 15 Oct 2013 00:36:21 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Oct 14, 2013 at 9:34 PM, Eric Talevich wrote: > On Mon, Oct 14, 2013 at 8:00 AM, Peter Cock wrote: > >> >> [...] something far more common is >> >> StringIO, which also crops up in our doctests. e.g. >> >> Python 2 only: >> >> >>> from StringIO import StringIO >> >> Both 2 and 3: >> >> >>> try: >> ... from StringIO import StringIO # Python 2 >> ... except ImportError: >> ... from io import StringIO # Python 3 >> ... > > For the case of StringIO and BytesIO, the top-level > io module was added in Python 2.6: > http://docs.python.org/2/library/io.html Yes, but under Python 2 io.StringIO is unicode based, while StringIO.StringIO & cStringIO.StringIO are (bytes) string based. It isn't a drop in replacement for our text based parsers. Where we do want byte strings (e.g. binary formats like SFF) then we can and now do use io.BytesIO in place of the old StringIO usage - and that then works nicely without change under Python3 as well. > In Py2.6, the implementation of io.StringIO is > slow, and io.BytesIO not meaningfully different > from io.StringIO, but both should be fine for > doctests. Yeah - for doctests the speed of the different StringIO options is immaterial. Peter From harijay at gmail.com Fri Oct 18 19:43:34 2013 From: harijay at gmail.com (hari jayaram) Date: Fri, 18 Oct 2013 19:43:34 -0400 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py Message-ID: Hi , This is my first post here..so hope I am following the conventions/practices. I am trying to create a Dash.app docset for biopython. Dash.app is a very fast documentation browser , sadly for OSX alone. One way to getting a docset built is to point it at sphinx documentation. I am using the sphinx-apidoc to auto-generate the documentation ( ver Sphinx==1.2b3). Sphinx can generate the documentation just fine for most of the source tree. It has an issue with Restriction.py where it does not like cls.size. The error is: File "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", line 324, in __len__ return cls.size AttributeError: type object 'RestrictionType' has no attribute 'size' I could cheat..and change it to "return 1" in the code and I got the documentation to build just fine. I dont know why sphinx cares about that line to throw an error , or how to get around that without cheating. Thanks Hari On a side note: The sphinx documentation , looks fine in a browser , but the dash2doc app could not index it to yield a docset. The problem was somehwere in the html , beautiful soup entered an infinite loop. I could however use pydoctor to generate a docset. It seems to work OK..but while sphinx doc created 15,804 index entries. The pydoctor index had only around 5000 entries. Ref: http://kapeli.com/docsets From p.j.a.cock at googlemail.com Sun Oct 20 10:18:42 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 15:18:42 +0100 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py In-Reply-To: References: Message-ID: On Sat, Oct 19, 2013 at 12:43 AM, hari jayaram wrote: > Hi , > This is my first post here..so hope I am following the > conventions/practices. Welcome Hari, > I am trying to create a Dash.app docset for biopython. Dash.app is a very > fast documentation browser , sadly for OSX alone. This one http://kapeli.com/dash ? > One way to getting a docset built is to point it at sphinx documentation. > I am using the sphinx-apidoc to auto-generate the documentation ( ver > Sphinx==1.2b3). > > Sphinx can generate the documentation just fine for most of the source > tree. How are you invoking Sphinx here? (i.e. what command line so we can try to reproduce the problem locally) > It has an issue with Restriction.py where it does not like cls.size. > The error is: > > File > "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", > line 324, in __len__ > return cls.size > AttributeError: type object 'RestrictionType' has no attribute 'size' > > I could cheat..and change it to "return 1" in the code and I got the > documentation to build just fine. The RestrictionType.__init__ documentation warns it is not intended for direct use (see the auto-generated file Restriction_Dictionary.py and associated magic to create a class for each enzyme). It seems Sphinx is trying to call the methods of the class. > I dont know why sphinx cares about that line to throw an error , > or how to get around that without cheating. I'm not sure either - there must be other strange Python classes where __len__ doesn't work without special initialisation. That might be worth asking on the Sphinx mailing list? > Thanks > Hari > > On a side note: The sphinx documentation , looks fine in a browser , but > the dash2doc app could not index it to yield a docset. The problem was > somehwere in the html , beautiful soup entered an infinite loop. Could be a bug in the HTML output from sphinx, perhaps a malformed tag from an unusual string in our comments? > I could however use pydoctor to generate a docset. It seems > to work OK..but while sphinx doc created 15,804 index entries. > The pydoctor index had only around 5000 entries. > > Ref: http://kapeli.com/docsets I don't know if those numbers should be the same, or if one counts modules while the other counts classes/functions etc? Peter P.S. The HTML API docs we distribute are generated with epydoc, see http://biopython.org/wiki/Building_a_release From p.j.a.cock at googlemail.com Sun Oct 20 10:32:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 15:32:16 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hi all, I've just made a pull request on dictionary method handling: https://github.com/biopython/biopython/pull/248 Some comments over on GitHub (or here) would be great. Thanks, Peter From p.j.a.cock at googlemail.com Sun Oct 20 15:16:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 20:16:32 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hi all, I've made a pull request which solves all the remaining Python 2 vs 3 imports using try/except: https://github.com/biopython/biopython/pull/249 Some comments over on GitHub (or here) would be great. Thanks, Peter [Yes, I've also got a 'six' branch which did this using a bundled copy of the six library, but I'm not sure we really need to bother with that. This seems lighter.] From p.j.a.cock at googlemail.com Tue Oct 22 09:59:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 Oct 2013 14:59:58 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sun, Oct 20, 2013 at 3:32 PM, Peter Cock wrote: > Hi all, > > I've just made a pull request on dictionary method handling: > https://github.com/biopython/biopython/pull/248 > > Some comments over on GitHub (or here) would be great. > > Thanks, > > Peter Thanks for looking over that Eric, if there are no objections I intend to rebase and apply the dictionary changes later this week: https://github.com/biopython/biopython/pull/248 Separately, regarding the imports issue - do people have a preference on the try/except as demonstrated here https://github.com/biopython/biopython/pull/249 versus a compatibility layer in Bio._py3k, or a bundled copy of 'six'? e.g. https://github.com/peterjc/biopython/tree/builtins e.g. https://github.com/peterjc/biopython/tree/six Thanks, Peter From eric.talevich at gmail.com Tue Oct 22 12:35:25 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 22 Oct 2013 09:35:25 -0700 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 6:59 AM, Peter Cock wrote: > On Sun, Oct 20, 2013 at 3:32 PM, Peter Cock wrote: > > Hi all, > > > > I've just made a pull request on dictionary method handling: > > https://github.com/biopython/biopython/pull/248 > > > > Some comments over on GitHub (or here) would be great. > > > > Thanks, > > > > Peter > > Thanks for looking over that Eric, if there are no objections > I intend to rebase and apply the dictionary changes later this > week: https://github.com/biopython/biopython/pull/248 > > Separately, regarding the imports issue - do people have > a preference on the try/except as demonstrated here > https://github.com/biopython/biopython/pull/249 versus > a compatibility layer in Bio._py3k, or a bundled copy of > 'six'? > > e.g. https://github.com/peterjc/biopython/tree/builtins > e.g. https://github.com/peterjc/biopython/tree/six > > Thanks, > > Peter > I just looked at the source code for six: https://bitbucket.org/gutworth/six/src/db5564076aa8/six.py?at=default It's very compact, much shorter than I expected but also quite dense. I get the sense they've had enough eyes on the codebase to sort out performance issues and edge cases, e.g. sys.MAXSIZE on Jython. For docstrings, I agree that directly showing the try/except block is more informative for users on either genus of Python. For the rest of the codebase, I would favor using a bundled copy of six (e.g. Bio._six). The benefits are (a) not having to discover and fix all the subtle bugs ourselves, (b) to be explicit about where we've done something for Py2/3 compatibility and not as an essential part of the way the code is supposed to work, and (c) six has its own documentation. I also see some virtue in not relying on six/Bio._py3k where it's not necessary, since six is compatible back to Python 2.4 and we only go back to Python 2.6 now. Halfway approach: just look at six and copy only the bits we need into _py3k? -Eric From p.j.a.cock at googlemail.com Tue Oct 22 12:42:49 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 Oct 2013 17:42:49 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 5:35 PM, Eric Talevich wrote: > On Tue, Oct 22, 2013 at 6:59 AM, Peter Cock wrote: >> >> Separately, regarding the imports issue - do people have >> a preference on the try/except as demonstrated here >> https://github.com/biopython/biopython/pull/249 versus >> a compatibility layer in Bio._py3k, or a bundled copy of >> 'six'? >> >> e.g. https://github.com/peterjc/biopython/tree/builtins >> e.g. https://github.com/peterjc/biopython/tree/six >> >> Thanks, >> >> Peter > > I just looked at the source code for six: > https://bitbucket.org/gutworth/six/src/db5564076aa8/six.py?at=default > > It's very compact, much shorter than I expected but also quite dense. I get > the sense they've had enough eyes on the codebase to sort out performance > issues and edge cases, e.g. sys.MAXSIZE on Jython. They've fixed two little bugs I reported, but this remains open: https://bitbucket.org/gutworth/six/issue/41/from-sixmovestkinter-import-and-similar I'm avoiding it though: https://github.com/biopython/biopython/commit/c36fdbaad432d477c64ad5768df7062340530176 > For docstrings, I agree that directly showing the try/except block is more > informative for users on either genus of Python. Agreed. > For the rest of the > codebase, I would favor using a bundled copy of six (e.g. Bio._six). The > benefits are (a) not having to discover and fix all the subtle bugs > ourselves, (b) to be explicit about where we've done something for Py2/3 > compatibility and not as an essential part of the way the code is supposed > to work, and (c) six has its own documentation. > > I also see some virtue in not relying on six/Bio._py3k where it's not > necessary, since six is compatible back to Python 2.4 and we only go back to > Python 2.6 now. Halfway approach: just look at six and copy only the bits we > need into _py3k? OK, I'll focus in that direction then. Six is MIT licensed so we should be fine bundling it or extracting snippets. Thanks, Peter From harijay at gmail.com Tue Oct 22 15:38:21 2013 From: harijay at gmail.com (hari jayaram) Date: Tue, 22 Oct 2013 15:38:21 -0400 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py In-Reply-To: References: Message-ID: Hi Peter.. Thanks for your comments. I could get a docset to build after all using pydoctor. The Dash.app is from Kapeli : http://kapeli.com/docsets Its is very fast and works very well with the pydoctor generated docset. For sphinx , I was using sphinx-apidoc with the "full" switch. 1) sphinx-apidoc -o apigenout -F Bio 2) In the apigenout directory I ran a "make html" to build the sphinx documentation. This had some errors depending on the sphinx and python version. This is also the step where it complained about Restriction.py code. When I changed the code to get it to stop complaining. I think these document creators do some kind of static analysis on the code..which may be getting caught up in Restriction.py 3) Then you point the doc2dash application ( which uses beautifulsoup4 and lxml) to build the docset ( from: https://github.com/hynek/doc2dash/) doc2dash did not yeild a docset because of an infinite loop . I got some help with it from Hynek Schlawack ..but in the end used pydoctor , which worked. doc2dash needs to be pointed to the directory where sphinx puts all the html files. It automatically adds the docset to the Dash app ~/Library/Application\ Support/doc2dash directory The commandline I used for doc2dash was . doc2dash --name biopython -A html/ I have a working docset. If anyone wants to give it a go I can gladly share it with you. Thanks a tonne Hari On Sun, Oct 20, 2013 at 10:18 AM, Peter Cock wrote: > On Sat, Oct 19, 2013 at 12:43 AM, hari jayaram wrote: > > Hi , > > This is my first post here..so hope I am following the > > conventions/practices. > > Welcome Hari, > > > I am trying to create a Dash.app docset for biopython. Dash.app is a > very > > fast documentation browser , sadly for OSX alone. > > This one http://kapeli.com/dash ? > > > One way to getting a docset built is to point it at sphinx > documentation. > > I am using the sphinx-apidoc to auto-generate the documentation ( ver > > Sphinx==1.2b3). > > > > Sphinx can generate the documentation just fine for most of the source > > tree. > > How are you invoking Sphinx here? (i.e. what command line > so we can try to reproduce the problem locally) > > > It has an issue with Restriction.py where it does not like cls.size. > > The error is: > > > > File > > > "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", > > line 324, in __len__ > > return cls.size > > AttributeError: type object 'RestrictionType' has no attribute 'size' > > > > I could cheat..and change it to "return 1" in the code and I got the > > documentation to build just fine. > > The RestrictionType.__init__ documentation warns it is not > intended for direct use (see the auto-generated file > Restriction_Dictionary.py and associated magic to create > a class for each enzyme). It seems Sphinx is trying to call > the methods of the class. > > > I dont know why sphinx cares about that line to throw an error , > > or how to get around that without cheating. > > I'm not sure either - there must be other strange Python > classes where __len__ doesn't work without special > initialisation. That might be worth asking on the Sphinx > mailing list? > > > Thanks > > Hari > > > > On a side note: The sphinx documentation , looks fine in a browser , but > > the dash2doc app could not index it to yield a docset. The problem was > > somehwere in the html , beautiful soup entered an infinite loop. > > Could be a bug in the HTML output from sphinx, perhaps a > malformed tag from an unusual string in our comments? > > > I could however use pydoctor to generate a docset. It seems > > to work OK..but while sphinx doc created 15,804 index entries. > > The pydoctor index had only around 5000 entries. > > > > Ref: http://kapeli.com/docsets > > I don't know if those numbers should be the same, or if one > counts modules while the other counts classes/functions etc? > > Peter > > P.S. > > The HTML API docs we distribute are generated with epydoc, > see http://biopython.org/wiki/Building_a_release > From p.j.a.cock at googlemail.com Sat Oct 26 14:44:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 26 Oct 2013 19:44:59 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 5:42 PM, Peter Cock wrote: > On Tue, Oct 22, 2013 at 5:35 PM, Eric Talevich wrote: >> For docstrings, I agree that directly showing the try/except block is more >> informative for users on either genus of Python. > > Agreed. > >> For the rest of the >> codebase, I would favor using a bundled copy of six (e.g. Bio._six). The >> benefits are (a) not having to discover and fix all the subtle bugs >> ourselves, (b) to be explicit about where we've done something for Py2/3 >> compatibility and not as an essential part of the way the code is supposed >> to work, and (c) six has its own documentation. >> >> I also see some virtue in not relying on six/Bio._py3k where it's not >> necessary, since six is compatible back to Python 2.4 and we only go back to >> Python 2.6 now. Halfway approach: just look at six and copy only the bits we >> need into _py3k? > > OK, I'll focus in that direction then. Six is MIT licensed so we should > be fine bundling it or extracting snippets. A new pull request for people to comment on, which eliminates all but two important fixers. As a bonus this makes installation under Python 3 much much quicker: https://github.com/biopython/biopython/pull/250 I've not (yet) needed anything from the 'six' library. Peter From p.j.a.cock at googlemail.com Sat Oct 5 10:15:33 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 11:15:33 +0100 Subject: [Biopython-dev] GenePop tests - No such file or directory: 'big.gen.IN2' Message-ID: Hi Tiago, The buildbot has often been failing, particularly on Jython 2.7 on Windows XP, but also one Linux from time to time. The failure is often stochastic - a rerun can fix it. e.g. http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Jython%202.7/builds/267/steps/shell/logs/stdio ====================================================================== ERROR: test_get_heterozygosity_info (test_PopGen_GenePop_EasyController.AppTest) Test heterozygosity info. ---------------------------------------------------------------------- Traceback (most recent call last): File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Tests/test_PopGen_GenePop_EasyController.py", line 53, in test_get_heterozygosity_info hz_info = self.ctrl.get_heterozygosity_info(0, "Locus2") File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Bio/PopGen/GenePop/EasyController.py", line 76, in get_heterozygosity_info geno_freqs = self._controller.calc_allele_genotype_freqs(self._fname) File "/home_local/buildslave/BuildBot_Biopython/jython27lin64/build/Bio/PopGen/GenePop/Controller.py", line 679, in calc_allele_genotype_freqs locf = open(fname + ".IN2") IOError: [Errno 2] No such file or directory: 'big.gen.IN2' ---------------------------------------------------------------------- The error appears to be in Bio/PopGen/GenePop/Controller.py where sometimes copying a file and then opening it immediately can fail: popf = open(fname + ".INF") shutil.copyfile(fname + ".INF", fname + ".IN2") locf = open(fname + ".IN2") pop_iter = _FileIterator(pop_parser, popf, fname + ".INF") locus_iter = _FileIterator(locus_parser, locf, fname + ".IN2") It seems the _FileIterator class is a wrapper to loop over a file and then delete the file - and since you need to parse the file in two ways (pop_parser and locus_parser) you've made a copy of the file. I presume these files are too big to load into memory and then delete immediately? In terms of fixing the symptoms, we could try something crude like this (untested): popf = open(fname + ".INF") shutil.copyfile(fname + ".INF", fname + ".IN2") while not os.path.isfile(fname + ".IN2"): sleep(0.5) locf = open(fname + ".IN2") pop_iter = _FileIterator(pop_parser, popf, fname + ".INF") locus_iter = _FileIterator(locus_parser, locf, fname + ".IN2") I'm not familiar with the output files so can't immediately see a more satisfactory solution. Peter P.S. As an aside, I would refactor this so that _FileIterator opens the handle itself from the filename given, which seems cleaner (opening and closing the handle in one place) and makes the calling code shorter too: shutil.copyfile(fname + ".INF", fname + ".IN2") pop_iter = _FileIterator(pop_parser, fname + ".INF") locus_iter = _FileIterator(locus_parser, fname + ".IN2") (If you have no objections, I'm happy to make that change) From p.j.a.cock at googlemail.com Sat Oct 5 19:02:47 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 20:02:47 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Sep 30, 2013 at 5:18 PM, Peter Cock wrote: > On Mon, Sep 30, 2013 at 12:22 AM, Peter Cock wrote: > >> Assuming my methodology isn't flawed, we're about half way >> in terms of getting every file in Biopython do be dual Python 2 >> and Python 3 code: >> >> 262 no change, 290 need fixers >> Troublesome ones at 52.5% > > New numbers with Bio._py3k.urllib changes which should > have dropped the number of troublesome files by at most > 13 files: > > 374 no change, 177 need fixers > Troublesome ones 32.1% > > I think my markup script is a bit fragile in terms of the exact > sequence of steps with do2to3.py etc. But much better > numbers than Sunday night :) I wasn't using the -B switch in diff until now, that makes things easier: 383 no change, 171 need fixers Troublesome ones 30.9% Revised branch here: https://github.com/peterjc/biopython/tree/mark2to3b https://travis-ci.org/peterjc/biopython/builds/12175589 This is rebased on the master where I've also cut down the number of fixers in use, so together we get a good speed up for the Python 3 install time. I've rebased the urllib changes (include in the above test branch) and made a pull request for comment: https://github.com/biopython/biopython/pull/245 Peter From p.j.a.cock at googlemail.com Sat Oct 5 21:36:25 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 5 Oct 2013 22:36:25 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sat, Oct 5, 2013 at 8:02 PM, Peter Cock wrote: > On Mon, Sep 30, 2013 at 5:18 PM, Peter Cock wrote: >> On Mon, Sep 30, 2013 at 12:22 AM, Peter Cock wrote: >> >>> Assuming my methodology isn't flawed, we're about half way >>> in terms of getting every file in Biopython do be dual Python 2 >>> and Python 3 code: >>> >>> 262 no change, 290 need fixers >>> Troublesome ones at 52.5% >> >> New numbers with Bio._py3k.urllib changes which should >> have dropped the number of troublesome files by at most >> 13 files: >> >> 374 no change, 177 need fixers >> Troublesome ones 32.1% >> >> I think my markup script is a bit fragile in terms of the exact >> sequence of steps with do2to3.py etc. But much better >> numbers than Sunday night :) > > I wasn't using the -B switch in diff until now, that makes > things easier: > > 383 no change, 171 need fixers > Troublesome ones 30.9% > > Revised branch here: > > https://github.com/peterjc/biopython/tree/mark2to3b > https://travis-ci.org/peterjc/biopython/builds/12175589 > > This is rebased on the master where I've also cut down the > number of fixers in use, so together we get a good speed > up for the Python 3 install time. > > I've rebased the urllib changes (include in the above > test branch) and made a pull request for comment: > https://github.com/biopython/biopython/pull/245 > > Peter Incorporating another new feature branch gives: 387 no change, 161 need fixers Troublesome ones 29.4% The new batch of 2to3 issues solved is changes to built in functions like range, zip, map, filter. Branch: https://github.com/peterjc/biopython/tree/builtins https://github.com/biopython/biopython/pull/246 Peter From p.j.a.cock at googlemail.com Sun Oct 6 14:03:00 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 15:03:00 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sat, Oct 5, 2013 at 10:36 PM, Peter Cock wrote: > > Incorporating another new feature branch gives: > > 387 no change, 161 need fixers > Troublesome ones 29.4% > > The new batch of 2to3 issues solved is changes to > built in functions like range, zip, map, filter. Branch: > https://github.com/peterjc/biopython/tree/builtins > https://github.com/biopython/biopython/pull/246 I've added basestring and input to the builtins branch (pull request updated), helps even more. However, I realised I am effectively reimplementing the MIT licensed 'six' library with 'Bio._py3k' and it would be simpler to just use that instead (and that would make life easier for contributors already using 'six' on other projects): https://pypi.python.org/pypi/six/ https://bitbucket.org/gutworth/six http://pythonhosted.org/six/ Expect a slight reworking of these branches to appear later, bundling a copy of 'six' as Bio/_py3k/__init__.py Peter From p.j.a.cock at googlemail.com Sun Oct 6 20:23:57 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 21:23:57 +0100 Subject: [Biopython-dev] Moving from thread to threading in Bio.PopGen? Message-ID: Hi Tiago, The Python 2 thread library is _thread under Python 3, hinting they'd like us to use the new threading library instead. How easy do you think that would be? http://docs.python.org/2.6/library/threading.html Here's a relevant looking snippet of code: http://stackoverflow.com/questions/4003783/translate-thread-start-new-thread-to-the-new-threading-api Thanks, Peter From p.j.a.cock at googlemail.com Sun Oct 6 21:50:18 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 6 Oct 2013 22:50:18 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sun, Oct 6, 2013 at 3:03 PM, Peter Cock wrote: > On Sat, Oct 5, 2013 at 10:36 PM, Peter Cock wrote: >> >> Incorporating another new feature branch gives: >> >> 387 no change, 161 need fixers >> Troublesome ones 29.4% >> >> The new batch of 2to3 issues solved is changes to >> built in functions like range, zip, map, filter. Branch: >> https://github.com/peterjc/biopython/tree/builtins >> https://github.com/biopython/biopython/pull/246 > > I've added basestring and input to the builtins branch > (pull request updated), helps even more. > > However, I realised I am effectively reimplementing the > MIT licensed 'six' library with 'Bio._py3k' and it would > be simpler to just use that instead (and that would make > life easier for contributors already using 'six' on other > projects): > > https://pypi.python.org/pypi/six/ > https://bitbucket.org/gutworth/six > http://pythonhosted.org/six/ > > Expect a slight reworking of these branches to appear > later, bundling a copy of 'six' ... New branch is https://github.com/peterjc/biopython/tree/six with 'six' bundled and using this for more import fixes. Using that work, we're now at under a quarter of the files needing 2to3 changes using the modified do2to3.py, https://github.com/peterjc/biopython/tree/mark2to3c https://travis-ci.org/peterjc/biopython/builds/12208302 416 no change, 132 need fixers Troublesome ones 24.1% Progress :) Peter From tiagoantao at gmail.com Mon Oct 7 09:51:18 2013 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 7 Oct 2013 10:51:18 +0100 Subject: [Biopython-dev] Moving from thread to threading in Bio.PopGen? In-Reply-To: References: Message-ID: Hi, I will change this today to get rid of that import thread. Tiago On 6 October 2013 21:23, Peter Cock wrote: > Hi Tiago, > > The Python 2 thread library is _thread under Python 3, > hinting they'd like us to use the new threading library > instead. How easy do you think that would be? > > http://docs.python.org/2.6/library/threading.html > > Here's a relevant looking snippet of code: > > > http://stackoverflow.com/questions/4003783/translate-thread-start-new-thread-to-the-new-threading-api > > Thanks, > > Peter > -- "The truth may be out there, but the lies are already in your head" - Terry Pratchett From pizzadave108 at gmail.com Tue Oct 8 14:51:53 2013 From: pizzadave108 at gmail.com (pizza Dave) Date: Tue, 8 Oct 2013 15:51:53 +0100 Subject: [Biopython-dev] Thesis Ideas Message-ID: Hi I am currently doing a post grad in data analytics I come from a plant bio-techonology background so I would like to work in bioinformatics area in the future. I am currently trying to think of a thesis project. I am a strong programmer curtently using mostly python but an experencet in C++ and Java. I would like to get some ideas of contrubutions I could make to biopython as a thesis. I would like to do something with alot of programming because that's what i'm good at and it's also alot of fun. also it would be great to contribution to the development bio-python so I am wondering do you have any ideas for a project I cound do for my thesis, I have about 10 weeks to compleat it starting in january its worth 10 credits. so its quite a small short project. any Ideas would be greatly apprechided. Thanks David Morrisroe. From p.j.a.cock at googlemail.com Wed Oct 9 06:27:31 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 9 Oct 2013 07:27:31 +0100 Subject: [Biopython-dev] PyTennessee 2014, Nashville, February 22nd and 23rd Message-ID: Any Biopythoneers based in/near Tennessee? http://www.pytennessee.org/speaking/cfp/ "PyTennessee 2014, taking place for the first time in Nashville on February 22nd and 23rd, will be accepting all types of proposals starting Oct 1st, 2013 through Nov 1st, 2013. Due to the competitive nature of the selection process, we encourage prospective speakers to submit their proposals as early as possible. We're looking forward to receiving your best proposals for tutorials and talks. Lightning talk sign ups will be available at the registration desk the day of the event." According to the organisers on Twitter they'd be interested in a Biopython talk submission, https://twitter.com/ed_dodds/status/387664414305820672 https://twitter.com/PyTennessee/status/387666325612396545 Peter From p.j.a.cock at googlemail.com Mon Oct 14 15:00:46 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 14 Oct 2013 16:00:46 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hello all, Despite a nasty cold, I've made further progress over the weekend. Switching to assuming Python 3 style dictionaries is a single biggest step forward - and as long as we have good test coverage I think this is low risk. I think a dual code base without needing 2to3 may be attainable for the next Biopython release. However, before that, I'd like to take a moment to discuss changing imports, e.g. Doc/examples/getgene.py Do people prefer something explicit like this, try: import gdbm # Python 2 except ImportError: from dbm import gnu as gdbm # Python 3 Or something via a helper library (e.g. our Bio._py3k or a bundled copy of the six library): from six import dbm_gnu as gdbm That's a rare example, something far more common is StringIO, which also crops up in our doctests. e.g. Python 2 only: >>> from StringIO import StringIO Both 2 and 3: >>> try: ... from StringIO import StringIO # Python 2 ... except ImportError: ... from io import StringIO # Python 3 ... Both via Bio._py3k, not ideal for a doctest as it is a private module intended as an implementation detail: >>> from Bio._py3k import StringIO Both via six, not ideal if we're bundling it as Bio._six or similar: >>> from six import StringIO Or, for a more common and more complex example, have a look at how urllib has changed under Python 3. See some of the commits here: https://github.com/peterjc/biopython/tree/six For docstrings, I actually prefer the explicit commented version with the try/except. For the main code, using a central helper like Bio._py3k or a bundled copy of six makes sense from a code management perspective - it would ensure consistency (and be easy to remove once we drop Python 2 support). Any thoughts? Thanks, Peter From eric.talevich at gmail.com Mon Oct 14 20:34:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Mon, 14 Oct 2013 13:34:00 -0700 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Oct 14, 2013 at 8:00 AM, Peter Cock wrote: > Hello all, > > Despite a nasty cold, I've made further progress over the > weekend. Switching to assuming Python 3 style dictionaries > is a single biggest step forward - and as long as we have > good test coverage I think this is low risk. I think a dual > code base without needing 2to3 may be attainable for > the next Biopython release. Nice :) > [...] something far more common is > StringIO, which also crops up in our doctests. e.g. > > Python 2 only: > > >>> from StringIO import StringIO > > Both 2 and 3: > > >>> try: > ... from StringIO import StringIO # Python 2 > ... except ImportError: > ... from io import StringIO # Python 3 > ... > For the case of StringIO and BytesIO, the top-level io module was added in Python 2.6: http://docs.python.org/2/library/io.html In Py2.6, the implementation of io.StringIO is slow, and io.BytesIO not meaningfully different from io.StringIO, but both should be fine for doctests. -Eric From p.j.a.cock at googlemail.com Mon Oct 14 23:36:21 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 15 Oct 2013 00:36:21 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Mon, Oct 14, 2013 at 9:34 PM, Eric Talevich wrote: > On Mon, Oct 14, 2013 at 8:00 AM, Peter Cock wrote: > >> >> [...] something far more common is >> >> StringIO, which also crops up in our doctests. e.g. >> >> Python 2 only: >> >> >>> from StringIO import StringIO >> >> Both 2 and 3: >> >> >>> try: >> ... from StringIO import StringIO # Python 2 >> ... except ImportError: >> ... from io import StringIO # Python 3 >> ... > > For the case of StringIO and BytesIO, the top-level > io module was added in Python 2.6: > http://docs.python.org/2/library/io.html Yes, but under Python 2 io.StringIO is unicode based, while StringIO.StringIO & cStringIO.StringIO are (bytes) string based. It isn't a drop in replacement for our text based parsers. Where we do want byte strings (e.g. binary formats like SFF) then we can and now do use io.BytesIO in place of the old StringIO usage - and that then works nicely without change under Python3 as well. > In Py2.6, the implementation of io.StringIO is > slow, and io.BytesIO not meaningfully different > from io.StringIO, but both should be fine for > doctests. Yeah - for doctests the speed of the different StringIO options is immaterial. Peter From harijay at gmail.com Fri Oct 18 23:43:34 2013 From: harijay at gmail.com (hari jayaram) Date: Fri, 18 Oct 2013 19:43:34 -0400 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py Message-ID: Hi , This is my first post here..so hope I am following the conventions/practices. I am trying to create a Dash.app docset for biopython. Dash.app is a very fast documentation browser , sadly for OSX alone. One way to getting a docset built is to point it at sphinx documentation. I am using the sphinx-apidoc to auto-generate the documentation ( ver Sphinx==1.2b3). Sphinx can generate the documentation just fine for most of the source tree. It has an issue with Restriction.py where it does not like cls.size. The error is: File "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", line 324, in __len__ return cls.size AttributeError: type object 'RestrictionType' has no attribute 'size' I could cheat..and change it to "return 1" in the code and I got the documentation to build just fine. I dont know why sphinx cares about that line to throw an error , or how to get around that without cheating. Thanks Hari On a side note: The sphinx documentation , looks fine in a browser , but the dash2doc app could not index it to yield a docset. The problem was somehwere in the html , beautiful soup entered an infinite loop. I could however use pydoctor to generate a docset. It seems to work OK..but while sphinx doc created 15,804 index entries. The pydoctor index had only around 5000 entries. Ref: http://kapeli.com/docsets From p.j.a.cock at googlemail.com Sun Oct 20 14:18:42 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 15:18:42 +0100 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py In-Reply-To: References: Message-ID: On Sat, Oct 19, 2013 at 12:43 AM, hari jayaram wrote: > Hi , > This is my first post here..so hope I am following the > conventions/practices. Welcome Hari, > I am trying to create a Dash.app docset for biopython. Dash.app is a very > fast documentation browser , sadly for OSX alone. This one http://kapeli.com/dash ? > One way to getting a docset built is to point it at sphinx documentation. > I am using the sphinx-apidoc to auto-generate the documentation ( ver > Sphinx==1.2b3). > > Sphinx can generate the documentation just fine for most of the source > tree. How are you invoking Sphinx here? (i.e. what command line so we can try to reproduce the problem locally) > It has an issue with Restriction.py where it does not like cls.size. > The error is: > > File > "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", > line 324, in __len__ > return cls.size > AttributeError: type object 'RestrictionType' has no attribute 'size' > > I could cheat..and change it to "return 1" in the code and I got the > documentation to build just fine. The RestrictionType.__init__ documentation warns it is not intended for direct use (see the auto-generated file Restriction_Dictionary.py and associated magic to create a class for each enzyme). It seems Sphinx is trying to call the methods of the class. > I dont know why sphinx cares about that line to throw an error , > or how to get around that without cheating. I'm not sure either - there must be other strange Python classes where __len__ doesn't work without special initialisation. That might be worth asking on the Sphinx mailing list? > Thanks > Hari > > On a side note: The sphinx documentation , looks fine in a browser , but > the dash2doc app could not index it to yield a docset. The problem was > somehwere in the html , beautiful soup entered an infinite loop. Could be a bug in the HTML output from sphinx, perhaps a malformed tag from an unusual string in our comments? > I could however use pydoctor to generate a docset. It seems > to work OK..but while sphinx doc created 15,804 index entries. > The pydoctor index had only around 5000 entries. > > Ref: http://kapeli.com/docsets I don't know if those numbers should be the same, or if one counts modules while the other counts classes/functions etc? Peter P.S. The HTML API docs we distribute are generated with epydoc, see http://biopython.org/wiki/Building_a_release From p.j.a.cock at googlemail.com Sun Oct 20 14:32:16 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 15:32:16 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hi all, I've just made a pull request on dictionary method handling: https://github.com/biopython/biopython/pull/248 Some comments over on GitHub (or here) would be great. Thanks, Peter From p.j.a.cock at googlemail.com Sun Oct 20 19:16:32 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 20 Oct 2013 20:16:32 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: Hi all, I've made a pull request which solves all the remaining Python 2 vs 3 imports using try/except: https://github.com/biopython/biopython/pull/249 Some comments over on GitHub (or here) would be great. Thanks, Peter [Yes, I've also got a 'six' branch which did this using a bundled copy of the six library, but I'm not sure we really need to bother with that. This seems lighter.] From p.j.a.cock at googlemail.com Tue Oct 22 13:59:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 Oct 2013 14:59:58 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Sun, Oct 20, 2013 at 3:32 PM, Peter Cock wrote: > Hi all, > > I've just made a pull request on dictionary method handling: > https://github.com/biopython/biopython/pull/248 > > Some comments over on GitHub (or here) would be great. > > Thanks, > > Peter Thanks for looking over that Eric, if there are no objections I intend to rebase and apply the dictionary changes later this week: https://github.com/biopython/biopython/pull/248 Separately, regarding the imports issue - do people have a preference on the try/except as demonstrated here https://github.com/biopython/biopython/pull/249 versus a compatibility layer in Bio._py3k, or a bundled copy of 'six'? e.g. https://github.com/peterjc/biopython/tree/builtins e.g. https://github.com/peterjc/biopython/tree/six Thanks, Peter From eric.talevich at gmail.com Tue Oct 22 16:35:25 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 22 Oct 2013 09:35:25 -0700 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 6:59 AM, Peter Cock wrote: > On Sun, Oct 20, 2013 at 3:32 PM, Peter Cock wrote: > > Hi all, > > > > I've just made a pull request on dictionary method handling: > > https://github.com/biopython/biopython/pull/248 > > > > Some comments over on GitHub (or here) would be great. > > > > Thanks, > > > > Peter > > Thanks for looking over that Eric, if there are no objections > I intend to rebase and apply the dictionary changes later this > week: https://github.com/biopython/biopython/pull/248 > > Separately, regarding the imports issue - do people have > a preference on the try/except as demonstrated here > https://github.com/biopython/biopython/pull/249 versus > a compatibility layer in Bio._py3k, or a bundled copy of > 'six'? > > e.g. https://github.com/peterjc/biopython/tree/builtins > e.g. https://github.com/peterjc/biopython/tree/six > > Thanks, > > Peter > I just looked at the source code for six: https://bitbucket.org/gutworth/six/src/db5564076aa8/six.py?at=default It's very compact, much shorter than I expected but also quite dense. I get the sense they've had enough eyes on the codebase to sort out performance issues and edge cases, e.g. sys.MAXSIZE on Jython. For docstrings, I agree that directly showing the try/except block is more informative for users on either genus of Python. For the rest of the codebase, I would favor using a bundled copy of six (e.g. Bio._six). The benefits are (a) not having to discover and fix all the subtle bugs ourselves, (b) to be explicit about where we've done something for Py2/3 compatibility and not as an essential part of the way the code is supposed to work, and (c) six has its own documentation. I also see some virtue in not relying on six/Bio._py3k where it's not necessary, since six is compatible back to Python 2.4 and we only go back to Python 2.6 now. Halfway approach: just look at six and copy only the bits we need into _py3k? -Eric From p.j.a.cock at googlemail.com Tue Oct 22 16:42:49 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 22 Oct 2013 17:42:49 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 5:35 PM, Eric Talevich wrote: > On Tue, Oct 22, 2013 at 6:59 AM, Peter Cock wrote: >> >> Separately, regarding the imports issue - do people have >> a preference on the try/except as demonstrated here >> https://github.com/biopython/biopython/pull/249 versus >> a compatibility layer in Bio._py3k, or a bundled copy of >> 'six'? >> >> e.g. https://github.com/peterjc/biopython/tree/builtins >> e.g. https://github.com/peterjc/biopython/tree/six >> >> Thanks, >> >> Peter > > I just looked at the source code for six: > https://bitbucket.org/gutworth/six/src/db5564076aa8/six.py?at=default > > It's very compact, much shorter than I expected but also quite dense. I get > the sense they've had enough eyes on the codebase to sort out performance > issues and edge cases, e.g. sys.MAXSIZE on Jython. They've fixed two little bugs I reported, but this remains open: https://bitbucket.org/gutworth/six/issue/41/from-sixmovestkinter-import-and-similar I'm avoiding it though: https://github.com/biopython/biopython/commit/c36fdbaad432d477c64ad5768df7062340530176 > For docstrings, I agree that directly showing the try/except block is more > informative for users on either genus of Python. Agreed. > For the rest of the > codebase, I would favor using a bundled copy of six (e.g. Bio._six). The > benefits are (a) not having to discover and fix all the subtle bugs > ourselves, (b) to be explicit about where we've done something for Py2/3 > compatibility and not as an essential part of the way the code is supposed > to work, and (c) six has its own documentation. > > I also see some virtue in not relying on six/Bio._py3k where it's not > necessary, since six is compatible back to Python 2.4 and we only go back to > Python 2.6 now. Halfway approach: just look at six and copy only the bits we > need into _py3k? OK, I'll focus in that direction then. Six is MIT licensed so we should be fine bundling it or extracting snippets. Thanks, Peter From harijay at gmail.com Tue Oct 22 19:38:21 2013 From: harijay at gmail.com (hari jayaram) Date: Tue, 22 Oct 2013 15:38:21 -0400 Subject: [Biopython-dev] sphinx build with github source has error in Restriction.py In-Reply-To: References: Message-ID: Hi Peter.. Thanks for your comments. I could get a docset to build after all using pydoctor. The Dash.app is from Kapeli : http://kapeli.com/docsets Its is very fast and works very well with the pydoctor generated docset. For sphinx , I was using sphinx-apidoc with the "full" switch. 1) sphinx-apidoc -o apigenout -F Bio 2) In the apigenout directory I ran a "make html" to build the sphinx documentation. This had some errors depending on the sphinx and python version. This is also the step where it complained about Restriction.py code. When I changed the code to get it to stop complaining. I think these document creators do some kind of static analysis on the code..which may be getting caught up in Restriction.py 3) Then you point the doc2dash application ( which uses beautifulsoup4 and lxml) to build the docset ( from: https://github.com/hynek/doc2dash/) doc2dash did not yeild a docset because of an infinite loop . I got some help with it from Hynek Schlawack ..but in the end used pydoctor , which worked. doc2dash needs to be pointed to the directory where sphinx puts all the html files. It automatically adds the docset to the Dash app ~/Library/Application\ Support/doc2dash directory The commandline I used for doc2dash was . doc2dash --name biopython -A html/ I have a working docset. If anyone wants to give it a go I can gladly share it with you. Thanks a tonne Hari On Sun, Oct 20, 2013 at 10:18 AM, Peter Cock wrote: > On Sat, Oct 19, 2013 at 12:43 AM, hari jayaram wrote: > > Hi , > > This is my first post here..so hope I am following the > > conventions/practices. > > Welcome Hari, > > > I am trying to create a Dash.app docset for biopython. Dash.app is a > very > > fast documentation browser , sadly for OSX alone. > > This one http://kapeli.com/dash ? > > > One way to getting a docset built is to point it at sphinx > documentation. > > I am using the sphinx-apidoc to auto-generate the documentation ( ver > > Sphinx==1.2b3). > > > > Sphinx can generate the documentation just fine for most of the source > > tree. > > How are you invoking Sphinx here? (i.e. what command line > so we can try to reproduce the problem locally) > > > It has an issue with Restriction.py where it does not like cls.size. > > The error is: > > > > File > > > "/Users/hari/.virtualenvs/pyvectormapdraw/lib/python2.7/site-packages/Bio/Restriction/Restriction.py", > > line 324, in __len__ > > return cls.size > > AttributeError: type object 'RestrictionType' has no attribute 'size' > > > > I could cheat..and change it to "return 1" in the code and I got the > > documentation to build just fine. > > The RestrictionType.__init__ documentation warns it is not > intended for direct use (see the auto-generated file > Restriction_Dictionary.py and associated magic to create > a class for each enzyme). It seems Sphinx is trying to call > the methods of the class. > > > I dont know why sphinx cares about that line to throw an error , > > or how to get around that without cheating. > > I'm not sure either - there must be other strange Python > classes where __len__ doesn't work without special > initialisation. That might be worth asking on the Sphinx > mailing list? > > > Thanks > > Hari > > > > On a side note: The sphinx documentation , looks fine in a browser , but > > the dash2doc app could not index it to yield a docset. The problem was > > somehwere in the html , beautiful soup entered an infinite loop. > > Could be a bug in the HTML output from sphinx, perhaps a > malformed tag from an unusual string in our comments? > > > I could however use pydoctor to generate a docset. It seems > > to work OK..but while sphinx doc created 15,804 index entries. > > The pydoctor index had only around 5000 entries. > > > > Ref: http://kapeli.com/docsets > > I don't know if those numbers should be the same, or if one > counts modules while the other counts classes/functions etc? > > Peter > > P.S. > > The HTML API docs we distribute are generated with epydoc, > see http://biopython.org/wiki/Building_a_release > From p.j.a.cock at googlemail.com Sat Oct 26 18:44:59 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 26 Oct 2013 19:44:59 +0100 Subject: [Biopython-dev] Python 2 and 3 migration thoughts In-Reply-To: References: Message-ID: On Tue, Oct 22, 2013 at 5:42 PM, Peter Cock wrote: > On Tue, Oct 22, 2013 at 5:35 PM, Eric Talevich wrote: >> For docstrings, I agree that directly showing the try/except block is more >> informative for users on either genus of Python. > > Agreed. > >> For the rest of the >> codebase, I would favor using a bundled copy of six (e.g. Bio._six). The >> benefits are (a) not having to discover and fix all the subtle bugs >> ourselves, (b) to be explicit about where we've done something for Py2/3 >> compatibility and not as an essential part of the way the code is supposed >> to work, and (c) six has its own documentation. >> >> I also see some virtue in not relying on six/Bio._py3k where it's not >> necessary, since six is compatible back to Python 2.4 and we only go back to >> Python 2.6 now. Halfway approach: just look at six and copy only the bits we >> need into _py3k? > > OK, I'll focus in that direction then. Six is MIT licensed so we should > be fine bundling it or extracting snippets. A new pull request for people to comment on, which eliminates all but two important fixers. As a bonus this makes installation under Python 3 much much quicker: https://github.com/biopython/biopython/pull/250 I've not (yet) needed anything from the 'six' library. Peter