From hoffman at ebi.ac.uk Fri Oct 3 12:02:44 2003 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:27 2005 Subject: [Biopython-dev] Python 2.3.2 breaks setup.py test Message-ID: In Python 2.3.2, the current directory '' is not included in the path. Instead the current directory at the time of Python startup is included in the path. This breaks setup.py test which relies on the first behavior to import the tests. OK to check in this patch? Index: setup.py =================================================================== RCS file: /home/repository/biopython/biopython/setup.py,v retrieving revision 1.67 diff -u -r1.67 setup.py --- setup.py 16 Sep 2003 16:59:30 -0000 1.67 +++ setup.py 3 Oct 2003 15:57:12 -0000 @@ -174,6 +174,7 @@ # change to the test dir and run the tests os.chdir("Tests") + sys.path.insert(0, '') import run_tests run_tests.main([]) -- Michael Hoffman European Bioinformatics Institute From jchang at jeffchang.com Wed Oct 8 00:47:07 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:27 2005 Subject: [Biopython-dev] settling in to new apartment Message-ID: <78BAC6AB-F94A-11D7-A798-000A956845CE@jeffchang.com> Hello Everybody, I've just finished moving cross country, and had been out of net contact for a while. It looks like there are a few messages piling up that might fall under my domain, so I will be getting to those over the next few days. Jeff From idoerg at burnham.org Wed Oct 8 01:37:33 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] settling in to new apartment In-Reply-To: <78BAC6AB-F94A-11D7-A798-000A956845CE@jeffchang.com> Message-ID: I hope your move was not too harrowing, and that you are settling in OK. How do you like the East? See you at PSB, maybe? Best of luck, Iddo PS: Can I congratulate you as Dr. Chang? ./I -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo On Wed, 8 Oct 2003, Jeffrey Chang wrote: > Hello Everybody, > > I've just finished moving cross country, and had been out of net > contact for a while. It looks like there are a few messages piling up > that might fall under my domain, so I will be getting to those over the > next few days. > > Jeff > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > From jchang at jeffchang.com Wed Oct 8 20:55:56 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] NCBIDictionary and staff ... In-Reply-To: <10663.1064667909@www68.gmx.net> Message-ID: <56E2892C-F9F3-11D7-9B72-000A956845CE@jeffchang.com> OK. I've modified the NCBIDictionary code so that the default behavior is to return 1 sequence by default. However, there is still an optional parameter to disable this, if you want the old behavior. I have committed this to CVS. Jeff On Saturday, September 27, 2003, at 09:05 AM, Andreas Kuntzagk wrote: > Hi Jeffrey and others, > > I`m sending this from my privat account because I`m in Paris for ECCB > and > haven`t figured out how to > access my work smtp-server, but can read email. > > Regarding the retmax value on efetch, yeah should be retmax=1, i > _think_ it > gives the maximum > number of returned entries. I`ve to recheck later on the etools > manual. If > I remember right, for my > example of two entries with same accession, the most recent came back > if I > used retmax = 1. > So this was maybe more of a workaround for my problem then a general > fix. > > by from Paris, > > Andreas > > -- > Andreas "Murple" Kuntzagk > mailto:the_murple@gmx.de > snail_mail: Andreas Kuntzagk, Glatzer Stra?e 5, 10247 Berlin > > NEU F?R ALLE - GMX MediaCenter - f?r Fotos, Musik, Dateien... > Fotoalbum, File Sharing, MMS, Multimedia-Gru?, GMX FotoService > > Jetzt kostenlos anmelden unter http://www.gmx.net > > +++ GMX - die erste Adresse f?r Mail, Message, More! +++ > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev From bugzilla-daemon at portal.open-bio.org Wed Oct 8 22:15:53 2003 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] [Bug 1531] Bio.Fasta.RecordParser, SequenceParser Message-ID: <200310090215.h992Fr0L002739@portal.open-bio.org> http://bugzilla.bioperl.org/show_bug.cgi?id=1531 jchang@biopython.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Additional Comments From jchang@biopython.org 2003-10-08 22:15 ------- The RecordParser and SequenceParser classes are not meant to be applied to files containing multiple sequences. They should be applied to the sequences individually. Files of sequences should be handled with the Iterator, and one of those classes passed as a the parser. However, it is true that the Iterator does not handle spaces between records, or at the beginning and end of files. I have fixed it so that now blank lines are ignored. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From jefftc at stanford.edu Thu Oct 9 23:18:44 2003 From: jefftc at stanford.edu (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Biopython 1.22 available Message-ID: <7445CC8C-FAD0-11D7-A5AC-000A956845CE@stanford.edu> Hello Everybody, Biopython 1.22 is now available from the website at: http://www.biopython.org/ This is mostly a maintenance release. The installation process is improved, and now distributes Martel and DTD files correctly. The changes made in this release are: Added Peter Slicker's patches for speeding up modules under Python 2.3 Fixed Martel installation. Does not install Bio.Cluster without Numeric. Distribute EUtils DTDs. Yves Bastide patched NCBIStandalone.Iterator to be Python 2.0 iterator Ashleigh's string coersion fixes in Clustalw. Yair Benita added precision to the protein molecular weights. Bartek updated AlignAce.Parser and added Motif.sim method bug fixes in Michiel De Hoon's clustering library Iddo's bug fixes to Bio.Enzyme and new RecordConsumer Guido Draheim added patches for fixing import path to xbb scripts regression tests updated to be Python 2.3 compatible GenBank.NCBIDictionary is smarter about guessing the format As usual, please report bugs to biopython-dev@biopython.org, or the bug database also available from the website. Jeff From thamelry at vub.ac.be Fri Oct 10 08:14:30 2003 From: thamelry at vub.ac.be (Thomas Hamelryck) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] mmCIF parser added to Bio.PDB In-Reply-To: References: Message-ID: <200310101414.30793.thamelry@vub.ac.be> Hi everybody, Due to popular demand (by Cath Lawrence :-), I've added mmCIF support to Bio.PDB. mmCIF in short is a file format that is used to describe crystal structures. The mmmCIF format solves many problems that are associated with the older PDB format (or at least that's what I'm told :-). Usage: >>> from Bio.PDB.MMCIFParser import MMCIFParser >>> parser=MMCIFParser() >>> structure=parser.get_structure("test", "1FAT.cif") In addition, there is also MMCIF2Dict, which makes the contents of an mmCIF file available as a Python dictionary (with the data tags as keys), so you can easily address all data in the mmCIF file. Usage: >>> from Bio.PDB.MMCIF2Dict import MMCIF2Dict >>> d=MMCIF2Dict("1FAT.cif") >>> print d["_database_PDB_matrix.entry_id"] 1FAT >>> print d["_struct_site.id"] ['CAA', 'MNA', 'CAB', 'MNB', 'CAC', 'MNC', 'CAD', 'MND'] >>> d["_computing.structure_solution"] "'X-PLOR 3.1'" The modules use C/Lex code to parse the file, so it's reasonably fast. Note that compilation requires C and GNU Lex (ie. Flex). There is no support for writing mmCIF files, and I'm not planning to work on that either. I'd be interested to hear about possible bugs, requested feactures etc, but it should work reasonably as is. Cheers, --- Thomas Hamelryck COMO-ULTR Vrije Universiteit Brussel (VUB) Belgium http://homepages.vub.ac.be/~thamelry From hoffman at ebi.ac.uk Fri Oct 10 10:11:19 2003 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle Message-ID: I have long wondered about how much the use of Bio.File.UndoHandle slows things down (it has additional checks for every read operation). Here are some results: I wrote two scripts, filetest.py and filetest-undo.py. They both read in every line of Homo sapiens chromosome 1 and do nothing with it. This file is 4086733 lines and 249290633 bytes. **** filetest.py input = file("/scratch/hoffman/1.fa") line = 1 while line: line = input.readline() **** filetest-undo.py import Bio.File input = Bio.File.UndoHandle(file("/scratch/hoffman/1.fa")) line = 1 while line: line = input.readline() **** Timing the run of these files gives the following results (real): filetest.py: 0m12.703s 0m12.215s 0m12.331s filetest-undo.py: 0m30.135s 0m29.676s 0m30.165s There is about a 150% increase in the amount of time it takes to do input using readline() with UndoHandle. The overhead of loading Bio.File is minimal: $ time python -c "import Bio.File" real 0m0.418s user 0m0.090s sys 0m0.080s $ time python -c "None" real 0m0.070s user 0m0.010s sys 0m0.030s This kind of increase on basic I/O means much one one is doing big jobs, in my opinion. I wasn't volunteering to rewrite anything to not use UndoHandle but people might consider it when writing future stuff. And I might try rewriting some stuff anyway. Any thoughts? -- Michael Hoffman European Bioinformatics Institute From jchang at jeffchang.com Mon Oct 13 00:38:49 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle In-Reply-To: Message-ID: <23E5E458-FD37-11D7-8578-000A956845CE@jeffchang.com> On Friday, October 10, 2003, at 10:11 AM, Michael Hoffman wrote: > I have long wondered about how much the use of Bio.File.UndoHandle > slows things down (it has additional checks for every read > operation). Here are some results: [cut, reading a file is slower when using an UndoHandle] > There is about a 150% increase in the amount of time it takes to do > input using readline() with UndoHandle. > This kind of increase on basic I/O means much one one is doing big > jobs, in my opinion. I wasn't volunteering to rewrite anything to not > use UndoHandle but people might consider it when writing future > stuff. And I might try rewriting some stuff anyway. Any thoughts? The UndoHandle creates overhead on readline due to its extra if checks and function calls. def readline(self, *args, **keywds): if self._saved: line = self._saved.pop(0) else: line = self._handle.readline(*args,**keywds) return line Also, passing *args and **keywds may incur another performance penalty, but I don't know how much. The best way to speed this up might be to recode the class in C as a type. This would help because the if statement would be evaluated in C, and also you can cache the self._handle.readline for a faster function lookup. Jeff From mdehoon at ims.u-tokyo.ac.jp Tue Oct 14 10:45:15 2003 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 Message-ID: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp> Dear Biopythoneers, This evening I set out to create the Windows installers for Biopython 1.22. The good news is that there were no compilation errors. The bad news is that the __init__.py and data.py files are missing from Bio/Cluster in the Biopython 1.22 source distribution. I checked in CVS, and found them there. Can these two files be added to the Biopython 1.22 package? I then added the __init__.py and data.py files from CVS, and made the Windows installers. You can find them at http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.2.exe http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.3.exe (I will remove these once they are available from the Biopython site). While I was at it, I also made a complete Biopython 1.22 distribution: http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.tar.gz There was one more warning message that I got when compiling Biopython: Bio/trie.c: In function `Trie_has_prefix': Bio/trie.c:443: warning: return makes integer from pointer without a cast As far I can tell from trie.c, this warning is not serious, it is due to returning a NULL instead of a 0 where the return type is int. But we may as well fix it. --Michiel. From thamelry at vub.ac.be Tue Oct 14 10:51:21 2003 From: thamelry at vub.ac.be (Thomas Hamelryck) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp> References: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp> Message-ID: <200310141651.21996.thamelry@vub.ac.be> > This evening I set out to create the Windows installers for Biopython > 1.22. The good news is that there were no compilation errors. Hi Michiel, Did you compile the KDTree module too? Some time ago somebody asked for it for Windows. It's uncommented in setup.py because of a bug in distutils on some platforms. Regards, -Thomas --- Thomas Hamelryck Computational modeling lab (COMO) Vrije Universiteit Brussel (VUB) Belgium http://homepages.vub.ac.be/~thamelry From mdehoon at ims.u-tokyo.ac.jp Tue Oct 14 21:42:56 2003 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <200310141651.21996.thamelry@vub.ac.be> References: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp> <200310141651.21996.thamelry@vub.ac.be> Message-ID: <3F8CA620.3030202@ims.u-tokyo.ac.jp> Thomas Hamelryck wrote: > Hi Michiel, > > Did you compile the KDTree module too? Some time ago somebody > asked for it for Windows. It's uncommented in setup.py because > of a bug in distutils on some platforms. I just tried to compile KDTree on Windows. It seems that the file Bio/KDTree/_KDTree.swig.C is missing in the source distribution. I picked it up from CVS, however I was not able to compile KDTree on Cygwin/MinGW (which I am using to build the Windows installer) nor using Microsoft's compiler. It seems that distutils doesn't realize that this is a C++ file, but I don't know how to fix that. Changing the file extensions to .cpp or .cc didn't help. --Michiel. C:\cygwin\usr\local\bin\gcc.exe -mno-cygwin -mdll -O -Wall -Ic:\Python22\include -c Bio/KDTree/_KDTree.C -o build\temp.win32-2.2\Release\_kdtree.o In file included from /usr/local/include/c++/3.3/bits/locale_facets.h:166, from /usr/local/include/c++/3.3/bits/basic_ios.h:44, from /usr/local/include/c++/3.3/ios:51, from /usr/local/include/c++/3.3/ostream:45, from /usr/local/include/c++/3.3/iostream:45, from Bio/KDTree/_KDTree.h:1, from Bio/KDTree/_KDTree.C:1: /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:46: error: `_U' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:47: error: `_L' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:48: error: `_U' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:48: error: `_L' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:49: error: `_N' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:50: error: `_X' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:50: error: `_N' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:51: error: `_S' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: `_P' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: `_U' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: `_L' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: `_N' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: `_B' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: `_P' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: `_U' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: `_L' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: `_N' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:54: error: `_C' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:55: error: `_P' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: `_U' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: `_L' was not declared in this scope /usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: `_N' was not declared in this scope Bio/KDTree/_KDTree.C: In member function `void KDTree::neighbor_simple_search(float)': Bio/KDTree/_KDTree.C:914: warning: comparison between signed and unsigned integer expressions Bio/KDTree/_KDTree.C:923: warning: comparison between signed and unsigned integer expressions error: command 'gcc' failed with exit status 1 > > Regards, > > -Thomas > > --- > Thomas Hamelryck > Computational modeling lab (COMO) > Vrije Universiteit Brussel (VUB) > Belgium > http://homepages.vub.ac.be/~thamelry > > -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From jchang at jeffchang.com Tue Oct 14 23:42:33 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp> Message-ID: <9C8384CF-FEC1-11D7-A8B2-000A956845CE@jeffchang.com> Yes, they were missing. I've fixed the MANIFEST.in file so that they will be included in the next release. I'll roll a 1.23 release this weekend. Jeff On Tuesday, October 14, 2003, at 10:45 AM, Michiel Jan Laurens de Hoon wrote: > Dear Biopythoneers, > > This evening I set out to create the Windows installers for Biopython > 1.22. The good news is that there were no compilation errors. The bad > news is that the __init__.py and data.py files are missing from > Bio/Cluster in the Biopython 1.22 source distribution. I checked in > CVS, and found them there. Can these two files be added to the > Biopython 1.22 package? > > I then added the __init__.py and data.py files from CVS, and made the > Windows installers. You can find them at > http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.2.exe > http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.3.exe > (I will remove these once they are available from the Biopython site). > While I was at it, I also made a complete Biopython 1.22 distribution: > http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.tar.gz > > There was one more warning message that I got when compiling Biopython: > > Bio/trie.c: In function `Trie_has_prefix': > Bio/trie.c:443: warning: return makes integer from pointer without a > cast > > As far I can tell from trie.c, this warning is not serious, it is due > to returning a NULL instead of a 0 where the return type is int. But > we may as well fix it. > > --Michiel. > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev From jchang at jeffchang.com Tue Oct 14 23:47:28 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <3F8CA620.3030202@ims.u-tokyo.ac.jp> Message-ID: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com> On Tuesday, October 14, 2003, at 09:42 PM, Michiel Jan Laurens de Hoon wrote: > Thomas Hamelryck wrote: > > Hi Michiel, > > > > Did you compile the KDTree module too? Some time ago somebody > > asked for it for Windows. It's uncommented in setup.py because > > of a bug in distutils on some platforms. > > I just tried to compile KDTree on Windows. It seems that the file > Bio/KDTree/_KDTree.swig.C is missing in the source distribution. I > picked it up from CVS, however I was not able to compile KDTree on > Cygwin/MinGW (which I am using to build the Windows installer) nor > using Microsoft's compiler. It seems that distutils doesn't realize > that this is a C++ file, but I don't know how to fix that. Changing > the file extensions to .cpp or .cc didn't help. > > --Michiel. Thomas, Should the _KDTree.swig.C file be distributed? If so, then does the _KDTree.i file need to be distributed, or just kept in the CVS? I'll distribute it anyway, since it doesn't appear to be hurting anything. Jeff From thamelry at vub.ac.be Wed Oct 15 04:32:01 2003 From: thamelry at vub.ac.be (Thomas Hamelryck) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com> References: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com> Message-ID: <200310151032.01495.thamelry@vub.ac.be> On Wednesday 15 October 2003 05:47 am, Jeffrey Chang wrote: > Should the _KDTree.swig.C file be distributed? If so, then does the > _KDTree.i file need to be distributed, or just kept in the CVS? I'll > distribute it anyway, since it doesn't appear to be hurting anything. The _KDTree.swig.C is generated by swig from _KDTree.i, so the latter should definitely be included. I'd also put _KDTree.swig.C in there for those who do not have swig installed. Meanwhile I'll look into the distutils problem again.... Thanks for trying, Michiel! -Thomas From hoffman at ebi.ac.uk Wed Oct 15 10:51:21 2003 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle In-Reply-To: <23E5E458-FD37-11D7-8578-000A956845CE@jeffchang.com> Message-ID: On Mon, 13 Oct 2003, Jeffrey Chang wrote: > The UndoHandle creates overhead on readline due to its extra if checks > and function calls. > > [...] > > The best way to speed this up might be to recode the class in C as a > type. This would help because the if statement would be evaluated in > C, and also you can cache the self._handle.readline for a faster > function lookup. Actually, I was thinking along the lines of recoding the class that calls UndoHandle instead (see below). This new implementation does not seem to be significantly faster than Bio.Fasta.Iterator when the latter is used without a parser. However you get the parsing done for free with this implementation! It seems to be about twice as fast as using Bio.Fasta.Iterator with Bio.Fasta.RecordParser, and provides the same functionality in a more lightweight package--a tuple of (defline, data) instead of a Bio.Record object. What do you think? class LightIterator(object): def __init__(self, handle): self._handle = handle self._defline = None def __iter__(self): return self def next(self): lines = [] defline_old = self._defline while 1: line = self._handle.readline() if not line: if not defline_old and not lines: raise StopIteration if defline_old: self._defline = None break elif line[0] == '>': self._defline = line[1:].rstrip() if defline_old or lines: break else: defline_old = self._defline else: lines.append(line.rstrip()) return defline_old, ''.join(lines) -- Michael Hoffman European Bioinformatics Institute From jchang at jeffchang.com Wed Oct 15 15:28:18 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Missing files in Biopython 1.22 In-Reply-To: <200310151032.01495.thamelry@vub.ac.be> Message-ID: I have edited the MANIFEST.in file so that _KDTree.swig.C is distributed. It will go out with the next release. Jeff On Wednesday, October 15, 2003, at 04:32 AM, Thomas Hamelryck wrote: > On Wednesday 15 October 2003 05:47 am, Jeffrey Chang wrote: >> Should the _KDTree.swig.C file be distributed? If so, then does the >> _KDTree.i file need to be distributed, or just kept in the CVS? I'll >> distribute it anyway, since it doesn't appear to be hurting anything. > > The _KDTree.swig.C is generated by swig from _KDTree.i, so the latter > should definitely be included. I'd also put _KDTree.swig.C in there > for those > who do not have swig installed. > > Meanwhile I'll look into the distutils problem again.... > Thanks for trying, Michiel! > > -Thomas From jchang at jeffchang.com Wed Oct 15 23:10:39 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle In-Reply-To: Message-ID: <51A00D19-FF86-11D7-8AF5-000A956845CE@jeffchang.com> On Wednesday, October 15, 2003, at 10:51 AM, Michael Hoffman wrote: > On Mon, 13 Oct 2003, Jeffrey Chang wrote: > >> The UndoHandle creates overhead on readline due to its extra if checks >> and function calls. >> >> [...] >> >> The best way to speed this up might be to recode the class in C as a >> type. This would help because the if statement would be evaluated in >> C, and also you can cache the self._handle.readline for a faster >> function lookup. > > Actually, I was thinking along the lines of recoding the class that > calls UndoHandle instead (see below). This new implementation does not > seem to be significantly faster than Bio.Fasta.Iterator when the > latter is used without a parser. However you get the parsing done for > free with this implementation! It seems to be about twice as fast as > using Bio.Fasta.Iterator with Bio.Fasta.RecordParser, and provides the > same functionality in a more lightweight package--a tuple of > (defline, data) instead of a Bio.Record object. What do you think? [cut code] That is a nice implementation. However, Biopython already has at least 3 Fasta parsers! Bio/Fasta Bio/SeqIO/FASTA Bio/expressions/fasta Bio/Fasta, the one you compared against, is easily the slowest one. Bio/SeqIO/FASTA is very similar to your implementation and not likely to be significantly faster or slower. Bio/expressions/fasta uses Martel. I don't know how well that will perform. The parsing part should be blazingly fast (since it is mostly in C), but building the object will be slow. It might be a wash. Jeff From mdehoon at ims.u-tokyo.ac.jp Thu Oct 16 00:19:50 2003 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan Message-ID: <3F8E1C66.4060803@ims.u-tokyo.ac.jp> Dear Biopython developers, I am volunteering to give a software demonstration of Biopython at the International Conference on Genome Informatics (GIW) in Tokyo/Yokohama this December. GIW is the largest annual conference on bioinformatics in Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for more information. The software demonstrations are set up like a poster session (instead of an oral presentation such as at BOSC), allowing easy communication with potential users. Such a software demonstration comes with a two-page paper describing the software. This paper is not medline-indexed and there is no rigorous refereeing involved for such short papers. However, it appears together with the full-length papers in the proceedings, so there will be a permanent record. The proceedings will be publicly available on the web after the conference. (http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is the paper for the software demonstration by our Bioruby colleagues last year at GIW). The deadline for submissions has already passed, however since my lab is organizing this conference we have some leeway. I'll be happy to write the paper, but I will need some help: 1) Are there any other papers on Biopython from which I can plagiarize the parts of Biopython that I am not very familiar with? 2) Would somebody be willing to have a look at the paper before I submit it, in case I write something wrong? 3) Who should I include as co-authors? Anybody is welcome, as far as I am concerned. 4) Does anybody have any cool scripts that make use of Biopython that I can show off at the conference? Thanks in advance, --Michiel, U Tokyo. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From hoffman at ebi.ac.uk Thu Oct 16 05:45:07 2003 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle In-Reply-To: <51A00D19-FF86-11D7-8AF5-000A956845CE@jeffchang.com> Message-ID: On Wed, 15 Oct 2003, Jeffrey Chang wrote: > That is a nice implementation. However, Biopython already has at least > 3 Fasta parsers! > Bio/Fasta > Bio/SeqIO/FASTA > Bio/expressions/fasta There sure are. We should probably be cutting them rather than adding them I suppose. :-) Have you thought of deprecating Bio.Fasta since it is the slowest? I know that the official path is to get people towards FormatIO but Bio.expressions.fasta is more than 12x slower than my implementation/Bio.SeqIO.FASTA (comparable as you predicted)! For one test: FormatIO: 3.085s/3.094s/3.154s LightIterator: 0.246s/0.243s/0.245s Unless of course, I am using Bio.expressions.fasta incorrectly. It is a bit hard to figure out what to do as there are no docstrings, unit tests, or other documentation that I can see. Here is the code, anyway. Please let me know if I did this in an inefficient way (this is a slight speedup over using SeqRecord.io). ===== from Bio import FormatIO iterator = FormatIO.FormatIO("SeqRecord", default_input_format = "fasta").readFile(file("/scratch/test.fna")) for record in iterator: pass ===== -- Michael Hoffman European Bioinformatics Institute From idoerg at burnham.org Thu Oct 16 12:23:56 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan Message-ID: <3F8EC61C.6090601@burnham.org> Michiel Jan Laurens de Hoon wrote: > Dear Biopython developers, > > I am volunteering to give a software demonstration of Biopython at the > International Conference on Genome Informatics (GIW) in Tokyo/Yokohama > this December. GIW is the largest annual conference on bioinformatics in > Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for more information. > The software demonstrations are set up like a poster session (instead of > an oral presentation such as at BOSC), allowing easy communication with > potential users. Such a software demonstration comes with a two-page > paper describing the software. This paper is not medline-indexed and > there is no rigorous refereeing involved for such short papers. However, > it appears together with the full-length papers in the proceedings, so > there will be a permanent record. The proceedings will be publicly > available on the web after the conference. > (http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is the paper for the > software demonstration by our Bioruby colleagues last year at GIW). Fantastic! I'm reminded of the "Made in" albums by Deep Purple: "Biopython: Made in Japan". > > The deadline for submissions has already passed, however since my lab is > organizing this conference we have some leeway. I'll be happy to write > the paper, but I will need some help: Nothing like having friends in high places... or being in one yourself. > 1) Are there any other papers on Biopython from which I can plagiarize > the parts of Biopython that I am not very familiar with? Jeff & Brad wrote something, but back in 2000. A lot has changed since. Still: http://biopython.org/docs/acm/ACMbiopy.pdf > 2) Would somebody be willing to have a look at the paper before I submit > it, in case I write something wrong? I can. > 3) Who should I include as co-authors? Anybody is welcome, as far as I > am concerned. Ummm.. definitely the triumvirate: Jeff, Brad & Andrew. Here's what I would do: take the top N posters to biopython-dev, N being the number of people you want on the paper. > 4) Does anybody have any cool scripts that make use of Biopython that I > can show off at the conference? > I have two websites: http://bioinformatics.org/pecop http://ffas.ljcrf.edu:8080/Fragnostic Which use the FASTA parsing, GenBank parsing, PDB parsing, GO module, and some other stuff. Nothing really cool in the source codes though. If you want to show code, I would suggest using something basic, like the manual. However, those are biopython powered sites, which is always good for PR. > Thanks in advance, > > --Michiel, U Tokyo. > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From jchang at jeffchang.com Fri Oct 17 15:03:10 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Performance of Bio.File.UndoHandle In-Reply-To: Message-ID: <8CC4C52A-00D4-11D8-84A9-000A956845CE@jeffchang.com> On Thursday, October 16, 2003, at 05:45 AM, Michael Hoffman wrote: > On Wed, 15 Oct 2003, Jeffrey Chang wrote: > >> That is a nice implementation. However, Biopython already has at >> least >> 3 Fasta parsers! >> Bio/Fasta >> Bio/SeqIO/FASTA >> Bio/expressions/fasta > > There sure are. We should probably be cutting them rather than adding > them I suppose. :-) Have you thought of deprecating Bio.Fasta since it > is the slowest? Yes, that will probably be done eventually. However, it does have a nice interface that's consistent with the other parsers, e.g. for GenBank, and it's documented. We'd be deprecating the best documented parser for faster ones that aren't documented. (As you noticed, not even docstrings.) It's trade-off. The decision would be much clearer if the other parsers had better documentation! ;) > I know that the official path is to get people towards FormatIO but > Bio.expressions.fasta is more than 12x slower than my > implementation/Bio.SeqIO.FASTA (comparable as you predicted)! For one > test: > > FormatIO: 3.085s/3.094s/3.154s > LightIterator: 0.246s/0.243s/0.245s Yikes! Your code is correct. However, in fairness, the fasta parser that FormatIO is doing more work, such as trying to detect database IDs (GenBank, EMBL, DDBJ, NBRF) in the description line. However, if that's something that's not generally needed, perhaps that functionality should be off by default, so that the parser would be faster. Everybody likes that, right? Jeff From mdehoon at ims.u-tokyo.ac.jp Sat Oct 18 03:49:23 2003 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan Message-ID: <3F90F083.9070206@ims.u-tokyo.ac.jp> Dear Biopythoneers, Thank you for your feedback on my proposed software demonstration at GIW this year. I have put together a draft of the paper describing the software; it can be downloaded from http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython.pdf. Please let me know if any changes should be made to it. I agree with Iddo that is is better to include the original Biopython developers Jeff, Brad and Andrew as co-authors of this paper. I feel kind of uncomfortable writing a paper on Biopython and putting only my name on it, as I am not one of the main developers. So for now, I have added Jeff, Brad, Andrew, and Iddo as co-authors. To ensure that I don't include anybody as co-author against their will, please let me know if you agree to be a co-author. If I don't hear from you, I will assume that you didn't get this email or that you prefer not to be a co-author, and I will remove your name for the final version. To me, it doesn't matter if you actually contributed to this paper or not, because without you guys there would be no Biopython and no paper to write about it. Thanks again, particularly for the pecop and fragnostic websites, I'll use those at the software demo to show Biopython in action. --Michiel. > I am volunteering to give a software demonstration of Biopython at > the International Conference on Genome Informatics (GIW) in > Tokyo/Yokohama this December. GIW is the largest annual conference on > bioinformatics in Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for > more information. The software demonstrations are set up like a > poster session (instead of an oral presentation such as at BOSC), > allowing easy communication with potential users. Such a software > demonstration comes with a two-page paper describing the software. > This paper is not medline-indexed and there is no rigorous refereeing > involved for such short papers. However, it appears together with the > full-length papers in the proceedings, so there will be a permanent > record. The proceedings will be publicly available on the web after > the conference. (http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is > the paper for the software demonstration by our Bioruby colleagues > last year at GIW). -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From chapmanb at uga.edu Sat Oct 18 12:05:17 2003 From: chapmanb at uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan In-Reply-To: <3F90F083.9070206@ims.u-tokyo.ac.jp> References: <3F90F083.9070206@ims.u-tokyo.ac.jp> Message-ID: <20031018160517.GA306@evostick.agtec.uga.edu> Michiel; > Thank you for your feedback on my proposed software demonstration at GIW > this year. I have put together a draft of the paper describing the > software; it can be downloaded from > http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython.pdf. Please let me > know if any changes should be made to it. Looks great. Thanks for doing this -- it's definitely a positive thing to get the word out about Biopython. > I agree with Iddo that is is better to include the original Biopython > developers Jeff, Brad and Andrew as co-authors of this paper. Thanks. Sure I'll take my name on more papers :-). > Thanks again, particularly for the pecop and fragnostic websites, I'll > use those at the software demo to show Biopython in action. If you want more examples of real life applications using Biopython, I've recently stuck up a page with code I'm using for my graduate research: http://plantgenome.agtec.uga.edu/bioinformatics/dating/ It's not really a software demo kind of thing but is at least an application of using Biopython for something "real." Also, if you want to steal stuff from my BOSC biopython talk you are more then welcome -- there's a tarball of the talk plus the LaTeX and associated figures at: http://evostick.agtec.uga.edu/~chapmanb/bp/bosc_biopython_2003.tar.gz Good luck with the presentation! Brad From jchang at jeffchang.com Sun Oct 19 11:37:12 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Biopython 1.23 available Message-ID: <1BAC9E6A-024A-11D8-8924-000A956845CE@jeffchang.com> Hello Everybody, Biopython 1.23 is now available from the website at: http://www.biopython.org/ This is mostly a maintenance release, which fixes some problems in the installation. You do not need to update from 1.22 unless you are using the Bio.Cluster, Bio.KDTree, or Bio.PDB.mmCIF packages. The changes made in this release are: Fixed distribution of files in Bio/Cluster Now distributing Bio/KDTree/_KDTree.swig.C minor updates in installation code added mmCIF support for PDB files As usual, please report bugs to biopython-dev@biopython.org, or the bug database also available from the website. Jeff From kristian.rother at charite.de Mon Oct 20 06:16:04 2003 From: kristian.rother at charite.de (Kristian Rother) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Updates on PDB entries Message-ID: <200310201216.04502.kristian.rother@charite.de> Hello BioPython Maintainers, I have just written some code that retrieves the weekly distributed files of new or modified protein structures from the PDB server or its mirrors. There was no such service in the last BioPython release i used. (1.10, i think). If You are interested in the code, i could provide You with an object oriented, cleaned-up and documented source file (which i would not write otherwise). Keep up the good work! Kristian Rother From thamelry at vub.ac.be Mon Oct 20 04:09:14 2003 From: thamelry at vub.ac.be (Thomas Hamelryck) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Updates on PDB entries In-Reply-To: <200310201216.04502.kristian.rother@charite.de> References: <200310201216.04502.kristian.rother@charite.de> Message-ID: <200310200807.h9K87KtL031792@sarek.skynet.be> Hi Kristian, > I have just written some code that retrieves the weekly distributed files > of new or modified protein structures from the PDB server or its mirrors. > There was no such service in the last BioPython release i used. (1.10, i > think). > > If You are interested in the code, i could provide You with an object > oriented, cleaned-up and documented source file (which i would not write > otherwise). That sounds very useful. Would indeed be a good addition to Biopython. Cheers, --- Thomas Hamelryck ULTR/COMO Institute for molecular biology/Computer Science Department Vrije Universiteit Brussel (VUB) Brussels, Belgium http://homepages.vub.ac.be/~thamelry From idoerg at burnham.org Mon Oct 20 12:52:51 2003 From: idoerg at burnham.org (Iddo Friedberg) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Updates on PDB entries In-Reply-To: <200310201216.04502.kristian.rother@charite.de> References: <200310201216.04502.kristian.rother@charite.de> Message-ID: <3F9412E3.30500@burnham.org> Holy cheese, Cleaned-up _and_ documented??!!! What is this project coming to? Good show Kristian. If you email me the code I'll see that it gets into the CVS :) Iddo Kristian Rother wrote: > Hello BioPython Maintainers, > > I have just written some code that retrieves the weekly distributed files of > new or modified protein structures from the PDB server or its mirrors. There > was no such service in the last BioPython release i used. (1.10, i think). > > If You are interested in the code, i could provide You with an object > oriented, cleaned-up and documented source file (which i would not write > otherwise). > > Keep up the good work! > > Kristian Rother > > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev > > -- Iddo Friedberg, Ph.D. The Burnham Institute 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://ffas.ljcrf.edu/~iddo From rhf22 at mole.bio.cam.ac.uk Wed Oct 22 10:34:39 2003 From: rhf22 at mole.bio.cam.ac.uk (Rasmus Fogh) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Possible contributor Message-ID: Dear BioPython, We (the CCPN project - www.ccpn.ac.uk) think we might make a natural contributor to BioPython, We are making a standard datamodel in the areas of BioMolecular NMR, macromolecular structure, and Biochemistry LIMS programs, including a model in UML, extensive, highly functional APIs to support the data model, a standard XML data format, I/O libraries, and some utility programs (e.g. NMR area format converters). We release all this under LGPL. To give you an impression, we have over 300 000 lines of python code and over 500 000 lines of HTML documentation to contribute (most of which is autogenerated from our UML model). We are currently working on extending our model from Python/XML to include also Java APIs and relational database storage. We do have a few questions: Where do I find a copy of the license and conditions of distribution? Is this an integrated project (thus with extensive coordination requirements) or a collection of independent deposited software? What obligations would we be taking on? Do we have to deposit to your CVS repository and who would get write access? We follow a slightly different set of style guidelines, in that we use internalUpperCase instead of separated_by_underscore, and we generate our own HTML documentation format. Would this be a problem? Can we just put our name and URL on the ScriptingCentral page, and might this be better than contributing to the CVS? Thanks for your help, Rasmus --------------------------------------------------------------------------- Dr. Rasmus H. Fogh Email: r.h.fogh@bioc.cam.ac.uk Dept. of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, UK. FAX (01223)766002 From cyli at MIT.EDU Wed Oct 22 17:24:40 2003 From: cyli at MIT.EDU (Ying Li) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Wrong comment, perhaps? Message-ID: <1066857880.13608.4.camel@tandem.mit.edu> Hi, I'm sorry to bother everyone over such a minor technicality, but in the docs and comments in Bio.Blast.Record.Blast, in the class DatabaseReport, it says that the attribute num_sequences_in_database is the number of sequences in the database, which is an int. However: Python 2.3.2 (#2, Oct 6 2003, 08:02:06) [GCC 3.3.2 20030908 (Debian prerelease)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Bio.Blast.NCBIStandalone import BlastParser >>> rec = BlastParser().parse(open("blastOutput01.txt")) >>> rec.num_sequences_in_database [1] Just wondering if the docs are wrong and it's really supposed to be a list (since the database names is also a list) or if the code is wrong and it's supposed to just be a number. (this is not the CVS version, but the tarballed release 1.23 for linux) Thanks! -Ying From jchang at jeffchang.com Wed Oct 22 18:03:20 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Wrong comment, perhaps? In-Reply-To: <1066857880.13608.4.camel@tandem.mit.edu> Message-ID: <8C11A9F7-04DB-11D8-A674-000A956845CE@jeffchang.com> Yes, you're right. It should be a list, because there can be multiple databases. I've updated the code in CVS. Thanks! jeff On Wednesday, October 22, 2003, at 05:24 PM, Ying Li wrote: > Hi, > > I'm sorry to bother everyone over such a minor technicality, but in the > docs and comments in Bio.Blast.Record.Blast, in the class > DatabaseReport, it says that the attribute num_sequences_in_database is > the number of sequences in the database, which is an int. > > However: > > Python 2.3.2 (#2, Oct 6 2003, 08:02:06) > [GCC 3.3.2 20030908 (Debian prerelease)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> from Bio.Blast.NCBIStandalone import BlastParser >>>> rec = BlastParser().parse(open("blastOutput01.txt")) >>>> rec.num_sequences_in_database > [1] > > Just wondering if the docs are wrong and it's really supposed to be a > list (since the database names is also a list) or if the code is wrong > and it's supposed to just be a number. > > (this is not the CVS version, but the tarballed release 1.23 for linux) > > Thanks! > -Ying > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev@biopython.org > http://biopython.org/mailman/listinfo/biopython-dev From jchang at jeffchang.com Wed Oct 22 18:28:07 2003 From: jchang at jeffchang.com (Jeffrey Chang) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Possible contributor In-Reply-To: Message-ID: <0291CE5E-04DF-11D8-A674-000A956845CE@jeffchang.com> Hi Rasmus, > We (the CCPN project - www.ccpn.ac.uk) think we might make a natural > contributor to BioPython [... description of project] > We do have a few questions: > > Where do I find a copy of the license and conditions of distribution? Biopython is distributed with the Biopython license. I have put it online at: http://www.biopython.org/static/LICENSE It's basically the Python license. I don't believe it can be distributed with the LGPL, so we have a license incompatibility. The Biopython license would not mind being distributed with LGPL, but the LGPL wouldn't like it much! > Is this an integrated project (thus with extensive coordination > requirements) or a collection of independent deposited software? What > obligations would we be taking on? It's in-between the two. People are essentially independently in-charge of their own code, but there is some oversight on what goes into the project. People do make minor changes (e.g. bug fixes) to other portions of the code base, but major changes and changes to the API are discouraged without discussion. More coordination comes in when it's time to make releases, when I need to make sure that everybody's code is in working order and ready to be released. If you were to submit your code as part of biopython, we would probably give you your own package in the Bio namespace, probably Bio.CCPN. You'd be free to do whatever you wanted under there. > Do we have to deposit to your CVS repository and who would get write > access? If you want the code to be distributed with Biopython, then it would have to be in the CVS repository. We have thus far been relatively liberal about handing out write access, and haven't run into any problems yet. We would likely be able to give out accounts to anyone in your project that needs it, within reason, I suppose. > We follow a slightly different set of style guidelines, in that we use > internalUpperCase instead of separated_by_underscore, and we generate > our > own HTML documentation format. Would this be a problem? If you're familiar with our code base, you'll notice that we don't always follow our own guidelines consistently! :) It is unlikely that it will ever get unified, unless we happen to come upon a large increase in resources. As for documentation, that's not a problem. Brad has been wanting to move to a more distributed documentation format, to make it easier for package maintainers to write their own documentation. > Can we just put our name and URL on the ScriptingCentral page, and > might > this be better than contributing to the CVS? Yes, it would certainly be appropriate for you to do that! I don't know if it would be better than contributing to CVS -- depends on what you want to get out of it. While the projects cover the same general area, I'm not sure how much overlap there is between the two projects, that is, how many people now are using both Biopython and CCPN. If the overlap is low, then many people would end up downloading a lot of code they don't intend to use. If the overlap is high, then distributing together would simplify the installation process. Also, please consider the distribution cycle. Historically, Biopython has had releases about once every 6 months. That's about the amount of time to accumulate enough new code, fix bugs, and for someone (me currently) to make the release. If you want faster releases, you'll probably have to help out with the builds! That said, why would someone want to contribute to Biopython? Biopython does have a stable, full-featured infrastructure, with nice net access, and web, CVS, mailing lists. Also, please read over the Contribution Guide, which talks about other considerations and requirements for contributing to Biopython: http://www.biopython.org/docs/developer/contrib.html Jeff From Y.Benita at pharm.uu.nl Tue Oct 28 05:37:52 2003 From: Y.Benita at pharm.uu.nl (Yair Benita) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] Updated SeqUtils Message-ID: Dear All, The SeqUtils module has been updated with some new functions. CodonUsage module: can be used to generate a codon adaptation index for a set of genes or computer a codon adaptation index from an existing index. IsoelectricPoint module: can be used to determine the isoelectric point of a protein from its sequence. ProtParam module: can be used to compute various properties of a protein, such as: aromaticity, stability, flexibility and more. More information is available as docstrings in each module. Special thanks to Iddo Friedberg for all the help. Yair -- Yair Benita Pharmaceutical Proteomics Utrecht University From bugzilla-daemon at portal.open-bio.org Fri Oct 31 18:59:37 2003 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] [Bug 1550] New: python setup.py install --home=~ fails Message-ID: <200310312359.h9VNxbZs024573@portal.open-bio.org> http://bugzilla.bioperl.org/show_bug.cgi?id=1550 Summary: python setup.py install --home=~ fails Product: Biopython Version: Not Applicable Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev@biopython.org ReportedBy: tvinar@math.uwaterloo.ca The installation to home directory fails with the following message: running install_data creating /usr/lib/python2.2/site-packages/Bio error: could not create '/usr/lib/python2.2/site-packages/Bio': Permission denied ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Oct 31 19:34:47 2003 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org) Date: Sat Mar 5 14:43:28 2005 Subject: [Biopython-dev] [Bug 1550] python setup.py install --home=~ fails Message-ID: <200311010034.hA10Ylqb026714@portal.open-bio.org> http://bugzilla.bioperl.org/show_bug.cgi?id=1550 idoerg@burnham.org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.