From hoffman at ebi.ac.uk  Fri Oct  3 12:02:44 2003
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Sat Mar  5 14:43:27 2005
Subject: [Biopython-dev] Python 2.3.2 breaks setup.py test
Message-ID: <Pine.LNX.4.44.0310031658480.8927-100000@c3po.ebi.ac.uk>

In Python 2.3.2, the current directory '' is not included in the path.  
Instead the current directory at the time of Python startup is included in
the path. This breaks setup.py test which relies on the first behavior to 
import the tests.

OK to check in this patch?

Index: setup.py
===================================================================
RCS file: /home/repository/biopython/biopython/setup.py,v
retrieving revision 1.67
diff -u -r1.67 setup.py
--- setup.py    16 Sep 2003 16:59:30 -0000      1.67
+++ setup.py    3 Oct 2003 15:57:12 -0000
@@ -174,6 +174,7 @@
 
         # change to the test dir and run the tests
         os.chdir("Tests")
+        sys.path.insert(0, '')
         import run_tests
         run_tests.main([])
--
Michael Hoffman <hoffman@ebi.ac.uk>
European Bioinformatics Institute


From jchang at jeffchang.com  Wed Oct  8 00:47:07 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:27 2005
Subject: [Biopython-dev] settling in to new apartment
Message-ID: <78BAC6AB-F94A-11D7-A798-000A956845CE@jeffchang.com>

Hello Everybody,

I've just finished moving cross country, and had been out of net 
contact for a while.  It looks like there are a few messages piling up 
that might fall under my domain, so I will be getting to those over the 
next few days.

Jeff


From idoerg at burnham.org  Wed Oct  8 01:37:33 2003
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] settling in to new apartment
In-Reply-To: <78BAC6AB-F94A-11D7-A798-000A956845CE@jeffchang.com>
Message-ID: <Pine.SGI.4.10.10310072235490.827535-100000@pines2.ljcrf.edu>

I hope your move was not too harrowing, and that you are settling in OK.
How do you like the East?

See you at PSB, maybe?

Best of luck,

Iddo

PS: Can I congratulate you as Dr. Chang?

./I

--
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037, USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo

On Wed, 8 Oct 2003, Jeffrey Chang wrote:

> Hello Everybody,
> 
> I've just finished moving cross country, and had been out of net 
> contact for a while.  It looks like there are a few messages piling up 
> that might fall under my domain, so I will be getting to those over the 
> next few days.
> 
> Jeff
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
> 


From jchang at jeffchang.com  Wed Oct  8 20:55:56 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] NCBIDictionary and staff ...
In-Reply-To: <10663.1064667909@www68.gmx.net>
Message-ID: <56E2892C-F9F3-11D7-9B72-000A956845CE@jeffchang.com>

OK.  I've modified the NCBIDictionary code so that the default behavior 
is to return 1 sequence by default.  However, there is still an 
optional parameter to disable this, if you want the old behavior.

I have committed this to CVS.

Jeff


On Saturday, September 27, 2003, at 09:05  AM, Andreas Kuntzagk wrote:

> Hi Jeffrey and others,
>
> I`m sending this from my privat account because I`m in Paris for ECCB 
> and
> haven`t figured out how to
> access my work smtp-server, but can read email.
>
> Regarding the retmax value on efetch, yeah should be retmax=1, i 
> _think_ it
> gives the maximum
> number of returned entries.  I`ve to recheck later on the etools 
> manual. If
> I remember right, for my
> example of two entries with same accession, the most recent came back 
> if I
> used retmax = 1.
> So this was maybe more of a workaround for my problem then a general 
> fix.
>
> by from Paris,
>
> Andreas
>
> -- 
> Andreas "Murple" Kuntzagk
> mailto:the_murple@gmx.de
> snail_mail: Andreas Kuntzagk, Glatzer Stra?e 5, 10247 Berlin
>
> NEU F?R ALLE - GMX MediaCenter - f?r Fotos, Musik, Dateien...
> Fotoalbum, File Sharing, MMS, Multimedia-Gru?, GMX FotoService
>
> Jetzt kostenlos anmelden unter http://www.gmx.net
>
> +++ GMX - die erste Adresse f?r Mail, Message, More! +++
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev


From bugzilla-daemon at portal.open-bio.org  Wed Oct  8 22:15:53 2003
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] [Bug 1531] Bio.Fasta.RecordParser, SequenceParser
Message-ID: <200310090215.h992Fr0L002739@portal.open-bio.org>

http://bugzilla.bioperl.org/show_bug.cgi?id=1531

jchang@biopython.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Additional Comments From jchang@biopython.org  2003-10-08 22:15 -------
The RecordParser and SequenceParser classes are not meant to be applied to files containing 
multiple sequences.  They should be applied to the sequences individually.  

Files of sequences should be handled with the Iterator, and one of those classes passed as a the 
parser.  However, it is true that the Iterator does not handle spaces between records, or at the 
beginning and end of files.  I have fixed it so that now blank lines are ignored.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From jefftc at stanford.edu  Thu Oct  9 23:18:44 2003
From: jefftc at stanford.edu (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Biopython 1.22 available
Message-ID: <7445CC8C-FAD0-11D7-A5AC-000A956845CE@stanford.edu>

Hello Everybody,

Biopython 1.22 is now available from the website at:
http://www.biopython.org/

This is mostly a maintenance release.  The installation process is 
improved, and now distributes Martel and DTD files correctly.  The 
changes made in this release are:
   Added Peter Slicker's patches for speeding up modules under Python 2.3
   Fixed Martel installation.
   Does not install Bio.Cluster without Numeric.
   Distribute EUtils DTDs.
   Yves Bastide patched NCBIStandalone.Iterator to be Python 2.0 iterator
   Ashleigh's string coersion fixes in Clustalw.
   Yair Benita added precision to the protein molecular weights.
   Bartek updated AlignAce.Parser and added Motif.sim method
   bug fixes in Michiel De Hoon's clustering library
   Iddo's bug fixes to Bio.Enzyme and new RecordConsumer
   Guido Draheim added patches for fixing import path to xbb scripts
   regression tests updated to be Python 2.3 compatible
   GenBank.NCBIDictionary is smarter about guessing the format

As usual, please report bugs to biopython-dev@biopython.org, or the bug 
database also available from the website.

Jeff


From thamelry at vub.ac.be  Fri Oct 10 08:14:30 2003
From: thamelry at vub.ac.be (Thomas Hamelryck)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] mmCIF parser added to Bio.PDB
In-Reply-To: <D5353706-EE2D-11D7-A915-00039390F614@anu.edu.au>
References: <D5353706-EE2D-11D7-A915-00039390F614@anu.edu.au>
Message-ID: <200310101414.30793.thamelry@vub.ac.be>

Hi everybody,

Due to popular demand (by Cath Lawrence :-), I've added mmCIF support
to Bio.PDB. mmCIF in short is a file format that is used to describe crystal
structures. The mmmCIF format solves many problems that are associated with 
the older PDB format (or at least that's what I'm told :-). 

Usage:

>>> from Bio.PDB.MMCIFParser import MMCIFParser
>>> parser=MMCIFParser()
>>> structure=parser.get_structure("test", "1FAT.cif")

In addition, there is also MMCIF2Dict, which makes the contents of an
mmCIF file available as a Python dictionary (with the data tags as keys),
so you can easily address all data in the mmCIF file.

Usage:

>>> from Bio.PDB.MMCIF2Dict import MMCIF2Dict
>>> d=MMCIF2Dict("1FAT.cif")

>>> print d["_database_PDB_matrix.entry_id"]
1FAT

>>> print d["_struct_site.id"]
['CAA', 'MNA', 'CAB', 'MNB', 'CAC', 'MNC', 'CAD', 'MND']

>>> d["_computing.structure_solution"]
"'X-PLOR 3.1'"


The modules use C/Lex code to parse the file, so it's reasonably fast. Note 
that compilation requires C and GNU Lex (ie. Flex). There is no support for 
writing mmCIF files, and I'm not planning to work on that either. I'd be 
interested to hear about possible bugs, requested feactures etc, but it 
should work reasonably as is.

Cheers,

---
Thomas Hamelryck
COMO-ULTR
Vrije Universiteit Brussel (VUB)
Belgium
http://homepages.vub.ac.be/~thamelry


From hoffman at ebi.ac.uk  Fri Oct 10 10:11:19 2003
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
Message-ID: <Pine.LNX.4.44.0310101458000.29967-100000@c3po.ebi.ac.uk>

I have long wondered about how much the use of Bio.File.UndoHandle
slows things down (it has additional checks for every read
operation). Here are some results:

I wrote two scripts, filetest.py and filetest-undo.py. They both read
in every line of Homo sapiens chromosome 1 and do nothing with
it. This file is 4086733 lines and 249290633 bytes.

**** filetest.py

input = file("/scratch/hoffman/1.fa")

line = 1
while line:
    line = input.readline()

**** filetest-undo.py

import Bio.File

input = Bio.File.UndoHandle(file("/scratch/hoffman/1.fa"))

line = 1
while line:
    line = input.readline()

****

Timing the run of these files gives the following results (real):

     filetest.py: 0m12.703s 0m12.215s 0m12.331s
filetest-undo.py: 0m30.135s 0m29.676s 0m30.165s

There is about a 150% increase in the amount of time it takes to do
input using readline() with UndoHandle. The overhead of loading
Bio.File is minimal:

$ time python -c "import Bio.File"
 
real    0m0.418s
user    0m0.090s
sys     0m0.080s
 
$ time python -c "None"
 
real    0m0.070s
user    0m0.010s
sys     0m0.030s

This kind of increase on basic I/O means much one one is doing big
jobs, in my opinion. I wasn't volunteering to rewrite anything to not
use UndoHandle but people might consider it when writing future
stuff. And I might try rewriting some stuff anyway. Any thoughts?
-- 
Michael Hoffman <hoffman@ebi.ac.uk>
European Bioinformatics Institute


From jchang at jeffchang.com  Mon Oct 13 00:38:49 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
In-Reply-To: <Pine.LNX.4.44.0310101458000.29967-100000@c3po.ebi.ac.uk>
Message-ID: <23E5E458-FD37-11D7-8578-000A956845CE@jeffchang.com>

On Friday, October 10, 2003, at 10:11  AM, Michael Hoffman wrote:

> I have long wondered about how much the use of Bio.File.UndoHandle
> slows things down (it has additional checks for every read
> operation). Here are some results:

[cut, reading a file is slower when using an UndoHandle]

> There is about a 150% increase in the amount of time it takes to do
> input using readline() with UndoHandle.

> This kind of increase on basic I/O means much one one is doing big
> jobs, in my opinion. I wasn't volunteering to rewrite anything to not
> use UndoHandle but people might consider it when writing future
> stuff. And I might try rewriting some stuff anyway. Any thoughts?

The UndoHandle creates overhead on readline due to its extra if checks 
and function calls.

     def readline(self, *args, **keywds):
         if self._saved:
             line = self._saved.pop(0)
         else:
             line = self._handle.readline(*args,**keywds)
         return line

Also, passing *args and **keywds may incur another performance penalty, 
but I don't know how much.

The best way to speed this up might be to recode the class in C as a 
type.  This would help because the if statement would be evaluated in 
C, and also you can cache the self._handle.readline for a faster 
function lookup.

Jeff


From mdehoon at ims.u-tokyo.ac.jp  Tue Oct 14 10:45:15 2003
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
Message-ID: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp>

Dear Biopythoneers,

This evening I set out to create the Windows installers for Biopython 
1.22. The good news is that there were no compilation errors. The bad 
news is that the __init__.py and data.py files are missing from 
Bio/Cluster in the Biopython 1.22 source distribution. I checked in CVS, 
and found them there. Can these two files be added to the Biopython 1.22 
package?

I then added the __init__.py and data.py files from CVS, and made the 
Windows installers. You can find them at
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.2.exe
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.3.exe
(I will remove these once they are available from the Biopython site). 
While I was at it, I also made a complete Biopython 1.22 distribution:
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.tar.gz

There was one more warning message that I got when compiling Biopython:

Bio/trie.c: In function `Trie_has_prefix':
Bio/trie.c:443: warning: return makes integer from pointer without a cast

As far I can tell from trie.c, this warning is not serious, it is due to 
returning a NULL instead of a 0 where the return type is int. But we may 
as well fix it.

--Michiel.


From thamelry at vub.ac.be  Tue Oct 14 10:51:21 2003
From: thamelry at vub.ac.be (Thomas Hamelryck)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp>
References: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp>
Message-ID: <200310141651.21996.thamelry@vub.ac.be>


> This evening I set out to create the Windows installers for Biopython
> 1.22. The good news is that there were no compilation errors. 

Hi Michiel,

Did you compile the KDTree module too? Some time ago somebody
asked for it for Windows. It's uncommented in setup.py because
of a bug in distutils on some platforms.

Regards,

-Thomas

---
Thomas Hamelryck
Computational modeling lab (COMO)
Vrije Universiteit Brussel (VUB)
Belgium
http://homepages.vub.ac.be/~thamelry


From mdehoon at ims.u-tokyo.ac.jp  Tue Oct 14 21:42:56 2003
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <200310141651.21996.thamelry@vub.ac.be>
References: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp>
	<200310141651.21996.thamelry@vub.ac.be>
Message-ID: <3F8CA620.3030202@ims.u-tokyo.ac.jp>

Thomas Hamelryck wrote:
 > Hi Michiel,
 >
 > Did you compile the KDTree module too? Some time ago somebody
 > asked for it for Windows. It's uncommented in setup.py because
 > of a bug in distutils on some platforms.

I just tried to compile KDTree on Windows. It seems that the file 
Bio/KDTree/_KDTree.swig.C is missing in the source distribution. I 
picked it up from CVS, however I was not able to compile KDTree on 
Cygwin/MinGW (which I am using to build the Windows installer) nor using 
Microsoft's compiler. It seems that distutils doesn't realize that this 
is a C++ file, but I don't know how to fix that. Changing the file 
extensions to .cpp or .cc didn't help.

--Michiel.

C:\cygwin\usr\local\bin\gcc.exe -mno-cygwin -mdll -O -Wall 
-Ic:\Python22\include -c Bio/KDTree/_KDTree.C -o 
build\temp.win32-2.2\Release\_kdtree.o
In file included from /usr/local/include/c++/3.3/bits/locale_facets.h:166,
                  from /usr/local/include/c++/3.3/bits/basic_ios.h:44,
                  from /usr/local/include/c++/3.3/ios:51,
                  from /usr/local/include/c++/3.3/ostream:45,
                  from /usr/local/include/c++/3.3/iostream:45,
                  from Bio/KDTree/_KDTree.h:1,
                  from Bio/KDTree/_KDTree.C:1:
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:46: error: 
`_U' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:47: error: 
`_L' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:48: error: 
`_U' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:48: error: 
`_L' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:49: error: 
`_N' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:50: error: 
`_X' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:50: error: 
`_N' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:51: error: 
`_S' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: 
`_P' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: 
`_U' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: 
`_L' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: 
`_N' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:52: error: 
`_B' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: 
`_P' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: 
`_U' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: 
`_L' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:53: error: 
`_N' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:54: error: 
`_C' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:55: error: 
`_P' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: 
`_U' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: 
`_L' was not declared in this scope
/usr/local/include/c++/3.3/i686-pc-cygwin/bits/ctype_base.h:56: error: 
`_N' was not declared in this scope
Bio/KDTree/_KDTree.C: In member function `void
    KDTree::neighbor_simple_search(float)':
Bio/KDTree/_KDTree.C:914: warning: comparison between signed and 
unsigned integer expressions
Bio/KDTree/_KDTree.C:923: warning: comparison between signed and 
unsigned integer expressions
error: command 'gcc' failed with exit status 1


> 
> Regards,
> 
> -Thomas
> 
> ---
> Thomas Hamelryck
> Computational modeling lab (COMO)
> Vrije Universiteit Brussel (VUB)
> Belgium
> http://homepages.vub.ac.be/~thamelry
> 
> 

-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From jchang at jeffchang.com  Tue Oct 14 23:42:33 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <3F8C0BFB.9060303@ims.u-tokyo.ac.jp>
Message-ID: <9C8384CF-FEC1-11D7-A8B2-000A956845CE@jeffchang.com>

Yes, they were missing.  I've fixed the MANIFEST.in file so that they 
will be included in the next release.  I'll roll a 1.23 release this 
weekend.

Jeff


On Tuesday, October 14, 2003, at 10:45  AM, Michiel Jan Laurens de Hoon 
wrote:

> Dear Biopythoneers,
>
> This evening I set out to create the Windows installers for Biopython 
> 1.22. The good news is that there were no compilation errors. The bad 
> news is that the __init__.py and data.py files are missing from 
> Bio/Cluster in the Biopython 1.22 source distribution. I checked in 
> CVS, and found them there. Can these two files be added to the 
> Biopython 1.22 package?
>
> I then added the __init__.py and data.py files from CVS, and made the 
> Windows installers. You can find them at
> http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.2.exe
> http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.win32-py2.3.exe
> (I will remove these once they are available from the Biopython site). 
> While I was at it, I also made a complete Biopython 1.22 distribution:
> http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython-1.22.tar.gz
>
> There was one more warning message that I got when compiling Biopython:
>
> Bio/trie.c: In function `Trie_has_prefix':
> Bio/trie.c:443: warning: return makes integer from pointer without a 
> cast
>
> As far I can tell from trie.c, this warning is not serious, it is due 
> to returning a NULL instead of a 0 where the return type is int. But 
> we may as well fix it.
>
> --Michiel.
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev


From jchang at jeffchang.com  Tue Oct 14 23:47:28 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <3F8CA620.3030202@ims.u-tokyo.ac.jp>
Message-ID: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com>

On Tuesday, October 14, 2003, at 09:42  PM, Michiel Jan Laurens de Hoon 
wrote:

> Thomas Hamelryck wrote:
> > Hi Michiel,
> >
> > Did you compile the KDTree module too? Some time ago somebody
> > asked for it for Windows. It's uncommented in setup.py because
> > of a bug in distutils on some platforms.
>
> I just tried to compile KDTree on Windows. It seems that the file 
> Bio/KDTree/_KDTree.swig.C is missing in the source distribution. I 
> picked it up from CVS, however I was not able to compile KDTree on 
> Cygwin/MinGW (which I am using to build the Windows installer) nor 
> using Microsoft's compiler. It seems that distutils doesn't realize 
> that this is a C++ file, but I don't know how to fix that. Changing 
> the file extensions to .cpp or .cc didn't help.
>
> --Michiel.

Thomas,

Should the _KDTree.swig.C file be distributed?  If so, then does the 
_KDTree.i file need to be distributed, or just kept in the CVS?  I'll 
distribute it anyway, since it doesn't appear to be hurting anything.

Jeff


From thamelry at vub.ac.be  Wed Oct 15 04:32:01 2003
From: thamelry at vub.ac.be (Thomas Hamelryck)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com>
References: <4C602AF6-FEC2-11D7-A8B2-000A956845CE@jeffchang.com>
Message-ID: <200310151032.01495.thamelry@vub.ac.be>

On Wednesday 15 October 2003 05:47 am, Jeffrey Chang wrote:
> Should the _KDTree.swig.C file be distributed?  If so, then does the
> _KDTree.i file need to be distributed, or just kept in the CVS?  I'll
> distribute it anyway, since it doesn't appear to be hurting anything.

The _KDTree.swig.C is generated by swig from _KDTree.i, so the latter
should definitely be included. I'd also put _KDTree.swig.C in there for those
who do not have swig installed.

Meanwhile I'll look into the distutils problem again.... 
Thanks for trying, Michiel!

-Thomas


From hoffman at ebi.ac.uk  Wed Oct 15 10:51:21 2003
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
In-Reply-To: <23E5E458-FD37-11D7-8578-000A956845CE@jeffchang.com>
Message-ID: <Pine.LNX.4.44.0310151535040.18529-100000@damiana.ebi.ac.uk>

On Mon, 13 Oct 2003, Jeffrey Chang wrote:

> The UndoHandle creates overhead on readline due to its extra if checks 
> and function calls.
>
> [...] 
>
> The best way to speed this up might be to recode the class in C as a 
> type.  This would help because the if statement would be evaluated in 
> C, and also you can cache the self._handle.readline for a faster 
> function lookup.

Actually, I was thinking along the lines of recoding the class that
calls UndoHandle instead (see below). This new implementation does not
seem to be significantly faster than Bio.Fasta.Iterator when the
latter is used without a parser. However you get the parsing done for
free with this implementation! It seems to be about twice as fast as
using Bio.Fasta.Iterator with Bio.Fasta.RecordParser, and provides the
same functionality in a more lightweight package--a tuple of
(defline, data) instead of a Bio.Record object. What do you think?

class LightIterator(object):
    def __init__(self, handle):
        self._handle = handle
        self._defline = None

    def __iter__(self):
        return self

    def next(self):
        lines = []
        defline_old = self._defline

        while 1:
            line = self._handle.readline()
            if not line:
                if not defline_old and not lines:
                    raise StopIteration
                if defline_old:
                    self._defline = None
                    break
            elif line[0] == '>':
                self._defline = line[1:].rstrip()
                if defline_old or lines:
                    break
                else:
                    defline_old = self._defline
            else:
                lines.append(line.rstrip())
            
        return defline_old, ''.join(lines)
-- 
Michael Hoffman <hoffman@ebi.ac.uk>
European Bioinformatics Institute


From jchang at jeffchang.com  Wed Oct 15 15:28:18 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Missing files in Biopython 1.22
In-Reply-To: <200310151032.01495.thamelry@vub.ac.be>
Message-ID: <BB0A6FB0-FF45-11D7-B861-000A956845CE@jeffchang.com>

I have edited the MANIFEST.in file so that _KDTree.swig.C is 
distributed.  It will go out with the next release.

Jeff

On Wednesday, October 15, 2003, at 04:32  AM, Thomas Hamelryck wrote:

> On Wednesday 15 October 2003 05:47 am, Jeffrey Chang wrote:
>> Should the _KDTree.swig.C file be distributed?  If so, then does the
>> _KDTree.i file need to be distributed, or just kept in the CVS?  I'll
>> distribute it anyway, since it doesn't appear to be hurting anything.
>
> The _KDTree.swig.C is generated by swig from _KDTree.i, so the latter
> should definitely be included. I'd also put _KDTree.swig.C in there 
> for those
> who do not have swig installed.
>
> Meanwhile I'll look into the distutils problem again....
> Thanks for trying, Michiel!
>
> -Thomas


From jchang at jeffchang.com  Wed Oct 15 23:10:39 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
In-Reply-To: <Pine.LNX.4.44.0310151535040.18529-100000@damiana.ebi.ac.uk>
Message-ID: <51A00D19-FF86-11D7-8AF5-000A956845CE@jeffchang.com>

On Wednesday, October 15, 2003, at 10:51  AM, Michael Hoffman wrote:

> On Mon, 13 Oct 2003, Jeffrey Chang wrote:
>
>> The UndoHandle creates overhead on readline due to its extra if checks
>> and function calls.
>>
>> [...]
>>
>> The best way to speed this up might be to recode the class in C as a
>> type.  This would help because the if statement would be evaluated in
>> C, and also you can cache the self._handle.readline for a faster
>> function lookup.
>
> Actually, I was thinking along the lines of recoding the class that
> calls UndoHandle instead (see below). This new implementation does not
> seem to be significantly faster than Bio.Fasta.Iterator when the
> latter is used without a parser. However you get the parsing done for
> free with this implementation! It seems to be about twice as fast as
> using Bio.Fasta.Iterator with Bio.Fasta.RecordParser, and provides the
> same functionality in a more lightweight package--a tuple of
> (defline, data) instead of a Bio.Record object. What do you think?

[cut code]

That is a nice implementation.  However, Biopython already has at least 
3 Fasta parsers!
   Bio/Fasta
   Bio/SeqIO/FASTA
   Bio/expressions/fasta

Bio/Fasta, the one you compared against, is easily the slowest one.  
Bio/SeqIO/FASTA is very similar to your implementation and not likely 
to be significantly faster or slower.  Bio/expressions/fasta uses 
Martel.  I don't know how well that will perform.  The parsing part 
should be blazingly fast (since it is mostly in C), but building the 
object will be slow.  It might be a wash.

Jeff


From mdehoon at ims.u-tokyo.ac.jp  Thu Oct 16 00:19:50 2003
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan
Message-ID: <3F8E1C66.4060803@ims.u-tokyo.ac.jp>

Dear Biopython developers,

I am volunteering to give a software demonstration of Biopython at the 
International Conference on Genome Informatics (GIW) in Tokyo/Yokohama 
this December. GIW is the largest annual conference on bioinformatics in 
Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for more information.
The software demonstrations are set up like a poster session (instead of 
an oral presentation such as at BOSC), allowing easy communication with 
potential users. Such a software demonstration comes with a two-page 
paper describing the software. This paper is not medline-indexed and 
there is no rigorous refereeing involved for such short papers. However, 
it appears together with the full-length papers in the proceedings, so 
there will be a permanent record. The proceedings will be publicly 
available on the web after the conference. 
(http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is the paper for the 
software demonstration by our Bioruby colleagues last year at GIW).

The deadline for submissions has already passed, however since my lab is 
organizing this conference we have some leeway. I'll be happy to write 
the paper, but I will need some help:
1) Are there any other papers on Biopython from which I can plagiarize 
the parts of Biopython that I am not very familiar with?
2) Would somebody be willing to have a look at the paper before I submit 
it, in case I write something wrong?
3) Who should I include as co-authors? Anybody is welcome, as far as I 
am concerned.
4) Does anybody have any cool scripts that make use of Biopython that I 
can show off at the conference?

Thanks in advance,

--Michiel, U Tokyo.


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From hoffman at ebi.ac.uk  Thu Oct 16 05:45:07 2003
From: hoffman at ebi.ac.uk (Michael Hoffman)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
In-Reply-To: <51A00D19-FF86-11D7-8AF5-000A956845CE@jeffchang.com>
Message-ID: <Pine.LNX.4.44.0310161027190.18529-100000@damiana.ebi.ac.uk>

On Wed, 15 Oct 2003, Jeffrey Chang wrote:

> That is a nice implementation.  However, Biopython already has at least 
> 3 Fasta parsers!
>    Bio/Fasta
>    Bio/SeqIO/FASTA
>    Bio/expressions/fasta

There sure are. We should probably be cutting them rather than adding
them I suppose. :-) Have you thought of deprecating Bio.Fasta since it
is the slowest?

I know that the official path is to get people towards FormatIO but
Bio.expressions.fasta is more than 12x slower than my
implementation/Bio.SeqIO.FASTA (comparable as you predicted)! For one
test:

FormatIO: 3.085s/3.094s/3.154s
LightIterator: 0.246s/0.243s/0.245s

Unless of course, I am using Bio.expressions.fasta incorrectly. It is
a bit hard to figure out what to do as there are no docstrings, unit
tests, or other documentation that I can see. Here is the code,
anyway. Please let me know if I did this in an inefficient way (this
is a slight speedup over using SeqRecord.io).

=====
from Bio import FormatIO

iterator = FormatIO.FormatIO("SeqRecord", default_input_format = "fasta").readFile(file("/scratch/test.fna"))
for record in iterator:
    pass
=====
-- 
Michael Hoffman <hoffman@ebi.ac.uk>
European Bioinformatics Institute


From idoerg at burnham.org  Thu Oct 16 12:23:56 2003
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan
Message-ID: <3F8EC61C.6090601@burnham.org>


Michiel Jan Laurens de Hoon wrote:
> Dear Biopython developers,
> 
> I am volunteering to give a software demonstration of Biopython at the 
> International Conference on Genome Informatics (GIW) in Tokyo/Yokohama 
> this December. GIW is the largest annual conference on bioinformatics in 
> Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for more information.
> The software demonstrations are set up like a poster session (instead of 
> an oral presentation such as at BOSC), allowing easy communication with 
> potential users. Such a software demonstration comes with a two-page 
> paper describing the software. This paper is not medline-indexed and 
> there is no rigorous refereeing involved for such short papers. However, 
> it appears together with the full-length papers in the proceedings, so 
> there will be a permanent record. The proceedings will be publicly 
> available on the web after the conference. 
> (http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is the paper for the 
> software demonstration by our Bioruby colleagues last year at GIW).

Fantastic! I'm reminded of the "Made in" albums by Deep Purple:
"Biopython: Made in Japan".

> 
> The deadline for submissions has already passed, however since my lab is 
> organizing this conference we have some leeway. I'll be happy to write 
> the paper, but I will need some help:

Nothing like having friends in high places... or being in one yourself.

> 1) Are there any other papers on Biopython from which I can plagiarize 
> the parts of Biopython that I am not very familiar with?

Jeff & Brad wrote something, but back in 2000. A lot has changed since.
Still:

http://biopython.org/docs/acm/ACMbiopy.pdf

> 2) Would somebody be willing to have a look at the paper before I submit 
> it, in case I write something wrong?

I can.

> 3) Who should I include as co-authors? Anybody is welcome, as far as I 
> am concerned.

Ummm.. definitely the triumvirate: Jeff, Brad & Andrew.

Here's what I would do: take the top N posters to biopython-dev, N being
the number of people you want on the paper.

> 4) Does anybody have any cool scripts that make use of Biopython that I 
> can show off at the conference?
> 

I have two websites:

http://bioinformatics.org/pecop

http://ffas.ljcrf.edu:8080/Fragnostic

Which use the FASTA parsing, GenBank parsing, PDB parsing, GO module,
and some other stuff. Nothing really cool in the source codes though. If
you want to show code, I would suggest using something basic, like the
manual. However, those are biopython powered sites, which is always good
for PR.

> Thanks in advance,
> 
> --Michiel, U Tokyo.
> 
> 

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo


From jchang at jeffchang.com  Fri Oct 17 15:03:10 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Performance of Bio.File.UndoHandle
In-Reply-To: <Pine.LNX.4.44.0310161027190.18529-100000@damiana.ebi.ac.uk>
Message-ID: <8CC4C52A-00D4-11D8-84A9-000A956845CE@jeffchang.com>

On Thursday, October 16, 2003, at 05:45  AM, Michael Hoffman wrote:

> On Wed, 15 Oct 2003, Jeffrey Chang wrote:
>
>> That is a nice implementation.  However, Biopython already has at 
>> least
>> 3 Fasta parsers!
>>    Bio/Fasta
>>    Bio/SeqIO/FASTA
>>    Bio/expressions/fasta
>
> There sure are. We should probably be cutting them rather than adding
> them I suppose. :-) Have you thought of deprecating Bio.Fasta since it
> is the slowest?

Yes, that will probably be done eventually.   However, it does have a 
nice interface that's consistent with the other parsers, e.g. for 
GenBank, and it's documented.  We'd be deprecating the best documented 
parser for faster ones that aren't documented.  (As you noticed, not 
even docstrings.)  It's trade-off.  The decision would be much clearer 
if the other parsers had better documentation!  ;)


> I know that the official path is to get people towards FormatIO but
> Bio.expressions.fasta is more than 12x slower than my
> implementation/Bio.SeqIO.FASTA (comparable as you predicted)! For one
> test:
>
> FormatIO: 3.085s/3.094s/3.154s
> LightIterator: 0.246s/0.243s/0.245s

Yikes!  Your code is correct.  However, in fairness, the fasta parser 
that FormatIO is doing more work, such as trying to detect database IDs 
(GenBank, EMBL, DDBJ, NBRF) in the description line.  However, if 
that's something that's not generally needed, perhaps that 
functionality should be off by default, so that the parser would be 
faster.  Everybody likes that, right?

Jeff


From mdehoon at ims.u-tokyo.ac.jp  Sat Oct 18 03:49:23 2003
From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan
Message-ID: <3F90F083.9070206@ims.u-tokyo.ac.jp>

Dear Biopythoneers,

Thank you for your feedback on my proposed software demonstration at GIW
this year. I have put together a draft of the paper describing the 
software; it can be downloaded from 
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython.pdf. Please let me 
know if any changes should be made to it.

I agree with Iddo that is is better to include the original Biopython 
developers Jeff, Brad and Andrew as co-authors of this paper. I feel 
kind of uncomfortable writing a paper on Biopython and putting only my 
name on it, as I am not one of the main developers. So for now, I have 
added Jeff, Brad, Andrew, and Iddo as co-authors. To ensure that I don't 
include anybody as co-author against their will, please let me know if 
you agree to be a co-author. If I don't hear from you, I will assume 
that you didn't get this email or that you prefer not to be a co-author, 
and I will remove your name for the final version. To me, it doesn't 
matter if you actually contributed to this paper or not, because without 
you guys there would be no Biopython and no paper to write about it.

Thanks again, particularly for the pecop and fragnostic websites, I'll 
use those at the software demo to show Biopython in action.

--Michiel.


> I am volunteering to give a software demonstration of Biopython at
> the International Conference on Genome Informatics (GIW) in
> Tokyo/Yokohama this December. GIW is the largest annual conference on
> bioinformatics in Asia: see http://giw.ims.u-tokyo.ac.jp/giw2003 for
> more information. The software demonstrations are set up like a
> poster session (instead of an oral presentation such as at BOSC),
> allowing easy communication with potential users. Such a software
> demonstration comes with a two-page paper describing the software.
> This paper is not medline-indexed and there is no rigorous refereeing
> involved for such short papers. However, it appears together with the
> full-length papers in the proceedings, so there will be a permanent
> record. The proceedings will be publicly available on the web after
> the conference. (http://www.jsbi.org/journal/GIW02/GIW02SS05.pdf is
> the paper for the software demonstration by our Bioruby colleagues
> last year at GIW).


-- 
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon


From chapmanb at uga.edu  Sat Oct 18 12:05:17 2003
From: chapmanb at uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Software demonstration at GIW 2003 in Japan
In-Reply-To: <3F90F083.9070206@ims.u-tokyo.ac.jp>
References: <3F90F083.9070206@ims.u-tokyo.ac.jp>
Message-ID: <20031018160517.GA306@evostick.agtec.uga.edu>

Michiel;

> Thank you for your feedback on my proposed software demonstration at GIW
> this year. I have put together a draft of the paper describing the 
> software; it can be downloaded from 
> http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/biopython.pdf. Please let me 
> know if any changes should be made to it.

Looks great. Thanks for doing this -- it's definitely a positive
thing to get the word out about Biopython.


> I agree with Iddo that is is better to include the original Biopython 
> developers Jeff, Brad and Andrew as co-authors of this paper.

Thanks. Sure I'll take my name on more papers :-).

> Thanks again, particularly for the pecop and fragnostic websites, I'll 
> use those at the software demo to show Biopython in action.

If you want more examples of real life applications using Biopython,
I've recently stuck up a page with code I'm using for my graduate
research:

http://plantgenome.agtec.uga.edu/bioinformatics/dating/

It's not really a software demo kind of thing but is at least an
application of using Biopython for something "real." Also, if you
want to steal stuff from my BOSC biopython talk you are more then
welcome -- there's a tarball of the talk plus the LaTeX and
associated figures at:

http://evostick.agtec.uga.edu/~chapmanb/bp/bosc_biopython_2003.tar.gz

Good luck with the presentation!
Brad 

From jchang at jeffchang.com  Sun Oct 19 11:37:12 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Biopython 1.23 available
Message-ID: <1BAC9E6A-024A-11D8-8924-000A956845CE@jeffchang.com>

Hello Everybody,

Biopython 1.23 is now available from the website at:
http://www.biopython.org/

This is mostly a maintenance release, which fixes some problems in the 
installation.  You do not need to update from 1.22 unless you are using 
the Bio.Cluster, Bio.KDTree, or Bio.PDB.mmCIF packages.  The changes 
made in this release are:
   Fixed distribution of files in Bio/Cluster
   Now distributing Bio/KDTree/_KDTree.swig.C
   minor updates in installation code
   added mmCIF support for PDB files

As usual, please report bugs to biopython-dev@biopython.org, or the bug 
database also available from the website.

Jeff


From kristian.rother at charite.de  Mon Oct 20 06:16:04 2003
From: kristian.rother at charite.de (Kristian Rother)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Updates on PDB entries
Message-ID: <200310201216.04502.kristian.rother@charite.de>


Hello BioPython Maintainers,

I have just written some code that retrieves the weekly distributed files of 
new or modified protein structures from the PDB server or its mirrors. There 
was no such service in the last BioPython release i used. (1.10, i think).

If You are interested in the code, i could provide You with an object 
oriented, cleaned-up and documented source file (which i would not write 
otherwise).

Keep up the good work!

Kristian Rother


From thamelry at vub.ac.be  Mon Oct 20 04:09:14 2003
From: thamelry at vub.ac.be (Thomas Hamelryck)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Updates on PDB entries
In-Reply-To: <200310201216.04502.kristian.rother@charite.de>
References: <200310201216.04502.kristian.rother@charite.de>
Message-ID: <200310200807.h9K87KtL031792@sarek.skynet.be>

Hi Kristian,

> I have just written some code that retrieves the weekly distributed files
> of new or modified protein structures from the PDB server or its mirrors.
> There was no such service in the last BioPython release i used. (1.10, i
> think).
>
> If You are interested in the code, i could provide You with an object
> oriented, cleaned-up and documented source file (which i would not write
> otherwise).


That sounds very useful. Would indeed be a good addition to Biopython.

Cheers,

---
Thomas Hamelryck      
ULTR/COMO
Institute for molecular biology/Computer Science Department
Vrije Universiteit Brussel (VUB)
Brussels, Belgium
http://homepages.vub.ac.be/~thamelry

From idoerg at burnham.org  Mon Oct 20 12:52:51 2003
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Updates on PDB entries
In-Reply-To: <200310201216.04502.kristian.rother@charite.de>
References: <200310201216.04502.kristian.rother@charite.de>
Message-ID: <3F9412E3.30500@burnham.org>


Holy cheese, Cleaned-up _and_ documented??!!! What is this project 
coming to?

Good show Kristian. If you email me the code I'll see that it gets into 
the CVS :)

Iddo


Kristian Rother wrote:
> Hello BioPython Maintainers,
> 
> I have just written some code that retrieves the weekly distributed files of 
> new or modified protein structures from the PDB server or its mirrors. There 
> was no such service in the last BioPython release i used. (1.10, i think).
> 
> If You are interested in the code, i could provide You with an object 
> oriented, cleaned-up and documented source file (which i would not write 
> otherwise).
> 
> Keep up the good work!
> 
> Kristian Rother
> 
> 
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
> 
> 

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo


From rhf22 at mole.bio.cam.ac.uk  Wed Oct 22 10:34:39 2003
From: rhf22 at mole.bio.cam.ac.uk (Rasmus Fogh)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Possible contributor
Message-ID: <Pine.SGI.4.33.0310221519460.31166922-100000@mole.bio.cam.ac.uk>

Dear BioPython,

We (the CCPN project - www.ccpn.ac.uk) think we might make a natural
contributor to BioPython, We are making a standard datamodel in the areas
of BioMolecular NMR, macromolecular structure, and Biochemistry LIMS
programs, including a model in UML, extensive, highly functional APIs to
support the data model, a standard XML data format, I/O libraries, and
some utility programs (e.g. NMR area format converters). We release all
this under LGPL.

To give you an impression, we have over 300 000 lines of python code and
over 500 000 lines of HTML documentation to contribute (most of which is
autogenerated from our UML model). We are currently working on extending
our model from Python/XML to include also Java APIs and relational
database storage.

We do have a few questions:

Where do I find a copy of the license and conditions of distribution?

Is this an integrated project (thus with extensive coordination
requirements) or a collection of independent deposited software? What
obligations would we be taking on?

Do we have to deposit to your CVS repository and who would get write
access?

We follow a slightly different set of style guidelines, in that we use
internalUpperCase instead of separated_by_underscore, and we generate our
own HTML documentation format. Would this be a problem?

Can we just put our name and URL on the ScriptingCentral page, and might
this be better than contributing to the CVS?

Thanks for your help,

Rasmus

---------------------------------------------------------------------------
Dr. Rasmus H. Fogh                  Email: r.h.fogh@bioc.cam.ac.uk
Dept. of Biochemistry, University of Cambridge,
80 Tennis Court Road, Cambridge CB2 1GA, UK.     FAX (01223)766002


From cyli at MIT.EDU  Wed Oct 22 17:24:40 2003
From: cyli at MIT.EDU (Ying Li)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Wrong comment, perhaps?
Message-ID: <1066857880.13608.4.camel@tandem.mit.edu>

Hi,

I'm sorry to bother everyone over such a minor technicality, but in the
docs and comments in Bio.Blast.Record.Blast, in the class
DatabaseReport, it says that the attribute num_sequences_in_database is
the number of sequences in the database, which is an int.

However:

Python 2.3.2 (#2, Oct  6 2003, 08:02:06) 
[GCC 3.3.2 20030908 (Debian prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio.Blast.NCBIStandalone import BlastParser
>>> rec = BlastParser().parse(open("blastOutput01.txt"))
>>> rec.num_sequences_in_database
[1]

Just wondering if the docs are wrong and it's really supposed to be a
list (since the database names is also a list) or if the code is wrong
and it's supposed to just be a number.

(this is not the CVS version, but the tarballed release 1.23 for linux)

Thanks!
-Ying


From jchang at jeffchang.com  Wed Oct 22 18:03:20 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Wrong comment, perhaps?
In-Reply-To: <1066857880.13608.4.camel@tandem.mit.edu>
Message-ID: <8C11A9F7-04DB-11D8-A674-000A956845CE@jeffchang.com>

Yes, you're right.  It should be a list, because there can be multiple 
databases.  I've updated the code in CVS.  Thanks!

jeff


On Wednesday, October 22, 2003, at 05:24  PM, Ying Li wrote:

> Hi,
>
> I'm sorry to bother everyone over such a minor technicality, but in the
> docs and comments in Bio.Blast.Record.Blast, in the class
> DatabaseReport, it says that the attribute num_sequences_in_database is
> the number of sequences in the database, which is an int.
>
> However:
>
> Python 2.3.2 (#2, Oct  6 2003, 08:02:06)
> [GCC 3.3.2 20030908 (Debian prerelease)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from Bio.Blast.NCBIStandalone import BlastParser
>>>> rec = BlastParser().parse(open("blastOutput01.txt"))
>>>> rec.num_sequences_in_database
> [1]
>
> Just wondering if the docs are wrong and it's really supposed to be a
> list (since the database names is also a list) or if the code is wrong
> and it's supposed to just be a number.
>
> (this is not the CVS version, but the tarballed release 1.23 for linux)
>
> Thanks!
> -Ying
>
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev


From jchang at jeffchang.com  Wed Oct 22 18:28:07 2003
From: jchang at jeffchang.com (Jeffrey Chang)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Possible contributor
In-Reply-To: <Pine.SGI.4.33.0310221519460.31166922-100000@mole.bio.cam.ac.uk>
Message-ID: <0291CE5E-04DF-11D8-A674-000A956845CE@jeffchang.com>

Hi Rasmus,

> We (the CCPN project - www.ccpn.ac.uk) think we might make a natural
> contributor to BioPython
[... description of project]

> We do have a few questions:
>
> Where do I find a copy of the license and conditions of distribution?

Biopython is distributed with the Biopython license.  I have put it 
online at:
http://www.biopython.org/static/LICENSE

It's basically the Python license.  I don't believe it can be 
distributed with the LGPL, so we have a license incompatibility.  The 
Biopython license would not mind being distributed with LGPL, but the 
LGPL wouldn't like it much!


> Is this an integrated project (thus with extensive coordination
> requirements) or a collection of independent deposited software? What
> obligations would we be taking on?

It's in-between the two.  People are essentially independently 
in-charge of their own code, but there is some oversight on what goes 
into the project.  People do make minor changes (e.g. bug fixes) to 
other portions of the code base, but major changes and changes to the 
API are discouraged without discussion.  More coordination comes in 
when it's time to make releases, when I need to make sure that 
everybody's code is in working order and ready to be released.

If you were to submit your code as part of biopython, we would probably 
give you your own package in the Bio namespace, probably Bio.CCPN.  
You'd be free to do whatever you wanted under there.


> Do we have to deposit to your CVS repository and who would get write
> access?

If you want the code to be distributed with Biopython, then it would 
have to be in the CVS repository.  We have thus far been relatively 
liberal about handing out write access, and haven't run into any 
problems yet.  We would likely be able to give out accounts to anyone 
in your project that needs it, within reason, I suppose.


> We follow a slightly different set of style guidelines, in that we use
> internalUpperCase instead of separated_by_underscore, and we generate 
> our
> own HTML documentation format. Would this be a problem?

If you're familiar with our code base, you'll notice that we don't 
always follow our own guidelines consistently!  :)  It is unlikely that 
it will ever get unified, unless we happen to come upon a large 
increase in resources.

As for documentation, that's not a problem.  Brad has been wanting to 
move to a more distributed documentation format, to make it easier for 
package maintainers to write their own documentation.


> Can we just put our name and URL on the ScriptingCentral page, and 
> might
> this be better than contributing to the CVS?

Yes, it would certainly be appropriate for you to do that!

I don't know if it would be better than contributing to CVS -- depends 
on what you want to get out of it.  While the projects cover the same 
general area, I'm not sure how much overlap there is between the two 
projects, that is, how many people now are using both Biopython and 
CCPN.  If the overlap is low, then many people would end up downloading 
a lot of code they don't intend to use.  If the overlap is high, then 
distributing together would simplify the installation process.

Also, please consider the distribution cycle.  Historically, Biopython 
has had releases about once every 6 months.  That's about the amount of 
time to accumulate enough new code, fix bugs, and for someone (me 
currently) to make the release.  If you want faster releases, you'll 
probably have to help out with the builds!

That said, why would someone want to contribute to Biopython?  
Biopython does have a stable, full-featured infrastructure, with nice 
net access, and web, CVS, mailing lists.

Also, please read over the Contribution Guide, which talks about other 
considerations and requirements for contributing to Biopython:
http://www.biopython.org/docs/developer/contrib.html

Jeff


From Y.Benita at pharm.uu.nl  Tue Oct 28 05:37:52 2003
From: Y.Benita at pharm.uu.nl (Yair Benita)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] Updated SeqUtils
Message-ID: <BBC40590.34EB%Y.Benita@pharm.uu.nl>

Dear All,
The SeqUtils module has been updated with some new functions.

CodonUsage module: can be used to generate a codon adaptation index for a
set of genes or computer a codon adaptation index from an existing index.

IsoelectricPoint module: can be used to determine the isoelectric point of a
protein from its sequence.

ProtParam module: can be used to compute various properties of a protein,
such as: aromaticity, stability, flexibility and more.

More information is available as docstrings in each module.

Special thanks to Iddo Friedberg for all the help.

Yair
-- 
Yair Benita
Pharmaceutical Proteomics
Utrecht University


From bugzilla-daemon at portal.open-bio.org  Fri Oct 31 18:59:37 2003
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] [Bug 1550] New: python setup.py install --home=~
	fails
Message-ID: <200310312359.h9VNxbZs024573@portal.open-bio.org>

http://bugzilla.bioperl.org/show_bug.cgi?id=1550

           Summary: python setup.py install --home=~ fails
           Product: Biopython
           Version: Not Applicable
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev@biopython.org
        ReportedBy: tvinar@math.uwaterloo.ca


The installation to home directory fails with the following message:
running install_data
creating /usr/lib/python2.2/site-packages/Bio
error: could not create '/usr/lib/python2.2/site-packages/Bio': Permission denied


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Fri Oct 31 19:34:47 2003
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  5 14:43:28 2005
Subject: [Biopython-dev] [Bug 1550] python setup.py install --home=~ fails
Message-ID: <200311010034.hA10Ylqb026714@portal.open-bio.org>

http://bugzilla.bioperl.org/show_bug.cgi?id=1550

idoerg@burnham.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.