From chapmanb at arches.uga.edu  Sun Jul  1 13:40:51 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:00 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <a0510100fb76079968dd4@[192.168.0.4]>
References: <a0510100fb76079968dd4@[192.168.0.4]>
Message-ID: <15167.24739.293433.283046@taxus.athen1.ga.home.com>

Jeff:
> It smells like time to put together another release to get some bug 
> fixes and new functionality out to the public.  

Sounds like a great idea! Thanks for pushing for it to happen.

> I currently have the 
> middle of next week in mind, although that's still up for debate.

Fine with me.

> Here's my undoubtedly incomplete list of stuff that's been updated 
> since the last release (on Mar 3!):
[...]

A few other things I can think of:

o Can output GenBank.Record objects in GenBank format
o Fixes and updates in SubsMat code.

>    1) you're currently working on something and really want to hold 
> off until it's done.

Iddo and I were working on fixes and additions he suggested for the
Align stuff. He just sent me a revised version today, so I think we
should be able to have it in within the deadline without a problem.

>    - dynamic programming code (Brad, where's yours? :)

Um...my cat ate it. 

Jeff: 
> Stuff I'd like, but may not get done:
>    - PDB parser

Andrew:
> I use UPDB to generate Martel format definitions.  However, it's
> not really useful unless the parser can build real data structures,

Thomas:
> I think, this month I can spend significantly more time on the biopython
> project than the last months - so is anything I mentioned worth to pull in
> the next release ?

Don't know if this can make it is the next release, but it seems like
there is:

1. Interest in a PDB parser.
2. Code available from Andrew for a partial PDB parser, that needs
someone to help finish and integrate it.
3. Someone named Thomas with extra time on his hands :-)

Don't know if this is a good match with your interests, Thomas, but
it's at least an idea (man, and I don't get them very often).

In general, sounds good about the release. Please let me know if there
are specific things I can do to help out.

Brad


From biopython-bugs at bioperl.org  Sun Jul  1 13:56:27 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:00 2005
Subject: [Biopython-dev] Notification: incoming/34
Message-ID: <200107011756.f61HuQ402618@pw600a.bioperl.org>

JitterBug notification

chapmanb moved PR#34 from incoming to trash
Message summary for PR#34
	From: <dragon13@dwp.net>
	Subject: specials of the day
	Date: Mon, 28 May 2001 15:02:24
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From dragon13@dwp.net Mon May 28 13:59:32 2001
Received: from lunar.eclipse.net (root@[207.207.192.6])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f4SHxVb29487;
	Mon, 28 May 2001 13:59:31 -0400
Received: from dwp.net (da001d1958.atl-ga.osd.concentric.net [64.3.199.167]) by lunar.eclipse.net (8.9.1a/8.6.12) with SMTP id PAA10840; Mon, 28 May 2001 15:03:30 -0400 (EDT)
From: <dragon13@dwp.net>
Subject: specials of the day
Date: Mon, 28 May 2001 15:02:24
Message-Id: <613.466435.205071@dwp.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"


PLEASE FORWARD TO THE PERSON
RESPONSIBLE FOR PURCHASING
YOUR LASER PRINTER SUPPLIES

**** VORTEX  SUPPLIES ****

LASER PRINTER TONER CARTRIDGES,
COPIER AND FAX CARTRIDGES

SAVE UP TO 30% FROM RETAIL

ORDER BY PHONE:1-888-288-9043
ORDER BY FAX: 1-888-977-1577
CUSTOMER SERVICE: 1-888-248-2015
E-MAIL REMOVAL LINE: 1-888-248-4930 

UNIVERSITY AND/OR SCHOOL PURCHASE ORDERS WELCOME. (NO CREDIT APPROVAL REQUIRED)
ALL OTHER PURCHASE ORDER REQUESTS REQUIRE CREDIT APPROVAL.
PAY BY CHECK (C.O.D), CREDIT CARD OR PURCHASE ORDER (NET 30 DAYS).

IF YOUR ORDER IS BY CREDIT CARD PLEASE LEAVE YOUR CREDIT CARD # PLUS EXPIRATION DATE. 
IF YOUR ORDER IS BY PURCHASE ORDER LEAVE YOUR SHIPPING/BILLING ADDRESSES AND YOUR P.O. NUMBER


FOR THOSE OF YOU WHO REQUIRE MORE INFORMATION ABOUT OUR COMPANY
INCUDING FEDERAL TAX ID NUMBER, CLOSEST SHIPPING OR CORPORATE ADDRESS IN THE CONTINENTAL 
U.S. OR  FOR CATALOG  REQUESTS PLEASE CALL OUR CUSTOMER SERVICE LINE  1-888-248-2015 
 

OUR NEW , LASER PRINTER TONER CARTRIDGE, PRICES ARE  AS FOLLOWS: 
(PLEASE ORDER BY PAGE NUMBER AND/OR ITEM NUMBER)

HEWLETT PACKARD: (ON PAGE 2)

ITEM #1  LASERJET SERIES  4L,4P (74A)------------------------$44
ITEM #2  LASERJET SERIES  1100 (92A)-------------------------$44
ITEM #3  LASERJET SERIES  2 (95A)----------------------------$39
ITEM #4  LASERJET SERIES  2P (75A)---------------------------$54 
ITEM #5  LASERJET SERIES  5P,6P,5MP, 6MP (3903A)----------  -$44
ITEM #6  LASERJET SERIES  5SI, 8000 (09A)--------------------$95
ITEM #7  LASERJET SERIES  2100 (96A)-------------------------$74
ITEM #8  LASERJET SERIES  8100 (82X)------------------------$145
ITEM #9  LASERJET SERIES  5L/6L (3906A)----------------------$39
ITEM #10 LASERJET SERIES  4V---------------------------------$95
ITEM #11 LASERJET SERIES 4000 (27X)--------------------------$72
ITEM #12 LASERJET SERIES 3SI/4SI (91A)-----------------------$54
ITEM #13 LASERJET SERIES 4, 4M, 5,5M-------------------------$49
ITEM #13A LASERJET SERIES 5000 (29X)-------------------------$95

HEWLETT PACKARD FAX (ON PAGE 2)

ITEM #14 LASERFAX 500, 700 (FX1)----------$49
ITEM #15  LASERFAX 5000,7000 (FX2)--------$54
ITEM #16  LASERFAX (FX3)------------------$59
ITEM #17  LASERFAX (FX4)------------------$54


LEXMARK/IBM (ON PAGE 3)

OPTRA 4019, 4029 HIGH YIELD---------------$89
OPTRA R, 4039, 4049 HIGH YIELD-----------$105

OPTRA E-----------------------------------$59
OPTRA N----------------------------------$115
OPTRA S----------------------------------$165


EPSON (ON PAGE 4)

ACTION LASER 7000,7500,8000,9000----------$105
ACTION LASER 1000,1500--------------------$105


CANON PRINTERS (ON PAGE 5)

PLEASE CALL FOR MODELS AND UPDATED PRICES
FOR CANON PRINTER CARTRIDGES

PANASONIC (0N PAGE 7)

NEC SERIES 2 MODELS 90 AND 95----------$105

APPLE (0N PAGE 8)

LASER WRITER PRO 600 or 16/600------------------$49 
LASER WRITER SELECT 300,320,360-----------------$74
LASER WRITER 300 AND 320------------------------$54
LASER WRITER NT, 2NT----------------------------$54
LASER WRITER 12/640-----------------------------$79

CANON FAX (ON PAGE 9)

LASERCLASS 4000 (FX3)---------------------------$59
LASERCLASS 5000,6000,7000 (FX2)-----------------$54
LASERFAX 5000,7000 (FX2)------------------------$54
LASERFAX 8500,9000 (FX4)------------------------$54

CANON COPIERS (PAGE 10)

PC 3, 6RE, 7 AND 11 (A30)---------------------$69
PC 300,320,700,720 and 760 (E-40)-------------$89

IF YOUR CARTRIDGE IS NOT LISTED CALL CUSTOMER SERVICE AT 1-888-248-2015 

90 DAY UNLIMITED WARRANTY INCLUDED ON ALL PRODUCTS.

ALL TRADEMARKS AND BRAND NAMES LISTED ABOVE ARE PROPERTY OF THE 
RESPECTIVE HOLDERS AND USED FOR DESCRIPTIVE PURPOSES ONLY.


From biopython-bugs at bioperl.org  Sun Jul  1 13:56:27 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:00 2005
Subject: [Biopython-dev] Notification: incoming/36
Message-ID: <200107011756.f61HuR402622@pw600a.bioperl.org>

JitterBug notification

chapmanb moved PR#36 from incoming to trash
Message summary for PR#36
	From: <br56@peopleweb.com>
	Subject: toner supplies
	Date: Fri, 29 Jun 2001 04:13:00
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From br56@peopleweb.com Fri Jun 29 04:20:04 2001
Received: from custmail.concentric.net (custmail.concentric.net [205.158.26.150])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f5T8Jw810271;
	Fri, 29 Jun 2001 04:19:59 -0400
Received: from www.z209220078.sjc-ca.dsl.cnc.net (hq.dcara.org [209.220.78.2])
	by custmail.concentric.net (8.11.0/8.11.0) with ESMTP id f5T8JfY24378;
	Fri, 29 Jun 2001 01:19:42 -0700 (PDT)
Received: from peopleweb.com ([168.191.92.201])
          by www.z209220078.sjc-ca.dsl.cnc.net (Post.Office MTA v3.5.2
          release 221 ID# 0-67874U100L2S100V35) with SMTP id net;
          Fri, 29 Jun 2001 01:18:59 -0700
From: <br56@peopleweb.com>
Subject: toner supplies
Date: Fri, 29 Jun 2001 04:13:00
Message-Id: <71.849361.143942@peopleweb.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"


 PLEASE FORWARD TO THE PERSON
RESPONSIBLE FOR PURCHASING
YOUR LASER PRINTER SUPPLIES

**** VORTEX  SUPPLIES ****

LASER PRINTER TONER CARTRIDGES,
COPIER AND FAX CARTRIDGES

SAVE UP TO 30% FROM RETAIL

ORDER BY PHONE:1-888-288-9043
ORDER BY FAX: 1-888-977-1577
CUSTOMER SERVICE: 1-888-248-2015
E-MAIL REMOVAL LINE: 1-888-248-4930 

UNIVERSITY AND/OR SCHOOL PURCHASE ORDERS WELCOME. (NO CREDIT APPROVAL REQUIRED)
ALL OTHER PURCHASE ORDER REQUESTS REQUIRE CREDIT APPROVAL.
PAY BY CHECK (C.O.D), CREDIT CARD OR PURCHASE ORDER (NET 30 DAYS).

IF YOUR ORDER IS BY CREDIT CARD PLEASE LEAVE YOUR CREDIT CARD # PLUS EXPIRATION DATE. 
IF YOUR ORDER IS BY PURCHASE ORDER LEAVE YOUR SHIPPING/BILLING ADDRESSES AND YOUR P.O. NUMBER


FOR THOSE OF YOU WHO REQUIRE MORE INFORMATION ABOUT OUR COMPANY
INCUDING FEDERAL TAX ID NUMBER, CLOSEST SHIPPING OR CORPORATE ADDRESS IN THE CONTINENTAL 
U.S. OR  FOR CATALOG  REQUESTS PLEASE CALL OUR CUSTOMER SERVICE LINE  1-888-248-2015 
 

OUR NEW , LASER PRINTER TONER CARTRIDGE, PRICES ARE  AS FOLLOWS: 
(PLEASE ORDER BY PAGE NUMBER AND/OR ITEM NUMBER)

HEWLETT PACKARD: (ON PAGE 2)

ITEM #1  LASERJET SERIES  4L,4P (74A)------------------------$44
ITEM #2  LASERJET SERIES  1100 (92A)-------------------------$44
ITEM #3  LASERJET SERIES  2 (95A)----------------------------$39
ITEM #4  LASERJET SERIES  2P (75A)---------------------------$54 
ITEM #5  LASERJET SERIES  5P,6P,5MP, 6MP (3903A)----------  -$44
ITEM #6  LASERJET SERIES  5SI, 8000 (09A)--------------------$95
ITEM #7  LASERJET SERIES  2100 (96A)-------------------------$74
ITEM #8  LASERJET SERIES  8100 (82X)------------------------$145
ITEM #9  LASERJET SERIES  5L/6L (3906A)----------------------$35
ITEM #10 LASERJET SERIES  4V---------------------------------$95
ITEM #11 LASERJET SERIES 4000 (27X)--------------------------$72
ITEM #12 LASERJET SERIES 3SI/4SI (91A)-----------------------$54
ITEM #13 LASERJET SERIES 4, 4M, 5,5M-------------------------$49
ITEM #13A LASERJET SERIES 5000 (29X)-------------------------$95

HEWLETT PACKARD FAX (ON PAGE 2)

ITEM #14 LASERFAX 500, 700 (FX1)----------$49
ITEM #15  LASERFAX 5000,7000 (FX2)--------$54
ITEM #16  LASERFAX (FX3)------------------$59
ITEM #17  LASERFAX (FX4)------------------$54


LEXMARK/IBM (ON PAGE 3)

OPTRA 4019, 4029 HIGH YIELD---------------$89
OPTRA R, 4039, 4049 HIGH YIELD-----------$105

OPTRA E-----------------------------------$59
OPTRA N----------------------------------$115
OPTRA S----------------------------------$165


EPSON (ON PAGE 4)

ACTION LASER 7000,7500,8000,9000----------$105
ACTION LASER 1000,1500--------------------$105


CANON PRINTERS (ON PAGE 5)

PLEASE CALL FOR MODELS AND UPDATED PRICES
FOR CANON PRINTER CARTRIDGES

PANASONIC (0N PAGE 7)

NEC SERIES 2 MODELS 90 AND 95----------$105

APPLE (0N PAGE 8)

LASER WRITER PRO 600 or 16/600------------------$49 
LASER WRITER SELECT 300,320,360-----------------$74
LASER WRITER 300 AND 320------------------------$54
LASER WRITER NT, 2NT----------------------------$54
LASER WRITER 12/640-----------------------------$79

CANON FAX (ON PAGE 9)

LASERCLASS 4000 (FX3)---------------------------$59
LASERCLASS 5000,6000,7000 (FX2)-----------------$54
LASERFAX 5000,7000 (FX2)------------------------$54
LASERFAX 8500,9000 (FX4)------------------------$54

CANON COPIERS (PAGE 10)

PC 3, 6RE, 7 AND 11 (A30)---------------------$69
PC 300,320,700,720 and 760 (E-40)-------------$89

IF YOUR CARTRIDGE IS NOT LISTED CALL CUSTOMER SERVICE AT 1-888-248-2015 

90 DAY UNLIMITED WARRANTY INCLUDED ON ALL PRODUCTS.

ALL TRADEMARKS AND BRAND NAMES LISTED ABOVE ARE PROPERTY OF THE 
RESPECTIVE HOLDERS AND USED FOR DESCRIPTIVE PURPOSES ONLY.


From idoerg at cc.huji.ac.il  Mon Jul  2 02:15:12 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:00 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <15167.24739.293433.283046@taxus.athen1.ga.home.com>
Message-ID: <Pine.GSO.4.33_heb2.09.0107020837440.25614-100000@new-shum>

Hi,

Jeff:
:
: > Here's my undoubtedly incomplete list of stuff that's been updated
: > since the last release (on Mar 3!):
: [...]
:
: A few other things I can think of:
:
: o Can output GenBank.Record objects in GenBank format
: o Fixes and updates in SubsMat code.

I fixed the SubsMat code, commited it + regression tests, and updated the
Wiki docs. AFAIK, should be OK. Let me know if not so.

Also, I added a bit of functionality to FSSP.FSSPTools

Brad:

: Iddo and I were working on fixes and additions he suggested for the
: Align stuff. He just sent me a revised version today, so I think we
: should be able to have it in within the deadline without a problem.

Yep. I think I can finish this by Tuesday.


: Jeff:
: > Stuff I'd like, but may not get done:
: >  - PDB parser
:
: Andrew:
: > I use UPDB to generate Martel format definitions.However, it's
: > not really useful unless the parser can build real data structures,
:

Is anybody familiar with Konrad Hinsen's PDB module which is used, I
believe as part of MMTK? I downloaded it a couple of years ago, and it
seems to stand out well on its own. It might be worth checking out:

http://starship.python.net/crew/hinsen/MMTK/

(Haven't really looked at it for a while, though).

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From idoerg at cc.huji.ac.il  Wed Jul  4 12:19:46 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <Pine.GSO.4.33_heb2.09.0107020837440.25614-100000@new-shum>
Message-ID: <Pine.GSO.4.33_heb2.09.0107041918410.8943-100000@new-shum>

Hi,

FSSP, Align, and SubsMat have all been committed for the new version. Inc.
updated regression tests.

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From chapmanb at arches.uga.edu  Wed Jul  4 16:06:54 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <Pine.GSO.4.33_heb2.09.0107041918410.8943-100000@new-shum>
References: <Pine.GSO.4.33_heb2.09.0107020837440.25614-100000@new-shum>
	<Pine.GSO.4.33_heb2.09.0107041918410.8943-100000@new-shum>
Message-ID: <15171.30558.888426.472221@taxus.athen1.ga.home.com>

Hi Iddo;

> FSSP, Align, and SubsMat have all been committed for the new version. Inc.
> updated regression tests.

Sweet. Thanks for the fixes and updates.

Can you also update the generated output in Tests/output for
test_FSSP, test_align and test_SubsMat if you've verified it to be
correct (you can do 'python run_tests.py -g test_FSSP' to get the
output updated). 

Right now running python run_tests.py in the Tests directory gives the
following errors:

======================================================================
ERROR: test_FSSP
----------------------------------------------------------------------
Traceback (most recent call last):
  File "run_tests.py", line 155, in runTest
    raise IOError, "Warning: Can't open %s for test %s" % \
IOError: Warning: Can't open ./output/test_FSSP for test test_FSSP
======================================================================
FAIL: test_SubsMat
----------------------------------------------------------------------
Traceback (most recent call last):
  File "run_tests.py", line 153, in runTest
    expected_handle)
  File "run_tests.py", line 247, in compare_output
    assert expected_line == output_line, \
AssertionError: 
Output  : '\n'
Expected: 'A 1.60\n'
======================================================================
FAIL: test_align
----------------------------------------------------------------------
Traceback (most recent call last):
  File "run_tests.py", line 153, in runTest
    expected_handle)
  File "run_tests.py", line 247, in compare_output
    assert expected_line == output_line, \
AssertionError: 
Output  : 'part of alignment: 88.4230990854\n'
Expected: 'part of alignment: 1.57690091462\n'
----------------------------------------------------------------------

Thanks again for the updates!
Brad


From idoerg at cc.huji.ac.il  Thu Jul  5 04:56:15 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <15171.30558.888426.472221@taxus.athen1.ga.home.com>
Message-ID: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>

Hi,

Incidentally: Happy (belated) 4th of July!

On Wed, 4 Jul 2001, Brad Chapman wrote:

:
: Can you also update the generated output in Tests/output for
: test_FSSP, test_align and test_SubsMat if you've verified it to be
: correct (you can do 'python run_tests.py -g test_FSSP' to get the
: output updated).

OK, done. Sorry about that.

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From jchang at SMI.Stanford.EDU  Thu Jul  5 15:57:29 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>
References: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>
Message-ID: <a05101002b76a757e8e74@[171.65.33.250]>

test_FSSP is still failing the regression tests for me.  It seems to 
be caused by:
f.write("\nRecords filtered in %s\n" % sum_ge_15.keys())

Since dictionaries are unordered, it is outputting them in a 
different order than they did when the output was created.  Thus, I 
added a little code so that it outputs the sorted keys, which should 
be the same every time:
k = sum_ge_15.keys()
k.sort()
f.write("\nRecords filtered in %s\n" % k)

I'm committed this change and updated the output.  I'm going to start 
putting together the release today.

Jeff

From chapmanb at arches.uga.edu  Thu Jul  5 16:44:05 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <a05101002b76a757e8e74@[171.65.33.250]>
References: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>
	<a05101002b76a757e8e74@[171.65.33.250]>
Message-ID: <15172.53653.811395.427800@taxus.athen1.ga.home.com>

Jeff:
> test_FSSP is still failing the regression tests for me.  It seems to 
> be caused by:
> f.write("\nRecords filtered in %s\n" % sum_ge_15.keys())
> 
> Since dictionaries are unordered, it is outputting them in a 
> different order than they did when the output was created. 

Ya, I also did this myself on some tests. It is very easy to get
tricked on this (well, at least it was easy for me to get tricked :-)
because for the same dictionary in multiple tests, python will give
you the same order. It is when you switch versions of python that the
ordering will start to change and you'll realize it. Sneaky!

> I'm going to start 
> putting together the release today.

Sweet! Give a heads up when you're done and I can build the windows
installer and update the HappyDoc documentation.

Also, can we try to get the pdf documentation in the release this
time? :-). You can grab it from the normal place:

http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial.pdf

If you put it in the Doc directory it would automagically be included
(I hope!). 

Thanks!
Brad


From idoerg at cc.huji.ac.il  Thu Jul  5 18:12:01 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <15172.53653.811395.427800@taxus.athen1.ga.home.com>
Message-ID: <Pine.GSO.4.33_heb2.09.0107060047160.10392-100000@new-shum>

Hi,

On Thu, 5 Jul 2001, Brad Chapman wrote:

: Jeff:
: > test_FSSP is still failing the regression tests for me.It seems to
: > be caused by:
: > f.write("\nRecords filtered in %s\n" % sum_ge_15.keys())
: >
: > Since dictionaries are unordered, it is outputting them in a
: > different order than they did when the output was created.
:
: Ya, I also did this myself on some tests. It is very easy to get
: tricked on this (well, at least it was easy for me to get tricked :-)

Same here. Didn't know that, (though I should have thought about it).
Jeff's fix is fine. Thanks for taking the time to do this.

One more thing: can the following correction please be inserted in the
manual?

(Section 3.5.5)
Amend the paragraph beginning with "Qi - " to:

Qi - is automatically assigned to 0.05 for a protein alphabet, and 0.25
for a nucleic acid alphabet. This is for geting the information content
without any assumption of prior distribtions. When assuming priors, or
when using a non-standard alphabet, user should supply the values for Qi.

Scratch the whole paragraph (just following) about gap characters,
including the 2nd equation.  We're not doing it that way anyway.


Thanks,

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From chapmanb at arches.uga.edu  Thu Jul  5 20:41:25 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <Pine.GSO.4.33_heb2.09.0107060047160.10392-100000@new-shum>
References: <15172.53653.811395.427800@taxus.athen1.ga.home.com>
	<Pine.GSO.4.33_heb2.09.0107060047160.10392-100000@new-shum>
Message-ID: <15173.2357.484152.295717@taxus.athen1.ga.home.com>

Iddo:
> One more thing: can the following correction please be inserted in the
> manual?

Shorely! All fixed in CVS and on the web site documentation. Thanks!

Brad


From jchang at SMI.Stanford.EDU  Thu Jul  5 22:24:17 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <15172.53653.811395.427800@taxus.athen1.ga.home.com>
References: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>
 <a05101002b76a757e8e74@[171.65.33.250]>
 <15172.53653.811395.427800@taxus.athen1.ga.home.com>
Message-ID: <a05101007b76ad1a0267c@[171.65.33.250]>

>  > I'm going to start
>>  putting together the release today.
>
>Sweet! Give a heads up when you're done and I can build the windows
>installer and update the HappyDoc documentation.

Yep, it's all there now!

>Also, can we try to get the pdf documentation in the release this
>time? :-). You can grab it from the normal place:
>
>http://www.bioinformatics.org/bradstuff/bp/tut/Tutorial.pdf

Thanks.  I had meant to include it last time, but there was a failure 
in the build process and it accidentally got left out.  Sorry about 
that!

Jeff

From jchang at SMI.Stanford.EDU  Thu Jul  5 22:26:46 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] build process online
Message-ID: <a05101008b76ad1f0392c@[171.65.33.250]>

Hello everybody,

I've written down the build process I use to make biopython releases 
and put them online:
http://www.biopython.org/wiki/html/BioPython/BuildProcess.html

This is a reference to make sure everything gets done for the builds!

Jeff

From chapmanb at arches.uga.edu  Sat Jul  7 14:48:04 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <a05101007b76ad1a0267c@[171.65.33.250]>
References: <Pine.GSO.4.33_heb2.09.0107051152230.29652-100000@new-shum>
	<a05101002b76a757e8e74@[171.65.33.250]>
	<15172.53653.811395.427800@taxus.athen1.ga.home.com>
	<a05101007b76ad1a0267c@[171.65.33.250]>
Message-ID: <15175.22884.286737.563372@taxus.athen1.ga.home.com>

[Jeff on the new release]
> Yep, it's all there now!

I've added windows installers for python2.0 and python2.1 to the
download page. They seem to install fine for me, but if anyone here
uses windows and wants to test them that would be excellent!

I'm probably just going to lay off on making rpms right
now, unless people complain.

[pdf docs]
> Thanks.  I had meant to include it last time, but there was a failure 
> in the build process and it accidentally got left out.  Sorry about 
> that!

No problem -- I just thought it was funny that we've been trying to
put it in for the last 12 releases or so and keep forgetting about it
somehow :-). Thanks for getting it in there! Also, the build process
wiki page is quite interesting reading -- thanks for adding it!

Brad


From johann at egenetics.com  Mon Jul  9 09:34:26 2001
From: johann at egenetics.com (Johann Visagie)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Biopython 1.00a2 release
In-Reply-To: <a0510100fb76079968dd4@[192.168.0.4]>; from jchang@SMI.Stanford.EDU on Wed, Jun 27, 2001 at 11:19:32PM -0700
References: <a0510100fb76079968dd4@[192.168.0.4]>
Message-ID: <20010709153426.E9831@fling.sanbi.ac.za>

FYI,

I've just updated the FreeBSD port of Biopython to 1.00a2, and committed the
changes back the FreeBSD CVS tree.

  http://www.freebsd.org/cgi/cvsweb.cgi/ports/biology/py-biopython/

-- Johann

From biopython-bugs at bioperl.org  Tue Jul 10 10:40:46 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Notification: incoming/37
Message-ID: <200107101440.f6AEek416252@pw600a.bioperl.org>

JitterBug notification

new message incoming/37

Message summary for PR#37
	From: idoerg@cc.huji.ac.il
	Subject: MutableSeq
	Date: Tue, 10 Jul 2001 10:40:46 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From idoerg@cc.huji.ac.il Tue Jul 10 10:40:46 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f6AEej416246
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 10 Jul 2001 10:40:46 -0400
Date: Tue, 10 Jul 2001 10:40:46 -0400
Message-Id: <200107101440.f6AEej416246@pw600a.bioperl.org>
From: idoerg@cc.huji.ac.il
To: biopython-bugs@bioperl.org
Subject: MutableSeq

Full_Name: Iddo Friedberg
Module: CVS
Version: 
OS: RedHat 7.0
Submission from: nv5.huji.ac.il (62.0.54.61)


>>> from Bio import Seq
>>> l=Seq.Seq('ACDEFGHIKL')phabet())
>>> j=l.tomutable()
>>> print j
MutableSeq('ACDEFGHIKL', Alphabet())
>>> j
Segmentation fault
(back to shell prompt)

BTW, I patched my RH 7.0 system up.. I don't believe
this is due to the overflow problems that are known
in this system.

Iddo


From idoerg at cc.huji.ac.il  Wed Jul 11 03:49:51 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
Message-ID: <Pine.GSO.4.33_heb2.09.0107111044000.19762-100000@new-shum>

Hi Andrew & all,

Sorry, I mangled the thing during cut/paste. Seems like you reproduced the
code correctly, Andrew.


Ok. Here's the minimal code. It causes a segfault with python 2.0 / 2.0.1
on Linux RH 7.0 and RH 6.2

idoerg@arrakis:noh/sw> python
Python 2.0 (#1, Oct 16 2000, 18:10:03)
[GCC 2.95.2 19991024 (release)] on linux2
Type "copyright", "credits" or "license" for more information.
imported cPickle, math, string, os
imported dirs
>>> from Bio import Seq
>>> l=Seq.MutableSeq('ACDEFGHIKL')
>>> print j
MutableSeq('ACDEFG', Alphabet())
>>> j
Segmentation fault (core dumped)


Boo Hoo!!

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From dalke at dalkescientific.com  Wed Jul 11 04:59:39 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
Message-ID: <006f01c109e7$d1108260$c1c63ec1@josiah.ebi.ac.uk>

That's can't be right, since there's no 'j':

Iddo:
> >>> from Bio import Seq
> >>> l=Seq.MutableSeq('ACDEFGHIKL')
> >>> print j
> MutableSeq('ACDEFG', Alphabet())
> >>> j
>Segmentation fault (core dumped)

[dalke@pw600a biopython]$ python
Python 2.0 (#4, Dec  8 2000, 21:23:00)
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from Bio import Seq
>>> l = Seq.MutableSeq('ACDEFGHIKL')
>>> print l
MutableSeq('ACDEFGHIKL', Alphabet())
>>> l
MutableSeq(array('c', 'ACDEFGHIKL'), Alphabet())
>>>

If I were to make a guess, what happens when you do

>>> import array
>>> a = array.array('c', 'ACDEFGHIKL')
>>> a
array('c', 'ACDEFGHIKL')
>>>

 ?
                    Andrew


From idoerg at cc.huji.ac.il  Wed Jul 11 05:16:57 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
In-Reply-To: <006f01c109e7$d1108260$c1c63ec1@josiah.ebi.ac.uk>
Message-ID: <Pine.GSO.4.33_heb2.09.0107111208520.21099-100000@new-shum>

On Wed, 11 Jul 2001, Andrew Dalke wrote:

: That's can't be right, since there's no 'j':

Of course it's wrong! Silly me...

Here's the real thing:

>>> from Bio import Seq
>>> l=Seq.MutableSeq('ACDEFGHIKL')
>>> print l
MutableSeq('ACDEFG', Alphabet())
>>> l
(segfault)

Here's the summary (for now):

Problem appears on python2.0, 2.0.1
Linux RH 6.2, RH 7.0

Does not appear on python2.1, which I have just installed.

[Andrew]

: If I were to make a guess, what happens when you do
:
: >>> import array
: >>> a = array.array('c', 'ACDEFGHIKL')
: >>> a
: array('c', 'ACDEFGHIKL')
: >>>

This bit of code works fine on python2.0

Can anyone reproduce the segfault bug on python2.0?

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From dalke at dalkescientific.com  Wed Jul 11 05:31:03 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
Message-ID: <008401c109ec$34367300$c1c63ec1@josiah.ebi.ac.uk>

Iddo:
>Can anyone reproduce the segfault bug on python2.0?

The biopython.org machine is an Alpha machine running Linux
and Python 2.0 - no segfault:

[dalke@pw600a ~]$ uname -a
Linux pw600a.bioperl.org 2.2.14-6.0 #1 Tue Mar 28 16:56:56 EST 2000 alpha
unknown
[dalke@pw600a ~]$ python
Python 2.0 (#4, Dec  8 2000, 21:23:00)
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from Bio import Seq
>>> l = Seq.MutableSeq('ACDEFGHIKL')
>>> print l
MutableSeq('ACDEFGHIKL', Alphabet())
>>> l
MutableSeq(array('c', 'ACDEFGHIKL'), Alphabet())
>>>

                    Andrew


From chapmanb at arches.uga.edu  Wed Jul 11 05:44:58 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
In-Reply-To: <008401c109ec$34367300$c1c63ec1@josiah.ebi.ac.uk>
References: <008401c109ec$34367300$c1c63ec1@josiah.ebi.ac.uk>
Message-ID: <15180.8218.54340.15962@taxus.athen1.ga.home.com>

> Iddo:
> >Can anyone reproduce the segfault bug on python2.0?

Andrew:
> The biopython.org machine is an Alpha machine running Linux
> and Python 2.0 - no segfault:

I do get the segfault with Python 2.0 on FreeBSD:

[chapmanb]$ uname -a
FreeBSD insomniac.athen1.ga.home.com 4.3-STABLE FreeBSD 4.3-STABLE #5: Fri Jun  
8 01:44:22 EDT 2001     chapmanb@insomniac.athen1.ga.home.com:/usr/src/sys/compi
le/INSOMNIAC  i386
[chapmanb]$ python
Python 2.0 (#2, Oct 31 2000, 15:45:46) 
[GCC 2.95.2 19991024 (release)] on freebsd4
Type "copyright", "credits" or "license" for more information.
>>> from Bio import Seq
>>> l = Seq.MutableSeq('ACDEFG')            
>>> print l
MutableSeq('ACDEFG', Alphabet())
>>> l
Segmentation fault (core dumped)

but I don't see it with NetBSD/Python2.1:

[chapmanb]$ uname -a
NetBSD taxus.athen1.ga.home.com 1.5.1 NetBSD 1.5.1 (TAXUS) #1: Tue Jun 12 09:13:48 EDT 2001     chapmanb@taxus:/usr/src/sys/arch/macppc/compile/TAXUS macppc
[chapmanb]$ python
Python 2.1 (#6, Jul  8 2001, 17:18:01) 
[GCC egcs-2.91.66 19990314 (egcs-1.1.2 release)] on netbsd1
Type "copyright", "credits" or "license" for more information.
>>> from Bio import Seq
>>> l = Seq.MutableSeq('ACDEFGHIKL')
>>> print l
MutableSeq('ACDEFGHIKL', Alphabet())
>>> l
MutableSeq(array('c', 'ACDEFGHIKL'), Alphabet())

I'd suspect this is a tricky bug in 2.0 that got fixed somewhere along
the line in 2.1.

only-use-old-versions-of-software-if-you-like-to-run-into-fixed-bugs<wink>-ly
yr's,

Brad


From idoerg at cc.huji.ac.il  Wed Jul 11 06:03:01 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
In-Reply-To: <15180.8218.54340.15962@taxus.athen1.ga.home.com>
Message-ID: <Pine.GSO.4.33_heb2.09.0107111300580.22088-100000@new-shum>

Hi,

You're up bright and early, Brad!

OK, I'm out of ideas. This should be documented somewhere (where do we
document known bugs?), but I'm out of ideas as to a fix. Besides a
python2.1 update.


Iddo

On Wed, 11 Jul 2001, Brad Chapman wrote:

: > Iddo:
: > >Can anyone reproduce the segfault bug on python2.0?
:
: Andrew:
: > The biopython.org machine is an Alpha machine running Linux
: > and Python 2.0 - no segfault:
:
: I do get the segfault with Python 2.0 on FreeBSD:
:
: [chapmanb]$ uname -a
: FreeBSD insomniac.athen1.ga.home.com 4.3-STABLE FreeBSD 4.3-STABLE #5: Fri Jun
: 8 01:44:22 EDT 2001   chapmanb@insomniac.athen1.ga.home.com:/usr/src/sys/compi
: le/INSOMNIACi386
: [chapmanb]$ python
: Python 2.0 (#2, Oct 31 2000, 15:45:46)
: [GCC 2.95.2 19991024 (release)] on freebsd4
: Type "copyright", "credits" or "license" for more information.
: >>> from Bio import Seq
: >>> l = Seq.MutableSeq('ACDEFG')
: >>> print l
: MutableSeq('ACDEFG', Alphabet())
: >>> l
: Segmentation fault (core dumped)
:
: but I don't see it with NetBSD/Python2.1:
:
: [chapmanb]$ uname -a
: NetBSD taxus.athen1.ga.home.com 1.5.1 NetBSD 1.5.1 (TAXUS) #1: Tue Jun 12 09:13:48 EDT 2001   chapmanb@taxus:/usr/src/sys/arch/macppc/compile/TAXUS macppc
: [chapmanb]$ python
: Python 2.1(#6, Jul8 2001, 17:18:01)
: [GCC egcs-2.91.66 19990314 (egcs-1.1.2 release)] on netbsd1
: Type "copyright", "credits" or "license" for more information.
: >>> from Bio import Seq
: >>> l = Seq.MutableSeq('ACDEFGHIKL')
: >>> print l
: MutableSeq('ACDEFGHIKL', Alphabet())
: >>> l
: MutableSeq(array('c', 'ACDEFGHIKL'), Alphabet())
:
: I'd suspect this is a tricky bug in 2.0 that got fixed somewhere along
: the line in 2.1.
:
: only-use-old-versions-of-software-if-you-like-to-run-into-fixed-bugs<wink>-ly
: yr's,
:
: Brad
:
: _______________________________________________
: Biopython-dev mailing list
: Biopython-dev@biopython.org
: http://biopython.org/mailman/listinfo/biopython-dev
:

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From chapmanb at arches.uga.edu  Wed Jul 11 06:15:42 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: bug 37 (fwd)
In-Reply-To: <Pine.GSO.4.33_heb2.09.0107111300580.22088-100000@new-shum>
References: <15180.8218.54340.15962@taxus.athen1.ga.home.com>
	<Pine.GSO.4.33_heb2.09.0107111300580.22088-100000@new-shum>
Message-ID: <15180.10062.710172.68319@taxus.athen1.ga.home.com>

Hi Iddo;

> You're up bright and early, Brad!

:-). Up late actually; ah, the fun of trying to finish a project
update that you start at the very last moment!
 
> OK, I'm out of ideas. This should be documented somewhere (where do we
> document known bugs?), but I'm out of ideas as to a fix. Besides a
> python2.1 update.

I can add something to the Tutorial about it (I can add a "known bugs"
section). I don't think this problem should affect anyones scripts or
anything, but it is good to bring it up so that we know it's
happening.

Brad


From dalke at dalkescientific.com  Wed Jul 11 10:00:57 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Martel now supports attributes
Message-ID: <001901c10a11$e8877dc0$c1c63ec1@josiah.ebi.ac.uk>

I finally got a chance to do something I proposed a long time ago.
Martel now supports attributes for the XML events.

[dalke@pw600a biopython]$ python
Python 2.0 (#4, Dec  8 2000, 21:23:00)
[GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from Martel import *
>>> format = Group("seq", Re("[ATCG]+"), {"type": "dna"}) + AnyEol()
>>> from xml.sax import saxutils
>>> gen = saxutils.XMLGenerator()
>>> parser = format.make_parser()
>>> parser.setContentHandler(gen)
>>> parser.parseString("GATTACA\n")
<?xml version="1.0" encoding="iso-8859-1"?>
<seq type="dna">GATTACA</seq>
>>>

So the new part here is the optional 3rd arg to Martel.Group,
which is the dictionary to use for the attributes.  The result
is shown in the <seq> tag, which now includes the attribute
'type="dna"'.

The regular expression pattern language was modified to allow
persisting the attributes to/from the ?P<> group name.

>>> str(format)
'(?P<seq?type=dna>[ACGT]+)(\\n|\\r\\n?)'
>>>

This is actually encoded like the query component of a URL,
so the following is allowed

   (?P<spam?a=8&homedir=%7Edalke>...)

and corresponds to a startElement of:

   <spam a="8" homedir="~dalke">

The reason for this change is to make it easier to support different
formats and versions.  Currently I've been using tags like

<swissprot38><swissprot38_record> ...

Now I can do:

<seqdb dbname="swissprot" release="38">
  <record format="swissprot">
ID   <id type="id">100K_RAT</id>
AC   <it type="accession">Q12345</id>;  ...
SQ   <sequence>EKLADWERDN ADEDLE</sequence>
  </record>
</seqdb>

and if tag names are chosen consistently across the databases then
something like a FASTA conversion can be made very generic - just
get the 'id type="id"' and <sequence> fields of each <record>.

I've also added the old Martel-specific regression tests back the
the main biopython CVS tree.  This doesn't have the format specific
tests (like for PIR, BLAST, etc.), excepting SWISS-PROT.

I change the 'sre_parse.py' and 'sre_constants.py' files to be
'msre_parse.py' and 'msre_constants.py' because I was all too often
running into conflicts between those files and the ones in the
now standard Python distribution.

                    Andrew
                    dalke@acm.org


From dalke at dalkescientific.com  Thu Jul 12 20:13:48 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] bioperl-db
Message-ID: <003401c10b30$b0831180$c1c63ec1@josiah.ebi.ac.uk>

As I think I've mentioned before, I'm visiting EBI before going
to BOSC next week.  Part of my hope while here is to get better
integration between biopython and bioperl.  This afternoon (and
evening) I worked on being able to use a MySQL database loaded
from the bioperl-db schema.

Here's some of what it looks like, with comments starting with ">>> #"

josiah> python
Python 2.1a2 (#1, Feb 11 2001, 00:48:26)
[GCC 2.95.3 19991030 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import BioSeqDatabase
>>> server = BioSeqDatabase.open_database(user = "root", db="minidb")
>>> server.keys()
['genbank']
>>> db = server["genbank"]
>>> # can also do 'display_id' and 'primary_id'
>>> record = db.lookup(accession = "L26462")
>>> # size of the sequence
>>> len(record)
3002
>>> # This is akin to the bioperl way to access the sequence record
>>> seq = record.primary_seq.seq
>>> # Except that biopython's Seq is a first-class object
>>> # In bioperl this is a primary_seq.subseq(0, 1)
>>> seq[0]
'A'
>>> # The first fetches the full string from the database,
>>> # then shows the 1st 5 chars; the second only fetches 5 characters
>>> # from the database
>>> seq.tostring()[:5], seq[:5].tostring()
('ACCTC', 'ACCTC')
>>> # More showing off :)
>>> seq[-1]
'C'
>>> seq.tostring()[-5:], seq[-5:].tostring()
('CTATC', 'CTATC')
>>> # Yep, subslices can be subsliced
>>> seq.tostring()[-5:], seq[-10:][5:].tostring()
('CTATC', 'CTATC')
>>> from Bio import utils
>>> # This shows that the Alphabet is correct
>>> utils.translate_to_stop(record.primary_seq.seq)
Seq('TSYLTPLITPLIVTLWVVSDFLFICIFDCIKRSLVFYLLFPKT', IUPACProtein())
>>> # Added a new 'Species' object for this, based on bioperl's
>>> record.species
<BioSeq.Species instance at 0x817afbc>
>>> str(record.species)
'Eukaryota Metazoa Chordata Craniata Vertebrata Mammalia Eutheria Primates
Catarrhini Hominidae Homo sapiens'
>>> record.species.binomial()
'Homo sapiens'
>>> # Loaded sequence features into the biopython Bio.SeqFeature.SeqFeature
>>> len(record.seq_features)
26
>>> for feature in record.seq_features[:4]:
...     print str(feature)
...
type: source
location: (0..3002)
ref: None:None
strand: 1
qualifiers:
        Key: db_xref, Value: taxon:9606
        Key: haplotype, Value: C4
        Key: note, Value: sequence found in a Melanesian population
        Key: organism, Value: Homo sapiens

type: variation
location: (110..111)
ref: None:None
strand: 1
qualifiers:
        Key: replace, Value: t

type: variation
location: (262..263)
ref: None:None
strand: 1
qualifiers:
        Key: note, Value: Rsa I polymorphism
        Key: replace, Value: t

type: variation
location: (272..273)
ref: None:None
strand: 1
qualifiers:
        Key: replace, Value: c

>>> # Much more is supported - see the source!
>>>

If you are interested, the files are at
  http://www.biopython.org/~dalke/BioSeqDatabase.py
  http://www.biopython.org/~dalke/BioSeq.py

It may be somewhat harder reading than usual because
  - this is prototype code
  - I've not used the bioperl object layer before
  - I've not used the bioperl-db schema before
  - I've not used MySQL before
  - there are almost no comments in the 550+ LOC


I found out some things in the process:

1) The biopython SeqFeature currently must be used like:

   feature = SeqFeature()
   feature.type = "variation"
   ...

I would much rather prefer allowing the values to be set through
the constructor, as in

   feature = SeqFeature(type = "variation", ...)

This is part of my design philosophy, which is to minimize
modification of a data structure after it has been created.


2) is strand part of the feature or the location?

In bioperl-db and in bioperl, the strand information is part
of the Range.  In SeqFeature, the strand is part of the SeqFeature
and not of the FeatureLocation (which is our equivalent to the
Range).

This is a problem when a SeqFeature has several subfeatures.
In the current scheme, the parent SeqFeature keeps track of
a global start/end location for all its children.  It also needs
a strand.  What strand is used if the subchildren are on
different strands?  (Does that possibility make sense?)

In any case, I'm not sure enough about how to use a SeqFeature
that I can't figure out if it's really wrong or not.

3) Related to that, what's the type used when there are subfeatures?
The SeqFeature documentation says in the case of a join( ... )
the subfeatures have a typename of parent.type + "_span"

>     CDS    join(1..10,30..40,50..60)
>
>    The the top level feature would be a CDS from 1 to 60, and the sub
>    features would be of 'CDS_span' type and would be from 1 to 10, 30 to
>    40 and 50 to 60, respectively

but in Genbank._FeatureConsumer it says:
>             # add _join or _order to the name to make the type clear

I don't like either one.  Why does the SeqFeature need a type at all?

4) feature qualifiers shouldn't really be a dictionary

The SeqFeature keeps track of the qualifiers through a dictionary.
That's a problem because if there are two qualifiers with the
same name then there's a collision, and one gets over written.
(For example,

LOCUS       HUM2C18X01   1668 bp    DNA             PRI       24-AUG-1993
DEFINITION  Homo sapiens cytochrome P4502C18 (CYP2C18) gene, 5' flank and
exon
            1.
ACCESSION   L16869
VERSION     L16869.1  GI:291599

has two /citations

     exon            <1270..1437
                     /gene="CYP2C18"
                     /note="transcription start site not determined"
                     /citation=[1]
                     /citation=[3]
                     /number=1
                     /evidence=experimental

)

To get around this problem, the GenBank parser has:

> If there are multiple qualifier keys with the same name we
> would lose some info in the dictionary, so we append a unique
> number to the end of the name in case of conflicts.
> """
> # if we've got a key from before, add it to the dictionary of
> # qualifiers
> if self._cur_qualifier_key:
>     # get a unique name
>     unique_name = self._cur_qualifier_key
>     counter = 1
>     while self._cur_feature.qualifiers.has_key(unique_name):
>         unique_name = self._cur_qualifier_key + str(counter)
>         counter = counter + 1
>
>     self._cur_feature.qualifiers[unique_name] = \
>                                              self._cur_qualifier_value

which means the qualifier key has a number on the end of it.
That's messing with user visible data in a way that I definitely
don't regard as kosher.  (In my new bioperl-db integration code I
still keep workaround but minimize it somewhat by not appending
a 0 the first time around.  That isn't the right solution either.)

One way I've solved this problem before is to make a hybrid dict/list
class, where you can refer to
  obj["key"]
to get the value associated with the named key - a string - or
  obj[4]
to get the value in the 5th position.

This only works if the key can never be a string.

I'm not suggesting it as a solution, only pointing out that other
alternatives exist.

- What's the numbering system of the FeatureLocation?
When the GenBank parser uses it there is no conversion from
GenBank's numbering system
(where [i,j] means i<=position<=j and position == 1 is the first term)
to Python's
(where [i,j] means i<=position<j and position == 0 is the first term)

I strongly insist that all numbers be converted to normal Python
semantics and only translated at I/O boundaries (to/from files, to/from
GUIs, etc.)  Otherwise people will get confused.  Okay, at least *I'll*
get confused.

                    Andrew
                    dalke@acm.org


From johann at egenetics.com  Fri Jul 13 04:27:20 2001
From: johann at egenetics.com (Johann Visagie)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] bioperl-db
In-Reply-To: <003401c10b30$b0831180$c1c63ec1@josiah.ebi.ac.uk>; from dalke@dalkescientific.com on Fri, Jul 13, 2001 at 01:13:48AM +0100
References: <003401c10b30$b0831180$c1c63ec1@josiah.ebi.ac.uk>
Message-ID: <20010713102720.J66477@fling.sanbi.ac.za>

Andrew Dalke on 2001-07-13 (Fri) at 01:13:48 +0100:
> 
>   - I've not used MySQL before

Accessing MySQL via Andy Dustman's MySQLdb module is a strange twilight
experience that is and yet isn't quite unlike most other MySQL APIs.

On the one hand, MySQLdb implements many of the functions of the standard
MySQL C API as methods.  On the other, the emulated cursor class (which
satisfies the Python DB API spec) makes using it very different from any
other MySQL API I've used, and more in line with what you'd expect of an
"enterprise" RDBMS.  Somewhat strange...  but overall MySQLdb is extremely
Pythonic and quite clean.

This all just aside and FYI and all that...

-- V

From dalke at dalkescientific.com  Sat Jul 14 20:06:53 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] selenocysteines
Message-ID: <003d01c10cc2$0e9f6380$c1c63ec1@josiah.ebi.ac.uk>

Hey all,

  Just noticed ftp://ftp.ncbi.nlm.nih.gov/genbank/gbrel.txt says:


>  Selenocysteine residues within the protein translations of coding
> region features have been represented in GenBank via the letter 'X'
> and a /transl_except qualifier. At the May 1999 DDBJ/EMBL/GenBank
> collaborative meeting, it was learned that IUPAC plans to adopt the
> letter 'U' for selenocysteine.

Any knowledge on if that has occured.

Also, I noticed the GenBank parsers is using the generic DNA, RNA
and protein alphabets when it looks like it should use the IUPAC
versions.  Even if there are a few places where it fails, it should
be more useful than what there is now.  I'll go ahead and change
it but if there are complaints (Brad? You did that code) I'll change
it back.

                    Andrew


From idoerg at cc.huji.ac.il  Sun Jul 15 03:43:28 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] bug in SeqIO.FASTA?
Message-ID: <Pine.GSO.4.33_heb2.09.0107151038440.24332-100000@new-shum>

Hi,

Line 80 in SeqIO.FASTA:
Method write_records is defined as:


   def write_records(records):
     .
     .
     .

Needs to accept a list of SeqRecords, as far as I can tell. Seems to abort
though:

fasta_1.write_records(rec_list_1)
TypeError: write_records() takes exactly 1 argument (2 given)


Shouldn't the definition line be:
def write_records(self,records):

?????

Iddo

--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From dalke at dalkescientific.com  Sun Jul 15 07:26:49 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] bug in SeqIO.FASTA?
Message-ID: <004601c10d21$09f641a0$c1c63ec1@josiah.ebi.ac.uk>

Iddo:
>Shouldn't the definition line be:
>def write_records(self,records):

Yep.  I updated CVS.

                    Andrew


From idoerg at cc.huji.ac.il  Sun Jul 15 10:04:34 2001
From: idoerg at cc.huji.ac.il (Iddo Friedberg)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] selenocysteines
In-Reply-To: <003d01c10cc2$0e9f6380$c1c63ec1@josiah.ebi.ac.uk>
Message-ID: <Pine.GSO.4.33_heb2.09.0107151652170.479-100000@new-shum>

On Sun, 15 Jul 2001, Andrew Dalke wrote:

: Hey all,
:
: Just noticed ftp://ftp.ncbi.nlm.nih.gov/genbank/gbrel.txt says:
:
:
: >Selenocysteine residues within the protein translations of coding
: > region features have been represented in GenBank via the letter 'X'
: > and a /transl_except qualifier. At the May 1999 DDBJ/EMBL/GenBank
: > collaborative meeting, it was learned that IUPAC plans to adopt the
: > letter 'U' for selenocysteine.
:
: Any knowledge on if that has occured.
:

Well, I looked at Swiss-Prot's FDHF_ECOLI, which contains selenocysteine
in position 140. (Just to make things even happier: SwissProt marks this
with a "C"). Moseying on to M13563, one of 3 pointed to GenBank enries, I
indeed found an 'X' in the position the Selenocysteine. Same thing for the
E.coli chromosome file (U0006) and the (what appears to be) a contig.

Guess the switch hasn't happened yet.

Iddo


--

Iddo Friedberg                                  | Tel: +972-2-6758647
Dept. of Molecular Genetics and Biotechnology   | Fax: +972-2-6757308
The Hebrew University - Hadassah Medical School | email: idoerg@cc.huji.ac.il
POB 12272, Jerusalem 91120                      |
Israel                                          |
http://bioinfo.md.huji.ac.il/marg/people-home/iddo/


From dalke at dalkescientific.com  Tue Jul 17 05:27:06 2001
From: dalke at dalkescientific.com (Andrew Dalke)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] selenocysteines
Message-ID: <00c801c10ea2$a57ce0c0$c1c63ec1@josiah.ebi.ac.uk>

Looks like U for selenocysteines is the right thing to do, so I
added that to the extended protein alphabet.  That also allowed
me to add 'X' as the symbol for unknown protein.

Anything which uses X for selenocysteines should translate it
to U on import.

SWISS-PROT is using 'C' + /translation comment for selenocysteines.
Since 'C' != 'X' it doesn't fall under that previous paragraph. :)

                    Andrew


From biopython-bugs at bioperl.org  Tue Jul 17 23:30:33 2001
From: biopython-bugs at bioperl.org (biopython-bugs@bioperl.org)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Notification: incoming/38
Message-ID: <200107180330.f6I3UXw20222@pw600a.bioperl.org>

JitterBug notification

new message incoming/38

Message summary for PR#38
	From: katel@worldpath.net
	Subject: skipping Martel fields
	Date: Tue, 17 Jul 2001 23:30:31 -0400
	0 replies 	0 followups

====> ORIGINAL MESSAGE FOLLOWS <====

>From katel@worldpath.net Tue Jul 17 23:30:31 2001
Received: from localhost (localhost [127.0.0.1])
	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f6I3UVw20215
	for <biopython-bugs@pw600a.bioperl.org>; Tue, 17 Jul 2001 23:30:31 -0400
Date: Tue, 17 Jul 2001 23:30:31 -0400
Message-Id: <200107180330.f6I3UVw20215@pw600a.bioperl.org>
From: katel@worldpath.net
To: biopython-bugs@bioperl.org
Subject: skipping Martel fields

Full_Name: Katharine Lindner
Module: 
Version: 1.00a
OS: Winndows 98
Submission from: (NULL) (207.3.148.253)


Set up a test case with the format at the end.   The parser skips the line sfter
a short codon

Examples:
SEQTPA         92    83 act THR T
SEQTPA         93    84 cc
SEQTPA         94    85 gag GLU E
SEQTPA         95    86 ggc GLY G

Skips residue 94

Format:

alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
amino_1_letter_codes = 'ACDEFGHIKLMNPQRSTVWY-'

nucleotides = 'gcat-'

amino_alts = map( Martel.Str, amino_3_letter_codes )

codon = Martel.Group( "codon", Martel.MaxRepeat( Martel.Any( nucleotides ), 1, 3
) )
amino_3_letter_code = Martel.Group( "amino_3_letter_code", \
    reduce( Martel.Alt, amino_alts ) )
amino_1_letter_code = Martel.Group( "amino_1_letter_code", \
    Martel.Any( amino_1_letter_codes ) )
amino_acid = Martel.Group( "amino_acid",
                           blank_space +
                           amino_3_letter_code +
                           blank_space +
                           amino_1_letter_code )
residue = Martel.Group( "residue",
                           blank_space +
                           codon +
                           Martel.Opt( amino_acid ) )


kabatid = Martel.Group("kabatid",
                    Martel.Rep1(Martel.Integer()))
pubmed_num = Martel.Group("pubmed_num",
                    Martel.Rep1(Martel.Integer()))
residue_num = Martel.Group("residue_num",
                    Martel.Rep1(Martel.Integer()))
kabat_num = Martel.Group("kabat_num",
                    Martel.Rep1(Martel.Integer()) +
                    Martel.Opt( Martel.Any( alpha ) ) )
id_line = Martel.Group("id_line",
                          Martel.Str("KADBID") +
                          blank_space +
                          kabatid +
                          Martel.ToEol() )
residue_line = Martel.Group( "residue_line",
                             Martel.Str( "SEQTPA" ) +
                             blank_space +
                             residue_num +
                             blank_space +
                             kabat_num +
                             Martel.Opt( residue ) +
                             floop )
kabat_record_end_line =      Martel.Group( "record_end",
                             Martel.Str( "RECEND" ) +
                             Martel.ToEol() +
                             Martel.ToEol() +
                             Martel.ToEol())
residue_lines = Martel.Group( "residue_lines", Martel.Rep( residue_line ) )
kabat_record = id_line + residue_lines + kabat_record_end_line


From katel at worldpath.net  Sat Jul 21 18:35:11 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] WIT database
Message-ID: <001701c11235$68b5dc00$010a0a0a@cadence.com>

   After some cleanup of old parsers, I plan to write a parser for the WIT
database.  It should be interesting because it is a database of metabolic
steps instead of sequences.

                                           Cayte


From tarjei at genome.wi.mit.edu  Sat Jul 21 16:14:36 2001
From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] WIT database
In-Reply-To: <001701c11235$68b5dc00$010a0a0a@cadence.com>
Message-ID: <NEBBJGOKPAACGBJLMGCDOEHACBAA.tarjei@genome.wi.mit.edu>

 Hi,

 I'm new to this project, but I'm currently in the process of
writing a BioPython parser for the KEGG database, which is very
similar to WIT. I'll soon be posting a module for parsing the
KEGG/Ligand database here, which I hope you'll integrate into
BioPython.

 I've also been toying with the idea of writing a parser for the
pathway section of KEGG, but I think that to maximize the usefulness
of such a module there should first be a set of objects for representing
reactions and pathways (a Bio.Pathway module). This would make it
possible to manipulate information from databases like KEGG and WIT
in a uniform manner - just like all sequence information is parsed into
a Bio.Seq object.

 If there is interest I'd be happy to collaborate, or even take the
lead, in developing a Bio.Pathway module. As Cayte says below,
this is an interesting challenge because, as far as I know, this is
not a functionality that is currently present in any of the other bio*
projects.


 Thanks,

 Tarjei Mikkelsen
 tarjei@genome.wi.mit.edu

> -----Original Message-----
> From: biopython-dev-admin@biopython.org
> [mailto:biopython-dev-admin@biopython.org]On Behalf Of Cayte
> Sent: Saturday, July 21, 2001 6:35 PM
> To: biopython-dev@biopython.org
> Subject: [Biopython-dev] WIT database
>
>
>    After some cleanup of old parsers, I plan to write a parser for the WIT
> database.  It should be interesting because it is a database of metabolic
> steps instead of sequences.
>
>                                            Cayte
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev@biopython.org
> http://biopython.org/mailman/listinfo/biopython-dev
>


From katel at worldpath.net  Sat Jul 21 20:23:41 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] WIT database
References: <NEBBJGOKPAACGBJLMGCDOEHACBAA.tarjei@genome.wi.mit.edu>
Message-ID: <002501c11244$90bd84a0$010a0a0a@cadence.com>

>
>  I've also been toying with the idea of writing a parser for the
> pathway section of KEGG, but I think that to maximize the usefulness
> of such a module there should first be a set of objects for representing
> reactions and pathways (a Bio.Pathway module). This would make it
> possible to manipulate information from databases like KEGG and WIT
> in a uniform manner - just like all sequence information is parsed into
> a Bio.Seq object.
>
   I think its a great idea!  My first step is usually just to poke around
the database and get ideas.  Hopefully,other biopythons will be interested
and after some brainstorming the ideas will converge to some reasonable
Bio.Pathways objects.

                                                    Cayte


From tarjei at genome.wi.mit.edu  Mon Jul 23 02:36:42 2001
From: tarjei at genome.wi.mit.edu (Tarjei S Mikkelsen)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Bio.KEGG
Message-ID: <NEBBJGOKPAACGBJLMGCDEEHECBAA.tarjei@genome.wi.mit.edu>


 Hi,

 as I mentioned earlier this weekend I have begun writing a BioPython
compatible parser for the KEGG database that I would be happy to contribute
to the project. Currently only the Ligand/Enzyme section is supported, 
but the Ligand/Compound and the Genes section should follow soon.

 Since I don't have CVS access I've attached the code here. It is organized
to plug directly into the current BioPython distribution. The source is 
heavily inspired by the Bio.GenBank module (which makes sense since KEGG 
has (unfortunately) modeled its flatfile format on the GenBank format).

 thanks,

 Tarjei Mikkelsen
 tarjei@genome.wi.mit.edu

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bio.kegg.v0.1a.tar.gz
Type: application/x-gzip
Size: 14261 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20010723/b784991b/bio.kegg.v0.1a.tar.bin
From marchign at di.unipi.it  Mon Jul 23 09:08:18 2001
From: marchign at di.unipi.it (Davide Marchignoli)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] "Features" of Bio.Clustalw
Message-ID: <Pine.LNX.4.31.0107231218580.949-100000@mercurio.localdomain>

I wanted to point out some (minor) problem I had using Bio.Clustalw.

In an alignment run clustalw generates two files:
- the output file (with default extension aln), and
- the tree file (with default extension dnd)
the MultipleAlignCL permits to change the output file with method
set_output but there is no way to change the tree file.

Is it possible to add a method set_tree that add to the command line the
-newtree option and the -align option (At the moment I am subclassing
MultipleAlignCL)?

The do_alignment function executes
  print "executing %s..." % command_line
and I think not everybody want it printed (I do not). Yes, one could
temporary redirect sys.stdout, but why?

Even if one sets the alignment type to PROTEIN (via set_type method of
MultipleAlignCL) the resulting alignment has DNA alphabet.

Tkanks, for your attention,

				Davide Marchignoli

PS: Is it possible to submit patches to biopython? How can I contribute?


From thomas at genome.cbs.dtu.dk  Tue Jul 24 09:09:04 2001
From: thomas at genome.cbs.dtu.dk (Thomas Sicheritz-Ponten)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] New: sequtils.py
Message-ID: <Pine.SGI.3.95.1010724145841.68983162A-100000@genome.cbs.dtu.dk>

Hej All,

After the Biopython BoF meeting at ISMB01 in Copenhagen we decided to
temporarily collect seqeuence utilities/functions in Bio/sequtils.py
Cessie (our new biopython member) and I started by collecting some functions 
(some of them are just aliases to existing - but deeply hidden functions).

Currently included:
	  ProteinX, makeTableX for error free translation of ambiguous DNA
	  complement, reverse, antiparallel and translate
	  nice six_frame_translations ala DNA Strider/XBBtools
	  GC, GC123, GC_skew, Accumulated_GC_skew
	  fasta_uniqids for getting unique identifiers in the FASTA file (useful) for using clustalw
	  quick_FASTA_reader for reading huge FASTA files (e.g. genomes)
	  apply_on_multi_fasta: use any function (e.g. GC) and apply it on all entries in a multiple FASTA file
	  
Questions: 
1) should we move Proteinx and maketablex somewhere else ?
2) we included a quick_fasta_reader hack, the FASTA parser is cool and nice
   but because of all checkings it takes ages for e.g. a complete genome
   Should we create a faster alternative ? (compatible with the normal one)
3) some functions exists in utils.py. Could we move sequence based functions 
   to sequtils.py and use utils.py for other non-seqeunce based functions ?
   (e.g. I'd like to put my hyper-geometric distribution code there for expression data)
4) anyone got a hangover from yesterdays banquette ?

cheers
-thomas

Sicheritz-Ponten Thomas, Ph.D  CBS, Department of Biotechnology
thomas@biopython.org           The Technical University of Denmark
CBS:  +45 45 252489            Building 208, DK-2800 Lyngby
Fax   +45 45 931585            http://www.cbs.dtu.dk/thomas

	De Chelonian Mobile ... The Turtle Moves ...


From jchang at SMI.Stanford.EDU  Wed Jul 25 10:17:53 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Bio.KEGG
In-Reply-To: <NEBBJGOKPAACGBJLMGCDEEHECBAA.tarjei@genome.wi.mit.edu>
Message-ID: <Pine.GSO.4.31.0107250717060.8865-100000@taiyang>

Hi Tarjei,

Thanks very much.  I'm in copenhagen right now, so I"ll take a look at it
next week after I get back.

Jeff


On Mon, 23 Jul 2001, Tarjei S Mikkelsen wrote:

>
>  Hi,
>
>  as I mentioned earlier this weekend I have begun writing a BioPython
> compatible parser for the KEGG database that I would be happy to contribute
> to the project. Currently only the Ligand/Enzyme section is supported,
> but the Ligand/Compound and the Genes section should follow soon.
>
>  Since I don't have CVS access I've attached the code here. It is organized
> to plug directly into the current BioPython distribution. The source is
> heavily inspired by the Bio.GenBank module (which makes sense since KEGG
> has (unfortunately) modeled its flatfile format on the GenBank format).
>
>  thanks,
>
>  Tarjei Mikkelsen
>  tarjei@genome.wi.mit.edu
>
>


From jchang at SMI.Stanford.EDU  Sat Jul 28 12:40:55 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] "Features" of Bio.Clustalw
In-Reply-To: <Pine.LNX.4.31.0107231218580.949-100000@mercurio.localdomain>
References: <Pine.LNX.4.31.0107231218580.949-100000@mercurio.localdomain>
Message-ID: <p05101002b7889b45c46c@[192.168.0.4]>

Hi Davide,

Thanks for the comments.  We're all getting back from ISMB and 
catching up on email, so the proper developer will comment soon.

>PS: Is it possible to submit patches to biopython? How can I contribute?

Yes!  We accept patches and new modules.  Pathes are preferably done 
against a current CVS repository.

Jeff

From chapmanb at arches.uga.edu  Sat Jul 28 15:03:10 2001
From: chapmanb at arches.uga.edu (Brad Chapman)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] "Features" of Bio.Clustalw
In-Reply-To: <Pine.LNX.4.31.0107231218580.949-100000@mercurio.localdomain>
References: <Pine.LNX.4.31.0107231218580.949-100000@mercurio.localdomain>
Message-ID: <15203.3182.97701.271322@taxus.athen1.ga.home.com>

Hi Davide;
> I wanted to point out some (minor) problem I had using Bio.Clustalw.

Thanks for sending the comments!

Jeff:
> Thanks for the comments.  We're all getting back from ISMB and 
> catching up on email, so the proper developer will comment soon.

I think I'm the "proper developer" since I wrote the Clustalw stuff, 
although this is the first time I've had someone refer to me like that
:-)
 
> In an alignment run clustalw generates two files:
> - the output file (with default extension aln), and
> - the tree file (with default extension dnd)
> the MultipleAlignCL permits to change the output file with method
> set_output but there is no way to change the tree file.

Yup, this is bad. Thanks for catching it!
 
> Is it possible to add a method set_tree that add to the command line the
> -newtree option and the -align option?

Definately -- this sounds exactly right.
 
> The do_alignment function executes
>   print "executing %s..." % command_line
> and I think not everybody want it printed (I do not). Yes, one could
> temporary redirect sys.stdout, but why?

Ooop, another good catch. It is definately bad for libraries to print
things. 
 
> Even if one sets the alignment type to PROTEIN (via set_type method of
> MultipleAlignCL) the resulting alignment has DNA alphabet.

Bleah. You caught all of my bugs :-). Bad me!

> PS: Is it possible to submit patches to biopython? How can I contribute?

Definately. We loooooove contributions. If you have fixes for the
above problems, the best way to submit the patch is to send a context
or unified diff ('diff -c' or 'diff -u') to this list, and I'll be
happy to apply it and make sure things are worked out. You can check
out the CVS tree anonymously (instructions for this are at
http://cvs.biopython.org/ ), and submitting patches against this would
be extra-great.

Thanks for your feedback on this. If you can send patches for whatever
you can fix, I'll take care of the rest of the bugs, and beef up the
test suite to catch them. 

Thanks again!
Brad


From katel at worldpath.net  Mon Jul 30 19:10:37 2001
From: katel at worldpath.net (Cayte)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Pathway Module
Message-ID: <002101c1194c$d997d5e0$010a0a0a@cadence.com>

  I'm posting some sketchy ideas about what the Pathway modules may contain,
mostly to get the discussion going.  I used arrays because most of the
relations are many to many.    Maybe reaction should have a pathway array?
Step is separate from reaction, because a reaction could occur in more than
one pathway. These classes are scaffolding, so please don't hesitate  to
suggest better ideas.

There may be other information associate with reaction, like temperature,
but I haven't come across it yet in the WIT or EMP databases.  It looks like
a lot of the meat is in the enzyme structures.  In EMP, the structure of an
Enzyme looks straight forward because it is a flat two column file with the
first column the name of a field and the contents in the second.

                                                                       Cayte


class Reaction:
    self.substrates = []
    self.products = []
    self.enzymes = []
    self.factors = []

class PathStep:
    self.reaction = None
    self.inlinks = []
    self.outlinks = []

class Pathway:
    self.name = ''
    self.organisms = []
    self.steps = []


From davide at biodec.com  Mon Jul 30 19:22:00 2001
From: davide at biodec.com (Davide Marchignoli)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] "Features" of Bio.Clustalw
In-Reply-To: <15203.3182.97701.271322@taxus.athen1.ga.home.com>
Message-ID: <Pine.LNX.4.31.0107310101210.7704-100000@mercurio.localdomain>


On Sat, 28 Jul 2001, Brad Chapman wrote:

> Hi Davide;
>
> Thanks for your feedback on this. If you can send patches for whatever
> you can fix, I'll take care of the rest of the bugs, and beef up the
> test suite to catch them.
>
> Thanks again!
> Brad
>

Hi Brad,

Here I send the patches I was able to cook up, these are only minor
changes, anyway I hope it will help.

I think that having a class like MultipleAlignCL is superior to passing
the alignment arguments to a function as is for blastpgp or blastall.

For first, it allow you to see which command it will be executed (for
debugging purposes it can be interesting) simply applying str to the
object.

Also it permits to build a set of parameters to be used on different
files.

Finally it is a general mechanism and could be used to give a uniform
interface to functions invoking external programs.

Do you think you would be interested in a patch implementing such
behaviour? I think one could also retain compatibilty with the current
interface.

Thanks again,

				Davide Marchignoli


From jchang at SMI.Stanford.EDU  Mon Jul 30 19:34:28 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] New: sequtils.py
In-Reply-To: 
 <Pine.SGI.3.95.1010724145841.68983162A-100000@genome.cbs.dtu.dk>
References: 
 <Pine.SGI.3.95.1010724145841.68983162A-100000@genome.cbs.dtu.dk>
Message-ID: <p05101003b78b9e44e56c@[171.65.33.250]>

Hi Thomas,

>2) we included a quick_fasta_reader hack, the FASTA parser is cool and nice
>    but because of all checkings it takes ages for e.g. a complete genome
>    Should we create a faster alternative ? (compatible with the normal one)

Yes!  It looks like the current Fasta readers are implemented in 
python and might even have some overhead in parsing the description 
lines.  It looks like we should update our FASTA handling to 1) run 
faster and 2) allow optional in-depth parsing of description lines. 
Perhaps it's time to redo it in Martel.


>3) some functions exists in utils.py. Could we move sequence based functions
>    to sequtils.py and use utils.py for other non-seqeunce based functions ?
>    (e.g. I'd like to put my hyper-geometric distribution code there 
>for expression data)

Sure.  However, I'm not sure hyper-geometric distributions should go 
into utils.py.  Perhaps we should start a new package to handle 
statistics and probability-type stuff.


>4) anyone got a hangover from yesterdays banquette ?

Looks like it...  :)

Jeff

From davide at biodec.com  Mon Jul 30 19:45:40 2001
From: davide at biodec.com (Davide Marchignoli)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] "Features" of Bio.Clustalw (fwd)
Message-ID: <Pine.LNX.4.31.0107310143490.7893-200000@mercurio.localdomain>

Ops, I forgot the patch.


Bye,
				Davide Marchignoli

-------------- next part --------------
diff -Nacr /usr/src/biopython-300701/Bio/Blast/Record.py Bio/Blast/Record.py
*** /usr/src/biopython-300701/Bio/Blast/Record.py	Mon Jul 30 23:14:13 2001
--- Bio/Blast/Record.py	Tue Jul 31 00:22:22 2001
***************
*** 291,294 ****
          DatabaseReport.__init__(self)
          Parameters.__init__(self)
          self.rounds = []
!         converged = 0
--- 291,294 ----
          DatabaseReport.__init__(self)
          Parameters.__init__(self)
          self.rounds = []
!         self.converged = 0
diff -Nacr /usr/src/biopython-300701/Bio/Clustalw/__init__.py Bio/Clustalw/__init__.py
*** /usr/src/biopython-300701/Bio/Clustalw/__init__.py	Mon Jul 30 23:14:13 2001
--- Bio/Clustalw/__init__.py	Tue Jul 31 00:14:24 2001
***************
*** 62,79 ****
  
      return align_handler.align
  
! def do_alignment(command_line):
      """Perform an alignment with the given command line.
  
      Arguments:
      o command_line - A command line object that can give out
      the command line we will input into clustalw.
      
      Returns:
      o A clustal alignment object corresponding to the created alignment.
      If the alignment type was not a clustal object, None is returned.
      """
-     print "executing %s..." % command_line
      run_clust = os.popen(str(command_line))
      value = run_clust.close()
  
--- 62,81 ----
  
      return align_handler.align
  
! def do_alignment(command_line, alphabet=None):
      """Perform an alignment with the given command line.
  
      Arguments:
      o command_line - A command line object that can give out
      the command line we will input into clustalw.
+     o alphabet - the alphabet to use in the created alignment. If not
+     specified IUPAC.unambiguous_dna and IUPAC.protein will be used for
+     dna and protein alignment respectively.
      
      Returns:
      o A clustal alignment object corresponding to the created alignment.
      If the alignment type was not a clustal object, None is returned.
      """
      run_clust = os.popen(str(command_line))
      value = run_clust.close()
  
***************
*** 107,113 ****
          return None
      # otherwise parse it into a ClustalAlignment object
      else:
!         return parse_file(out_file)
  
  
  class ClustalAlignment(Alignment):
--- 109,118 ----
          return None
      # otherwise parse it into a ClustalAlignment object
      else:
!         if not alphabet:
!             alphabet = (IUPAC.unambiguous_dna, IUPAC.protein)[
!                 command_line.type == 'PROTEIN']
!         return parse_file(out_file, alphabet)
  
  
  class ClustalAlignment(Alignment):
***************
*** 361,366 ****
--- 366,372 ----
  
          # 2. a guide tree to use
          self.guide_tree = None
+         self.new_tree = None
  
          # 3. matrices
          self.protein_matrix = None
***************
*** 369,375 ****
          # 4. type of residues
          self.type = None
  
!     def __repr__(self):
          """Write out the command line as a string."""
          cline = self.command + " " + self.sequence_file
  
--- 375,381 ----
          # 4. type of residues
          self.type = None
  
!     def __str__(self):
          """Write out the command line as a string."""
          cline = self.command + " " + self.sequence_file
  
***************
*** 392,397 ****
--- 398,406 ----
              cline = cline + " -CASE=" + self.change_case
          if self.add_seqnos:
              cline = cline + " -SEQNOS=" + self.add_seqnos
+         if self.new_tree:
+             # clustal does not work if -align is written -ALIGN
+             cline = cline + " -NEWTREE=" + self.new_tree + " -align"
  
          # multiple alignment options
          if self.guide_tree:
***************
*** 478,483 ****
--- 487,497 ----
          else:
              self.guide_tree = tree_file
  
+     def set_new_guide_tree(self, tree_file):
+         """Set the name of the guide tree file generated in the alignment.
+         """
+         self.new_tree = tree_file
+         
      def set_protein_matrix(self, protein_matrix):
          """Set the type of protein matrix to use.
  
From jchang at SMI.Stanford.EDU  Tue Jul 31 12:56:04 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] Re: [BioPython] tests failing
In-Reply-To: <3B66BE5E.7020204@herts.ac.uk>
References: <3B66BE5E.7020204@herts.ac.uk>
Message-ID: <p05101004b78c9282b250@[171.65.33.250]>

Hey Mark,

Thanks for letting us know about these.  I'm moving this thread onto 
the "biopython-dev" list, as it's probably more appropriate there.

>Failure: test_SubsMat
>
>AssertionError:
>output: 'M0.00 0.40 0.70 0.80 1.00\n'
>Expected: 'M -0.00 0.40 0.70 0.80 1.00\n'

It looks like this is from a difference in how windows and Iddo's OS 
handles 0's.  It's probably not serious, but should be fixed.  Iddo, 
can you write some code that will check for this?


>Error: test_gobase
>
>from Bio import Sequence
>ImportError: cannot import name Sequence
>
>Error: test_rebase
>
>from Bio import Sequence
>ImportError: cannot import name Sequence

These seem to be from some legacy code that hasn't been cleaned up. 
It's now fixed in the CVS and will be incorporated into the next 
release.


>Failure: test_prodoc
>
>AssertionError:
>Output: 'J. \n'
>Expected: 'J. \n'

Brad, this looks pretty odd.  Is it a newline problem?

Jeff

From jchang at SMI.Stanford.EDU  Tue Jul 31 13:12:55 2001
From: jchang at SMI.Stanford.EDU (Jeffrey Chang)
Date: Sat Mar  5 14:43:01 2005
Subject: [Biopython-dev] biopython-dev BOSC BoF redux
Message-ID: <p05101007b78c976cd90d@[171.65.33.250]>

Hello everybody,

Here is a summary of the Developer's BoF at BOSC.  It'll provide a
good roadmap of things that need to be done in the next few months.

Jeff


Andrew's work at EBI

- Andrew met bioperl and biojava developers
- Andrew wrote code to connect biopython objects to Ewan's bioperl-db


Documentation

- Brad needs some feedback, what's important to work on?
- we need documentation for Martel
- manual and tutorial should be separate
- A cookbook with mini blocks of working code is a good idea.
   Brad will investigate
- general dissatisfaction with Wiki.  Organization problems?


Reorganizing modules

Currently, modules are organized:
   Bio/
     [databases]
     Data/
     Tools/
       Classification/
       Clustering/
       Transcribe.py
       Translate.py
       listfns.py
       mathfns.py
       stringfns.py
       MultiProc/
     Seq.py
     SeqFeature.py
     utils.py
   Martel/

Discussion points:
- want a broad tree rather than deep tree -- stuff easier to find
- packages for end users should be on top, rather than buried deep inside
- should try to keep stuff in Bio namespace to avoid clashes with
other packages (Martel is an exception)
- directories within Tools/ should be moved up one level
- should add sequtils.py to include general sequence calculations


Johann's stuff
- has a DAS server, almost done
- has code to work with controlled vocabularies on expression data
- needs to get approval to release to biopython


Integrating current parsing API with Martel

We considered two alternatives to building the parsing API.  Both are
compatible with Martel:
Current API:
   parser = XXXParser()
   obj = parser.parse(handle)
Alternative API:
   p = format.make_parser()
   p.setContentHandler(obj)
   p.parseFile(handle)
   return obj.built

We agreed to map Martel into the current API, because it is simpler
and less tied to the Martel/SAX way of doing things.


Biopython.com

Technically, we talked about this in the general BoF, but I'm
including it here because of its importance.  Andrew had some
proposals for what to do with the biopython.com domain name.  We could
1) allow him to use it for his consulting company, or we can 2)
reserve it for companies that provide support and/or services for
biopython.  We agreed that the second would be the better choice, and
trust him to maintain the domain for those purposes.