From bugzilla-daemon at portal.open-bio.org  Sat Mar  4 16:17:25 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Sat Mar  4 16:20:26 2006
Subject: [Biopython-dev] [Bug 1933] Iterator support for Standalone XML
	blast output with multiple querys
Message-ID: <200603042117.k24LHPWQ012440@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1933


------- Comment #8 from mdehoon@ims.u-tokyo.ac.jp  2006-03-04 16:17 -------
> I'm wondering if it would be better to create a separate "XML Blast Iterator"
> to live in the NCBIXML.py file, rather than enhance the existing iterator in
> NCBIStandalone.py
> 
> Comments?

Since the proposed patch needs to add only one line to the existing iterator in
NCBIStandalone.py, I think this solution is better than adding a separate XML
Blast Iterator.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Tue Mar  7 15:00:43 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Tue Mar  7 15:20:24 2006
Subject: [Biopython-dev] [Bug 1968] GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603072000.k27K0hq6007815@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968


------- Comment #1 from kael.fischer@gmail.com  2006-03-07 15:00 -------
Note that that little snippet breaks some other REFERENCES unless it is moved
down several lines....  r

right after:

consumer.reference_num(data[:data.find(' ')])

and before:
consumer.reference_bases(data[data.find(' ')+1:])


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Tue Mar  7 14:36:26 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Tue Mar  7 15:20:42 2006
Subject: [Biopython-dev] [Bug 1968] New: GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603071936.k27JaQeH007228@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968

           Summary: GenBank parsing fails if REFERNCE (bases.. )line is
                    split
           Product: Biopython
           Version: Not Applicable
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev@biopython.org
        ReportedBy: kael.fischer@gmail.com


Using the  Bio/GenBank/__init__.py from the CVS HEAD, parsing of:  J01917.1 
GI:209811 fails.

The file pointer is at the end of the record and the traceback is:

Traceback (most recent call last):
  File "gb2fasta.py", line 21, in ?
    for gbr in GenBank.Iterator(f,parser=gbParser):
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
146, in next
    return self._parser.parse(File.StringHandle(data))
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
212, in parse
    self._scanner.feed(handle, self._consumer)
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1518, in feed
    line = self._feed_header(handle, consumer)
  File "/usr/local/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1386, in _feed_header
    assert line[0:GENBANK_INDENT] <> GENBANK_SPACER, \
AssertionError: Unexpected continuation of an entry:
            28259)


The _feed_header method does not deal with REFERENCE ... (bases ....) being
split across lines.

This diff fixes it (form is wordwrapping it):
***************
*** 1425,1431 ****
--- 1425,1437 ----
                  #Need to call consumer.reference_num() and
consumer.reference_bases()
                  #e.g.
                  # REFERENCE   1  (bases 1 to 86436)
+ 
                  data = data.strip()
+ 
+                 #check for closing pren
+                 while data.find(')') == -1:
+                     data=data+handle.readline().strip()
+                 
                  while data.find('  ')<>-1:
                      data = data.replace('  ',' ')
                  if data.find(' ')==-1 :


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Wed Mar  8 05:48:44 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Wed Mar  8 06:20:58 2006
Subject: [Biopython-dev] [Bug 1968] GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603081048.k28Amid3025128@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968


biopython-bugzilla@maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED


------- Comment #2 from biopython-bugzilla@maubp.freeserve.co.uk  2006-03-08 05:48 -------
This GenBank file?

http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=209811

..
REFERENCE   42 (bases 1517 to 1696; 3932 to 4112; 17880 to 17975; 21142 to
            28259)
  AUTHORS   Fraser,N.W., Baker,C.C., Moore,M.A. and Ziff,E.B.
  TITLE     Poly(A) sites of adenovirus serotype 2 transcription units
  JOURNAL   J. Mol. Biol. 155 (3), 207-233 (1982)
   PUBMED   6176714
REFERENCE   43 (bases 7869 to 8420)
..

You said in comment 2 that your original fix caused problems in other
references unless the snippet of code was moved slightly.  Same GenBank file or
a different one - and if so, which one?

Thank you


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Wed Mar  8 12:10:07 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Wed Mar  8 12:20:33 2006
Subject: [Biopython-dev] [Bug 1968] GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603081710.k28HA7Bl030345@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968


------- Comment #3 from kael.fischer@gmail.com  2006-03-08 12:10 -------
Created an attachment (id=296)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=296&action=view)
proposed patch handles splits like: REFERENCE (base ..... \n ... )

tested against all viral sequences in gb rel. 152.  No parsing failures.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 10:22:05 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 11:21:13 2006
Subject: [Biopython-dev] [Bug 1968] GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603091522.k29FM5rR014209@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968


biopython-bugzilla@maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


------- Comment #4 from biopython-bugzilla@maubp.freeserve.co.uk  2006-03-09 10:22 -------
I have checked in my own fix (I didn't want to depend on bracket/parentheses
counting) and tested J01917.1 GI:209811 and worse versions where I edited the
file to split the reference onto many lines.

See Bio/GenBank/__init__.py revision 1.59

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/__init__.py?cvsroot=biopython

If you wouldn't mind Kael, it would be great if you could try this version on
all viral sequences in genbank release 152.

Thanks.  Peter


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 10:28:54 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 11:21:22 2006
Subject: [Biopython-dev] [Bug 1965] GenBank FeatureParser converts dates
	from 4 digits to TWO!
Message-ID: <200603091528.k29FSsG3014393@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1965


------- Comment #1 from biopython-bugzilla@maubp.freeserve.co.uk  2006-03-09 10:28 -------
I've tried both GenBank.FeatureParser and GenBank.RecordParser but I'm not able
to reproduce this using the CVS GenBank parser, I see the full four digit
years.

Could you provide some more information - e.g. a specific GenBank file ID, and
the version of Python and BioPython you are using.

A short example script wouldn't hurt ;)


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 11:39:19 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 12:20:42 2006
Subject: [Biopython-dev] [Bug 1965] GenBank FeatureParser converts dates
	from 4 digits to TWO!
Message-ID: <200603091639.k29GdJeX016365@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1965


mcolosimo@mitre.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |LATER


------- Comment #2 from mcolosimo@mitre.org  2006-03-09 11:39 -------
Now I'm at a lost. You are correct that they do generate the correct 4-digit
year. But while building my scripts, I got it to output 2-digit years. But that
was only test scripts which I didn't save.

I've resolved this and changed the resolution to LATER (what ever that means)

(In reply to comment #1)
> I've tried both GenBank.FeatureParser and GenBank.RecordParser but I'm not able
> to reproduce this using the CVS GenBank parser, I see the full four digit
> years.
> 
> Could you provide some more information - e.g. a specific GenBank file ID, and
> the version of Python and BioPython you are using.
> 
> A short example script wouldn't hurt ;)
> 


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 11:24:00 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 12:21:00 2006
Subject: [Biopython-dev] [Bug 1933] Iterator support for Standalone XML
	blast output with multiple querys
Message-ID: <200603091624.k29GO0SN015960@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1933


biopython-bugzilla@maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #9 from biopython-bugzilla@maubp.freeserve.co.uk  2006-03-09 11:24 -------
Fixed checked in, see Bio/Blast/NCBIStandalone.py revision 1.60

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Blast/NCBIStandalone.py?cvsroot=biopython

I have not yet updated the test scripts.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 11:24:11 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 12:21:41 2006
Subject: [Biopython-dev] [Bug 1715] Bio.Blast.NCBIStandalone does not
	support standalone NCBI RPS-Blast (rpsblast) output
Message-ID: <200603091624.k29GOBcN015972@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1715


biopython-bugzilla@maubp.freeserve.co.uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #12 from biopython-bugzilla@maubp.freeserve.co.uk  2006-03-09 11:24 -------
Marking as fixed - as long as you use XML output, not plain text.

See also Bug 1933 which fixes standalone blast XML iteration, and has some
examples attached using rpsblast output.

I have checked in a new "rpsblast" function to Bio/Blast/NCBIStandalone.py
based on the existing "blastall" and "blastpgp" functions.

However, this will by default call standalone RPS-BLAST with the option -m 7 to
produce XML output.

See Bio/Blast/NCBIStandalone.py revision 1.60

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Blast/NCBIStandalone.py?cvsroot=biopython

NOTE - BioPython does not currently support the "plain text" default output
from standalone rpsblast, especially for multiple queries.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org  Thu Mar  9 12:29:35 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org)
Date: Thu Mar  9 13:20:29 2006
Subject: [Biopython-dev] [Bug 1968] GenBank parsing fails if REFERNCE
	(bases.. )line is split
Message-ID: <200603091729.k29HTZ4f017584@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1968


------- Comment #5 from kael.fischer@gmail.com  2006-03-09 12:29 -------
Tested against GenBank-Vrl rel. 152 (>300,000 records).
No errors generated.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From jara.zimmermann at seznam.cz  Thu Mar  9 18:41:04 2006
From: jara.zimmermann at seznam.cz (=?us-ascii?Q?Jan=20Zimmermann?=)
Date: Thu Mar  9 18:42:38 2006
Subject: [Biopython-dev] biopython programmer
Message-ID: <623.1367-1717-1791254803-1141947664@seznam.cz>

Hi,

my name is Jan Zimmermann and I'm a student of Charles University, Faculty of Mathematics and Physics in Czech Republic. Now I'm choosing a topic of my bachelor's work. For this work I have 14 months (till May 2007). I'd be interested to create any new part of Biopython. If you are intrested in my offer please send me part of your TODO List, so that I can choose, what I could program.

my interests:
* AI
* genetic algorithms
* neural networks
* machine learning

my programing skills
* Delphi, PHP, mySQL, C, C++, Java, UNIX shell script, Prolog, Scheme, Haskel, Python (I've begun to study)...
* experiences of team programing

Thanks for reply.

Regards

Jan Zimmermann
From dag at sonsorol.org  Sat Mar 11 15:00:33 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sat, 11 Mar 2006 15:00:33 -0500
Subject: [Biopython-dev] hey biopython devs ...
Message-ID: <E6FDC1AE-15AB-4326-8867-D21146ACCE18@sonsorol.org>


Some server and site news for you ...

Our hosting time at Wyeth is coming to an end very shortly which now  
forces us to greatly speed up our planned transition to the new OBF  
server hardware sitting in our other Boston area datacenter.

I've just completed moving all mailing lists off of the old  
portal.open-bio.org server. All OBF lists are hosted off of "http:// 
lists.open-bio.org" now and the python lists were the last to be  
moved (a few minutes ago).   Until biopython can move its website to  
the new server, the website will no longer be on the same machine as  
the mailing list and mailing list archive. I've left behind aliases  
on the old server to catch and forward list related messages.

This brings me to the other issue -- moving your website. Can you  
elect someone among yourselves to work with me on moving  
biopython.org entirely over to the new servers? I want to be able to  
work with someone who knows where everything is and can test things  
out once the transition is complete.

Regards,
Chris
OBF


From idoerg at burnham.org  Sat Mar 11 16:41:08 2006
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat, 11 Mar 2006 13:41:08 -0800
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <E6FDC1AE-15AB-4326-8867-D21146ACCE18@sonsorol.org>
Message-ID: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>

Thanks Chris.

I understand the Biopython site is based on Quixote, which personally I
always found rather flummoxing. I see three options:

1) Getting someone to move the site "as-is", Quixote & all. I don't really
know how to do this.... Michiel?

2) Migrate to another web content management system. I favor Plone,
personally. I am willing to work on this, but I will not be able to get
around to it before Thursday or Friday and possibly only Monday next.

3) OK, two options.

Bst,

Iddo

--
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037, USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://iddo-friedberg.org
http://BioFunctionPrediction.org

On Sat, 11 Mar 2006, Chris Dagdigian wrote:

> 
> Some server and site news for you ...
> 
> Our hosting time at Wyeth is coming to an end very shortly which now  
> forces us to greatly speed up our planned transition to the new OBF  
> server hardware sitting in our other Boston area datacenter.
> 
> I've just completed moving all mailing lists off of the old  
> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
> lists.open-bio.org" now and the python lists were the last to be  
> moved (a few minutes ago).   Until biopython can move its website to  
> the new server, the website will no longer be on the same machine as  
> the mailing list and mailing list archive. I've left behind aliases  
> on the old server to catch and forward list related messages.
> 
> This brings me to the other issue -- moving your website. Can you  
> elect someone among yourselves to work with me on moving  
> biopython.org entirely over to the new servers? I want to be able to  
> work with someone who knows where everything is and can test things  
> out once the transition is complete.
> 
> Regards,
> Chris
> OBF
> 
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 


From dag at sonsorol.org  Sun Mar 12 07:25:02 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sun, 12 Mar 2006 07:25:02 -0500
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>
References: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>
Message-ID: <E94DE393-4315-43F6-A468-C41F090EC34D@sonsorol.org>


One more option to throw into the mix --

bioperl and other projects are moving to wiki based websites with  
blog engines to handle news posts etc. So far the effect has been  
very positive. I think your quixote system could be converted as-is  
to a mediawiki site fairly quickly.

-chris


On Mar 11, 2006, at 4:41 PM, Iddo Friedberg wrote:

> Thanks Chris.
>
> I understand the Biopython site is based on Quixote, which  
> personally I
> always found rather flummoxing. I see three options:
>
> 1) Getting someone to move the site "as-is", Quixote & all. I don't  
> really
> know how to do this.... Michiel?
>
> 2) Migrate to another web content management system. I favor Plone,
> personally. I am willing to work on this, but I will not be able to  
> get
> around to it before Thursday or Friday and possibly only Monday next.
>
> 3) OK, two options.
>
> Bst,
>
> Iddo
>
> --
> Iddo Friedberg, Ph.D.
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037, USA
> Tel: +1 (858) 646 3100 x3516
> Fax: +1 (858) 646 3171
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
>
> On Sat, 11 Mar 2006, Chris Dagdigian wrote:
>
>>
>> Some server and site news for you ...
>>
>> Our hosting time at Wyeth is coming to an end very shortly which now
>> forces us to greatly speed up our planned transition to the new OBF
>> server hardware sitting in our other Boston area datacenter.
>>
>> I've just completed moving all mailing lists off of the old
>> portal.open-bio.org server. All OBF lists are hosted off of "http://
>> lists.open-bio.org" now and the python lists were the last to be
>> moved (a few minutes ago).   Until biopython can move its website to
>> the new server, the website will no longer be on the same machine as
>> the mailing list and mailing list archive. I've left behind aliases
>> on the old server to catch and forward list related messages.
>>
>> This brings me to the other issue -- moving your website. Can you
>> elect someone among yourselves to work with me on moving
>> biopython.org entirely over to the new servers? I want to be able to
>> work with someone who knows where everything is and can test things
>> out once the transition is complete.
>>
>> Regards,
>> Chris
>> OBF
>>
>>
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>


From mdehoon at c2b2.columbia.edu  Sun Mar 12 14:41:26 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Sun, 12 Mar 2006 14:41:26 -0500
Subject: [Biopython-dev] hey biopython devs ...
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>

Thanks, Chris.

> I've just completed moving all mailing lists off of the old  
> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
> lists.open-bio.org" now and the python lists were the last to be  
> moved (a few minutes ago).

Note though that the biopython mailing archive source file was damaged about
a year ago (see this post:
http://www.biopython.org/pipermail/biopython/2005-March/002557.html). I have
bits and pieces of the old mailing archive from which we can probably put
together a full archive, but I don't have the necessary file permissions to
fix this. Is there some way to solve this?

> I understand the Biopython site is based on Quixote, which personally I
> always found rather flummoxing. I see three options:

I don't have a strong opinion about this. I don't know much about Quixote
except for using it a couple of times to make some minor changes to the
Biopython website. If somebody likes Plone or Wiki better, I'm all for it.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


From idoerg at burnham.org  Sun Mar 12 18:20:23 2006
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sun, 12 Mar 2006 15:20:23 -0800
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <4414ACB7.80508@burnham.org>

OK, so if nobody objects, lets go with the wiki option. This will 
minimize work for all involved, and from what I've seen the wiki seems 
like an improvement.

I understand there is a time constraint. But if there is any work needed 
on my part, can it keep until Thursday?

thanks,

Iddo


Michiel De Hoon wrote:
> Thanks, Chris.
>
>   
>> I've just completed moving all mailing lists off of the old  
>> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
>> lists.open-bio.org" now and the python lists were the last to be  
>> moved (a few minutes ago).
>>     
>
> Note though that the biopython mailing archive source file was damaged about
> a year ago (see this post:
> http://www.biopython.org/pipermail/biopython/2005-March/002557.html). I have
> bits and pieces of the old mailing archive from which we can probably put
> together a full archive, but I don't have the necessary file permissions to
> fix this. Is there some way to solve this?
>
>   
>> I understand the Biopython site is based on Quixote, which personally I
>> always found rather flummoxing. I see three options:
>>     
>
> I don't have a strong opinion about this. I don't know much about Quixote
> except for using it a couple of times to make some minor changes to the
> Biopython website. If somebody likes Plone or Wiki better, I'm all for it.
>
> --Michiel.
>
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
>
>
>
>
>
>   


-- 

Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9949
http://iddo-friedberg.org
http://BioFunctionPrediction.org


From dag at sonsorol.org  Sun Mar 12 20:20:37 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sun, 12 Mar 2006 20:20:37 -0500
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <4414ACB7.80508@burnham.org>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
	<4414ACB7.80508@burnham.org>
Message-ID: <01850143-9644-4C6F-8570-9A6FB7C1D2CA@sonsorol.org>


Sure!

I'm CC'ing this to Jason as he has set up most of the wiki and blog  
engines on our new system.

If you guys don't mind, we'll set up a wiki (and/or blog if needed)  
similar to what bioperl and biomoby are already using.

We'll set the site up at http://biopython.open-bio.org  for you to  
work on and when things are set we can simply make the switch.

The only real time constraint is our upcoming loss of IP connectivity  
at Wyeth for the existing Open Bio web and CVS sourcecode servers, I  
really don't want to move your python based website code over to the  
new server hardware since you are not seriously committed to using  
it. I also can't move the old server from Wyeth to our new colo cage  
simply due to space and power reasons.

We'll let you know when the Wiki is up. Thanks again!

Regards,
Chris


On Mar 12, 2006, at 6:20 PM, Iddo Friedberg wrote:

> OK, so if nobody objects, lets go with the wiki option. This will  
> minimize work for all involved, and from what I've seen the wiki  
> seems like an improvement.
>
> I understand there is a time constraint. But if there is any work  
> needed on my part, can it keep until Thursday?
>
> thanks,
>
> Iddo
>
>
> Michiel De Hoon wrote:
>> Thanks, Chris.
>>
>>
>>> I've just completed moving all mailing lists off of the old   
>>> portal.open-bio.org server. All OBF lists are hosted off of  
>>> "http:// lists.open-bio.org" now and the python lists were the  
>>> last to be  moved (a few minutes ago).
>>>
>>
>> Note though that the biopython mailing archive source file was  
>> damaged about
>> a year ago (see this post:
>> http://www.biopython.org/pipermail/biopython/2005-March/ 
>> 002557.html). I have
>> bits and pieces of the old mailing archive from which we can  
>> probably put
>> together a full archive, but I don't have the necessary file  
>> permissions to
>> fix this. Is there some way to solve this?
>>
>>
>>> I understand the Biopython site is based on Quixote, which  
>>> personally I
>>> always found rather flummoxing. I see three options:
>>>
>>
>> I don't have a strong opinion about this. I don't know much about  
>> Quixote
>> except for using it a couple of times to make some minor changes  
>> to the
>> Biopython website. If somebody likes Plone or Wiki better, I'm all  
>> for it.
>>
>> --Michiel.
>>
>> Michiel de Hoon
>> Center for Computational Biology and Bioinformatics
>> Columbia University
>> 1150 St Nicholas Avenue
>> New York, NY 10032
>>
>>
>>
>>
>>
>>
>
>
> -- 
>
> Iddo Friedberg, Ph.D.
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037
> Tel: (858) 646 3100 x3516
> Fax: (858) 713 9949
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
>


From dag at sonsorol.org  Tue Mar 14 09:41:34 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Tue, 14 Mar 2006 09:41:34 -0500
Subject: [Biopython-dev] help test the new open-bio anonymous source code
	server
Message-ID: <BD8D0472-3DFC-401C-928A-CBBBE448FC75@sonsorol.org>


Hi folks,

Apologies for the cross-post. I need some help testing out a new open- 
bio server

The new server is http://code.open-bio.org and it has been purpose  
built to replace our existing anonymous CVS server (cvs.open-bio.org,  
etc. etc.)

In addition to anonymous CVS, this new system also offers anonymous  
rsync mirrors of all our source code.

For security reasons (the anonymous CVS pserver protocol is  
considered insecure) we run these anon access methods on a locked  
down machine that only has a read only copy of the codebase. The  
webserver page (except for the viewcvs CGI) actually redirects to a  
wiki entry on a different server so we don' have to maintain a  
website on the new box.

The anonymous access repository is currently updated every 30 minutes  
from the main developer system.

Things I need help with:

  - check out http://code.open-bio.org -- does the documentation make  
sense? Can it be made better? If so, change it! (it is a wiki after  
all...)

  - please experiment with anonymous CVS, confirm that you can check  
out code

  - experiment with rsync! this is a new feature for us that we could  
not offer on the old server (due to upstream port ACLs on a core router)

  - provide feedback on the "speed" and bandwidth of the anoncvs  
server, does it seem reasonable?

Let me know what you'll think. The reason for this move is that one  
of our core hosting facilities (datacenter at Wyeth Research) is no  
longer going to be useable by us as they are changing their WAN links  
in such a way that our servers will not be able to directly access  
the internet. We are in the process of moving *every* open-bio.org  
service onto new hardware located in a different datacenter.

Our mailing lists have already been migrated, websites are moving as  
well. Expect some big changes once we tackle the task of moving all  
the developers and the writable CVS repositories to the new  
datacenter - that will happen probably within the next 2 weeks.

Regards,
Chris Dagdigian
OBF


From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 13:00:25 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 13:00:25 -0500
Subject: [Biopython-dev] [Bug 1971] Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161800.k2GI0P9w008247@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971


------- Comment #1 from bill at barnard-engineering.com  2006-03-16 13:00 -------
Created an attachment (id=297)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=297&action=view)
This simple patch permits GC123 to handle a lower case sequence

Additionally, I removed a variable named l ("ell") which is only used once in a
range statement where it strongly resembles a 1 ("one"). I replaced l in the
range statement with its value, len(seq).


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 12:58:03 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 12:58:03 -0500
Subject: [Biopython-dev] [Bug 1971] New: Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161758.k2GHw3iT008224@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971

           Summary: Bio.SeqUtils.GC123 raises ZeroDivisionError on lower
                    case sequence
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: trivial
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: bill at barnard-engineering.com


The GC123 utility will raise ZeroDivisionError: float division if you pass it a
lowercase sequence.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 14:35:44 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 14:35:44 -0500
Subject: [Biopython-dev] [Bug 1971] Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161935.k2GJZig4009442@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp  2006-03-16 14:35 -------
Committed to CVS, thanks.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 02:28:46 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 02:28:46 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603210728.k2L7Sk87005724@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


------- Comment #1 from bill at barnard-engineering.com  2006-03-21 02:28 -------
Created an attachment (id=298)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=298&action=view)
Proposed patch to update the parser

The fix is quite simple; add a clause to Bio.Prosite. _RecordConsumer.comment()
to assign the new version qualifier to a new Bio.Prosite.Record data member
named following the existing naming convention.

Most of the patch consists of updates to the data files in the Tests directory
that are tested by Tests/test_prosite2.py

I would further recommend removing the following files from the repository
since they are not used in (or refer to) any existing test that I found. These
are all in the Tests/Prosite directory:

README
ps001
ps002
ps003
ps00107.htm
ps00123.htm
ps00123.txt
ps00159.htm
ps00165.htm
ps00213.html
ps00213.txt
ps00432.htm
ps00812.txt
ps01213.txt

The test_prosite2 tests pass before and after applying the patch to the current
CVS tree.

Patch application instructions:

Run from the top of the tree, i.e. in the biopython root directory
% patch -p0 < path_to_patches/biopython-Bio_Prosite_parser_fix.patch


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 02:07:57 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 02:07:57 -0500
Subject: [Biopython-dev] [Bug 1972] New: New Prosite comment qualifier
	breaks Bio.Prosite.RecordParser
Message-ID: <200603210707.k2L77vsJ005498@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972

           Summary: New Prosite comment qualifier breaks
                    Bio.Prosite.RecordParser
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: minor
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: bill at barnard-engineering.com


To reproduce:

import Bio.Prosite
prosite = Bio.Prosite.ExPASyDictionary(parser=Bio.Prosite.RecordParser())
entry = prosite['PS00079']

which raises:

SyntaxError: Unknown qual '/VERSION' in comment line
CC   /VERSION=1;


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 03:25:42 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 03:25:42 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603210825.k2L8Pg3S006352@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
I thought of this after I submitted the patch. Even though the existing test
simply compares the output to the stored output of a previous test, this bug
could be added as a test case such that if an unsupported change occurs to the
Prosite format, an exception will be raised when running the test.

Add the lines to Tests/test_prosite2.py:

expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
entry = expasy_dict['PS00079']
print entry.accession

or some such, then regenerate the test output file.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From biopython-dev at maubp.freeserve.co.uk  Tue Mar 21 07:17:42 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython-dev))
Date: Tue, 21 Mar 2006 12:17:42 +0000
Subject: [Biopython-dev]  "Online" tests, was [Bug 1972]
In-Reply-To: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
Message-ID: <441FEEE6.8040402@maubp.freeserve.co.uk>

bugzilla-daemon at portal.open-bio.org wrote:
> http://bugzilla.open-bio.org/show_bug.cgi?id=1972
> 
> ------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
> I thought of this after I submitted the patch. Even though the existing test
> simply compares the output to the stored output of a previous test, this bug
> could be added as a test case such that if an unsupported change occurs to the
> Prosite format, an exception will be raised when running the test.
> 
> Add the lines to Tests/test_prosite2.py:
> 
> expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
> entry = expasy_dict['PS00079']
> print entry.accession
> 
> or some such, then regenerate the test output file.

This change would add a test requiring internet access, right?

I think BioPython has previously avoiding doing this...

Perhaps we should have some additional "online tests" for some modules - 
but handled separately so that people can test most of BioPython WITHOUT 
having a network connection (e.g. due to using a standalone machine, 
proxy complications, dialup etc).

I had wondered about this as a general point for the BioPython test 
framework (as there are lots of other modules that can fetch information 
or query online tools - e.g. online blast).

Comments?

Peter


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 14:26:39 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 14:26:39 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603211926.k2LJQdN1016415@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #3 from mdehoon at ims.u-tokyo.ac.jp  2006-03-21 14:26 -------
Fixed in CVS, thanks.
About the test case, I think it is better to avoid requiring internet access in
order to run the tests, at least for the default run suite. See the
Biopython-dev mailing list for further discussion.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 15:27:49 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 15:27:49 -0500
Subject: [Biopython-dev] [Bug 1767] Bio/trie.c can crash on Windows
Message-ID: <200603212027.k2LKRnnE017090@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1767


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         OS/Version|Windows 2000                |All


------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp  2006-03-21 15:27 -------
This problem can be fixed by adding

#ifdef __MINGW32__
#  define strdup _strdup
#endif

near the top of trie.c (after the #include's). This will cause trie.pyd to be
linked to mscvr71.dll only, as it's supposed to be.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bugzilla-daemon at portal.open-bio.org  Wed Mar 22 06:36:12 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 22 Mar 2006 06:36:12 -0500
Subject: [Biopython-dev] [Bug 1974] New: Missing flex dependency crashes
	build with unclear error message "/usr/bin/ld: cannot find -lfl"
Message-ID: <200603221136.k2MBaC1W030806@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1974

           Summary: Missing flex dependency crashes build with unclear error
                    message "/usr/bin/ld: cannot find -lfl"
           Product: Biopython
           Version: Not Applicable
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: yohell at ifm.liu.se


Biopython depends on flex, which is not mentioned in the docs. 

$ python setup.py build

produces a lot of text and at the end comes:
/usr/bin/ld: cannot find -lfl 
collect2: ld returned 1 exit status 
error: command 'gcc' failed with exit status 1 

A google on "biopython lfl" led me to this post:
http://mail.python.org/pipermail/distutils-sig/2004-March/003795.html

After reading this I installed flex using Synaptic (Ubuntulinux 5.10) and
rebuilt using

$ python setup.py build

with no problem.

The missing flex dependency should be added to the list of required software on
the biopython download page:
http://www.biopython.org/download/

Cheers,
Joel Hedlund


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

From bill at barnard-engineering.com  Wed Mar 22 11:38:34 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Wed, 22 Mar 2006 08:38:34 -0800
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <441FEEE6.8040402@maubp.freeserve.co.uk>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
	<441FEEE6.8040402@maubp.freeserve.co.uk>
Message-ID: <1143045514.813.26.camel@tioga.barnard-engineering.com>

On Tue, 2006-03-21 at 12:17 +0000, Peter (BioPython-dev) wrote:
> bugzilla-daemon at portal.open-bio.org wrote:
> > http://bugzilla.open-bio.org/show_bug.cgi?id=1972
> > 
> > ------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
> > I thought of this after I submitted the patch. Even though the existing test
> > simply compares the output to the stored output of a previous test, this bug
> > could be added as a test case such that if an unsupported change occurs to the
> > Prosite format, an exception will be raised when running the test.
> > 
> > Add the lines to Tests/test_prosite2.py:
> > 
> > expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
> > entry = expasy_dict['PS00079']
> > print entry.accession
> > 
> > or some such, then regenerate the test output file.
> 
> This change would add a test requiring internet access, right?

That is correct. The test I suggest belongs elsewhere than in
test_prosite2; see below...

> 
> I think BioPython has previously avoiding doing this...

Tests.test_Registry checks on-line retrieval from several remote DBs.

Further investigation shows this:

[billb at tioga Tests]$ grep -l requires_internet *.py
requires_internet.py
test_ais.py
test_EUtils.py
test_HotRand.py
test_Registry.py

> 
> Perhaps we should have some additional "online tests" for some modules - 
> but handled separately so that people can test most of BioPython WITHOUT 
> having a network connection (e.g. due to using a standalone machine, 
> proxy complications, dialup etc).

Perhaps another test case similar to test_Registry should be implemented
with the purpose of detecting DB format changes. I'll be happy to have a
go at it, should you want such a test.

test_Registry appears to check basic connectivity, and that the record
retrieved matches the one requested, but does no parsing other than to
look at the first N bytes from the data returned to see if it contains
the requested ID.

> 
> I had wondered about this as a general point for the BioPython test 
> framework (as there are lots of other modules that can fetch information 
> or query online tools - e.g. online blast).

Any test that needs a remote resource should import requires_internet
and then can fail gracefully if the internet is not accessible.

This does not address the bandwidth issue for those on dial-up, however
the tests so far don't appear to retrieve very much data. A parser test
could retrieve small records to minimize bandwidth and memory.

As to proxies; I didn't look too hard but probably most, if not all, the
remote resources will be accessed through http which should be no
problem. If http is not accessible, require_internet handles that case.

Bill


From dag at sonsorol.org  Wed Mar 22 14:47:05 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Wed, 22 Mar 2006 14:47:05 -0500
Subject: [Biopython-dev] biopython website has moved to its new home
Message-ID: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>


I still can't believe I tried this, even harder to believe it  
actually worked ...

Your python Quixote based website has been moved to its new server  
home and IP address. While DNS for biopython.org is still updating  
you can confirm the new site is online by surfing over to http:// 
testsite.biopython.org

Considering I don't know python, have never used quixote and did not  
realize until far into the migration project that many critical bits  
of python required for the website to render under the application  
server are actually part of your biopython CVS repository it is a  
miracle that this worked at all. heh.

Anyway, you guys were the last major website needing to be moved (in  
addition to bugzilla.open-bio.org which also moved today).

This means that you are continuing to use the "old" site -- you can  
use this as long as you like. Your new wiki based site at  
biopython.open-bio.org is still available and if you chose to migrate  
into that new setup just let me know (or support at open-bio.org) when  
you want the switchover to occur.

Please keep an eye on biopython.org for me, especially after it  
starts returning 207.154.17.70 as the resolved IP address. Let me  
know if there are any issues


Regards,
Chris


From biopython-dev at maubp.freeserve.co.uk  Thu Mar 23 03:50:18 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython-dev))
Date: Thu, 23 Mar 2006 08:50:18 +0000
Subject: [Biopython-dev] biopython website has moved to its new home
In-Reply-To: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>
References: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>
Message-ID: <4422614A.4060508@maubp.freeserve.co.uk>

Chris Dagdigian wrote:
> Please keep an eye on biopython.org for me, especially after it  
> starts returning 207.154.17.70 as the resolved IP address. Let me  
> know if there are any issues

Umm, http://207.154.17.70/ goes to a BioMoby page at the moment, at 
first glance the content looks identical to http://www.biomoby.org/

Did you mean a different IP address Chris?

Peter


From bill at barnard-engineering.com  Thu Mar 30 15:33:55 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Thu, 30 Mar 2006 12:33:55 -0800
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <442403DE.7090607@biopython.org>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
	<441FEEE6.8040402@maubp.freeserve.co.uk>
	<1143045514.813.26.camel@tioga.barnard-engineering.com>
	<442403DE.7090607@biopython.org>
Message-ID: <1143750836.5736.27.camel@tioga.barnard-engineering.com>

On Fri, 2006-03-24 at 14:36 +0000, Peter wrote:
> Bill Barnard wrote:
> > Perhaps another test case similar to test_Registry should be implemented
> > with the purpose of detecting DB format changes. I'll be happy to have a
> > go at it, should you want such a test.

> Peter wrote:
> I think this is an excellent idea - but it would be good to have an 
> opinion from some of the more seasoned BioPython developers.
> 
> Putting these online tests into separate unit test(s) will make tracing 
> unit test failures simply due to being offline much easier.
> 
> I could probably help out with some of the formats - but I am by no 
> means familiar with them all.

I've made a first cut unit test, tentatively named
test_Parsers_for_newest_formats, which retrieves and parses some small
records for Prosite, Prodoc, SwissProt, and Medline records. I tried
these types first, based on a quick search of the code tree to see where
there was existing code that makes use of Bio.WWW.

[billb at tioga Bio]$ find . -name "*.py" | xargs grep "Bio\.WWW"

yields (in part)

./Prosite/Prodoc.py:from Bio.WWW import ExPASy
./Prosite/__init__.py:from Bio.WWW import ExPASy
./SwissProt/SProt.py:from Bio.WWW import ExPASy
./PubMed.py:from Bio.WWW import NCBI

./Blast/NCBIWWW.py:    from Bio.WWW import NCBI

The first four are easy to test with code like:

class ExpasyTest(unittest.TestCase):
    """Test that Expasy parsers can read the current database formats
    """
    def setUp(self):
        self.prosite_dict = Prosite.ExPASyDictionary \
                            (parser=Prosite.RecordParser())

    def t_read_record(self):
        """Retrieve a Prosite record and parse it
        """
        accession = 'PS00159'
        entry = self.prosite_dict[accession]
        self.assertEqual(entry.accession, accession)

Testing Blast in the same way doesn't seem sensible to me, and it looks
as though any effort there should be in the XML Parser area, rather than
in the thankless task of parsing HTML. (I suspect that's what you've
already decided.)

> In some cases (e.g. GenBank, Fasta) once the sample file is downloaded 
> there are multiple parsers to be checked (e.g. record and feature parsers).

I'll take a look at more parsers, as I figure out where they are. I will
take the same approach of looking through the code tree for existing
parsers using find/grep. It looks as though there are a fair number
which may be obsolete. I would appreciate any guidance in figuring out
which ones would be most useful to check.

(Is this exercise useful? I was just learning my way around the code
using the on-line course at the Pasteur Institute, and found a minor bug
which I fixed. Since any bug should really be covered by a test as well
as being fixed, I wanted to now add the test. I like cleaning up
problems as I find them, but I may not be doing anything that's of more
than minor utility for Biopython...)

> We should probably produce a streamlined test output file WITHOUT 
> details which are likely to change in later versions of the test file 
> e.g. revisions to genbank files.

Since the test only verifies the record can be retrieved, parsed, and is
the actual record requested it emits very little output. My last run
emitted:

[billb at tioga Tests]$ python test_Parsers_for_newest_formats.py
Retrieve a Prodoc record and parse it ... ok
Retrieve a Prosite record and parse it ... ok
Retrieve a SwissProt record and parse it into Record format ... WARNING
- Ignoring line: DT   20-DEC-2005, integrated into UniProtKB/Swiss-Prot.

WARNING - Ignoring line: DT   07-DEC-2004, sequence version 1.

WARNING - Ignoring line: DT   07-FEB-2006, entry version 10.

ok
Retrieve a SwissProt record and parse it into Sequence format ... ok
Retrieve a PubMed record and parse it ... ok

----------------------------------------------------------------------
Ran 5 tests in 3.085s

OK


Is this what you mean by "streamlined test output"?

> One question is should the test "cache" any downloaded files (say for a 
> day) which would be helpful for anyone trying to debug a particular 
> issue and re-running the online tests?  Or is this just making life too 
> complicated.

This could be done, but I doubt I would do it unless it really seemed
useful...

Lazily yours,

Bill


From biopython-dev at maubp.freeserve.co.uk  Fri Mar 31 15:41:18 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython Dev))
Date: Fri, 31 Mar 2006 21:41:18 +0100
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <1143750836.5736.27.camel@tioga.barnard-engineering.com>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>	<441FEEE6.8040402@maubp.freeserve.co.uk>	<1143045514.813.26.camel@tioga.barnard-engineering.com>	<442403DE.7090607@biopython.org>
	<1143750836.5736.27.camel@tioga.barnard-engineering.com>
Message-ID: <442D93EE.9040707@maubp.freeserve.co.uk>

Bill Barnard wrote:
> I've made a first cut unit test, tentatively named
> test_Parsers_for_newest_formats, which retrieves and parses some small
> records for Prosite, Prodoc, SwissProt, and Medline records. I tried
> these types first, based on a quick search of the code tree to see where
> there was existing code that makes use of Bio.WWW.

Sounds good to me.  But not a very snappy name - how about something 
shorter like test_OnlineFormats.py instead?

> Testing Blast in the same way doesn't seem sensible to me, and it looks
> as though any effort there should be in the XML Parser area, rather than
> in the thankless task of parsing HTML. (I suspect that's what you've
> already decided.)

It was decided fairly recently to prioritise XML output for Blast.  The
plain text output is fairly stable, but my impression is that the HTML
was/is a moving target and a thankless job.

I think the Blast test should actually submit a short protein/nucleotide 
sequence known to be in the online database.  Maybe do some basic sanity 
testing like check it returns at least N results and the best hit is at 
least a certain score.

>>In some cases (e.g. GenBank, Fasta) once the sample file is downloaded 
>>there are multiple parsers to be checked (e.g. record and feature parsers).
> 
> I'll take a look at more parsers, as I figure out where they are. I will
> take the same approach of looking through the code tree for existing
> parsers using find/grep. It looks as though there are a fair number
> which may be obsolete. I would appreciate any guidance in figuring out
> which ones would be most useful to check.
> 
> (Is this exercise useful? I was just learning my way around the code
> using the on-line course at the Pasteur Institute, and found a minor bug
> which I fixed. Since any bug should really be covered by a test as well
> as being fixed, I wanted to now add the test. I like cleaning up
> problems as I find them, but I may not be doing anything that's of more
> than minor utility for Biopython...)

Like yourself, I'm only familiar with a fraction of the BioPython code.

I'll volunteer to add cases for GenBank, Fasta and GEO files.

>>We should probably produce a streamlined test output file WITHOUT 
>>details which are likely to change in later versions of the test file 
>>e.g. revisions to genbank files.
> 
> Since the test only verifies the record can be retrieved, parsed, and is
> the actual record requested it emits very little output. My last run
> emitted:
> 
> ...
 > WARNING - Ignoring line: DT   20-DEC-2005, integrated into 
UniProtKB/Swiss-Prot.
> ...

Those WARNING lines are my doing, see bug 1946

http://bugzilla.open-bio.org/show_bug.cgi?id=1946

> Is this what you mean by "streamlined test output"?

Pretty much.

>>One question is should the test "cache" any downloaded files (say for a 
>>day) which would be helpful for anyone trying to debug a particular 
>>issue and re-running the online tests?  Or is this just making life too 
>>complicated.
> 
> This could be done, but I doubt I would do it unless it really seemed
> useful...

OK :)

Peter


From dag at sonsorol.org  Sat Mar 11 20:00:33 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sat, 11 Mar 2006 15:00:33 -0500
Subject: [Biopython-dev] hey biopython devs ...
Message-ID: <E6FDC1AE-15AB-4326-8867-D21146ACCE18@sonsorol.org>


Some server and site news for you ...

Our hosting time at Wyeth is coming to an end very shortly which now  
forces us to greatly speed up our planned transition to the new OBF  
server hardware sitting in our other Boston area datacenter.

I've just completed moving all mailing lists off of the old  
portal.open-bio.org server. All OBF lists are hosted off of "http:// 
lists.open-bio.org" now and the python lists were the last to be  
moved (a few minutes ago).   Until biopython can move its website to  
the new server, the website will no longer be on the same machine as  
the mailing list and mailing list archive. I've left behind aliases  
on the old server to catch and forward list related messages.

This brings me to the other issue -- moving your website. Can you  
elect someone among yourselves to work with me on moving  
biopython.org entirely over to the new servers? I want to be able to  
work with someone who knows where everything is and can test things  
out once the transition is complete.

Regards,
Chris
OBF


From idoerg at burnham.org  Sat Mar 11 21:41:08 2006
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sat, 11 Mar 2006 13:41:08 -0800
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <E6FDC1AE-15AB-4326-8867-D21146ACCE18@sonsorol.org>
Message-ID: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>

Thanks Chris.

I understand the Biopython site is based on Quixote, which personally I
always found rather flummoxing. I see three options:

1) Getting someone to move the site "as-is", Quixote & all. I don't really
know how to do this.... Michiel?

2) Migrate to another web content management system. I favor Plone,
personally. I am willing to work on this, but I will not be able to get
around to it before Thursday or Friday and possibly only Monday next.

3) OK, two options.

Bst,

Iddo

--
Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037, USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://iddo-friedberg.org
http://BioFunctionPrediction.org

On Sat, 11 Mar 2006, Chris Dagdigian wrote:

> 
> Some server and site news for you ...
> 
> Our hosting time at Wyeth is coming to an end very shortly which now  
> forces us to greatly speed up our planned transition to the new OBF  
> server hardware sitting in our other Boston area datacenter.
> 
> I've just completed moving all mailing lists off of the old  
> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
> lists.open-bio.org" now and the python lists were the last to be  
> moved (a few minutes ago).   Until biopython can move its website to  
> the new server, the website will no longer be on the same machine as  
> the mailing list and mailing list archive. I've left behind aliases  
> on the old server to catch and forward list related messages.
> 
> This brings me to the other issue -- moving your website. Can you  
> elect someone among yourselves to work with me on moving  
> biopython.org entirely over to the new servers? I want to be able to  
> work with someone who knows where everything is and can test things  
> out once the transition is complete.
> 
> Regards,
> Chris
> OBF
> 
> 
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
> 


From dag at sonsorol.org  Sun Mar 12 12:25:02 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sun, 12 Mar 2006 07:25:02 -0500
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>
References: <Pine.SGI.4.10.10603111333070.4145365-100000@pines2.ljcrf.edu>
Message-ID: <E94DE393-4315-43F6-A468-C41F090EC34D@sonsorol.org>


One more option to throw into the mix --

bioperl and other projects are moving to wiki based websites with  
blog engines to handle news posts etc. So far the effect has been  
very positive. I think your quixote system could be converted as-is  
to a mediawiki site fairly quickly.

-chris


On Mar 11, 2006, at 4:41 PM, Iddo Friedberg wrote:

> Thanks Chris.
>
> I understand the Biopython site is based on Quixote, which  
> personally I
> always found rather flummoxing. I see three options:
>
> 1) Getting someone to move the site "as-is", Quixote & all. I don't  
> really
> know how to do this.... Michiel?
>
> 2) Migrate to another web content management system. I favor Plone,
> personally. I am willing to work on this, but I will not be able to  
> get
> around to it before Thursday or Friday and possibly only Monday next.
>
> 3) OK, two options.
>
> Bst,
>
> Iddo
>
> --
> Iddo Friedberg, Ph.D.
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037, USA
> Tel: +1 (858) 646 3100 x3516
> Fax: +1 (858) 646 3171
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
>
> On Sat, 11 Mar 2006, Chris Dagdigian wrote:
>
>>
>> Some server and site news for you ...
>>
>> Our hosting time at Wyeth is coming to an end very shortly which now
>> forces us to greatly speed up our planned transition to the new OBF
>> server hardware sitting in our other Boston area datacenter.
>>
>> I've just completed moving all mailing lists off of the old
>> portal.open-bio.org server. All OBF lists are hosted off of "http://
>> lists.open-bio.org" now and the python lists were the last to be
>> moved (a few minutes ago).   Until biopython can move its website to
>> the new server, the website will no longer be on the same machine as
>> the mailing list and mailing list archive. I've left behind aliases
>> on the old server to catch and forward list related messages.
>>
>> This brings me to the other issue -- moving your website. Can you
>> elect someone among yourselves to work with me on moving
>> biopython.org entirely over to the new servers? I want to be able to
>> work with someone who knows where everything is and can test things
>> out once the transition is complete.
>>
>> Regards,
>> Chris
>> OBF
>>
>>
>> _______________________________________________
>> Biopython-dev mailing list
>> Biopython-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>>


From mdehoon at c2b2.columbia.edu  Sun Mar 12 19:41:26 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Sun, 12 Mar 2006 14:41:26 -0500
Subject: [Biopython-dev] hey biopython devs ...
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>

Thanks, Chris.

> I've just completed moving all mailing lists off of the old  
> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
> lists.open-bio.org" now and the python lists were the last to be  
> moved (a few minutes ago).

Note though that the biopython mailing archive source file was damaged about
a year ago (see this post:
http://www.biopython.org/pipermail/biopython/2005-March/002557.html). I have
bits and pieces of the old mailing archive from which we can probably put
together a full archive, but I don't have the necessary file permissions to
fix this. Is there some way to solve this?

> I understand the Biopython site is based on Quixote, which personally I
> always found rather flummoxing. I see three options:

I don't have a strong opinion about this. I don't know much about Quixote
except for using it a couple of times to make some minor changes to the
Biopython website. If somebody likes Plone or Wiki better, I'm all for it.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


From idoerg at burnham.org  Sun Mar 12 23:20:23 2006
From: idoerg at burnham.org (Iddo Friedberg)
Date: Sun, 12 Mar 2006 15:20:23 -0800
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <4414ACB7.80508@burnham.org>

OK, so if nobody objects, lets go with the wiki option. This will 
minimize work for all involved, and from what I've seen the wiki seems 
like an improvement.

I understand there is a time constraint. But if there is any work needed 
on my part, can it keep until Thursday?

thanks,

Iddo


Michiel De Hoon wrote:
> Thanks, Chris.
>
>   
>> I've just completed moving all mailing lists off of the old  
>> portal.open-bio.org server. All OBF lists are hosted off of "http:// 
>> lists.open-bio.org" now and the python lists were the last to be  
>> moved (a few minutes ago).
>>     
>
> Note though that the biopython mailing archive source file was damaged about
> a year ago (see this post:
> http://www.biopython.org/pipermail/biopython/2005-March/002557.html). I have
> bits and pieces of the old mailing archive from which we can probably put
> together a full archive, but I don't have the necessary file permissions to
> fix this. Is there some way to solve this?
>
>   
>> I understand the Biopython site is based on Quixote, which personally I
>> always found rather flummoxing. I see three options:
>>     
>
> I don't have a strong opinion about this. I don't know much about Quixote
> except for using it a couple of times to make some minor changes to the
> Biopython website. If somebody likes Plone or Wiki better, I'm all for it.
>
> --Michiel.
>
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
>
>
>
>
>
>   


-- 

Iddo Friedberg, Ph.D.
Burnham Institute for Medical Research
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9949
http://iddo-friedberg.org
http://BioFunctionPrediction.org


From dag at sonsorol.org  Mon Mar 13 01:20:37 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Sun, 12 Mar 2006 20:20:37 -0500
Subject: [Biopython-dev] hey biopython devs ...
In-Reply-To: <4414ACB7.80508@burnham.org>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEBB@cgcmail.cgc.cpmc.columbia.edu>
	<4414ACB7.80508@burnham.org>
Message-ID: <01850143-9644-4C6F-8570-9A6FB7C1D2CA@sonsorol.org>


Sure!

I'm CC'ing this to Jason as he has set up most of the wiki and blog  
engines on our new system.

If you guys don't mind, we'll set up a wiki (and/or blog if needed)  
similar to what bioperl and biomoby are already using.

We'll set the site up at http://biopython.open-bio.org  for you to  
work on and when things are set we can simply make the switch.

The only real time constraint is our upcoming loss of IP connectivity  
at Wyeth for the existing Open Bio web and CVS sourcecode servers, I  
really don't want to move your python based website code over to the  
new server hardware since you are not seriously committed to using  
it. I also can't move the old server from Wyeth to our new colo cage  
simply due to space and power reasons.

We'll let you know when the Wiki is up. Thanks again!

Regards,
Chris


On Mar 12, 2006, at 6:20 PM, Iddo Friedberg wrote:

> OK, so if nobody objects, lets go with the wiki option. This will  
> minimize work for all involved, and from what I've seen the wiki  
> seems like an improvement.
>
> I understand there is a time constraint. But if there is any work  
> needed on my part, can it keep until Thursday?
>
> thanks,
>
> Iddo
>
>
> Michiel De Hoon wrote:
>> Thanks, Chris.
>>
>>
>>> I've just completed moving all mailing lists off of the old   
>>> portal.open-bio.org server. All OBF lists are hosted off of  
>>> "http:// lists.open-bio.org" now and the python lists were the  
>>> last to be  moved (a few minutes ago).
>>>
>>
>> Note though that the biopython mailing archive source file was  
>> damaged about
>> a year ago (see this post:
>> http://www.biopython.org/pipermail/biopython/2005-March/ 
>> 002557.html). I have
>> bits and pieces of the old mailing archive from which we can  
>> probably put
>> together a full archive, but I don't have the necessary file  
>> permissions to
>> fix this. Is there some way to solve this?
>>
>>
>>> I understand the Biopython site is based on Quixote, which  
>>> personally I
>>> always found rather flummoxing. I see three options:
>>>
>>
>> I don't have a strong opinion about this. I don't know much about  
>> Quixote
>> except for using it a couple of times to make some minor changes  
>> to the
>> Biopython website. If somebody likes Plone or Wiki better, I'm all  
>> for it.
>>
>> --Michiel.
>>
>> Michiel de Hoon
>> Center for Computational Biology and Bioinformatics
>> Columbia University
>> 1150 St Nicholas Avenue
>> New York, NY 10032
>>
>>
>>
>>
>>
>>
>
>
> -- 
>
> Iddo Friedberg, Ph.D.
> Burnham Institute for Medical Research
> 10901 N. Torrey Pines Rd.
> La Jolla, CA 92037
> Tel: (858) 646 3100 x3516
> Fax: (858) 713 9949
> http://iddo-friedberg.org
> http://BioFunctionPrediction.org
>


From dag at sonsorol.org  Tue Mar 14 14:41:34 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Tue, 14 Mar 2006 09:41:34 -0500
Subject: [Biopython-dev] help test the new open-bio anonymous source code
	server
Message-ID: <BD8D0472-3DFC-401C-928A-CBBBE448FC75@sonsorol.org>


Hi folks,

Apologies for the cross-post. I need some help testing out a new open- 
bio server

The new server is http://code.open-bio.org and it has been purpose  
built to replace our existing anonymous CVS server (cvs.open-bio.org,  
etc. etc.)

In addition to anonymous CVS, this new system also offers anonymous  
rsync mirrors of all our source code.

For security reasons (the anonymous CVS pserver protocol is  
considered insecure) we run these anon access methods on a locked  
down machine that only has a read only copy of the codebase. The  
webserver page (except for the viewcvs CGI) actually redirects to a  
wiki entry on a different server so we don' have to maintain a  
website on the new box.

The anonymous access repository is currently updated every 30 minutes  
from the main developer system.

Things I need help with:

  - check out http://code.open-bio.org -- does the documentation make  
sense? Can it be made better? If so, change it! (it is a wiki after  
all...)

  - please experiment with anonymous CVS, confirm that you can check  
out code

  - experiment with rsync! this is a new feature for us that we could  
not offer on the old server (due to upstream port ACLs on a core router)

  - provide feedback on the "speed" and bandwidth of the anoncvs  
server, does it seem reasonable?

Let me know what you'll think. The reason for this move is that one  
of our core hosting facilities (datacenter at Wyeth Research) is no  
longer going to be useable by us as they are changing their WAN links  
in such a way that our servers will not be able to directly access  
the internet. We are in the process of moving *every* open-bio.org  
service onto new hardware located in a different datacenter.

Our mailing lists have already been migrated, websites are moving as  
well. Expect some big changes once we tackle the task of moving all  
the developers and the writable CVS repositories to the new  
datacenter - that will happen probably within the next 2 weeks.

Regards,
Chris Dagdigian
OBF


From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 18:00:25 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 13:00:25 -0500
Subject: [Biopython-dev] [Bug 1971] Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161800.k2GI0P9w008247@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971


------- Comment #1 from bill at barnard-engineering.com  2006-03-16 13:00 -------
Created an attachment (id=297)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=297&action=view)
This simple patch permits GC123 to handle a lower case sequence

Additionally, I removed a variable named l ("ell") which is only used once in a
range statement where it strongly resembles a 1 ("one"). I replaced l in the
range statement with its value, len(seq).


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 17:58:03 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 12:58:03 -0500
Subject: [Biopython-dev] [Bug 1971] New: Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161758.k2GHw3iT008224@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971

           Summary: Bio.SeqUtils.GC123 raises ZeroDivisionError on lower
                    case sequence
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: trivial
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: bill at barnard-engineering.com


The GC123 utility will raise ZeroDivisionError: float division if you pass it a
lowercase sequence.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Thu Mar 16 19:35:44 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 16 Mar 2006 14:35:44 -0500
Subject: [Biopython-dev] [Bug 1971] Bio.SeqUtils.GC123 raises
	ZeroDivisionError on lower case sequence
Message-ID: <200603161935.k2GJZig4009442@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1971


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp  2006-03-16 14:35 -------
Committed to CVS, thanks.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 07:28:46 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 02:28:46 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603210728.k2L7Sk87005724@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


------- Comment #1 from bill at barnard-engineering.com  2006-03-21 02:28 -------
Created an attachment (id=298)
 --> (http://bugzilla.open-bio.org/attachment.cgi?id=298&action=view)
Proposed patch to update the parser

The fix is quite simple; add a clause to Bio.Prosite. _RecordConsumer.comment()
to assign the new version qualifier to a new Bio.Prosite.Record data member
named following the existing naming convention.

Most of the patch consists of updates to the data files in the Tests directory
that are tested by Tests/test_prosite2.py

I would further recommend removing the following files from the repository
since they are not used in (or refer to) any existing test that I found. These
are all in the Tests/Prosite directory:

README
ps001
ps002
ps003
ps00107.htm
ps00123.htm
ps00123.txt
ps00159.htm
ps00165.htm
ps00213.html
ps00213.txt
ps00432.htm
ps00812.txt
ps01213.txt

The test_prosite2 tests pass before and after applying the patch to the current
CVS tree.

Patch application instructions:

Run from the top of the tree, i.e. in the biopython root directory
% patch -p0 < path_to_patches/biopython-Bio_Prosite_parser_fix.patch


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 07:07:57 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 02:07:57 -0500
Subject: [Biopython-dev] [Bug 1972] New: New Prosite comment qualifier
	breaks Bio.Prosite.RecordParser
Message-ID: <200603210707.k2L77vsJ005498@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972

           Summary: New Prosite comment qualifier breaks
                    Bio.Prosite.RecordParser
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: minor
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: bill at barnard-engineering.com


To reproduce:

import Bio.Prosite
prosite = Bio.Prosite.ExPASyDictionary(parser=Bio.Prosite.RecordParser())
entry = prosite['PS00079']

which raises:

SyntaxError: Unknown qual '/VERSION' in comment line
CC   /VERSION=1;


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 08:25:42 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 03:25:42 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603210825.k2L8Pg3S006352@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
I thought of this after I submitted the patch. Even though the existing test
simply compares the output to the stored output of a previous test, this bug
could be added as a test case such that if an unsupported change occurs to the
Prosite format, an exception will be raised when running the test.

Add the lines to Tests/test_prosite2.py:

expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
entry = expasy_dict['PS00079']
print entry.accession

or some such, then regenerate the test output file.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From biopython-dev at maubp.freeserve.co.uk  Tue Mar 21 12:17:42 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython-dev))
Date: Tue, 21 Mar 2006 12:17:42 +0000
Subject: [Biopython-dev]  "Online" tests, was [Bug 1972]
In-Reply-To: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
Message-ID: <441FEEE6.8040402@maubp.freeserve.co.uk>

bugzilla-daemon at portal.open-bio.org wrote:
> http://bugzilla.open-bio.org/show_bug.cgi?id=1972
> 
> ------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
> I thought of this after I submitted the patch. Even though the existing test
> simply compares the output to the stored output of a previous test, this bug
> could be added as a test case such that if an unsupported change occurs to the
> Prosite format, an exception will be raised when running the test.
> 
> Add the lines to Tests/test_prosite2.py:
> 
> expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
> entry = expasy_dict['PS00079']
> print entry.accession
> 
> or some such, then regenerate the test output file.

This change would add a test requiring internet access, right?

I think BioPython has previously avoiding doing this...

Perhaps we should have some additional "online tests" for some modules - 
but handled separately so that people can test most of BioPython WITHOUT 
having a network connection (e.g. due to using a standalone machine, 
proxy complications, dialup etc).

I had wondered about this as a general point for the BioPython test 
framework (as there are lots of other modules that can fetch information 
or query online tools - e.g. online blast).

Comments?

Peter


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 19:26:39 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 14:26:39 -0500
Subject: [Biopython-dev] [Bug 1972] New Prosite comment qualifier breaks
	Bio.Prosite.RecordParser
Message-ID: <200603211926.k2LJQdN1016415@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1972


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


------- Comment #3 from mdehoon at ims.u-tokyo.ac.jp  2006-03-21 14:26 -------
Fixed in CVS, thanks.
About the test case, I think it is better to avoid requiring internet access in
order to run the tests, at least for the default run suite. See the
Biopython-dev mailing list for further discussion.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Tue Mar 21 20:27:49 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 21 Mar 2006 15:27:49 -0500
Subject: [Biopython-dev] [Bug 1767] Bio/trie.c can crash on Windows
Message-ID: <200603212027.k2LKRnnE017090@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1767


mdehoon at ims.u-tokyo.ac.jp changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         OS/Version|Windows 2000                |All


------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp  2006-03-21 15:27 -------
This problem can be fixed by adding

#ifdef __MINGW32__
#  define strdup _strdup
#endif

near the top of trie.c (after the #include's). This will cause trie.pyd to be
linked to mscvr71.dll only, as it's supposed to be.


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bugzilla-daemon at portal.open-bio.org  Wed Mar 22 11:36:12 2006
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 22 Mar 2006 06:36:12 -0500
Subject: [Biopython-dev] [Bug 1974] New: Missing flex dependency crashes
	build with unclear error message "/usr/bin/ld: cannot find -lfl"
Message-ID: <200603221136.k2MBaC1W030806@portal.open-bio.org>

http://bugzilla.open-bio.org/show_bug.cgi?id=1974

           Summary: Missing flex dependency crashes build with unclear error
                    message "/usr/bin/ld: cannot find -lfl"
           Product: Biopython
           Version: Not Applicable
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: yohell at ifm.liu.se


Biopython depends on flex, which is not mentioned in the docs. 

$ python setup.py build

produces a lot of text and at the end comes:
/usr/bin/ld: cannot find -lfl 
collect2: ld returned 1 exit status 
error: command 'gcc' failed with exit status 1 

A google on "biopython lfl" led me to this post:
http://mail.python.org/pipermail/distutils-sig/2004-March/003795.html

After reading this I installed flex using Synaptic (Ubuntulinux 5.10) and
rebuilt using

$ python setup.py build

with no problem.

The missing flex dependency should be added to the list of required software on
the biopython download page:
http://www.biopython.org/download/

Cheers,
Joel Hedlund


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


From bill at barnard-engineering.com  Wed Mar 22 16:38:34 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Wed, 22 Mar 2006 08:38:34 -0800
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <441FEEE6.8040402@maubp.freeserve.co.uk>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
	<441FEEE6.8040402@maubp.freeserve.co.uk>
Message-ID: <1143045514.813.26.camel@tioga.barnard-engineering.com>

On Tue, 2006-03-21 at 12:17 +0000, Peter (BioPython-dev) wrote:
> bugzilla-daemon at portal.open-bio.org wrote:
> > http://bugzilla.open-bio.org/show_bug.cgi?id=1972
> > 
> > ------- Comment #2 from bill at barnard-engineering.com  2006-03-21 03:25 -------
> > I thought of this after I submitted the patch. Even though the existing test
> > simply compares the output to the stored output of a previous test, this bug
> > could be added as a test case such that if an unsupported change occurs to the
> > Prosite format, an exception will be raised when running the test.
> > 
> > Add the lines to Tests/test_prosite2.py:
> > 
> > expasy_dict = Prosite.ExPASyDictionary(parser=record_parser)
> > entry = expasy_dict['PS00079']
> > print entry.accession
> > 
> > or some such, then regenerate the test output file.
> 
> This change would add a test requiring internet access, right?

That is correct. The test I suggest belongs elsewhere than in
test_prosite2; see below...

> 
> I think BioPython has previously avoiding doing this...

Tests.test_Registry checks on-line retrieval from several remote DBs.

Further investigation shows this:

[billb at tioga Tests]$ grep -l requires_internet *.py
requires_internet.py
test_ais.py
test_EUtils.py
test_HotRand.py
test_Registry.py

> 
> Perhaps we should have some additional "online tests" for some modules - 
> but handled separately so that people can test most of BioPython WITHOUT 
> having a network connection (e.g. due to using a standalone machine, 
> proxy complications, dialup etc).

Perhaps another test case similar to test_Registry should be implemented
with the purpose of detecting DB format changes. I'll be happy to have a
go at it, should you want such a test.

test_Registry appears to check basic connectivity, and that the record
retrieved matches the one requested, but does no parsing other than to
look at the first N bytes from the data returned to see if it contains
the requested ID.

> 
> I had wondered about this as a general point for the BioPython test 
> framework (as there are lots of other modules that can fetch information 
> or query online tools - e.g. online blast).

Any test that needs a remote resource should import requires_internet
and then can fail gracefully if the internet is not accessible.

This does not address the bandwidth issue for those on dial-up, however
the tests so far don't appear to retrieve very much data. A parser test
could retrieve small records to minimize bandwidth and memory.

As to proxies; I didn't look too hard but probably most, if not all, the
remote resources will be accessed through http which should be no
problem. If http is not accessible, require_internet handles that case.

Bill


From dag at sonsorol.org  Wed Mar 22 19:47:05 2006
From: dag at sonsorol.org (Chris Dagdigian)
Date: Wed, 22 Mar 2006 14:47:05 -0500
Subject: [Biopython-dev] biopython website has moved to its new home
Message-ID: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>


I still can't believe I tried this, even harder to believe it  
actually worked ...

Your python Quixote based website has been moved to its new server  
home and IP address. While DNS for biopython.org is still updating  
you can confirm the new site is online by surfing over to http:// 
testsite.biopython.org

Considering I don't know python, have never used quixote and did not  
realize until far into the migration project that many critical bits  
of python required for the website to render under the application  
server are actually part of your biopython CVS repository it is a  
miracle that this worked at all. heh.

Anyway, you guys were the last major website needing to be moved (in  
addition to bugzilla.open-bio.org which also moved today).

This means that you are continuing to use the "old" site -- you can  
use this as long as you like. Your new wiki based site at  
biopython.open-bio.org is still available and if you chose to migrate  
into that new setup just let me know (or support at open-bio.org) when  
you want the switchover to occur.

Please keep an eye on biopython.org for me, especially after it  
starts returning 207.154.17.70 as the resolved IP address. Let me  
know if there are any issues


Regards,
Chris


From biopython-dev at maubp.freeserve.co.uk  Thu Mar 23 08:50:18 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython-dev))
Date: Thu, 23 Mar 2006 08:50:18 +0000
Subject: [Biopython-dev] biopython website has moved to its new home
In-Reply-To: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>
References: <706CDF3D-CC40-49FC-ACC9-BD195A519132@sonsorol.org>
Message-ID: <4422614A.4060508@maubp.freeserve.co.uk>

Chris Dagdigian wrote:
> Please keep an eye on biopython.org for me, especially after it  
> starts returning 207.154.17.70 as the resolved IP address. Let me  
> know if there are any issues

Umm, http://207.154.17.70/ goes to a BioMoby page at the moment, at 
first glance the content looks identical to http://www.biomoby.org/

Did you mean a different IP address Chris?

Peter


From bill at barnard-engineering.com  Thu Mar 30 20:33:55 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Thu, 30 Mar 2006 12:33:55 -0800
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <442403DE.7090607@biopython.org>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>
	<441FEEE6.8040402@maubp.freeserve.co.uk>
	<1143045514.813.26.camel@tioga.barnard-engineering.com>
	<442403DE.7090607@biopython.org>
Message-ID: <1143750836.5736.27.camel@tioga.barnard-engineering.com>

On Fri, 2006-03-24 at 14:36 +0000, Peter wrote:
> Bill Barnard wrote:
> > Perhaps another test case similar to test_Registry should be implemented
> > with the purpose of detecting DB format changes. I'll be happy to have a
> > go at it, should you want such a test.

> Peter wrote:
> I think this is an excellent idea - but it would be good to have an 
> opinion from some of the more seasoned BioPython developers.
> 
> Putting these online tests into separate unit test(s) will make tracing 
> unit test failures simply due to being offline much easier.
> 
> I could probably help out with some of the formats - but I am by no 
> means familiar with them all.

I've made a first cut unit test, tentatively named
test_Parsers_for_newest_formats, which retrieves and parses some small
records for Prosite, Prodoc, SwissProt, and Medline records. I tried
these types first, based on a quick search of the code tree to see where
there was existing code that makes use of Bio.WWW.

[billb at tioga Bio]$ find . -name "*.py" | xargs grep "Bio\.WWW"

yields (in part)

./Prosite/Prodoc.py:from Bio.WWW import ExPASy
./Prosite/__init__.py:from Bio.WWW import ExPASy
./SwissProt/SProt.py:from Bio.WWW import ExPASy
./PubMed.py:from Bio.WWW import NCBI

./Blast/NCBIWWW.py:    from Bio.WWW import NCBI

The first four are easy to test with code like:

class ExpasyTest(unittest.TestCase):
    """Test that Expasy parsers can read the current database formats
    """
    def setUp(self):
        self.prosite_dict = Prosite.ExPASyDictionary \
                            (parser=Prosite.RecordParser())

    def t_read_record(self):
        """Retrieve a Prosite record and parse it
        """
        accession = 'PS00159'
        entry = self.prosite_dict[accession]
        self.assertEqual(entry.accession, accession)

Testing Blast in the same way doesn't seem sensible to me, and it looks
as though any effort there should be in the XML Parser area, rather than
in the thankless task of parsing HTML. (I suspect that's what you've
already decided.)

> In some cases (e.g. GenBank, Fasta) once the sample file is downloaded 
> there are multiple parsers to be checked (e.g. record and feature parsers).

I'll take a look at more parsers, as I figure out where they are. I will
take the same approach of looking through the code tree for existing
parsers using find/grep. It looks as though there are a fair number
which may be obsolete. I would appreciate any guidance in figuring out
which ones would be most useful to check.

(Is this exercise useful? I was just learning my way around the code
using the on-line course at the Pasteur Institute, and found a minor bug
which I fixed. Since any bug should really be covered by a test as well
as being fixed, I wanted to now add the test. I like cleaning up
problems as I find them, but I may not be doing anything that's of more
than minor utility for Biopython...)

> We should probably produce a streamlined test output file WITHOUT 
> details which are likely to change in later versions of the test file 
> e.g. revisions to genbank files.

Since the test only verifies the record can be retrieved, parsed, and is
the actual record requested it emits very little output. My last run
emitted:

[billb at tioga Tests]$ python test_Parsers_for_newest_formats.py
Retrieve a Prodoc record and parse it ... ok
Retrieve a Prosite record and parse it ... ok
Retrieve a SwissProt record and parse it into Record format ... WARNING
- Ignoring line: DT   20-DEC-2005, integrated into UniProtKB/Swiss-Prot.

WARNING - Ignoring line: DT   07-DEC-2004, sequence version 1.

WARNING - Ignoring line: DT   07-FEB-2006, entry version 10.

ok
Retrieve a SwissProt record and parse it into Sequence format ... ok
Retrieve a PubMed record and parse it ... ok

----------------------------------------------------------------------
Ran 5 tests in 3.085s

OK


Is this what you mean by "streamlined test output"?

> One question is should the test "cache" any downloaded files (say for a 
> day) which would be helpful for anyone trying to debug a particular 
> issue and re-running the online tests?  Or is this just making life too 
> complicated.

This could be done, but I doubt I would do it unless it really seemed
useful...

Lazily yours,

Bill


From biopython-dev at maubp.freeserve.co.uk  Fri Mar 31 20:41:18 2006
From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython Dev))
Date: Fri, 31 Mar 2006 21:41:18 +0100
Subject: [Biopython-dev] "Online" tests, was [Bug 1972]
In-Reply-To: <1143750836.5736.27.camel@tioga.barnard-engineering.com>
References: <200603210825.k2L8Pg3S006352@portal.open-bio.org>	<441FEEE6.8040402@maubp.freeserve.co.uk>	<1143045514.813.26.camel@tioga.barnard-engineering.com>	<442403DE.7090607@biopython.org>
	<1143750836.5736.27.camel@tioga.barnard-engineering.com>
Message-ID: <442D93EE.9040707@maubp.freeserve.co.uk>

Bill Barnard wrote:
> I've made a first cut unit test, tentatively named
> test_Parsers_for_newest_formats, which retrieves and parses some small
> records for Prosite, Prodoc, SwissProt, and Medline records. I tried
> these types first, based on a quick search of the code tree to see where
> there was existing code that makes use of Bio.WWW.

Sounds good to me.  But not a very snappy name - how about something 
shorter like test_OnlineFormats.py instead?

> Testing Blast in the same way doesn't seem sensible to me, and it looks
> as though any effort there should be in the XML Parser area, rather than
> in the thankless task of parsing HTML. (I suspect that's what you've
> already decided.)

It was decided fairly recently to prioritise XML output for Blast.  The
plain text output is fairly stable, but my impression is that the HTML
was/is a moving target and a thankless job.

I think the Blast test should actually submit a short protein/nucleotide 
sequence known to be in the online database.  Maybe do some basic sanity 
testing like check it returns at least N results and the best hit is at 
least a certain score.

>>In some cases (e.g. GenBank, Fasta) once the sample file is downloaded 
>>there are multiple parsers to be checked (e.g. record and feature parsers).
> 
> I'll take a look at more parsers, as I figure out where they are. I will
> take the same approach of looking through the code tree for existing
> parsers using find/grep. It looks as though there are a fair number
> which may be obsolete. I would appreciate any guidance in figuring out
> which ones would be most useful to check.
> 
> (Is this exercise useful? I was just learning my way around the code
> using the on-line course at the Pasteur Institute, and found a minor bug
> which I fixed. Since any bug should really be covered by a test as well
> as being fixed, I wanted to now add the test. I like cleaning up
> problems as I find them, but I may not be doing anything that's of more
> than minor utility for Biopython...)

Like yourself, I'm only familiar with a fraction of the BioPython code.

I'll volunteer to add cases for GenBank, Fasta and GEO files.

>>We should probably produce a streamlined test output file WITHOUT 
>>details which are likely to change in later versions of the test file 
>>e.g. revisions to genbank files.
> 
> Since the test only verifies the record can be retrieved, parsed, and is
> the actual record requested it emits very little output. My last run
> emitted:
> 
> ...
 > WARNING - Ignoring line: DT   20-DEC-2005, integrated into 
UniProtKB/Swiss-Prot.
> ...

Those WARNING lines are my doing, see bug 1946

http://bugzilla.open-bio.org/show_bug.cgi?id=1946

> Is this what you mean by "streamlined test output"?

Pretty much.

>>One question is should the test "cache" any downloaded files (say for a 
>>day) which would be helpful for anyone trying to debug a particular 
>>issue and re-running the online tests?  Or is this just making life too 
>>complicated.
> 
> This could be done, but I doubt I would do it unless it really seemed
> useful...

OK :)

Peter