From jdiggans at gmail.com Fri Sep 8 15:26:19 2006 From: jdiggans at gmail.com (James Diggans) Date: Fri, 8 Sep 2006 15:26:19 -0400 Subject: [Biopython-dev] Parsing PubMed XML records In-Reply-To: References: Message-ID: Just began a small project to parse records from a few PubMed searches and in using the Bio.Pubmed and Bio.Medline packages. The method used (once patched acc. to the link below) in the documentation seems to use the plain-text Medline format which doesn't seem to include Affiliation, a field in which I'm interested. The XML parsers *do* include this field in their parse but it doesn't look as if they were ever finished (e.g. NLMMedlineXML.py has a 'Citation' object while PubMed.py uses a 'Record' object; I don't see any hierarchical relationships between the two). Can someone provide a brief overview as to the status of this package? Is the XML interface usable (even if I have to write a new format perhaps?)? Regards, James http://lists.open-bio.org/pipermail/biopython-dev/2003-July/001348.html From biopython-dev at maubp.freeserve.co.uk Wed Sep 13 14:23:56 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython Dev)) Date: Wed, 13 Sep 2006 19:23:56 +0100 Subject: [Biopython-dev] Fasta.SequenceParser slower on python 2.4 than 2.3 Message-ID: <45084CBC.9080103@maubp.freeserve.co.uk> I've been looking at sequence parsing again, and was a little puzzled to notice that the stock Fasta.SequenceParser (which uses Martel internally) is about three to four times slower on Python 2.4 than on Python 2.3 (on my Windows XP laptop). Has anyone else noticed this? For comparison, SeqIO.FASTA.FastaReader is about the same (maybe even a fraction faster). I've been using rat.protein.faa as a test case, a 22 MB file with approx 36000 entries. The sequences are split into 80 character lines. Available here: ftp://ftp.ncbi.nlm.nih.gov/refseq/R_norvegicus/mRNA_Prot/rat.protein.faa.gz On python 2.3.3 the attached script takes about 12s to parse, on python 2.4.3 it takes about 56s. Explicitly caching the file using cStringIO makes no real difference. Using SeqIO.FASTA.FastaReader takes about 10s or 11s (regardless of the version of python). It is possible that this "slow down" is Windows only - I know they switched from MSVC version 6 to version 7 (or something) instead, which may be to blame. Peter -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: simple_no_cache.py Url: http://lists.open-bio.org/pipermail/biopython-dev/attachments/20060913/a69fedfd/attachment.pl From biopython-dev at maubp.freeserve.co.uk Sun Sep 17 07:05:14 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Sep 2006 12:05:14 +0100 Subject: [Biopython-dev] Bio.GenBank FeatureParser vs RecordParser In-Reply-To: <450C8966.3030106@maubp.freeserve.co.uk> References: <450C8966.3030106@maubp.freeserve.co.uk> Message-ID: <450D2BEA.6040903@maubp.freeserve.co.uk> Peter wrote: > I've been looking at some timings for parsing GenBank files, in > particular FeatureParser vs RecordParser > > The test file I'm using is one of the largest bacterial genomes, the > GenBank file is almost 24MB: > > ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Streptomyces_coelicolor/NC_003888.gbk > > On my nice new desktop: > > RecordParser takes about 5s to return a Bio.GenBank.Record object. > > FeatureParser takes about 45 to 50s to return a SeqRecord object. > > ... > > The other option (which I do plan to look into) is improving the > location parser so that it doesn't cause such a slow down. > I started this thread on the discussion list, but this follow up is probably better off on the development list... With the following fairly small change to Bio/GenBank/LocationParser.py the time taken by the FeatureParser is almost halved (from about 45 to 50s to about about 27 or 28s). Old code: def scan(input): scanner = LocationScanner() return scanner.tokenize(input) def parse(tokens): #print "I have", tokens parser = LocationParser() return parser.parse(tokens) New code: _cached_scanner = LocationScanner() def scan(input): return _cached_scanner.tokenize(input) _cached_parser = LocationParser() def parse(tokens): #print "I have", tokens return _cached_parser.parse(tokens) These two functions are called for every feature by the location method of the _FeatureConsumer class in Bio/GenBank/__init__.py I checked that test_GenBank and test_GenBankFormat still pass. My change means the LocationScanner() and LocationParser() objects are created once and then reused - rather than being recreated for each feature. Alternatively, the _FeatureConsumer could create its own copies of these objects (once) and call them directly instead of using the scan and parse functions. This also works and takes a similar amount of time. If no one objects, I'll double check this works (and is worthwhile) on my older slower windows machine, and check it in at some point next week. Peter From biopython-dev at maubp.freeserve.co.uk Sun Sep 17 18:06:32 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Sep 2006 23:06:32 +0100 Subject: [Biopython-dev] Bio.GenBank FeatureParser vs RecordParser In-Reply-To: <450D2BEA.6040903@maubp.freeserve.co.uk> References: <450C8966.3030106@maubp.freeserve.co.uk> <450D2BEA.6040903@maubp.freeserve.co.uk> Message-ID: <450DC6E8.5030100@maubp.freeserve.co.uk> Peter wrote: > Peter wrote: >> I've been looking at some timings for parsing GenBank files, in >> particular FeatureParser vs RecordParser >> >> The test file I'm using is one of the largest bacterial genomes, the >> GenBank file is almost 24MB: >> >> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Streptomyces_coelicolor/NC_003888.gbk >> >> On my nice new desktop: >> >> RecordParser takes about 5s to return a Bio.GenBank.Record object. >> >> FeatureParser takes about 45 to 50s to return a SeqRecord object. >> >> ... >> >> The other option (which I do plan to look into) is improving the >> location parser so that it doesn't cause such a slow down. >> > > I started this thread on the discussion list, but this follow up is > probably better off on the development list... > > With the following fairly small change to Bio/GenBank/LocationParser.py > the time taken by the FeatureParser is almost halved (from about 45 to > 50s to about about 27 or 28s). > > Old code: > > def scan(input): > scanner = LocationScanner() > return scanner.tokenize(input) > > def parse(tokens): > #print "I have", tokens > parser = LocationParser() > return parser.parse(tokens) > > > New code: > > _cached_scanner = LocationScanner() > def scan(input): > return _cached_scanner.tokenize(input) > > _cached_parser = LocationParser() > def parse(tokens): > #print "I have", tokens > return _cached_parser.parse(tokens) > > > These two functions are called for every feature by the location method > of the _FeatureConsumer class in Bio/GenBank/__init__.py > > I checked that test_GenBank and test_GenBankFormat still pass. > > My change means the LocationScanner() and LocationParser() objects are > created once and then reused - rather than being recreated for each feature. > > Alternatively, the _FeatureConsumer could create its own copies of these > objects (once) and call them directly instead of using the scan and > parse functions. This also works and takes a similar amount of time. > > If no one objects, I'll double check this works (and is worthwhile) on > my older slower windows machine, and check it in at some point next week. I still plan to check in the above fairly minor change. I've also looked deeper, and I have tweaked LocationParser.py to handle the typical (exact) cases using regular expressions as special cases (falling back on the existing spark parser otherwise): "123..456" "function(123..456)" e.g. "complement(123..456)" The above are enough for most bacteria, I then added: "function(123..456,789..1066,1999..2006)" to cover joins, and: "function(function(123..456,789..1066,1999..2006))" to cover the complement of joins for non-bacteria. With this in place the parsing time for the large example falls from about 27s to about 7s (compared to the 45s or more taken by the CVS edition of the parser). I'm not ready to check in this hybrid regular expressions/spark parser, as I think it could be done more cleanly... Peter From idoerg at burnham.org Mon Sep 18 18:09:08 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 18 Sep 2006 15:09:08 -0700 Subject: [Biopython-dev] Biopython for Ubuntu Message-ID: <450F1904.3070601@burnham.org> Apparently we have a Debian / Ubuntu package for Biopython. If there was an announcement here then I am sorry, but it went past me. Anyhow, thanks very much to Philipp Benner for creating the Ubuntu package. Currently Biopython 1.41, and you need to add the universe repository to get it. It's in the universe/python section. I'll add something to the Wiki Iddo -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA T: +1 858 646 3100 x3516 http://iddo-friedberg.org http://BioFunctionPrediction.org From bugzilla-daemon at portal.open-bio.org Mon Sep 25 10:53:40 2006 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Sep 2006 10:53:40 -0400 Subject: [Biopython-dev] [Bug 2076] EMBL to GenBank converter should fix unterminated lines In-Reply-To: Message-ID: <200609251453.k8PEredO017998@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2076 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 25 11:02:10 2006 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Sep 2006 11:02:10 -0400 Subject: [Biopython-dev] [Bug 2035] fast/approximate clustalw parameter set incorrectly In-Reply-To: Message-ID: <200609251502.k8PF2A1j018510@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2035 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2006-09-25 11:02 ------- Fix checked in, revision 1.14 http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Clustalw/__init__.py?cvsroot=biopython -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From kvddrift at earthlink.net Tue Sep 26 14:32:56 2006 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 26 Sep 2006 14:32:56 -0400 (GMT-04:00) Subject: [Biopython-dev] biopython instructions for Mac OS X Message-ID: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Hi, I was reading your wiki page and noticed that the instructions for installing biopython on Mac OS X are quite elaborous. I would like to bring under your attention that it is very easy to install the package using the fink package manager (similar to debian, see also http://fink.sf.net). Fink will take care of getting the source tarballs and installing all additional packages needed for biopython. If you would like to add this to your wiki page, I can write a few sentences for this. Also, does the most recent version of biopython work with python 2.5? thanks, - Koen. From mdehoon at c2b2.columbia.edu Tue Sep 26 21:02:58 2006 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 26 Sep 2006 21:02:58 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Message-ID: <4519CDC2.3080805@c2b2.columbia.edu> Koen van der Drift wrote: > If you would like to add this to your wiki page, I can write a few > sentences for this. You can make an account to be able to edit the wiki page by going to "Log in / create account" at the top of the biopython home page. Let me know if this doesn't work for you. > Also, does the most recent version of biopython work with python 2.5? Yes, as far as I can tell. At least I didn't experience any problems with Biopython with python 2.5 on Cygwin or Mac OS X. Some deprecation warnings (which should be fixed for the next release), but nothing serious. --Michiel. From kvddrift at earthlink.net Tue Sep 26 22:25:55 2006 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 26 Sep 2006 22:25:55 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <4519CDC2.3080805@c2b2.columbia.edu> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> <4519CDC2.3080805@c2b2.columbia.edu> Message-ID: <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> On Sep 26, 2006, at 9:02 PM, Michiel de Hoon wrote: > You can make an account to be able to edit the wiki page by going > to "Log in / create account" at the top of the biopython home page. > Let me know if this doesn't work for you. Thanks, I was able to create an account. However, I just noticed that the install instructions are only linked from the wiki page, and are on an external HTML document created by Brad Chapman. I will email him and ask him to update the instructions. FYI, I was also able to build biopython 1.42 with python 2.5 on Mac OS X. - Koen. From mdehoon at c2b2.columbia.edu Tue Sep 26 22:55:50 2006 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 26 Sep 2006 22:55:50 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> <4519CDC2.3080805@c2b2.columbia.edu> <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> Message-ID: <4519E836.3030204@c2b2.columbia.edu> Koen van der Drift wrote: > Thanks, I was able to create an account. However, I just noticed that > the install instructions are only linked from the wiki page, and are on > an external HTML document created by Brad Chapman. I will email him and > ask him to update the instructions. I'm not sure if Brad is still very actively involved with Biopython (sorry Brad if this statement is incorrect). But, we can also help you with fixing the install instructions. The TeX source for this is in CVS; you can access it here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/install/?cvsroot=biopython The simplest thing may be to add your instructions to the .tex file, send me (not the mailing list) the result, and then I'll upload the new tex, pdf, html to CVS and the website. It makes sense though for these installations instructions to be part of the wiki. So: Who prefers the current tex/pdf/html form to having a wiki for the installation instructions? --Michiel. From lpritc at scri.sari.ac.uk Wed Sep 27 04:36:40 2006 From: lpritc at scri.sari.ac.uk (Leighton Pritchard) Date: Wed, 27 Sep 2006 09:36:40 +0100 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Message-ID: <1159346200.4794.38.camel@lplinuxdev> Hi Koen, I notice you're maintaining the fink distribution of biopython ;) Thanks for doing that. On Tue, 2006-09-26 at 14:32 -0400, Koen van der Drift wrote: > Also, does the most recent version of biopython work with python 2.5? It works for me with Python2.5 on OS X, as do all the dependencies. (Getting matplotlib/pylab installed correctly with 2.5 was a much more involved matter, but that story belongs on a different mailing list.) L. -- Dr Leighton Pritchard AMRSC D131, Plant-Pathogen Interactions, Scottish Crop Research Institute Invergowrie, Dundee, Scotland, DD2 5DA, UK T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578 E: lpritc at scri.sari.ac.uk W: http://bioinf.scri.sari.ac.uk/lp GPG/PGP: FEFC205C E58BA41B http://www.keyserver.net (If the signature does not verify, please remove the SCRI disclaimer) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From jdiggans at gmail.com Fri Sep 8 19:26:19 2006 From: jdiggans at gmail.com (James Diggans) Date: Fri, 8 Sep 2006 15:26:19 -0400 Subject: [Biopython-dev] Parsing PubMed XML records In-Reply-To: References: Message-ID: Just began a small project to parse records from a few PubMed searches and in using the Bio.Pubmed and Bio.Medline packages. The method used (once patched acc. to the link below) in the documentation seems to use the plain-text Medline format which doesn't seem to include Affiliation, a field in which I'm interested. The XML parsers *do* include this field in their parse but it doesn't look as if they were ever finished (e.g. NLMMedlineXML.py has a 'Citation' object while PubMed.py uses a 'Record' object; I don't see any hierarchical relationships between the two). Can someone provide a brief overview as to the status of this package? Is the XML interface usable (even if I have to write a new format perhaps?)? Regards, James http://lists.open-bio.org/pipermail/biopython-dev/2003-July/001348.html From biopython-dev at maubp.freeserve.co.uk Wed Sep 13 18:23:56 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter (BioPython Dev)) Date: Wed, 13 Sep 2006 19:23:56 +0100 Subject: [Biopython-dev] Fasta.SequenceParser slower on python 2.4 than 2.3 Message-ID: <45084CBC.9080103@maubp.freeserve.co.uk> I've been looking at sequence parsing again, and was a little puzzled to notice that the stock Fasta.SequenceParser (which uses Martel internally) is about three to four times slower on Python 2.4 than on Python 2.3 (on my Windows XP laptop). Has anyone else noticed this? For comparison, SeqIO.FASTA.FastaReader is about the same (maybe even a fraction faster). I've been using rat.protein.faa as a test case, a 22 MB file with approx 36000 entries. The sequences are split into 80 character lines. Available here: ftp://ftp.ncbi.nlm.nih.gov/refseq/R_norvegicus/mRNA_Prot/rat.protein.faa.gz On python 2.3.3 the attached script takes about 12s to parse, on python 2.4.3 it takes about 56s. Explicitly caching the file using cStringIO makes no real difference. Using SeqIO.FASTA.FastaReader takes about 10s or 11s (regardless of the version of python). It is possible that this "slow down" is Windows only - I know they switched from MSVC version 6 to version 7 (or something) instead, which may be to blame. Peter -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: simple_no_cache.py URL: From biopython-dev at maubp.freeserve.co.uk Sun Sep 17 11:05:14 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Sep 2006 12:05:14 +0100 Subject: [Biopython-dev] Bio.GenBank FeatureParser vs RecordParser In-Reply-To: <450C8966.3030106@maubp.freeserve.co.uk> References: <450C8966.3030106@maubp.freeserve.co.uk> Message-ID: <450D2BEA.6040903@maubp.freeserve.co.uk> Peter wrote: > I've been looking at some timings for parsing GenBank files, in > particular FeatureParser vs RecordParser > > The test file I'm using is one of the largest bacterial genomes, the > GenBank file is almost 24MB: > > ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Streptomyces_coelicolor/NC_003888.gbk > > On my nice new desktop: > > RecordParser takes about 5s to return a Bio.GenBank.Record object. > > FeatureParser takes about 45 to 50s to return a SeqRecord object. > > ... > > The other option (which I do plan to look into) is improving the > location parser so that it doesn't cause such a slow down. > I started this thread on the discussion list, but this follow up is probably better off on the development list... With the following fairly small change to Bio/GenBank/LocationParser.py the time taken by the FeatureParser is almost halved (from about 45 to 50s to about about 27 or 28s). Old code: def scan(input): scanner = LocationScanner() return scanner.tokenize(input) def parse(tokens): #print "I have", tokens parser = LocationParser() return parser.parse(tokens) New code: _cached_scanner = LocationScanner() def scan(input): return _cached_scanner.tokenize(input) _cached_parser = LocationParser() def parse(tokens): #print "I have", tokens return _cached_parser.parse(tokens) These two functions are called for every feature by the location method of the _FeatureConsumer class in Bio/GenBank/__init__.py I checked that test_GenBank and test_GenBankFormat still pass. My change means the LocationScanner() and LocationParser() objects are created once and then reused - rather than being recreated for each feature. Alternatively, the _FeatureConsumer could create its own copies of these objects (once) and call them directly instead of using the scan and parse functions. This also works and takes a similar amount of time. If no one objects, I'll double check this works (and is worthwhile) on my older slower windows machine, and check it in at some point next week. Peter From biopython-dev at maubp.freeserve.co.uk Sun Sep 17 22:06:32 2006 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Sep 2006 23:06:32 +0100 Subject: [Biopython-dev] Bio.GenBank FeatureParser vs RecordParser In-Reply-To: <450D2BEA.6040903@maubp.freeserve.co.uk> References: <450C8966.3030106@maubp.freeserve.co.uk> <450D2BEA.6040903@maubp.freeserve.co.uk> Message-ID: <450DC6E8.5030100@maubp.freeserve.co.uk> Peter wrote: > Peter wrote: >> I've been looking at some timings for parsing GenBank files, in >> particular FeatureParser vs RecordParser >> >> The test file I'm using is one of the largest bacterial genomes, the >> GenBank file is almost 24MB: >> >> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Streptomyces_coelicolor/NC_003888.gbk >> >> On my nice new desktop: >> >> RecordParser takes about 5s to return a Bio.GenBank.Record object. >> >> FeatureParser takes about 45 to 50s to return a SeqRecord object. >> >> ... >> >> The other option (which I do plan to look into) is improving the >> location parser so that it doesn't cause such a slow down. >> > > I started this thread on the discussion list, but this follow up is > probably better off on the development list... > > With the following fairly small change to Bio/GenBank/LocationParser.py > the time taken by the FeatureParser is almost halved (from about 45 to > 50s to about about 27 or 28s). > > Old code: > > def scan(input): > scanner = LocationScanner() > return scanner.tokenize(input) > > def parse(tokens): > #print "I have", tokens > parser = LocationParser() > return parser.parse(tokens) > > > New code: > > _cached_scanner = LocationScanner() > def scan(input): > return _cached_scanner.tokenize(input) > > _cached_parser = LocationParser() > def parse(tokens): > #print "I have", tokens > return _cached_parser.parse(tokens) > > > These two functions are called for every feature by the location method > of the _FeatureConsumer class in Bio/GenBank/__init__.py > > I checked that test_GenBank and test_GenBankFormat still pass. > > My change means the LocationScanner() and LocationParser() objects are > created once and then reused - rather than being recreated for each feature. > > Alternatively, the _FeatureConsumer could create its own copies of these > objects (once) and call them directly instead of using the scan and > parse functions. This also works and takes a similar amount of time. > > If no one objects, I'll double check this works (and is worthwhile) on > my older slower windows machine, and check it in at some point next week. I still plan to check in the above fairly minor change. I've also looked deeper, and I have tweaked LocationParser.py to handle the typical (exact) cases using regular expressions as special cases (falling back on the existing spark parser otherwise): "123..456" "function(123..456)" e.g. "complement(123..456)" The above are enough for most bacteria, I then added: "function(123..456,789..1066,1999..2006)" to cover joins, and: "function(function(123..456,789..1066,1999..2006))" to cover the complement of joins for non-bacteria. With this in place the parsing time for the large example falls from about 27s to about 7s (compared to the 45s or more taken by the CVS edition of the parser). I'm not ready to check in this hybrid regular expressions/spark parser, as I think it could be done more cleanly... Peter From idoerg at burnham.org Mon Sep 18 22:09:08 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Mon, 18 Sep 2006 15:09:08 -0700 Subject: [Biopython-dev] Biopython for Ubuntu Message-ID: <450F1904.3070601@burnham.org> Apparently we have a Debian / Ubuntu package for Biopython. If there was an announcement here then I am sorry, but it went past me. Anyhow, thanks very much to Philipp Benner for creating the Ubuntu package. Currently Biopython 1.41, and you need to add the universe repository to get it. It's in the universe/python section. I'll add something to the Wiki Iddo -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA T: +1 858 646 3100 x3516 http://iddo-friedberg.org http://BioFunctionPrediction.org From bugzilla-daemon at portal.open-bio.org Mon Sep 25 14:53:40 2006 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Sep 2006 10:53:40 -0400 Subject: [Biopython-dev] [Bug 2076] EMBL to GenBank converter should fix unterminated lines In-Reply-To: Message-ID: <200609251453.k8PEredO017998@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2076 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Sep 25 15:02:10 2006 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Sep 2006 11:02:10 -0400 Subject: [Biopython-dev] [Bug 2035] fast/approximate clustalw parameter set incorrectly In-Reply-To: Message-ID: <200609251502.k8PF2A1j018510@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2035 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2006-09-25 11:02 ------- Fix checked in, revision 1.14 http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/Clustalw/__init__.py?cvsroot=biopython -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From kvddrift at earthlink.net Tue Sep 26 18:32:56 2006 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 26 Sep 2006 14:32:56 -0400 (GMT-04:00) Subject: [Biopython-dev] biopython instructions for Mac OS X Message-ID: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Hi, I was reading your wiki page and noticed that the instructions for installing biopython on Mac OS X are quite elaborous. I would like to bring under your attention that it is very easy to install the package using the fink package manager (similar to debian, see also http://fink.sf.net). Fink will take care of getting the source tarballs and installing all additional packages needed for biopython. If you would like to add this to your wiki page, I can write a few sentences for this. Also, does the most recent version of biopython work with python 2.5? thanks, - Koen. From mdehoon at c2b2.columbia.edu Wed Sep 27 01:02:58 2006 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 26 Sep 2006 21:02:58 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Message-ID: <4519CDC2.3080805@c2b2.columbia.edu> Koen van der Drift wrote: > If you would like to add this to your wiki page, I can write a few > sentences for this. You can make an account to be able to edit the wiki page by going to "Log in / create account" at the top of the biopython home page. Let me know if this doesn't work for you. > Also, does the most recent version of biopython work with python 2.5? Yes, as far as I can tell. At least I didn't experience any problems with Biopython with python 2.5 on Cygwin or Mac OS X. Some deprecation warnings (which should be fixed for the next release), but nothing serious. --Michiel. From kvddrift at earthlink.net Wed Sep 27 02:25:55 2006 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 26 Sep 2006 22:25:55 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <4519CDC2.3080805@c2b2.columbia.edu> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> <4519CDC2.3080805@c2b2.columbia.edu> Message-ID: <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> On Sep 26, 2006, at 9:02 PM, Michiel de Hoon wrote: > You can make an account to be able to edit the wiki page by going > to "Log in / create account" at the top of the biopython home page. > Let me know if this doesn't work for you. Thanks, I was able to create an account. However, I just noticed that the install instructions are only linked from the wiki page, and are on an external HTML document created by Brad Chapman. I will email him and ask him to update the instructions. FYI, I was also able to build biopython 1.42 with python 2.5 on Mac OS X. - Koen. From mdehoon at c2b2.columbia.edu Wed Sep 27 02:55:50 2006 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Tue, 26 Sep 2006 22:55:50 -0400 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> <4519CDC2.3080805@c2b2.columbia.edu> <8B1F1014-DF4E-4F0A-A6E7-5172C6404FF1@earthlink.net> Message-ID: <4519E836.3030204@c2b2.columbia.edu> Koen van der Drift wrote: > Thanks, I was able to create an account. However, I just noticed that > the install instructions are only linked from the wiki page, and are on > an external HTML document created by Brad Chapman. I will email him and > ask him to update the instructions. I'm not sure if Brad is still very actively involved with Biopython (sorry Brad if this statement is incorrect). But, we can also help you with fixing the install instructions. The TeX source for this is in CVS; you can access it here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/install/?cvsroot=biopython The simplest thing may be to add your instructions to the .tex file, send me (not the mailing list) the result, and then I'll upload the new tex, pdf, html to CVS and the website. It makes sense though for these installations instructions to be part of the wiki. So: Who prefers the current tex/pdf/html form to having a wiki for the installation instructions? --Michiel. From lpritc at scri.sari.ac.uk Wed Sep 27 08:36:40 2006 From: lpritc at scri.sari.ac.uk (Leighton Pritchard) Date: Wed, 27 Sep 2006 09:36:40 +0100 Subject: [Biopython-dev] biopython instructions for Mac OS X In-Reply-To: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> References: <19508413.1159295576644.JavaMail.root@elwamui-wigeon.atl.sa.earthlink.net> Message-ID: <1159346200.4794.38.camel@lplinuxdev> Hi Koen, I notice you're maintaining the fink distribution of biopython ;) Thanks for doing that. On Tue, 2006-09-26 at 14:32 -0400, Koen van der Drift wrote: > Also, does the most recent version of biopython work with python 2.5? It works for me with Python2.5 on OS X, as do all the dependencies. (Getting matplotlib/pylab installed correctly with 2.5 was a much more involved matter, but that story belongs on a different mailing list.) L. -- Dr Leighton Pritchard AMRSC D131, Plant-Pathogen Interactions, Scottish Crop Research Institute Invergowrie, Dundee, Scotland, DD2 5DA, UK T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578 E: lpritc at scri.sari.ac.uk W: http://bioinf.scri.sari.ac.uk/lp GPG/PGP: FEFC205C E58BA41B http://www.keyserver.net (If the signature does not verify, please remove the SCRI disclaimer) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).