From hubin.keio at gmail.com Thu Feb 2 09:35:50 2006 From: hubin.keio at gmail.com (Bin Hu) Date: Thu Feb 2 09:38:37 2006 Subject: [BioPython] Bug in Bio.SeqUtils ? Message-ID: <71dea9850602020635u37a7294dv15911521deeee656@mail.gmail.com> Hi, When using Bio.SeqUtils to estimate isoelectric point for PDB entry 1a8y, it seems the function isoelectric_point() cannot reach an end, although it worked pretty well for all the other entries that I've tested. Could this be a bug in Bio.SeqUtils? If anyone want to test it, blow is the sequence of 1a8y: eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded ddddddd For PDB entry 1rb9, the hydrophilicity of this protein cannot be estimated because its sequence starts with "X", which is not in the key list used by SeqUtils. It will bring the following error message: Traceback (most recent call last): File "./dataGen.py", line 62, in ? aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5) File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line 206, in protein_scale score += weight[j] * ParamDict[subsequence[j]] + weight[j] * ParamDict[subsequence[Window-j-1]] KeyError: 'X' Although I can delete the "X" in this protein, could the author implement a warning message and work around this error stop? Thank you. Bin From idoerg at burnham.org Thu Feb 2 13:35:17 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Thu Feb 2 13:31:22 2006 Subject: [BioPython] Bug in Bio.SeqUtils ? In-Reply-To: <71dea9850602020635u37a7294dv15911521deeee656@mail.gmail.com> References: <71dea9850602020635u37a7294dv15911521deeee656@mail.gmail.com> Message-ID: <43E250E5.6030901@burnham.org> Which version are you using? I tried the 1a8y sequence which you gave, and also a sequence with an 'X', and they worked fine for me. CVS version. # seq is a Record object. seq.sequence is a string with the protein sequence >>> from Bio.SeqUtils import ProtParam >>> ps = ProtParam.ProteinAnalysis(seq.sequence) >>> ps.isoelectric_point() 3.9298931884765151 # and for a sequence with an 'x' >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip') >>> ps2.isoelectric_point() 5.8285980224609375 Bin Hu wrote: >Hi, > >When using Bio.SeqUtils to estimate isoelectric point for PDB entry 1a8y, it >seems the function isoelectric_point() cannot reach an end, although it >worked pretty well for all the other entries that I've tested. Could this be >a bug in Bio.SeqUtils? > >If anyone want to test it, blow is the sequence of 1a8y: > >eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel >aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll >dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf >dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye >tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe >ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded >ddddddd > >For PDB entry 1rb9, the hydrophilicity of this protein cannot be estimated >because its sequence starts with "X", which is not in the key list used by >SeqUtils. It will bring the following error message: > >Traceback (most recent call last): > File "./dataGen.py", line 62, in ? > aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5) > File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line >206, in protein_scale > score += weight[j] * ParamDict[subsequence[j]] + weight[j] * >ParamDict[subsequence[Window-j-1]] >KeyError: 'X' > >Although I can delete the "X" in this protein, could the author implement a >warning message and work around this error stop? Thank you. > >Bin > >_______________________________________________ >BioPython mailing list - BioPython@biopython.org >http://biopython.org/mailman/listinfo/biopython > > > > -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From idoerg at burnham.org Thu Feb 2 16:14:57 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Thu Feb 2 16:11:02 2006 Subject: [BioPython] Bug in Bio.SeqUtils ? In-Reply-To: <43E250E5.6030901@burnham.org> References: <71dea9850602020635u37a7294dv15911521deeee656@mail.gmail.com> <43E250E5.6030901@burnham.org> Message-ID: <43E27651.3010504@burnham.org> Oh, sorry. Your second problem was with protein_scale, which does indeed break on any letter not of the 20 regular amino acids. I inserted this into a try/except clause which produces a warning to stderr, instead of raising an exception. It is now in CVS. Yair, is that OK, or would we rather leave the exception raising bit there? There are arguments either way... ./I Iddo Friedberg wrote: > Which version are you using? I tried the 1a8y sequence which you gave, > and also a sequence with an 'X', and they worked fine for me. CVS > version. > > # seq is a Record object. seq.sequence is a string with the protein > sequence > > >>> from Bio.SeqUtils import ProtParam > >>> ps = ProtParam.ProteinAnalysis(seq.sequence) > >>> ps.isoelectric_point() > 3.9298931884765151 > > > # and for a sequence with an 'x' > >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip') > >>> ps2.isoelectric_point() > 5.8285980224609375 > > Bin Hu wrote: > >> Hi, >> >> When using Bio.SeqUtils to estimate isoelectric point for PDB entry >> 1a8y, it >> seems the function isoelectric_point() cannot reach an end, although it >> worked pretty well for all the other entries that I've tested. Could >> this be >> a bug in Bio.SeqUtils? >> >> If anyone want to test it, blow is the sequence of 1a8y: >> >> eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel >> aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll >> dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf >> dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye >> tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe >> ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded >> ddddddd >> >> For PDB entry 1rb9, the hydrophilicity of this protein cannot be >> estimated >> because its sequence starts with "X", which is not in the key list >> used by >> SeqUtils. It will bring the following error message: >> >> Traceback (most recent call last): >> File "./dataGen.py", line 62, in ? >> aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5) >> File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line >> 206, in protein_scale >> score += weight[j] * ParamDict[subsequence[j]] + weight[j] * >> ParamDict[subsequence[Window-j-1]] >> KeyError: 'X' >> >> Although I can delete the "X" in this protein, could the author >> implement a >> warning message and work around this error stop? Thank you. >> >> Bin >> >> _______________________________________________ >> BioPython mailing list - BioPython@biopython.org >> http://biopython.org/mailman/listinfo/biopython >> >> >> >> > > -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From hubin.keio at gmail.com Fri Feb 3 09:27:10 2006 From: hubin.keio at gmail.com (Bin Hu) Date: Fri Feb 3 09:23:16 2006 Subject: [BioPython] Bug in Bio.SeqUtils ? In-Reply-To: <43E27651.3010504@burnham.org> References: <71dea9850602020635u37a7294dv15911521deeee656@mail.gmail.com> <43E250E5.6030901@burnham.org> <43E27651.3010504@burnham.org> Message-ID: <71dea9850602030627g7452bc7rcb86e097f9328cf6@mail.gmail.com> Thank you for your reply. I am using Python 2.4, BioPython 1.41 (install from src). I will check CVS version when I get some time. Regards, Bin On 2/3/06, Iddo Friedberg wrote: > Oh, sorry. > > Your second problem was with protein_scale, which does indeed break on > any letter not of the 20 regular amino acids. > > I inserted this into a try/except clause which produces a warning to > stderr, instead of raising an exception. It is now in CVS. > > Yair, is that OK, or would we rather leave the exception raising bit > there? There are arguments either way... > > > ./I > > > Iddo Friedberg wrote: > > > Which version are you using? I tried the 1a8y sequence which you gave, > > and also a sequence with an 'X', and they worked fine for me. CVS > > version. > > > > # seq is a Record object. seq.sequence is a string with the protein > > sequence > > > > >>> from Bio.SeqUtils import ProtParam > > >>> ps = ProtParam.ProteinAnalysis(seq.sequence) > > >>> ps.isoelectric_point() > > 3.9298931884765151 > > > > > > # and for a sequence with an 'x' > > >>> ps2 = ProtParam.ProteinAnalysis('xsdfgvcrtyip') > > >>> ps2.isoelectric_point() > > 5.8285980224609375 > > > > Bin Hu wrote: > > > >> Hi, > >> > >> When using Bio.SeqUtils to estimate isoelectric point for PDB entry > >> 1a8y, it > >> seems the function isoelectric_point() cannot reach an end, although it > >> worked pretty well for all the other entries that I've tested. Could > >> this be > >> a bug in Bio.SeqUtils? > >> > >> If anyone want to test it, blow is the sequence of 1a8y: > >> > >> eegldfpeydgvdrvinvnaknyknvfkkyevlallyheppeddkasqrqfemeelilel > >> aaqvledkgvgfglvdsekdaavakklglteedsiyvfkedevieydgefsadtlvefll > >> dvledpveliegerelqafeniedeikligyfknkdsehykafkeaaeefhpyipffatf > >> dskvakkltlklneidfyeafmeepvtipdkpnseeeivnfveehrrstlrklkpesmye > >> tweddmdgihivafaeeadpdgyefleilksvaqdntdnpdlsiiwidpddfpllvpywe > >> ktfdidlsapqigvvnvtdadsvwmemddeedlpsaeeledwledvlegeintedddded > >> ddddddd > >> > >> For PDB entry 1rb9, the hydrophilicity of this protein cannot be > >> estimated > >> because its sequence starts with "X", which is not in the key list > >> used by > >> SeqUtils. It will bring the following error message: > >> > >> Traceback (most recent call last): > >> File "./dataGen.py", line 62, in ? > >> aHydrophilicityList = aSeqObj.protein_scale(ProtParamData.hw, 5) > >> File "/usr/lib/python2.4/site-packages/Bio/SeqUtils/ProtParam.py", line > >> 206, in protein_scale > >> score += weight[j] * ParamDict[subsequence[j]] + weight[j] * > >> ParamDict[subsequence[Window-j-1]] > >> KeyError: 'X' > >> > >> Although I can delete the "X" in this protein, could the author > >> implement a > >> warning message and work around this error stop? Thank you. > >> > >> Bin > >> > >> _______________________________________________ > >> BioPython mailing list - BioPython@biopython.org > >> http://biopython.org/mailman/listinfo/biopython > >> > >> > >> > >> > > > > > > > -- > Iddo Friedberg, Ph.D. > Burnham Institute for Medical Research > 10901 N. Torrey Pines Rd. > La Jolla, CA 92037 USA > Tel: +1 (858) 646 3100 x3516 > Fax: +1 (858) 713 9949 > http://iddo-friedberg.org > http://BioFunctionPrediction.org > > From omid9dr18 at hotmail.com Sat Feb 4 16:24:30 2006 From: omid9dr18 at hotmail.com (Omid Khalouei) Date: Sat Feb 4 16:37:34 2006 Subject: [BioPython] PDBParser looking for Numeric module Message-ID: Hello, I am trying to use the PDBParser, but I get the following error message: >>>from Bio.PDB.PDBParser import PDBParser Traceback (most recent call last): File "", line 1, in -toplevel- from Bio.PDB.PDBParser import PDBParser File "C:\Python24\Lib\site-packages\Bio\PDB\__init__.py", line 10, in -toplevel- from PDBParser import PDBParser File "C:\Python24\Lib\site-packages\Bio\PDB\PDBParser.py", line 10, in -toplevel- from Numeric import array, Float0 ImportError: No module named Numeric The only "numeric" module I found in my folders is in the "numarray" folder in site-packages, but the first letter is in lower case. So I'm not sure which Numeric module it is looking for. Am I missing somehting? Thanks for your help, Omid K. From biopython at maubp.freeserve.co.uk Sun Feb 5 06:29:47 2006 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun Feb 5 07:44:59 2006 Subject: [BioPython] PDBParser looking for Numeric module In-Reply-To: References: Message-ID: <43E5E1AB.7000801@maubp.freeserve.co.uk> Omid Khalouei wrote: > I am trying to use the PDBParser, but I get the following error message: > >>>> from Bio.PDB.PDBParser import PDBParser ... > ImportError: No module named Numeric > > The only "numeric" module I found in my folders is in the "numarray" > folder in site-packages, but the first letter is in lower case. So I'm > not sure which Numeric module it is looking for. Am I missing somehting? You need to install Numeric python to get the numeric library. http://sourceforge.net/projects/numpy/ http://numeric.scipy.org/ The numeric libraries have been around for a long time (reaching version 24.2) and are well tested. However, its developers are moving to numpy instead - at some point BioPython will eventually need to convert its code to the replacement library. numarray is yet another array library. Peter From mdehoon at c2b2.columbia.edu Sun Feb 5 10:55:40 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Sun Feb 5 10:58:42 2006 Subject: [BioPython] PDBParser looking for Numeric module Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE43@cgcmail.cgc.cpmc.columbia.edu> Use this link: > http://sourceforge.net/projects/numpy/ Don't use this one: > http://numeric.scipy.org/ > The numeric libraries have been around for a long time (reaching version > 24.2) and are well tested. > However, its developers are moving to numpy instead - at some point > BioPython will eventually need to convert its code to the replacement > library. Note that currently the Numerical Python maintainers are the same people as the SciPy developers -- the new numpy module was originally called scipy-core; one of its goals was to force people to start using SciPy by removing parts of Numerical Python, such as LinearAlgebra and FFT. Note also that the documentation for numpy is not free. Given the success of good old Numerical Python, I doubt that numpy will replace Numerical Python any time soon, if ever. Therefore, I don't think that Biopython should convert to numpy unless most other Python packages relevant to computational biology do so. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From omid9dr18 at hotmail.com Mon Feb 6 21:50:41 2006 From: omid9dr18 at hotmail.com (Omid Khalouei) Date: Mon Feb 6 22:03:31 2006 Subject: [BioPython] Locating setup.py In-Reply-To: Message-ID: Hello, Could you please help me out locate the setup.py file used to compile the Biopython on Windows. I have to modify it a bit, but when I search for setup.py almost 15 of them are returned in different folders. Thanks for your help From solberg at berkeley.edu Tue Feb 7 00:02:39 2006 From: solberg at berkeley.edu (Owen Solberg) Date: Mon Feb 6 23:58:28 2006 Subject: [BioPython] BioPython "location->feature" lookup function for GenBank files? Message-ID: Does anyone know if there is a way to look up what features lie within a sequence range of a GenBank file, using biopython? Specifically, I have a GenBank record for an entire bacterial genome, and I want to be able to ask, "What features, if any, are annotated between bases 110,000 and 120,000." Or "What gene, if any, includes position 534,213?" And so on. This seems like kind of a 'reverse lookup' since most tools seem geared for iterating through features, and then returning the qualifiers for each feature. Am I really going to have to make my own reverse lookup to do this? Haven't other people already written this kind of code? Thanks in advance for your help! Owen From biopython at maubp.freeserve.co.uk Tue Feb 7 05:27:12 2006 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue Feb 7 06:05:07 2006 Subject: [BioPython] Locating setup.py In-Reply-To: References: Message-ID: <43E87600.6090004@maubp.freeserve.co.uk> Omid Khalouei wrote: > Hello, > > Could you please help me out locate the setup.py file used to compile > the Biopython on Windows. I have to modify it a bit, but when I search > for setup.py almost 15 of them are returned in different folders. > > Thanks for your help Its this file in CVS, http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/setup.py?cvsroot=biopython If you get the BioPython 1.41 ZIP file: http://www.biopython.org/files/biopython-1.41.zip If you unzip this to C:\temp then the setup.py you need to run should be C:\temp\biopython-141\setup.py If you goto C:\temp\biopython-141 at the command prompt, then: python setup.py build And if that works, to install on your machine: python setup.py install Or to build a windows installation setup EXE program: python setup.py bdist_wininst Good luck - and try checking the mailing list archives if you get stuck. I've just tried this with Python 2.3 and MSVC 6.0 (Microsoft Visual C++) and it fails to build the object file: build\temp.win32-2.3\Release\Bio/PDB/mmCIF/lex.yy.obj I found that if I comment out the Bio.PDB.mmCIF.MMCIFlex extension module in setup.py (line 432 to 437) then it seems to build fine. Peter P.S. If python.exe isn't on your path, you may need to do this: c:\python23\python.exe setup.py build From rhaygood at duke.edu Tue Feb 7 12:10:11 2006 From: rhaygood at duke.edu (Ralph Haygood) Date: Tue Feb 7 13:04:47 2006 Subject: [BioPython] PopGen module for Biopython? Message-ID: Fellow Biopythoneers, I'm a population geneticist (currently working at Duke University). Since last spring, I've written a Python module for computing a variety of population-genetic statistics from DNA sequences, including popular favorites such as Tajima's D, Fu and Li's D, and Fay and Wu's H. It can compute statistics for a whole alignment, or it can slide a window along an alignment, or it can compare a pair of congruent alignments (e.g., transcription factor binding sites versus other sites in a cis-regulatory region). It runs under Biopython, in that it works on Bio.Align.Generic.Alignment objects. I've used this module extensively (it has contributed to two manuscripts currently making their ways toward publication). Where possible, I've compared its output with that of DnaSP, a widely used Windows application for population-genetic analyses (however, I wrote the module mostly to do things that can't be done with DnaSP or any other canned program I know of). So I'm confident it's largely correct. Now I'm wondering whether there would be interest in adding it to Biopython. Bioperl has a module for population-genetic analyses (written by my colleague Jason Stajich). I think it would be nice for Biopython to have one too. Before my code could be added, I would need to spend a little time on stylistic modifications and more on documentation. I'm willing to spend the time if there would be interest. There are no intellectual property obstructions. Distribution of my code under the Biopython License Agreement would be fine. Ralph Haygood From omid9dr18 at hotmail.com Wed Feb 8 01:54:40 2006 From: omid9dr18 at hotmail.com (Omid Khalouei) Date: Wed Feb 8 02:07:37 2006 Subject: [BioPython] Locating setup.py In-Reply-To: <43E87600.6090004@maubp.freeserve.co.uk> Message-ID: Yes I just checked the link that you sent: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/setup.py?cvsroot=biopython and that's exactly the setup.py that I'm looking for. I'll have to uncomment the lines related to _CKDTree in order to make my NeighborSearch function functional. But I still can't find this setup.py, could the reason be that I downloaded the biopython using the Windows Installer and so it doesn't contain the compiler?! Thanks for your help. >From: Peter >Reply-To: biopython@biopython.org >To: Omid Khalouei , biopython@biopython.org >Subject: Re: [BioPython] Locating setup.py >Date: Tue, 07 Feb 2006 10:27:12 +0000 > > >Its this file in CVS, > >http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/setup.py?cvsroot=biopython > >If you get the BioPython 1.41 ZIP file: > >http://www.biopython.org/files/biopython-1.41.zip > >If you unzip this to C:\temp then the setup.py you need to run should be >C:\temp\biopython-141\setup.py > >If you goto C:\temp\biopython-141 at the command prompt, then: > >python setup.py build > >And if that works, to install on your machine: > >python setup.py install > >Or to build a windows installation setup EXE program: > >python setup.py bdist_wininst > >Good luck - and try checking the mailing list archives if you get stuck. > >I've just tried this with Python 2.3 and MSVC 6.0 (Microsoft Visual C++) >and it fails to build the object file: > >build\temp.win32-2.3\Release\Bio/PDB/mmCIF/lex.yy.obj > >I found that if I comment out the Bio.PDB.mmCIF.MMCIFlex extension module >in setup.py (line 432 to 437) then it seems to build fine. > >Peter > >P.S. If python.exe isn't on your path, you may need to do this: > >c:\python23\python.exe setup.py build > From biopython at maubp.freeserve.co.uk Wed Feb 8 05:03:33 2006 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed Feb 8 05:03:52 2006 Subject: [BioPython] Locating setup.py In-Reply-To: References: Message-ID: <43E9C1F5.4020407@maubp.freeserve.co.uk> > But I still can't find this setup.py, could the reason be that I > downloaded the biopython using the Windows Installer and so it doesn't > contain the compiler?! OK, I understand now. If you get the BioPython 1.41 ZIP file (or the TAR.GZ file) they contain all the source code (both python and c), plus the build files like setup.py, and the biopython tests: http://www.biopython.org/files/biopython-1.41.zip If you download the Windows installation program, you ONLY get the bare minimum. You don't get the c source to anything (you get pre-compiled binaries instead), you don't get the setup.py (because you don't need it), and you also don't get the test cases. As you want to compile BioPython, you should download the ZIP file. Most Windows people just want to use BioPython, and don't even have a compiler. They should use the Windows Installer program. I hope that explains things. Peter From alexl at users.sourceforge.net Wed Feb 8 20:56:27 2006 From: alexl at users.sourceforge.net (Alex Lancaster) Date: Wed Feb 8 21:18:01 2006 Subject: [BioPython] PopGen module for Biopython? In-Reply-To: (Ralph Haygood's message of "Tue, 7 Feb 2006 12:10:11 -0500 (EST)") References: Message-ID: >>>>> "RH" == Ralph Haygood writes: RH> Fellow Biopythoneers, I'm a population geneticist (currently RH> working at Duke University). Since last spring, I've written a RH> Python module for computing a variety of population-genetic RH> statistics from DNA sequences, including popular favorites such as RH> Tajima's D, Fu and Li's D, and Fay and Wu's H. It can compute RH> statistics for a whole alignment, or it can slide a window along RH> an alignment, or it can compare a pair of congruent alignments RH> (e.g., transcription factor binding sites versus other sites in a RH> cis-regulatory region). It runs under Biopython, in that it works RH> on Bio.Align.Generic.Alignment objects. Ralph, On related population genetic tools in Python, I have a project that uses Python, but is not yet integrated to use biopython, but I would like to explore ways that it could do so. PyPop: Python for Population Genetics: http://www.pypop.org/ It currently does analyses such as (1) conformity to Hardy-Weinberg expectations, (2) tests for balancing or directional selection; (3) estimates of haplotype frequencies (and their distributions) and measures and tests of significance for linkage disequilibrium (LD). It's licensed using the GNU GPL. It would be nice to integrate these tools, or somehow be able to access them via biopython, and PyPop doesn't do the tests that you are proposing, it would be nice to somehow integrate them also with your tools. (Some of the tests mentioned above are computationally extensive and therefore are written in C, and are accessed from Python using SWIG). Way back in 2001, I posted to this list a query about standardising input files for population data, but I guess that there weren't that many pop. gen folks using biopython back then! http://portal.open-bio.org/pipermail/biopython/2001-September/000723.html Anyway, thoughts on this would be appreciated. Alex -- Alex Lancaster | Dept. of Integrative Biology, UC Berkeley: ib.berkeley.edu From idoerg at burnham.org Thu Feb 9 00:46:28 2006 From: idoerg at burnham.org (Iddo Friedberg) Date: Thu Feb 9 00:42:32 2006 Subject: [BioPython] PopGen module for Biopython? In-Reply-To: Message-ID: Hi Alex, I already wrote Ralph regarding his contribution. I think the integration idea is terrific, if it is appropriate, of course. I am not exactly sure about the GPL licensing of pypop into software with the Biopython license... maybe someone here can provide an answer. Anyhow, let me know if you need any help on this. And if you want to communicate more regarding pypop and Ralph's-yet-to-be-named package, it will be great to do that in biopython-dev@biopython.org Cheers, Iddo -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037, USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 646 3171 http://iddo-friedberg.org http://BioFunctionPrediction.org On Wed, 8 Feb 2006, Alex Lancaster wrote: > >>>>> "RH" == Ralph Haygood writes: > > RH> Fellow Biopythoneers, I'm a population geneticist (currently > RH> working at Duke University). Since last spring, I've written a > RH> Python module for computing a variety of population-genetic > RH> statistics from DNA sequences, including popular favorites such as > RH> Tajima's D, Fu and Li's D, and Fay and Wu's H. It can compute > RH> statistics for a whole alignment, or it can slide a window along > RH> an alignment, or it can compare a pair of congruent alignments > RH> (e.g., transcription factor binding sites versus other sites in a > RH> cis-regulatory region). It runs under Biopython, in that it works > RH> on Bio.Align.Generic.Alignment objects. > > Ralph, > > On related population genetic tools in Python, I have a project that > uses Python, but is not yet integrated to use biopython, but I would > like to explore ways that it could do so. > > PyPop: Python for Population Genetics: > > http://www.pypop.org/ > > It currently does analyses such as (1) conformity to Hardy-Weinberg > expectations, (2) tests for balancing or directional selection; (3) > estimates of haplotype frequencies (and their distributions) and > measures and tests of significance for linkage disequilibrium (LD). > It's licensed using the GNU GPL. > > It would be nice to integrate these tools, or somehow be able to > access them via biopython, and PyPop doesn't do the tests that you are > proposing, it would be nice to somehow integrate them also with your > tools. (Some of the tests mentioned above are computationally > extensive and therefore are written in C, and are accessed from Python > using SWIG). > > Way back in 2001, I posted to this list a query about standardising > input files for population data, but I guess that there weren't that > many pop. gen folks using biopython back then! > > http://portal.open-bio.org/pipermail/biopython/2001-September/000723.html > > Anyway, thoughts on this would be appreciated. > > Alex > -- > Alex Lancaster | Dept. of Integrative Biology, UC Berkeley: ib.berkeley.edu > > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From joel at macresearcher.com Thu Feb 9 13:16:00 2006 From: joel at macresearcher.com (Joel Dudley) Date: Thu Feb 9 13:43:41 2006 Subject: [BioPython] MacResearch.org announces iPod giveaway contest Message-ID: <32AF109D-763B-4689-A530-7070DEE52975@macresearcher.com> Help MacResearch.org expand its Script Repository and you could win a black 2GB iPod Nano. Eligible contestants must submit a research- oriented script that can run natively (no emulators) on Mac OS X 10.3 or higher without modification before the contest end date. Scripts for all scientific domains are welcome including scripts written for High Performance Computing (grid, cluster, etc) setup and management. If your script does not meet the aforementioned criteria then you will not be eligible to win the iPod Nano. Winners will be chosen by random drawing. The contest begins 2/8/2006 and ends 2/28/2006. The ultimate goal of this contest, and the script repository in general, is to create a valuable community resource that can be used to benefit endeavors in research and education. Please don't be shy about your coding style or lack of documentation. Your script will make someone's life easier. To learn more about MacResearch.org and the MacResearch.org Script Repository visit http:// www.macresearch.org and http://www.macresearch.org/script_repository. About MacResearch.org: MacResearch.org is the premier community for scientists using Mac OS X and related hardware in their research. It is the mission of MacResearch.org to cultivate a knowledgeable and vibrant community of researchers to exchange ideas and information, build a community knowledge-base, and collectively escalate the prominence of Apple technologies in the scientific research community. Official Rules: Eligible entrants must submit a script to the MacResearch.org Script Repository using the script submission form available through MacResearch.org (see http://www.macresearch.org/script_repository). The submitter becomes eligible for the drawing when their script is approved by the MacResearch.org executive committee and published in the public Script Repository. This sweepstakes is open to persons over 18 years of age. Limit one entry per person. No purchase necessary. All entries must be received before 5:00 pm PST on February 28th, 2006. The prize is one (1) Apple iPod Nano Black 2GB One winner will be selected within forty-eight (48) hours of Contest End Date in a random drawing. Drawing will be conducted by the MacResearch.org executive committee, whose decision is final on all matters relating to this sweepstakes. The winner need not be present at the drawing to win. Odds of winning are dependent upon the total number of entries received. Limit one prize per person. Winner will be notified by e-mail within seventy-two (72) hours of the drawing date. Prizes must be claimed within two weeks of the drawing date. Winners are responsible for all applicable taxes. If the Sponsor is unable to locate a given winner, an alternate winner will be selected by a random drawing. All prizes will be awarded and are non-transferable. No cash or other substitutions are allowed except by sponsors sole election due to prize unavailability. By submitting an entry for this Sweepstakes, participants agree to abide by these official rules and any decision Sponsor makes regarding this promotion. Sponsor reserves the right to disqualify from the Sweepstakes, and to prosecute to the fullest extent permitted by law, any participant or winner who, in Sponsors reasonable suspicion, tampers with the MacResearch.org website, the entry process, intentionally submits more than a single entry or mechanical entries, violates these official rules, or acts in an unsportsmanlike or disruptive manner. By entering the sweepstakes, the entrant (a) agrees to the Official Rules and the decisions of the Sponsor shall be final in all respects; (b) consents to the use of winners names and likenesses and any statements, quotes or testimonials provided by the winners for advertising and publicity purposes without further compensation, except where prohibited by law; (c) and releases Sponsor, its subsidiaries, and affiliates, and their directors, officers, employees and agents from any and all liability for any injuries, losses or damages of any kind caused by any prize or resulting from acceptance, possession or use of any prize. The promotion and the rights and obligations of Sponsor and participants will be governed and controlled by the laws of the State of Arizona, applicable to contracts made and preformed therein without reference to the applicable choice of law provisions. All actions, proceedings, and litigation relating hereto will be instituted and prosecuted solely within the State of Arizona, Maricopa County. The parties consent to the jurisdiction of the state courts of Arizona and federal court located within the state and county with respect to any action, dispute, or other matter pertaining to or arising out of this promotion. This promotion is not affiliated in any way with Apple Computer, Inc. Apple, the Apple logo, and iPod are trademarks of Apple Computer, Inc. registered in the U.S. and other countries. From omid9dr18 at hotmail.com Thu Feb 9 21:40:48 2006 From: omid9dr18 at hotmail.com (Omid Khalouei) Date: Thu Feb 9 21:53:34 2006 Subject: [BioPython] C compiler for Biopython 2.4 In-Reply-To: <43E5D703.9070807@maubp.freeserve.co.uk> Message-ID: Hello, Could you please let me know which C compiler (preferably free) I could use to compile the biopython version 2.4. I want to install it on Windows XP. Thanks for your help, Omid K. From ymasuda at ethercube.com Thu Feb 9 22:34:26 2006 From: ymasuda at ethercube.com (Yasushi Masuda) Date: Thu Feb 9 23:32:02 2006 Subject: [BioPython] PDBParser looking for Numeric module In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE43@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE43@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <43EC09C2.4020809@ethercube.com> Hello, I agree it's not the time to move numpy (yet!), but I'd like to note that now numpy has fft and linalg, corresponding to FFT and LinearAlgebra, respectively. Michiel De Hoon wrote: > Use this link: >> http://sourceforge.net/projects/numpy/ > > Don't use this one: >> http://numeric.scipy.org/ > >> The numeric libraries have been around for a long time (reaching version >> 24.2) and are well tested. > >> However, its developers are moving to numpy instead - at some point >> BioPython will eventually need to convert its code to the replacement >> library. > > Note that currently the Numerical Python maintainers are the same people as > the SciPy developers -- the new numpy module was originally called > scipy-core; one of its goals was to force people to start using SciPy by > removing parts of Numerical Python, such as LinearAlgebra and FFT. Note also > that the documentation for numpy is not free. Given the success of good old > Numerical Python, I doubt that numpy will replace Numerical Python any time > soon, if ever. Therefore, I don't think that Biopython should convert to > numpy unless most other Python packages relevant to computational biology do > so. > -- Yasushi Masuda ymasuda@ethercube.com From mdehoon at c2b2.columbia.edu Fri Feb 10 10:14:00 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri Feb 10 10:11:21 2006 Subject: [BioPython] C compiler for Biopython 2.4 Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE54@cgcmail.cgc.cpmc.columbia.edu> You can use Cygwin to compile Biopython for Windows. That's how Biopython's installer for Windows is built. See http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/python for some hints on how to build C extension modules for Windows using Cygwin. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of Omid Khalouei Sent: Thu 2/9/2006 9:40 PM To: biopython@biopython.org Subject: [BioPython] C compiler for Biopython 2.4 Hello, Could you please let me know which C compiler (preferably free) I could use to compile the biopython version 2.4. I want to install it on Windows XP. Thanks for your help, Omid K. _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From biopython at maubp.freeserve.co.uk Fri Feb 10 11:06:25 2006 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri Feb 10 11:17:56 2006 Subject: [BioPython] Compiling Bio.PDB.mmCIF.MMCIFlex on Windows In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE54@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE54@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <43ECBA01.2010403@maubp.freeserve.co.uk> Michiel De Hoon wrote: > You can use Cygwin to compile Biopython for Windows. That's how Biopython's > installer for Windows is built. See > http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/python for some hints on > how to build C extension modules for Windows using Cygwin. > > --Michiel. Your website is currently not responding, so I haven't been able to read it yet. Does Cygwin cope with the Bio.PDB.mmCIF.MMCIFlex extension module? I tried to compiled BioPython 1.41 with Python 2.3.3 on Windows XP SP2 using MSVC 6.0 (Microsoft Visual C++ 6.0). There is a problem with unistd.h missing while trying to compile file lex.yy.c into lex.yy.obj as shown below: building 'Bio.PDB.mmCIF.MMCIFlex' extension C:\Program Files\Microsoft Visual Studio\VC98\BIN\cl.exe /c /nologo /Ox /MD /W3 /GX /DNDEBUG -IBio -Ic:\python23\include -Ic:\python23\PC /TcBio/PDB/mmCIF/lex.yy.c /Fobuild\temp.win32-2.3\Release\Bio/PDB/mmCIF/lex.yy.obj lex.yy.c Bio/PDB/mmCIF/lex.yy.c(12) : fatal error C1083: Cannot open include file: 'unistd.h': No such file or directory error: command '"C:\Program Files\Microsoft Visual Studio\VC98\BIN\cl.exe"' failed with exit status 2 Thank you Peter From mdehoon at c2b2.columbia.edu Fri Feb 10 11:32:54 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri Feb 10 11:28:36 2006 Subject: [BioPython] Compiling Bio.PDB.mmCIF.MMCIFlex on Windows Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE55@cgcmail.cgc.cpmc.columbia.edu> > Does Cygwin cope with the Bio.PDB.mmCIF.MMCIFlex extension module? I don't think it does. > I tried to compiled BioPython 1.41 with Python 2.3.3 on Windows XP SP2 > using MSVC 6.0 (Microsoft Visual C++ 6.0). > There is a problem with unistd.h missing while trying to compile file > lex.yy.c into lex.yy.obj as shown below: On cygwin, there is no problem with unistd.h. However, since Bio.PDB.mmCIF.MMCIFlex needs flex to be installed, the linking fails: C:\cygwin\bin\gcc.exe -mno-cygwin -shared -s build\temp.win32-2.4\Release\bio\pd b\mmcif\lex.yy.o build\temp.win32-2.4\Release\bio\pdb\mmcif\mmciflexmodule.o bui ld\temp.win32-2.4\Release\bio\pdb\mmcif\MMCIFlex.def -Lc:\Python24\libs -Lc:\Pyt hon24\PCBuild -lfl -lpython24 -lmsvcr71 -o build\lib.win32-2.4\Bio\PDB\mmCIF\MMC IFlex.pyd /usr/lib/gcc/i686-pc-mingw32/3.4.4/../../../../i686-pc-mingw32/bin/ld: cannot fi nd -lfl This is a recurring problem and is not limited to Windows, but to any machine without flex installed. So ... is there a chance of getting Bio.PDB.mmCIF.MMCIFlex rewritten without relying on flex? --Michiel. From rhaygood at duke.edu Fri Feb 10 11:57:42 2006 From: rhaygood at duke.edu (Ralph Haygood) Date: Fri Feb 10 12:08:35 2006 Subject: [BioPython] PopGen module for Biopython? Message-ID: I've received encouraging responses to my message of this Tuesday proposing a Biopython module for population-genetic analyses. So I'll go ahead and revise my module for incorporation into Biopython. Unless someone objects, I'll name the package Bio.PopGen. Until it's ready to use, I suggest further discussion of it happen on Biopython-dev. From rhaygood at duke.edu Fri Feb 10 13:12:28 2006 From: rhaygood at duke.edu (Ralph Haygood) Date: Fri Feb 10 13:01:53 2006 Subject: [BioPython] PopGen module for Biopython? In-Reply-To: <43ECD3A4.9010709@mitre.org> References: <43ECD3A4.9010709@mitre.org> Message-ID: Marc, To me, Bio.Systematics sounds like phylogeny estimation, which my module doesn't do at all, and which would deserve its own package(s). I think the name of my package needs to indicate population genetics, like BioPerl's Bio::PopGen. I had in mind that some people might prefer Bio.PopulationGenetics to Bio.PopGen, which would be fine with me if there were majority support for it. Ralph On Fri, 10 Feb 2006, Marc Colosimo wrote: > Ralph Haygood wrote: > > >I've received encouraging responses to my message of this Tuesday proposing a > >Biopython module for population-genetic analyses. So I'll go ahead and revise > >my module for incorporation into Biopython. Unless someone objects, I'll name > >the package Bio.PopGen. Until it's ready to use, I suggest further discussion > >of it happen on Biopython-dev. > > > Just a suggestion, what about Bio.Systematics as the package name? I > think this is more general and other people could add similar things > under it. > > Marc > > > > From mcolosimo at mitre.org Fri Feb 10 12:55:48 2006 From: mcolosimo at mitre.org (Marc Colosimo) Date: Fri Feb 10 13:10:48 2006 Subject: [BioPython] PopGen module for Biopython? In-Reply-To: References: Message-ID: <43ECD3A4.9010709@mitre.org> Ralph Haygood wrote: >I've received encouraging responses to my message of this Tuesday proposing a >Biopython module for population-genetic analyses. So I'll go ahead and revise >my module for incorporation into Biopython. Unless someone objects, I'll name >the package Bio.PopGen. Until it's ready to use, I suggest further discussion >of it happen on Biopython-dev. > Just a suggestion, what about Bio.Systematics as the package name? I think this is more general and other people could add similar things under it. Marc From cariaso at yahoo.com Sat Feb 11 03:40:53 2006 From: cariaso at yahoo.com (Michael Cariaso) Date: Sat Feb 11 03:43:31 2006 Subject: [BioPython] biopython and jython Message-ID: <43EDA315.50901@yahoo.com> Is anyone running any parts of biopython under jython? I'm hoping to get biopython compiled into a .jar via jython. At the moment jython is unable to import either of Bio.EUtils.HistoryClient or DBIdsClient. Specifically I get a ClassCastException during the import. Some other modules seem to work. I expect that biopython's dependence on numpy could also cause problems. If so, might JNumeric be used from jython in place of numpy? Any experiences you've had to indicate if this is feasible would be appreciated. Mike Cariaso From thamelry at binf.ku.dk Sat Feb 11 07:58:21 2006 From: thamelry at binf.ku.dk (Thomas Hamelryck) Date: Sat Feb 11 08:20:13 2006 Subject: [BioPython] Compiling Bio.PDB.mmCIF.MMCIFlex on Windows In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE55@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE55@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <34204.87.72.27.226.1139662701.squirrel@www.binf.ku.dk> > This is a recurring problem and is not limited to Windows, but to any > machine without flex installed. So ... is there a chance of getting > Bio.PDB.mmCIF.MMCIFlex rewritten without > relying on flex? Nope - that would be a lot of work. If the mmCIF module causes problems it can just be commented out. Best regards, -Thomas From luan_sheng1 at 163.com Mon Feb 13 02:26:07 2006 From: luan_sheng1 at 163.com (luan_sheng) Date: Mon Feb 13 03:52:14 2006 Subject: [BioPython] hello, i want to join the mail list, is it right by sending a mail? Message-ID: <000801c6306e$c17fc5b0$0802a8c0@LocalHost> hello,i want to join the mail list, is it right by sending this mail? From e.picardi at unical.it Wed Feb 15 04:17:44 2006 From: e.picardi at unical.it (Ernesto Picardi) Date: Wed Feb 15 04:22:35 2006 Subject: [BioPython] error in installing Biopython Message-ID: <200602151017.44604.e.picardi@unical.it> Dear all, I'm trying to install Biopython on my Laptop, under SUSE linux 10.0. Firstly I tried biopython-1.23 release (I worked with it on Windows PC). After the command: python setup.py install Python starts to install the package till this error: /usr/lib/gcc/i586-suse-linux/4.0.2/../../../../i586-suse-linux/bin/ld: cannot find -lfl collect2: ld returned 1 exit status error: command 'gcc' failed with exit status 1 SUSE 10.0 uses Python 2.4 and gcc 4.0 The error is the same if I used biopython-1.41. Please, let me know how I could solve the problem. Thank you Ernesto ************************************ Dr Ernesto Picardi Dept. of Cellular Biology Ponte Pietro Bucci University of Calabria 87036, Arcavacata di Rende (CS) Italy Tel.: +39 0984 492937 Fax: +39 0984 492911 E-mail: e.picardi@unical.it ************************************ From thamelry at binf.ku.dk Wed Feb 15 04:32:54 2006 From: thamelry at binf.ku.dk (Thomas Hamelryck) Date: Wed Feb 15 04:28:20 2006 Subject: [BioPython] error in installing Biopython In-Reply-To: <200602151017.44604.e.picardi@unical.it> References: <200602151017.44604.e.picardi@unical.it> Message-ID: <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> > cannot find -lfl collect2: ld returned 1 exit status Install flex. -Thomas From mcolosimo at mitre.org Wed Feb 15 08:56:54 2006 From: mcolosimo at mitre.org (Marc Colosimo) Date: Wed Feb 15 08:55:42 2006 Subject: [BioPython] error in installing Biopython In-Reply-To: <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> References: <200602151017.44604.e.picardi@unical.it> <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> Message-ID: <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> Thomas I had the same problem over a week ago and no one said to install flex. In the mean time two more people have had the SAME problem. In another posting (Compiling Bio.PDB.mmCIF.MMCIFlex on Windows), you said it would be a lot of work to rewrite it without flex. So, I think the solution would be to fix either setup.py to check for this (such as a script that returns -1 if the flex library isn't found - similar to autoconfig) or give an option in setup.py to not build Bio.PDB.mmCIF.MMCIFlex (or what ever depends on it). Marc On Feb 15, 2006, at 4:32 AM, Thomas Hamelryck wrote: > >> cannot find -lfl collect2: ld returned 1 exit status > > Install flex. > > -Thomas > From thamelry at binf.ku.dk Wed Feb 15 09:33:23 2006 From: thamelry at binf.ku.dk (Thomas Hamelryck) Date: Wed Feb 15 09:28:47 2006 Subject: [BioPython] error in installing Biopython In-Reply-To: <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> References: <200602151017.44604.e.picardi@unical.it> <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> Message-ID: <33287.192.168.10.162.1140014003.squirrel@www.binf.ku.dk> About Flex & Biopython: > So, I think the > solution would be to fix either setup.py to check for this (such as a > script that returns -1 if the flex library isn't found - similar to > autoconfig) or give an option in setup.py to not build > Bio.PDB.mmCIF.MMCIFlex (or what ever depends on it). Well, yes, that's what I said in my previous mail about this. I don't mind if it's commented out. No problem. Best regards, -Thomas From mmokrejs at ribosome.natur.cuni.cz Wed Feb 15 09:58:59 2006 From: mmokrejs at ribosome.natur.cuni.cz (=?windows-1252?Q?Martin_MOKREJ=8A?=) Date: Wed Feb 15 10:01:28 2006 Subject: [BioPython] error in installing Biopython In-Reply-To: <33287.192.168.10.162.1140014003.squirrel@www.binf.ku.dk> References: <200602151017.44604.e.picardi@unical.it> <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> <33287.192.168.10.162.1140014003.squirrel@www.binf.ku.dk> Message-ID: <43F341B3.6090409@ribosome.natur.cuni.cz> Hi, Thomas Hamelryck wrote: > About Flex & Biopython: > > >>So, I think the >>solution would be to fix either setup.py to check for this (such as a >>script that returns -1 if the flex library isn't found - similar to >>autoconfig) or give an option in setup.py to not build >>Bio.PDB.mmCIF.MMCIFlex (or what ever depends on it). > > > Well, yes, that's what I said in my previous mail about this. > I don't mind if it's commented out. No problem. I do intend to use mmCIF and known others do. Please leave it in but for people who don't have flex add some option to configure to disable this part just for them. Thanks. Martin From sbassi at gmail.com Wed Feb 15 09:24:46 2006 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed Feb 15 11:58:23 2006 Subject: [BioPython] error in installing Biopython In-Reply-To: <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> References: <200602151017.44604.e.picardi@unical.it> <33054.87.72.27.226.1139995974.squirrel@www.binf.ku.dk> <81E8D92A-BA7C-4F7C-92F7-96FB2EFD87EC@mitre.org> Message-ID: On 2/15/06, Marc Colosimo wrote: > said it would be a lot of work to rewrite it without flex. So, I > think the solution would be to fix either setup.py to check for this > (such as a script that returns -1 if the flex library isn't found - And a output in plain English stating that Flex is required. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From s203jay at mail.chem.itb.ac.id Mon Feb 20 22:48:45 2006 From: s203jay at mail.chem.itb.ac.id (Indrajaya) Date: Mon Feb 20 23:50:37 2006 Subject: [BioPython] biopython gui Message-ID: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> Dear biopython users, Hi, I'm Indrajaya. I'd to develop a nice biopython gui but I don't know what the best toolkit to use for. Anybody guide me how do I start it? Should I use pyGTK? pyQT? wxPython? pyFox? Or else? Anybody seen another project hosted in http://ftp.bioinformatics.org/pub/GUIBlast/? Any suggestion, idea or anything else are welcome. Sincerely, Indrajaya From sbassi at gmail.com Tue Feb 21 06:24:22 2006 From: sbassi at gmail.com (Sebastian Bassi) Date: Tue Feb 21 08:12:15 2006 Subject: [BioPython] biopython gui In-Reply-To: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> References: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> Message-ID: On 2/21/06, Indrajaya wrote: > Hi, I'm Indrajaya. I'd to develop a nice biopython gui but I don't know > what the best toolkit to use for. Anybody guide me how do I start it? > Should I use pyGTK? pyQT? wxPython? pyFox? Or else? I guess you should choose the one that makes you feel better, I think that is a matter of personal preference. Like use KDE or Gnome. Each one has it advantages, but at the end of the day, everything you can do with one (toolkit/wm), you can do it with the other. Before choosing gui toolkit, I think you should be sure what do you want to do. I mean, biopython is a set of functions and methods, not a "command line stand alone program" where you would put a GUI on top. I am also thinking about GUIs for each function, like the module for Google Homepage I made based on Tm function. But even if I do it for all Biopython function, it won't be a nice solution because what is needed (IMHO) is something to link all the functions, as you do in code. There are some attemps to do that, like using flowchart like interface where you join functions, like a program. But never saw it in Biopyhon. > Anybody seen another project hosted in > http://ftp.bioinformatics.org/pub/GUIBlast/? Yes, I started :). It is pretty stoped now because lack ot time :(. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From ziemys at ecr6.ohio-state.edu Tue Feb 21 13:51:41 2006 From: ziemys at ecr6.ohio-state.edu (ziemys@ecr6.ohio-state.edu) Date: Tue Feb 21 16:00:23 2006 Subject: [BioPython] Bio.Cluster with 3D coordinates Message-ID: HI, Can anyone give axample amd suggestions how to find cluster centers of my points in Cartesian space (x,y,z) using Bio.Cluster or Bio.kMeans ? Bio.Cluster seems to be "optimised" for gene data. However, I did not find clues how to use it with 3D coordinates. Arturas Z. From mdehoon at c2b2.columbia.edu Tue Feb 21 16:45:01 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue Feb 21 16:47:24 2006 Subject: [BioPython] Bio.Cluster with 3D coordinates Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE83@cgcmail.cgc.cpmc.columbia.edu> > Can anyone give axample amd suggestions how to find cluster centers > of my points in Cartesian space (x,y,z) using Bio.Cluster or Bio.kMeans ? >>> from Numeric import * >>> from Bio.Cluster import kcluster, clustercentroid >>> datapoints = array([[1.1,0.9,1.1], ... [2.3,3.1,2.7], ... [1.2,1.0,0.9], ... [2.2,2.9,2.6], ... [2.2,3.0,2.9]]) >>> clusterid, error, nfound = kcluster(datapoints) >>> clusterid array([0, 1, 0, 1, 1]) >>> centroid, centroid_mask = clustercentroid(datapoints, clusterid=clusterid) >>> centroid[0] array([ 1.15, 0.95, 1. ]) >>> centroid[1] array([ 2.23333333, 3. , 2.73333333]) >>> centroid_mask array([[1, 1, 1], [1, 1, 1]]) # Because there are no missing data. kcluster uses the Euclidean distance by default. To find more than two clusters, use kcluster(..., nclusters=the_number_of_clusters_you_want). If you have a lot of data points, it's better to make kcluster use multiple runs by specifying kcluster(..., npass=some_big_number). --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of ziemys@ecr6.ohio-state.edu Sent: Tue 2/21/2006 1:51 PM To: biopython@biopython.org Subject: [BioPython] Bio.Cluster with 3D coordinates HI, Bio.Cluster seems to be "optimised" for gene data. However, I did not find clues how to use it with 3D coordinates. Arturas Z. _______________________________________________ BioPython mailing list - BioPython@biopython.org http://biopython.org/mailman/listinfo/biopython From km at mrna.tn.nic.in Tue Feb 21 17:19:10 2006 From: km at mrna.tn.nic.in (km) Date: Tue Feb 21 18:00:45 2006 Subject: [BioPython] Bio.Cluster with 3D coordinates In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE83@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE83@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <20060221221910.GA29577@mrna.tn.nic.in> Hi all, how to reduce the dimensionality from 3-D to 2D while interpreting clusters witn Bio.Cluster ? regards, KM On Tue, Feb 21, 2006 at 04:45:01PM -0500, Michiel De Hoon wrote: > > Can anyone give axample amd suggestions how to find cluster centers > > of my points in Cartesian space (x,y,z) using Bio.Cluster or Bio.kMeans ? > > >>> from Numeric import * > >>> from Bio.Cluster import kcluster, clustercentroid > >>> datapoints = array([[1.1,0.9,1.1], > ... [2.3,3.1,2.7], > ... [1.2,1.0,0.9], > ... [2.2,2.9,2.6], > ... [2.2,3.0,2.9]]) > >>> clusterid, error, nfound = kcluster(datapoints) > >>> clusterid > array([0, 1, 0, 1, 1]) > >>> centroid, centroid_mask = clustercentroid(datapoints, > clusterid=clusterid) > >>> centroid[0] > array([ 1.15, 0.95, 1. ]) > >>> centroid[1] > array([ 2.23333333, 3. , 2.73333333]) > >>> centroid_mask > array([[1, 1, 1], > [1, 1, 1]]) > # Because there are no missing data. > > > kcluster uses the Euclidean distance by default. To find more than two > clusters, use kcluster(..., nclusters=the_number_of_clusters_you_want). If > you have a lot of data points, it's better to make kcluster use multiple runs > by specifying kcluster(..., npass=some_big_number). > > --Michiel. > > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: biopython-bounces@portal.open-bio.org on behalf of > ziemys@ecr6.ohio-state.edu > Sent: Tue 2/21/2006 1:51 PM > To: biopython@biopython.org > Subject: [BioPython] Bio.Cluster with 3D coordinates > > HI, > > > > Bio.Cluster seems to be "optimised" for gene data. However, I did not find > clues how to use it with 3D coordinates. > > Arturas Z. > > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org From wolfgang.meyer at gmail.com Wed Feb 22 17:48:12 2006 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Wed Feb 22 17:50:22 2006 Subject: [BioPython] problems concerning Bio.PDB.NACCESS Message-ID: Hi, I found some problems in Bio.PDB.NACCESS. the version of NACCESS.py I am using is Revision *1.1*from *Sat Jul 30 15:59:48 2005 UTC* The problems are in the function: ---------------- def process_asa_data(rsa_data): .... asa = line[54:62] vdw = line[62:68] ----------------- first, shouldn't the input parameter be "asa_data" rather than "rsa_data" (I suppose that's a typo resulted from copy-and paste)? second, should "asa" and "vdw" values be converted to float? regards, -- Wolfgang Meyer From s203jay at mail.chem.itb.ac.id Mon Feb 27 00:56:08 2006 From: s203jay at mail.chem.itb.ac.id (Indrajaya) Date: Mon Feb 27 01:36:18 2006 Subject: [BioPython] biopython gui In-Reply-To: References: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> Message-ID: <1755.222.124.227.203.1141019768.squirrel@webmail.chem.itb.ac.id> > On 2/21/06, Indrajaya wrote: >> Hi, I'm Indrajaya. I'd to develop a nice biopython gui but I don't know >> what the best toolkit to use for. Anybody guide me how do I start it? >> Should I use pyGTK? pyQT? wxPython? pyFox? Or else? > > I guess you should choose the one that makes you feel better, I think > that is a matter of personal preference. Like use KDE or Gnome. Each > one has it advantages, but at the end of the day, everything you can > do with one (toolkit/wm), you can do it with the other. > Before choosing gui toolkit, I think you should be sure what do you > want to do. I mean, biopython is a set of functions and methods, not a > "command line stand alone program" where you would put a GUI on top. I > am also thinking about GUIs for each function, like the module for > Google Homepage I made based on Tm function. But even if I do it for > all Biopython function, it won't be a nice solution because what is > needed (IMHO) is something to link all the functions, as you do in > code. There are some attemps to do that, like using flowchart like > interface where you join functions, like a program. But never saw it > in Biopyhon. Ya, I believe it's not a nice solution to make gui for all function. But I think, it is cool to make gui for blast or swissprot GUIs. Just for simple thinking, to get information from genebank with some criteria (ie. who posted the sequence, with fixed sequence length, etc), it will show the result directly to user. Anyone have other idea? > >> Anybody seen another project hosted in >> http://ftp.bioinformatics.org/pub/GUIBlast/? > Can you tell me, why do you use pythoncard as a framework for guiblast? > Yes, I started :). It is pretty stoped now because lack ot time :(. > > > -- > Bioinformatics news: http://www.bioinformatica.info > Lriser: http://www.linspire.com/lraiser_success.php?serial=318 > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From biopython at maubp.freeserve.co.uk Mon Feb 27 05:43:14 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Mon Feb 27 06:16:40 2006 Subject: [BioPython] biopython gui (for Blast) In-Reply-To: <1755.222.124.227.203.1141019768.squirrel@webmail.chem.itb.ac.id> References: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> <1755.222.124.227.203.1141019768.squirrel@webmail.chem.itb.ac.id> Message-ID: <4402D7C2.7040204@maubp.freeserve.co.uk> Indrajaya wrote: > Ya, I believe it's not a nice solution to make gui for all function. > But I think, it is cool to make gui for blast or swissprot GUIs. Just > for simple thinking, to get information from genebank with some > criteria (ie. who posted the sequence, with fixed sequence length, > etc), it will show the result directly to user. Anyone have other > idea? What advantages would a Blast GUI program have over the existing web interfaces? The NCBI's online blast is very easy to use already. For those people who need to automate things then using BioPython (or BioPerl or ...) with either online or standalone blast is fine - it just takes more work from the user. Are you trying to do something in between? More powerful than the existing web-blast GUI, and less complex that a scripting interface? Peter From ilya.soifer at gmail.com Mon Feb 27 10:38:39 2006 From: ilya.soifer at gmail.com (Ilya Soifer) Date: Mon Feb 27 10:33:50 2006 Subject: [BioPython] qblast fails on parsing XML results In-Reply-To: <1952757a0602270404j8c4932v@mail.gmail.com> References: <1952757a0602270404j8c4932v@mail.gmail.com> Message-ID: <1952757a0602270738h57ec36efg@mail.gmail.com> Hi, I hope that I send it to the correct list. When I run qblast I get >>> res1 = NCBIWWW.qblast("blastn", "nr", seq1) Traceback (most recent call last): File "", line 1, in -toplevel- res1 = NCBIWWW.qblast("blastn", "nr", seq1) File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 1130, in qblast i = results.index("Connection: close") ValueError: substring not found This happens since the results that Blast return no longer have this header # HTTP/1.1 200 OK # Date: Wed, 05 Oct 2005 02:13:33 GMT # Server: Nde # Content-Type: text/plain # Connection: close # but this one HTTP/1.0 200 OK Date: Mon, 27 Feb 2006 11:54:40 GMT Content-Type: application/xml Server: Nde Via: 1.1 proxy7 (NetCache NetApp/6.0.2) I guess it might be better to look for something like " Hi, Does anyone have code for extracting a single model from a PDB file, and then printing it as another PDB-like file? Thanks, Iddo -- Iddo Friedberg, Ph.D. Burnham Institute for Medical Research 10901 N. Torrey Pines Rd. La Jolla, CA 92037 USA Tel: +1 (858) 646 3100 x3516 Fax: +1 (858) 713 9949 http://iddo-friedberg.org http://BioFunctionPrediction.org From thamelry at binf.ku.dk Mon Feb 27 11:49:31 2006 From: thamelry at binf.ku.dk (Thomas Hamelryck) Date: Mon Feb 27 12:06:16 2006 Subject: [BioPython] extracting a single model from PDB file In-Reply-To: <44032A3F.20503@burnham.org> References: <44032A3F.20503@burnham.org> Message-ID: <33202.10.10.11.12.1141058971.squirrel@www.binf.ku.dk> On Mon, February 27, 2006 5:35 pm, Iddo Friedberg wrote: > Hi, > > > Does anyone have code for extracting a single model from a PDB file, and > then printing it as another PDB-like file? Something like this will work: class ModelSelect(Select): def __init__(self, model_id): self.model_id=model_id def accept_model(self, model): if model.get_id()==self.model_id: return 1 else: return 0 # Select model 0 for output ms0=ModelSelect(0) io=PDBIO() io.set_structure(structure) io.save("out.pdb", select=ms0) Cheers, -Thomas From mdehoon at c2b2.columbia.edu Mon Feb 27 13:16:47 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Mon Feb 27 13:16:09 2006 Subject: [BioPython] qblast fails on parsing XML results Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECE91@cgcmail.cgc.cpmc.columbia.edu> There is a simpler solution to this, which is to use urllib instead of the socket library in the function _send_to_qblast and _send_to_blasturl. If we use urllib, we get the results automatically without the HTTP header. So .... does anybody know why socket is used instead of urllib? If it's because older Python versions didn't have urllib, we can just replace socket by urllib to solve this problem. Or am I missing something? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces@portal.open-bio.org on behalf of Ilya Soifer Sent: Mon 2/27/2006 10:38 AM To: biopython@biopython.org Subject: [BioPython] qblast fails on parsing XML results Hi, I hope that I send it to the correct list. When I run qblast I get >>> res1 = NCBIWWW.qblast("blastn", "nr", seq1) Traceback (most recent call last): File "", line 1, in -toplevel- res1 = NCBIWWW.qblast("blastn", "nr", seq1) File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 1130, in qblast i = results.index("Connection: close") ValueError: substring not found This happens since the results that Blast return no longer have this header # HTTP/1.1 200 OK # Date: Wed, 05 Oct 2005 02:13:33 GMT # Server: Nde # Content-Type: text/plain # Connection: close # but this one HTTP/1.0 200 OK Date: Mon, 27 Feb 2006 11:54:40 GMT Content-Type: application/xml Server: Nde Via: 1.1 proxy7 (NetCache NetApp/6.0.2) I guess it might be better to look for something like " > Indrajaya wrote: > > Ya, I believe it's not a nice solution to make gui for all function. > > But I think, it is cool to make gui for blast or swissprot GUIs. Just > > for simple thinking, to get information from genebank with some > > criteria (ie. who posted the sequence, with fixed sequence length, > > etc), it will show the result directly to user. Anyone have other > > idea? > > What advantages would a Blast GUI program have over the existing web > interfaces? The NCBI's online blast is very easy to use already. > > For those people who need to automate things then using BioPython (or > BioPerl or ...) with either online or standalone blast is fine - it just > takes more work from the user. > > Are you trying to do something in between? More powerful than the > existing web-blast GUI, and less complex that a scripting interface? > > Peter The idea started when I saw http://www.bioperl.org/wiki/HOWTO:Graphics. Not just stop at this point, I want to filter all sequences which has xxxxxx (dna code) start from sequence number x until number y without any gap. Picture at this address http://id.wikipedia.org/wiki/Gambar:Protein_alignment.jpg (a few sequence has gap, I don't want it) just for example. I think NCBI's online blast cannot filter with specific criteria, am I right? By combining PIL (Python Image Library) with a nice standalone program (GTK/wxPython) is nice. Another advantages by using customize program/toolkit, I can do anything with the database in my local network, after I mirrored the database. Does anyone work with PIL+biopython? Sincerely, -indrajaya- > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From s203jay at mail.chem.itb.ac.id Tue Feb 28 00:08:41 2006 From: s203jay at mail.chem.itb.ac.id (Indrajaya) Date: Tue Feb 28 01:13:37 2006 Subject: [BioPython] biopython gui (for Blast) In-Reply-To: <4402D7C2.7040204@maubp.freeserve.co.uk> References: <1312.222.124.226.24.1140493725.squirrel@webmail.chem.itb.ac.id> <1755.222.124.227.203.1141019768.squirrel@webmail.chem.itb.ac.id> <4402D7C2.7040204@maubp.freeserve.co.uk> Message-ID: <1892.222.124.227.203.1141103321.squirrel@webmail.chem.itb.ac.id> > Indrajaya wrote: > > Ya, I believe it's not a nice solution to make gui for all function. > > But I think, it is cool to make gui for blast or swissprot GUIs. Just > > for simple thinking, to get information from genebank with some > > criteria (ie. who posted the sequence, with fixed sequence length, > > etc), it will show the result directly to user. Anyone have other > > idea? > > What advantages would a Blast GUI program have over the existing web > interfaces? The NCBI's online blast is very easy to use already. > > For those people who need to automate things then using BioPython (or > BioPerl or ...) with either online or standalone blast is fine - it just > takes more work from the user. > > Are you trying to do something in between? More powerful than the > existing web-blast GUI, and less complex that a scripting interface? > > Peter The idea started when I saw http://www.bioperl.org/wiki/HOWTO:Graphics. Not just stop at this point, I want to filter all sequences which has xxxxxx (dna code) start from sequence number x until number y without any gap. Picture at this address http://id.wikipedia.org/wiki/Gambar:Protein_alignment.jpg (a few sequence has gap, I don't want it) just for example. I think NCBI's online blast cannot filter with specific criteria, am I right? By combining PIL (Python Image Library) with a nice standalone program (GTK/wxPython) is nice. Another advantages by using customize program/toolkit, I can do anything with the database in my local network, after I mirrored the database. Does anyone work with PIL+biopython? Sincerely, -indrajaya- > > _______________________________________________ > BioPython mailing list - BioPython@biopython.org > http://biopython.org/mailman/listinfo/biopython > From maximilianh at gmail.com Mon Feb 27 13:34:53 2006 From: maximilianh at gmail.com (Maximilian Haeussler) Date: Tue Feb 28 08:22:20 2006 Subject: [BioPython] biopython gui Message-ID: <76f031ae0602271034t72497fa6x@mail.gmail.com> regarding this biopython-gui discussion: BioPerl and BioJava have gui elements. In Biojava they include genome-browser like lines, where you can display features of various kinds (main part). Other parts display proteins or transcription factor motifs. [It suffers by the way, from little documentation and is simply too slow.] In BioPerl it's a whole genome browser (GBrowser) whose gui-part is included. [fast, relieable] best wishes, Max From lpritc at scri.sari.ac.uk Tue Feb 28 10:19:01 2006 From: lpritc at scri.sari.ac.uk (Leighton Pritchard) Date: Tue Feb 28 11:56:56 2006 Subject: [BioPython] biopython gui (for Blast) In-Reply-To: <1893.222.124.227.203.1141103338.squirrel@webmail.chem.itb.ac.id> References: <1893.222.124.227.203.1141103338.squirrel@webmail.chem.itb.ac.id> Message-ID: <1141139941.3478.242.camel@lplinuxdev> Skipped content of type multipart/alternative-------------- next part -------------- An embedded message was scrubbed... From: Leighton Pritchard Subject: Re: [BioPython] biopython gui (for Blast) Date: Tue, 28 Feb 2006 15:19:01 +0000 Size: 2814 Url: http://portal.open-bio.org/pipermail/biopython/attachments/20060228/8b1a8eb4/attachment.eml From omid9dr18 at hotmail.com Tue Feb 28 20:18:12 2006 From: omid9dr18 at hotmail.com (Omid Khalouei) Date: Tue Feb 28 20:30:29 2006 Subject: [BioPython] Homology Modeling question In-Reply-To: <76f031ae0602271034t72497fa6x@mail.gmail.com> Message-ID: Hello, My question is not specifically related to Biopython, I wanted to know if homology modeling can be used reliably to see the effects of single amino acid substitutions. I mean is homoloy modeling useful for predicting the structure of those sequences for which there is no know structure or can it also be used for a more "fine tuning" analysis such as changing one amino acid on a PDB structure and then performing homology modeling using that same PDB structure as template? Also is there any uptodate forum for homology modeling? I looked it up on Google but postings were for back in 1990's. Thanks for your help. Omid K.