From wolfgang.meyer at gmail.com Fri Feb 2 13:03:22 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Fri, 2 Feb 2007 19:03:22 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? Message-ID: Hi, I read the previous posts about numpy / numeric confusions. I were also stumbled by it very much and wanted some issues to be clarified. In the documentation for BioPython on web, many input/output arrays are specified as "Numpy arrays". However, I think it should be "Numeric arrays". Say in module KDTree set_coords(self, coords) Add the coordinates of the points. o coords - two dimensional Numpy array of type "f". E.g. if the points have dimensionality D and there are N points, the coords array should be NxD dimensional. If some users (e.g. me) takes this for serious and use a Numpy array as input, the user will get nothing but an error: File "KDTree.py", line 135, in set_coords if min(coords)<=-1e6 or max(coords)>=1e6: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Why? Because the function is expecting a Numeric array, rather than Numpy array, which is specified in the documentation. So why is this the case? I know Numeric is not under development anymore, and Numpy is a "new" Numeric. But according to my experiences they are really different in many aspects so far. If one mixes them in programs, one can really have a lot headaches. So I suggest to give real instructions in documentation, instead of causing more confusions. Thanks! -- Wolfgang Meyer From biopython at maubp.freeserve.co.uk Fri Feb 2 14:00:26 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 02 Feb 2007 19:00:26 +0000 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: References: Message-ID: <45C38A4A.2000301@maubp.freeserve.co.uk> Wolfgang Meyer wrote: > Hi, > > I read the previous posts about numpy / numeric confusions. I were also > stumbled by it very much and wanted some issues to be clarified. > > In the documentation for BioPython on web, many input/output arrays > are specified as "Numpy arrays". However, I think it should be "Numeric > arrays". > > ... > > Why? Because the function is expecting a Numeric array, > rather than Numpy array, which is specified in the documentation. Its because "NumPy" means something different now, than when that was written (when it meant the Numeric library). See below. > So why is this the case? I know Numeric is not under development > anymore, and Numpy is a "new" Numeric. See: http://www.scipy.org/History_of_SciPy (1) Numeric Once upon a time there was a python array and matrix library called "Numerical Python" which was used with "import Numeric" but was frequently referred to as simply "NumPy". (2) numarray Then there was a new, incompatible, branch NumArray used with "import numarry" which was much better for large array (but not so good with small arrays). (3) numpy Now developers are working on the third iteration, which ended up being called "NumPy" and used with "import numpy". So the term "NumPy" in recent documentation means the third project, used with "import numpy". BUT in older documentation it is shorthand for the first project, "Numerical python" used with "import Numeric". > But according to my > experiences they are really different in many aspects so far. If one > mixes them in programs, one can really have a lot headaches. Mixing the two is a certainly a bad idea :) > So I suggest to give real instructions in documentation, instead > of causing more confusions. Apart from "KDTree.py" are there any other uses of the term "NumPy" which would be better as "Numeric"? BioPython will have to move from Numeric to NumPy at some point anyway... Peter From mdehoon at c2b2.columbia.edu Fri Feb 2 16:27:03 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Fri, 02 Feb 2007 16:27:03 -0500 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <45C3ACA7.1080802@c2b2.columbia.edu> Peter wrote: > BioPython will have to move from Numeric to NumPy at some point anyway... > October 30, 2010 would be a good time to do so. Then the NumPy documentation will be free. See: http://www.tramy.us/ Just kidding. I'm fine with Biopython moving from Numeric to NumPy, except that it doesn't have priority for me, so I don't really want to spend time on it myself. If somebody else wants to push a Numeric -> NumPy transition, I'd have no problem with it. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From bpederse at gmail.com Fri Feb 2 16:43:57 2007 From: bpederse at gmail.com (Brent Pedersen) Date: Fri, 2 Feb 2007 13:43:57 -0800 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C3ACA7.1080802@c2b2.columbia.edu> References: <45C38A4A.2000301@maubp.freeserve.co.uk> <45C3ACA7.1080802@c2b2.columbia.edu> Message-ID: well, here's an offer (3 days old) by the main author on numpy to do conversions of open source packages that are currently using numeric. http://comments.gmane.org/gmane.comp.python.numeric.general/13413 -brent On 2/2/07, Michiel Jan Laurens de Hoon wrote: > Peter wrote: > > BioPython will have to move from Numeric to NumPy at some point anyway... > > > October 30, 2010 would be a good time to do so. Then the NumPy > documentation will be free. See: http://www.tramy.us/ > > Just kidding. > I'm fine with Biopython moving from Numeric to NumPy, except that it > doesn't have priority for me, so I don't really want to spend time on it > myself. If somebody else wants to push a Numeric -> NumPy transition, > I'd have no problem with it. > > --Michiel. > > -- > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1130 St Nicholas Avenue > New York, NY 10032 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From wolfgang.meyer at gmail.com Fri Feb 2 18:37:14 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Sat, 3 Feb 2007 00:37:14 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: On 2/2/07, Peter wrote: > > Wolfgang Meyer wrote: > > Hi, > > > > I read the previous posts about numpy / numeric confusions. I were also > > stumbled by it very much and wanted some issues to be clarified. > > > > In the documentation for BioPython on web, many input/output arrays > > are specified as "Numpy arrays". However, I think it should be "Numeric > > arrays". > > > > ... > > > > Why? Because the function is expecting a Numeric array, > > rather than Numpy array, which is specified in the documentation. > > Its because "NumPy" means something different now, than when that was > written (when it meant the Numeric library). See below. Thank! Good to learn this. I read the below text a long time ago but did not pay attention to the "misuse" of "numpy". > So why is this the case? I know Numeric is not under development > > anymore, and Numpy is a "new" Numeric. > > See: http://www.scipy.org/History_of_SciPy > > (1) Numeric > Once upon a time there was a python array and matrix library called > "Numerical Python" which was used with "import Numeric" but was > frequently referred to as simply "NumPy". > > (2) numarray > Then there was a new, incompatible, branch NumArray used with "import > numarry" which was much better for large array (but not so good with > small arrays). > > (3) numpy > Now developers are working on the third iteration, which ended up being > called "NumPy" and used with "import numpy". > > So the term "NumPy" in recent documentation means the third project, > used with "import numpy". BUT in older documentation it is shorthand > for the first project, "Numerical python" used with "import Numeric". > > > But according to my > > experiences they are really different in many aspects so far. If one > > mixes them in programs, one can really have a lot headaches. > > Mixing the two is a certainly a bad idea :) But certainly there are scenarios where both are needed. For example, a user who is in favor of both numpy and biopython will be simply pathetic :-) > So I suggest to give real instructions in documentation, instead > > of causing more confusions. > > Apart from "KDTree.py" are there any other uses of the term "NumPy" > which would be better as "Numeric"? IMHO, wherever "numpy" shows up in the current documentation should be replaced by 'numeric'. Simply image a user who just come to the filed and is unaware of the development history of numpy, (nor does he/she is aware of the 'misuse' of the term). When he/she is reading BioPython documentation, he will definitely take "NumPy" for "numpy". BioPython will have to move from Numeric to NumPy at some point anyway... Looking forward to that. Peter > > -- Wolfgang Meyer From biopython at maubp.freeserve.co.uk Sat Feb 3 12:50:13 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 03 Feb 2007 17:50:13 +0000 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <45C4CB55.4050801@maubp.freeserve.co.uk> Peter wrote: > ... the term "NumPy" in recent documentation means the third project, > used with "import numpy". BUT in older documentation it is shorthand > for the first project, "Numerical python" used with "import Numeric". Wolfgang Meyer wrote: > IMHO, wherever "numpy" shows up in the current documentation should be > replaced by 'numeric'. Simply image a user who just come to the filed > and is unaware of the development history of numpy, (nor does he/she is > aware of the 'misuse' of the term). When he/she is reading BioPython > documentation, he will definitely take "NumPy" for "numpy". I agree with you :) I have updated the python comments (and the one use of "numpy" in the Bio.PDB manual) to change "NumPy" (and other case variants) to use "Numeric" instead. If I've missed anything, let me know. I have NOT made any functional changes to the code. Peter From lucks at fas.harvard.edu Sat Feb 3 13:19:15 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sat, 3 Feb 2007 13:19:15 -0500 Subject: [BioPython] Do you mean Numpy or Numeric? Message-ID: <3ADAFA9D-6731-4823-9609-AA864E5C0869@fas.harvard.edu> I am in strong favor of migrating from numeric to numpy, especially since matplotlib is moving in that direction. It is extremely frustrating to try to use numpy in code that you write that also uses modules built off of numeric - certainly unmanageable for a new user who might not know the convoluted history of these packages. Any word on whether/how Travis Oliphant can help with this migration? Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From biopython at maubp.freeserve.co.uk Sun Feb 4 09:55:10 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 04 Feb 2007 14:55:10 +0000 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files Message-ID: <45C5F3CE.9080202@maubp.freeserve.co.uk> Hello list, I've been working on new Bio.SeqIO code for reading and writing clustal alignments. For more details about Bio.SeqIO, see here: http://www.biopython.org/wiki/SeqIO One issue that has recently come to my attention is how to deal with clustal alignments with repeated sequence identifiers. Clustalw 1.83 will reject any file where the first 30 characters of the identifier are not unique (regardless of the file format). However, there is nothing in the clustal file format which prevents this. For example, BioEdit 5.0.7 will happily read and write clustal format alignments with repeated entries. Should Bio.SeqIO also be tolerant like this? Its not quite as concise as the current code, but I have got a rough version of the parser ready which copes with such files. Any views? Peter From mdehoon at c2b2.columbia.edu Sun Feb 4 13:50:14 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Sun, 4 Feb 2007 13:50:14 -0500 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files References: <45C5F3CE.9080202@maubp.freeserve.co.uk> Message-ID: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> > Clustalw 1.83 will reject any file where the first 30 characters of the > identifier are not unique (regardless of the file format). > > However, there is nothing in the clustal file format which prevents > this. For example, BioEdit 5.0.7 will happily read and write clustal > format alignments with repeated entries. > > Should Bio.SeqIO also be tolerant like this? Yes, I think so. Some users may want to write a file in the Clustal format to use it with some program other Clustal. Also, assuming that clustal gives a clear error message when the file contains longer identifiers, that should be sufficient to enable the user to fix the problem. By the way, let us know when you feel that the Bio.SeqIO code is ready to be included in the next Biopython release (code-named Bronx). --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/ms-tnef Size: 3185 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/biopython/attachments/20070204/250fe714/attachment.bin From lucks at fas.harvard.edu Sun Feb 4 14:24:25 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sun, 4 Feb 2007 14:24:25 -0500 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files In-Reply-To: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> References: <45C5F3CE.9080202@maubp.freeserve.co.uk> <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> Message-ID: <8B44D1DD-6AEE-475E-8ADF-18F271E99196@fas.harvard.edu> What about throwing some sort of error message if there are non- unique id's? Something like what happens when you use NCBIWWW.qblast (from Bio.Blast), where it warns you that qblast only works with certain databases. That way users unaware of this issue in Clustalw will learn about it, and Bio.SeqIO will still permit you the freedom to have non-unique id's. Maybe also make this warning message easy to turn off so it doesn't get annoying. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 4, 2007, at 1:50 PM, Michiel De Hoon wrote: >> Clustalw 1.83 will reject any file where the first 30 characters >> of the >> identifier are not unique (regardless of the file format). >> >> However, there is nothing in the clustal file format which prevents >> this. For example, BioEdit 5.0.7 will happily read and write clustal >> format alignments with repeated entries. >> >> Should Bio.SeqIO also be tolerant like this? > > Yes, I think so. Some users may want to write a file in the Clustal > format to > use it with some program other Clustal. Also, assuming that clustal > gives a > clear error message when the file contains longer identifiers, that > should be > sufficient to enable the user to fix the problem. > > By the way, let us know when you feel that the Bio.SeqIO code is > ready to be > included in the next Biopython release (code-named Bronx). > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From biopython at maubp.freeserve.co.uk Sun Feb 4 15:13:41 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 04 Feb 2007 20:13:41 +0000 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files In-Reply-To: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> References: <45C5F3CE.9080202@maubp.freeserve.co.uk> <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> Message-ID: <45C63E75.3030304@maubp.freeserve.co.uk> Michiel De Hoon wrote: >> Clustalw 1.83 will reject any file where the first 30 characters of the >> identifier are not unique (regardless of the file format). >> >> However, there is nothing in the clustal file format which prevents >> this. For example, BioEdit 5.0.7 will happily read and write clustal >> format alignments with repeated entries. >> >> Should Bio.SeqIO also be tolerant like this? > > Yes, I think so. Some users may want to write a file in the Clustal format to > use it with some program other Clustal. I was hoping you would agree. Done. > Also, assuming that clustal gives a clear error message when the file > contains longer identifiers, that should be sufficient to enable the > user to fix the problem. The clustal programs do seem to be able to read clustal files with identifiers longer than 30 characters (I tried a hand made file with identifiers 55 characters long). This is good. Regardless of the input file format, if your input sequences have long identifiers they are silently truncated to 30 characters. In addition, any colons in the identifier are silently converted into underscores on loading. Both the command line ClustalW 1.83 and the GUI tool ClustalX 1.83 then give a very explicit error message if there are non unique identifiers: ERROR: Multiple sequences found with same name, XXXX (first 30 chars are significant) (where XXXX is the repeated identifier). Peter From lucks at fas.harvard.edu Sun Feb 4 15:22:50 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sun, 4 Feb 2007 15:22:50 -0500 Subject: [BioPython] CVS and Fink Installations Message-ID: <423AA91E-18B7-4DBB-A9EA-257C36755290@fas.harvard.edu> Hey Guys, I am using biopython 1.42 (with python 2.5) installed with fink on an OSX platform. However, I am getting more interested in biopython development these days, and would like to check out the latest code from CVS. The problem is I would also like my fink version to stay around as a stable version that I can rely on for my research if something is buggy in the latest CVS. Do you have any suggestions on how to have BOTH the fink and CVS code installed such that I can switch between the two? I appreciate your help and any suggestions on how you guys deal with unstable biopython development code and code you need to be stable for your other work. Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From kaiserl at MIT.EDU Sun Feb 4 15:35:05 2007 From: kaiserl at MIT.EDU (Liselotte Kaiser) Date: Sun, 04 Feb 2007 15:35:05 -0500 Subject: [BioPython] unsubscribe Message-ID: <20070204153505.5f11msl6t5c8wocs@webmail.mit.edu> I would like to unsubscribe but forgot my password. Liselotte ------------------------------------------- Liselotte Kaiser, Ph.D. Center for Biomedical Engineering Massachusetts Institute of Technology 500 Technology Square, NE47-307 Cambridge, MA 02139-4307 phone: 617-324-7612 email: kaiserl at mit.edu From mdehoon at c2b2.columbia.edu Sun Feb 4 18:33:23 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Sun, 4 Feb 2007 18:33:23 -0500 Subject: [BioPython] CVS and Fink Installations References: <423AA91E-18B7-4DBB-A9EA-257C36755290@fas.harvard.edu> Message-ID: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB33@mail.exch.c2b2.columbia.edu> > I am using biopython 1.42 (with python 2.5) installed with fink on an > OSX platform. However, I am getting more interested in biopython > development these days, and would like to check out the latest code > from CVS. Great! > Do you have any suggestions on how to have BOTH the fink and CVS code > installed such that I can switch between the two? Download the code from CVS, do "python setup.py build" as usual, but don't do "python setup.py install". Then, when you want to use the CVS version, set PYTHONPATH to the directory in which Biopython was built. For example, in my case, that would be export PYTHONPATH=/users/mdehoon/biopython/build/lib.linux-i686-2.5/ Then, if you do >>> from Bio import something it will import it from your code built from CVS. You can do the same trick from inside Python; see the function main in Tests/run_tests.py for an example. --Michiel. From meesters at uni-mainz.de Mon Feb 5 03:40:21 2007 From: meesters at uni-mainz.de (Christian Meesters) Date: Mon, 5 Feb 2007 09:40:21 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <200702050940.21896.meesters@uni-mainz.de> Hi On Friday 02 February 2007 20:00, Peter wrote: > BioPython will have to move from Numeric to NumPy at some point anyway... Just FYI: I might point at a recent discussion on the numpy-mainlinglist: http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025796.html Travis Oliphant - one of the main scipy/numpy-authors offered help migrating libraries which depent on Numeric to numpy. Biophython was actually mentioned as one of the packages which might profit from some support, but I don't know whether some of the Biophython authors jumped in or even is aware of that thread ... Cheers Christian From rcsqtc at iiqab.csic.es Mon Feb 5 07:40:57 2007 From: rcsqtc at iiqab.csic.es (Ramon Crehuet) Date: Mon, 05 Feb 2007 13:40:57 +0100 Subject: [BioPython] Sequences and alignments Message-ID: <45C725D9.9060504@iiqab.csic.es> Dear all, I have done an alignment with TM align and now I need to compare some numerical data associated to each structure residue by residue, according to the alignment. Imagine I have an array of the B-factor of each C_alpha and want to compare these values for equivalent residues, according to the alignment. I am quite new to the subject of alignments and don't know how to deal with this data. I've looked at the biopython documentation but I can't find any hint. Ideally I would like to be able to iterate along the alignment, but is that possible? I'm sure this raises many points on how to treat gaps, etc. but I was wondering if somebody has some experience on that. Cheers, Ramon From wolfgang.meyer at gmail.com Tue Feb 6 13:45:11 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Tue, 6 Feb 2007 19:45:11 +0100 Subject: [BioPython] StructureAlignment gap code Message-ID: Hi, Currently in this module it assume that gaps in alignments are represented by '-' (hyphen). But it is not unusual that some programs use '.' (dot) to represent gaps (e.g. Dali). And this perhaps could be solved by replacing the '-' with fasta_align._alphabet.new_letters() -- Wolfgang Meyer From lucks at fas.harvard.edu Wed Feb 7 13:37:16 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Wed, 7 Feb 2007 13:37:16 -0500 Subject: [BioPython] Biopython Citation Message-ID: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> Hey all, I am preparing a paper and I want to cite biopython since I used it extensively in the research. Is there a standard citation I can use? Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From mdehoon at c2b2.columbia.edu Wed Feb 7 15:12:30 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Wed, 07 Feb 2007 15:12:30 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> Message-ID: <45CA32AE.6050809@c2b2.columbia.edu> Julius Lucks wrote: > I am preparing a paper and I want to cite biopython since I used it > extensively in the research. Is there a standard citation I can use? Brad Chapman and Jeffrey Chang wrote a paper for the ACM sigbio newsletter in 2000: http://portal.acm.org/citation.cfm?id=360268 I don't know of any general Biopython paper since then. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From lpritc at scri.ac.uk Thu Feb 8 04:10:04 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 08 Feb 2007 09:10:04 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: <45CA32AE.6050809@c2b2.columbia.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> Message-ID: <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> Perhaps its due time for a short description of the project to be sent in to Bioinformatics or BMC Bioinformatics? L. On Wed, 2007-02-07 at 15:12 -0500, Michiel Jan Laurens de Hoon wrote: > Julius Lucks wrote: > > I am preparing a paper and I want to cite biopython since I used it > > extensively in the research. Is there a standard citation I can use? > > Brad Chapman and Jeffrey Chang wrote a paper for the ACM sigbio > newsletter in 2000: > > http://portal.acm.org/citation.cfm?id=360268 > > I don't know of any general Biopython paper since then. > > --Michiel. > -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From lpritc at scri.ac.uk Thu Feb 8 10:36:24 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 08 Feb 2007 15:36:24 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> On Thu, 2007-02-08 at 09:10 -0500, Julius Lucks wrote: > I imagine there have been substantial changes since then. Or what > about PLoS computational biology, provided we can get the page charges > deferred (which I think they are up for). I just pulled two possible targets out of thin air... PLoS Comp Biol is another good one. The discussion on who is going to write it might be more... interesting... ;) L. -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From sbassi at gmail.com Thu Feb 8 15:11:36 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Thu, 8 Feb 2007 21:11:36 +0100 Subject: [BioPython] Biopython Citation In-Reply-To: <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: On 2/8/07, Leighton Pritchard wrote: > The discussion on who is going to write it might be more... > interesting... ;) I've been thinking in the same problem. I would like to collaborate on it. I could open a "Google doc" and then send you an invite so we could both write in the same online document, and add more people later if somebody is interested. Best, SB -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From lucks at fas.harvard.edu Thu Feb 8 20:50:33 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 8 Feb 2007 20:50:33 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> There is a nice article covering the strengths and weaknesses of BioPython, BioPerl and BioJava aimed at guiding a novice user: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? db=pubmed&cmd=Retrieve&dopt=AbstractPlus&list_uids=12230038 It seemed very unbiased, and is good to read to see how BioPython is viewed within the broader community. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 8, 2007, at 3:11 PM, Sebastian Bassi wrote: > On 2/8/07, Leighton Pritchard wrote: >> The discussion on who is going to write it might be more... >> interesting... ;) > > I've been thinking in the same problem. I would like to collaborate on > it. I could open a "Google doc" and then send you an invite so we > could both write in the same online document, and add more people > later if somebody is interested. > Best, > SB > > -- > Bioinformatics news: http://www.bioinformatica.info > Lriser: http://www.linspire.com/lraiser_success.php?serial=318 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From sbassi at gmail.com Thu Feb 8 21:20:47 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Thu, 8 Feb 2007 23:20:47 -0300 Subject: [BioPython] Biopython Citation In-Reply-To: <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: On 2/8/07, Julius Lucks wrote: > There is a nice article covering the strengths and weaknesses of BioPython, > BioPerl and BioJava aimed at guiding a novice user: Yes, but BioPython grew a lot from 2002. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From mdehoon at c2b2.columbia.edu Thu Feb 8 22:49:53 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Thu, 08 Feb 2007 22:49:53 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: <45CBEF61.5090101@c2b2.columbia.edu> Sebastian Bassi wrote: > On 2/8/07, Julius Lucks wrote: >> There is a nice article covering the strengths and weaknesses of BioPython, >> BioPerl and BioJava aimed at guiding a novice user: > > Yes, but BioPython grew a lot from 2002. > Currently, Biopython is undergoing some major improvements, for example with the Blast parser and especially with the new Bio.SeqIO. I expect that Biopython in two or three months will be a lot more transparent and user-friendly than it is now. It may be worthwhile to postpone submitting a Biopython paper until these improvements have made it into a Biopython release. Don't let that stop you from starting to write a paper, though :-). --Michiel. From lpritc at scri.ac.uk Fri Feb 9 05:25:44 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 09 Feb 2007 10:25:44 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <1171016744.11097.316.camel@lplinuxdev.scri.sari.ac.uk> On Thu, 2007-02-08 at 21:11 +0100, Sebastian Bassi wrote: > On 2/8/07, Leighton Pritchard wrote: > > The discussion on who is going to write it might be more... > > interesting... ;) > > I've been thinking in the same problem. I would like to collaborate on > it. I could open a "Google doc" and then send you an invite so we > could both write in the same online document, and add more people > later if somebody is interested. > Best, > SB I'd be very happy to collaborate on writing this sort of paper but, as my own contributions to the code have been minuscule, I want to make sure that the major contributors to, and developers of, the code base are OK with that, and also that they take the lion's share of the credit, regardless of how many words they put into the text. I don't want to be pushing myself forward at someone else's expense, taking credit for work that I didn't do, or generating friction within the community in arguments over who is, and who is not, named, and where they are in the pecking order ;) Michiel wrote: > Currently, Biopython is undergoing some major improvements, for > example > with the Blast parser and especially with the new Bio.SeqIO. I expect > that Biopython in two or three months will be a lot more transparent > and > user-friendly than it is now. It may be worthwhile to postpone > submitting a Biopython paper until these improvements have made it > into > a Biopython release. > > Don't let that stop you from starting to write a paper, though :-). With that in mind, I'd be glad to help in drawing up a framework for the paper, and writing it, with a view to submitting after the next release, possibly in May or June. I've not used Google Docs before, but if you'd like to set that up Sebastian, I'm interested to see how collaborative writing works on it. L. -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From hjm at tacgi.com Wed Feb 14 21:08:37 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Wed, 14 Feb 2007 18:08:37 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: <200702141808.37738.hjm@tacgi.com> On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: > On 2/8/07, Julius Lucks wrote: > > There is a nice article covering the strengths and weaknesses of > > BioPython, BioPerl and BioJava aimed at guiding a novice user: Thanks :) > Yes, but BioPython grew a lot from 2002. So have the other bio-projects. For example, I noted BioRuby in passing but didn't include it in the survey - it's now gone to release 1.0, some kind of a step. It certainly is time for a re-review. What would be quite nice is a longish paper, perhaps in an e-format, with a short form published on paper. If key personnel from each of the Bio-efforts contributed what they think are the strengths and weaknesses of their own approach, it would be extremely useful to a new crop of students. Some of the problem in choosing a bio, as I mentioned in the paper, is dependent on the language itself, some is the approach to the problem. Certainly, contributors to each group read each others lists and can best lay out their arguments. Some of the attraction to users is also dependent on how the different distributions tend to package the contributions and the frequency that they're updated. I haven't checked the biopython releases recently, but the Ubuntu distro, for example, does a very good job of packaging popular perl modules, with the result that I haven't had to resort to CPAN for a very long time. Just my (possibly mistaken) impression, but Python utilities and modules seems to be a little less well packaged, at least by Ubuntu. -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) harry.mangalam at uci.edu -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From hlapp at gmx.net Thu Feb 15 11:55:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 15 Feb 2007 11:55:19 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <200702141808.37738.hjm@tacgi.com> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> Message-ID: <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> BTW Jason and I wrote a relatively high-level review last summer: Briefings in Bioinformatics 2006 7(3):287-296 http://dx.doi.org/10.1093/bib/bbl026 It's not a Bio* comparison paper, though. In fact, it rather tries to highlight the strengths of various toolkits and standards in the life sciences. -hilmar On Feb 14, 2007, at 9:08 PM, Harry Mangalam wrote: > On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: >> On 2/8/07, Julius Lucks wrote: >>> There is a nice article covering the strengths and weaknesses of >>> BioPython, BioPerl and BioJava aimed at guiding a novice user: > > Thanks :) > >> Yes, but BioPython grew a lot from 2002. > > So have the other bio-projects. For example, I noted BioRuby in > passing but didn't include it in the survey - it's now gone to > release 1.0, some kind of a step. It certainly is time for a > re-review. What would be quite nice is a longish paper, perhaps in > an e-format, with a short form published on paper. If key personnel >> from each of the Bio-efforts contributed what they think are the > strengths and weaknesses of their own approach, it would be extremely > useful to a new crop of students. > > Some of the problem in choosing a bio, as I mentioned in the paper, is > dependent on the language itself, some is the approach to the > problem. Certainly, contributors to each group read each others > lists and can best lay out their arguments. > > Some of the attraction to users is also dependent on how the different > distributions tend to package the contributions and the frequency > that they're updated. I haven't checked the biopython releases > recently, but the Ubuntu distro, for example, does a very good job of > packaging popular perl modules, with the result that I haven't had to > resort to CPAN for a very long time. Just my (possibly mistaken) > impression, but Python utilities and modules seems to be a little > less well packaged, at least by Ubuntu. > > -- > Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, > UC Irvine 92697 949 824 0084(o), 949 285 4487(c) > harry.mangalam at uci.edu > -- > Cheers, Harry > Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) > hjm at tacgi.com [plain text preferred] > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mdehoon at c2b2.columbia.edu Thu Feb 15 14:54:28 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 15 Feb 2007 14:54:28 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <200702141808.37738.hjm@tacgi.com> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> Message-ID: <45D4BA74.9060102@c2b2.columbia.edu> Harry Mangalam wrote: > Some of the attraction to users is also dependent on how the different > distributions tend to package the contributions and the frequency > that they're updated. I haven't checked the biopython releases > recently, but the Ubuntu distro, for example, does a very good job of > packaging popular perl modules, with the result that I haven't had to > resort to CPAN for a very long time. Just my (possibly mistaken) > impression, but Python utilities and modules seems to be a little > less well packaged, at least by Ubuntu. > Just wondering: What's so bad about having resort to CPAN? --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From hjm at tacgi.com Thu Feb 15 14:49:41 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Thu, 15 Feb 2007 11:49:41 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> Message-ID: <200702151149.42023.hjm@tacgi.com> That's a really excellent overview. My only (non)critique is that since there's so much to cover that you couldn't possibly have covered each in the kind of detail it deserves. Maybe the proper way to do this is to wikitize the paper, allowing supporters and users to fill in the gaps. Again, great job - sorry I missed this when it came out. Harry On Thursday 15 February 2007 08:55, Hilmar Lapp wrote: > BTW Jason and I wrote a relatively high-level review last summer: > > Briefings in Bioinformatics 2006 7(3):287-296 > http://dx.doi.org/10.1093/bib/bbl026 > > It's not a Bio* comparison paper, though. In fact, it rather tries > to highlight the strengths of various toolkits and standards in the > life sciences. > > -hilmar > > On Feb 14, 2007, at 9:08 PM, Harry Mangalam wrote: > > On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: > >> On 2/8/07, Julius Lucks wrote: > >>> There is a nice article covering the strengths and weaknesses > >>> of BioPython, BioPerl and BioJava aimed at guiding a novice > >>> user: > > > > Thanks :) > > > >> Yes, but BioPython grew a lot from 2002. > > > > So have the other bio-projects. For example, I noted BioRuby in > > passing but didn't include it in the survey - it's now gone to > > release 1.0, some kind of a step. It certainly is time for a > > re-review. What would be quite nice is a longish paper, perhaps > > in an e-format, with a short form published on paper. If key > > personnel > > > >> from each of the Bio-efforts contributed what they think are the > > > > strengths and weaknesses of their own approach, it would be > > extremely useful to a new crop of students. > > > > Some of the problem in choosing a bio, as I mentioned in the > > paper, is dependent on the language itself, some is the approach > > to the problem. Certainly, contributors to each group read each > > others lists and can best lay out their arguments. > > > > Some of the attraction to users is also dependent on how the > > different distributions tend to package the contributions and the > > frequency that they're updated. I haven't checked the biopython > > releases recently, but the Ubuntu distro, for example, does a > > very good job of packaging popular perl modules, with the result > > that I haven't had to resort to CPAN for a very long time. Just > > my (possibly mistaken) impression, but Python utilities and > > modules seems to be a little less well packaged, at least by > > Ubuntu. > > > > -- > > Harry Mangalam - Research Computing, NACS, E2148, Engineering > > Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) > > harry.mangalam at uci.edu > > -- > > Cheers, Harry > > Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for > > fax) hjm at tacgi.com [plain text preferred] > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From hjm at tacgi.com Thu Feb 15 23:44:58 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Thu, 15 Feb 2007 20:44:58 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: <45D4BA74.9060102@c2b2.columbia.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> <45D4BA74.9060102@c2b2.columbia.edu> Message-ID: <200702152044.58911.hjm@tacgi.com> On Thursday 15 February 2007 11:54, Michiel Jan Laurens de Hoon wrote: > Harry Mangalam wrote: > > Some of the attraction to users is also dependent on how the > > different distributions tend to package the contributions and the > > frequency that they're updated. I haven't checked the biopython > > releases recently, but the Ubuntu distro, for example, does a > > very good job of packaging popular perl modules, with the result > > that I haven't had to resort to CPAN for a very long time. Just > > my (possibly mistaken) impression, but Python utilities and > > modules seems to be a little less well packaged, at least by > > Ubuntu. > > Just wondering: What's so bad about having resort to CPAN? > > --Michiel. CPAN is a wonderful installation tool rendered almost obsolete by the even more wonderful apt-get. I'd rather not have to deal with 2 (sometimes) conflicting (or at least complicating) installation paths. -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From lucks at fas.harvard.edu Mon Feb 19 21:23:34 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Mon, 19 Feb 2007 21:23:34 -0500 Subject: [BioPython] Biopython Wiki code-snippets Message-ID: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> Hi All, Does anyone know which mediawiki extension the biopython wiki uses to do code-snippet highlighting via the tags? Or what is the email address of the maintainer of the biopython wiki. I appreciate your help. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From marc.saric at gmx.de Tue Feb 20 06:29:32 2007 From: marc.saric at gmx.de (Marc Saric) Date: Tue, 20 Feb 2007 12:29:32 +0100 Subject: [BioPython] Biopython Wiki code-snippets In-Reply-To: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> References: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> Message-ID: <45DADB9C.9070703@gmx.de> Hi Julius, I guess you can always have a look at the Special:Version page, which should list all extensions. My guess would be something like wfGeSHiColorExtension (at least that's what I am using for syntax highlighting). Julius Lucks wrote: > Hi All, > > Does anyone know which mediawiki extension the biopython wiki uses to > do code-snippet highlighting via the tags? Or what is the > email address of the maintainer of the biopython wiki. > > I appreciate your help. > > Julius -- Bye, Marc Saric http://www.marcsaric.de From throwaway at MIT.EDU Wed Feb 21 13:31:56 2007 From: throwaway at MIT.EDU (Alex Coventry) Date: Wed, 21 Feb 2007 13:31:56 -0500 Subject: [BioPython] Problem using efetch Message-ID: Hi. The following query results in an error message from NCBI: >>> print client.search(''' NP_542420 ATRTC ''', db='protein').efetch(retmode='text', rettype='fasta').read() ... ... ... ... Error: Internal Error I expected this query to return two sequences in fasta format. The same query seems to work without problems at . Searching for either query separately also seems to work. E.g.: >>> print client.search(''' ATRTC ''', db='protein').efetch(retmode='text', rettype='fasta').read() ... ... ... >gi|71620|pir||ATRTC actin beta - rat MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS TFQQMWISKQEYDESGPSIVHRKCF Am I doing something wrong, here? Alex From lucks at fas.harvard.edu Thu Feb 22 18:41:05 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 22 Feb 2007 18:41:05 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: References: Message-ID: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Hi Alex, I am not sure if anyone has addressed your issue. Which module are you using (i.e. where did you import 'client' from)? Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 21, 2007, at 1:31 PM, Alex Coventry wrote: > > Hi. The following query results in an error message from NCBI: > >>>> print client.search(''' > NP_542420 > ATRTC > ''', db='protein').efetch(retmode='text', rettype='fasta').read() > ... ... ... ... > Error: Internal Error > > I expected this query to return two sequences in fasta format. The > same > query seems to work without problems at > . > > Searching for either query separately also seems to work. E.g.: > >>>> print client.search(''' > ATRTC > ''', db='protein').efetch(retmode='text', > rettype='fasta').read() > ... ... ... >gi|71620|pir||ATRTC actin beta - rat > MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP > IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL > SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR > DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN > SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS > TFQQMWISKQEYDESGPSIVHRKCF > > Am I doing something wrong, here? > > Alex > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From throwaway at MIT.EDU Thu Feb 22 23:01:56 2007 From: throwaway at MIT.EDU (Alex Coventry) Date: Thu, 22 Feb 2007 23:01:56 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> (Julius Lucks's message of "Thu, 22 Feb 2007 18:41:05 -0500") References: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Message-ID: Oh, thanks for pointing that out. That was careless of me. It's using Bio.EUtils, i.e. from Bio.EUtils import HistoryClient client = HistoryClient.HistoryClient() This is with Biopython version 1.42. Alex >>>>> "Julius" == Julius Lucks writes: Julius> Hi Alex, Julius> I am not sure if anyone has addressed your issue. Which Julius> module are you using (i.e. where did you import 'client' Julius> from)? Julius> Julius From lucks at fas.harvard.edu Thu Feb 22 23:47:39 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 22 Feb 2007 23:47:39 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> References: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Message-ID: <1E7E397A-E3D6-4903-8D4F-2B5F8DAD3F22@fas.harvard.edu> Hey Alex, I am not sure why there was an internal error, but if you want to return 2 sequences, then the following adjustment of your query will do the trick from Bio.EUtils import HistoryClient client = HistoryClient.HistoryClient() print client.search('NP_542420 OR ATRTC',db='protein').efetch (retmode='text', rettype='fasta').read() which gives >gi|18093102|ref|NP_542420.1| dynamin 1 [Rattus norvegicus] MGNRGMEDLIPLVNRLQDAFSAIGQNADLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLV LQLVNSTTEYAEFLHCKGKKFTDFEEVRLEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLVDLPGM TKVPVGDQPPDIEFQIRDMLMQFVTKENCLILAVSPANSDLANSDALKIAKEVDPQGQRTIGVITKLDLM DEGTDARDVLENKLLPLRRGYIGVVNRSQKDIDGKKDITAALAAERKFFLSHPSYRHLADRMGTPYLQKV LNQQLTNHIRDTLPGLRNKLQSQLLSIEKEVDEYKNFRPDDPARKTKALLQMVQQFAVDFEKRIEGSGDQ IDTYELSGGARINRIFHERFPFELVKMEFDEKELRREISYAIKNIHGIRTGLFTPDLAFEATVKKQVQKL KEPSIKCVDMVVSELTSTIRKCSEKLQQYPRLREEMERIVTTHIREREGRTKEQVMLLIDIELAYMNTNH EDFIGFANAQQRSNQMNKKKTSGNQDEILVIRKGWLTINNIGIMKGGSKEYWFVLTAENLSWYKDDEEKE KKYMLSVDNLKLRDVEKGFMSSKHIFALFNTEQRNVYKDYRQLELACETQEEVDSWKASFLRAGVYPERV GDKEKASETEENGSDSFMHSMDPQLERQVETIRNLVDSYMAIVNKTVRDLMPKTIMHLMINNTKEFIFSE LLANLYSCGDQNTLMEESAEQAQRRDEMLRMYHALKEALSIIGDINTTTVSTPMPPPVDDSWLQVQSVPA GRRSPTSSPTPQRRAPAVPPARPGSRGPAPGPPPAGSALGGAPPVPSRPGASPDPFGPPPQVPSRPNRAP PGVPRITISDP >gi|71620|pir||ATRTC actin beta - rat MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS TFQQMWISKQEYDESGPSIVHRKCF The 'OR' should give 2 results in the search, which efetch should handle in return. I am not sure why the code gave the InternalError though, maybe someone else can pinpoint that. In general, the NCBI interface for efetch will make use of whatever results were returned by a previous search when using the NCBI history server (which the HistoryClient does for you). So getting the right list of results might come down to constructing the appropriate NCBI query. Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 22, 2007, at 6:41 PM, Julius Lucks wrote: > Hi Alex, > > I am not sure if anyone has addressed your issue. Which module are > you using (i.e. where did you import 'client' from)? > > Julius > > ----------------------------------------------------- > http://openwetware.org/wiki/User:Lucks > ----------------------------------------------------- > > > > On Feb 21, 2007, at 1:31 PM, Alex Coventry wrote: > >> >> Hi. The following query results in an error message from NCBI: >> >>>>> print client.search(''' >> NP_542420 >> ATRTC >> ''', db='protein').efetch(retmode='text', rettype='fasta').read() >> ... ... ... ... >> Error: Internal Error >> >> I expected this query to return two sequences in fasta format. The >> same >> query seems to work without problems at >> . >> >> Searching for either query separately also seems to work. E.g.: >> >>>>> print client.search(''' >> ATRTC >> ''', db='protein').efetch(retmode='text', >> rettype='fasta').read() >> ... ... ... >gi|71620|pir||ATRTC actin beta - rat >> MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKY >> P >> IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAV >> L >> SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIV >> R >> DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTF >> N >> SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASL >> S >> TFQQMWISKQEYDESGPSIVHRKCF >> >> Am I doing something wrong, here? >> >> Alex >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From km at mrna.tn.nic.in Fri Feb 23 16:14:46 2007 From: km at mrna.tn.nic.in (km) Date: Sat, 24 Feb 2007 02:44:46 +0530 Subject: [BioPython] Bio.PDB centroid Message-ID: <20070223211446.GA32138@mrna.tn.nic.in> Hi all, Is there any inbuilt function in Bio.PDB to find out the center of mass of the side chain of a residue ? any suggestions ? tia, regards, KM -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From sbassi at gmail.com Fri Feb 23 16:34:58 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 23 Feb 2007 18:34:58 -0300 Subject: [BioPython] Error retrieving from GenBank Message-ID: Hello, Is this biopython bug or just a connexion problem from my side? My internet connexion is not good, but this problems sounds like biopython not interacting with eutils. >>> from Bio import GenBank >>> gilist=GenBank.search_for("beta-conglycinin") Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1890, in search_for ids.append(db_id.dbids.ids[0]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 124, in _get_dbids infile = self.efetch(retmode = "text", rettype = "uilist") File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 150, in efetch complexity = complexity) File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", line 987, in efetch_using_dbids query = {"id": id_string, File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", line 644, in _get return self.opener.open(url) File "/usr/lib/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib/python2.4/urllib2.py", line 1021, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.4/urllib2.py", line 996, in do_open raise URLError(err) urllib2.URLError: >>> -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From lucks at fas.harvard.edu Fri Feb 23 18:13:41 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Fri, 23 Feb 2007 18:13:41 -0500 Subject: [BioPython] Error retrieving from GenBank In-Reply-To: References: Message-ID: Hi Sebastian, It looks like your error occurred within python 2.4's urllib, so it might be that. Otherwise, I am using python 2.5 and got your code to work (with a gilist of length 930). The query took a little while (maybe about 3 minutes), so if you have a bad connection, something might have timed out. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 23, 2007, at 4:34 PM, Sebastian Bassi wrote: > Hello, > > Is this biopython bug or just a connexion problem from my side? My > internet connexion is not good, but this problems sounds like > biopython not interacting with eutils. > > >>>> from Bio import GenBank >>>> gilist=GenBank.search_for("beta-conglycinin") > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 1890, in search_for > ids.append(db_id.dbids.ids[0]) > File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", > line 124, in _get_dbids > infile = self.efetch(retmode = "text", rettype = "uilist") > File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", > line 150, in efetch > complexity = complexity) > File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", > line 987, in efetch_using_dbids > query = {"id": id_string, > File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", > line 644, in _get > return self.opener.open(url) > File "/usr/lib/python2.4/urllib2.py", line 358, in open > response = self._open(req, data) > File "/usr/lib/python2.4/urllib2.py", line 376, in _open > '_open', req) > File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain > result = func(*args) > File "/usr/lib/python2.4/urllib2.py", line 1021, in http_open > return self.do_open(httplib.HTTPConnection, req) > File "/usr/lib/python2.4/urllib2.py", line 996, in do_open > raise URLError(err) > urllib2.URLError: >>>> > > > -- > Bioinformatics news: http://www.bioinformatica.info > Lriser: http://www.linspire.com/lraiser_success.php?serial=318 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From sbassi at gmail.com Fri Feb 23 21:54:58 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 23 Feb 2007 23:54:58 -0300 Subject: [BioPython] Error retrieving from GenBank In-Reply-To: References: Message-ID: On 2/23/07, Julius Lucks wrote: > It looks like your error occurred within python 2.4's urllib, so it might be > that. Otherwise, I am using python 2.5 and got your code to work (with a > gilist of length 930). The query took a little while (maybe about 3 > minutes), so if you have a bad connection, something might have timed out. Thank you very much. You were right about the time out. I tried with a search for more terms that returned less data and it worked. Best, SB. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From km at mrna.tn.nic.in Sat Feb 24 13:58:10 2007 From: km at mrna.tn.nic.in (km) Date: Sun, 25 Feb 2007 00:28:10 +0530 Subject: [BioPython] Bio.PDB centroid In-Reply-To: <2d7c25310702240204i388293b9l552d0d1842cea17a@mail.gmail.com> References: <20070223211446.GA32138@mrna.tn.nic.in> <2d7c25310702240204i388293b9l552d0d1842cea17a@mail.gmail.com> Message-ID: <20070224185810.GA14553@mrna.tn.nic.in> On Sat, Feb 24, 2007 at 11:04:23AM +0100, Thomas Hamelryck wrote: > what about: > > centroid=Vector(0,0,0) > counter=0.0 > for atom in residue: > centroid=centroid+atom.get_vector() > counter+=1 > centroid=centroid/counter > > This includes all residue atoms - put in a little test to exclude main chain > atoms if necessary. well thats gives only an average of x,y,z coordinates. But i remember that it little more complex than that - I look at it as a problem of finding the centroid of a polygon which takes into account area of the triangular regions as weightage in the calculation of centroid. I donot know how to approach it in 3 dimensional coordinate space. regards, KM -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From wolfgang.meyer at gmail.com Fri Feb 2 18:03:22 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Fri, 2 Feb 2007 19:03:22 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? Message-ID: Hi, I read the previous posts about numpy / numeric confusions. I were also stumbled by it very much and wanted some issues to be clarified. In the documentation for BioPython on web, many input/output arrays are specified as "Numpy arrays". However, I think it should be "Numeric arrays". Say in module KDTree set_coords(self, coords) Add the coordinates of the points. o coords - two dimensional Numpy array of type "f". E.g. if the points have dimensionality D and there are N points, the coords array should be NxD dimensional. If some users (e.g. me) takes this for serious and use a Numpy array as input, the user will get nothing but an error: File "KDTree.py", line 135, in set_coords if min(coords)<=-1e6 or max(coords)>=1e6: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Why? Because the function is expecting a Numeric array, rather than Numpy array, which is specified in the documentation. So why is this the case? I know Numeric is not under development anymore, and Numpy is a "new" Numeric. But according to my experiences they are really different in many aspects so far. If one mixes them in programs, one can really have a lot headaches. So I suggest to give real instructions in documentation, instead of causing more confusions. Thanks! -- Wolfgang Meyer From biopython at maubp.freeserve.co.uk Fri Feb 2 19:00:26 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 02 Feb 2007 19:00:26 +0000 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: References: Message-ID: <45C38A4A.2000301@maubp.freeserve.co.uk> Wolfgang Meyer wrote: > Hi, > > I read the previous posts about numpy / numeric confusions. I were also > stumbled by it very much and wanted some issues to be clarified. > > In the documentation for BioPython on web, many input/output arrays > are specified as "Numpy arrays". However, I think it should be "Numeric > arrays". > > ... > > Why? Because the function is expecting a Numeric array, > rather than Numpy array, which is specified in the documentation. Its because "NumPy" means something different now, than when that was written (when it meant the Numeric library). See below. > So why is this the case? I know Numeric is not under development > anymore, and Numpy is a "new" Numeric. See: http://www.scipy.org/History_of_SciPy (1) Numeric Once upon a time there was a python array and matrix library called "Numerical Python" which was used with "import Numeric" but was frequently referred to as simply "NumPy". (2) numarray Then there was a new, incompatible, branch NumArray used with "import numarry" which was much better for large array (but not so good with small arrays). (3) numpy Now developers are working on the third iteration, which ended up being called "NumPy" and used with "import numpy". So the term "NumPy" in recent documentation means the third project, used with "import numpy". BUT in older documentation it is shorthand for the first project, "Numerical python" used with "import Numeric". > But according to my > experiences they are really different in many aspects so far. If one > mixes them in programs, one can really have a lot headaches. Mixing the two is a certainly a bad idea :) > So I suggest to give real instructions in documentation, instead > of causing more confusions. Apart from "KDTree.py" are there any other uses of the term "NumPy" which would be better as "Numeric"? BioPython will have to move from Numeric to NumPy at some point anyway... Peter From mdehoon at c2b2.columbia.edu Fri Feb 2 21:27:03 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Fri, 02 Feb 2007 16:27:03 -0500 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <45C3ACA7.1080802@c2b2.columbia.edu> Peter wrote: > BioPython will have to move from Numeric to NumPy at some point anyway... > October 30, 2010 would be a good time to do so. Then the NumPy documentation will be free. See: http://www.tramy.us/ Just kidding. I'm fine with Biopython moving from Numeric to NumPy, except that it doesn't have priority for me, so I don't really want to spend time on it myself. If somebody else wants to push a Numeric -> NumPy transition, I'd have no problem with it. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From bpederse at gmail.com Fri Feb 2 21:43:57 2007 From: bpederse at gmail.com (Brent Pedersen) Date: Fri, 2 Feb 2007 13:43:57 -0800 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C3ACA7.1080802@c2b2.columbia.edu> References: <45C38A4A.2000301@maubp.freeserve.co.uk> <45C3ACA7.1080802@c2b2.columbia.edu> Message-ID: well, here's an offer (3 days old) by the main author on numpy to do conversions of open source packages that are currently using numeric. http://comments.gmane.org/gmane.comp.python.numeric.general/13413 -brent On 2/2/07, Michiel Jan Laurens de Hoon wrote: > Peter wrote: > > BioPython will have to move from Numeric to NumPy at some point anyway... > > > October 30, 2010 would be a good time to do so. Then the NumPy > documentation will be free. See: http://www.tramy.us/ > > Just kidding. > I'm fine with Biopython moving from Numeric to NumPy, except that it > doesn't have priority for me, so I don't really want to spend time on it > myself. If somebody else wants to push a Numeric -> NumPy transition, > I'd have no problem with it. > > --Michiel. > > -- > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1130 St Nicholas Avenue > New York, NY 10032 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From wolfgang.meyer at gmail.com Fri Feb 2 23:37:14 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Sat, 3 Feb 2007 00:37:14 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: On 2/2/07, Peter wrote: > > Wolfgang Meyer wrote: > > Hi, > > > > I read the previous posts about numpy / numeric confusions. I were also > > stumbled by it very much and wanted some issues to be clarified. > > > > In the documentation for BioPython on web, many input/output arrays > > are specified as "Numpy arrays". However, I think it should be "Numeric > > arrays". > > > > ... > > > > Why? Because the function is expecting a Numeric array, > > rather than Numpy array, which is specified in the documentation. > > Its because "NumPy" means something different now, than when that was > written (when it meant the Numeric library). See below. Thank! Good to learn this. I read the below text a long time ago but did not pay attention to the "misuse" of "numpy". > So why is this the case? I know Numeric is not under development > > anymore, and Numpy is a "new" Numeric. > > See: http://www.scipy.org/History_of_SciPy > > (1) Numeric > Once upon a time there was a python array and matrix library called > "Numerical Python" which was used with "import Numeric" but was > frequently referred to as simply "NumPy". > > (2) numarray > Then there was a new, incompatible, branch NumArray used with "import > numarry" which was much better for large array (but not so good with > small arrays). > > (3) numpy > Now developers are working on the third iteration, which ended up being > called "NumPy" and used with "import numpy". > > So the term "NumPy" in recent documentation means the third project, > used with "import numpy". BUT in older documentation it is shorthand > for the first project, "Numerical python" used with "import Numeric". > > > But according to my > > experiences they are really different in many aspects so far. If one > > mixes them in programs, one can really have a lot headaches. > > Mixing the two is a certainly a bad idea :) But certainly there are scenarios where both are needed. For example, a user who is in favor of both numpy and biopython will be simply pathetic :-) > So I suggest to give real instructions in documentation, instead > > of causing more confusions. > > Apart from "KDTree.py" are there any other uses of the term "NumPy" > which would be better as "Numeric"? IMHO, wherever "numpy" shows up in the current documentation should be replaced by 'numeric'. Simply image a user who just come to the filed and is unaware of the development history of numpy, (nor does he/she is aware of the 'misuse' of the term). When he/she is reading BioPython documentation, he will definitely take "NumPy" for "numpy". BioPython will have to move from Numeric to NumPy at some point anyway... Looking forward to that. Peter > > -- Wolfgang Meyer From biopython at maubp.freeserve.co.uk Sat Feb 3 17:50:13 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 03 Feb 2007 17:50:13 +0000 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <45C4CB55.4050801@maubp.freeserve.co.uk> Peter wrote: > ... the term "NumPy" in recent documentation means the third project, > used with "import numpy". BUT in older documentation it is shorthand > for the first project, "Numerical python" used with "import Numeric". Wolfgang Meyer wrote: > IMHO, wherever "numpy" shows up in the current documentation should be > replaced by 'numeric'. Simply image a user who just come to the filed > and is unaware of the development history of numpy, (nor does he/she is > aware of the 'misuse' of the term). When he/she is reading BioPython > documentation, he will definitely take "NumPy" for "numpy". I agree with you :) I have updated the python comments (and the one use of "numpy" in the Bio.PDB manual) to change "NumPy" (and other case variants) to use "Numeric" instead. If I've missed anything, let me know. I have NOT made any functional changes to the code. Peter From lucks at fas.harvard.edu Sat Feb 3 18:19:15 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sat, 3 Feb 2007 13:19:15 -0500 Subject: [BioPython] Do you mean Numpy or Numeric? Message-ID: <3ADAFA9D-6731-4823-9609-AA864E5C0869@fas.harvard.edu> I am in strong favor of migrating from numeric to numpy, especially since matplotlib is moving in that direction. It is extremely frustrating to try to use numpy in code that you write that also uses modules built off of numeric - certainly unmanageable for a new user who might not know the convoluted history of these packages. Any word on whether/how Travis Oliphant can help with this migration? Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From biopython at maubp.freeserve.co.uk Sun Feb 4 14:55:10 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 04 Feb 2007 14:55:10 +0000 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files Message-ID: <45C5F3CE.9080202@maubp.freeserve.co.uk> Hello list, I've been working on new Bio.SeqIO code for reading and writing clustal alignments. For more details about Bio.SeqIO, see here: http://www.biopython.org/wiki/SeqIO One issue that has recently come to my attention is how to deal with clustal alignments with repeated sequence identifiers. Clustalw 1.83 will reject any file where the first 30 characters of the identifier are not unique (regardless of the file format). However, there is nothing in the clustal file format which prevents this. For example, BioEdit 5.0.7 will happily read and write clustal format alignments with repeated entries. Should Bio.SeqIO also be tolerant like this? Its not quite as concise as the current code, but I have got a rough version of the parser ready which copes with such files. Any views? Peter From mdehoon at c2b2.columbia.edu Sun Feb 4 18:50:14 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Sun, 4 Feb 2007 13:50:14 -0500 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files References: <45C5F3CE.9080202@maubp.freeserve.co.uk> Message-ID: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> > Clustalw 1.83 will reject any file where the first 30 characters of the > identifier are not unique (regardless of the file format). > > However, there is nothing in the clustal file format which prevents > this. For example, BioEdit 5.0.7 will happily read and write clustal > format alignments with repeated entries. > > Should Bio.SeqIO also be tolerant like this? Yes, I think so. Some users may want to write a file in the Clustal format to use it with some program other Clustal. Also, assuming that clustal gives a clear error message when the file contains longer identifiers, that should be sufficient to enable the user to fix the problem. By the way, let us know when you feel that the Bio.SeqIO code is ready to be included in the next Biopython release (code-named Bronx). --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3185 bytes Desc: not available URL: From lucks at fas.harvard.edu Sun Feb 4 19:24:25 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sun, 4 Feb 2007 14:24:25 -0500 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files In-Reply-To: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> References: <45C5F3CE.9080202@maubp.freeserve.co.uk> <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> Message-ID: <8B44D1DD-6AEE-475E-8ADF-18F271E99196@fas.harvard.edu> What about throwing some sort of error message if there are non- unique id's? Something like what happens when you use NCBIWWW.qblast (from Bio.Blast), where it warns you that qblast only works with certain databases. That way users unaware of this issue in Clustalw will learn about it, and Bio.SeqIO will still permit you the freedom to have non-unique id's. Maybe also make this warning message easy to turn off so it doesn't get annoying. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 4, 2007, at 1:50 PM, Michiel De Hoon wrote: >> Clustalw 1.83 will reject any file where the first 30 characters >> of the >> identifier are not unique (regardless of the file format). >> >> However, there is nothing in the clustal file format which prevents >> this. For example, BioEdit 5.0.7 will happily read and write clustal >> format alignments with repeated entries. >> >> Should Bio.SeqIO also be tolerant like this? > > Yes, I think so. Some users may want to write a file in the Clustal > format to > use it with some program other Clustal. Also, assuming that clustal > gives a > clear error message when the file contains longer identifiers, that > should be > sufficient to enable the user to fix the problem. > > By the way, let us know when you feel that the Bio.SeqIO code is > ready to be > included in the next Biopython release (code-named Bronx). > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From biopython at maubp.freeserve.co.uk Sun Feb 4 20:13:41 2007 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 04 Feb 2007 20:13:41 +0000 Subject: [BioPython] Bio.SeqIO and Clustal aka Clustalw files In-Reply-To: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> References: <45C5F3CE.9080202@maubp.freeserve.co.uk> <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB30@mail.exch.c2b2.columbia.edu> Message-ID: <45C63E75.3030304@maubp.freeserve.co.uk> Michiel De Hoon wrote: >> Clustalw 1.83 will reject any file where the first 30 characters of the >> identifier are not unique (regardless of the file format). >> >> However, there is nothing in the clustal file format which prevents >> this. For example, BioEdit 5.0.7 will happily read and write clustal >> format alignments with repeated entries. >> >> Should Bio.SeqIO also be tolerant like this? > > Yes, I think so. Some users may want to write a file in the Clustal format to > use it with some program other Clustal. I was hoping you would agree. Done. > Also, assuming that clustal gives a clear error message when the file > contains longer identifiers, that should be sufficient to enable the > user to fix the problem. The clustal programs do seem to be able to read clustal files with identifiers longer than 30 characters (I tried a hand made file with identifiers 55 characters long). This is good. Regardless of the input file format, if your input sequences have long identifiers they are silently truncated to 30 characters. In addition, any colons in the identifier are silently converted into underscores on loading. Both the command line ClustalW 1.83 and the GUI tool ClustalX 1.83 then give a very explicit error message if there are non unique identifiers: ERROR: Multiple sequences found with same name, XXXX (first 30 chars are significant) (where XXXX is the repeated identifier). Peter From lucks at fas.harvard.edu Sun Feb 4 20:22:50 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Sun, 4 Feb 2007 15:22:50 -0500 Subject: [BioPython] CVS and Fink Installations Message-ID: <423AA91E-18B7-4DBB-A9EA-257C36755290@fas.harvard.edu> Hey Guys, I am using biopython 1.42 (with python 2.5) installed with fink on an OSX platform. However, I am getting more interested in biopython development these days, and would like to check out the latest code from CVS. The problem is I would also like my fink version to stay around as a stable version that I can rely on for my research if something is buggy in the latest CVS. Do you have any suggestions on how to have BOTH the fink and CVS code installed such that I can switch between the two? I appreciate your help and any suggestions on how you guys deal with unstable biopython development code and code you need to be stable for your other work. Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From kaiserl at MIT.EDU Sun Feb 4 20:35:05 2007 From: kaiserl at MIT.EDU (Liselotte Kaiser) Date: Sun, 04 Feb 2007 15:35:05 -0500 Subject: [BioPython] unsubscribe Message-ID: <20070204153505.5f11msl6t5c8wocs@webmail.mit.edu> I would like to unsubscribe but forgot my password. Liselotte ------------------------------------------- Liselotte Kaiser, Ph.D. Center for Biomedical Engineering Massachusetts Institute of Technology 500 Technology Square, NE47-307 Cambridge, MA 02139-4307 phone: 617-324-7612 email: kaiserl at mit.edu From mdehoon at c2b2.columbia.edu Sun Feb 4 23:33:23 2007 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Sun, 4 Feb 2007 18:33:23 -0500 Subject: [BioPython] CVS and Fink Installations References: <423AA91E-18B7-4DBB-A9EA-257C36755290@fas.harvard.edu> Message-ID: <2AB4D8AD00B9DE47A1E5FA2321BCA11D09AB33@mail.exch.c2b2.columbia.edu> > I am using biopython 1.42 (with python 2.5) installed with fink on an > OSX platform. However, I am getting more interested in biopython > development these days, and would like to check out the latest code > from CVS. Great! > Do you have any suggestions on how to have BOTH the fink and CVS code > installed such that I can switch between the two? Download the code from CVS, do "python setup.py build" as usual, but don't do "python setup.py install". Then, when you want to use the CVS version, set PYTHONPATH to the directory in which Biopython was built. For example, in my case, that would be export PYTHONPATH=/users/mdehoon/biopython/build/lib.linux-i686-2.5/ Then, if you do >>> from Bio import something it will import it from your code built from CVS. You can do the same trick from inside Python; see the function main in Tests/run_tests.py for an example. --Michiel. From meesters at uni-mainz.de Mon Feb 5 08:40:21 2007 From: meesters at uni-mainz.de (Christian Meesters) Date: Mon, 5 Feb 2007 09:40:21 +0100 Subject: [BioPython] Do you mean Numpy or Numeric? In-Reply-To: <45C38A4A.2000301@maubp.freeserve.co.uk> References: <45C38A4A.2000301@maubp.freeserve.co.uk> Message-ID: <200702050940.21896.meesters@uni-mainz.de> Hi On Friday 02 February 2007 20:00, Peter wrote: > BioPython will have to move from Numeric to NumPy at some point anyway... Just FYI: I might point at a recent discussion on the numpy-mainlinglist: http://projects.scipy.org/pipermail/numpy-discussion/2007-January/025796.html Travis Oliphant - one of the main scipy/numpy-authors offered help migrating libraries which depent on Numeric to numpy. Biophython was actually mentioned as one of the packages which might profit from some support, but I don't know whether some of the Biophython authors jumped in or even is aware of that thread ... Cheers Christian From rcsqtc at iiqab.csic.es Mon Feb 5 12:40:57 2007 From: rcsqtc at iiqab.csic.es (Ramon Crehuet) Date: Mon, 05 Feb 2007 13:40:57 +0100 Subject: [BioPython] Sequences and alignments Message-ID: <45C725D9.9060504@iiqab.csic.es> Dear all, I have done an alignment with TM align and now I need to compare some numerical data associated to each structure residue by residue, according to the alignment. Imagine I have an array of the B-factor of each C_alpha and want to compare these values for equivalent residues, according to the alignment. I am quite new to the subject of alignments and don't know how to deal with this data. I've looked at the biopython documentation but I can't find any hint. Ideally I would like to be able to iterate along the alignment, but is that possible? I'm sure this raises many points on how to treat gaps, etc. but I was wondering if somebody has some experience on that. Cheers, Ramon From wolfgang.meyer at gmail.com Tue Feb 6 18:45:11 2007 From: wolfgang.meyer at gmail.com (Wolfgang Meyer) Date: Tue, 6 Feb 2007 19:45:11 +0100 Subject: [BioPython] StructureAlignment gap code Message-ID: Hi, Currently in this module it assume that gaps in alignments are represented by '-' (hyphen). But it is not unusual that some programs use '.' (dot) to represent gaps (e.g. Dali). And this perhaps could be solved by replacing the '-' with fasta_align._alphabet.new_letters() -- Wolfgang Meyer From lucks at fas.harvard.edu Wed Feb 7 18:37:16 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Wed, 7 Feb 2007 13:37:16 -0500 Subject: [BioPython] Biopython Citation Message-ID: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> Hey all, I am preparing a paper and I want to cite biopython since I used it extensively in the research. Is there a standard citation I can use? Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From mdehoon at c2b2.columbia.edu Wed Feb 7 20:12:30 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Wed, 07 Feb 2007 15:12:30 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> Message-ID: <45CA32AE.6050809@c2b2.columbia.edu> Julius Lucks wrote: > I am preparing a paper and I want to cite biopython since I used it > extensively in the research. Is there a standard citation I can use? Brad Chapman and Jeffrey Chang wrote a paper for the ACM sigbio newsletter in 2000: http://portal.acm.org/citation.cfm?id=360268 I don't know of any general Biopython paper since then. --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From lpritc at scri.ac.uk Thu Feb 8 09:10:04 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 08 Feb 2007 09:10:04 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: <45CA32AE.6050809@c2b2.columbia.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> Message-ID: <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> Perhaps its due time for a short description of the project to be sent in to Bioinformatics or BMC Bioinformatics? L. On Wed, 2007-02-07 at 15:12 -0500, Michiel Jan Laurens de Hoon wrote: > Julius Lucks wrote: > > I am preparing a paper and I want to cite biopython since I used it > > extensively in the research. Is there a standard citation I can use? > > Brad Chapman and Jeffrey Chang wrote a paper for the ACM sigbio > newsletter in 2000: > > http://portal.acm.org/citation.cfm?id=360268 > > I don't know of any general Biopython paper since then. > > --Michiel. > -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From lpritc at scri.ac.uk Thu Feb 8 15:36:24 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 08 Feb 2007 15:36:24 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> On Thu, 2007-02-08 at 09:10 -0500, Julius Lucks wrote: > I imagine there have been substantial changes since then. Or what > about PLoS computational biology, provided we can get the page charges > deferred (which I think they are up for). I just pulled two possible targets out of thin air... PLoS Comp Biol is another good one. The discussion on who is going to write it might be more... interesting... ;) L. -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From sbassi at gmail.com Thu Feb 8 20:11:36 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Thu, 8 Feb 2007 21:11:36 +0100 Subject: [BioPython] Biopython Citation In-Reply-To: <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: On 2/8/07, Leighton Pritchard wrote: > The discussion on who is going to write it might be more... > interesting... ;) I've been thinking in the same problem. I would like to collaborate on it. I could open a "Google doc" and then send you an invite so we could both write in the same online document, and add more people later if somebody is interested. Best, SB -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From lucks at fas.harvard.edu Fri Feb 9 01:50:33 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 8 Feb 2007 20:50:33 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> There is a nice article covering the strengths and weaknesses of BioPython, BioPerl and BioJava aimed at guiding a novice user: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? db=pubmed&cmd=Retrieve&dopt=AbstractPlus&list_uids=12230038 It seemed very unbiased, and is good to read to see how BioPython is viewed within the broader community. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 8, 2007, at 3:11 PM, Sebastian Bassi wrote: > On 2/8/07, Leighton Pritchard wrote: >> The discussion on who is going to write it might be more... >> interesting... ;) > > I've been thinking in the same problem. I would like to collaborate on > it. I could open a "Google doc" and then send you an invite so we > could both write in the same online document, and add more people > later if somebody is interested. > Best, > SB > > -- > Bioinformatics news: http://www.bioinformatica.info > Lriser: http://www.linspire.com/lraiser_success.php?serial=318 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From sbassi at gmail.com Fri Feb 9 02:20:47 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Thu, 8 Feb 2007 23:20:47 -0300 Subject: [BioPython] Biopython Citation In-Reply-To: <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: On 2/8/07, Julius Lucks wrote: > There is a nice article covering the strengths and weaknesses of BioPython, > BioPerl and BioJava aimed at guiding a novice user: Yes, but BioPython grew a lot from 2002. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From mdehoon at c2b2.columbia.edu Fri Feb 9 03:49:53 2007 From: mdehoon at c2b2.columbia.edu (Michiel de Hoon) Date: Thu, 08 Feb 2007 22:49:53 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: <45CBEF61.5090101@c2b2.columbia.edu> Sebastian Bassi wrote: > On 2/8/07, Julius Lucks wrote: >> There is a nice article covering the strengths and weaknesses of BioPython, >> BioPerl and BioJava aimed at guiding a novice user: > > Yes, but BioPython grew a lot from 2002. > Currently, Biopython is undergoing some major improvements, for example with the Blast parser and especially with the new Bio.SeqIO. I expect that Biopython in two or three months will be a lot more transparent and user-friendly than it is now. It may be worthwhile to postpone submitting a Biopython paper until these improvements have made it into a Biopython release. Don't let that stop you from starting to write a paper, though :-). --Michiel. From lpritc at scri.ac.uk Fri Feb 9 10:25:44 2007 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 09 Feb 2007 10:25:44 +0000 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <45CA32AE.6050809@c2b2.columbia.edu> <1170925804.11097.177.camel@lplinuxdev.scri.sari.ac.uk> <1170948984.11097.283.camel@lplinuxdev.scri.sari.ac.uk> Message-ID: <1171016744.11097.316.camel@lplinuxdev.scri.sari.ac.uk> On Thu, 2007-02-08 at 21:11 +0100, Sebastian Bassi wrote: > On 2/8/07, Leighton Pritchard wrote: > > The discussion on who is going to write it might be more... > > interesting... ;) > > I've been thinking in the same problem. I would like to collaborate on > it. I could open a "Google doc" and then send you an invite so we > could both write in the same online document, and add more people > later if somebody is interested. > Best, > SB I'd be very happy to collaborate on writing this sort of paper but, as my own contributions to the code have been minuscule, I want to make sure that the major contributors to, and developers of, the code base are OK with that, and also that they take the lion's share of the credit, regardless of how many words they put into the text. I don't want to be pushing myself forward at someone else's expense, taking credit for work that I didn't do, or generating friction within the community in arguments over who is, and who is not, named, and where they are in the pecking order ;) Michiel wrote: > Currently, Biopython is undergoing some major improvements, for > example > with the Blast parser and especially with the new Bio.SeqIO. I expect > that Biopython in two or three months will be a lot more transparent > and > user-friendly than it is now. It may be worthwhile to postpone > submitting a Biopython paper until these improvements have made it > into > a Biopython release. > > Don't let that stop you from starting to write a paper, though :-). With that in mind, I'd be glad to help in drawing up a framework for the paper, and writing it, with a view to submitting after the next release, possibly in May or June. I've not used Google Docs before, but if you'd like to set that up Sebastian, I'm interested to see how collaborative writing works on it. L. -- Dr Leighton Pritchard AMRSC D131, Plant Pathology, Scottish Crop Research Institute W: http://bioinf.scri.ac.uk/lp E: lpritc at scri.ac.uk GPG: 0xE58BA41B _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From hjm at tacgi.com Thu Feb 15 02:08:37 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Wed, 14 Feb 2007 18:08:37 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> Message-ID: <200702141808.37738.hjm@tacgi.com> On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: > On 2/8/07, Julius Lucks wrote: > > There is a nice article covering the strengths and weaknesses of > > BioPython, BioPerl and BioJava aimed at guiding a novice user: Thanks :) > Yes, but BioPython grew a lot from 2002. So have the other bio-projects. For example, I noted BioRuby in passing but didn't include it in the survey - it's now gone to release 1.0, some kind of a step. It certainly is time for a re-review. What would be quite nice is a longish paper, perhaps in an e-format, with a short form published on paper. If key personnel from each of the Bio-efforts contributed what they think are the strengths and weaknesses of their own approach, it would be extremely useful to a new crop of students. Some of the problem in choosing a bio, as I mentioned in the paper, is dependent on the language itself, some is the approach to the problem. Certainly, contributors to each group read each others lists and can best lay out their arguments. Some of the attraction to users is also dependent on how the different distributions tend to package the contributions and the frequency that they're updated. I haven't checked the biopython releases recently, but the Ubuntu distro, for example, does a very good job of packaging popular perl modules, with the result that I haven't had to resort to CPAN for a very long time. Just my (possibly mistaken) impression, but Python utilities and modules seems to be a little less well packaged, at least by Ubuntu. -- Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) harry.mangalam at uci.edu -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From hlapp at gmx.net Thu Feb 15 16:55:19 2007 From: hlapp at gmx.net (Hilmar Lapp) Date: Thu, 15 Feb 2007 11:55:19 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <200702141808.37738.hjm@tacgi.com> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> Message-ID: <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> BTW Jason and I wrote a relatively high-level review last summer: Briefings in Bioinformatics 2006 7(3):287-296 http://dx.doi.org/10.1093/bib/bbl026 It's not a Bio* comparison paper, though. In fact, it rather tries to highlight the strengths of various toolkits and standards in the life sciences. -hilmar On Feb 14, 2007, at 9:08 PM, Harry Mangalam wrote: > On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: >> On 2/8/07, Julius Lucks wrote: >>> There is a nice article covering the strengths and weaknesses of >>> BioPython, BioPerl and BioJava aimed at guiding a novice user: > > Thanks :) > >> Yes, but BioPython grew a lot from 2002. > > So have the other bio-projects. For example, I noted BioRuby in > passing but didn't include it in the survey - it's now gone to > release 1.0, some kind of a step. It certainly is time for a > re-review. What would be quite nice is a longish paper, perhaps in > an e-format, with a short form published on paper. If key personnel >> from each of the Bio-efforts contributed what they think are the > strengths and weaknesses of their own approach, it would be extremely > useful to a new crop of students. > > Some of the problem in choosing a bio, as I mentioned in the paper, is > dependent on the language itself, some is the approach to the > problem. Certainly, contributors to each group read each others > lists and can best lay out their arguments. > > Some of the attraction to users is also dependent on how the different > distributions tend to package the contributions and the frequency > that they're updated. I haven't checked the biopython releases > recently, but the Ubuntu distro, for example, does a very good job of > packaging popular perl modules, with the result that I haven't had to > resort to CPAN for a very long time. Just my (possibly mistaken) > impression, but Python utilities and modules seems to be a little > less well packaged, at least by Ubuntu. > > -- > Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, > UC Irvine 92697 949 824 0084(o), 949 285 4487(c) > harry.mangalam at uci.edu > -- > Cheers, Harry > Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) > hjm at tacgi.com [plain text preferred] > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net : =========================================================== From mdehoon at c2b2.columbia.edu Thu Feb 15 19:54:28 2007 From: mdehoon at c2b2.columbia.edu (Michiel Jan Laurens de Hoon) Date: Thu, 15 Feb 2007 14:54:28 -0500 Subject: [BioPython] Biopython Citation In-Reply-To: <200702141808.37738.hjm@tacgi.com> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <7E0786A9-243E-4279-B09B-25DAB26D2259@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> Message-ID: <45D4BA74.9060102@c2b2.columbia.edu> Harry Mangalam wrote: > Some of the attraction to users is also dependent on how the different > distributions tend to package the contributions and the frequency > that they're updated. I haven't checked the biopython releases > recently, but the Ubuntu distro, for example, does a very good job of > packaging popular perl modules, with the result that I haven't had to > resort to CPAN for a very long time. Just my (possibly mistaken) > impression, but Python utilities and modules seems to be a little > less well packaged, at least by Ubuntu. > Just wondering: What's so bad about having resort to CPAN? --Michiel. -- Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1130 St Nicholas Avenue New York, NY 10032 From hjm at tacgi.com Thu Feb 15 19:49:41 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Thu, 15 Feb 2007 11:49:41 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> <05D00193-F72B-4695-A275-CB3E29BB534D@gmx.net> Message-ID: <200702151149.42023.hjm@tacgi.com> That's a really excellent overview. My only (non)critique is that since there's so much to cover that you couldn't possibly have covered each in the kind of detail it deserves. Maybe the proper way to do this is to wikitize the paper, allowing supporters and users to fill in the gaps. Again, great job - sorry I missed this when it came out. Harry On Thursday 15 February 2007 08:55, Hilmar Lapp wrote: > BTW Jason and I wrote a relatively high-level review last summer: > > Briefings in Bioinformatics 2006 7(3):287-296 > http://dx.doi.org/10.1093/bib/bbl026 > > It's not a Bio* comparison paper, though. In fact, it rather tries > to highlight the strengths of various toolkits and standards in the > life sciences. > > -hilmar > > On Feb 14, 2007, at 9:08 PM, Harry Mangalam wrote: > > On Thursday 08 February 2007 18:20, Sebastian Bassi wrote: > >> On 2/8/07, Julius Lucks wrote: > >>> There is a nice article covering the strengths and weaknesses > >>> of BioPython, BioPerl and BioJava aimed at guiding a novice > >>> user: > > > > Thanks :) > > > >> Yes, but BioPython grew a lot from 2002. > > > > So have the other bio-projects. For example, I noted BioRuby in > > passing but didn't include it in the survey - it's now gone to > > release 1.0, some kind of a step. It certainly is time for a > > re-review. What would be quite nice is a longish paper, perhaps > > in an e-format, with a short form published on paper. If key > > personnel > > > >> from each of the Bio-efforts contributed what they think are the > > > > strengths and weaknesses of their own approach, it would be > > extremely useful to a new crop of students. > > > > Some of the problem in choosing a bio, as I mentioned in the > > paper, is dependent on the language itself, some is the approach > > to the problem. Certainly, contributors to each group read each > > others lists and can best lay out their arguments. > > > > Some of the attraction to users is also dependent on how the > > different distributions tend to package the contributions and the > > frequency that they're updated. I haven't checked the biopython > > releases recently, but the Ubuntu distro, for example, does a > > very good job of packaging popular perl modules, with the result > > that I haven't had to resort to CPAN for a very long time. Just > > my (possibly mistaken) impression, but Python utilities and > > modules seems to be a little less well packaged, at least by > > Ubuntu. > > > > -- > > Harry Mangalam - Research Computing, NACS, E2148, Engineering > > Gateway, UC Irvine 92697 949 824 0084(o), 949 285 4487(c) > > harry.mangalam at uci.edu > > -- > > Cheers, Harry > > Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for > > fax) hjm at tacgi.com [plain text preferred] > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From hjm at tacgi.com Fri Feb 16 04:44:58 2007 From: hjm at tacgi.com (Harry Mangalam) Date: Thu, 15 Feb 2007 20:44:58 -0800 Subject: [BioPython] Biopython Citation In-Reply-To: <45D4BA74.9060102@c2b2.columbia.edu> References: <389FF9B4-E549-4961-8382-6AD94A9607E6@fas.harvard.edu> <200702141808.37738.hjm@tacgi.com> <45D4BA74.9060102@c2b2.columbia.edu> Message-ID: <200702152044.58911.hjm@tacgi.com> On Thursday 15 February 2007 11:54, Michiel Jan Laurens de Hoon wrote: > Harry Mangalam wrote: > > Some of the attraction to users is also dependent on how the > > different distributions tend to package the contributions and the > > frequency that they're updated. I haven't checked the biopython > > releases recently, but the Ubuntu distro, for example, does a > > very good job of packaging popular perl modules, with the result > > that I haven't had to resort to CPAN for a very long time. Just > > my (possibly mistaken) impression, but Python utilities and > > modules seems to be a little less well packaged, at least by > > Ubuntu. > > Just wondering: What's so bad about having resort to CPAN? > > --Michiel. CPAN is a wonderful installation tool rendered almost obsolete by the even more wonderful apt-get. I'd rather not have to deal with 2 (sometimes) conflicting (or at least complicating) installation paths. -- Cheers, Harry Harry J Mangalam - 949 856 2847(o) 949 285 4487(c) (email for fax) hjm at tacgi.com [plain text preferred] From lucks at fas.harvard.edu Tue Feb 20 02:23:34 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Mon, 19 Feb 2007 21:23:34 -0500 Subject: [BioPython] Biopython Wiki code-snippets Message-ID: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> Hi All, Does anyone know which mediawiki extension the biopython wiki uses to do code-snippet highlighting via the tags? Or what is the email address of the maintainer of the biopython wiki. I appreciate your help. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- From marc.saric at gmx.de Tue Feb 20 11:29:32 2007 From: marc.saric at gmx.de (Marc Saric) Date: Tue, 20 Feb 2007 12:29:32 +0100 Subject: [BioPython] Biopython Wiki code-snippets In-Reply-To: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> References: <8580827C-5E6C-4A2A-B478-0C7790E50340@fas.harvard.edu> Message-ID: <45DADB9C.9070703@gmx.de> Hi Julius, I guess you can always have a look at the Special:Version page, which should list all extensions. My guess would be something like wfGeSHiColorExtension (at least that's what I am using for syntax highlighting). Julius Lucks wrote: > Hi All, > > Does anyone know which mediawiki extension the biopython wiki uses to > do code-snippet highlighting via the tags? Or what is the > email address of the maintainer of the biopython wiki. > > I appreciate your help. > > Julius -- Bye, Marc Saric http://www.marcsaric.de From throwaway at MIT.EDU Wed Feb 21 18:31:56 2007 From: throwaway at MIT.EDU (Alex Coventry) Date: Wed, 21 Feb 2007 13:31:56 -0500 Subject: [BioPython] Problem using efetch Message-ID: Hi. The following query results in an error message from NCBI: >>> print client.search(''' NP_542420 ATRTC ''', db='protein').efetch(retmode='text', rettype='fasta').read() ... ... ... ... Error: Internal Error I expected this query to return two sequences in fasta format. The same query seems to work without problems at . Searching for either query separately also seems to work. E.g.: >>> print client.search(''' ATRTC ''', db='protein').efetch(retmode='text', rettype='fasta').read() ... ... ... >gi|71620|pir||ATRTC actin beta - rat MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS TFQQMWISKQEYDESGPSIVHRKCF Am I doing something wrong, here? Alex From lucks at fas.harvard.edu Thu Feb 22 23:41:05 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 22 Feb 2007 18:41:05 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: References: Message-ID: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Hi Alex, I am not sure if anyone has addressed your issue. Which module are you using (i.e. where did you import 'client' from)? Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 21, 2007, at 1:31 PM, Alex Coventry wrote: > > Hi. The following query results in an error message from NCBI: > >>>> print client.search(''' > NP_542420 > ATRTC > ''', db='protein').efetch(retmode='text', rettype='fasta').read() > ... ... ... ... > Error: Internal Error > > I expected this query to return two sequences in fasta format. The > same > query seems to work without problems at > . > > Searching for either query separately also seems to work. E.g.: > >>>> print client.search(''' > ATRTC > ''', db='protein').efetch(retmode='text', > rettype='fasta').read() > ... ... ... >gi|71620|pir||ATRTC actin beta - rat > MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP > IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL > SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR > DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN > SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS > TFQQMWISKQEYDESGPSIVHRKCF > > Am I doing something wrong, here? > > Alex > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From throwaway at MIT.EDU Fri Feb 23 04:01:56 2007 From: throwaway at MIT.EDU (Alex Coventry) Date: Thu, 22 Feb 2007 23:01:56 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> (Julius Lucks's message of "Thu, 22 Feb 2007 18:41:05 -0500") References: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Message-ID: Oh, thanks for pointing that out. That was careless of me. It's using Bio.EUtils, i.e. from Bio.EUtils import HistoryClient client = HistoryClient.HistoryClient() This is with Biopython version 1.42. Alex >>>>> "Julius" == Julius Lucks writes: Julius> Hi Alex, Julius> I am not sure if anyone has addressed your issue. Which Julius> module are you using (i.e. where did you import 'client' Julius> from)? Julius> Julius From lucks at fas.harvard.edu Fri Feb 23 04:47:39 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Thu, 22 Feb 2007 23:47:39 -0500 Subject: [BioPython] Problem using efetch In-Reply-To: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> References: <008C3AC2-20CF-4CD2-9E84-9BD95AA426DE@fas.harvard.edu> Message-ID: <1E7E397A-E3D6-4903-8D4F-2B5F8DAD3F22@fas.harvard.edu> Hey Alex, I am not sure why there was an internal error, but if you want to return 2 sequences, then the following adjustment of your query will do the trick from Bio.EUtils import HistoryClient client = HistoryClient.HistoryClient() print client.search('NP_542420 OR ATRTC',db='protein').efetch (retmode='text', rettype='fasta').read() which gives >gi|18093102|ref|NP_542420.1| dynamin 1 [Rattus norvegicus] MGNRGMEDLIPLVNRLQDAFSAIGQNADLDLPQIAVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRPLV LQLVNSTTEYAEFLHCKGKKFTDFEEVRLEIEAETDRVTGTNKGISPVPINLRVYSPHVLNLTLVDLPGM TKVPVGDQPPDIEFQIRDMLMQFVTKENCLILAVSPANSDLANSDALKIAKEVDPQGQRTIGVITKLDLM DEGTDARDVLENKLLPLRRGYIGVVNRSQKDIDGKKDITAALAAERKFFLSHPSYRHLADRMGTPYLQKV LNQQLTNHIRDTLPGLRNKLQSQLLSIEKEVDEYKNFRPDDPARKTKALLQMVQQFAVDFEKRIEGSGDQ IDTYELSGGARINRIFHERFPFELVKMEFDEKELRREISYAIKNIHGIRTGLFTPDLAFEATVKKQVQKL KEPSIKCVDMVVSELTSTIRKCSEKLQQYPRLREEMERIVTTHIREREGRTKEQVMLLIDIELAYMNTNH EDFIGFANAQQRSNQMNKKKTSGNQDEILVIRKGWLTINNIGIMKGGSKEYWFVLTAENLSWYKDDEEKE KKYMLSVDNLKLRDVEKGFMSSKHIFALFNTEQRNVYKDYRQLELACETQEEVDSWKASFLRAGVYPERV GDKEKASETEENGSDSFMHSMDPQLERQVETIRNLVDSYMAIVNKTVRDLMPKTIMHLMINNTKEFIFSE LLANLYSCGDQNTLMEESAEQAQRRDEMLRMYHALKEALSIIGDINTTTVSTPMPPPVDDSWLQVQSVPA GRRSPTSSPTPQRRAPAVPPARPGSRGPAPGPPPAGSALGGAPPVPSRPGASPDPFGPPPQVPSRPNRAP PGVPRITISDP >gi|71620|pir||ATRTC actin beta - rat MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYP IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVL SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVR DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFN SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLS TFQQMWISKQEYDESGPSIVHRKCF The 'OR' should give 2 results in the search, which efetch should handle in return. I am not sure why the code gave the InternalError though, maybe someone else can pinpoint that. In general, the NCBI interface for efetch will make use of whatever results were returned by a previous search when using the NCBI history server (which the HistoryClient does for you). So getting the right list of results might come down to constructing the appropriate NCBI query. Cheers, Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 22, 2007, at 6:41 PM, Julius Lucks wrote: > Hi Alex, > > I am not sure if anyone has addressed your issue. Which module are > you using (i.e. where did you import 'client' from)? > > Julius > > ----------------------------------------------------- > http://openwetware.org/wiki/User:Lucks > ----------------------------------------------------- > > > > On Feb 21, 2007, at 1:31 PM, Alex Coventry wrote: > >> >> Hi. The following query results in an error message from NCBI: >> >>>>> print client.search(''' >> NP_542420 >> ATRTC >> ''', db='protein').efetch(retmode='text', rettype='fasta').read() >> ... ... ... ... >> Error: Internal Error >> >> I expected this query to return two sequences in fasta format. The >> same >> query seems to work without problems at >> . >> >> Searching for either query separately also seems to work. E.g.: >> >>>>> print client.search(''' >> ATRTC >> ''', db='protein').efetch(retmode='text', >> rettype='fasta').read() >> ... ... ... >gi|71620|pir||ATRTC actin beta - rat >> MDDDIAALVVDNGSAMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKY >> P >> IEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAV >> L >> SLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIV >> R >> DIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTF >> N >> SIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASL >> S >> TFQQMWISKQEYDESGPSIVHRKCF >> >> Am I doing something wrong, here? >> >> Alex >> _______________________________________________ >> BioPython mailing list - BioPython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From km at mrna.tn.nic.in Fri Feb 23 21:14:46 2007 From: km at mrna.tn.nic.in (km) Date: Sat, 24 Feb 2007 02:44:46 +0530 Subject: [BioPython] Bio.PDB centroid Message-ID: <20070223211446.GA32138@mrna.tn.nic.in> Hi all, Is there any inbuilt function in Bio.PDB to find out the center of mass of the side chain of a residue ? any suggestions ? tia, regards, KM -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From sbassi at gmail.com Fri Feb 23 21:34:58 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 23 Feb 2007 18:34:58 -0300 Subject: [BioPython] Error retrieving from GenBank Message-ID: Hello, Is this biopython bug or just a connexion problem from my side? My internet connexion is not good, but this problems sounds like biopython not interacting with eutils. >>> from Bio import GenBank >>> gilist=GenBank.search_for("beta-conglycinin") Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1890, in search_for ids.append(db_id.dbids.ids[0]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 124, in _get_dbids infile = self.efetch(retmode = "text", rettype = "uilist") File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 150, in efetch complexity = complexity) File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", line 987, in efetch_using_dbids query = {"id": id_string, File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", line 644, in _get return self.opener.open(url) File "/usr/lib/python2.4/urllib2.py", line 358, in open response = self._open(req, data) File "/usr/lib/python2.4/urllib2.py", line 376, in _open '_open', req) File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain result = func(*args) File "/usr/lib/python2.4/urllib2.py", line 1021, in http_open return self.do_open(httplib.HTTPConnection, req) File "/usr/lib/python2.4/urllib2.py", line 996, in do_open raise URLError(err) urllib2.URLError: >>> -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From lucks at fas.harvard.edu Fri Feb 23 23:13:41 2007 From: lucks at fas.harvard.edu (Julius Lucks) Date: Fri, 23 Feb 2007 18:13:41 -0500 Subject: [BioPython] Error retrieving from GenBank In-Reply-To: References: Message-ID: Hi Sebastian, It looks like your error occurred within python 2.4's urllib, so it might be that. Otherwise, I am using python 2.5 and got your code to work (with a gilist of length 930). The query took a little while (maybe about 3 minutes), so if you have a bad connection, something might have timed out. Julius ----------------------------------------------------- http://openwetware.org/wiki/User:Lucks ----------------------------------------------------- On Feb 23, 2007, at 4:34 PM, Sebastian Bassi wrote: > Hello, > > Is this biopython bug or just a connexion problem from my side? My > internet connexion is not good, but this problems sounds like > biopython not interacting with eutils. > > >>>> from Bio import GenBank >>>> gilist=GenBank.search_for("beta-conglycinin") > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", > line 1890, in search_for > ids.append(db_id.dbids.ids[0]) > File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", > line 124, in _get_dbids > infile = self.efetch(retmode = "text", rettype = "uilist") > File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", > line 150, in efetch > complexity = complexity) > File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", > line 987, in efetch_using_dbids > query = {"id": id_string, > File "/usr/lib/python2.4/site-packages/Bio/EUtils/ThinClient.py", > line 644, in _get > return self.opener.open(url) > File "/usr/lib/python2.4/urllib2.py", line 358, in open > response = self._open(req, data) > File "/usr/lib/python2.4/urllib2.py", line 376, in _open > '_open', req) > File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain > result = func(*args) > File "/usr/lib/python2.4/urllib2.py", line 1021, in http_open > return self.do_open(httplib.HTTPConnection, req) > File "/usr/lib/python2.4/urllib2.py", line 996, in do_open > raise URLError(err) > urllib2.URLError: >>>> > > > -- > Bioinformatics news: http://www.bioinformatica.info > Lriser: http://www.linspire.com/lraiser_success.php?serial=318 > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From sbassi at gmail.com Sat Feb 24 02:54:58 2007 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 23 Feb 2007 23:54:58 -0300 Subject: [BioPython] Error retrieving from GenBank In-Reply-To: References: Message-ID: On 2/23/07, Julius Lucks wrote: > It looks like your error occurred within python 2.4's urllib, so it might be > that. Otherwise, I am using python 2.5 and got your code to work (with a > gilist of length 930). The query took a little while (maybe about 3 > minutes), so if you have a bad connection, something might have timed out. Thank you very much. You were right about the time out. I tried with a search for more terms that returned less data and it worked. Best, SB. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From km at mrna.tn.nic.in Sat Feb 24 18:58:10 2007 From: km at mrna.tn.nic.in (km) Date: Sun, 25 Feb 2007 00:28:10 +0530 Subject: [BioPython] Bio.PDB centroid In-Reply-To: <2d7c25310702240204i388293b9l552d0d1842cea17a@mail.gmail.com> References: <20070223211446.GA32138@mrna.tn.nic.in> <2d7c25310702240204i388293b9l552d0d1842cea17a@mail.gmail.com> Message-ID: <20070224185810.GA14553@mrna.tn.nic.in> On Sat, Feb 24, 2007 at 11:04:23AM +0100, Thomas Hamelryck wrote: > what about: > > centroid=Vector(0,0,0) > counter=0.0 > for atom in residue: > centroid=centroid+atom.get_vector() > counter+=1 > centroid=centroid/counter > > This includes all residue atoms - put in a little test to exclude main chain > atoms if necessary. well thats gives only an average of x,y,z coordinates. But i remember that it little more complex than that - I look at it as a problem of finding the centroid of a polygon which takes into account area of the triangular regions as weightage in the calculation of centroid. I donot know how to approach it in 3 dimensional coordinate space. regards, KM -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.