From johncumbers at gmail.com Tue Dec 1 02:10:16 2009 From: johncumbers at gmail.com (John Cumbers) Date: Mon, 30 Nov 2009 23:10:16 -0800 Subject: [Biopython] MuscleCommandline and phyiout Message-ID: Hello, I'm using the MuscleCommandline wrapper and I'm having trouble getting the Phylip interleaved output format. For the Muscle command line I would type "muscle -in myinputfile -phyiout myoutputfile" and this command in python: cline = MuscleCommandline (input=output_file_name_FASTA, out=output_file_name_aligned) But for phyiout, this doesn't work: cline = MuscleCommandline (input=output_file_name_FASTA, phyiout=output_file_name_aligned) returning: ValueError: Option name phyiout was not found. I tried to lookup the possibilities here: http://www.biopython.org/DIST/docs/api/Bio.Align.Applications._Muscle.MuscleCommandline-class.html but couldn't find them, any help appreciated, cheers, John John Cumbers, Ph.D Candidate NASA Ames Research Center Mail Stop 239-20, Bldg N239 Rm 373 Moffett Field, CA 94035, USA. cell +1 (401) 523 8190, office +1 (650) 604-1914, fax +1 (650) 604-1088 Graduate Program in Molecular Biology, Cell Biology, and Biochemistry Brown University, Box G-W Providence, RI, 02912, USA From biopython at maubp.freeserve.co.uk Tue Dec 1 04:23:54 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 09:23:54 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: References: Message-ID: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> On Tue, Dec 1, 2009 at 7:10 AM, John Cumbers wrote: > Hello, > > I'm using the MuscleCommandline wrapper and I'm having trouble getting the > Phylip interleaved output format. ?For the Muscle command line I would type > "muscle -in myinputfile -phyiout myoutputfile" What version of MUSCLE do you have? v3.7 doesn't mention this option in the command line help, nor does the current manual: http://www.drive5.com/muscle/muscle.html Peter From biopython at maubp.freeserve.co.uk Tue Dec 1 04:52:19 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 09:52:19 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> Message-ID: <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> On Tue, Dec 1, 2009 at 9:23 AM, Peter wrote: > On Tue, Dec 1, 2009 at 7:10 AM, John Cumbers wrote: >> Hello, >> >> I'm using the MuscleCommandline wrapper and I'm having trouble getting the >> Phylip interleaved output format. ?For the Muscle command line I would type >> "muscle -in myinputfile -phyiout myoutputfile" > > What version of MUSCLE do you have? v3.7 doesn't mention this > option in the command line help, nor does the current manual: > http://www.drive5.com/muscle/muscle.html It looks like an undocumented option, much like -phyi (which I guessed) for PHYLIP interlaced and -phys for PHYLIP sequential. These match the documented format options (e.g. -msf, -html, -clw and -clwstrict). i.e. You can use this: "muscle -in myinputfile -phyi -out myoutputfile" I think we should ask the MUSCLE author which of these undocumented arguments are actually supported rather than adding them all. Peter From bartomas at gmail.com Tue Dec 1 11:25:36 2009 From: bartomas at gmail.com (bar tomas) Date: Tue, 1 Dec 2009 16:25:36 +0000 Subject: [Biopython] Host_organism field in SwissProt Message-ID: Hi, I'm using BioPython for processing protein sequences from SwissProt database. I'm following the excellent tutorial documentation on SwissProt querying (p.107) I'm just wondering about the meaning of the field 'host_organism' in a swissprot record, as I haven't yet found a record where the value of this field is supplied. If the record concerns the protein sequence of a bacteria, for instance, does the host_organism field contain a list of taxonomic ids of possible host organisms where the bacteria can be found? Many thanks From biopython at maubp.freeserve.co.uk Tue Dec 1 14:19:23 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:19:23 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> Message-ID: <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> On Tue, Dec 1, 2009 at 9:52 AM, Peter wrote: > > It looks like an undocumented option, much like -phyi (which I guessed) > for PHYLIP interlaced and -phys for PHYLIP sequential. These match > the documented format options (e.g. -msf, -html, -clw and -clwstrict). > i.e. You can use this: > > "muscle -in myinputfile -phyi -out myoutputfile" > > I think we should ask the MUSCLE author which of these > undocumented arguments are actually supported rather than > adding them all. Robert Edgar agreed the documentation was out of sync, and has confirmed these are safe arguments to include in our wrapper. Peter From biopython at maubp.freeserve.co.uk Tue Dec 1 14:21:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:21:35 +0000 Subject: [Biopython] Host_organism field in SwissProt In-Reply-To: References: Message-ID: <320fb6e00912011121o41237344v61d1570a919d54b4@mail.gmail.com> On Tue, Dec 1, 2009 at 4:25 PM, bar tomas wrote: > Hi, > I'm using BioPython for processing protein sequences from SwissProt database. > I'm following the excellent tutorial documentation on SwissProt querying (p.107) > I'm just wondering about the meaning of the field 'host_organism' in a > swissprot record, as I haven't yet found a record where the value of > this field is supplied. > If the record concerns the protein sequence of a bacteria, for > instance, does the host_organism field contain a list of taxonomic ids > of possible host organisms where the bacteria can be found? > Many thanks I suspect (without checking) that this field only applies to viruses (or perhaps pathogens) and so will usually be empty. Peter P.S. Page numbers (and even section numbers) in the tutorial do change from release to release - the section names are usually stable. From biopython at maubp.freeserve.co.uk Tue Dec 1 14:34:19 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:34:19 +0000 Subject: [Biopython] Fwd: [Utilities-announce] NCBI E-Utility Policy Change In-Reply-To: <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> References: <7B6F170840CA6C4DA63EE0C8A7BB43EC09CA7387@NIHCESMLBX15.nih.gov> <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> Message-ID: <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> Hi all, Attention NCBI Entrez users - the NCBI really do want you to include your email address, and it will be mandatory in future! See below... If using Bio.Entrez, the tool parameter will by default be set to Biopython, but the email is omitted. We already encourage the email to be included in our documentation but given the new NCBI guidance I'd suggest we make omitting the email issue a warning in the next release (and an error in the subsequent release of Biopython?). Peter ---------- Forwarded message ---------- From: ? Date: Tue, Dec 1, 2009 at 6:59 PM Subject: [Utilities-announce] NCBI E-Utility Policy Change To: utilities-announce at ncbi.nlm.nih.gov As part of an ongoing effort to ensure efficient access to the Entrez Utilities (E-utilities) by all users, NCBI has decided to change the usage policy for the E-utilities effective June 1, 2010. Effective on June 1, 2010, all E-utility requests, either using standard URLs or SOAP, must contain non-null values for both the &tool and &email parameters. Any E-utility request made after June 1, 2010 that does not contain values for both parameters will return an error explaining that these parameters must be included in E-utility requests. The value of the &tool parameter should be a URI-safe string that is the name of the software package, script or web page producing the E-utility request. The value of the &email parameter should be a valid e-mail address for the appropriate contact person or group responsible for maintaining the tool producing the E-utility request. NCBI uses these parameters to contact users whose use of the E-utilities violates the standard usage policies described at http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements. These usage policies are designed to prevent excessive requests from a small group of users from reducing or eliminating the wider community's access to the E-utilities. NCBI will attempt to contact a user at the e-mail address provided in the &email parameter prior to blocking access to the E-utilities. NCBI realizes that this policy change will require many of our users to change their code. Based on past experience, we anticipate that most of our users should be able to make the necessary changes before the June 1, 2010 deadline. If you have any concerns about making these changes by that date, or if you have any questions about these policies, please contact eutilities at ncbi.nlm.nih.gov. Thank you for your understanding and cooperation in helping us continue to deliver a reliable and efficient web service. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------- next part -------------- _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce From cjfields at illinois.edu Tue Dec 1 14:54:48 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 1 Dec 2009 13:54:48 -0600 Subject: [Biopython] Fwd: [Utilities-announce] NCBI E-Utility Policy Change In-Reply-To: <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> References: <7B6F170840CA6C4DA63EE0C8A7BB43EC09CA7387@NIHCESMLBX15.nih.gov> <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> Message-ID: <1EE2400D-D3F7-49DC-82D7-001EE18F2030@illinois.edu> I'll be following the same (exclusion of the email a warning in next bioperl release in Jan, and an error for the spring release). chris On Dec 1, 2009, at 1:34 PM, Peter wrote: > Hi all, > > Attention NCBI Entrez users - the NCBI really do want you to include > your email address, and it will be mandatory in future! See below... > > If using Bio.Entrez, the tool parameter will by default be set to > Biopython, but the email is omitted. We already encourage the email > to be included in our documentation but given the new NCBI guidance > I'd suggest we make omitting the email issue a warning in the next > release (and an error in the subsequent release of Biopython?). > > Peter > > > ---------- Forwarded message ---------- > From: > Date: Tue, Dec 1, 2009 at 6:59 PM > Subject: [Utilities-announce] NCBI E-Utility Policy Change > To: utilities-announce at ncbi.nlm.nih.gov > > > As part of an ongoing effort to ensure efficient access to the Entrez > Utilities (E-utilities) by all users, NCBI has decided to change the > usage policy for the E-utilities effective June 1, 2010. Effective on > June 1, 2010, all E-utility requests, either using standard URLs or > SOAP, must contain non-null values for both the &tool and &email > parameters. Any E-utility request made after June 1, 2010 that does > not contain values for both parameters will return an error explaining > that these parameters must be included in E-utility requests. > > > > The value of the &tool parameter should be a URI-safe string that is > the name of the software package, script or web page producing the > E-utility request. > > > > The value of the &email parameter should be a valid e-mail address for > the appropriate contact person or group responsible for maintaining > the tool producing the E-utility request. > > > > NCBI uses these parameters to contact users whose use of the > E-utilities violates the standard usage policies described at > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements. > These usage policies are designed to prevent excessive requests from a > small group of users from reducing or eliminating the wider > community's access to the E-utilities. NCBI will attempt to contact a > user at the e-mail address provided in the &email parameter prior to > blocking access to the E-utilities. > > > > NCBI realizes that this policy change will require many of our users > to change their code. Based on past experience, we anticipate that > most of our users should be able to make the necessary changes before > the June 1, 2010 deadline. If you have any concerns about making these > changes by that date, or if you have any questions about these > policies, please contact eutilities at ncbi.nlm.nih.gov. > > > > Thank you for your understanding and cooperation in helping us > continue to deliver a reliable and efficient web service. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From mike.thon at gmail.com Wed Dec 2 07:31:45 2009 From: mike.thon at gmail.com (Michael Thon) Date: Wed, 2 Dec 2009 13:31:45 +0100 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> Message-ID: <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> I was wondering if you have implemented this method yet and if so, is it in a repository somewhere where I can try it? I was about to post a message on this and I searched the archives first (!) and found this thread. I have genbank genomic sequences and I need to get the transcript sequences for the CDS features. thanks Mike On Nov 6, 2009, at 1:47 PM, Peter wrote: > On Fri, Nov 6, 2009 at 12:22 PM, Peter wrote: >> Hi all, >> >> I am planing to add a new method to the SeqFeature object, but >> would like a little feedback first. This email is really just the >> background - I'll write up a few examples later to try and make >> this a bit clearer... > > OK, here is a non-trivial example - the first CDS feature in the > GenBank file NC_000932.gb (included as a Biopython unit test), > which is a three part join on the reverse strand. In this case, the > GenBank file includes the protein translation for the CDS features > so we can use it to check our results. > > We can parse this GenBank file into a SeqRecord with: > > from Bio import SeqIO > record = SeqIO.read(open("../biopython/Tests/GenBank/NC_000932.gb"), "gb") > > Let's have a look at the first CDS feature (index 2): > > f = record.features[2] > print f.type, f.location, f.strand, f.location_operator > for sub_f in f.sub_features : > print " - ", sub_f.location, sub_f.strand > table = f.qualifiers.get("transl_table",[1])[0] # List of one int > print "Table", table > > Giving: > > CDS [97998:69724] -1 join > - [97998:98024] -1 > - [98561:98793] -1 > - [69610:69724] -1 > Table 11 > > Looking at the raw GenBank file, this feature has location string: > > complement(join(97999..98024,98562..98793,69611..69724)) > > i.e. To get the sequence you need to do this (note zero based > Python counting as in the output above): > > print (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement() > > And then translate it using NCBI genetic code table 11, > > print "Manual translation:" > print (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement().translate(table=11, > cds=True) > print "Given translation:" > print f.qualifiers["translation"][0] # List of one string > print "Biopython translation (with proposed code):" > print f.extract(record.seq).translate(table, cds=True) > > And the output, together with the provided translation in the > feature annotation, and the shortcut with the new code I am > proposing to include in Biopython: > > Manual translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > Given translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > Biopython translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > > The point of all this was with the proposed new extract method, > you just need: > > feature_seq = f.extract(record.seq) > > instead of: > > feature_seq = (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement() > > which is in itself a slight simplification since you'd need to get the > those coordinates from the sub features, worry about strands, etc. > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From kellrott at gmail.com Wed Dec 2 10:12:53 2009 From: kellrott at gmail.com (Kyle Ellrott) Date: Wed, 2 Dec 2009 07:12:53 -0800 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> Message-ID: On Wed, Dec 2, 2009 at 4:31 AM, Michael Thon wrote: > I was wondering if you have implemented this method yet and if so, is it in > a repository somewhere where I can try it? I was about to post a message on > this and I searched the archives first (!) and found this thread. I have > genbank genomic sequences and I need to get the transcript sequences for the > CDS features. > > It should be in the main GIT repository at http://github.com/biopython/biopython/ It's to new to have made an official release version yet. Kyle From biopython at maubp.freeserve.co.uk Wed Dec 2 17:44:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 2 Dec 2009 22:44:11 +0000 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> Message-ID: <320fb6e00912021444g11b27f81k8a3c8675ddc25169@mail.gmail.com> On Wed, Dec 2, 2009 at 3:12 PM, Kyle Ellrott wrote: > On Wed, Dec 2, 2009 at 4:31 AM, Michael Thon wrote: > >> I was wondering if you have implemented this method yet and if so, is it in >> a repository somewhere where I can try it? ?I was about to post a message on >> this and I searched the archives first (!) and found this thread. ?I have >> genbank genomic sequences and I need to get the transcript sequences for the >> CDS features. >> > > It should be in the main GIT repository at > http://github.com/biopython/biopython/ > It's to new to have made an official release version yet. > > Kyle Kyle has also been following the dicussion on the dev mailing list, where it was mentioned this was now "on the trunk". See also: http://www.biopython.org/wiki/SourceCode Getting people using and testing the code now would be nice, especially if we hope to get a release out before too long. Peter From richard_w_g_price at academia.edu Wed Dec 2 20:21:26 2009 From: richard_w_g_price at academia.edu (Richard Price) Date: Wed, 2 Dec 2009 17:21:26 -0800 Subject: [Biopython] New Academia.edu feature for Biopython Message-ID: Dear Biopython members, I wanted to tell the list about a new feature on Academia.edu. Academia.edu launched 12 months ago and now helps 300,000 academics a month answer the question 'who's researching what?' We have built a dedicated page on Academia.edu for the Biopython mailing list: http://lists.academia.edu/See-members-of-Biopython This page will show you fellow members already on Academia.edu. You can see their papers, research interests, and other information. Visit the link below, sign up with Academia.edu, and see who else from Biopython is on Academia.edu. http://lists.academia.edu/See-members-of-Biopython Richard Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. Founder of Academia.edu From mike.thon at gmail.com Thu Dec 3 00:20:12 2009 From: mike.thon at gmail.com (Michael Thon) Date: Thu, 3 Dec 2009 06:20:12 +0100 Subject: [Biopython] can't compile version from github Message-ID: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> Don't know if this belongs on the dev mailing list or here... I just checked out a copy of biopython from the github repo and I tried to install it in an non-root directory to try out a new feature. here is the command I ran on the command line: python setup.py install --prefix=/Users/mike/biopython_dev Here is the offending part of the compile: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype The installer did not find the numpy that I installed using easy_install so I continued without it and without Reportlab. (Mac OS 10.6) Mike From biopython at maubp.freeserve.co.uk Thu Dec 3 05:19:32 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 3 Dec 2009 10:19:32 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> Message-ID: <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> On Thu, Dec 3, 2009 at 5:20 AM, Michael Thon wrote: > Don't know if this belongs on the dev mailing list or here... Tricky - I might have picked the dev list, but this is fine. > I just checked out a copy of biopython from the github repo and I > tried to install it in an non-root directory to try out a new feature. > > here is the command I ran on the command line: > python setup.py install --prefix=/Users/mike/biopython_dev > > Here is the offending part of the compile: > > gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o > In file included from Bio/triemodule.c:3: > Bio/trie.h:12: warning: function declaration isn?t a prototype > > The installer did not find the numpy that I installed using easy_install > so I continued without it and without Reportlab. ?(Mac OS 10.6) > > Mike Ah - Snow Leopard. Are you using the Apple provided Python with Mac OS 10.6? Apple have as usual done odd things with Python, but people have reported getting it to work. Looking at the compiler flags I am puzzled about the inclusion of "-arch ppc" since Snow Leopard is x86 only. Could you give us the whole of the compile log? What you have shown just has a warning - no actual error. Also, doing it in stages would be wiser: #First remove old build files: python setup clean #Do the compile: python setup build #Run the unit tests: python setup test #To install under your home directory I'd use: python steup install --prefix=/Users/mike/ Peter From mike.thon at gmail.com Thu Dec 3 07:17:17 2009 From: mike.thon at gmail.com (Michael Thon) Date: Thu, 3 Dec 2009 13:17:17 +0100 Subject: [Biopython] can't compile version from github In-Reply-To: <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> Message-ID: On Dec 3, 2009, at 11:19 AM, Peter wrote: > On Thu, Dec 3, 2009 at 5:20 AM, Michael Thon wrote: >> Don't know if this belongs on the dev mailing list or here... > > Tricky - I might have picked the dev list, but this is fine. > >> I just checked out a copy of biopython from the github repo and I >> tried to install it in an non-root directory to try out a new feature. >> >> here is the command I ran on the command line: >> python setup.py install --prefix=/Users/mike/biopython_dev >> >> Here is the offending part of the compile: >> >> gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o >> In file included from Bio/triemodule.c:3: >> Bio/trie.h:12: warning: function declaration isn?t a prototype >> >> The installer did not find the numpy that I installed using easy_install >> so I continued without it and without Reportlab. (Mac OS 10.6) >> >> Mike > > Ah - Snow Leopard. Are you using the Apple provided Python with Mac OS 10.6? > Apple have as usual done odd things with Python, but people have > reported getting > it to work. Yup, I'm using the Apple-provided python. Below is the full output of the build phase. ...so, as usual, I am off on a tangent instead of working on the stuff that actually needs to get done. I don't really need to get this installed. I just wanted to try the SeqFeature.extract method that you mentioned in a previous thread. I realized that I probably don't need to compile all the C extensions to get that to work so I opened up SeqFeature.py to see what this method looks like. I couldn't find it so I suppose this method has not made its way into the main git repo on github. In the end, I wrote one myself but it would still be good to compare its output to what you have in biopython. ...but I still have this biopython from github that won't compile and it probably should. So, if you have any ideas what might be wrong and how to fix it I can try it and report back. -Mike python setup.py build running build running build_py *** Numerical Python *** is either not installed or out of date. This package is optional, which means it is only used in a few specialized modules in Biopython. You probably don't need this if you are unsure. You can ignore this requirement, and install it later if you see ImportErrors. You can find Numerical Python at http://numpy.sourceforge.net/. Do you want to continue this installation? (Y/n) Y *** Reportlab *** is either not installed or out of date. This package is optional, which means it is only used in a few specialized modules in Biopython. You probably don't need this if you are unsure. You can ignore this requirement, and install it later if you see ImportErrors. You can find Reportlab at http://www.reportlab.org/downloads.html. Do you want to continue this installation? (Y/n) Y running build_ext building 'Bio.trie' extension creating build/temp.macosx-10.6-universal-2.6 creating build/temp.macosx-10.6-universal-2.6/Bio gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type lipo: can't open input file: /var/folders/wI/wIckOkhJHe0hxBDeAEI1VE+++TI/-Tmp-//ccemhRHR.out (No such file or directory) error: command 'gcc-4.2' failed with exit status 1 From biopython at maubp.freeserve.co.uk Thu Dec 3 07:26:12 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 3 Dec 2009 12:26:12 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> Message-ID: <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> On Thu, Dec 3, 2009 at 12:17 PM, Michael Thon wrote: > > Yup, I'm using the Apple-provided python. ?Below is the full output of the build phase. I'll take a look at it, but until I have a machine with Snow Leopard, solving this will be tricky. Any other Snow Leopard users please speak up. > ...so, as usual, I am off on a tangent instead of working on the stuff that > actually needs to get done. ?I don't really need to get this installed. ?I just > wanted to try the SeqFeature.extract method that you mentioned in a > previous thread. ?I realized that I probably don't need to compile all the > C extensions to get that to work so I opened up SeqFeature.py to see > what this method looks like. ?I couldn't find it so I suppose this method > has not made its way into the main git repo on github. ?In the end, I > wrote one myself but it would still be good to compare its output to > what you have in biopython. I'm not sure why you couldn't find this in the latest code from git. The method is in Bio/SeqFeature.py, search for "def extract": http://github.com/biopython/biopython/blob/master/Bio/SeqFeature.py You can probably take a working Biopython 1.52 install, and manually update just the Bio/SeqFeature.py file if you really need to. You could also try installing just the "pure Python" part of Biopython by hacking setup.py to set EXTENSIONS = [], as done for Jython. > ...but I still have this biopython from github that won't compile and it > probably should. ?So, if you have any ideas what might be wrong and > how to fix it I can try it and report back. I'll try to get back to you on this shortly. Peter From johncumbers at gmail.com Fri Dec 4 02:50:43 2009 From: johncumbers at gmail.com (John Cumbers) Date: Thu, 3 Dec 2009 23:50:43 -0800 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> Message-ID: Many thanks Peter, Sorry for delayed reply, I filter this list and forgot to check the folder :) Best wishes, John John Cumbers, Ph.D Candidate NASA Ames Research Center Mail Stop 239-20, Bldg N239 Rm 373 Moffett Field, CA 94035, USA. cell +1 (401) 523 8190, office +1 (650) 604-1914, fax +1 (650) 604-1088 Graduate Program in Molecular Biology, Cell Biology, and Biochemistry Brown University, Box G-W Providence, RI, 02912, USA On Tue, Dec 1, 2009 at 11:19 AM, Peter wrote: > On Tue, Dec 1, 2009 at 9:52 AM, Peter > wrote: > > > > It looks like an undocumented option, much like -phyi (which I guessed) > > for PHYLIP interlaced and -phys for PHYLIP sequential. These match > > the documented format options (e.g. -msf, -html, -clw and -clwstrict). > > i.e. You can use this: > > > > "muscle -in myinputfile -phyi -out myoutputfile" > > > > I think we should ask the MUSCLE author which of these > > undocumented arguments are actually supported rather than > > adding them all. > > Robert Edgar agreed the documentation was out of sync, and has > confirmed these are safe arguments to include in our wrapper. > > Peter > From biopython at maubp.freeserve.co.uk Fri Dec 4 07:32:23 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 4 Dec 2009 12:32:23 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> Message-ID: <320fb6e00912040432k628392c5n57fc79a3b68a88eb@mail.gmail.com> On Fri, Dec 4, 2009 at 7:50 AM, John Cumbers wrote: > Many thanks Peter, > Sorry for delayed reply, I filter this list and forgot to check the folder > :) > Best wishes, > John Bug filed, http://bugzilla.open-bio.org/show_bug.cgi?id=2961 From brynedal at gmail.com Fri Dec 4 14:01:49 2009 From: brynedal at gmail.com (Boel Brynedal) Date: Fri, 4 Dec 2009 14:01:49 -0500 Subject: [Biopython] Problems installing Biopython Message-ID: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Dear List, I am trying to install Biopython from source on my Mac OS X v.10.6.1. I ran into some problems when building Biopython, thought that it might be due to the fact that I am missing xcore tools (gcc-4.0 was missing) so I installed version 2.5 of xcore. This is however the output as I try to build biopython after installing xcore: $ python setup.py install running install running build running build_py creating build a lot of creating, copying etc etc... running build_ext building 'Bio.clistfns' extension creating build/temp.macosx-10.3-fat-2.6 creating build/temp.macosx-10.3-fat-2. 6/Bio gcc-4.0 -arch ppc -arch i386 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/clistfnsmodule.c -o build/temp.macosx-10.3-fat-2.6/Bio/clistfnsmodule.o In file included from /usr/include/architecture/i386/math.h:626, from /usr/include/math.h:28, from /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6/pyport.h:235, from /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6/Python.h:58, from Bio/clistfnsmodule.c:10: /usr/include/AvailabilityMacros.h:108:14: warning: #warning Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. Compiling with an SDK that doesn't seem to exist: /Developer/SDKs/MacOSX10.4u.sdk Please check your Xcode installation gcc-4.0 -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -g -bundle -undefined dynamic_lookup build/temp.macosx-10.3-fat-2.6/Bio/clistfnsmodule.o -o build/lib.macosx-10.3-fat-2.6/Bio/clistfns.so ld: library not found for -lbundle1.o collect2: ld returned 1 exit status ld: library not found for -lbundle1.o collect2: ld returned 1 exit status lipo: can't open input file: /var/folders/fy/fySdohlPEBSVFU-YIflSGk+++TI/-Tmp-//cc1Cm89x.out (No such file or directory) error: command 'gcc-4.0' failed with exit status 1 I am new to using Mac, and not the most talented computer nerd, but it seems like we have two problems here: the systems seems to be looking for MacOSX10.4u.sdk, when the xcore tools I've installed contain MacOSX10.5.sdk and MacOSX10.6.sdk. Why does it look for the earlier version, and what can I do about it? #warning Building for Intel with Mac OS X Deployment Target < 10.4 is invalid - this I do not understand at all. I'm using Python 2.6.4 and I'm trying to install biopython-1.52. Any tips, comments or ideas would be greatly appreciated! Thank you, Boel From mike.thon at gmail.com Fri Dec 4 15:27:41 2009 From: mike.thon at gmail.com (Michael Thon) Date: Fri, 4 Dec 2009 21:27:41 +0100 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Message-ID: <6D780E44-39B7-4CFF-975B-D72722A5E5B2@gmail.com> Hi Boel - I installed biopython using easy_install on Mac OS 10.6.2 and didn't have any problems. I don't know what xcore is. Do you mean Xcode? the version I have is 3.2.1 and I downloaded it from developer.apple.com -Mike From p.j.a.cock at googlemail.com Fri Dec 4 15:52:46 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 4 Dec 2009 20:52:46 +0000 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Message-ID: <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> On Fri, Dec 4, 2009 at 7:01 PM, Boel Brynedal wrote: > Dear List, > > I am new to using Mac, and not the most talented computer > nerd, but it seems like we have two problems here: > the systems seems to be looking for MacOSX10.4u.sdk, > when the xcore tools I've installed contain MacOSX10.5.sdk > and MacOSX10.6.sdk. Why does it look > for the earlier version, and what can I do about it? > #warning Building for Intel with Mac OS X Deployment Target < 10.4 is > invalid - this I do not understand at all. Sadly there are some general issues with Python on Snow Leopard (this isn't just a Biopython issue). Right now as far as I know none of our core developers have Snow Leopard (10.6) so it is hard to help. However, with Leopard (10.4), when installing XCode I had the option of installing the 10.3 headers too. Could you re-install XCode and check this time for an option to include the headers for 10.4 (and/or 10.3)? Peter From mjldehoon at yahoo.com Sat Dec 5 10:59:41 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 5 Dec 2009 07:59:41 -0800 (PST) Subject: [Biopython] CompareAce parser Message-ID: <613008.1949.qm@web62407.mail.re1.yahoo.com> Hi everybody, In Bio.Motif, there is a nominal parser for CompareAce files. However, this parser has almost no functionality. Would anybody mind if we deprecate this module for the next release? Whereas we usually might declare the module obsolete before deprecating it, in this case I think we can deprecate it straightaway since currently this parser does very little. --Michiel. From bartek at rezolwenta.eu.org Sat Dec 5 19:58:41 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Sun, 6 Dec 2009 01:58:41 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <613008.1949.qm@web62407.mail.re1.yahoo.com> References: <613008.1949.qm@web62407.mail.re1.yahoo.com> Message-ID: <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> Hi, I don't have anything against deprecating, even though I don't the advantages of doing so. (the module is trivial, but so is the output of compareACE: a number giving a score between motifs. The score, however is not trivial and I wouldn't want to reimplement it.) cheers Bartek On Sat, Dec 5, 2009 at 4:59 PM, Michiel de Hoon wrote: > Hi everybody, > > In Bio.Motif, there is a nominal parser for CompareAce files. However, this parser has almost no functionality. Would anybody mind if we deprecate this module for the next release? Whereas we usually might declare the module obsolete before deprecating it, in this case I think we can deprecate it straightaway since currently this parser does very little. > > --Michiel. > > > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From biopython at maubp.freeserve.co.uk Sun Dec 6 09:10:22 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 6 Dec 2009 14:10:22 +0000 Subject: [Biopython] CompareAce parser In-Reply-To: <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> References: <613008.1949.qm@web62407.mail.re1.yahoo.com> <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> Message-ID: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> On Sun, Dec 6, 2009 at 12:58 AM, Bartek Wilczynski wrote: > Hi, > > I don't have anything against deprecating, even though I don't the > advantages of doing so. (the module is trivial, but so is the output > of compareACE: a number giving a score between motifs. The score, > however is not trivial and I wouldn't want to reimplement it.) > > cheers > ?Bartek So the reason this parser is so simple and has almost no functionality is just a reflection of the simplicity of the CompareAce files? If so, I'd say leave the parser in. Peter From mjldehoon at yahoo.com Sun Dec 6 09:31:47 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 6 Dec 2009 06:31:47 -0800 (PST) Subject: [Biopython] CompareAce parser In-Reply-To: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> Message-ID: <889087.95522.qm@web62406.mail.re1.yahoo.com> > So the reason this parser is so simple and has almost no > functionality is just a reflection of the simplicity of > the CompareAce files? Not exactly. CompareAce files can have different outputs, depending on the query given to CompareAce. The simplest query returns only one number. The current CompareAce parser can only parse this output. In other words, >>> input = open("test.out") >>> from Bio.Motif.Parsers import AlignAce >>> AlignAce.CompareAceParser().parse(input) 0.92130000000000001 is equivalent to >>> input = open("test.out") >>> float(input.read()) 0.92130000000000001 I am not against having a CompareAce parser in Biopython, but if we have such a parser it should be able to handle more output formats than just the trivial output format. With this in mind, I think we should either extend the CompareAce parser to handle cases that cannot be trivially handled by a simple Python command, or remove it altogether. If we do keep it in Biopython, there should also be some documentation to cover it, and perhaps a unit test. --Michiel --- On Sun, 12/6/09, Peter wrote: > From: Peter > Subject: Re: [Biopython] CompareAce parser > To: "Bartek Wilczynski" > Cc: "Michiel de Hoon" , biopython at biopython.org > Date: Sunday, December 6, 2009, 9:10 AM > On Sun, Dec 6, 2009 at 12:58 AM, > Bartek Wilczynski > > wrote: > > Hi, > > > > I don't have anything against deprecating, even though > I don't the > > advantages of doing so. (the module is trivial, but so > is the output > > of compareACE: a number giving a score between motifs. > The score, > > however is not trivial and I wouldn't want to > reimplement it.) > > > > cheers > > ?Bartek > > So the reason this parser is so simple and has almost no > functionality > is just a reflection of the simplicity of the CompareAce > files? If so, I'd > say leave the parser in. > > Peter > From brynedal at gmail.com Sun Dec 6 16:38:55 2009 From: brynedal at gmail.com (Boel Brynedal) Date: Sun, 6 Dec 2009 16:38:55 -0500 Subject: [Biopython] Problems installing Biopython In-Reply-To: <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> Message-ID: <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> Hi Peter, I downloaded XCode again and included the 10.4 support - this seem to have fixed it. Thank you very much! Boel 2009/12/4 Peter Cock > On Fri, Dec 4, 2009 at 7:01 PM, Boel Brynedal wrote: > > Dear List, > > > > I am new to using Mac, and not the most talented computer > > nerd, but it seems like we have two problems here: > > the systems seems to be looking for MacOSX10.4u.sdk, > > when the xcore tools I've installed contain MacOSX10.5.sdk > > and MacOSX10.6.sdk. Why does it look > > for the earlier version, and what can I do about it? > > #warning Building for Intel with Mac OS X Deployment Target < 10.4 is > > invalid - this I do not understand at all. > > Sadly there are some general issues with Python on > Snow Leopard (this isn't just a Biopython issue). Right > now as far as I know none of our core developers have > Snow Leopard (10.6) so it is hard to help. > > However, with Leopard (10.4), when installing XCode I > had the option of installing the 10.3 headers too. Could > you re-install XCode and check this time for an option > to include the headers for 10.4 (and/or 10.3)? > > Peter > From biopython at maubp.freeserve.co.uk Sun Dec 6 16:49:28 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 6 Dec 2009 21:49:28 +0000 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> Message-ID: <320fb6e00912061349v2dbc3586g185701894e8e7c05@mail.gmail.com> > 2009/12/4 Peter Cock >> However, with Leopard (10.4), when installing XCode I >> had the option of installing the 10.3 headers too. Could >> you re-install XCode and check this time for an option >> to include the headers for 10.4 (and/or 10.3)? Minor typo - Leopard is Mac OS 10.5 of course ;) On Sun, Dec 6, 2009 at 9:38 PM, Boel Brynedal wrote: > Hi Peter, > > I downloaded XCode again and included the 10.4 support - this > seem to have fixed it. > Thank you very much! > > Boel Excellent - thanks for letting us know. Peter From biopython at maubp.freeserve.co.uk Mon Dec 7 07:14:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 7 Dec 2009 12:14:35 +0000 Subject: [Biopython] Deprecating old Bio.GFF module in preparation for new code? Message-ID: <320fb6e00912070414i3e4503dt311953b99d2efeb5@mail.gmail.com> Dear all, Is anyone using the "old" Bio.GFF module in Biopython? This was written by Michael Hoffman back 2002, and allowed access to a General Feature Format (GFF) MySQL database created with BioPerl's Bio::DB:GFF. It may need updating to work with the latest BioPerl, or GFF3 files (I don't know). This old code did not include any GFF parser of its own. As those on the dev mailing list will know, Brad Chapman has been working on a GFF parser (covering GFF3, and the older GFF2 and GTF files). The obvious place to put this is under Bio.GFF. I would therefore like to propose deprecating the current Bio.GFF code in the next release of Biopython (hopefully this month), which will allow us to replace it with Brad's new parser in the subsequent release. If anyone is using the old module, please let us know now. Thank you, Peter From biopython at maubp.freeserve.co.uk Mon Dec 7 09:02:51 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 7 Dec 2009 14:02:51 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> Message-ID: <320fb6e00912070602i3a881fd7nd758d71ce0d1a3f4@mail.gmail.com> On Thu, Dec 3, 2009 at 12:26 PM, Peter wrote: > On Thu, Dec 3, 2009 at 12:17 PM, Michael Thon wrote: > >> ...but I still have this biopython from github that won't compile and it >> probably should. ?So, if you have any ideas what might be wrong and >> how to fix it I can try it and report back. > > I'll try to get back to you on this shortly. > Hi Michael, As discussed on the other thread, could you try reinstalling XCode on Snow Leopard (Mac OS X 10.6), but this time tick the option to include the older headers (Tiger 10.4 SDK - not sure exactly what it is called). http://lists.open-bio.org/pipermail/biopython/2009-December/005906.html Peter From iwan.grin at googlemail.com Tue Dec 8 13:52:13 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Tue, 8 Dec 2009 19:52:13 +0100 Subject: [Biopython] Parsing problem Message-ID: Hi all, I am having a little problem while trying to parse a GenBank (or rather GenProt) file using BioPython. I am trying to extract the position on the genome from the "coded_by" qualifier of the CDS feature of a protein. The "coded_by" string in this specific case looks like this: 'complement(NC_012967.1: 3622110..3624728)' Now, when I run Bio.GFF.easy.LocationFromString('complement(NC_012967.1:3622110..3624728)' ) I get File "/usr/lib/pymodules/python2.6/Bio/GFF/easy.py", line 419, in __init__ list.__init__(self, [int(location_str)-1]) # zero based, nip it in the bud ValueError: invalid literal for int() with base 10: 'NC_012967.1:3622110..3624728' Is there another way to parse this location string or do I have to cook up some kind of custom RegExp? Iwan P.S.: Code snippet: from Bio import Entrez from Bio import SeqIO from Bio import GFF gi = 254163455 handle = Entrez.efetch(db="protein", id=gi, rettype="gb") record= SeqIO.read(handle,"genbank") handle.close() for feature in record.features: if(feature.type=="CDS" and feature.qualifiers.has_key("coded_by")): print feature.qualifiers["coded_by"][0], loc=GFF.easy.LocationFromString(feature.qualifiers["coded_by"][0]) print loc.start(),loc.end(), loc.complement From biopython at maubp.freeserve.co.uk Tue Dec 8 17:43:29 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 8 Dec 2009 22:43:29 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: Message-ID: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin wrote: > Hi all, > > I am having a little problem while trying to parse a GenBank (or rather > GenProt) file using BioPython. I am trying to extract the position on the > genome from the "coded_by" qualifier of the CDS feature of a protein. > > The "coded_by" string in this specific case looks like this: > > 'complement(NC_012967.1: > 3622110..3624728)' Oh, one of those tricky cross references to another file :( > Now, when I run > > Bio.GFF.easy.LocationFromString('complement(NC_012967.1:3622110..3624728)' ) > This is interesting timing - Bio.GFF.easy has a lot of code which duplicated the EMBL/GenBank parsing, and I'm actually suggesting we deprecate it in the next release (!). What made you use Bio.GFF in the first place? It has never been documented. That said, it does look like you found a bug in Bio.GFF.easy ... In the long term, I think Bio.GenBank would be a better place to put this functionality (and reworking the location parsing is on the todo list, partly as it is currently a speed bottleneck). Peter From biopython at maubp.freeserve.co.uk Tue Dec 8 18:53:29 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 8 Dec 2009 23:53:29 +0000 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> Message-ID: <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> On Tue, Dec 8, 2009 at 10:43 PM, Peter wrote: > On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin wrote: >> Hi all, >> >> I am having a little problem while trying to parse a GenBank (or rather >> GenProt) file using BioPython. I am trying to extract the position on the >> genome from the "coded_by" qualifier of the CDS feature of a protein. >> >> The "coded_by" string in this specific case looks like this: >> >> 'complement(NC_012967.1: >> 3622110..3624728)' > > Oh, one of those tricky cross references to another file :( It looks like the Bio.GFF.easy code expects that to be formatted as NC_012967.1:complement(3622110..3624728) and not as complement(NC_012967.1:3622110..3624728) Peter From iwan.grin at googlemail.com Wed Dec 9 07:33:09 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 13:33:09 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> Message-ID: Hi Peter, Thank you for your reply. I am new to BioPython and stumbled upon GFF.easy while searching through the API docs. Actually, What I wanted was a way to parse that location string into an SeqFeature-like thing from which I could get start, end and strand.Unfortunately I could not find the correct parser in Bio.Genbank - any suggestions are welcome. I agree with you that Bio.GFF.easy expects the Accession number before the complement. (Actually for my purpose I do not need the accession number at all.) Iwan 2009/12/9 Peter > On Tue, Dec 8, 2009 at 10:43 PM, Peter > wrote: > > On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin > wrote: > >> Hi all, > >> > >> I am having a little problem while trying to parse a GenBank (or rather > >> GenProt) file using BioPython. I am trying to extract the position on > the > >> genome from the "coded_by" qualifier of the CDS feature of a protein. > >> > >> The "coded_by" string in this specific case looks like this: > >> > >> 'complement(NC_012967.1: > >> 3622110..3624728)' > > > > Oh, one of those tricky cross references to another file :( > > It looks like the Bio.GFF.easy code expects that to be formatted > as NC_012967.1:complement(3622110..3624728) and not as > complement(NC_012967.1:3622110..3624728) > > Peter > From biopython at maubp.freeserve.co.uk Wed Dec 9 08:25:44 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 13:25:44 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> Message-ID: <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> On Wed, Dec 9, 2009 at 12:33 PM, Iwan Grin wrote: > Hi Peter, Thank you for your reply. > > I am new to BioPython and stumbled upon GFF.easy while searching through the > API docs. Actually, What I wanted was a way to parse that location string > into an SeqFeature-like thing from which I could get start, end and > strand.Unfortunately I could not find the correct parser in Bio.Genbank - > any suggestions are welcome. Right now Bio.GenBank doesn't really expose the location parsing in an easy to use way like Bio.GFF.easy does. > I agree with you that Bio.GFF.easy expects the Accession number before the > complement. (Actually for my purpose I do not need the accession number at > all.) The pragmatic solution is to write your own quick parser to pull out the coordinates (if that is all you need). We'll have to look at this as part of the discussion of what to do with the old Bio.GFF (as part of planning for Brad's new GFF parsing code). Peter From chapmanb at 50mail.com Wed Dec 9 08:38:02 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 9 Dec 2009 08:38:02 -0500 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> Message-ID: <20091209133802.GB79820@sobchak.mgh.harvard.edu> Iwan and Peter; > > I am new to BioPython and stumbled upon GFF.easy while searching through the > > API docs. Actually, What I wanted was a way to parse that location string > > into an SeqFeature-like thing from which I could get start, end and > > strand.Unfortunately I could not find the correct parser in Bio.Genbank - > > any suggestions are welcome. > > Right now Bio.GenBank doesn't really expose the location parsing in an > easy to use way like Bio.GFF.easy does. If you don't like ugly code, please avert your eyes now. This will work with the standard GenBank parsing and is definitely not future proof since it involves using private members. However, it'll work for something quick n' dirty: from Bio.GenBank import _FeatureConsumer from Bio.SeqFeature import SeqFeature def gb_string_to_feature(content, use_fuzziness=True): """Convert a GenBank location string into a SeqFeature. """ consumer = _FeatureConsumer(use_fuzziness) consumer._cur_feature = SeqFeature() consumer.location(content) return consumer._cur_feature print gb_string_to_feature('complement(NC_012967.1:3622110..3624728)') Hope this helps, Brad From bartek at rezolwenta.eu.org Wed Dec 9 09:43:23 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Wed, 9 Dec 2009 15:43:23 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <889087.95522.qm@web62406.mail.re1.yahoo.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> Message-ID: <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> Hi Michiel, I haven't given enough consideration to the maintenance costs of having parsers like this one in biopython. I think you are right that it's not useful in its current state, and I don't think it's worth putting efforts into improving it. There are already other methods of motif comparison implemented in bio.Motif and if I was to choose an external motif comparison software to support in biopython, I would vote for the STAMP tool from Benos lab. So, in conclusion, I think it would make sense to deprecate the CompareAce parser. cheers Bartek On Sun, Dec 6, 2009 at 3:31 PM, Michiel de Hoon wrote: >> So the reason this parser is so simple and has almost no >> functionality is just a reflection of the simplicity of >> the CompareAce files? > > Not exactly. CompareAce files can have different outputs, depending on the query given to CompareAce. The simplest query returns only one number. The current CompareAce parser can only parse this output. In other words, > >>>> input = open("test.out") >>>> from Bio.Motif.Parsers import AlignAce >>>> AlignAce.CompareAceParser().parse(input) > 0.92130000000000001 > > is equivalent to > >>>> input = open("test.out") >>>> float(input.read()) > 0.92130000000000001 > > I am not against having a CompareAce parser in Biopython, but if we have such a parser it should be able to handle more output formats than just the trivial output format. > > With this in mind, I think we should either extend the CompareAce parser to handle cases that cannot be trivially handled by a simple Python command, or remove it altogether. If we do keep it in Biopython, there should also be some documentation to cover it, and perhaps a unit test. > > --Michiel > > --- On Sun, 12/6/09, Peter wrote: > >> From: Peter >> Subject: Re: [Biopython] CompareAce parser >> To: "Bartek Wilczynski" >> Cc: "Michiel de Hoon" , biopython at biopython.org >> Date: Sunday, December 6, 2009, 9:10 AM >> On Sun, Dec 6, 2009 at 12:58 AM, >> Bartek Wilczynski >> >> wrote: >> > Hi, >> > >> > I don't have anything against deprecating, even though >> I don't the >> > advantages of doing so. (the module is trivial, but so >> is the output >> > of compareACE: a number giving a score between motifs. >> The score, >> > however is not trivial and I wouldn't want to >> reimplement it.) >> > >> > cheers >> > ?Bartek >> >> So the reason this parser is so simple and has almost no >> functionality >> is just a reflection of the simplicity of the CompareAce >> files? If so, I'd >> say leave the parser in. >> >> Peter >> > > > > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From iwan.grin at googlemail.com Wed Dec 9 09:51:20 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 15:51:20 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <20091209133802.GB79820@sobchak.mgh.harvard.edu> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> Message-ID: 2009/12/9 Brad Chapman > Iwan and Peter; > > > > I am new to BioPython and stumbled upon GFF.easy while searching > through the > > > API docs. Actually, What I wanted was a way to parse that location > string > > > into an SeqFeature-like thing from which I could get start, end and > > > strand.Unfortunately I could not find the correct parser in Bio.Genbank > - > > > any suggestions are welcome. > > > > Right now Bio.GenBank doesn't really expose the location parsing in an > > easy to use way like Bio.GFF.easy does. > > If you don't like ugly code, please avert your eyes now. This will > work with the standard GenBank parsing and is definitely not future > proof since it involves using private members. However, it'll work > for something quick n' dirty: > > from Bio.GenBank import _FeatureConsumer > from Bio.SeqFeature import SeqFeature > > def gb_string_to_feature(content, use_fuzziness=True): > """Convert a GenBank location string into a SeqFeature. > """ > consumer = _FeatureConsumer(use_fuzziness) > consumer._cur_feature = SeqFeature() > consumer.location(content) > return consumer._cur_feature > > print gb_string_to_feature('complement(NC_012967.1:3622110..3624728)') > > Hope this helps, > Brad > Brad, Thank you very much! as much as this is a hack, it works for what I want to have. I guess for future proofness, either the parsers from Bio.GenBank should be exposed, or the coded_by qualifier should be parsed as location by default, although I am not sure how well the latter idea fits into the present data structure. Iwan From biopython at maubp.freeserve.co.uk Wed Dec 9 10:05:39 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 15:05:39 +0000 Subject: [Biopython] CompareAce parser In-Reply-To: <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> Message-ID: <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> On Wed, Dec 9, 2009 at 2:43 PM, Bartek Wilczynski wrote: > Hi Michiel, > > I haven't given enough consideration to the maintenance costs of > having parsers like this one in biopython. I think you are right that > it's not ?useful in its current state, and I don't think it's worth > putting efforts into improving it. There are already other methods of > motif comparison implemented in bio.Motif and if I was to choose an > external motif comparison software to support in biopython, I would > vote for the STAMP tool from Benos lab. So, in conclusion, I think it > would make sense to deprecate the CompareAce parser. > > cheers > Bartek I hadn't looked to see just how simple the files and parser were ;) Do you want to go ahead and make the deprecation Bartek? Thanks, Peter From biopython at maubp.freeserve.co.uk Wed Dec 9 10:26:18 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 15:26:18 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> Message-ID: <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> On Wed, Dec 9, 2009 at 2:51 PM, Iwan Grin wrote: > Brad, Thank you very much! > > as much as this is a hack, it works for what I want to have. I guess for > future proofness, either the parsers from Bio.GenBank should be exposed, or > the coded_by qualifier should be parsed as location by default, although I > am not sure how well the latter idea fits into the present data structure. Brad's trick still work in Biopython 1.53 at the very least. I think we'll try and make the location parser more accessible in future, but changing the parsing of "coded_by" qualifiers would risk breaking existing user scripts. Peter From iwan.grin at googlemail.com Wed Dec 9 10:55:21 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 16:55:21 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> Message-ID: 2009/12/9 Peter > On Wed, Dec 9, 2009 at 2:51 PM, Iwan Grin > wrote: > > Brad, Thank you very much! > > > > as much as this is a hack, it works for what I want to have. I guess for > > future proofness, either the parsers from Bio.GenBank should be exposed, > or > > the coded_by qualifier should be parsed as location by default, although > I > > am not sure how well the latter idea fits into the present data > structure. > > Brad's trick still work in Biopython 1.53 at the very least. I think we'll > try and make the location parser more accessible in future, but > changing the parsing of "coded_by" qualifiers would risk breaking > existing user scripts. > > Peter > I would suggest to add a new "coded_by" feature and leave the qualifier as it is. This should minimize the risk of breaking stuff. On the other hand, This feature would be pretty specific for CDS in Genbank Protein files. Iwan From bartek at rezolwenta.eu.org Wed Dec 9 11:33:58 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Wed, 9 Dec 2009 17:33:58 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> Message-ID: <8b34ec180912090833uf9ace0x8f0b76335dbd8143@mail.gmail.com> On Wed, Dec 9, 2009 at 4:05 PM, Peter wrote: > I hadn't looked to see just how simple the files and parser were ;) > > Do you want to go ahead and make the deprecation Bartek? Yes. It's done and pushed to github now. cheers Bartek -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From villahozbale at wisc.edu Fri Dec 11 13:32:53 2009 From: villahozbale at wisc.edu (ANGEL VILLAHOZ-BALETA) Date: Fri, 11 Dec 2009 12:32:53 -0600 Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? Message-ID: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> Hi to all, I believe that there is a printing error in the Biopython Tutorial and Cookbook... Go there: http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 Then check the following source code: >>> for record in records: ... print "title:", record["TI"] ... if "AU" in records: ... print "authors:", record["AU"] ... print "source:", record["CO"] ... print I believe that the if sentence would have the record instead of the records because it would never print such an information about the authors since the data structure of records does not have this key but always integers as its indices. Let me know if I am right or not. Thanks very much, Angel Villahoz-Baleta Bioinformatics Programmer University of Wisconsin-Madison From biopython at maubp.freeserve.co.uk Fri Dec 11 14:45:31 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 11 Dec 2009 19:45:31 +0000 Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? In-Reply-To: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> References: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> Message-ID: <320fb6e00912111145w68330b80l8b1091db48fac3fb@mail.gmail.com> On Fri, Dec 11, 2009 at 6:32 PM, ANGEL VILLAHOZ-BALETA wrote: > Hi to all, > > I believe that there is a printing error in the Biopython Tutorial and Cookbook... > > Go there: > > http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 > > Then check the following source code: > >>>> for record in records: > ... ? ? print "title:", record["TI"] > ... ? ? if "AU" in records: > ... ? ? ? ? print "authors:", record["AU"] > ... ? ? print "source:", record["CO"] > ... ? ? print > > I believe that the if sentence would have the record > instead of the records because it would never print > such an information about the authors since the data > structure of records does not have this key but always > integers as its indices. What version of Biopython do you have? Could you show us the actual error message? I've just been playing with the example, and for some records certain fields are missing (you get a KeyError), so this works better: for record in records: print "title:", record.get("TI","?") print "author:", record.get("AU","?") print "source:", record.get("CO","?") print Does that help? Peter From mjldehoon at yahoo.com Fri Dec 11 20:52:08 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 11 Dec 2009 17:52:08 -0800 (PST) Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? In-Reply-To: <320fb6e00912111145w68330b80l8b1091db48fac3fb@mail.gmail.com> Message-ID: <138660.52957.qm@web62406.mail.re1.yahoo.com> Dear Angel, This was indeed a typing error in the tutorial. It is fixed now as Peter suggested. Thanks for noticing! --Michiel. --- On Fri, 12/11/09, Peter wrote: > From: Peter > Subject: Re: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? > To: "ANGEL VILLAHOZ-BALETA" > Cc: biopython at lists.open-bio.org > Date: Friday, December 11, 2009, 2:45 PM > On Fri, Dec 11, 2009 at 6:32 PM, > ANGEL VILLAHOZ-BALETA > > wrote: > > Hi to all, > > > > I believe that there is a printing error in the > Biopython Tutorial and Cookbook... > > > > Go there: > > > > http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 > > > > Then check the following source code: > > > >>>> for record in records: > > ... ? ? print "title:", record["TI"] > > ... ? ? if "AU" in records: > > ... ? ? ? ? print "authors:", record["AU"] > > ... ? ? print "source:", record["CO"] > > ... ? ? print > > > > I believe that the if sentence would have the record > > instead of the records because it would never print > > such an information about the authors since the data > > structure of records does not have this key but > always > > integers as its indices. > > What version of Biopython do you have? > Could you show us the actual error message? > > I've just been playing with the example, and for some > records certain fields are missing (you get a KeyError), > so this works better: > > for record in records: > ? ? print "title:", record.get("TI","?") > ? ? print "author:", record.get("AU","?") > ? ? print "source:", record.get("CO","?") > ? ? print > > Does that help? > > Peter > > _______________________________________________ > Biopython mailing list? -? Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From aboulia at gmail.com Tue Dec 15 01:01:47 2009 From: aboulia at gmail.com (Kevin Lam) Date: Tue, 15 Dec 2009 14:01:47 +0800 Subject: [Biopython] rsync download of biopython problems Message-ID: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> Hi I just tried downloading biopython via rsync rsync -av code.open-bio.org::cvsbiopython . But all the files were appended with a ",v" why did that happen? (i.e. see below) Attic Bio BioSQL CONTRIB,v DEPRECATED,v Doc Experimental LICENSE,v MANIFEST.in,v Martel NEWS,v README,v Scripts setup.py,v Tests From biopython at maubp.freeserve.co.uk Tue Dec 15 05:37:55 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 10:37:55 +0000 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> Message-ID: <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> On Tue, Dec 15, 2009 at 6:01 AM, Kevin Lam wrote: > Hi > I just tried downloading biopython via rsync > > ?rsync -av code.open-bio.org::cvsbiopython . > > But all the files were appended with a ",v" why did that happen? (i.e. see > below) > Attic ?Bio ?BioSQL ?CONTRIB,v ?DEPRECATED,v ?Doc ?Experimental ?LICENSE,v > MANIFEST.in,v ?Martel ?NEWS,v ?README,v ?Scripts ?setup.py,v ?Tests CVS likes to add ",v" to files - if you wanted to download them from the public CVS server (code.open-bio.org) you would have had to use the CVS command line tool. However, we don't use CVS anymore, we use git. See: http://www.biopython.org/wiki/SourceCode If you really want to use rsync you *might* be able to point it at http://biopython.org/SRC/biopython/ Peter P.S. Why did you try downloading with rsync from the code.open-bio.org? Is there something confusing in our documentation we can fix? Thanks! From aboulia at gmail.com Tue Dec 15 06:39:35 2009 From: aboulia at gmail.com (Kevin Lam) Date: Tue, 15 Dec 2009 19:39:35 +0800 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> Message-ID: <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> Hi Peter, git works fine for me! ok lemme explain how i got there.. from http://www.biopython.org/wiki/CVS i followed this link http://cvs.biopython.org/ which redirected me to http://www.open-bio.org/wiki/SourceCode i was on a fresh install system of CentOS so rsync was avail so i used that. Cheers Kevin On Tue, Dec 15, 2009 at 6:37 PM, Peter wrote: > On Tue, Dec 15, 2009 at 6:01 AM, Kevin Lam wrote: > > Hi > > I just tried downloading biopython via rsync > > > > rsync -av code.open-bio.org::cvsbiopython . > > > > But all the files were appended with a ",v" why did that happen? (i.e. > see > > below) > > Attic Bio BioSQL CONTRIB,v DEPRECATED,v Doc Experimental LICENSE,v > > MANIFEST.in,v Martel NEWS,v README,v Scripts setup.py,v Tests > > CVS likes to add ",v" to files - if you wanted to download them > from the public CVS server (code.open-bio.org) you would have > had to use the CVS command line tool. > > However, we don't use CVS anymore, we use git. See: > http://www.biopython.org/wiki/SourceCode > > If you really want to use rsync you *might* be able to point it > at http://biopython.org/SRC/biopython/ > > Peter > > P.S. Why did you try downloading with rsync from the code.open-bio.org? > Is there something confusing in our documentation we can fix? Thanks! > From biopython at maubp.freeserve.co.uk Tue Dec 15 06:48:22 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 11:48:22 +0000 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> Message-ID: <320fb6e00912150348y7f074d94u74b3b14cc31d4c6@mail.gmail.com> On Tue, Dec 15, 2009 at 11:39 AM, Kevin Lam wrote: > Hi Peter, > git works fine for me! > ok lemme explain how i got there.. > > from http://www.biopython.org/wiki/CVS I just the "this page is obsolete" bit at the top needs to be more prominent... > i followed this link http://cvs.biopython.org/ I can ask the sys admins to redirect that to: http://www.biopython.org/wiki/SourceCode > which redirected me to http://www.open-bio.org/wiki/SourceCode Ah - that does need updating. Thanks! > i was on a fresh install system of CentOS so rsync was avail so i used that. Well it "worked" (you got a copy of the CVS repository rather than a snapshot of the code), but the CVS repository is now out of date. Peter From biopython at maubp.freeserve.co.uk Tue Dec 15 12:01:38 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 17:01:38 +0000 Subject: [Biopython] Biopython 1.53 released Message-ID: <320fb6e00912150901k138ae04bmc5d5af9c867340ec@mail.gmail.com> Dear Biopythoneers, We are pleased to announce the availability of Biopython 1.53, a new stable release of the Biopython library, three months after the release of Biopython 1.52. This is our first release since migrating from CVS to git for source code control. There have been some additions to our core objects ? the Seq (and related UnknownSeq) objects gained upper and lower methods (like the string methods of the same name but alphabet aware) plus a new ungap method. The SeqFeature object now has an extract method to get the region of sequence it describes (useful for getting CDS nucleotide sequences from GenBank files). Also SeqRecord objects now support addition, giving a new SeqRecord with the combined sequence, all the SeqFeatures, and any common annotation. SQLite support (built into Python 2.5+) was added to our BioSQL interface. This is still a little experimental as we are using a draft BioSQL SQLite schema, but this should be merged into the next BioSQL release. Biopython now includes wrappers for the new NCBI BLAST C++ tools, which will be replacing the old NCBI ?legacy? BLAST tools written in C. The plain text BLAST parser has been updated to cope as well. Nevertheless, we (and the NCBI) still recommend using the XML output for parsing. Bio.Entrez includes the new (Jan 2010) DTD files from the NCBI for parsing MedLine/PubMed data. The NCBI codon tables have been updated from version 3.4 to 3.9, which adds a few extra start codons, and a few new tables (Tables 16, 21, 22 and 23). The restriction enzyme list in Bio.Restriction has been updated to the Nov 2009 release of REBASE. The Bio.PDB parser and output code has been updated to understand the element column in ATOM and HETATM lines, and Bio.PDB.PDBList has been updated for recent changes to the PDB FTP site. Finally, support for running Biopython under Jython (using the Java Virtual Machine) has been much improved. Note that Jython does not support C code, and currently Jython does not parse DTD files (needed for the Bio.Entrez XML parser). However, most of the Biopython modules seem fine from testing Jython 2.5.0 and 2.5.1. Sources and Windows Installers are available from our downloads page. Thanks to the Biopython development team and to everyone who has reported bugs or contributed patches since our last release. --Peter, on behalf of the Biopython developers P.S. This news post is online at http://news.open-bio.org/news/2009/12/biopython-release-153/ You may wish to subscribe to our news feed. For RSS links etc, see: http://biopython.org/wiki/News Biopython news is also on twitter: http://twitter.com/biopython From cgohlke at uci.edu Tue Dec 15 12:17:53 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 09:17:53 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 Message-ID: <4B27C4C1.3090206@uci.edu> Hello, I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual Studio 2008. The test output is attached. The installer is at Best, Christoph Gohlke Laboratory for Fluorescence Dynamics University of California, Irvine http://www.lfd.uci.edu/~gohlke/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: biopython-1.53.win-amd64-py2.6-test.txt URL: From biopython at maubp.freeserve.co.uk Tue Dec 15 12:49:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 17:49:35 +0000 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <4B27C4C1.3090206@uci.edu> References: <4B27C4C1.3090206@uci.edu> Message-ID: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: > Hello, > > I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual > Studio 2008. The test output is attached. The installer is at > Nice :) Was this with NumPy 1.4.0rc2? The fact that test_GraphicsBitmaps.py failed with a font problem is indicative of something not quite right in ReportLab and/or PIL. This is almost certainly not a Biopython problem. A couple of SCOP tested failed - could you run unix2dos (or similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt and retest? That "fixes" it on win32. Also what does this do on your Windows 64bit python? >>> import sys >>> sys.platform 'win32' I've seen threads discussing if it should return "win64" or "win32", but the simplest way to check is try it and see. Thanks, Peter From cgohlke at uci.edu Tue Dec 15 13:22:23 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 10:22:23 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> Message-ID: <4B27D3DF.4070006@uci.edu> On 12/15/2009 9:49 AM, Peter wrote: > On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: >> Hello, >> >> I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual >> Studio 2008. The test output is attached. The installer is at >> > > Nice :) > > Was this with NumPy 1.4.0rc2? Yes, numpy-1.4.0rc2.dev7996, also available on the same download page. > > The fact that test_GraphicsBitmaps.py failed with a font problem > is indicative of something not quite right in ReportLab and/or PIL. > This is almost certainly not a Biopython problem. > OK, I will check Reportlab and PIL. I now remember seeing some font loading issues with PIL 1.1.7 in other packages even though all internal tests pass. > A couple of SCOP tested failed - could you run unix2dos (or > similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt > and retest? That "fixes" it on win32. > That worked. The font error is now the only failing test. > Also what does this do on your Windows 64bit python? > >>>> import sys >>>> sys.platform > 'win32' 'win32' is correct. I use "'64 bit' in sys.version" to check for a 64 bit version at runtime. > > I've seen threads discussing if it should return "win64" or > "win32", but the simplest way to check is try it and see. > Thank you. Feel free to redistribute the installer if you think it is good enough. Christoph From cgohlke at uci.edu Tue Dec 15 13:20:07 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 10:20:07 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> Message-ID: <4B27D357.30705@uci.edu> On 12/15/2009 9:49 AM, Peter wrote: > On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: >> Hello, >> >> I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual >> Studio 2008. The test output is attached. The installer is at >> > > Nice :) > > Was this with NumPy 1.4.0rc2? > > The fact that test_GraphicsBitmaps.py failed with a font problem > is indicative of something not quite right in ReportLab and/or PIL. > This is almost certainly not a Biopython problem. > > A couple of SCOP tested failed - could you run unix2dos (or > similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt > and retest? That "fixes" it on win32. > > Also what does this do on your Windows 64bit python? > >>>> import sys >>>> sys.platform > 'win32' > > I've seen threads discussing if it should return "win64" or > "win32", but the simplest way to check is try it and see. > > Thanks, > > Peter > > From biopython at maubp.freeserve.co.uk Tue Dec 15 13:57:09 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 18:57:09 +0000 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <4B27D3DF.4070006@uci.edu> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> <4B27D3DF.4070006@uci.edu> Message-ID: <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> On Tue, Dec 15, 2009 at 6:22 PM, Christoph Gohlke wrote: > >> The fact that test_GraphicsBitmaps.py failed with a font problem >> is indicative of something not quite right in ReportLab and/or PIL. >> This is almost certainly not a Biopython problem. >> > OK, I will check Reportlab and PIL. I now remember seeing some > font loading issues with PIL 1.1.7 in other packages even though > all internal tests pass. If you are happy to investigate further, that would be great. >> A couple of SCOP tested failed - could you run unix2dos (or >> similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt >> and retest? That "fixes" it on win32. > > That worked. The font error is now the only failing test. Good. >> Also what does this do on your Windows 64bit python? >> >>>>> import sys >>>>> sys.platform >> >> 'win32' > > 'win32' is correct. > I use "'64 bit' in sys.version" to check for a 64 bit version at runtime. Thanks - I just wanted to be sure. > > Thank you. Feel free to redistribute the installer if you think it is > good enough. > Given NumPy don't offer their own 64bit installers, and the possible need for Microsoft Visual C++ 2008 redistributable package, perhaps linking to your page makes most sense for now. I'll update our download page if that sounds sensible. Thank you, Peter From cgohlke at uci.edu Tue Dec 15 14:29:28 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 11:29:28 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> <4B27D3DF.4070006@uci.edu> <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> Message-ID: <4B27E398.3090600@uci.edu> On 12/15/2009 10:57 AM, Peter wrote: > On Tue, Dec 15, 2009 at 6:22 PM, Christoph Gohlke wrote: >> >>> The fact that test_GraphicsBitmaps.py failed with a font problem >>> is indicative of something not quite right in ReportLab and/or PIL. >>> This is almost certainly not a Biopython problem. >>> >> OK, I will check Reportlab and PIL. I now remember seeing some >> font loading issues with PIL 1.1.7 in other packages even though >> all internal tests pass. > > If you are happy to investigate further, that would be great. > Turned out that the Times-Roman font is simply missing from the reportlab 2.3 source distribution, which I used. The missing fonts can be downloaded at and put in the reportlab/fonts directory. I also included these fonts in the updated reportlab-2.3.win-amd64-py2.6.exe installer. All tests pass now. Some tests were skipped due to missing third party packages on my computer. >> >> Thank you. Feel free to redistribute the installer if you think it is >> good enough. >> > > Given NumPy don't offer their own 64bit installers, and the > possible need for Microsoft Visual C++ 2008 redistributable > package, perhaps linking to your page makes most sense > for now. I'll update our download page if that sounds sensible. > Makes sense. The VC.CRT redistributable is usually installed with Python and I compiled with the http://bugs.python.org/issue4120 patch. Best, Christoph From aboulia at gmail.com Tue Dec 15 22:52:36 2009 From: aboulia at gmail.com (Kevin Lam) Date: Wed, 16 Dec 2009 11:52:36 +0800 Subject: [Biopython] Entrez.efetch Service unavailable! Message-ID: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> Hi I have been trying to use Entrez.efetch to download ~1000 bacteria genomes I read in the docs that biopython will auto take care of the delay and fetch via the preferred site for download scripts but I have been getting service unavailable errors at GMT +8 1100am is this normal? Or should i edit the source to give a larger delay buffer? So far I have only managed to get 3 fasta sequences out Or should I bring this up to NCBI instead? Traceback (most recent call last): File "../retr-fasta.py", line 15, in ? handle = Entrez.efetch(db="genome", id=uid, rettype="fasta") File "/home/k/lib/biopython-biopython-9a41381/build/lib.linux-x86_64-2.4/Bio/Entrez/__init__.py", line 105, in efetch return _open(cgi, variables) File "/home/k/lib/biopython-biopython-9a41381/build/lib.linux-x86_64-2.4/Bio/Entrez/__init__.py", line 343, in _open raise IOError("Service unavailable!") IOError: Service unavailable! Cheers Kevin From biopython at maubp.freeserve.co.uk Wed Dec 16 03:58:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 08:58:35 +0000 Subject: [Biopython] Entrez.efetch Service unavailable! In-Reply-To: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> References: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> Message-ID: <320fb6e00912160058i57acc020nba7c61c53a4ec64b@mail.gmail.com> On Wed, Dec 16, 2009 at 3:52 AM, Kevin Lam wrote: > Hi I have been trying to use > Entrez.efetch > to download ~1000 bacteria genomes Why not use their FTP site? They even make bundles available, e.g. ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.gbk.tar.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.faa.tar.gz etc. Note the folder is called Bacteria for historical reasons, it is really Prokaryotes as there are plenty of Archaea in there. > I read in the docs that biopython will auto take care of the delay and fetch > via the preferred site for download scripts > but I have been getting service unavailable ?errors at GMT +8 1100am > is this normal? Or should i edit the source to give a larger delay buffer? > So far I have only managed to get 3 fasta sequences out The NCBI were planning some Entrez work about now (updating DTD files), so the downtime might be expected. I'd wait a day, and then if it is still down email them. Regards, Peter From iua1 at psu.edu Wed Dec 16 11:49:11 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 11:49:11 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups Message-ID: Hello Everyone, I applaud the move to Github, I think it was a great decision that will allow more people to contribute to the project. Yet at the same time a community is built on communication and the current mailing list feels extremely antiquated. The way of interaction is tedious: sending emails to an address, then one reads one message at a time, messages are not displayed in a threaded form where multiple messages are shown at the same time. There is no of search, etc. and it all feels like a throwback to the 90s. Personally I think a choice of mailman is a choice of deliberately of limiting access to all but the most hardcore - and for example that's why the main Python-dev uses it, it is a more of a mechanism to keep people away. Of course python has comp.lang.python and it is a nice and thriving group. The alternative such as Google groups would be far superior in attracting and building a community of developers and users as well. Is this an idea that the owners of the list would entertain? best regards, Istvan Albert -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 11:56:57 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 11:56:57 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Istvan, what is your solution to Google all of a sudden deciding to take Google Groups down, or to make it a paid subscription service, or if Google goes out of business? All of these things have happened over and over before with commercial vendors. Economies do go through cycles. Would it be OK to lose the entire archive of the mailing list in such an event? BTW if you use a threaded email client (such as GMail, or in fact most modern email readers), you *will* see threaded messages. Also, the Biopython list is indexed in GMane I believe, so you can search there pretty conveniently. Finally, have you tried using Google to search the Biopython archive? It's not so bad, actually. Just my $0.02 (and I'm a big fan of Google Groups). -hilmar On Dec 16, 2009, at 11:49 AM, Istvan Albert wrote: > Hello Everyone, > > I applaud the move to Github, I think it was a great decision that > will allow more people to contribute to the project. > > Yet at the same time a community is built on communication and the > current mailing list feels extremely antiquated. The way of > interaction is tedious: sending emails to an address, then one reads > one message at a time, messages are not displayed in a threaded form > where multiple messages are shown at the same time. There is no of > search, etc. and it all feels like a throwback to the 90s. Personally > I think a choice of mailman is a choice of deliberately of limiting > access to all but the most hardcore - and for example that's why the > main Python-dev uses it, it is a more of a mechanism to keep people > away. Of course python has comp.lang.python and it is a nice and > thriving group. > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as well. > Is this an idea that the owners of the list would entertain? > > best regards, > > Istvan Albert > > > -- > Istvan Albert > http://www.personal.psu.edu/iua1 > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carlos.borroto at gmail.com Wed Dec 16 11:57:05 2009 From: carlos.borroto at gmail.com (Carlos Javier Borroto) Date: Wed, 16 Dec 2009 11:57:05 -0500 Subject: [Biopython] Is there any in silico PCR tool on biopython? Message-ID: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Hi there, I'm looking for a way to show some information on the specificity of sets of primers I'm designing, I'll love to have a way to run an in silico PCR reaction and parse the results with biopython. Is there something to use with one of the available PCR simulation tools? regards, -- Carlos Javier Borroto Baltimore, MD Google Voice: (410) 929 4020 From biopython at maubp.freeserve.co.uk Wed Dec 16 12:07:05 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:07:05 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: > Hello Everyone, > > I applaud the move to Github, I think it was a great decision that > will allow more people to contribute to the project. > > Yet at the same time a community is built on communication and the > current mailing list feels extremely antiquated. The way of > interaction is tedious: sending emails to an address, then one reads > one message at a time, messages are not displayed in a threaded form > where multiple messages are shown at the same time. There is no of > search, ?etc. and it all feels like a throwback to the 90s. A lot of that is down to your email program - I find none of those issue apply to how I use the list (in GoogleMail). You are specifically talking about browsing the mailing list archive? There yes, things are a bit rudimentary, and search isn't as good as in GoogleMail. But on the other hand it is clearly a read only archive. > Personally I think a choice of mailman is a choice of deliberately of > limiting access to all but the most hardcore - and for example that's > why the main Python-dev uses it, it is a more of ?a mechanism to > keep people away. Of course?python has comp.lang.python and it > is a nice and thriving group. > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as > well. Is this an idea that ?the owners of the list would entertain? It is something that the OBF may consider - but there are a lot of concerns about reliance on third parties, advertising, loss of brand control etc (see also Hilmar's email). Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 12:08:00 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:08:00 +0000 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> References: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Message-ID: <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> On Wed, Dec 16, 2009 at 4:57 PM, Carlos Javier Borroto wrote: > Hi there, > > I'm looking for a way to show some information on the specificity of > sets of primers I'm designing, I'll love to have a way to run an in > silico PCR reaction and parse the results with biopython. > > Is there something to use with one of the available PCR simulation tools? > I know people use Biopython with primer3 (usually the EMBOSS wrapped version, eprimer3). Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 12:11:40 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:11:40 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: > > Also, the Biopython list is indexed in GMane I believe, so you can > search there pretty conveniently. Good point - should we add links to these on the mailing list wiki page? Main mailing list: http://dir.gmane.org/gmane.comp.python.bio.general Dev mailing list: http://dir.gmane.org/gmane.comp.python.bio.general Announcement list: http://dir.gmane.org/gmane.comp.python.bio.general Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 12:14:31 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:14:31 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> Message-ID: <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> On Wed, Dec 16, 2009 at 5:11 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: >> >> Also, the Biopython list is indexed in GMane I believe, so you can >> search there pretty conveniently. > > Good point - should we add links to these on the mailing list wiki page? Sorry, same link three times, should be: Main mailing list: http://dir.gmane.org/gmane.comp.python.bio.general http://news.gmane.org/gmane.comp.python.bio.general Dev mailing list: http://dir.gmane.org/gmane.comp.python.bio.devel http://news.gmane.org/gmane.comp.python.bio.devel Announcement list: http://dir.gmane.org/gmane.comp.python.bio.announce http://news.gmane.org/gmane.comp.python.bio.announce Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 12:26:01 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:26:01 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as well. > Is this an idea that ?the owners of the list would entertain? > As far as I can tell from the limited documentation I found on Google Groups (maybe I was looking in the wrong place?) there is no way to import our existing decade long mail archives. That would be a major downside. Also, using Google Groups would *require* all posters to have a Google Account - a potential sticking point for some. Peter From cjfields at illinois.edu Wed Dec 16 12:19:52 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 11:19:52 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> References: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> Message-ID: <52405282-5654-47D8-885F-4733F9A40873@illinois.edu> On Dec 16, 2009, at 11:07 AM, Peter wrote: > On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: >> Hello Everyone, >> >> I applaud the move to Github, I think it was a great decision that >> will allow more people to contribute to the project. >> >> Yet at the same time a community is built on communication and the >> current mailing list feels extremely antiquated. The way of >> interaction is tedious: sending emails to an address, then one reads >> one message at a time, messages are not displayed in a threaded form >> where multiple messages are shown at the same time. There is no of >> search, etc. and it all feels like a throwback to the 90s. > > A lot of that is down to your email program - I find none of > those issue apply to how I use the list (in GoogleMail). > > You are specifically talking about browsing the mailing list > archive? There yes, things are a bit rudimentary, and search > isn't as good as in GoogleMail. But on the other hand it is > clearly a read only archive. > >> Personally I think a choice of mailman is a choice of deliberately of >> limiting access to all but the most hardcore - and for example that's >> why the main Python-dev uses it, it is a more of a mechanism to >> keep people away. Of course python has comp.lang.python and it >> is a nice and thriving group. >> >> The alternative such as Google groups would be far superior in >> attracting and building a community of developers and users as >> well. Is this an idea that the owners of the list would entertain? > > It is something that the OBF may consider - but there are a > lot of concerns about reliance on third parties, advertising, loss > of brand control etc (see also Hilmar's email). > > Peter I agree with peter and hilmar. Nabble and Gmane both archive the open-bio lists, at least for bioperl, but I would assume biopython as well. For a painful example of how bad third party mail lists can be (painful at least to me), see the gmod lists at Sourceforge. chris From iua1 at psu.edu Wed Dec 16 12:36:45 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 12:36:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 11:56 AM, Hilmar Lapp wrote: > Groups down, or to make it a paid subscription service, or if Google goes > out of business? > Economies do go through cycles. Would it be OK to lose the entire > archive of the mailing list in such an event? Every choice has two sides. Has a positive and has negative dimension to it. I am sure one can come up with unlikely yet equally pessimistic scenarios for the existing setup as well. One thing is seems clear to me and I do not think that you are aware of it. This mailman setup is a throttle - it imposes a negative feedback on the amount of messages that it can handle. This system of messages cannot grow over a certain limit. Just imagine regularly getting a dozen new emails a day plus their followups, yet you are just a casual user. This would be unbearable for many people whose inboxes are already overflowing. So they either don't participate or once they get even a few of these messages they turn off email delivery at which point you are left with a rudimentary site where it is hard to contribute so it drops off their radar. I can't even imagine what it would look like to have a popular newsgroup being delivered to my mailbox. In a nutshell you are saying it already works - but that is only because you get so few messages ... and getting more becomes actually inconvenient to the point at which it has to decay again to the manageable level Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 12:40:45 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 12:40:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > I am sure one can come up with unlikely yet equally pessimistic > scenarios for the existing setup as well. My point was that this is not unlikely at all. It happened with some of Yahoo's services, and it happened with others rather popular ones. If you own and operate your own brand, your equipment can still go out of order. But at least it's under your own control. Do you see that difference? Would you argue that that is unimportant? > This mailman setup is a throttle - it imposes a negative feedback on > the amount of messages that it can handle. I really don't know what you mean by this. I get 200 messages a day from various lists. I'd be dead if I had an email client that can't thread and can't filter, but I do have one that can (and it's free). GMail can do both too, and is free. Have you tried a threading email reader? Can you explain how reading newsgroups through a threaded and filtering news reader is different and more efficient than reading emails through a threaded and filtering email reader? That all being said, if what's at issue here is to have a Google Group interface to the Biopython mailing list, then that's actually easy to achieve. Someone (ideally one of the current list or project admins/ owners) creates a (presumably identically named) Google Group, and sets it to mirror the mailman mailing list. Guys - I'm happy to help with that if you don't know how to do that. Create the group, subscribe drycafe at gmail.com, and make me an admin. I'll configure the mirroring. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sdavis2 at mail.nih.gov Wed Dec 16 12:48:30 2009 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 16 Dec 2009 12:48:30 -0500 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> References: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> Message-ID: <264855a00912160948q538bed21q6dc0d410d468c5c7@mail.gmail.com> On Wed, Dec 16, 2009 at 12:08 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:57 PM, Carlos Javier Borroto > wrote: >> Hi there, >> >> I'm looking for a way to show some information on the specificity of >> sets of primers I'm designing, I'll love to have a way to run an in >> silico PCR reaction and parse the results with biopython. >> >> Is there something to use with one of the available PCR simulation tools? >> > > I know people use Biopython with primer3 (usually the EMBOSS > wrapped version, eprimer3). Carlos, If it is something like the UCSC genome browser in-silico PCR (for mapping the putative amplimers from a set of primers), they (UCSC) have an executable of the software. I always have trouble finding their software tools, but they are very responsive to email if you have problems. I don't have an example output file, but I bet it is just tab-delimited text, so parsing is probably not too difficult. Sean From lueck at ipk-gatersleben.de Wed Dec 16 12:31:18 2009 From: lueck at ipk-gatersleben.de (lueck at ipk-gatersleben.de) Date: Wed, 16 Dec 2009 18:31:18 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> Message-ID: <20091216183118.zy9yvhd0hdcs4ggs@webmail.ipk-gatersleben.de> What about a free forum e.g. smf (http://www.simplemachines.org/) on the biopython homepage? I'm using this too and I'm quite happy. Easy building, maintaining... Just an idea... Zitat von Peter : > On Wed, Dec 16, 2009 at 5:11 PM, Peter > wrote: >> On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: >>> >>> Also, the Biopython list is indexed in GMane I believe, so you can >>> search there pretty conveniently. >> >> Good point - should we add links to these on the mailing list wiki page? > > Sorry, same link three times, should be: > > Main mailing list: > http://dir.gmane.org/gmane.comp.python.bio.general > http://news.gmane.org/gmane.comp.python.bio.general > > Dev mailing list: > http://dir.gmane.org/gmane.comp.python.bio.devel > http://news.gmane.org/gmane.comp.python.bio.devel > > Announcement list: > http://dir.gmane.org/gmane.comp.python.bio.announce > http://news.gmane.org/gmane.comp.python.bio.announce > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From iua1 at psu.edu Wed Dec 16 13:02:26 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:02:26 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: Hi Everyone, > Also, using Google Groups would *require* all posters to have > a Google Account - a potential sticking point for some. Right, but even currently one has to make an account on your site with email and password. Aren't you less comfortable signing up with various third parties than Google? A response mentioned that Gmail's email threading works just as well ... well it doesn't help anyone who has not already been subscribed to the messages to begin with. If you come to a discussion later you cannot get that. My goal is not to argue with each point, I only picked these two because both seemed to me like obvious responses yet they are only superficially addressing the issues that I brought up. But to step back a second, and maybe I wasn't specific enough. It is less about me, I can filter and thread my own email, I could download the email entire archive and search it etc.. I teach an introductory level class that uses Biopython, can I recommend my class that all go and sign up with you? I cannot really. Many people are just learning about computing, they will be overwhelmed with the everything interface, lack of search etc. All of you who responded - frankly I think you are too close to this issue to be able judge it correctly. It is like advising a newbie to use VI, one in a hundred will love it ninety nine will hate it, but hey who could argue that it is not super awesome? Once you do something for a bunch of years, you develop strategies and everything seems to work just fine, and everything makes sense. Get someone who is 20 and has never heard of sending emails to an address then see what they say about it... You all obviously care about the a community around biopython so all I am saying here is this: when you look around and wish that you could get a lot more people in - I think the answer is right there. Make it really easy to ask question, participate but also easy to just not participate and just be able to catch up really quickly of what is going on. I wish you all the best, Istvan Albert -- Istvan Albert http://www.personal.psu.edu/iua1 From lueck at ipk-gatersleben.de Wed Dec 16 13:04:17 2009 From: lueck at ipk-gatersleben.de (lueck at ipk-gatersleben.de) Date: Wed, 16 Dec 2009 19:04:17 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <20091216190417.1z9yrs6mn4f4gcsw@webmail.ipk-gatersleben.de> What about a free forum e.g. smf (http://www.simplemachines.org/) on the biopython homepage? I'm using this too and I'm quite happy. Easy building, maintaining... Just an idea... Zitat von Hilmar Lapp : > > On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > >> I am sure one can come up with unlikely yet equally pessimistic >> scenarios for the existing setup as well. > > My point was that this is not unlikely at all. It happened with some > of Yahoo's services, and it happened with others rather popular ones. > > If you own and operate your own brand, your equipment can still go > out of order. But at least it's under your own control. Do you see > that difference? Would you argue that that is unimportant? > >> This mailman setup is a throttle - it imposes a negative feedback on >> the amount of messages that it can handle. > > I really don't know what you mean by this. I get 200 messages a day > from various lists. I'd be dead if I had an email client that can't > thread and can't filter, but I do have one that can (and it's free). > GMail can do both too, and is free. Have you tried a threading email > reader? Can you explain how reading newsgroups through a threaded and > filtering news reader is different and more efficient than reading > emails through a threaded and filtering email reader? > > That all being said, if what's at issue here is to have a Google > Group interface to the Biopython mailing list, then that's actually > easy to achieve. Someone (ideally one of the current list or project > admins/ owners) creates a (presumably identically named) Google > Group, and sets it to mirror the mailman mailing list. > > Guys - I'm happy to help with that if you don't know how to do that. > Create the group, subscribe drycafe at gmail.com, and make me an admin. > I'll configure the mirroring. > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From iua1 at psu.edu Wed Dec 16 13:13:45 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:13:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 12:40 PM, Hilmar Lapp wrote: > If you own and operate your own brand, your equipment can still go out of > order. But at least it's under your own control. Do you see that difference? > Would you argue that that is unimportant? It is a valid argument. No question about that. I do have a comeback though, what percent of the past decade's archive's content do you think is still actually useful? Isn't the archive's purpose more of a historical one. And type of archiving you could do for yourself. But as for useful content that other people want to use - I am guessing the half life of any particular advice is no more than about one two two years. Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 13:23:18 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 13:23:18 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> On Dec 16, 2009, at 1:13 PM, Istvan Albert wrote: > what percent of the past decade's archive's content do you think is > still actually useful? Actually I find it very useful. We frequently cite past posts as references for explanations, or problems previously reported. It's one of the main differences between an archived mailing list and a simple alias for a group of people. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Dec 16 13:28:37 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 12:28:37 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: On Dec 16, 2009, at 12:23 PM, Hilmar Lapp wrote: > > On Dec 16, 2009, at 1:13 PM, Istvan Albert wrote: > >> what percent of the past decade's archive's content do you think is still actually useful? > > > Actually I find it very useful. We frequently cite past posts as references for explanations, or problems previously reported. It's one of the main differences between an archived mailing list and a simple alias for a group of people. > > -hilmar Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. chris From iua1 at psu.edu Wed Dec 16 13:40:27 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:40:27 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: > > Agreed. ?With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. My fault for being insufficiently clear. I am not saying that having archives is useless. It all needs to be framed in the mindset of an unexpected event causing an archive to be lost. Is that irreparable harm? For example would having a hundred more active participants be worth the small risk of losing the archives? I am just putting the 100 as a number out there, just to get you to think. I think you all agree that at some level of extra participation the risks would be well worth it. Now I am convinced that a Google group would get more participation. But is that 10 more people, one hundred, one thousand? That I do not dare to guesstimate. (definitely more than 10, ;-) ) Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From cjfields at illinois.edu Wed Dec 16 13:34:42 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 12:34:42 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <61CD9E3F-3F1A-4AC0-B863-9A24E7A373EF@illinois.edu> On Dec 16, 2009, at 11:40 AM, Hilmar Lapp wrote: > > On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > >> I am sure one can come up with unlikely yet equally pessimistic scenarios for the existing setup as well. > > My point was that this is not unlikely at all. It happened with some of Yahoo's services, and it happened with others rather popular ones. > > If you own and operate your own brand, your equipment can still go out of order. But at least it's under your own control. Do you see that difference? Would you argue that that is unimportant? > >> This mailman setup is a throttle - it imposes a negative feedback on the amount of messages that it can handle. > > I really don't know what you mean by this. I get 200 messages a day from various lists. I'd be dead if I had an email client that can't thread and can't filter, but I do have one that can (and it's free). GMail can do both too, and is free. Have you tried a threading email reader? Can you explain how reading newsgroups through a threaded and filtering news reader is different and more efficient than reading emails through a threaded and filtering email reader? > > That all being said, if what's at issue here is to have a Google Group interface to the Biopython mailing list, then that's actually easy to achieve. Someone (ideally one of the current list or project admins/owners) creates a (presumably identically named) Google Group, and sets it to mirror the mailman mailing list. > > Guys - I'm happy to help with that if you don't know how to do that. Create the group, subscribe drycafe at gmail.com, and make me an admin. I'll configure the mirroring. > > -hilmar That would probably be a good idea for all the (most trafficked) open-bio groups. I'll work on it from the bioperl end. chris From tiagoantao at gmail.com Wed Dec 16 13:52:55 2009 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 16 Dec 2009 18:52:55 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> On Wed, Dec 16, 2009 at 5:40 PM, Hilmar Lapp wrote: > If you own and operate your own brand, your equipment can still go out of > order. But at least it's under your own control. Do you see that difference? > Would you argue that that is unimportant? +1 . Just to say I fully subscribe to this point of view. github is _different_ by the very nature of being a distributed system. If tomorrow github.com disappears, it will be very easy to recover from it. If google turns bad, we loose a lot of history as google groups is not inherently distributed and thus neither resilient nor fail-safe. I prefer the status quo to the google groups change. From my point of view the technological autonomy provided by the OBF is a good thing. My ?0.02, Tiago From biopython at maubp.freeserve.co.uk Wed Dec 16 13:55:16 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 18:55:16 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> On Wed, Dec 16, 2009 at 6:02 PM, Istvan Albert wrote: > A response mentioned that Gmail's email threading works just as well > ... ?well it doesn't help anyone who has not already been subscribed > to the messages to begin with. ?If you come to a discussion later you > cannot get that. True - but that becomes less and less of an issue once you have signed up. > ... Get someone who is 20 and has never heard of sending > emails to an address then see what they say about it... Do 20 year olds really not know how to use email these days? I do talk to graduate students, and hadn't noticed a trend. I must be getting old(er). Maybe we need a "Dummies Guide to setting up a GoogleMail/Thunderbird/Outlook filter"? e.g. I have one to move things from the inbox to a "Biopython" folder automatically. I do take you point that making the mailing list more accessible to novices (especially university students) is a good idea - and you may be right that *mirroring* it on GoogleGroups could be a solution. I don't know enough about how that works to have an informed viewpoint, but I trust Hilmar to look into it. Peter From iua1 at psu.edu Wed Dec 16 13:57:39 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:57:39 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> Message-ID: 2009/12/16 Tiago Ant?o : > If google turns bad, we loose a lot of history as google groups > is not inherently distributed and thus neither resilient No, on a second thought it actually is. Sign up to the group such that is sends an email for every single message (if you wish so). Google goes under, go back to the site the way it is right now. Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From iua1 at psu.edu Wed Dec 16 13:59:11 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:59:11 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> Message-ID: On Wed, Dec 16, 2009 at 1:55 PM, Peter wrote: > Do 20 year olds really not know how to use email these days? > I do talk to graduate students, and hadn't noticed a trend. > I must be getting old(er). Maybe we need a "Dummies Not email but interacting with a listserver via emails. Ask your students what a list server is. -- Istvan Albert http://www.personal.psu.edu/iua1 From iua1 at psu.edu Wed Dec 16 14:14:21 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 14:14:21 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: On Wed, Dec 16, 2009 at 2:07 PM, Chris Fields wrote: > Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? > The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). The effective way to deal with it is white listing. There is a setting that requires that the first message from a given email be approved by mods. That being said spam, popularity and ease of access all correlate. -- Istvan Albert http://www.personal.psu.edu/iua1 From cjfields at illinois.edu Wed Dec 16 14:05:46 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 13:05:46 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> Message-ID: On Dec 16, 2009, at 12:57 PM, Istvan Albert wrote: > 2009/12/16 Tiago Ant?o : > >> If google turns bad, we loose a lot of history as google groups >> is not inherently distributed and thus neither resilient > > No, on a second thought it actually is. > > Sign up to the group such that is sends an email for every single > message (if you wish so). Google goes under, go back to the site the > way it is right now. > > Istvan ...and in the meantime we lose any content only present on the google group list. Whereas if the group is a mirror of this list, then nothing is lost. chris From hlapp at drycafe.net Wed Dec 16 14:20:21 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 14:20:21 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <307922E9-8DD9-408F-8D39-D259186B3568@drycafe.net> It's the mailing list moderator volunteers that keep the spam out, actually. It's gotten so bad though that most OBF lists are set to reject non-member posts. What Google might be better at is to reject the spam well enough that one could open up the lists again to non-member posting. -hilmar Sent from away On Dec 16, 2009, at 2:07 PM, Chris Fields wrote: > Just curious, but does anyone know whether Google groups are more or > less susceptible to spamming? The current mailman setup does keep > out a vast majority of spam (I can't recall the last instance, > actually). From cjfields at illinois.edu Wed Dec 16 14:07:39 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 13:07:39 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> On Dec 16, 2009, at 12:40 PM, Istvan Albert wrote: > On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: >> >> Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. > > My fault for being insufficiently clear. I am not saying that having > archives is useless. > > It all needs to be framed in the mindset of an unexpected event > causing an archive to be lost. Is that irreparable harm? For example > would having a hundred more active participants be worth the small > risk of losing the archives? Not that I think Google is in any danger of going under, or that Google Groups will cease to exist, but they have discontinued services in the past (notebook was one, and I recall others going away). > I am just putting the 100 as a number out there, just to get you to > think. I think you all agree that at some level of extra participation > the risks would be well worth it. I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. > Now I am convinced that a Google group would get more participation. > But is that 10 more people, one hundred, one thousand? That I do not > dare to guesstimate. > > (definitely more than 10, ;-) ) > > Istvan I think mirroring the list is the best compromise. I can't envision moving everything wholesale over to Google Groups for the reasons Hilmar has outlined. Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). chris From iua1 at psu.edu Wed Dec 16 14:21:05 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 14:21:05 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: On Wed, Dec 16, 2009 at 2:07 PM, Chris Fields wrote: > I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. Great idea! While you are at it, why not allow people to post as well? Sign up the current list so that when someone posts on Google Groups it also goes to the current biopython list. When people reply-all from biopython group it will go to both lists. Maybe it is possible to get both worlds and using them in parallel! Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 14:29:14 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 14:29:14 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: That's included in the mirroring. You can post through either interface and join at either interface, it's transparent. -hilmar Sent from away On Dec 16, 2009, at 2:21 PM, Istvan Albert wrote: > Great idea! While you are at it, why not allow people to post as well? From pingou at pingoured.fr Wed Dec 16 14:35:07 2009 From: pingou at pingoured.fr (Pierre-Yves) Date: Wed, 16 Dec 2009 20:35:07 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <1260992107.5993.0.camel@localhost.localdomain> On Wed, 2009-12-16 at 14:21 -0500, Istvan Albert wrote: > > Sign up the current list so that when someone posts on Google Groups > it also goes to the current biopython list that implies that people are suscribed to both list and that won't be always the case Pierre From richard_w_g_price at academia.edu Wed Dec 16 17:50:05 2009 From: richard_w_g_price at academia.edu (Richard Price) Date: Wed, 16 Dec 2009 14:50:05 -0800 Subject: [Biopython] New Academia.edu feature for Biopython In-Reply-To: References: Message-ID: Dear Biopython members, I just wanted to let you know that there are now 5 members of Biopython on Academia.edu listing their research interests such as Analytical Chemistry, Computational Biology, and Bioinformatics. They have also listed contacts, photos and papers. There are thousands of people listing the same research interests as the Biopython members on Academia.edu, so there are lots of researchers for Biopython members to discover. To see the 5 members of Biopython on Academia.edu, and their research interests and papers, follow the link below: http://lists.academia.edu/See-members-of-Biopython Richard Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. Founder of Academia.edu On Wed, Dec 2, 2009 at 5:21 PM, Richard Price wrote: > Dear Biopython members, > > > I wanted to tell the list about a new feature on Academia.edu. > Academia.edu launched 12 months ago and now helps 300,000 academics a > month answer the question 'who's researching what?' > > > We have built a dedicated page on Academia.edu for the Biopython mailing list: > > > http://lists.academia.edu/See-members-of-Biopython > > > This page will show you fellow members already on Academia.edu. ?You > can see their papers, research interests, and other information. > > > Visit the link below, sign up with Academia.edu, and see who else from > Biopython is on Academia.edu. > > > > http://lists.academia.edu/See-members-of-Biopython > > > Richard > > > Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. > Founder of Academia.edu > From lpritc at scri.ac.uk Thu Dec 17 04:35:56 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 09:35:56 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: Message-ID: Hi, On 16/12/2009 16:49, "Istvan Albert" wrote: > [the mailing list] feels like a throwback to the 90s. [...] > Of course python has comp.lang.python and it is a nice and > thriving group. comp.lang.python is a Usenet group, and Usenet is a throwback to the 70s. ;) > I can't even imagine what it would look like to have a popular > newsgroup being delivered to my mailbox. Mailing lists are far more convenient - for me - than having to navigate to a website especially to check new messages on a particular subject. Mailing lists bring new posts and issues to my attention via a single always-on interface in a timely manner. The current mailing list also has the advantage of grabbing the attention, specifically, of many people who might be able to do something about a query. As it happens, when I used to still read Usenet groups, I would do so from my mail client, with exactly the same threaded interface as I used for mailing lists and all other email. Biopython is never likely to be more than a niche interest, so I wouldn't expect it to ever reach the traffic of - say - alt.binaries. To be honest, the traffic doesn't even seem to approach that of numpy-discussion. And while we're talking about numpy-discussion, it illustrates one of Hilmar's and Chris' points: On 16/12/2009 16:56, "Hilmar Lapp" wrote: > what is your solution to Google all of a sudden deciding to take > Google Groups down, or to make it a paid subscription service, or if > Google goes out of business? On 16/12/2009 19:07, "Chris Fields" wrote: > Not that I think Google is in any danger of going under, or that Google Groups > will cease to exist, but they have discontinued services in the past (notebook > was one, and I recall others going away). http://groups.google.com/group/numpy-discussion/unlock?_done=/group/numpy-di scussion/ That mailing list was taken down from Google Groups for 'violating terms of service' - why, I don't know: it's a mailing list for a specialist Python library. It does illustrate though, how control can be (irrevocably) lost over communication via Google Groups. Notably, the mailing list itself persists without interruption. On 16/12/2009 17:36, "Istvan Albert" wrote: > One thing is seems clear to me and I do not think that you are aware > of it. This mailman setup is a throttle - it imposes a negative > feedback on the amount of messages that it can handle. > > This system of messages cannot grow over a certain limit. Just imagine > regularly getting a dozen new emails a day plus their followups, yet > you are just a casual user. > > This would be unbearable for many people whose inboxes are already > overflowing. This issue - that some people don't like to receive lots of messages at once - is already solved in mailman. There is a 'daily digest' option on the mailing list that collates the messages for a day, and sends them out as a single email for you. As a mailing list, mailman is deliberately designed to be relatively low volume in terms of content, but to reach many readers directly; the idea is to send relatively few messages to many people - but to push those messages through to the reader. Forums, wikis and website-based discussion lists require a deliberate effort on the part of the reader either to find what they're interested in, or to visit the site regularly. Otherwise they just end up signing up to receive updates by email, much like mailman. On 16/12/2009 17:40, "Hilmar Lapp" wrote: > That all being said, if what's at issue here is to have a Google Group > interface to the Biopython mailing list, then that's actually easy to > achieve. FWIW, I think that archiving the mailing list on Google Groups is not a bad idea - so long as the current registration scheme continues to prevent the inevitable waves of spam from Google Groups users. The Biopython mailing lists appear already to have been archived - with various degrees of usable interface, and likely intermittent coverage, too - at sites such as: http://www.mailinglistarchive.com/biopython at biopython.org/index.html and http://osdir.com/ml/search.html?cx=008059810939676512379%3Af5owd_2hq3u&cof=F ORID%3A10&q=%5Bbiopython%5D&sa=Search amongst others. On 16/12/2009 18:02, "Istvan Albert" wrote: > All of you who responded - frankly I think you are too close to this > issue to be able judge it correctly. [...] Get > someone who is 20 and has never heard of sending emails to an address > then see what they say about it... If they've never heard of emailing an address, and/or can't use a mail client to filter their email, I'm not sure the immediate problem is necessarily with the mailing list... ;) I don't think that anyone here wants to restrict access to Biopython, or to prevent discussion, even inadvertently. That we've had 28 posts on this issue in about 12 hours suggests that the list can handle issues of some interest. Sure, it would be nice to have a convenient, web-accessible and searchable archive with a pretty and robust interface (and Google Groups could give us that). But I'm not convinced that the current mailing list is a particular barrier to participation. +1 for mirroring/archiving on Google Groups: http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From lpritc at scri.ac.uk Thu Dec 17 05:14:54 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 10:14:54 +0000 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Message-ID: Hi Carlos, There's an interface to the EMBOSS package primersearch, which might be what you're looking for. http://embossgui.sourceforge.net/demo/manual/primersearch.html http://github.com/biopython/biopython/blob/master/Bio/Emboss/Applications.py Cheers, L. On 16/12/2009 16:57, "Carlos Javier Borroto" wrote: > Hi there, > > I'm looking for a way to show some information on the specificity of > sets of primers I'm designing, I'll love to have a way to run an in > silico PCR reaction and parse the results with biopython. > > Is there something to use with one of the available PCR simulation tools? > > regards, -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From biopython at maubp.freeserve.co.uk Thu Dec 17 05:42:16 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 10:42:16 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912170242p7caa551eh91c2aa7beabc00f1@mail.gmail.com> On Thu, Dec 17, 2009 at 9:35 AM, Leighton Pritchard wrote: > > On 16/12/2009 16:56, "Hilmar Lapp" wrote: > >> what is your solution to Google all of a sudden deciding to take >> Google Groups down, or to make it a paid subscription service, >> or if Google goes out of business? > > On 16/12/2009 19:07, "Chris Fields" wrote: > >> Not that I think Google is in any danger of going under, or that Google Groups >> will cease to exist, but they have discontinued services in the past (notebook >> was one, and I recall others going away). > > http://groups.google.com/group/numpy-discussion/unlock?_done=/group/numpy-di > scussion/ > > That mailing list was taken down from Google Groups for 'violating terms of > service' - why, I don't know: it's a mailing list for a specialist Python > library. ?It does illustrate though, how control can be (irrevocably) lost > over communication via Google Groups. ?Notably, the mailing list itself > persists without interruption. I remembered that example later - if NumPy had *switched* that would have been a major upset to their community. As it was, it seems people who were using the Google Groups interface switched to using plain old email: http://mail.scipy.org/pipermail/numpy-discussion/2009-October/045855.html Peter From biopython at maubp.freeserve.co.uk Thu Dec 17 05:55:51 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 10:55:51 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912170255y3968a6e4s3bd904071910251@mail.gmail.com> On Wed, Dec 16, 2009 at 6:55 PM, Peter wrote: > > I do take you point that making the mailing list more > accessible to novices (especially university students) is a > good idea - and you may be right that *mirroring* it on > GoogleGroups could be a solution. I don't know enough > about how that works to have an informed viewpoint, but > I trust Hilmar to look into it. > On Thu, Dec 17, 2009 at 9:35 AM, Leighton Pritchard wrote: > > +1 for mirroring/archiving on Google Groups: > http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 > It looks like we can see mirroring/archiving on Google Groups in action - Hilmar and Chris have got this up and running for the main BioPerl list, http://lists.open-bio.org/pipermail/bioperl-l/2009-December/031789.html http://groups.google.com/group/bioperl-l It is a shame there isn't any obvious way to import the existing archive though. Peter From biopython at maubp.freeserve.co.uk Thu Dec 17 07:27:59 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 12:27:59 +0000 Subject: [Biopython] Entrez.efetch Service unavailable! In-Reply-To: <5b6410e0912160125j4f034218x263e6ab90b4afd47@mail.gmail.com> References: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> <320fb6e00912160058i57acc020nba7c61c53a4ec64b@mail.gmail.com> <5b6410e0912160125j4f034218x263e6ab90b4afd47@mail.gmail.com> Message-ID: <320fb6e00912170427q6b31ee33x6b161323d24f3027@mail.gmail.com> On Wed, Dec 16, 2009 at 9:25 AM, Kevin Lam wrote: > Hi Peter, > Thanks for the suggestion. It was also an exercise for me since I am new to > Biopython and add to the fact that I do not need the Archaea sequences as I > am looking for pathogenic bacteria. I admit I am lazy ha! All the same, > thanks for being so helpful to a newbie to biopython. > Cheers > Kevin No problem. I've been trying Entrez EFtech on and off over the last 24 hours, and it has been unavailable for a while. It seems to be back now. Peter P.S. Mailing list CC'd From giles.weaver at googlemail.com Thu Dec 17 08:25:18 2009 From: giles.weaver at googlemail.com (Giles Weaver) Date: Thu, 17 Dec 2009 13:25:18 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <4B2A313E.3050300@googlemail.com> I urge extreme caution with regards to using Google Groups. My experiences with Google Groups have been less than satisfactory. I've administered several small groups and they have been plagued with delivery issues for a small subset of users. I've also known members to be unable to access a group that they are a member of, and find that Google Groups has "disappeared" their profile - despite having an otherwise fully functioning Google account (Gmail etc). I'm not a fan of mailman, but at least the open-bio administrators have full control over the lists. I've known Chris to solve issues with the bioperl mailman list within minutes. You won't get that kind of service (if any) from Google. Having looked at alternatives to Google Groups myself, the two things that have caught my attention are bbPress (a wordpress derived bulletin board) and Google Wave. Both are still under development. bbPress boards can be subscribed to via RSS (and possibly email), so users can have messages drop into their mail/news reader. Wave looks promising, but I wouldn't touch it with a barge pole until mature Google free implementations take off, and that could be some time away. Mirroring the open-bio mailman lists onto Google Groups seems to me the right way to go, but I think there should be a health warning on the list home pages! Giles On 16/12/2009 19:07, Chris Fields wrote: > On Dec 16, 2009, at 12:40 PM, Istvan Albert wrote: > > >> On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: >> >>> Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. >>> >> My fault for being insufficiently clear. I am not saying that having >> archives is useless. >> >> It all needs to be framed in the mindset of an unexpected event >> causing an archive to be lost. Is that irreparable harm? For example >> would having a hundred more active participants be worth the small >> risk of losing the archives? >> > Not that I think Google is in any danger of going under, or that Google Groups will cease to exist, but they have discontinued services in the past (notebook was one, and I recall others going away). > > >> I am just putting the 100 as a number out there, just to get you to >> think. I think you all agree that at some level of extra participation >> the risks would be well worth it. >> > I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. > > >> Now I am convinced that a Google group would get more participation. >> But is that 10 more people, one hundred, one thousand? That I do not >> dare to guesstimate. >> >> (definitely more than 10, ;-) ) >> >> Istvan >> > I think mirroring the list is the best compromise. I can't envision moving everything wholesale over to Google Groups for the reasons Hilmar has outlined. > > Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). > > chris > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From biopython at maubp.freeserve.co.uk Thu Dec 17 08:35:25 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 13:35:25 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <4B2A313E.3050300@googlemail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> <4B2A313E.3050300@googlemail.com> Message-ID: <320fb6e00912170535r3a6d2a4cv37d852aff3671e6e@mail.gmail.com> On Thu, Dec 17, 2009 at 1:25 PM, Giles Weaver wrote: > I urge extreme caution with regards to using Google Groups. > My experiences with Google Groups have been less than satisfactory. > ... > > Mirroring the open-bio mailman lists onto Google Groups seems to me the > right way to go, but I think there should be a health warning on the list > home pages! Thanks for the hard earned advice. I absolutely agree that *moving* to GoogleGroups is a bad idea, and accept that *mirroring* may be worth a try. Given the BioPerl list is already trying this out, let's give that a week or so, and if they think it works nicely then we could do the same for Biopython. Peter From David.Lapointe at umassmed.edu Thu Dec 17 07:42:15 2009 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Thu, 17 Dec 2009 07:42:15 -0500 Subject: [Biopython] EMBOSS and Python Message-ID: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> I haven't used EMBOSS though the python module before so this hasn't been an issue but even though I have a working install of EMBOSS, all of the EMBOSS tests seem to fail. I haven't seen any instructions for this. David From biopython at maubp.freeserve.co.uk Thu Dec 17 12:04:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 17:04:11 +0000 Subject: [Biopython] EMBOSS and Python In-Reply-To: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> Message-ID: <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> On Thu, Dec 17, 2009 at 12:42 PM, Lapointe, David wrote: > I haven't used EMBOSS though the python module before so this hasn't > been an issue but even though I have ?a working install of EMBOSS, all > of the EMBOSS tests seem to fail. I haven't seen any instructions for > this. What version of Biopython do you have? And if you are talking about the Biopython unit tests, could you post the output please? What version of EMBOSS do you have? Some of the Biopython tests did flag issues in EMBOSS which are fixed in their latest release. Thanks, Peter From iua1 at psu.edu Thu Dec 17 12:29:17 2009 From: iua1 at psu.edu (Istvan Albert) Date: Thu, 17 Dec 2009 12:29:17 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: On Thu, Dec 17, 2009 at 4:35 AM, Leighton Pritchard wrote: > Mailing lists are far more convenient - for me - than having to navigate to > a website especially to check new messages on a particular subject. Just for clarity: nobody is suggesting to take that away. You can *always* get the messages delivered via email. I think some of you get a little bit defensive because you assume that the suggestion is about messing with your system. > Biopython is never likely to be more than a niche interest, You are perfectly right here and that echoes my original sentiment! What keeps biopython a niche interest is exactly the lack of community building features. There obstacles in getting into it, so most won't. It doesn't take much to discourage a newcomer. best, Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From dalloliogm at gmail.com Thu Dec 17 12:48:18 2009 From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio) Date: Thu, 17 Dec 2009 18:48:18 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: <5aa3b3570912170948k1890aa5ft3c30c760718c3cbd@mail.gmail.com> On Wed, Dec 16, 2009 at 6:26 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: >> >> The alternative such as Google groups would be far superior in >> attracting and building a community of developers and users as well. >> Is this an idea that ?the owners of the list would entertain? >> > > As far as I can tell from the limited documentation I found on > Google Groups (maybe I was looking in the wrong place?) there > is no way to import our existing decade long mail archives. > That would be a major downside. It seems that you can use a google/group to archive the messages from a remote mailing list, but you can't import the older messages to the google group: - http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 Maybe you can give it a try: just create a group on google and use it as a mirror for the current mailing list. This way google users will be able to use the google interface to search and read messages, but they won't be able to post messages by mail. It won't arm anyone to have a mirror of the messages in GG, except that it may be a bit confusing for new users. About other possibilities, it seems that there is not yet a way to import a whole archive go GG: - http://groups.google.com/group/Groups-Suggestions/browse_thread/thread/3eb3ed32dee6b97e I have tried google/wave (I have invitations if anyone wants) but please, avoid to switch to it yet because it is still very very buggy and it will take them at least one year or so to make it useful. > Also, using Google Groups would *require* all posters to have > a Google Account - a potential sticking point for some. I am still not sure about this: it seems that you can subscribe with another address if you are invited, but you have to create a google account (not necessarily a google mail) to subscribe to it manually. It is a nice idea to switch to a fancier program for managing the list, but maybe it would require too much time to switch to google/groups. > > Peter > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Giovanni Dall'Olio, phd student Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain) My blog on bioinformatics: http://bioinfoblog.it From lpritc at scri.ac.uk Thu Dec 17 12:55:52 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 17:55:52 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: Message-ID: Hi, On 17/12/2009 17:29, "Istvan Albert" wrote: > On Thu, Dec 17, 2009 at 4:35 AM, Leighton Pritchard wrote: > >> Mailing lists are far more convenient - for me - than having to navigate to >> a website especially to check new messages on a particular subject. > > Just for clarity: nobody is suggesting to take that away. You can > *always* get the messages delivered via email. I wasn't worried that anyone would 'take it away'. Though it must be said that not all discussion systems allow email tracking. > I think some of you get a little bit defensive because you assume that the > suggestion is about messing with your system. I don't think that's a fair characterisation. So far, what I've seen is open and thoughtful discussion of a comment you've made, and general agreement that mirroring/archiving on Google Groups is a good idea. Just because someone doesn't agree with every point you make, that doesn't make them defensive. >> Biopython is never likely to be more than a niche interest, > > You are perfectly right here and that echoes my original sentiment! > What keeps biopython a niche interest is exactly the lack of community > building features. There obstacles in getting into it, so most won't. > It doesn't take much to discourage a newcomer. That's not what I meant when I wrote that it's a niche interest. Biopython is a niche interest because a user likely fulfils three - rather unusual - criteria. I believe that these are the big obstacles to participation - not the software used for the mailing list: - they program in Python - they have an interest (likely professional or educational - not many bioinformatics hobbyists out there) in bioinformatics - they have bothered to investigate an existing general library for bioinformatics in Python, rather than try to solve their problem from scratch ;) Add to that, that the mailing list is a forum for discussion which - along with all other online fora for discussion, including Google Groups - is inherently self-selecting for active members. Taken together, this sets its own upper limit to the user base, which in turn helps define what is a sensible system to convey information between, and to, users. The three criteria listed above do suggest a sufficient level of comfort with the problem domain, such that Biopython's presence on Google Groups is - in my opinion - unlikely to be the major draw for new contributors or users. I do worry for the future of research if signing up to a mailing list (which you can do via the website; no subscription email is required) is an insurmountable hurdle for young, apparently computer-literate scientists. If that sort of thing really is a problem, it should at least keep conference attendances down in years to come, as navigating their Byzantine registration systems must drive them to endless despair... L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From iua1 at psu.edu Thu Dec 17 13:47:23 2009 From: iua1 at psu.edu (Istvan Albert) Date: Thu, 17 Dec 2009 13:47:23 -0500 Subject: [Biopython] some eye opening stats Message-ID: Hello Everyone, So I ran some statistics on this group (see below) that includes the entire past year. Make you own decisions based on it. Here is one of my observation: I find it saddening that I made the list at number 18! That's some niche list where one person posting ten messages in a whole year gets to be at number 18. In fact I only need three more posts to make myself top ten poster! Would you still claim this to be a good way to establish, grow and interact with a community? I said this many times before, and I'll try for this to be the last time I bring this up: I believe biopython is a niche software tool because *YOU* are limiting its reach *YOURSELVES* by making inappropriate decisions as far as accessibility and community goes. It will stay so as long as you don't recognize and act on this. best regards, Istvan Albert ================================= Statistics from 1.12.2008 to 17.12.2009 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ***** People who have written most messages: +----+-----Author-----------------------------------+--Msg-+-Percent-+ | 1 | biopython at maubp.freeserve.co.uk (Peter) | 472 | 39.70 % | | 2 | chapmanb at 50mail.com (Brad Chapman) | 53 | 4.46 % | | 3 | p.j.a.cock at googlemail.com (Peter Cock) | 36 | 3.03 % | | 4 | lueck at ipk-gatersleben.de (=?iso-8859-1?Q? | 27 | 2.27 % | | 5 | mjldehoon at yahoo.com (Michiel de Hoon) | 23 | 1.93 % | | 6 | cjfields at illinois.edu (Chris Fields) | 22 | 1.85 % | | 7 | dalloliogm at gmail.com (Giovanni Marco Dall | 21 | 1.77 % | | 8 | winda002 at student.otago.ac.nz (David Winte | 15 | 1.26 % | | 9 | lpritc at scri.ac.uk (Leighton Pritchard) | 14 | 1.18 % | | 10 | cmckay at u.washington.edu (Cedar McKay) | 13 | 1.09 % | | 11 | italo.maia at gmail.com (Italo Maia) | 12 | 1.01 % | | 12 | kellrott at gmail.com (Kyle Ellrott) | 12 | 1.01 % | | 13 | rodrigo_faccioli at uol.com.br (Rodrigo facc | 12 | 1.01 % | | 14 | bartek at rezolwenta.eu.org (Bartek Wilczyns | 12 | 1.01 % | | 15 | anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Ro | 11 | 0.93 % | | 16 | bartomas at gmail.com (bar tomas) | 11 | 0.93 % | | 17 | pzs at dcs.gla.ac.uk (Peter Saffrey) | 11 | 0.93 % | | 18 | iua1 at psu.edu (Istvan Albert) | 10 | 0.84 % | | 19 | dejmail at gmail.com (Liam Thompson) | 10 | 0.84 % | | 20 | stran104 at chapman.edu (Matthew Strand) | 10 | 0.84 % | | 21 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 9 | 0.76 % | | 22 | pengyu.ut at gmail.com (Peng Yu) | 9 | 0.76 % | | 23 | yvan.strahm at bccs.uib.no (Yvan Strahm) | 9 | 0.76 % | | 24 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 9 | 0.76 % | | 25 | carlos.borroto at gmail.com (Carlos Javier B | 9 | 0.76 % | +----+----------------------------------------------+------+---------+ | | other | 337 | 28.34 % | +----+----------------------------------------------+------+---------+ ***** Best authors, by total size of their messages (w/o quoting): +----+-----Author-------------------------------------------+-KBytes-+ | 1 | biopython at maubp.freeserve.co.uk (Peter) | 391.4 | | 2 | chapmanb at 50mail.com (Brad Chapman) | 48.1 | | 3 | lpritc at scri.ac.uk (Leighton Pritchard) | 39.1 | | 4 | p.j.a.cock at googlemail.com (Peter Cock) | 35.7 | | 5 | lueck at ipk-gatersleben.de (=?iso-8859-1?Q?Stefanie | 27.9 | | 6 | matzke at berkeley.edu (Nick Matzke) | 21.2 | | 7 | animesh.agrawal at anu.edu.au (Animesh Agrawal) | 21.1 | | 8 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 16.7 | | 9 | pzs at dcs.gla.ac.uk (Peter Saffrey) | 15.4 | | 10 | dalloliogm at gmail.com (Giovanni Marco Dall'Olio) | 14.7 | | 11 | natassa_g_2000 at yahoo.com (natassa) | 12.7 | | 12 | hlapp at gmx.net (Hilmar Lapp) | 12.7 | | 13 | mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydG | 12.5 | | 14 | cjfields at illinois.edu (Chris Fields) | 12.0 | | 15 | bartek at rezolwenta.eu.org (Bartek Wilczynski) | 11.6 | | 16 | winda002 at student.otago.ac.nz (David Winter) | 11.3 | | 17 | cmckay at u.washington.edu (Cedar McKay) | 11.0 | | 18 | mjldehoon at yahoo.com (Michiel de Hoon) | 10.9 | | 19 | dejmail at gmail.com (Liam Thompson) | 10.6 | | 20 | rodrigo_faccioli at uol.com.br (Rodrigo faccioli) | 10.3 | | 21 | anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues? | 9.7 | | 22 | fufezan at uni-muenster.de (Christian Fufezan) | 9.6 | | 23 | kellrott at gmail.com (Kyle Ellrott) | 9.2 | | 24 | peter at maubp.freeserve.co.uk (Peter) | 8.9 | | 25 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 8.3 | +----+------------------------------------------------------+--------+ ***** Best authors, by average size of their message (w/o quoting): +----+-----Author--------------------------------------------+-bytes-+ | 1 | agarbino at gmail.com (Alex Garbino) | 7991 | | 2 | aduran at fhcrc.org (Duran, Alysha M) | 3893 | | 3 | n.j.loman at bham.ac.uk (Nick Loman) | 3877 | | 4 | mmueller at python-academy.de (=?ISO-8859-15?Q?Mike_M | 3816 | | 5 | animesh.agrawal at anu.edu.au (Animesh Agrawal) | 3598 | | 6 | ivan at biodec.com (Ivan Rossi) | 2947 | | 7 | thomas.e.keller at gmail.com (Thomas Keller) | 2937 | | 8 | lpritc at scri.ac.uk (Leighton Pritchard) | 2860 | | 9 | matzke at berkeley.edu (Nick Matzke) | 2717 | | 10 | jhcepas at gmail.com (Jaime Huerta Cepas) | 2705 | | 11 | fufezan at uni-muenster.de (Christian Fufezan) | 2447 | | 12 | natassa_g_2000 at yahoo.com (natassa) | 2175 | | 13 | bav853 at bham.ac.uk (Bhima A van der Molen) | 1987 | | 14 | kteague at bcgsc.ca (Kevin Teague) | 1932 | | 15 | danielchubb at gmail.com (Daniel Chubb) | 1923 | | 16 | lueck at ipk-gatersleben.de (=?utf-8?Q?Stefanie_L=C3= | 1919 | | 17 | dalloliogm at fastwebnet.it (Giovanni Marco Dall'Olio | 1915 | | 18 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 1895 | | 19 | bav853 at bham.ac.uk (Bhima Auro van der Molen) | 1874 | | 20 | darnells at dnastar.com (Steve Darnell) | 1768 | | 21 | gatoygata at hotmail.com (Joaquin Abian Monux) | 1744 | | 22 | srini_iyyer_bio at yahoo.com (Srinivas Iyyer) | 1733 | | 23 | bassbabyface at yahoo.com (Ben O'Loghlin) | 1698 | | 24 | hlapp at gmx.net (Hilmar Lapp) | 1624 | | 25 | mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGl | 1603 | +----+-------------------------------------------------------+-------+ ***** Table showing the most successful subjects: +----+----Subject-----------------------------------+--Msg-+-Percent-+ | 1 | [Biopython] suggestion: moving to the discus | 33 | 2.78 % | | 2 | [Biopython] Problems parsing with PSIBlastPa | 32 | 2.69 % | | 3 | [BioPython] The count method of a Seq (or Mu | 23 | 1.93 % | | 4 | [Biopython] Biopython and Snow Leopard | 21 | 1.77 % | | 5 | [Biopython] Phylogenetic trees with biopytho | 16 | 1.35 % | | 6 | [Biopython] Indexing large sequence files | 15 | 1.26 % | | 7 | [Biopython] SQL Alchemy based BioSQL | 14 | 1.18 % | | 8 | [BioPython] AlignIO: Sequences of different | 13 | 1.09 % | | 9 | [Biopython] searching for a human chromosome | 13 | 1.09 % | | 10 | [Biopython] BLAST against mouse genome only | 12 | 1.01 % | | 11 | [Biopython] How to get sequences upstream of | 12 | 1.01 % | | 12 | [Biopython] Parsing large blast files | 11 | 0.93 % | | 13 | [Biopython] Additions to the SeqRecord | 11 | 0.93 % | | 14 | [Biopython] Biopython & p3d | 11 | 0.93 % | | 15 | [BioPython] Feedback from Biopython 1.50 bet | 10 | 0.84 % | | 16 | [BioPython] Adding startswith and endswith m | 10 | 0.84 % | | 17 | [Biopython] Bio.Sequencing.Ace | 10 | 0.84 % | | 18 | [Biopython] Entrez.read return value is type | 10 | 0.84 % | | 19 | [Biopython] SeqIO for fasta conversion of Il | 10 | 0.84 % | | 20 | [Biopython] Parsing problem | 9 | 0.76 % | | 21 | [Biopython] Fasta.index_file: functionality | 9 | 0.76 % | | 22 | [Biopython] Adaptor trimmer and dimers | 9 | 0.76 % | | 23 | [BioPython] Is query_length really the lengt | 8 | 0.67 % | | 24 | [BioPython] Reading Roche 454 binary SFF fil | 8 | 0.67 % | | 25 | [Biopython] Writing into a PDB file using PD | 8 | 0.67 % | +----+----------------------------------------------+------+---------+ | | other | 851 | 71.57 % | +----+----------------------------------------------+------+---------+ ***** Most used email clients: +----+----Mailer------------------------------------+--Msg-+-Percent-+ | 1 | (unknown) | 1189 |100.00 % | +----+----------------------------------------------+------+---------+ | | other | 0 | 0.00 % | +----+----------------------------------------------+------+---------+ ***** Table of maximal quoting: +----+-----Author------------------------------------------+-Percent-+ | 1 | golubchi at stats.ox.ac.uk (Tanya Golubchik) | 94.68 % | | 2 | fredgca at hotmail.com (Frederico Arnoldi) | 92.56 % | | 3 | srikrishnamohan at gmail.com (km) | 92.28 % | | 4 | cmckay at u.washington.edu (Cedar Mckay) | 88.19 % | | 5 | harekrishna at gmail.com (Austin Davis-Richar | 87.51 % | | 6 | biopython.chen at gmail.com (chen Ku) | 86.14 % | | 7 | wgheath at gmail.com (William Heath) | 85.12 % | | 8 | jkhilmer at gmail.com (Jonathan Hilmer) | 82.50 % | | 9 | andrea at biodec.com (Andrea) | 80.82 % | | 10 | nuin at genedrift.org (Paulo Nuin) | 79.90 % | | 11 | lueck at ipk-gatersleben.de (lueck at ipk-gat | 78.81 % | | 12 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 78.57 % | | 13 | oda at georgetown.edu (Ogan ABAAN) | 75.33 % | | 14 | sean.maceach at gmail.com (Sean MacEachern) | 75.09 % | | 15 | pengyu.ut at gmail.com (Peng Yu) | 73.17 % | | 16 | fungazid at yahoo.com (Fungazid) | 71.57 % | | 17 | yvan.strahm at bccs.uib.no (Yvan Strahm) | 70.23 % | | 18 | cjfields at illinois.edu (Chris Fields) | 70.18 % | | 19 | sdavis2 at mail.nih.gov (Sean Davis) | 68.54 % | | 20 | thomas.hamelryck at gmail.com (Thomas Hamelry | 68.31 % | | 21 | mjldehoon at yahoo.com (Michiel de Hoon) | 66.49 % | | 22 | mavata at gmail.com (Manu Tamminen) | 66.42 % | | 23 | eric.talevich at gmail.com (Eric Talevich) | 65.92 % | | 24 | bsouthey at gmail.com (Bruce Southey) | 64.70 % | | 25 | biopythonlist at gmail.com (dr goettel) | 64.53 % | +----+-----------------------------------------------------+---------+ | | average | 44.57 % | +----+-----------------------------------------------------+---------+ ***** Graph showing number of messages written during hours of day: 100% ---------------------#--------------------------- - 146 90% ---------------------#--------------------------- msgs 80% ---------------------#--------------------------- 70% ---------------------#-#------------------------- 60% ---------------------#-#-#-#-#---#--------------- 50% ---------------------#-#-#-#-#-#-#-#------------- 40% ---------------------#-#-#-#-#-#-#-#-#----------- 30% -----------------#-#-#-#-#-#-#-#-#-#-#-#--------- 20% -----------------#-#-#-#-#-#-#-#-#-#-#-#--------- 10% ---------------#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#- * * * * * * * * * * * * * * * * * * * * * * * * hour 0 5 11 17 23 ***** Graph showing number of messages written during days of month: 100% -----------------------------#--------------------------------- - 74 90% -----------------------------#-#------------------------------- msgsday 1 6 12 18 24 31 ***** Graph showing number of messages written during days of week: 100% -------------#--------------- - 262 90% ---------#---#--------------- msgs 80% ---------#---#---#----------- 70% -#---#---#---#---#----------- 60% -#---#---#---#---#----------- 50% -#---#---#---#---#----------- 40% -#---#---#---#---#----------- 30% -#---#---#---#---#----------- 20% -#---#---#---#---#----------- 10% -#---#---#---#---#---#---#--- * * * * * * * Mon Tue Wed Thu Fri Sat Sun ***** Maximal quoting: Author : andrea at biodec.com (Andrea) Subject : [Biopython] Problems parsing with PSIBlastParser Date : Thu, 15 Oct 2009 17:39:48 +0200 Quote ratio: 98.63% / 15890 bytes ***** Longest message: Author : chapmanb at 50mail.com (Brad Chapman) Subject : [Biopython] Skipping over blank/erroneous Entrez.esummary() Date : Wed, 7 Oct 2009 16:29:11 -0400 Size : 18503 bytes ***** Most successful subject: Subject : [Biopython] suggestion: moving to the discussion list to Google No. of msgs: 33 Total size : 43527 bytes ***** Final summary: Total number of messages: 1189 Total number of different authors: 159 Total number of different subjects: 332 Total size of messages (w/o headers): 1992170 bytes Average size of a message: 1675 bytes ***** Generated by MailListStat v1.3, (C) 2001-2003 ***** See http://freshmeat.net/projects/mls for details... -- Istvan Albert http://www.personal.psu.edu/iua1 From mailinglist.honeypot at gmail.com Thu Dec 17 14:08:55 2009 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Thu, 17 Dec 2009 14:08:55 -0500 Subject: [Biopython] some eye opening stats In-Reply-To: References: Message-ID: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> Hi Istvan, On Dec 17, 2009, at 1:47 PM, Istvan Albert wrote: > Hello Everyone, > > So I ran some statistics on this group (see below) that includes the > entire past year. Make you own decisions based on it. > > Here is one of my observation: I find it saddening that I made the > list at number 18! That's some niche list where one person posting > ten messages in a whole year gets to be at number 18. In fact I only > need three more posts to make myself top ten poster! Would you still > claim this to be a good way to establish, grow and interact with a > community? > > I said this many times before, and I'll try for this to be the last > time I bring this up: > > I believe biopython is a niche software tool because *YOU* are > limiting its reach *YOURSELVES* by making inappropriate decisions as > far as accessibility and community goes. It will stay so as long as > you don't recognize and act on this. I haven't said much so far because (1) I'm not really actively using biopython atm, and (2) I'm largely indifferent about the choice of mailing list vs. web interface, but let's be serious here ... how can you be so confident in drawing any causality between your stats and the fact that biopython is using a mailing list? You're arguing that since you are at # 18 w/ only 10 posts, it must be due to discussion about this project is confined to a mailing-list instead of a more "open" and "accessible" web group and the community needs to "act now" or ignore this at its peril. Try doing the same experiment with the bioconductor mailing list, or (depending on how bold you're feeling) the R-user mailing list. Discussion on both groups is via mailing-list only (or through gmane --- same can be said with this list) and come back with that report. Now, go try your same experiment on the networkx or igraph user group. Both are hosted on google groups. With 10 posts, you'll likely be somewhere in the top 10 posters for the year. Oh, even better: igraph just set up the mirror-do-hicky-whatever so you can access their mailing list via GG sometime in July: http://groups.google.com/group/network-analysis-with-igraph/browse_thread/thread/77305d9b6bc6d35/c6a694e287936049?lnk=gst&q=google+groups#c6a694e287936049 Perhaps you'd like to see how traffic has changed on that list before and after that fact. I'm going to guess it wasn't by all that much, but that would at least be a better experiment you can use to base your hunches on. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From cjfields at illinois.edu Thu Dec 17 14:46:08 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 17 Dec 2009 13:46:08 -0600 Subject: [Biopython] some eye opening stats In-Reply-To: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> References: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> Message-ID: <1ED2ED4E-F87E-4229-8D6A-8EF56C837D73@illinois.edu> On Dec 17, 2009, at 1:08 PM, Steve Lianoglou wrote: > Hi Istvan, > > On Dec 17, 2009, at 1:47 PM, Istvan Albert wrote: > >> Hello Everyone, >> >> So I ran some statistics on this group (see below) that includes the >> entire past year. Make you own decisions based on it. >> >> Here is one of my observation: I find it saddening that I made the >> list at number 18! That's some niche list where one person posting >> ten messages in a whole year gets to be at number 18. In fact I only >> need three more posts to make myself top ten poster! Would you still >> claim this to be a good way to establish, grow and interact with a >> community? >> >> I said this many times before, and I'll try for this to be the last >> time I bring this up: >> >> I believe biopython is a niche software tool because *YOU* are >> limiting its reach *YOURSELVES* by making inappropriate decisions as >> far as accessibility and community goes. It will stay so as long as >> you don't recognize and act on this. > > I haven't said much so far because (1) I'm not really actively using biopython atm, and (2) I'm largely indifferent about the choice of mailing list vs. web interface, but let's be serious here ... how can you be so confident in drawing any causality between your stats and the fact that biopython is using a mailing list? > > You're arguing that since you are at # 18 w/ only 10 posts, it must be due to discussion about this project is confined to a mailing-list instead of a more "open" and "accessible" web group and the community needs to "act now" or ignore this at its peril. > > Try doing the same experiment with the bioconductor mailing list, or (depending on how bold you're feeling) the R-user mailing list. Discussion on both groups is via mailing-list only (or through gmane --- same can be said with this list) and come back with that report. > > Now, go try your same experiment on the networkx or igraph user group. Both are hosted on google groups. With 10 posts, you'll likely be somewhere in the top 10 posters for the year. > > Oh, even better: igraph just set up the mirror-do-hicky-whatever so you can access their mailing list via GG sometime in July: > > http://groups.google.com/group/network-analysis-with-igraph/browse_thread/thread/77305d9b6bc6d35/c6a694e287936049?lnk=gst&q=google+groups#c6a694e287936049 > > Perhaps you'd like to see how traffic has changed on that list before and after that fact. I'm going to guess it wasn't by all that much, but that would at least be a better experiment you can use to base your hunches on. > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact This also doesn't factor participation via other means, such as other mail lists, IRC, etc. As an example, the Perl Moose mail list is fairly low traffic, with a few posts a week, but the IRC channel is much more active. Conversely, we in BioPerl tend to use the mail list over #bioperl (though I do use both if time permits). I think way too much time has been pushed into this topic, considering we've reached a pretty viable option, namely mirroring the list to Google Groups. That seems satisfactory to everyone. I fail to see the reason to press the issue (and everyone's ire) more? chris From jtomkins at ICR.org Thu Dec 17 14:43:19 2009 From: jtomkins at ICR.org (Jeff Tomkins) Date: Thu, 17 Dec 2009 13:43:19 -0600 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> Message-ID: I installed biopython 1.52 as directed for OS X leopard. Everything imports using the python prompt in the terminal, idle, ipython, wing ide, etc. But when I run a standard python script (#!/usr/bin/python) in the shell it cannot locate Bio. What setup feature have I missed? Thanks, Jeff From villahozbale at wisc.edu Thu Dec 17 15:23:30 2009 From: villahozbale at wisc.edu (ANGEL VILLAHOZ-BALETA) Date: Thu, 17 Dec 2009 14:23:30 -0600 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <7065_1261079798_ZZg0M1E7a5Xoa.00_C5096CF9-B649-4D98-BAF7-DFB9A0AC74D4@icr.org> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <7065_1261079798_ZZg0M1E7a5Xoa.00_C5096CF9-B649-4D98-BAF7-DFB9A0AC74D4@icr.org> Message-ID: <6f80a33d6a3f0.4b2a3ee2@wiscmail.wisc.edu> Jeff, Have you added the path of the libraries of Biopython to the shell variable called as PYTHONPATH? Angel Villahoz-Baleta Bioinformatics Programmer University of Wisconsin-Madison ----- Original Message ----- From: Jeff Tomkins Date: Thursday, December 17, 2009 1:56 pm Subject: [Biopython] Bio won't import in *.py scripts To: "Biopython at lists.open-bio.org" > I installed biopython 1.52 as directed for OS X leopard. Everything > imports using the python prompt in the terminal, idle, ipython, wing > ide, etc. But when I run a standard python script (#!/usr/bin/python) > in the shell it cannot locate Bio. What setup feature have I missed? > > Thanks, Jeff > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From daniel at dim.fm.usp.br Thu Dec 17 14:49:28 2009 From: daniel at dim.fm.usp.br (Daniel Silvestre) Date: Thu, 17 Dec 2009 17:49:28 -0200 Subject: [Biopython] Why so few recipes in the cookbook? Message-ID: <4B2A8B48.50302@dim.fm.usp.br> Greeting everybody, Is there a reason to the existence of so few recipes in Biopython cookbook? Is there a task force to improve the documentation and related stuff? Usually I see perfect reusable and instructional recipes in blogs of biopython users. But, they simply don't get to the cookbook. Att. Daniel -- +---------------------------------------+ Daniel de A. M. M. Silvestre LIM01 - Laborat?rio de Inform?tica M?dica - HCFMUSP Sala 1349 - Depto. de Patologia Faculdade de Medicina Universidade de S?o Paulo Av. Dr. Arnaldo, 455 | e-mail: daniel at dim.fm.usp.br Cerqueira C?sar | Tel: +55-11-3061-7381 01246-903 - S?o Paulo - SP | Cel: +55-11-8042-9369 BRASIL | Skype: jarretinha --------------------------------------------------------------------- Esta mensagem pode conter informacao confidencial. Se voce nao for o destinatario ou a pessoa autorizada a receber esta mensagem, nao podera usar, copiar ou divulgar as informacoes nela contidas ou tomar qualquer acao baseada nessas informacoes. Se voce recebeu esta mensagem por engano, favor avisar imediatamente o remetente, respondendo o e-mail e, em seguida, apague-o. Agradecemos sua cooperacao. This message may contain confidential information. If you are not the addressee or authorized person to receive it for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by replying this e-mail message and delete it. Thanks in advance for your cooperation. ---------------------------------------------------------------------- DIM Faculdade de Medicina USP ---------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: daniel.vcf Type: text/x-vcard Size: 375 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Thu Dec 17 16:16:42 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 21:16:42 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2A8B48.50302@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> Message-ID: <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> 2009/12/17 Daniel Silvestre : > Greeting everybody, > > Is there a reason to the existence of so few recipes in Biopython > cookbook? Is there a task force to improve the documentation > and related stuff? The cookbook wiki is still quite new (6 months or so), but the idea was to encourage user participation. What would you like to write about ;) http://news.open-bio.org/news/2009/04/biopython-cookbook-wiki/ > Usually I see perfect reusable and instructional recipes in blogs of > biopython users. But, they simply don't get to the cookbook. Any specific examples? We can ask blog authors to put some of their finished receipes on the wiki (with a link back to the original). Peter P.S. Your message was delayed for moderation - probably the attachment? From p.j.a.cock at googlemail.com Thu Dec 17 17:59:07 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 17 Dec 2009 22:59:07 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> Message-ID: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Hi Jeff, On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins wrote: > I installed biopython 1.52 as directed for OS X leopard. > ?Everything imports using the python prompt in the terminal, > idle, ipython, wing ide, etc. Good :) > But when I run a standard python script (#!/usr/bin/python) > in the shell it cannot locate Bio. ?What setup feature have I missed? That does seem odd. You didn't call your script Bio.py did you? Could you show us the error message? Peter From mjldehoon at yahoo.com Fri Dec 18 03:35:40 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 18 Dec 2009 00:35:40 -0800 (PST) Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: <274100.45967.qm@web62402.mail.re1.yahoo.com> If you start python in the terminal, does it start /usr/bin/python or a different python? --- On Thu, 12/17/09, Peter Cock wrote: > From: Peter Cock > Subject: Re: [Biopython] Bio won't import in *.py scripts > To: "Jeff Tomkins" > Cc: "Biopython at lists.open-bio.org" > Date: Thursday, December 17, 2009, 5:59 PM > Hi Jeff, > > On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins > wrote: > > I installed biopython 1.52 as directed for OS X > leopard. > > ?Everything imports using the python prompt in the > terminal, > > idle, ipython, wing ide, etc. > > Good :) > > > But when I run a standard python script > (#!/usr/bin/python) > > in the shell it cannot locate Bio. ?What setup > feature have I missed? > > That does seem odd. You didn't call your script Bio.py did > you? > Could you show us the error message? > > Peter > > _______________________________________________ > Biopython mailing list? -? Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From lpritc at scri.ac.uk Fri Dec 18 03:49:51 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 18 Dec 2009 08:49:51 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: Hi, On 17/12/2009 22:59, "Peter Cock" wrote: > Hi Jeff, > > On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins wrote: >> I installed biopython 1.52 as directed for OS X leopard. >> ?Everything imports using the python prompt in the terminal, >> idle, ipython, wing ide, etc. > > Good :) > >> But when I run a standard python script (#!/usr/bin/python) >> in the shell it cannot locate Bio. ?What setup feature have I missed? It could be a $PATH issue. On my Mac, /usr/bin/python is where Apple's Python lives. I leave that installation alone, and don't replace it to avoid problems with the OS's potential use of Python (I had horrors with it back at 10.2). My 'working' version of Python is installed as /usr/local/bin/python. lpmacpro:scripts lpritc$ /usr/bin/python Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> lpmacpro:scripts lpritc$ which python /usr/local/bin/python lpmacpro:scripts lpritc$ python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> lpmacpro:scripts lpritc$ /usr/local/bin/python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> You could check to see if your setup is similar. What this means is that, as I have /usr/local/bin ahead of /usr/bin in my $PATH (don't shout at me, everyone!), command-line invocation and module installations with 'python setup.py' use the 'working' version, so install modules under that version of Python only. This would mean that if I had #!/usr/bin/python at the head of my script, it would use Apple's Python, and not see modules installed under my 'working' Python. This would give the same error you seem to describe: lpmacpro:scripts lpritc$ python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import Bio >>> lpmacpro:scripts lpritc$ /usr/bin/python Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import Bio Traceback (most recent call last): File "", line 1, in ImportError: No module named Bio >>> If this is the issue, then one way to get around the problem is to use #!/usr/bin/env python at the top of your script so that it uses the Python you would get from the command-line. This is better in some ways, as you don't have to make guesses about where the Python executable is installed when you move the script to another machine, though it probably won't matter on Windows, and there may be security issues that arise from this shortcut under some circumstances. If you're at all worried about those, just try 'which python' at the command-line, and substitute that location for /usr/bin/python. Cheers, L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From lpritc at scri.ac.uk Fri Dec 18 05:16:42 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 18 Dec 2009 10:16:42 +0000 Subject: [Biopython] some eye opening stats In-Reply-To: Message-ID: On 17/12/2009 18:47, "Istvan Albert" wrote: > Statistics from 1.12.2008 to 17.12.2009 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ***** People who have written most messages: > +----+-----Author-----------------------------------+--Msg-+-Percent-+ [...] > | 9 | lpritc at scri.ac.uk (Leighton Pritchard) | 14 | 1.18 % | > > ***** Best authors, by total size of their messages (w/o quoting): > +----+-----Author-------------------------------------------+-KBytes-+ [...] > | 3 | lpritc at scri.ac.uk (Leighton Pritchard) | 39.1 | Hmm... Peter has suggested privately that I do have a somewhat prolix and flowery writing style - I guess he's right... ;) L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From daniel at dim.fm.usp.br Fri Dec 18 07:55:15 2009 From: daniel at dim.fm.usp.br (Daniel Silvestre) Date: Fri, 18 Dec 2009 10:55:15 -0200 Subject: [Biopython] [Fwd: Re: Why so few recipes in the cookbook?] Message-ID: <4B2B7BB3.6090505@dim.fm.usp.br> -- +---------------------------------------+ Daniel de A. M. M. Silvestre LIM01 - Laborat?rio de Inform?tica M?dica - HCFMUSP Sala 1349 - Depto. de Patologia Faculdade de Medicina Universidade de S?o Paulo Av. Dr. Arnaldo, 455 | e-mail: daniel at dim.fm.usp.br Cerqueira C?sar | Tel: +55-11-3061-7381 01246-903 - S?o Paulo - SP | Cel: +55-11-8042-9369 BRASIL | Skype: jarretinha --------------------------------------------------------------------- Esta mensagem pode conter informacao confidencial. Se voce nao for o destinatario ou a pessoa autorizada a receber esta mensagem, nao podera usar, copiar ou divulgar as informacoes nela contidas ou tomar qualquer acao baseada nessas informacoes. Se voce recebeu esta mensagem por engano, favor avisar imediatamente o remetente, respondendo o e-mail e, em seguida, apague-o. Agradecemos sua cooperacao. This message may contain confidential information. If you are not the addressee or authorized person to receive it for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by replying this e-mail message and delete it. Thanks in advance for your cooperation. ---------------------------------------------------------------------- DIM Faculdade de Medicina USP ---------------------------------------------------------------------- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 6202 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: daniel.vcf Type: text/x-vcard Size: 375 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Fri Dec 18 07:57:30 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 12:57:30 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2B6DE2.3080500@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> Message-ID: <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> 2009/12/18 Daniel Silvestre : > Greetings again, > > There are some blogs like Programming for Scientists and Yokofakun with > some usable code and tips. > > I do want to contribute, but there are no clear objectives stated to the > cookbook writing process. What's the gist? Just some code snippets? > Complete examples? I would personally prefer concrete examples. The name "cookbook" suggests a collection of complete recipes (rather than snippets). > My teaching experience says that complete (and real) examples are the > most wanted. For instance, even in the bioperl community only code > snippets are available. So, the first question students ask me after a > brief looking at the tutorials is smth like "Well, how do I use this in > a directory tree?". > > What about a bioinformatics recipes in (bio)python? Yes please :) Are there any examples in the main "Biopython Tutorial and Cookbook" which would in your opinion be better in the wiki? > Att, > Daniel > > P.S.:Is there a problem with my vcard? I can use a different sig. Could you try no vcard attachment at all? Peter From biopython at maubp.freeserve.co.uk Fri Dec 18 10:00:13 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 15:00:13 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2B8CC3.3090307@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> Message-ID: <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> 2009/12/18 Daniel Silvestre : > Hi people, > > Actually, even the tutorial is a collection of snippets. I do consider > and regard the effort. But, in order to attract biologists like myself > and my colleagues we need something more pragmatic, problem > driven. Most of the tutorial is by its nature "snippets" but the Cookbook chapter examples are more self standing. I suspect you are looking for even more self contained things - complete examples with a motivating rational, sample input data, etc. > The prototipical workflow of a molecular biologist is: > > ?- Select a bunch of interesting genes in Entrez by clicking buttons and > boxes; > > ?- BLAST some sequences and save the results in separated directories, > normally one for each gene; > > ?- Struggle to extract useful statistics from the results, wich usually > end in sorting and selecting the first few results; > > ?- Apply some analytical method (phylogeny reconstruction, mutation > analysis, etc.) over the "filtered" results; > > ?- Restart the cycle until get satisfied or bored; > > By the way, in one of my classes I just taught the students (which can > be grad students and professors) to use the fields of Entrez (molecular > weight, range search, organism name, etc.) and they felt really powerful > after that. For instance, they used to retrieve sequence lists of papers > ?by hand !!! I confess I don't know or use the full power of the Entrez website, although that is in part since I can do clever stuff via their API ;) > On the other hand, the ones who dare to use biopython tipically don't > know how to glob things and other administrivia. So, without a real > example only biology geeks like me get to the next step. > > There is a list of good recipes to start the cookbook: > > ?- How to retrieve and organize sequences and annotations from online > databases using you own custom command line tool; We touch on some of this already, e.g. search and retrieve examples in the Bio.Entrez chapter of the tutorial. Are you looking for something more in depth? Or using other databases? > ?- How to setup/insert/retrieve a bunch of results into a local > (personal) database (SQL); Done, although not tagged as a cookbook specifically: http://www.biopython.org/wiki/BioSQL (The tutorial also points to this page) > ?- How to annotate retrieved results with your own results; Now here I'd like a little clarification about what you want to do. My guess would be something I have considered working up into a cookbook recipe, based one stuff I have already done: Taking a small genome (viral or prokaryote), doing simple gene predictions (e.g. ORF finding, pick first start codon, or maybe calling a command line tool to do it for us), then taking the predicted peptides and BLASTing them, then making a GenBank file with these predicted features and stick a summary of the BLAST results in their annotation. However, while this is a reasonable first step, there are downsides to encouraging this sort of naive approach to annotation - the example would ideally have "Further Reading" section, see for example Schnoes et al 2009. http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000605 > These are real problems faced by the common biologist. The proposed > snippets in the tutorial and the cookbook is already dealt by a lot of > web tools. It's absolutely necessary to show that biopython can increase > the power and range of a biologist everyday work, and can possibly be > automated. > > I have some examples to obtain statistics over genome sequences which > address complete examples (including globbing filelists, retrieving from > online databases, etc.) and can prepare them as a recipe. But, I could > use some help . . . If you start a cookbook entry on the wiki, and some outline code, I'm sure we can as a group contribute ideas and tips (particularly in the code, but maybe in the approach too). Or, if you would rather, discuss some specific ideas here on the mailing list first. Note that some of these topics would be ideal for an OBF project wide set of examples, with reference solutions in Biopython, BioPerl, BioJava, BioRuby, etc. That is however a much much bigger task. Peter From rjalves at igc.gulbenkian.pt Fri Dec 18 12:17:44 2009 From: rjalves at igc.gulbenkian.pt (Renato Alves) Date: Fri, 18 Dec 2009 17:17:44 +0000 Subject: [Biopython] SeqIO.index improvement suggestions Message-ID: <4B2BB938.5030709@igc.gulbenkian.pt> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [I tried submitting this message to the dev mailing list, but got rejected since I'm not yet authorized to post there, so here it goes] Hi everyone, I'm working on changes to the Bio.SeqIO.index() function to make it more consistent with the .read and .parse i.e. accept a filehandle instead of a filename and also to include a way to cache the index into a file to speed up the process. The reason why we are implementing these two is because we were going to implement our own index solution until we realized this was added to 1.52. However the implementation in 1.52 has a few limitations. One limitation is that we are using a gzipped database for the sake of space and using gzip.open() to create the file-handle that would then be passed to .parse(). The same was not doable with .index(). This is already implemented in http://github.com/Unode/biopython/commit/6fc390151452e3ddf26a117269132125a3ffb3fe The second is that we are going to use this feature to quick search the database in a web application. Here we have the limitation that we don't have persistence across web requests, which means that we would need to recalculate the index on every web request. The details of how we plan to implement this are the following: cPickle the internal dictionary of offsets and save it on the database folder with the same name as the database + .index. The consistency check on whether the file has changed will be performed based on name and timestamp. By default .index() will search for this file, check the timestamp and use the cache if they match, otherwise they will be recalculated. The save function will be available like: >>> >>> d = SeqIO.index(...) >>> >>> d.save(filename) where filename is optional and defaults to "%s.index" % _handle.name We already have a solution like this implemented with subclasses of SeqIO._index, it's just a matter of reworking that and merge it into BioPython if you consider a good addition to the code. I would like to hear your comments and suggestions on this. Regards, Renato -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAksruTIACgkQYh11EUYTX9TymgCeL6hu3Uz//itSHx38k9KjfZJg dGUAmwVCgaI9G/19VKiUolrXogelgrPs =M+xw -----END PGP SIGNATURE----- From biopython at maubp.freeserve.co.uk Fri Dec 18 16:39:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 21:39:11 +0000 Subject: [Biopython] SeqIO.index improvement suggestions In-Reply-To: <4B2BB938.5030709@igc.gulbenkian.pt> References: <4B2BB938.5030709@igc.gulbenkian.pt> Message-ID: <320fb6e00912181339o1a5c4100w6f1957fd4d78d20d@mail.gmail.com> Hi Renato, I'm cooking dinner while writing this, so it won't be as in depth as usual... On Fri, Dec 18, 2009 at 5:17 PM, Renato Alves wrote: > > [I tried submitting this message to the dev mailing list, but got > rejected since I'm not yet authorized to post there, so here it goes] Have you definitely subscribed to the dev list? That should be all that is required to post there, and this discussion would be better suited there. > Hi everyone, > > I'm working on changes to the Bio.SeqIO.index() function to make it more > consistent with the .read and .parse i.e. accept a filehandle instead of > a filename and also to include a way to cache the index into a file to > speed up the process. > > The reason why we are implementing these two is because we were going to > implement our own index solution until we realized this was added to 1.52. > > However the implementation in 1.52 has a few limitations. Yes, this was designed to cover basic use cases in a general way, but with the option in future to do other things - and in particular saving the index to disk was kept in mind. > One limitation is that we are using a gzipped database for the sake of > space and using gzip.open() to create the file-handle that would then be > passed to .parse(). The same was not doable with .index(). > This is already implemented in > http://github.com/Unode/biopython/commit/6fc390151452e3ddf26a117269132125a3ffb3fe That was a deliberate choice in that the index code wants to "own" the handle. If other code has access to the handle, there is a risky of different bits of code moving the handle pointer etc. But, if you are careful it could be done. There are also issues here in combination with saving the index. With a filename, the code can easily reopen the file in the same mode. With a handle, things are more tricky. You have non-file handles to consider - such as the gzip example. There is also the problem of recording the file mode (normal text, universal text, or binary - which we will need for SFF files - code already written). If we do change the code to allow handles, it would have to be to allow handles OR filenames to be compatible with Biopython 1.52 and 1.53 (which take just filenames). This could be handled as in Bio.SeqIO.convert(), which also allows both (which was the subject of some discussion!). > The second is that we are going to use this feature to quick search the > database in a web application. Here we have the limitation that we don't > have persistence across web requests, which means that we would need to > recalculate the index on every web request. > > The details of how we plan to implement this are the following: > > cPickle the internal dictionary of offsets and save it on the database > folder with the same name as the database + .index. The consistency > check on whether the file has changed will be performed based on name > and timestamp. By default .index() will search for this file, check the > timestamp and use the cache if they match, otherwise they will be > recalculated. The save function will be available like: > >>>> >>> d = SeqIO.index(...) >>>> >>> d.save(filename) > > where filename is optional and defaults to "%s.index" % _handle.name > > We already have a solution like this implemented with subclasses of > SeqIO._index, it's just a matter of reworking that and merge it into > BioPython if you consider a good addition to the code. > > I would like to hear your comments and suggestions on this. Yes, saving indexes is an obvious addition. I have explored using pickle via shelve, and also SQLite - there are implementations of this on my github respository, plus begun to look into the existing OBF Open Biological Database Access (OBDA) specification for cross project compatibility. Other potential benefits here are reduced memory usage if we don't keep the dictionary of offsets in RAM. http://github.com/peterjc/biopython/tree/index-shelve http://github.com/peterjc/biopython/tree/index-sqlite There is a potential complication with index sub-classes which do more specialised indexing (e.g. GenBank files, and for a more extreme case, SFF files). See: http://github.com/peterjc/biopython/tree/sff-seqio Anyway - great to see you are finding the code useful, and have some quite similar ideas for how to extend it further. Peter From biopython at maubp.freeserve.co.uk Fri Dec 18 17:26:43 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 22:26:43 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: <320fb6e00912181426i44b93b2co96a1a171a404dc5f@mail.gmail.com> On Fri, Dec 18, 2009 at 6:25 PM, Jeff Tomkins wrote: > I got some advice from Angel V. and added the following lines to my > profile and the scripts now import Bio - it looks like ?it worked and > fixed the issue. ?Thanks for getting back with me! > -jeff > > PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/biopython-1.52-py2.5-macosx-10.3-fat.egg:${PYTHONPATH}" > export PYTHONPATH > Excellent - and well done Angel for coming up with a working solution. Peter From pedro.al at fenhi.uh.cu Fri Dec 18 17:25:29 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Fri, 18 Dec 2009 17:25:29 -0500 Subject: [Biopython] Superpose structures Message-ID: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Hi all!! I want to superpose some structures on a reference one. The fixed selection is the backbone atoms of two residues and i want to superpose the rest of structures based in this atoms (for the same residues in the others structures, of course) Reference selection: [[, , , ], [, , , ]] The moving selection is a similar nested list, with the list of the residues backbone atoms to move in the query structures.... How can i superpose these structures based in the backbone atoms of two residues? Please help me... Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From p.j.a.cock at googlemail.com Fri Dec 18 17:55:44 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 18 Dec 2009 22:55:44 +0000 Subject: [Biopython] Superpose structures In-Reply-To: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> References: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Message-ID: <320fb6e00912181455g680a8dbmdfa166bd820f06ed@mail.gmail.com> 2009/12/18 Yasser Almeida Hern?ndez : > Hi all!! > > I want to superpose some structures on a reference one. > The fixed selection is the backbone atoms of two residues and i want to > superpose the rest of structures based in this atoms (for the same residues > in the others structures, of course) > > Reference selection: > [[, , , ], [, , , > ]] > > The moving selection is a similar nested list, with the list of the residues > backbone atoms to move in the query structures.... > > How can i superpose these structures based in the backbone atoms of two > residues? > > Please help me... > Thanks Does this example help? http://www.warwick.ac.uk/go/peter_cock/python/protein_superposition/ Peter From mjldehoon at yahoo.com Fri Dec 18 18:54:52 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 18 Dec 2009 15:54:52 -0800 (PST) Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <1D3CBE1D-A7A9-4337-BE0A-E073C2B9A3CC@ICR.org> Message-ID: <138394.60837.qm@web62408.mail.re1.yahoo.com> Good to hear you found a solution. Just for some background information: The reason /usr/bin/python couldn't find Biopython is that you have Biopython installed with the python in /Library/Frameworks/Python.framework/Versions/2.5/bin/python. These two pythons don't know about each other, so anything you install for one python is not seen by the other python. If you want to use /usr/bin/python in your scripts, another solution would have been to install Biopython for that python using /usr/bin/python setup.py build followed by /usr/bin/python setup.py install. But usually it's better to leave the Apple-installed python in /usr/bin/python alone, and to install modules for /Library/Frameworks/Python.framework/Versions/2.5/bin/python, and use that python. --Michiel. --- On Fri, 12/18/09, Jeff Tomkins wrote: > From: Jeff Tomkins > Subject: Re: [Biopython] Bio won't import in *.py scripts > To: "Michiel de Hoon" > Date: Friday, December 18, 2009, 1:27 PM > I got some advice from Angel V. and > added the following lines to my .profile and the scripts now > import Bio - it looks like? it worked and fixed the > issue.? Thanks for getting back with me! > -jeff > > PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/biopython-1.52-py2.5-macosx-10.3-fat.egg:${PYTHONPATH}" > export PYTHONPATH > > > On Dec 18, 2009, at 2:35 AM, Michiel de Hoon wrote: > > > If you start python in the terminal, does it start > /usr/bin/python or a different python? > > > > --- On Thu, 12/17/09, Peter Cock > wrote: > > > >> From: Peter Cock > >> Subject: Re: [Biopython] Bio won't import in *.py > scripts > >> To: "Jeff Tomkins" > >> Cc: "Biopython at lists.open-bio.org" > > >> Date: Thursday, December 17, 2009, 5:59 PM > >> Hi Jeff, > >> > >> On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins > > >> wrote: > >>> I installed biopython 1.52 as directed for OS > X > >> leopard. > >>>? Everything imports using the python > prompt in the > >> terminal, > >>> idle, ipython, wing ide, etc. > >> > >> Good :) > >> > >>> But when I run a standard python script > >> (#!/usr/bin/python) > >>> in the shell it cannot locate Bio.? What > setup > >> feature have I missed? > >> > >> That does seem odd. You didn't call your script > Bio.py did > >> you? > >> Could you show us the error message? > >> > >> Peter > >> > >> _______________________________________________ > >> Biopython mailing list? -? Biopython at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biopython > >> > > > > > > > > From schafer at rostlab.org Fri Dec 18 18:57:52 2009 From: schafer at rostlab.org (=?ISO-8859-1?Q?Christian_Sch=E4fer?=) Date: Fri, 18 Dec 2009 18:57:52 -0500 Subject: [Biopython] Superpose structures In-Reply-To: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> References: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Message-ID: <4B2C1700.6080704@rostlab.org> Hey, an alternative would be to use the program ProFit (http://www.bioinf.org.uk/software/profit/index.html), which does a least-square fitting based on aligned residues of one reference and one or more mobile structures. It comes with an extensive yet easy comprehensible set of commands. I'm currently using it myself within my Python code. Chris On 12/18/2009 05:25 PM, Yasser Almeida Hern?ndez wrote: > Hi all!! > > I want to superpose some structures on a reference one. > The fixed selection is the backbone atoms of two residues and i want to > superpose the rest of structures based in this atoms (for the same > residues in the others structures, of course) > > Reference selection: > [[, , , ], [, , CA>, ]] > > The moving selection is a similar nested list, with the list of the > residues backbone atoms to move in the query structures.... > > How can i superpose these structures based in the backbone atoms of two > residues? > > Please help me... > Thanks > > > > From bjorn_johansson at bio.uminho.pt Sat Dec 19 06:00:19 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Sat, 19 Dec 2009 11:00:19 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit Message-ID: Hi, I am trying to get a restriction batch tha is limited to some enzymes with a certain size. I think that the lambdasplit might be used for this. I have not found any examples of the use of the restrictionbatch.lambdasplit rb = RestrictionBatch(first=[],suppliers=['F','R']) rb2= rb.lambdasplit(lambda x: x.size==6) this code does not work. Could someone give me an example on how to use this? I have tried to see docs on the lambda function in python, but I still could not solve this. grateful for any answer! /bjorn From biopython at maubp.freeserve.co.uk Sat Dec 19 06:17:33 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 19 Dec 2009 11:17:33 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: References: Message-ID: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> 2009/12/19 Bj?rn Johansson : > Hi, > > I am trying to get a restriction batch tha is limited to some enzymes with a > certain size. > I think that the lambdasplit might be used for this. > > I have not found any examples of the use of the restrictionbatch.lambdasplit > > rb = RestrictionBatch(first=[],suppliers=['F','R']) > > rb2= rb.lambdasplit(lambda x: x.size==6) > > this code does not work. Could someone give me an example on how to use > this? > I have tried to see docs on the lambda function in python, but I still could > not solve this. > grateful for any answer! > /bjorn Hmm, no mention of lambdasplit in this doc: http://biopython.org/DIST/docs/cookbook/Restriction.html Also no mention in Tests/test_Restriction.py Looking at the code, you need a function (which could be defined with a python lambda but need not be) which will be given as single argument and must return a boolean (or rather, something which will be evaluated as a boolean). You code looks fine: >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size==6]) 128 >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>> len(rb2) 0 Either we have both misunderstood the point of this function, or there is a bug in lambdasplit. Please file a bug report: http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython Thanks, Peter From sohm at inaf.cnrs-gif.fr Sat Dec 19 12:53:35 2009 From: sohm at inaf.cnrs-gif.fr (=?ISO-8859-1?Q?Fr=E9d=E9ric_Sohm?=) Date: Sat, 19 Dec 2009 18:53:35 +0100 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> Message-ID: <4B2D131F.1080302@inaf.cnrs-gif.fr> Hi Bj?rn, Peter, The code is working for me. I can't reproduce the bug. >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size == 6]) 128 >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) >>> len(rb2) 128 >>> len([x for x in rb if len(x) == 6]) 128 >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) >>> len(rb3) 128 >>> rb2 == rb3 True >>> EcoRI in rb True >>> EcoRI in rb2 and EcoRI in rb3 True >>> EcoRI.size == len(EcoRI) == 6 True >>> I am a bit puzzled there, Peter's code should work (and is effectively working on my machine) ... My setup is : Debian Lenny python 2.5.2 or python 2.4.6 (both tested) Biopython : 1.45 What about yours ? Best regards Fred Peter wrote: > 2009/12/19 Bj?rn Johansson : >> Hi, >> >> I am trying to get a restriction batch tha is limited to some enzymes with a >> certain size. >> I think that the lambdasplit might be used for this. >> >> I have not found any examples of the use of the restrictionbatch.lambdasplit >> >> rb = RestrictionBatch(first=[],suppliers=['F','R']) >> >> rb2= rb.lambdasplit(lambda x: x.size==6) >> >> this code does not work. Could someone give me an example on how to use >> this? >> I have tried to see docs on the lambda function in python, but I still could >> not solve this. >> grateful for any answer! >> /bjorn > > Hmm, no mention of lambdasplit in this doc: > http://biopython.org/DIST/docs/cookbook/Restriction.html > > Also no mention in Tests/test_Restriction.py > > Looking at the code, you need a function (which could > be defined with a python lambda but need not be) > which will be given as single argument and must > return a boolean (or rather, something which will be > evaluated as a boolean). You code looks fine: > >>>> from Bio.Restriction import * >>>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>>> len(rb) > 228 >>>> len([x for x in rb if x.size==6]) > 128 >>>> rb2= rb.lambdasplit(lambda x: x.size==6) >>>> len(rb2) > 0 > > Either we have both misunderstood the point of > this function, or there is a bug in lambdasplit. > > Please file a bug report: > http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython > > Thanks, > > Peter > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Fr?d?ric Sohm GIS AMAGEN CNRS INRA Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s" UPR 2197 DEPSN, CNRS Institut de Neurobiologie A. Fessard 1 Avenue de la Terrasse 91 198 GIF-SUR -YVETTE FRANCE Phone: 33 1 69 82 34 12 Fax: 33 1 69 82 41 67 email: sohm at inaf.cnrs-gif.fr From bjorn_johansson at bio.uminho.pt Sun Dec 20 01:58:43 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Sun, 20 Dec 2009 06:58:43 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <4B2D131F.1080302@inaf.cnrs-gif.fr> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> Message-ID: Hi, I copied the output from running Frederics example on my machine below: bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size == 6]) 128 >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) >>> len(rb2) 0 >>> len([x for x in rb if len(x) == 6]) 128 >>> len([x for x in rb if len(x) == 6]) 128 >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) >>> len(rb3) 0 >>> rb2 == rb3 True >>> EcoRI in rb True >>> EcoRI in rb2 and EcoRI in rb3 False >>> EcoRI.size == len(EcoRI) == 6 True >>> The lambdasplit does not seem to be working for me. I use python 2.6.4 on ubuntu karmic How can I print the biooython version? thansk for your help! /bjorn 2009/12/19 Fr?d?ric Sohm > Hi Bj?rn, Peter, > > The code is working for me. > I can't reproduce the bug. > > > > >>> from Bio.Restriction import * > >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) > >>> len(rb) > 228 > >>> len([x for x in rb if x.size == 6]) > 128 > >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) > >>> len(rb2) > 128 > >>> len([x for x in rb if len(x) == 6]) > 128 > >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) > >>> len(rb3) > 128 > >>> rb2 == rb3 > True > >>> EcoRI in rb > True > >>> EcoRI in rb2 and EcoRI in rb3 > True > >>> EcoRI.size == len(EcoRI) == 6 > True > >>> > > > I am a bit puzzled there, Peter's code should work (and is effectively > working on my machine) ... > > My setup is : > > Debian Lenny > python 2.5.2 or python 2.4.6 (both tested) > Biopython : 1.45 > > What about yours ? > > > Best regards > > Fred > > > > Peter wrote: > >> 2009/12/19 Bj?rn Johansson : >> >>> Hi, >>> >>> I am trying to get a restriction batch tha is limited to some enzymes >>> with a >>> certain size. >>> I think that the lambdasplit might be used for this. >>> >>> I have not found any examples of the use of the >>> restrictionbatch.lambdasplit >>> >>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>> >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>> >>> this code does not work. Could someone give me an example on how to use >>> this? >>> I have tried to see docs on the lambda function in python, but I still >>> could >>> not solve this. >>> grateful for any answer! >>> /bjorn >>> >> >> Hmm, no mention of lambdasplit in this doc: >> http://biopython.org/DIST/docs/cookbook/Restriction.html >> >> Also no mention in Tests/test_Restriction.py >> >> Looking at the code, you need a function (which could >> be defined with a python lambda but need not be) >> which will be given as single argument and must >> return a boolean (or rather, something which will be >> evaluated as a boolean). You code looks fine: >> >> from Bio.Restriction import * >>>>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>>>> len(rb) >>>>> >>>> 228 >> >>> len([x for x in rb if x.size==6]) >>>>> >>>> 128 >> >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>>>> len(rb2) >>>>> >>>> 0 >> >> Either we have both misunderstood the point of >> this function, or there is a bug in lambdasplit. >> >> Please file a bug report: >> http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython >> >> Thanks, >> >> Peter >> >> _______________________________________________ >> Biopython mailing list - Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> >> > -- > Fr?d?ric Sohm > GIS AMAGEN CNRS INRA > Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s" > UPR 2197 DEPSN, CNRS > Institut de Neurobiologie A. Fessard > 1 Avenue de la Terrasse > 91 198 GIF-SUR -YVETTE > FRANCE > Phone: 33 1 69 82 34 12 > Fax: 33 1 69 82 41 67 > email: sohm at inaf.cnrs-gif.fr > -- ______O_________oO________oO______o_______oO__ Bj?rn Johansson Assistant Professor Departament of Biology University of Minho Campus de Gualtar 4710-057 Braga PORTUGAL http://www.bio.uminho.pt http://sites.google.com/site/bjornhome Work (direct) +351-253 601517 Private mob. +351-967 147 704 Dept of Biology (secretariate) +351-253 60 4310 Dept of Biology (fax) +351-253 678980 From biopython at maubp.freeserve.co.uk Sun Dec 20 13:01:56 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 20 Dec 2009 18:01:56 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> Message-ID: <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> 2009/12/20 Bj?rn Johansson : > Hi, > I copied the output from running Frederics example on my machine below: > > bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python > Python 2.6.4 (r264:75706, Dec ?7 2009, 18:45:15) > [GCC 4.4.1] on linux2 > ... > The lambdasplit does not seem to be working for me. I use python > 2.6.4 on ubuntu karmic I was using the same version of Python (also on Linux Ubuntu Karmic). This looks like it could be a Python 2.6 specific problem. > How can I print the biooython version? > thansk for your help! > /bjorn Its in the FAQ in the Tutorial, at the Python prompt just do: import Bio print Bio.__version__ Peter From biopython at maubp.freeserve.co.uk Mon Dec 21 07:03:34 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 21 Dec 2009 12:03:34 +0000 Subject: [Biopython] Fwd: Why so few recipes in the cookbook? In-Reply-To: <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> <4B2BAE35.2070404@dim.fm.usp.br> <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> Message-ID: <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> I just checked with Daniel to make sure he was happy for me to forward this back to the mailing list. Peter ---------- Forwarded message ---------- From: Peter Date: Fri, Dec 18, 2009 at 10:42 PM Subject: Re: [Biopython] Why so few recipes in the cookbook? To: Daniel Silvestre Hi Daniel, Do you mind if I send this to the list too? 2009/12/18 Daniel Silvestre : >> >> I confess I don't know or use the full power of the Entrez website, >> although that is in part since I can do clever stuff via their API ;) > > This is exactly what we want to do when get to the Entrez interface. > But, the information "How to submit complex query" is hidden (and > scattered) under many layers of web pages. > > The ability to do such things in a more customized way is the dream of > all life science guy. This is partly down to the NCBI's Entrez documentation - a lots of the examples in the Biopython tutorial took some serious exploration to get working, including trawling the net for other Entrez users (in other languages). I hope that we've managed to make things clearer. > While this tutorial is enough to CS-oriented guys, it's a really big > step to grasp such information for people from other communities. > That's why I'm always a little confused about the idea behind bio > projects. If the idea is programming of scientists, the approach is > way too CS. You are probably right in that the Bio* projects do cater more to a programming scientist than a wet biologist - not that there aren't people that can and do both. You have to be able to program to take full advantage of any of the Bio* kits. However, there are a number of front ends, webpages, etc which use them internally. >> Now here I'd like a little clarification about what you want to do. >> >> My guess would be something I have considered working up >> into a cookbook recipe, based one stuff I have already done: >> Taking a small genome (viral or prokaryote), doing simple >> gene predictions (e.g. ORF finding, pick first start codon, >> or maybe calling a command line tool to do it for us), then >> taking the predicted peptides and BLASTing them, then >> making a GenBank file with these predicted features and >> stick a summary of the BLAST results in their annotation. >> >> However, while this is a reasonable first step, there are >> downsides to encouraging this sort of naive approach to >> annotation - the example would ideally have "Further >> Reading" section, see for example Schnoes et al 2009. >> http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000605 >> > > That's exactly my point. Without a complete recipe with a > specific motivation and a clear stated problem behinf it, > people will continue with this kind of behavior. We agree here. > And I don't see why one need to start simple. The first time I've > entered in a molbio lab was to carry on a old fashioned gene cloning > procedure. This is a simple procedure. How do this compares to the > simple examples we see on bio project tutorials? I see the tutorial as a teaching aid, and many of the cookbook examples also. For someone learning to program, an overly complicated example is intimidating. This is not to say we can't have some complex cookbook entries too. i.e. I thank for learning to program, you need to start simple, and build up the complexity gradually. >>> These are real problems faced by the common biologist. The proposed >>> snippets in the tutorial and the cookbook is already dealt by a lot of >>> web tools. It's absolutely necessary to show that biopython can increase >>> the power and range of a biologist everyday work, and can possibly be >>> automated. >>> >>> I have some examples to obtain statistics over genome sequences which >>> address complete examples (including globbing filelists, retrieving from >>> online databases, etc.) and can prepare them as a recipe. But, I could >>> use some help . . . >> >> If you start a cookbook entry on the wiki, and some outline >> code, I'm sure we can as a group contribute ideas and tips >> (particularly in the code, but maybe in the approach too). Or, >> if you would rather, discuss some specific ideas here on the >> mailing list first. >> >> Note that some of these topics would be ideal for an OBF >> project wide set of examples, with reference solutions in >> Biopython, BioPerl, BioJava, BioRuby, etc. That is however >> a much much bigger task. > > I think that there is no need to worry about big things right now. By > the very nature of programming, people will mirror ideas from one > another. I've tried a similar approach in the bioperl community. But, > for the pragmatic life scientist, perl is over expressive while python > has a much higher first encounter acceptance rate (I'm not sure why, > tough...). > > My idea is not a master blaster cookbook, just to assemble simple ideas > that work for the everyday user, be this guy a CS or a life scientist. > > How do this sound to you? Wonderful :) (And I would agree with you that Python is probably easier to teach to beginners than Perl) Peter From chapmanb at 50mail.com Mon Dec 21 08:11:48 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Mon, 21 Dec 2009 08:11:48 -0500 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> <4B2BAE35.2070404@dim.fm.usp.br> <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> Message-ID: <20091221131148.GB21580@sobchak.mgh.harvard.edu> Peter and Daniel; Really interesting discussion. Documentation is an area that can always use more work to appeal to a wider audience. Daniel: > > While this tutorial is enough to CS-oriented guys, it's a really big > > step to grasp such information for people from other communities. > > That's why I'm always a little confused about the idea behind bio > > projects. If the idea is programming of scientists, the approach is > > way too CS. This stresses why we actively encourage contributions from biologists as well. Many of the contributors to Biopython tend more towards the programming/bioinformatics side, since that experience helps in building up and appreciating a re-usable toolkit. When those same people write documentation, it is going to be naturally biased towards the sort of work they do. I'd definitely encourage you, and anyone else who might be interested, to build up examples that are more intuitive to those coming at the work from a different starting point. This is exactly the idea behind starting up the cookbook on the wiki; it's all freely editable, so dig right in. Brad From biopython at maubp.freeserve.co.uk Tue Dec 22 11:22:24 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 22 Dec 2009 16:22:24 +0000 Subject: [Biopython] EMBOSS and Python In-Reply-To: <5ECA525B88314B48870E4AC72E3B9AF2045A70FE@EDUNIVMAIL05.ad.umassmed.edu> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <5ECA525B88314B48870E4AC72E3B9AF2045A70FE@EDUNIVMAIL05.ad.umassmed.edu> Message-ID: <320fb6e00912220822m7e1c81c5h113a642f1336b328@mail.gmail.com> Hi David, I cc'd the mailing list again. On Tue, Dec 22, 2009 at 4:02 PM, Lapointe, David wrote: > > Hi Peter, > > I have current version for both EMBOSS (6.1.0) and BioPython (1.53). Do you have the original unpatched EMBOSS 6.1.0, or the latest patched version, currently EMBOSS 6.1.0 patch 3? See: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/README.patch > I looked at the > code for the unit tests (asis) and the problem might be there, as I could run the > test fine by hand. Could you clarify what you meant by "I could run the test fine by hand"? > Shouldn't there be a '-' in front of asequence? > > ? ?def test_water_file(self): > ? ? ? ?"""water with the asis trick, output to a file.""" > ? ? ? ?#Setup, try a mixture of keyword arguments and later additions: > ? ? ? ?cline = WaterCommandline(cmd=exes["water"], > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? gapopen="10", gapextend="0.5") > ? ? ? ?#Try using both human readable names, and the literal ones: > ? ? ? ?cline.set_parameter("asequence", "asis:ACCCGGGCGCGGT") > ? ? ? ?cline.set_parameter("-bsequence", "asis:ACCCGAGCGCGGT") > > David Good question, but no. The test is confirming the set_parameter method supports both these ways of setting the parameters. This is now a semi- obsolete method - the preferred way would be to use bsequence in the constructor arguments, or the bsequence property. Interestingly the test output also indicates issues calling dnal (which is nothing to do with EMBOSS), yet at least one other command line tool test seem to be running OK (Clustalw). Peter From silvio.tschapke at googlemail.com Wed Dec 23 05:57:58 2009 From: silvio.tschapke at googlemail.com (Silvio Tschapke) Date: Wed, 23 Dec 2009 11:57:58 +0100 Subject: [Biopython] cannot find elink_090910.dtd Message-ID: Hi all, I am using ubuntu 9.10, python 2.6 and biopython 1.53 But while running these two lines of code pmid = "14630660" results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) I get the following error message posted at the bottom. So I searched at the proposed websites and in the web for "elink_090910.dtd" without success. I only found the files for elink_020511.dtd and something for 2010. But nothing related to elink_090910.dtd. Could you please help me how I can solve this problem? Cheers and merry christmas, Silvio Traceback (most recent call last): File "/home/silvio/programming/python/first steps/biopythonTutorial.py", line 17, in results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid)) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/__init__.py", line 258, in read record = handler.read(handle) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", line 108, in read self.parser.ParseFile(handle) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", line 377, in externalEntityRefHandler raise RuntimeError(message) RuntimeError: Unable to load DTD file eLink_090910.dtd. From biopython at maubp.freeserve.co.uk Wed Dec 23 07:10:17 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 12:10:17 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: Message-ID: <320fb6e00912230410hf895e3sa999b0489e0b9e5e@mail.gmail.com> On Wed, Dec 23, 2009 at 10:57 AM, Silvio Tschapke wrote: > Hi all, > > I am using ubuntu 9.10, python 2.6 and biopython 1.53 > But while running these two lines of code > > pmid = "14630660" > results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) > > I get the following error message posted at the bottom. > So I searched at the proposed websites and in the web for > "elink_090910.dtd" without success. > I only found the files for elink_020511.dtd and something for 2010. But > nothing related to elink_090910.dtd. > Could you please help me how I can solve this problem? > > Cheers and merry christmas, > Silvio Hi Silvio, Yep, I don't have that file either, it looks like we missed it :( The NCBI website don't make it easy to find (as far as I could tell, none of the DTD pages list this file). However, the XML tells us where to try: http://eutils.ncbi.nlm.nih.gov/corehtml/query/DTD/eLink_090910.dtd I've added this to our repository, and it will be in the next release. You'll need to download that DTD file. How did you install Biopython? It looks like the file would need to go inside the egg - so it might be easier to (re)install from source with this extra file in the source code subdirectory Bio/Entrez/DTDs. Or start by grabbing the latest Biopython code from git. Does that make sense? However, even with the DTD file, I'm getting "Error 111 (Connection refused)" for your example. Maybe the NCBI are doing some maintenance work at the moment? Merry Christmas, Peter From konrad.koehler at mac.com Wed Dec 23 06:22:32 2009 From: konrad.koehler at mac.com (Konrad Koehler) Date: Wed, 23 Dec 2009 12:22:32 +0100 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: Message-ID: <11374559688483741089292458484266867741-Webmail@me.com> Hi Silvio, This appears to be a temporary glitch with the Entrez database and not biopython. For example, the following link should display the abstract, but current does not: http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&list_uids=14630660&dopt=Abstract A lot of other things in Entrez appear to be broken. For example internal links to RefSeq records from EntrezGene do not currently work. For example, the RefSeq link from here: http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=2645 to here: http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=2645 doesn't work either. Hopefully the problems in the Entrez database will be sorted out shortly. Cheers, Konrad On Wednesday, December 23, 2009, at 11:57AM, "Silvio Tschapke" wrote: >Hi all, > >I am using ubuntu 9.10, python 2.6 and biopython 1.53 >But while running these two lines of code > >pmid = "14630660" >results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) > >I get the following error message posted at the bottom. >So I searched at the proposed websites and in the web for >"elink_090910.dtd" without success. >I only found the files for elink_020511.dtd and something for 2010. But >nothing related to elink_090910.dtd. >Could you please help me how I can solve this problem? > >Cheers and merry christmas, >Silvio > > >Traceback (most recent call last): > File "/home/silvio/programming/python/first steps/biopythonTutorial.py", >line 17, in > results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid)) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/__init__.py", >line 258, in read > record = handler.read(handle) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", >line 108, in read > self.parser.ParseFile(handle) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", >line 377, in externalEntityRefHandler > raise RuntimeError(message) >RuntimeError: Unable to load DTD file eLink_090910.dtd. >_______________________________________________ >Biopython mailing list - Biopython at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biopython > > From biopython at maubp.freeserve.co.uk Wed Dec 23 07:28:37 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 12:28:37 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <11374559688483741089292458484266867741-Webmail@me.com> References: <11374559688483741089292458484266867741-Webmail@me.com> Message-ID: <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> On Wed, Dec 23, 2009 at 11:22 AM, Konrad Koehler wrote: > Hi ?Silvio, > > This appears to be a temporary glitch with the Entrez database and not biopython. > Your URLs seem to be working now :) On Wed, Dec 23, 2009 at 12:10 PM, Peter wrote: > > However, even with the DTD file, I'm getting "Error 111 > (Connection refused)" for your example. Maybe the NCBI are > doing some maintenance work at the moment? This is also working now. Its looks like both a temporary glitch at the NCBI, and the problem with Biopython missing eLink_090910.dtd Peter From biopython at maubp.freeserve.co.uk Wed Dec 23 09:03:04 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 14:03:04 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> Message-ID: <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> On Wed, Dec 23, 2009 at 1:29 PM, Silvio Tschapke wrote: > > I have copied the DTD file in > /usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/DTDs/ > now and it seems to work. Great! I have installed Biopython with > > python setup.py build > python setup.py test > sudo python setup.py install OK - good :) > But will I have the same problems when I use eSearch, or eQuery and so > on? Because all of the DTD will not be up to date. So far I only > copied this DTD for eLink you posted. I *hope* that this was the only DTD file we were missing. If not, please do let us know so we can fix this for other users. In the short term, you would again need to download and fetch other missing DTD files in the same way. You can generally look at the start of the XML file to see where the DTD can be found, e.g. >>> from Bio import Entrez >>> Entrez.email = "your.name.here at example.com" >>> print Entrez.elink(dbfrom="pubmed", id="12345678").read(300) ... Thanks, Peter From bjorn_johansson at bio.uminho.pt Wed Dec 23 09:42:37 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Wed, 23 Dec 2009 14:42:37 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> Message-ID: Hi, my Biopython version seems to be 1.52 bjorn at bjorn-laptop:~$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import Bio >>> Bio.__version__ '1.52' This works really well to get only five cutters and above (rb is a restriction batch): rb = [x for x in rb if len(x) > 4] which was what I wanted initially. Thanks for all help! Happy hollidays! /bjorn 2009/12/20 Peter > 2009/12/20 Bj?rn Johansson : > > Hi, > > I copied the output from running Frederics example on my machine below: > > > > bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python > > Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) > > [GCC 4.4.1] on linux2 > > ... > > The lambdasplit does not seem to be working for me. I use python > > 2.6.4 on ubuntu karmic > > I was using the same version of Python (also on Linux Ubuntu Karmic). > This looks like it could be a Python 2.6 specific problem. > > > How can I print the biooython version? > > thansk for your help! > > /bjorn > > Its in the FAQ in the Tutorial, at the Python prompt just do: > > import Bio > print Bio.__version__ > > Peter > -- ______O_________oO________oO______o_______oO__ Bj?rn Johansson Assistant Professor Departament of Biology University of Minho Campus de Gualtar 4710-057 Braga PORTUGAL http://www.bio.uminho.pt http://sites.google.com/site/bjornhome Work (direct) +351-253 601517 Private mob. +351-967 147 704 Dept of Biology (secretariate) +351-253 60 4310 Dept of Biology (fax) +351-253 678980 From cjfields at illinois.edu Wed Dec 23 10:08:31 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 23 Dec 2009 09:08:31 -0600 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> Message-ID: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> On Dec 23, 2009, at 8:03 AM, Peter wrote: > On Wed, Dec 23, 2009 at 1:29 PM, Silvio Tschapke > wrote: >> >> I have copied the DTD file in >> /usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/DTDs/ >> now and it seems to work. Great! I have installed Biopython with >> >> python setup.py build >> python setup.py test >> sudo python setup.py install > > OK - good :) > >> But will I have the same problems when I use eSearch, or eQuery and so >> on? Because all of the DTD will not be up to date. So far I only >> copied this DTD for eLink you posted. > > I *hope* that this was the only DTD file we were missing. If > not, please do let us know so we can fix this for other users. > > In the short term, you would again need to download and fetch > other missing DTD files in the same way. You can generally look > at the start of the XML file to see where the DTD can be found, e.g. > >>>> from Bio import Entrez >>>> Entrez.email = "your.name.here at example.com" >>>> print Entrez.elink(dbfrom="pubmed", id="12345678").read(300) > > 2009//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eLink_090910.dtd"> > > ... > > Thanks, > > Peter Just a quick question: is there any particular reason you need the DTDs? The BioPerl eutils interface doesn't use them at all, primarily b/c they aren't required on our end. chris From biopython at maubp.freeserve.co.uk Wed Dec 23 10:14:06 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 15:14:06 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> Message-ID: <320fb6e00912230714y5893143dhd262a6732536c87@mail.gmail.com> On Wed, Dec 23, 2009 at 3:08 PM, Chris Fields wrote: > > Just a quick question: is there any particular reason you need the DTDs? > The BioPerl eutils interface doesn't use them at all, primarily b/c they > aren't required on our end. Our parser uses the DTDs (local copies) to know the expected data structure, which gets turned into Python lists, dicts, strings etc using that information. I don't know too much about the implementation details, but this doesn't work on Jython (Python under Java) since they haven't implemented the DTD parsing support we expect. Peter From mjldehoon at yahoo.com Thu Dec 24 10:11:48 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 24 Dec 2009 07:11:48 -0800 (PST) Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> Message-ID: <439227.21985.qm@web62403.mail.re1.yahoo.com> --- On Wed, 12/23/09, Chris Fields wrote: > Just a quick question: is there any particular reason you > need the DTDs?? The BioPerl eutils interface doesn't > use them at all, primarily b/c they aren't required on our > end. > The DTDs are needed to figure out the data structure of the XML file. In other words, what is a list, what is a dictionary, what is plain data, etcetera. How does the BioPerl eutils interface know how to store the information in the XML file? --Michiel. From cjfields at illinois.edu Thu Dec 24 13:54:14 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 24 Dec 2009 12:54:14 -0600 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <439227.21985.qm@web62403.mail.re1.yahoo.com> References: <439227.21985.qm@web62403.mail.re1.yahoo.com> Message-ID: <5569562A-372E-414C-889A-9E75B344BA21@illinois.edu> On Dec 24, 2009, at 9:11 AM, Michiel de Hoon wrote: > --- On Wed, 12/23/09, Chris Fields wrote: >> Just a quick question: is there any particular reason you >> need the DTDs? The BioPerl eutils interface doesn't >> use them at all, primarily b/c they aren't required on our >> end. >> > The DTDs are needed to figure out the data structure of the XML file. In other words, what is a list, what is a dictionary, what is plain data, etcetera. How does the BioPerl eutils interface know how to store the information in the XML file? > > --Michiel. We have classes designed to hold the information generically; docsums has docsum items, elink has linkouts, einfo has field/link information, so on. Has worked fairly well with eutils changes over the last four yeqrs w/o directly relying on the DTDs in a release. chris From pedro.al at fenhi.uh.cu Thu Dec 24 16:05:41 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Thu, 24 Dec 2009 16:05:41 -0500 Subject: [Biopython] Superpose structures... Message-ID: <20091224160541.4d9f8d4so4g4gcss@correo.fenhi.uh.cu> Hi all... I've superpose two structures.. Now i want to compute the RMSD between 2 residues after the superposition (with the transformed coordinates of the moving structure) How can i do that...? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From pedro.al at fenhi.uh.cu Thu Dec 24 16:37:12 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Thu, 24 Dec 2009 16:37:12 -0500 Subject: [Biopython] Superpose structures... Message-ID: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> I know i made that question already but this time is a quite different. I already superpose the structures (Profit isn't so suitable for my project), but now i want to compute the RMSD betweeen 2 residues in the superposed position. When i do that, the RMSD that i get is the same that if i superpose the structures according to that residue, and what i really want is the "deviation of 2 residues when its structures are superposed according to its binding sites residues", and i want to know how to do that in Biopython... Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From schafer at rostlab.org Thu Dec 24 17:11:10 2009 From: schafer at rostlab.org (=?ISO-8859-1?Q?Christian_Sch=E4fer?=) Date: Thu, 24 Dec 2009 17:11:10 -0500 Subject: [Biopython] Superpose structures... In-Reply-To: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> References: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> Message-ID: <4B33E6FE.3090709@rostlab.org> So, what you want to do is to superpose two structures by minimizing the RMSD between aligned residues and after that calculating the RMSD between two residues? Is that right? If so, ProFit is able to do that. Superimpostion is done by the ZONE command (which then returns the overall RMSD over the aligned regions after superimpostion); and calculating the RMSD between specific residues (after superimposition) is done via the RZONE command. You can even chose which kind of atoms to consider for RMSD calculation with the ATOMS and RATOMS commands. I'm not sure if something that specific could be done with plain BioPython. But again, you can always write a wrapper for the ProFit part in Python. Chris On 12/24/2009 04:37 PM, Yasser Almeida Hern?ndez wrote: > I know i made that question already but this time is a quite different. > I already superpose the structures (Profit isn't so suitable for my > project), but now i want to compute the RMSD betweeen 2 residues in the > superposed position. When i do that, the RMSD that i get is the same > that if i superpose the structures according to that residue, and what i > really want is the "deviation of 2 residues when its structures are > superposed according to its binding sites residues", and i want to know > how to do that in Biopython... > > Thanks > From bioinformaticsing at gmail.com Sat Dec 26 09:37:31 2009 From: bioinformaticsing at gmail.com (ning luwen) Date: Sat, 26 Dec 2009 22:37:31 +0800 Subject: [Biopython] need help! how to retrieve full text from Pubmed central ? Message-ID: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> Dear everyone, ?? I need to download full text from Pubmed central. After see the Entrez manual, maybe Entrez(not the web interface) doesn't give a way to?download .pdf full text file, is this true? -- regards, ningluwen From bioinformaticsing at gmail.com Sat Dec 26 09:54:23 2009 From: bioinformaticsing at gmail.com (ning luwen) Date: Sat, 26 Dec 2009 22:54:23 +0800 Subject: [Biopython] need help! how to retrieve full text from Pubmed central ? In-Reply-To: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> References: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> Message-ID: <90247fbe0912260654scd2b0ceyb37d54f36a3531fa@mail.gmail.com> more about the problem. From http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetchlit_help.html, I can learn: PubMed Central contains a number of articles classified as "open access" for which you may download the full text as XML. For the remaining articles in PMC you may download only the abstracts as XML. but when try to handle=Entrez.efetch(db='pmc',id=idlist,rettype='full',retmode='xml') record=Entrez.read(handle) got following errors: Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.6/dist-packages/Bio/Entrez/__init__.py", line 258, in read record = handler.read(handle) File "/usr/local/lib/python2.6/dist-packages/Bio/Entrez/Parser.py", line 114, in read raise CorruptedXMLError Bio.Entrez.Parser.CorruptedXMLError: Failed to parse the XML data. Please make sure that the input data are in XML format, and that the data are not corrupted. the python version is 1.53 and my system is ubuntu 9.10. On Sat, Dec 26, 2009 at 10:37 PM, ning luwen wrote: > Dear everyone, > ?? I need to download full text from Pubmed central. After see the > Entrez manual, maybe Entrez(not the web interface) doesn't give a way > to?download .pdf full text file, is this true? > > > > > -- > regards, > ningluwen > -- regards, luwening,bioinformatics center in uestc: www.bioinformaticsinuestc.cz.cc From pedro.al at fenhi.uh.cu Mon Dec 28 09:06:38 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Mon, 28 Dec 2009 09:06:38 -0500 Subject: [Biopython] Superpose structures... DONE Message-ID: <20091228090638.y05tos1p8g0gk08c@correo.fenhi.uh.cu> The problem with the calculus of the RMSD after the superposition is solved. This is done with the class SVDSuperimposer (Bio > SVDSuperimposer > SVDSuperimposer.py). This class has the method get_init_rms() that compute the structures RMSD after the superposition... Now i have another question. It is possible in Biopython read gziped pdb files (.pdb.gz)? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From mjldehoon at yahoo.com Mon Dec 28 12:27:23 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Mon, 28 Dec 2009 09:27:23 -0800 (PST) Subject: [Biopython] Superpose structures... DONE In-Reply-To: <20091228090638.y05tos1p8g0gk08c@correo.fenhi.uh.cu> Message-ID: <363342.11278.qm@web62406.mail.re1.yahoo.com> --- On Mon, 12/28/09, Yasser Almeida Hern?ndez wrote: > Now i have another question. It is possible in Biopython > read gziped pdb files (.pdb.gz)? I am not a Bio.PDB user, but from its documentation it looks like it uses the file name to open a PDB file instead of a handle. Thomas, how do you feel about modifying Bio.PDB so it uses a file handle instead of a file name? Then Bio.PDB can parse gzipped and bzipped files. --Michiel. From pedro.al at fenhi.uh.cu Tue Dec 29 09:18:38 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Tue, 29 Dec 2009 09:18:38 -0500 Subject: [Biopython] Remove hydrogens... Message-ID: <20091229091838.fnyk66sayos8swww@correo.fenhi.uh.cu> Hi all... How can i remove hydrogens atoms from the structures objects? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From pengyu.ut at gmail.com Tue Dec 29 11:08:09 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 10:08:09 -0600 Subject: [Biopython] Comparison between bioperl and biopython? Message-ID: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> May I ask somebody who are versitile in both bioperl and biopython comment on the pros and cons of bioperl and biopython? I'm sending this email to both bioperl and biopython mailing lists. But I hope that it will not result in any contention. I assume that the functionality between bioperl or biopython is the same, i.e., tasks can be done in bioperl can be done biopython and vice versa, as both libraries have been out there over 10 years. Please correct me if my understanding is not true. Given that a task that can be done with either bioperl or biopython, I, in particularly, want to know how long it will take to write the code for the task in bioperl and biopython, with the same readability requirement (see below) and the assumption that users have the same fluency in perl and python. python is claimed to be good for maintainability. But perl is criticized for there-are-many-ways-for-a-given-task. Since there are multiple ways in perl, let us assume that we always use perl in a readable way. From jason at bioperl.org Tue Dec 29 11:49:20 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 29 Dec 2009 08:49:20 -0800 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> Message-ID: <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> Are you asking for the purposes of choosing a toolkit for your work or just curious about the advantages/disadvantages of language choice? -jason On Dec 29, 2009, at 8:08 AM, Peng Yu wrote: > May I ask somebody who are versitile in both bioperl and biopython > comment on the pros and cons of bioperl and biopython? I'm sending > this email to both bioperl and biopython mailing lists. But I hope > that it will not result in any contention. > > I assume that the functionality between bioperl or biopython is the > same, i.e., tasks can be done in bioperl can be done biopython and > vice versa, as both libraries have been out there over 10 years. > Please correct me if my understanding is not true. > > Given that a task that can be done with either bioperl or biopython, > I, in particularly, want to know how long it will take to write the > code for the task in bioperl and biopython, with the same readability > requirement (see below) and the assumption that users have the same > fluency in perl and python. > > python is claimed to be good for maintainability. But perl is > criticized for there-are-many-ways-for-a-given-task. Since there are > multiple ways in perl, let us assume that we always use perl in a > readable way. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason.stajich at gmail.com jason at bioperl.org http://fungalgenomes.org/ From sdavis2 at mail.nih.gov Tue Dec 29 12:03:40 2009 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Tue, 29 Dec 2009 12:03:40 -0500 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> Message-ID: <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: > May I ask somebody who are versitile in both bioperl and biopython > comment on the pros and cons of bioperl and biopython? I'm sending > this email to both bioperl and biopython mailing lists. But I hope > that it will not result in any contention. > > I assume that the functionality between bioperl or biopython is the > same, i.e., tasks can be done in bioperl can be done biopython and > vice versa, as both libraries have been out there over 10 years. > Please correct me if my understanding is not true. The two projects have similar goals, but saying that the functionality is the same would be an extreme oversimplification. You will need to define what you want to do and then check to see what the two projects have to offer. This will, in general, require perusing the websites for both projects as well as the relevant documentation. > Given that a task that can be done with either bioperl or biopython, > I, in particularly, want to know how long it will take to write the > code for the task in bioperl and biopython, with the same readability > requirement (see below) and the assumption that users have the same > fluency in perl and python. Again, you will want to define the task(s) to be accomplished and then weigh the pros and cons of each project combined with local expertise. If you don't know what you want to do, then you can certainly read some examples on the websites and see which project strikes you as a "winner" for you. > python is claimed to be good for maintainability. But perl is > criticized for there-are-many-ways-for-a-given-task. Since there are > multiple ways in perl, let us assume that we always use perl in a > readable way. These two statements are generalizations that provide little insight into the strengths or weaknesses of the languages. In other words, one can write good or bad code in both languages. Hope that helps. Sean From eric.talevich at gmail.com Tue Dec 29 13:37:43 2009 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 29 Dec 2009 10:37:43 -0800 Subject: [Biopython] Superpose structures... DONE Message-ID: <3f6baf360912291037o5313b9f0s2acdc481c9989ce1@mail.gmail.com> On Mon, 28 Dec 2009, Michiel de Hoon wrote: > > I am not a Bio.PDB user, but from its documentation it looks like it uses > the file name to open a PDB file instead of a handle. Thomas, how do you > feel about modifying Bio.PDB so it uses a file handle instead of a file > name? Then Bio.PDB can parse gzipped and bzipped files. > > --Michiel. > > I guess PDB requires a file name because it wants full control over the file handle -- the handle is passed between PDBParser and parse_pdb_header, for instance. But control still isn't as crucial as in SeqIO.index (for example), so I don't think using a handle directly would lead to catastrophe in general. In addition, do you think a StructIO module would be worthwhile? Benefits: - Accept either a file name or file handle - Wouldn't necessarily need to specify the structure object's name as a separate argument (as PDBParser requires) - No need to instantiate a Parser object before parsing - PDB, PDBXML and mmCIF parsing would be called the same way Drawbacks: - Integrating parse_pdb_headers would become more important/tricky - Thin wrappers still require effort, and I'm currently tied up with TreeIO -- I'd get to it some months from now Cheers, Eric From pengyu.ut at gmail.com Tue Dec 29 13:58:59 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 12:58:59 -0600 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> Message-ID: <366c6f340912291058t6c601e57re0c35e69fe81e09d@mail.gmail.com> To choose a toolkit for my work. On Tue, Dec 29, 2009 at 10:49 AM, Jason Stajich wrote: > Are you asking for the purposes of choosing a toolkit for your work or just > curious about the advantages/disadvantages of language choice? > > -jason > On Dec 29, 2009, at 8:08 AM, Peng Yu wrote: > >> May I ask somebody who are versitile in both bioperl and biopython >> comment on the pros and cons of bioperl and biopython? I'm sending >> this email to both bioperl and biopython mailing lists. But I hope >> that it will not result in any contention. >> >> I assume that the functionality between bioperl or biopython is the >> same, i.e., tasks can be done in bioperl can be done biopython and >> vice versa, as both libraries have been out there over 10 years. >> Please correct me if my understanding is not true. >> >> Given that a task that can be done with either bioperl or biopython, >> I, in particularly, want to know how long it will take to write the >> code for the task in bioperl and biopython, with the same readability >> requirement (see below) and the assumption that users have the same >> fluency in perl and python. >> >> python is claimed to be good for maintainability. But perl is >> criticized for there-are-many-ways-for-a-given-task. Since there are >> multiple ways in perl, let us assume that we always use perl in a >> readable way. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason.stajich at gmail.com > jason at bioperl.org > http://fungalgenomes.org/ > > From pengyu.ut at gmail.com Tue Dec 29 14:15:14 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 13:15:14 -0600 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> Message-ID: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis wrote: > On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: >> May I ask somebody who are versitile in both bioperl and biopython >> comment on the pros and cons of bioperl and biopython? I'm sending >> this email to both bioperl and biopython mailing lists. But I hope >> that it will not result in any contention. >> >> I assume that the functionality between bioperl or biopython is the >> same, i.e., tasks can be done in bioperl can be done biopython and >> vice versa, as both libraries have been out there over 10 years. >> Please correct me if my understanding is not true. > > The two projects have similar goals, but saying that the functionality > is the same would be an extreme oversimplification. ?You will need to > define what you want to do and then check to see what the two projects > have to offer. ?This will, in general, require perusing the websites > for both projects as well as the relevant documentation. According to your experience, are there some tasks that are easier with one than with another? >> Given that a task that can be done with either bioperl or biopython, >> I, in particularly, want to know how long it will take to write the >> code for the task in bioperl and biopython, with the same readability >> requirement (see below) and the assumption that users have the same >> fluency in perl and python. > > Again, you will want to define the task(s) to be accomplished and then > weigh the pros and cons of each project combined with local expertise. > ?If you don't know what you want to do, then you can certainly read > some examples on the websites and see which project strikes you as a > "winner" for you. > >> python is claimed to be good for maintainability. But perl is >> criticized for there-are-many-ways-for-a-given-task. Since there are >> multiple ways in perl, let us assume that we always use perl in a >> readable way. > > These two statements are generalizations that provide little insight > into the strengths or weaknesses of the languages. ?In other words, > one can write good or bad code in both languages. > > Hope that helps. > > Sean > From jkhilmer at gmail.com Tue Dec 29 14:55:18 2009 From: jkhilmer at gmail.com (Jonathan Hilmer) Date: Tue, 29 Dec 2009 12:55:18 -0700 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> Message-ID: <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> Personally, I think that the differences between Python and Perl (although substantial) are not large enough to make the language itself the deciding factor. Instead, consider the larger community of software. I haven't yet found a situation in which Python cannot be applied: it can be used with R (statistics); lower-level code C or fortran; visualization software such as PyMol, Chimera, Blender, VTK; plotting with matplotlib; and scipy/numpy or sage, which provide innumerable benefits for computation, data-processing, etc. Although I don't claim to have a great deal of experience with Perl, I haven't seen the same integration with that language: I'm assuming it can be used with R and VTK (not sure about C or fortran?). For this reason, unless your work is highly targeted and you have no use programming language integration with other software, I would recommend Python. For perl experts, I would truly appreciate any corrections you could offer to these observations of mine, since I wouldn't mind using perl if it offers benefits either in general or for specific applications. Jonathan On Tue, Dec 29, 2009 at 12:15 PM, Peng Yu wrote: > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis wrote: >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: >>> May I ask somebody who are versitile in both bioperl and biopython >>> comment on the pros and cons of bioperl and biopython? I'm sending >>> this email to both bioperl and biopython mailing lists. But I hope >>> that it will not result in any contention. >>> >>> I assume that the functionality between bioperl or biopython is the >>> same, i.e., tasks can be done in bioperl can be done biopython and >>> vice versa, as both libraries have been out there over 10 years. >>> Please correct me if my understanding is not true. >> >> The two projects have similar goals, but saying that the functionality >> is the same would be an extreme oversimplification. ?You will need to >> define what you want to do and then check to see what the two projects >> have to offer. ?This will, in general, require perusing the websites >> for both projects as well as the relevant documentation. > > According to your experience, are there some tasks that are easier > with one than with another? > >>> Given that a task that can be done with either bioperl or biopython, >>> I, in particularly, want to know how long it will take to write the >>> code for the task in bioperl and biopython, with the same readability >>> requirement (see below) and the assumption that users have the same >>> fluency in perl and python. >> >> Again, you will want to define the task(s) to be accomplished and then >> weigh the pros and cons of each project combined with local expertise. >> ?If you don't know what you want to do, then you can certainly read >> some examples on the websites and see which project strikes you as a >> "winner" for you. >> >>> python is claimed to be good for maintainability. But perl is >>> criticized for there-are-many-ways-for-a-given-task. Since there are >>> multiple ways in perl, let us assume that we always use perl in a >>> readable way. >> >> These two statements are generalizations that provide little insight >> into the strengths or weaknesses of the languages. ?In other words, >> one can write good or bad code in both languages. >> >> Hope that helps. >> >> Sean >> > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From wgheath at gmail.com Tue Dec 29 15:16:39 2009 From: wgheath at gmail.com (William Heath) Date: Tue, 29 Dec 2009 12:16:39 -0800 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> Message-ID: The biggest reason to go with python is the ease of use. Biologists are not programmers and the learning curve for python is much smaller than that of perl. I like perl but choose python because of this issue. Perl 6 does address some of these issues however but this has not been fully implemented as of yet. -Tim P.S. I love, love, love cpan though which is only for perl right now :( On Tue, Dec 29, 2009 at 11:55 AM, Jonathan Hilmer wrote: > Personally, I think that the differences between Python and Perl > (although substantial) are not large enough to make the language > itself the deciding factor. > > Instead, consider the larger community of software. I haven't yet > found a situation in which Python cannot be applied: it can be used > with R (statistics); lower-level code C or fortran; visualization > software such as PyMol, Chimera, Blender, VTK; plotting with > matplotlib; and scipy/numpy or sage, which provide innumerable > benefits for computation, data-processing, etc. > > Although I don't claim to have a great deal of experience with Perl, I > haven't seen the same integration with that language: I'm assuming it > can be used with R and VTK (not sure about C or fortran?). For this > reason, unless your work is highly targeted and you have no use > programming language integration with other software, I would > recommend Python. > > For perl experts, I would truly appreciate any corrections you could > offer to these observations of mine, since I wouldn't mind using perl > if it offers benefits either in general or for specific applications. > > > Jonathan > > On Tue, Dec 29, 2009 at 12:15 PM, Peng Yu wrote: > > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis > wrote: > >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: > >>> May I ask somebody who are versitile in both bioperl and biopython > >>> comment on the pros and cons of bioperl and biopython? I'm sending > >>> this email to both bioperl and biopython mailing lists. But I hope > >>> that it will not result in any contention. > >>> > >>> I assume that the functionality between bioperl or biopython is the > >>> same, i.e., tasks can be done in bioperl can be done biopython and > >>> vice versa, as both libraries have been out there over 10 years. > >>> Please correct me if my understanding is not true. > >> > >> The two projects have similar goals, but saying that the functionality > >> is the same would be an extreme oversimplification. You will need to > >> define what you want to do and then check to see what the two projects > >> have to offer. This will, in general, require perusing the websites > >> for both projects as well as the relevant documentation. > > > > According to your experience, are there some tasks that are easier > > with one than with another? > > > >>> Given that a task that can be done with either bioperl or biopython, > >>> I, in particularly, want to know how long it will take to write the > >>> code for the task in bioperl and biopython, with the same readability > >>> requirement (see below) and the assumption that users have the same > >>> fluency in perl and python. > >> > >> Again, you will want to define the task(s) to be accomplished and then > >> weigh the pros and cons of each project combined with local expertise. > >> If you don't know what you want to do, then you can certainly read > >> some examples on the websites and see which project strikes you as a > >> "winner" for you. > >> > >>> python is claimed to be good for maintainability. But perl is > >>> criticized for there-are-many-ways-for-a-given-task. Since there are > >>> multiple ways in perl, let us assume that we always use perl in a > >>> readable way. > >> > >> These two statements are generalizations that provide little insight > >> into the strengths or weaknesses of the languages. In other words, > >> one can write good or bad code in both languages. > >> > >> Hope that helps. > >> > >> Sean > >> > > > > _______________________________________________ > > Biopython mailing list - Biopython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From jason at bioperl.org Tue Dec 29 16:57:49 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 29 Dec 2009 13:57:49 -0800 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> Message-ID: <02851B8A-E74E-453E-9725-6FA8F3995F82@bioperl.org> On Dec 29, 2009, at 11:15 AM, Peng Yu wrote: > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis > wrote: >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu >> wrote: >>> May I ask somebody who are versitile in both bioperl and biopython >>> comment on the pros and cons of bioperl and biopython? I'm sending >>> this email to both bioperl and biopython mailing lists. But I hope >>> that it will not result in any contention. >>> >>> I assume that the functionality between bioperl or biopython is the >>> same, i.e., tasks can be done in bioperl can be done biopython and >>> vice versa, as both libraries have been out there over 10 years. >>> Please correct me if my understanding is not true. >> >> The two projects have similar goals, but saying that the >> functionality >> is the same would be an extreme oversimplification. You will need to >> define what you want to do and then check to see what the two >> projects >> have to offer. This will, in general, require perusing the websites >> for both projects as well as the relevant documentation. > > According to your experience, are there some tasks that are easier > with one than with another? As you have still failed to give much insight into the 'tasks' it is hard to give you a better answer. If there is a module or set of routines already written then yes one might be easier than the other. Otherwise it just depends on your strengths in the programming language. We discussed the strengths of the different toolkits briefly on the podcast last month. http://twit.tv/floss96 I echo Sean. Use whichever language you are a better programmer in. BioPerl is more mature in some facets than is BioPython, but BioPython has some components that are more heavily developed and supported than BioPerl (structures being one of those and interfacing that to pyMol would be a strength). I personally think the Gbrowse, Bio-Graphics, and Bio::DB::GFF/Bio::DB::SeqFeature::Store interface to Sequence databases and Features is a critical aspect of mining genomic data and features and use these heavily in my work, making BioPerl easy and powerful for my tasks. That and sequence and alignment parsing and reformatting. But there are comparable tools written in python with and without BioPython that you can also use so mainly it is about building up an expertise in a toolkit and going forward. The BioPerl faithful will probably say it is more useful toolkit to us, but we are of course a biased sample. Both projects can benefit from more users and developers contributing code and documentation so I would just jump in and give it a try if you are unsure which will be easier for you. > >>> Given that a task that can be done with either bioperl or biopython, >>> I, in particularly, want to know how long it will take to write the >>> code for the task in bioperl and biopython, with the same >>> readability >>> requirement (see below) and the assumption that users have the same >>> fluency in perl and python. >> >> Again, you will want to define the task(s) to be accomplished and >> then >> weigh the pros and cons of each project combined with local >> expertise. >> If you don't know what you want to do, then you can certainly read >> some examples on the websites and see which project strikes you as a >> "winner" for you. >> >>> python is claimed to be good for maintainability. But perl is >>> criticized for there-are-many-ways-for-a-given-task. Since there are >>> multiple ways in perl, let us assume that we always use perl in a >>> readable way. >> >> These two statements are generalizations that provide little insight >> into the strengths or weaknesses of the languages. In other words, >> one can write good or bad code in both languages. >> >> Hope that helps. >> >> Sean >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason.stajich at gmail.com jason at bioperl.org http://fungalgenomes.org/ From mitlox at op.pl Tue Dec 29 21:42:07 2009 From: mitlox at op.pl (xyz) Date: Wed, 30 Dec 2009 12:42:07 +1000 Subject: [Biopython] fastq-solexa index In-Reply-To: <320fb6e00911260248w1f6a29b1ucc0bfecec897c67b@mail.gmail.com> References: <4B0DD08B.6070607@op.pl> <320fb6e00911260248w1f6a29b1ucc0bfecec897c67b@mail.gmail.com> Message-ID: <4B3ABDFF.8030809@op.pl> Peter wrote: > In Bio.SeqIO we give each file format a name, in this case "fastq-solexa" > means the old Solexa FASTQ files (also used by Illumina up to and > including pipeline 1.2) which use Solexa scores with an ASCII offset > of 64 (not PHRED scores). The table on the SeqIO wiki page tries to > summarise this. See also: http://en.wikipedia.org/wiki/FASTQ_format > > The "index" column on that table on the SeqIO wiki page indicates if > each file format can be used with the Bio.SeqIO.index(...) function > included in Biopython 1.52 onwards. See: > http://news.open-bio.org/news/2009/09/biopython-seqio-index/ > > There are also examples in the main Tutorial, > http://biopython.org/DIST/docs/tutorial/Tutorial.html > http://biopython.org/DIST/docs/tutorial/Tutorial.pdf > > And in the Bio.SeqIO module's built in help, online here: > http://biopython.org/DIST/docs/api/Bio.SeqIO-module.html > > >From within Python: > > >>>> from Bio import SeqIO >>>> help(SeqIO) >>>> > ... > >>>> help(SeqIO.index) >>>> > ... > > Peter > > Thank you. From mitlox at op.pl Tue Dec 29 21:51:45 2009 From: mitlox at op.pl (xyz) Date: Wed, 30 Dec 2009 12:51:45 +1000 Subject: [Biopython] Strand Message-ID: <4B3AC041.2070008@op.pl> Hello, I downloaded data from Phytozome Biomart: >AC159145_38|MtChr2|AC159145_38|Mtruncatula|17915949|17918990|-1 ATTTCCTCCAGACTTGTTAAAGAAGTTGAGTACAGATTGTATTGTCATGCAAAATCATCA ATATGGCATATCCCCAGTAAAACTCCTGGGAAATCAAAAGCTATCGAGTTTTTTCGAGAT CTTGACAACTTCCAACGATCAAGATGATAAGGTTTATGTCTCTACAGTACGTTCACGTAA CTATCCCGTGACTGGCTTCCAATGGCATCCTGAGAAAAATGCCTTCGAATGGGGCTCACC AAGCATTCCACACACAGAGGATGCCATTCGAACAACTCAGTATGCTGCAAACTATTTGGT CAGTGAAGCGAGGAAGTCCTTAAACAGACCAGTTGCTCAGGAATTGTTAGACAATCTCAT ATACAATTACAGACCCACTTATTGTGGGTATGCAGGTTGTCCACCGCCTAATCCGAACCT CTACTACCAGCCGGTCATTGGAATTCTCAGCCACCCCGGCGATGGCACTTCAGGCCGCCA CAGTAATGCTACGGGCGCTTCCTTCATTCACGCCTCTTATGTGAAATTCGTGGAGGCTGC TGGCGCTAGAGTAGTTCCTCTCATTTACAACGAACCGGAGGAGAAGATTCTCAAGGTATC AGAAAAGGCCAAAGCTTGA The above data is from -1 strand, but how could I convert it +1 strand? Thank you in advance. Best regards, From p.j.a.cock at googlemail.com Wed Dec 30 07:13:42 2009 From: p.j.a.cock at googlemail.com (Peter) Date: Wed, 30 Dec 2009 12:13:42 +0000 Subject: [Biopython] programming errors In-Reply-To: <4788465e0912252205l1ac3f26ds98739898ff83dbd9@mail.gmail.com> References: <4788465e0912252205l1ac3f26ds98739898ff83dbd9@mail.gmail.com> Message-ID: <389810B4-4CAB-4375-8964-5ABDA9749FFB@googlemail.com> Hi Rocky, To send a query to the mailing list please use the Biopython at ... address, not the Biopython-owner at ... address. You need to sign up to the mailing list first. Thanks, Peter On 26 Dec 2009, at 06:05, Rocky Parida wrote: > Hi > My name is Rocky. I am student at the Grand Valley State University. > I am > doing MS in Bioinformatics. I am currently using python 2.6.I was > trying to > follow a documentation( > http://www.inb.mu-luebeck.de/biosoft/biopython/tut/Tutorial002.html#toc10 > )in > > order to connect to a biological databases. I am facing some troubles > regarding importing data from NCBI. I am attaching my snipped word > doc with > this email. > Can you please suggest me on how to perform statistical data > analysis using > python. I am very much interested to learn but i am facing some > troubles > following the documentation. Is there a step by step documentation > that has > all the information regarding what to write, what to download and > how to do > statistics using python codes. If you please refer those sites and > books in > you reply. I have very little back ground in programming. So, please > keep > that in you consideration as well. > Thanking you > Rocky From chapmanb at 50mail.com Wed Dec 30 07:59:42 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 30 Dec 2009 07:59:42 -0500 Subject: [Biopython] Strand In-Reply-To: <4B3AC041.2070008@op.pl> References: <4B3AC041.2070008@op.pl> Message-ID: <20091230125942.GB39741@sobchak.mgh.harvard.edu> Hello; > I downloaded data from Phytozome Biomart: > > >AC159145_38|MtChr2|AC159145_38|Mtruncatula|17915949|17918990|-1 > ATTTCCTCCAGACTTGTTAAAGAAGTTGAGTACAGATTGTATTGTCATGCAAAATCATCA > ATATGGCATATCCCCAGTAAAACTCCTGGGAAATCAAAAGCTATCGAGTTTTTTCGAGAT > CTTGACAACTTCCAACGATCAAGATGATAAGGTTTATGTCTCTACAGTACGTTCACGTAA > CTATCCCGTGACTGGCTTCCAATGGCATCCTGAGAAAAATGCCTTCGAATGGGGCTCACC > AAGCATTCCACACACAGAGGATGCCATTCGAACAACTCAGTATGCTGCAAACTATTTGGT > CAGTGAAGCGAGGAAGTCCTTAAACAGACCAGTTGCTCAGGAATTGTTAGACAATCTCAT > ATACAATTACAGACCCACTTATTGTGGGTATGCAGGTTGTCCACCGCCTAATCCGAACCT > CTACTACCAGCCGGTCATTGGAATTCTCAGCCACCCCGGCGATGGCACTTCAGGCCGCCA > CAGTAATGCTACGGGCGCTTCCTTCATTCACGCCTCTTATGTGAAATTCGTGGAGGCTGC > TGGCGCTAGAGTAGTTCCTCTCATTTACAACGAACCGGAGGAGAAGATTCTCAAGGTATC > AGAAAAGGCCAAAGCTTGA > > The above data is from -1 strand, but how could I convert it +1 strand? If you got this from BioMart and retrieved something like cDNA sequence or transcript, this is probably already reverse complemented for you. In this case it looks like a coding sequence starting at base 47 and proceeding to the stop codon at the end. To answer your question, please see the Tutorial documentation, specifically Chapter 5 Sequence Input/Output: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc44 and section 3.7 Nucleotide sequences and (reverse) complements: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc23 This should lead you to: in_file = "your_data.fa" with open(in_file) as in_handle: rec = SeqIO.read(in_handle, "fasta") rc = rec.seq.reverse_complement() Hope this helps, Brad From pedro.al at fenhi.uh.cu Wed Dec 30 15:05:55 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Wed, 30 Dec 2009 15:05:55 -0500 Subject: [Biopython] Save custom structure... Message-ID: <20091230150555.ojbth0gp34g088os@correo.fenhi.uh.cu> Hi all... I've extracted a residue and an atom as two separated objects... How can i save them as a single structure .pdb file? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From johncumbers at gmail.com Tue Dec 1 07:10:16 2009 From: johncumbers at gmail.com (John Cumbers) Date: Mon, 30 Nov 2009 23:10:16 -0800 Subject: [Biopython] MuscleCommandline and phyiout Message-ID: Hello, I'm using the MuscleCommandline wrapper and I'm having trouble getting the Phylip interleaved output format. For the Muscle command line I would type "muscle -in myinputfile -phyiout myoutputfile" and this command in python: cline = MuscleCommandline (input=output_file_name_FASTA, out=output_file_name_aligned) But for phyiout, this doesn't work: cline = MuscleCommandline (input=output_file_name_FASTA, phyiout=output_file_name_aligned) returning: ValueError: Option name phyiout was not found. I tried to lookup the possibilities here: http://www.biopython.org/DIST/docs/api/Bio.Align.Applications._Muscle.MuscleCommandline-class.html but couldn't find them, any help appreciated, cheers, John John Cumbers, Ph.D Candidate NASA Ames Research Center Mail Stop 239-20, Bldg N239 Rm 373 Moffett Field, CA 94035, USA. cell +1 (401) 523 8190, office +1 (650) 604-1914, fax +1 (650) 604-1088 Graduate Program in Molecular Biology, Cell Biology, and Biochemistry Brown University, Box G-W Providence, RI, 02912, USA From biopython at maubp.freeserve.co.uk Tue Dec 1 09:23:54 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 09:23:54 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: References: Message-ID: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> On Tue, Dec 1, 2009 at 7:10 AM, John Cumbers wrote: > Hello, > > I'm using the MuscleCommandline wrapper and I'm having trouble getting the > Phylip interleaved output format. ?For the Muscle command line I would type > "muscle -in myinputfile -phyiout myoutputfile" What version of MUSCLE do you have? v3.7 doesn't mention this option in the command line help, nor does the current manual: http://www.drive5.com/muscle/muscle.html Peter From biopython at maubp.freeserve.co.uk Tue Dec 1 09:52:19 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 09:52:19 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> Message-ID: <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> On Tue, Dec 1, 2009 at 9:23 AM, Peter wrote: > On Tue, Dec 1, 2009 at 7:10 AM, John Cumbers wrote: >> Hello, >> >> I'm using the MuscleCommandline wrapper and I'm having trouble getting the >> Phylip interleaved output format. ?For the Muscle command line I would type >> "muscle -in myinputfile -phyiout myoutputfile" > > What version of MUSCLE do you have? v3.7 doesn't mention this > option in the command line help, nor does the current manual: > http://www.drive5.com/muscle/muscle.html It looks like an undocumented option, much like -phyi (which I guessed) for PHYLIP interlaced and -phys for PHYLIP sequential. These match the documented format options (e.g. -msf, -html, -clw and -clwstrict). i.e. You can use this: "muscle -in myinputfile -phyi -out myoutputfile" I think we should ask the MUSCLE author which of these undocumented arguments are actually supported rather than adding them all. Peter From bartomas at gmail.com Tue Dec 1 16:25:36 2009 From: bartomas at gmail.com (bar tomas) Date: Tue, 1 Dec 2009 16:25:36 +0000 Subject: [Biopython] Host_organism field in SwissProt Message-ID: Hi, I'm using BioPython for processing protein sequences from SwissProt database. I'm following the excellent tutorial documentation on SwissProt querying (p.107) I'm just wondering about the meaning of the field 'host_organism' in a swissprot record, as I haven't yet found a record where the value of this field is supplied. If the record concerns the protein sequence of a bacteria, for instance, does the host_organism field contain a list of taxonomic ids of possible host organisms where the bacteria can be found? Many thanks From biopython at maubp.freeserve.co.uk Tue Dec 1 19:19:23 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:19:23 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> Message-ID: <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> On Tue, Dec 1, 2009 at 9:52 AM, Peter wrote: > > It looks like an undocumented option, much like -phyi (which I guessed) > for PHYLIP interlaced and -phys for PHYLIP sequential. These match > the documented format options (e.g. -msf, -html, -clw and -clwstrict). > i.e. You can use this: > > "muscle -in myinputfile -phyi -out myoutputfile" > > I think we should ask the MUSCLE author which of these > undocumented arguments are actually supported rather than > adding them all. Robert Edgar agreed the documentation was out of sync, and has confirmed these are safe arguments to include in our wrapper. Peter From biopython at maubp.freeserve.co.uk Tue Dec 1 19:21:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:21:35 +0000 Subject: [Biopython] Host_organism field in SwissProt In-Reply-To: References: Message-ID: <320fb6e00912011121o41237344v61d1570a919d54b4@mail.gmail.com> On Tue, Dec 1, 2009 at 4:25 PM, bar tomas wrote: > Hi, > I'm using BioPython for processing protein sequences from SwissProt database. > I'm following the excellent tutorial documentation on SwissProt querying (p.107) > I'm just wondering about the meaning of the field 'host_organism' in a > swissprot record, as I haven't yet found a record where the value of > this field is supplied. > If the record concerns the protein sequence of a bacteria, for > instance, does the host_organism field contain a list of taxonomic ids > of possible host organisms where the bacteria can be found? > Many thanks I suspect (without checking) that this field only applies to viruses (or perhaps pathogens) and so will usually be empty. Peter P.S. Page numbers (and even section numbers) in the tutorial do change from release to release - the section names are usually stable. From biopython at maubp.freeserve.co.uk Tue Dec 1 19:34:19 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Dec 2009 19:34:19 +0000 Subject: [Biopython] Fwd: [Utilities-announce] NCBI E-Utility Policy Change In-Reply-To: <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> References: <7B6F170840CA6C4DA63EE0C8A7BB43EC09CA7387@NIHCESMLBX15.nih.gov> <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> Message-ID: <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> Hi all, Attention NCBI Entrez users - the NCBI really do want you to include your email address, and it will be mandatory in future! See below... If using Bio.Entrez, the tool parameter will by default be set to Biopython, but the email is omitted. We already encourage the email to be included in our documentation but given the new NCBI guidance I'd suggest we make omitting the email issue a warning in the next release (and an error in the subsequent release of Biopython?). Peter ---------- Forwarded message ---------- From: ? Date: Tue, Dec 1, 2009 at 6:59 PM Subject: [Utilities-announce] NCBI E-Utility Policy Change To: utilities-announce at ncbi.nlm.nih.gov As part of an ongoing effort to ensure efficient access to the Entrez Utilities (E-utilities) by all users, NCBI has decided to change the usage policy for the E-utilities effective June 1, 2010. Effective on June 1, 2010, all E-utility requests, either using standard URLs or SOAP, must contain non-null values for both the &tool and &email parameters. Any E-utility request made after June 1, 2010 that does not contain values for both parameters will return an error explaining that these parameters must be included in E-utility requests. The value of the &tool parameter should be a URI-safe string that is the name of the software package, script or web page producing the E-utility request. The value of the &email parameter should be a valid e-mail address for the appropriate contact person or group responsible for maintaining the tool producing the E-utility request. NCBI uses these parameters to contact users whose use of the E-utilities violates the standard usage policies described at http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements. These usage policies are designed to prevent excessive requests from a small group of users from reducing or eliminating the wider community's access to the E-utilities. NCBI will attempt to contact a user at the e-mail address provided in the &email parameter prior to blocking access to the E-utilities. NCBI realizes that this policy change will require many of our users to change their code. Based on past experience, we anticipate that most of our users should be able to make the necessary changes before the June 1, 2010 deadline. If you have any concerns about making these changes by that date, or if you have any questions about these policies, please contact eutilities at ncbi.nlm.nih.gov. Thank you for your understanding and cooperation in helping us continue to deliver a reliable and efficient web service. _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce -------------- next part -------------- _______________________________________________ Utilities-announce mailing list http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce From cjfields at illinois.edu Tue Dec 1 19:54:48 2009 From: cjfields at illinois.edu (Chris Fields) Date: Tue, 1 Dec 2009 13:54:48 -0600 Subject: [Biopython] Fwd: [Utilities-announce] NCBI E-Utility Policy Change In-Reply-To: <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> References: <7B6F170840CA6C4DA63EE0C8A7BB43EC09CA7387@NIHCESMLBX15.nih.gov> <320fb6e00912011129j68dda3b2p6df9a232f0462458@mail.gmail.com> <320fb6e00912011134u2481644aw5dfdfe9f9a3049f0@mail.gmail.com> Message-ID: <1EE2400D-D3F7-49DC-82D7-001EE18F2030@illinois.edu> I'll be following the same (exclusion of the email a warning in next bioperl release in Jan, and an error for the spring release). chris On Dec 1, 2009, at 1:34 PM, Peter wrote: > Hi all, > > Attention NCBI Entrez users - the NCBI really do want you to include > your email address, and it will be mandatory in future! See below... > > If using Bio.Entrez, the tool parameter will by default be set to > Biopython, but the email is omitted. We already encourage the email > to be included in our documentation but given the new NCBI guidance > I'd suggest we make omitting the email issue a warning in the next > release (and an error in the subsequent release of Biopython?). > > Peter > > > ---------- Forwarded message ---------- > From: > Date: Tue, Dec 1, 2009 at 6:59 PM > Subject: [Utilities-announce] NCBI E-Utility Policy Change > To: utilities-announce at ncbi.nlm.nih.gov > > > As part of an ongoing effort to ensure efficient access to the Entrez > Utilities (E-utilities) by all users, NCBI has decided to change the > usage policy for the E-utilities effective June 1, 2010. Effective on > June 1, 2010, all E-utility requests, either using standard URLs or > SOAP, must contain non-null values for both the &tool and &email > parameters. Any E-utility request made after June 1, 2010 that does > not contain values for both parameters will return an error explaining > that these parameters must be included in E-utility requests. > > > > The value of the &tool parameter should be a URI-safe string that is > the name of the software package, script or web page producing the > E-utility request. > > > > The value of the &email parameter should be a valid e-mail address for > the appropriate contact person or group responsible for maintaining > the tool producing the E-utility request. > > > > NCBI uses these parameters to contact users whose use of the > E-utilities violates the standard usage policies described at > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html#UserSystemRequirements. > These usage policies are designed to prevent excessive requests from a > small group of users from reducing or eliminating the wider > community's access to the E-utilities. NCBI will attempt to contact a > user at the e-mail address provided in the &email parameter prior to > blocking access to the E-utilities. > > > > NCBI realizes that this policy change will require many of our users > to change their code. Based on past experience, we anticipate that > most of our users should be able to make the necessary changes before > the June 1, 2010 deadline. If you have any concerns about making these > changes by that date, or if you have any questions about these > policies, please contact eutilities at ncbi.nlm.nih.gov. > > > > Thank you for your understanding and cooperation in helping us > continue to deliver a reliable and efficient web service. > > > > _______________________________________________ > Utilities-announce mailing list > http://www.ncbi.nlm.nih.gov/mailman/listinfo/utilities-announce > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From mike.thon at gmail.com Wed Dec 2 12:31:45 2009 From: mike.thon at gmail.com (Michael Thon) Date: Wed, 2 Dec 2009 13:31:45 +0100 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> Message-ID: <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> I was wondering if you have implemented this method yet and if so, is it in a repository somewhere where I can try it? I was about to post a message on this and I searched the archives first (!) and found this thread. I have genbank genomic sequences and I need to get the transcript sequences for the CDS features. thanks Mike On Nov 6, 2009, at 1:47 PM, Peter wrote: > On Fri, Nov 6, 2009 at 12:22 PM, Peter wrote: >> Hi all, >> >> I am planing to add a new method to the SeqFeature object, but >> would like a little feedback first. This email is really just the >> background - I'll write up a few examples later to try and make >> this a bit clearer... > > OK, here is a non-trivial example - the first CDS feature in the > GenBank file NC_000932.gb (included as a Biopython unit test), > which is a three part join on the reverse strand. In this case, the > GenBank file includes the protein translation for the CDS features > so we can use it to check our results. > > We can parse this GenBank file into a SeqRecord with: > > from Bio import SeqIO > record = SeqIO.read(open("../biopython/Tests/GenBank/NC_000932.gb"), "gb") > > Let's have a look at the first CDS feature (index 2): > > f = record.features[2] > print f.type, f.location, f.strand, f.location_operator > for sub_f in f.sub_features : > print " - ", sub_f.location, sub_f.strand > table = f.qualifiers.get("transl_table",[1])[0] # List of one int > print "Table", table > > Giving: > > CDS [97998:69724] -1 join > - [97998:98024] -1 > - [98561:98793] -1 > - [69610:69724] -1 > Table 11 > > Looking at the raw GenBank file, this feature has location string: > > complement(join(97999..98024,98562..98793,69611..69724)) > > i.e. To get the sequence you need to do this (note zero based > Python counting as in the output above): > > print (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement() > > And then translate it using NCBI genetic code table 11, > > print "Manual translation:" > print (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement().translate(table=11, > cds=True) > print "Given translation:" > print f.qualifiers["translation"][0] # List of one string > print "Biopython translation (with proposed code):" > print f.extract(record.seq).translate(table, cds=True) > > And the output, together with the provided translation in the > feature annotation, and the shortcut with the new code I am > proposing to include in Biopython: > > Manual translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > Given translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > Biopython translation: > MPTIKQLIRNTRQPIRNVTKSPALRGCPQRRGTCTRVYTITPKKPNSALRKVARVRLTSGFEITAYIPGIGHNLQEHSVVLVRGGRVKDLPGVRYHIVRGTLDAVGVKDRQQGRSKYGVKKPK > > The point of all this was with the proposed new extract method, > you just need: > > feature_seq = f.extract(record.seq) > > instead of: > > feature_seq = (record.seq[97998:98024] + record.seq[98561:98793] + > record.seq[69610:69724]).reverse_complement() > > which is in itself a slight simplification since you'd need to get the > those coordinates from the sub features, worry about strands, etc. > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From kellrott at gmail.com Wed Dec 2 15:12:53 2009 From: kellrott at gmail.com (Kyle Ellrott) Date: Wed, 2 Dec 2009 07:12:53 -0800 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> Message-ID: On Wed, Dec 2, 2009 at 4:31 AM, Michael Thon wrote: > I was wondering if you have implemented this method yet and if so, is it in > a repository somewhere where I can try it? I was about to post a message on > this and I searched the archives first (!) and found this thread. I have > genbank genomic sequences and I need to get the transcript sequences for the > CDS features. > > It should be in the main GIT repository at http://github.com/biopython/biopython/ It's to new to have made an official release version yet. Kyle From biopython at maubp.freeserve.co.uk Wed Dec 2 22:44:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 2 Dec 2009 22:44:11 +0000 Subject: [Biopython] Getting the sequence for a SeqFeature In-Reply-To: References: <320fb6e00911060422u2d2742d5r7b5b1db98c991df5@mail.gmail.com> <320fb6e00911060447g779f2ac2i7739a28c3f4a4077@mail.gmail.com> <0EAFC700-CFDB-4D0B-882A-B9D0EF492172@gmail.com> Message-ID: <320fb6e00912021444g11b27f81k8a3c8675ddc25169@mail.gmail.com> On Wed, Dec 2, 2009 at 3:12 PM, Kyle Ellrott wrote: > On Wed, Dec 2, 2009 at 4:31 AM, Michael Thon wrote: > >> I was wondering if you have implemented this method yet and if so, is it in >> a repository somewhere where I can try it? ?I was about to post a message on >> this and I searched the archives first (!) and found this thread. ?I have >> genbank genomic sequences and I need to get the transcript sequences for the >> CDS features. >> > > It should be in the main GIT repository at > http://github.com/biopython/biopython/ > It's to new to have made an official release version yet. > > Kyle Kyle has also been following the dicussion on the dev mailing list, where it was mentioned this was now "on the trunk". See also: http://www.biopython.org/wiki/SourceCode Getting people using and testing the code now would be nice, especially if we hope to get a release out before too long. Peter From richard_w_g_price at academia.edu Thu Dec 3 01:21:26 2009 From: richard_w_g_price at academia.edu (Richard Price) Date: Wed, 2 Dec 2009 17:21:26 -0800 Subject: [Biopython] New Academia.edu feature for Biopython Message-ID: Dear Biopython members, I wanted to tell the list about a new feature on Academia.edu. Academia.edu launched 12 months ago and now helps 300,000 academics a month answer the question 'who's researching what?' We have built a dedicated page on Academia.edu for the Biopython mailing list: http://lists.academia.edu/See-members-of-Biopython This page will show you fellow members already on Academia.edu. You can see their papers, research interests, and other information. Visit the link below, sign up with Academia.edu, and see who else from Biopython is on Academia.edu. http://lists.academia.edu/See-members-of-Biopython Richard Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. Founder of Academia.edu From mike.thon at gmail.com Thu Dec 3 05:20:12 2009 From: mike.thon at gmail.com (Michael Thon) Date: Thu, 3 Dec 2009 06:20:12 +0100 Subject: [Biopython] can't compile version from github Message-ID: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> Don't know if this belongs on the dev mailing list or here... I just checked out a copy of biopython from the github repo and I tried to install it in an non-root directory to try out a new feature. here is the command I ran on the command line: python setup.py install --prefix=/Users/mike/biopython_dev Here is the offending part of the compile: gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype The installer did not find the numpy that I installed using easy_install so I continued without it and without Reportlab. (Mac OS 10.6) Mike From biopython at maubp.freeserve.co.uk Thu Dec 3 10:19:32 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 3 Dec 2009 10:19:32 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> Message-ID: <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> On Thu, Dec 3, 2009 at 5:20 AM, Michael Thon wrote: > Don't know if this belongs on the dev mailing list or here... Tricky - I might have picked the dev list, but this is fine. > I just checked out a copy of biopython from the github repo and I > tried to install it in an non-root directory to try out a new feature. > > here is the command I ran on the command line: > python setup.py install --prefix=/Users/mike/biopython_dev > > Here is the offending part of the compile: > > gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o > In file included from Bio/triemodule.c:3: > Bio/trie.h:12: warning: function declaration isn?t a prototype > > The installer did not find the numpy that I installed using easy_install > so I continued without it and without Reportlab. ?(Mac OS 10.6) > > Mike Ah - Snow Leopard. Are you using the Apple provided Python with Mac OS 10.6? Apple have as usual done odd things with Python, but people have reported getting it to work. Looking at the compiler flags I am puzzled about the inclusion of "-arch ppc" since Snow Leopard is x86 only. Could you give us the whole of the compile log? What you have shown just has a warning - no actual error. Also, doing it in stages would be wiser: #First remove old build files: python setup clean #Do the compile: python setup build #Run the unit tests: python setup test #To install under your home directory I'd use: python steup install --prefix=/Users/mike/ Peter From mike.thon at gmail.com Thu Dec 3 12:17:17 2009 From: mike.thon at gmail.com (Michael Thon) Date: Thu, 3 Dec 2009 13:17:17 +0100 Subject: [Biopython] can't compile version from github In-Reply-To: <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> Message-ID: On Dec 3, 2009, at 11:19 AM, Peter wrote: > On Thu, Dec 3, 2009 at 5:20 AM, Michael Thon wrote: >> Don't know if this belongs on the dev mailing list or here... > > Tricky - I might have picked the dev list, but this is fine. > >> I just checked out a copy of biopython from the github repo and I >> tried to install it in an non-root directory to try out a new feature. >> >> here is the command I ran on the command line: >> python setup.py install --prefix=/Users/mike/biopython_dev >> >> Here is the offending part of the compile: >> >> gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o >> In file included from Bio/triemodule.c:3: >> Bio/trie.h:12: warning: function declaration isn?t a prototype >> >> The installer did not find the numpy that I installed using easy_install >> so I continued without it and without Reportlab. (Mac OS 10.6) >> >> Mike > > Ah - Snow Leopard. Are you using the Apple provided Python with Mac OS 10.6? > Apple have as usual done odd things with Python, but people have > reported getting > it to work. Yup, I'm using the Apple-provided python. Below is the full output of the build phase. ...so, as usual, I am off on a tangent instead of working on the stuff that actually needs to get done. I don't really need to get this installed. I just wanted to try the SeqFeature.extract method that you mentioned in a previous thread. I realized that I probably don't need to compile all the C extensions to get that to work so I opened up SeqFeature.py to see what this method looks like. I couldn't find it so I suppose this method has not made its way into the main git repo on github. In the end, I wrote one myself but it would still be good to compare its output to what you have in biopython. ...but I still have this biopython from github that won't compile and it probably should. So, if you have any ideas what might be wrong and how to fix it I can try it and report back. -Mike python setup.py build running build running build_py *** Numerical Python *** is either not installed or out of date. This package is optional, which means it is only used in a few specialized modules in Biopython. You probably don't need this if you are unsure. You can ignore this requirement, and install it later if you see ImportErrors. You can find Numerical Python at http://numpy.sourceforge.net/. Do you want to continue this installation? (Y/n) Y *** Reportlab *** is either not installed or out of date. This package is optional, which means it is only used in a few specialized modules in Biopython. You probably don't need this if you are unsure. You can ignore this requirement, and install it later if you see ImportErrors. You can find Reportlab at http://www.reportlab.org/downloads.html. Do you want to continue this installation? (Y/n) Y running build_ext building 'Bio.trie' extension creating build/temp.macosx-10.6-universal-2.6 creating build/temp.macosx-10.6-universal-2.6/Bio gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -IBio -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/triemodule.c -o build/temp.macosx-10.6-universal-2.6/Bio/triemodule.o In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type In file included from Bio/triemodule.c:3: Bio/trie.h:12: warning: function declaration isn?t a prototype Bio/triemodule.c:389: warning: initialization from incompatible pointer type Bio/triemodule.c: In function ?_write_value_to_handle?: Bio/triemodule.c:480: error: too few arguments to function ?PyMarshal_WriteObjectToString? Bio/triemodule.c:482: warning: passing argument 3 of ?PyString_AsStringAndSize? from incompatible pointer type lipo: can't open input file: /var/folders/wI/wIckOkhJHe0hxBDeAEI1VE+++TI/-Tmp-//ccemhRHR.out (No such file or directory) error: command 'gcc-4.2' failed with exit status 1 From biopython at maubp.freeserve.co.uk Thu Dec 3 12:26:12 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 3 Dec 2009 12:26:12 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> Message-ID: <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> On Thu, Dec 3, 2009 at 12:17 PM, Michael Thon wrote: > > Yup, I'm using the Apple-provided python. ?Below is the full output of the build phase. I'll take a look at it, but until I have a machine with Snow Leopard, solving this will be tricky. Any other Snow Leopard users please speak up. > ...so, as usual, I am off on a tangent instead of working on the stuff that > actually needs to get done. ?I don't really need to get this installed. ?I just > wanted to try the SeqFeature.extract method that you mentioned in a > previous thread. ?I realized that I probably don't need to compile all the > C extensions to get that to work so I opened up SeqFeature.py to see > what this method looks like. ?I couldn't find it so I suppose this method > has not made its way into the main git repo on github. ?In the end, I > wrote one myself but it would still be good to compare its output to > what you have in biopython. I'm not sure why you couldn't find this in the latest code from git. The method is in Bio/SeqFeature.py, search for "def extract": http://github.com/biopython/biopython/blob/master/Bio/SeqFeature.py You can probably take a working Biopython 1.52 install, and manually update just the Bio/SeqFeature.py file if you really need to. You could also try installing just the "pure Python" part of Biopython by hacking setup.py to set EXTENSIONS = [], as done for Jython. > ...but I still have this biopython from github that won't compile and it > probably should. ?So, if you have any ideas what might be wrong and > how to fix it I can try it and report back. I'll try to get back to you on this shortly. Peter From johncumbers at gmail.com Fri Dec 4 07:50:43 2009 From: johncumbers at gmail.com (John Cumbers) Date: Thu, 3 Dec 2009 23:50:43 -0800 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> Message-ID: Many thanks Peter, Sorry for delayed reply, I filter this list and forgot to check the folder :) Best wishes, John John Cumbers, Ph.D Candidate NASA Ames Research Center Mail Stop 239-20, Bldg N239 Rm 373 Moffett Field, CA 94035, USA. cell +1 (401) 523 8190, office +1 (650) 604-1914, fax +1 (650) 604-1088 Graduate Program in Molecular Biology, Cell Biology, and Biochemistry Brown University, Box G-W Providence, RI, 02912, USA On Tue, Dec 1, 2009 at 11:19 AM, Peter wrote: > On Tue, Dec 1, 2009 at 9:52 AM, Peter > wrote: > > > > It looks like an undocumented option, much like -phyi (which I guessed) > > for PHYLIP interlaced and -phys for PHYLIP sequential. These match > > the documented format options (e.g. -msf, -html, -clw and -clwstrict). > > i.e. You can use this: > > > > "muscle -in myinputfile -phyi -out myoutputfile" > > > > I think we should ask the MUSCLE author which of these > > undocumented arguments are actually supported rather than > > adding them all. > > Robert Edgar agreed the documentation was out of sync, and has > confirmed these are safe arguments to include in our wrapper. > > Peter > From biopython at maubp.freeserve.co.uk Fri Dec 4 12:32:23 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 4 Dec 2009 12:32:23 +0000 Subject: [Biopython] MuscleCommandline and phyiout In-Reply-To: References: <320fb6e00912010123j984fcc2s34499263295164dc@mail.gmail.com> <320fb6e00912010152r6e2f50e4v90091a98524481ed@mail.gmail.com> <320fb6e00912011119o62d5ef51t1f27f7a4c310a026@mail.gmail.com> Message-ID: <320fb6e00912040432k628392c5n57fc79a3b68a88eb@mail.gmail.com> On Fri, Dec 4, 2009 at 7:50 AM, John Cumbers wrote: > Many thanks Peter, > Sorry for delayed reply, I filter this list and forgot to check the folder > :) > Best wishes, > John Bug filed, http://bugzilla.open-bio.org/show_bug.cgi?id=2961 From brynedal at gmail.com Fri Dec 4 19:01:49 2009 From: brynedal at gmail.com (Boel Brynedal) Date: Fri, 4 Dec 2009 14:01:49 -0500 Subject: [Biopython] Problems installing Biopython Message-ID: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Dear List, I am trying to install Biopython from source on my Mac OS X v.10.6.1. I ran into some problems when building Biopython, thought that it might be due to the fact that I am missing xcore tools (gcc-4.0 was missing) so I installed version 2.5 of xcore. This is however the output as I try to build biopython after installing xcore: $ python setup.py install running install running build running build_py creating build a lot of creating, copying etc etc... running build_ext building 'Bio.clistfns' extension creating build/temp.macosx-10.3-fat-2.6 creating build/temp.macosx-10.3-fat-2. 6/Bio gcc-4.0 -arch ppc -arch i386 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -O3 -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c Bio/clistfnsmodule.c -o build/temp.macosx-10.3-fat-2.6/Bio/clistfnsmodule.o In file included from /usr/include/architecture/i386/math.h:626, from /usr/include/math.h:28, from /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6/pyport.h:235, from /Library/Frameworks/Python.framework/Versions/2.6/include/python2.6/Python.h:58, from Bio/clistfnsmodule.c:10: /usr/include/AvailabilityMacros.h:108:14: warning: #warning Building for Intel with Mac OS X Deployment Target < 10.4 is invalid. Compiling with an SDK that doesn't seem to exist: /Developer/SDKs/MacOSX10.4u.sdk Please check your Xcode installation gcc-4.0 -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -g -bundle -undefined dynamic_lookup build/temp.macosx-10.3-fat-2.6/Bio/clistfnsmodule.o -o build/lib.macosx-10.3-fat-2.6/Bio/clistfns.so ld: library not found for -lbundle1.o collect2: ld returned 1 exit status ld: library not found for -lbundle1.o collect2: ld returned 1 exit status lipo: can't open input file: /var/folders/fy/fySdohlPEBSVFU-YIflSGk+++TI/-Tmp-//cc1Cm89x.out (No such file or directory) error: command 'gcc-4.0' failed with exit status 1 I am new to using Mac, and not the most talented computer nerd, but it seems like we have two problems here: the systems seems to be looking for MacOSX10.4u.sdk, when the xcore tools I've installed contain MacOSX10.5.sdk and MacOSX10.6.sdk. Why does it look for the earlier version, and what can I do about it? #warning Building for Intel with Mac OS X Deployment Target < 10.4 is invalid - this I do not understand at all. I'm using Python 2.6.4 and I'm trying to install biopython-1.52. Any tips, comments or ideas would be greatly appreciated! Thank you, Boel From mike.thon at gmail.com Fri Dec 4 20:27:41 2009 From: mike.thon at gmail.com (Michael Thon) Date: Fri, 4 Dec 2009 21:27:41 +0100 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Message-ID: <6D780E44-39B7-4CFF-975B-D72722A5E5B2@gmail.com> Hi Boel - I installed biopython using easy_install on Mac OS 10.6.2 and didn't have any problems. I don't know what xcore is. Do you mean Xcode? the version I have is 3.2.1 and I downloaded it from developer.apple.com -Mike From p.j.a.cock at googlemail.com Fri Dec 4 20:52:46 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 4 Dec 2009 20:52:46 +0000 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> Message-ID: <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> On Fri, Dec 4, 2009 at 7:01 PM, Boel Brynedal wrote: > Dear List, > > I am new to using Mac, and not the most talented computer > nerd, but it seems like we have two problems here: > the systems seems to be looking for MacOSX10.4u.sdk, > when the xcore tools I've installed contain MacOSX10.5.sdk > and MacOSX10.6.sdk. Why does it look > for the earlier version, and what can I do about it? > #warning Building for Intel with Mac OS X Deployment Target < 10.4 is > invalid - this I do not understand at all. Sadly there are some general issues with Python on Snow Leopard (this isn't just a Biopython issue). Right now as far as I know none of our core developers have Snow Leopard (10.6) so it is hard to help. However, with Leopard (10.4), when installing XCode I had the option of installing the 10.3 headers too. Could you re-install XCode and check this time for an option to include the headers for 10.4 (and/or 10.3)? Peter From mjldehoon at yahoo.com Sat Dec 5 15:59:41 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 5 Dec 2009 07:59:41 -0800 (PST) Subject: [Biopython] CompareAce parser Message-ID: <613008.1949.qm@web62407.mail.re1.yahoo.com> Hi everybody, In Bio.Motif, there is a nominal parser for CompareAce files. However, this parser has almost no functionality. Would anybody mind if we deprecate this module for the next release? Whereas we usually might declare the module obsolete before deprecating it, in this case I think we can deprecate it straightaway since currently this parser does very little. --Michiel. From bartek at rezolwenta.eu.org Sun Dec 6 00:58:41 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Sun, 6 Dec 2009 01:58:41 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <613008.1949.qm@web62407.mail.re1.yahoo.com> References: <613008.1949.qm@web62407.mail.re1.yahoo.com> Message-ID: <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> Hi, I don't have anything against deprecating, even though I don't the advantages of doing so. (the module is trivial, but so is the output of compareACE: a number giving a score between motifs. The score, however is not trivial and I wouldn't want to reimplement it.) cheers Bartek On Sat, Dec 5, 2009 at 4:59 PM, Michiel de Hoon wrote: > Hi everybody, > > In Bio.Motif, there is a nominal parser for CompareAce files. However, this parser has almost no functionality. Would anybody mind if we deprecate this module for the next release? Whereas we usually might declare the module obsolete before deprecating it, in this case I think we can deprecate it straightaway since currently this parser does very little. > > --Michiel. > > > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From biopython at maubp.freeserve.co.uk Sun Dec 6 14:10:22 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 6 Dec 2009 14:10:22 +0000 Subject: [Biopython] CompareAce parser In-Reply-To: <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> References: <613008.1949.qm@web62407.mail.re1.yahoo.com> <8b34ec180912051658s2acb9da1na03a8b0577fa6a8d@mail.gmail.com> Message-ID: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> On Sun, Dec 6, 2009 at 12:58 AM, Bartek Wilczynski wrote: > Hi, > > I don't have anything against deprecating, even though I don't the > advantages of doing so. (the module is trivial, but so is the output > of compareACE: a number giving a score between motifs. The score, > however is not trivial and I wouldn't want to reimplement it.) > > cheers > ?Bartek So the reason this parser is so simple and has almost no functionality is just a reflection of the simplicity of the CompareAce files? If so, I'd say leave the parser in. Peter From mjldehoon at yahoo.com Sun Dec 6 14:31:47 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 6 Dec 2009 06:31:47 -0800 (PST) Subject: [Biopython] CompareAce parser In-Reply-To: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> Message-ID: <889087.95522.qm@web62406.mail.re1.yahoo.com> > So the reason this parser is so simple and has almost no > functionality is just a reflection of the simplicity of > the CompareAce files? Not exactly. CompareAce files can have different outputs, depending on the query given to CompareAce. The simplest query returns only one number. The current CompareAce parser can only parse this output. In other words, >>> input = open("test.out") >>> from Bio.Motif.Parsers import AlignAce >>> AlignAce.CompareAceParser().parse(input) 0.92130000000000001 is equivalent to >>> input = open("test.out") >>> float(input.read()) 0.92130000000000001 I am not against having a CompareAce parser in Biopython, but if we have such a parser it should be able to handle more output formats than just the trivial output format. With this in mind, I think we should either extend the CompareAce parser to handle cases that cannot be trivially handled by a simple Python command, or remove it altogether. If we do keep it in Biopython, there should also be some documentation to cover it, and perhaps a unit test. --Michiel --- On Sun, 12/6/09, Peter wrote: > From: Peter > Subject: Re: [Biopython] CompareAce parser > To: "Bartek Wilczynski" > Cc: "Michiel de Hoon" , biopython at biopython.org > Date: Sunday, December 6, 2009, 9:10 AM > On Sun, Dec 6, 2009 at 12:58 AM, > Bartek Wilczynski > > wrote: > > Hi, > > > > I don't have anything against deprecating, even though > I don't the > > advantages of doing so. (the module is trivial, but so > is the output > > of compareACE: a number giving a score between motifs. > The score, > > however is not trivial and I wouldn't want to > reimplement it.) > > > > cheers > > ?Bartek > > So the reason this parser is so simple and has almost no > functionality > is just a reflection of the simplicity of the CompareAce > files? If so, I'd > say leave the parser in. > > Peter > From brynedal at gmail.com Sun Dec 6 21:38:55 2009 From: brynedal at gmail.com (Boel Brynedal) Date: Sun, 6 Dec 2009 16:38:55 -0500 Subject: [Biopython] Problems installing Biopython In-Reply-To: <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> Message-ID: <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> Hi Peter, I downloaded XCode again and included the 10.4 support - this seem to have fixed it. Thank you very much! Boel 2009/12/4 Peter Cock > On Fri, Dec 4, 2009 at 7:01 PM, Boel Brynedal wrote: > > Dear List, > > > > I am new to using Mac, and not the most talented computer > > nerd, but it seems like we have two problems here: > > the systems seems to be looking for MacOSX10.4u.sdk, > > when the xcore tools I've installed contain MacOSX10.5.sdk > > and MacOSX10.6.sdk. Why does it look > > for the earlier version, and what can I do about it? > > #warning Building for Intel with Mac OS X Deployment Target < 10.4 is > > invalid - this I do not understand at all. > > Sadly there are some general issues with Python on > Snow Leopard (this isn't just a Biopython issue). Right > now as far as I know none of our core developers have > Snow Leopard (10.6) so it is hard to help. > > However, with Leopard (10.4), when installing XCode I > had the option of installing the 10.3 headers too. Could > you re-install XCode and check this time for an option > to include the headers for 10.4 (and/or 10.3)? > > Peter > From biopython at maubp.freeserve.co.uk Sun Dec 6 21:49:28 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 6 Dec 2009 21:49:28 +0000 Subject: [Biopython] Problems installing Biopython In-Reply-To: <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> References: <2167c9200912041101m5df4d144n8cdff208e8dc3b1c@mail.gmail.com> <320fb6e00912041252u705058c2ue06c4b88d4e34059@mail.gmail.com> <2167c9200912061338g164a8f61x25df698cab5c0548@mail.gmail.com> Message-ID: <320fb6e00912061349v2dbc3586g185701894e8e7c05@mail.gmail.com> > 2009/12/4 Peter Cock >> However, with Leopard (10.4), when installing XCode I >> had the option of installing the 10.3 headers too. Could >> you re-install XCode and check this time for an option >> to include the headers for 10.4 (and/or 10.3)? Minor typo - Leopard is Mac OS 10.5 of course ;) On Sun, Dec 6, 2009 at 9:38 PM, Boel Brynedal wrote: > Hi Peter, > > I downloaded XCode again and included the 10.4 support - this > seem to have fixed it. > Thank you very much! > > Boel Excellent - thanks for letting us know. Peter From biopython at maubp.freeserve.co.uk Mon Dec 7 12:14:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 7 Dec 2009 12:14:35 +0000 Subject: [Biopython] Deprecating old Bio.GFF module in preparation for new code? Message-ID: <320fb6e00912070414i3e4503dt311953b99d2efeb5@mail.gmail.com> Dear all, Is anyone using the "old" Bio.GFF module in Biopython? This was written by Michael Hoffman back 2002, and allowed access to a General Feature Format (GFF) MySQL database created with BioPerl's Bio::DB:GFF. It may need updating to work with the latest BioPerl, or GFF3 files (I don't know). This old code did not include any GFF parser of its own. As those on the dev mailing list will know, Brad Chapman has been working on a GFF parser (covering GFF3, and the older GFF2 and GTF files). The obvious place to put this is under Bio.GFF. I would therefore like to propose deprecating the current Bio.GFF code in the next release of Biopython (hopefully this month), which will allow us to replace it with Brad's new parser in the subsequent release. If anyone is using the old module, please let us know now. Thank you, Peter From biopython at maubp.freeserve.co.uk Mon Dec 7 14:02:51 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 7 Dec 2009 14:02:51 +0000 Subject: [Biopython] can't compile version from github In-Reply-To: <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> References: <909D70B5-2F42-46A8-9F57-407CA362E4EB@gmail.com> <320fb6e00912030219j42948b7fue5a3fd8610d66641@mail.gmail.com> <320fb6e00912030426gc84e3f5w6e238638e9b40b2e@mail.gmail.com> Message-ID: <320fb6e00912070602i3a881fd7nd758d71ce0d1a3f4@mail.gmail.com> On Thu, Dec 3, 2009 at 12:26 PM, Peter wrote: > On Thu, Dec 3, 2009 at 12:17 PM, Michael Thon wrote: > >> ...but I still have this biopython from github that won't compile and it >> probably should. ?So, if you have any ideas what might be wrong and >> how to fix it I can try it and report back. > > I'll try to get back to you on this shortly. > Hi Michael, As discussed on the other thread, could you try reinstalling XCode on Snow Leopard (Mac OS X 10.6), but this time tick the option to include the older headers (Tiger 10.4 SDK - not sure exactly what it is called). http://lists.open-bio.org/pipermail/biopython/2009-December/005906.html Peter From iwan.grin at googlemail.com Tue Dec 8 18:52:13 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Tue, 8 Dec 2009 19:52:13 +0100 Subject: [Biopython] Parsing problem Message-ID: Hi all, I am having a little problem while trying to parse a GenBank (or rather GenProt) file using BioPython. I am trying to extract the position on the genome from the "coded_by" qualifier of the CDS feature of a protein. The "coded_by" string in this specific case looks like this: 'complement(NC_012967.1: 3622110..3624728)' Now, when I run Bio.GFF.easy.LocationFromString('complement(NC_012967.1:3622110..3624728)' ) I get File "/usr/lib/pymodules/python2.6/Bio/GFF/easy.py", line 419, in __init__ list.__init__(self, [int(location_str)-1]) # zero based, nip it in the bud ValueError: invalid literal for int() with base 10: 'NC_012967.1:3622110..3624728' Is there another way to parse this location string or do I have to cook up some kind of custom RegExp? Iwan P.S.: Code snippet: from Bio import Entrez from Bio import SeqIO from Bio import GFF gi = 254163455 handle = Entrez.efetch(db="protein", id=gi, rettype="gb") record= SeqIO.read(handle,"genbank") handle.close() for feature in record.features: if(feature.type=="CDS" and feature.qualifiers.has_key("coded_by")): print feature.qualifiers["coded_by"][0], loc=GFF.easy.LocationFromString(feature.qualifiers["coded_by"][0]) print loc.start(),loc.end(), loc.complement From biopython at maubp.freeserve.co.uk Tue Dec 8 22:43:29 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 8 Dec 2009 22:43:29 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: Message-ID: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin wrote: > Hi all, > > I am having a little problem while trying to parse a GenBank (or rather > GenProt) file using BioPython. I am trying to extract the position on the > genome from the "coded_by" qualifier of the CDS feature of a protein. > > The "coded_by" string in this specific case looks like this: > > 'complement(NC_012967.1: > 3622110..3624728)' Oh, one of those tricky cross references to another file :( > Now, when I run > > Bio.GFF.easy.LocationFromString('complement(NC_012967.1:3622110..3624728)' ) > This is interesting timing - Bio.GFF.easy has a lot of code which duplicated the EMBL/GenBank parsing, and I'm actually suggesting we deprecate it in the next release (!). What made you use Bio.GFF in the first place? It has never been documented. That said, it does look like you found a bug in Bio.GFF.easy ... In the long term, I think Bio.GenBank would be a better place to put this functionality (and reworking the location parsing is on the todo list, partly as it is currently a speed bottleneck). Peter From biopython at maubp.freeserve.co.uk Tue Dec 8 23:53:29 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 8 Dec 2009 23:53:29 +0000 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> Message-ID: <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> On Tue, Dec 8, 2009 at 10:43 PM, Peter wrote: > On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin wrote: >> Hi all, >> >> I am having a little problem while trying to parse a GenBank (or rather >> GenProt) file using BioPython. I am trying to extract the position on the >> genome from the "coded_by" qualifier of the CDS feature of a protein. >> >> The "coded_by" string in this specific case looks like this: >> >> 'complement(NC_012967.1: >> 3622110..3624728)' > > Oh, one of those tricky cross references to another file :( It looks like the Bio.GFF.easy code expects that to be formatted as NC_012967.1:complement(3622110..3624728) and not as complement(NC_012967.1:3622110..3624728) Peter From iwan.grin at googlemail.com Wed Dec 9 12:33:09 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 13:33:09 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> Message-ID: Hi Peter, Thank you for your reply. I am new to BioPython and stumbled upon GFF.easy while searching through the API docs. Actually, What I wanted was a way to parse that location string into an SeqFeature-like thing from which I could get start, end and strand.Unfortunately I could not find the correct parser in Bio.Genbank - any suggestions are welcome. I agree with you that Bio.GFF.easy expects the Accession number before the complement. (Actually for my purpose I do not need the accession number at all.) Iwan 2009/12/9 Peter > On Tue, Dec 8, 2009 at 10:43 PM, Peter > wrote: > > On Tue, Dec 8, 2009 at 6:52 PM, Iwan Grin > wrote: > >> Hi all, > >> > >> I am having a little problem while trying to parse a GenBank (or rather > >> GenProt) file using BioPython. I am trying to extract the position on > the > >> genome from the "coded_by" qualifier of the CDS feature of a protein. > >> > >> The "coded_by" string in this specific case looks like this: > >> > >> 'complement(NC_012967.1: > >> 3622110..3624728)' > > > > Oh, one of those tricky cross references to another file :( > > It looks like the Bio.GFF.easy code expects that to be formatted > as NC_012967.1:complement(3622110..3624728) and not as > complement(NC_012967.1:3622110..3624728) > > Peter > From biopython at maubp.freeserve.co.uk Wed Dec 9 13:25:44 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 13:25:44 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> Message-ID: <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> On Wed, Dec 9, 2009 at 12:33 PM, Iwan Grin wrote: > Hi Peter, Thank you for your reply. > > I am new to BioPython and stumbled upon GFF.easy while searching through the > API docs. Actually, What I wanted was a way to parse that location string > into an SeqFeature-like thing from which I could get start, end and > strand.Unfortunately I could not find the correct parser in Bio.Genbank - > any suggestions are welcome. Right now Bio.GenBank doesn't really expose the location parsing in an easy to use way like Bio.GFF.easy does. > I agree with you that Bio.GFF.easy expects the Accession number before the > complement. (Actually for my purpose I do not need the accession number at > all.) The pragmatic solution is to write your own quick parser to pull out the coordinates (if that is all you need). We'll have to look at this as part of the discussion of what to do with the old Bio.GFF (as part of planning for Brad's new GFF parsing code). Peter From chapmanb at 50mail.com Wed Dec 9 13:38:02 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 9 Dec 2009 08:38:02 -0500 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> Message-ID: <20091209133802.GB79820@sobchak.mgh.harvard.edu> Iwan and Peter; > > I am new to BioPython and stumbled upon GFF.easy while searching through the > > API docs. Actually, What I wanted was a way to parse that location string > > into an SeqFeature-like thing from which I could get start, end and > > strand.Unfortunately I could not find the correct parser in Bio.Genbank - > > any suggestions are welcome. > > Right now Bio.GenBank doesn't really expose the location parsing in an > easy to use way like Bio.GFF.easy does. If you don't like ugly code, please avert your eyes now. This will work with the standard GenBank parsing and is definitely not future proof since it involves using private members. However, it'll work for something quick n' dirty: from Bio.GenBank import _FeatureConsumer from Bio.SeqFeature import SeqFeature def gb_string_to_feature(content, use_fuzziness=True): """Convert a GenBank location string into a SeqFeature. """ consumer = _FeatureConsumer(use_fuzziness) consumer._cur_feature = SeqFeature() consumer.location(content) return consumer._cur_feature print gb_string_to_feature('complement(NC_012967.1:3622110..3624728)') Hope this helps, Brad From bartek at rezolwenta.eu.org Wed Dec 9 14:43:23 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Wed, 9 Dec 2009 15:43:23 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <889087.95522.qm@web62406.mail.re1.yahoo.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> Message-ID: <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> Hi Michiel, I haven't given enough consideration to the maintenance costs of having parsers like this one in biopython. I think you are right that it's not useful in its current state, and I don't think it's worth putting efforts into improving it. There are already other methods of motif comparison implemented in bio.Motif and if I was to choose an external motif comparison software to support in biopython, I would vote for the STAMP tool from Benos lab. So, in conclusion, I think it would make sense to deprecate the CompareAce parser. cheers Bartek On Sun, Dec 6, 2009 at 3:31 PM, Michiel de Hoon wrote: >> So the reason this parser is so simple and has almost no >> functionality is just a reflection of the simplicity of >> the CompareAce files? > > Not exactly. CompareAce files can have different outputs, depending on the query given to CompareAce. The simplest query returns only one number. The current CompareAce parser can only parse this output. In other words, > >>>> input = open("test.out") >>>> from Bio.Motif.Parsers import AlignAce >>>> AlignAce.CompareAceParser().parse(input) > 0.92130000000000001 > > is equivalent to > >>>> input = open("test.out") >>>> float(input.read()) > 0.92130000000000001 > > I am not against having a CompareAce parser in Biopython, but if we have such a parser it should be able to handle more output formats than just the trivial output format. > > With this in mind, I think we should either extend the CompareAce parser to handle cases that cannot be trivially handled by a simple Python command, or remove it altogether. If we do keep it in Biopython, there should also be some documentation to cover it, and perhaps a unit test. > > --Michiel > > --- On Sun, 12/6/09, Peter wrote: > >> From: Peter >> Subject: Re: [Biopython] CompareAce parser >> To: "Bartek Wilczynski" >> Cc: "Michiel de Hoon" , biopython at biopython.org >> Date: Sunday, December 6, 2009, 9:10 AM >> On Sun, Dec 6, 2009 at 12:58 AM, >> Bartek Wilczynski >> >> wrote: >> > Hi, >> > >> > I don't have anything against deprecating, even though >> I don't the >> > advantages of doing so. (the module is trivial, but so >> is the output >> > of compareACE: a number giving a score between motifs. >> The score, >> > however is not trivial and I wouldn't want to >> reimplement it.) >> > >> > cheers >> > ?Bartek >> >> So the reason this parser is so simple and has almost no >> functionality >> is just a reflection of the simplicity of the CompareAce >> files? If so, I'd >> say leave the parser in. >> >> Peter >> > > > > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From iwan.grin at googlemail.com Wed Dec 9 14:51:20 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 15:51:20 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <20091209133802.GB79820@sobchak.mgh.harvard.edu> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> Message-ID: 2009/12/9 Brad Chapman > Iwan and Peter; > > > > I am new to BioPython and stumbled upon GFF.easy while searching > through the > > > API docs. Actually, What I wanted was a way to parse that location > string > > > into an SeqFeature-like thing from which I could get start, end and > > > strand.Unfortunately I could not find the correct parser in Bio.Genbank > - > > > any suggestions are welcome. > > > > Right now Bio.GenBank doesn't really expose the location parsing in an > > easy to use way like Bio.GFF.easy does. > > If you don't like ugly code, please avert your eyes now. This will > work with the standard GenBank parsing and is definitely not future > proof since it involves using private members. However, it'll work > for something quick n' dirty: > > from Bio.GenBank import _FeatureConsumer > from Bio.SeqFeature import SeqFeature > > def gb_string_to_feature(content, use_fuzziness=True): > """Convert a GenBank location string into a SeqFeature. > """ > consumer = _FeatureConsumer(use_fuzziness) > consumer._cur_feature = SeqFeature() > consumer.location(content) > return consumer._cur_feature > > print gb_string_to_feature('complement(NC_012967.1:3622110..3624728)') > > Hope this helps, > Brad > Brad, Thank you very much! as much as this is a hack, it works for what I want to have. I guess for future proofness, either the parsers from Bio.GenBank should be exposed, or the coded_by qualifier should be parsed as location by default, although I am not sure how well the latter idea fits into the present data structure. Iwan From biopython at maubp.freeserve.co.uk Wed Dec 9 15:05:39 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 15:05:39 +0000 Subject: [Biopython] CompareAce parser In-Reply-To: <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> Message-ID: <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> On Wed, Dec 9, 2009 at 2:43 PM, Bartek Wilczynski wrote: > Hi Michiel, > > I haven't given enough consideration to the maintenance costs of > having parsers like this one in biopython. I think you are right that > it's not ?useful in its current state, and I don't think it's worth > putting efforts into improving it. There are already other methods of > motif comparison implemented in bio.Motif and if I was to choose an > external motif comparison software to support in biopython, I would > vote for the STAMP tool from Benos lab. So, in conclusion, I think it > would make sense to deprecate the CompareAce parser. > > cheers > Bartek I hadn't looked to see just how simple the files and parser were ;) Do you want to go ahead and make the deprecation Bartek? Thanks, Peter From biopython at maubp.freeserve.co.uk Wed Dec 9 15:26:18 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 9 Dec 2009 15:26:18 +0000 Subject: [Biopython] Parsing problem In-Reply-To: References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> Message-ID: <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> On Wed, Dec 9, 2009 at 2:51 PM, Iwan Grin wrote: > Brad, Thank you very much! > > as much as this is a hack, it works for what I want to have. I guess for > future proofness, either the parsers from Bio.GenBank should be exposed, or > the coded_by qualifier should be parsed as location by default, although I > am not sure how well the latter idea fits into the present data structure. Brad's trick still work in Biopython 1.53 at the very least. I think we'll try and make the location parser more accessible in future, but changing the parsing of "coded_by" qualifiers would risk breaking existing user scripts. Peter From iwan.grin at googlemail.com Wed Dec 9 15:55:21 2009 From: iwan.grin at googlemail.com (Iwan Grin) Date: Wed, 9 Dec 2009 16:55:21 +0100 Subject: [Biopython] Parsing problem In-Reply-To: <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> References: <320fb6e00912081443u478afa02pc290c19ae14e21cb@mail.gmail.com> <320fb6e00912081553h5091715dpb0c345bf4f8c3dfb@mail.gmail.com> <320fb6e00912090525j399c28e8w15e6fdea61b14133@mail.gmail.com> <20091209133802.GB79820@sobchak.mgh.harvard.edu> <320fb6e00912090726w95cfa8bi542f6227f84c888b@mail.gmail.com> Message-ID: 2009/12/9 Peter > On Wed, Dec 9, 2009 at 2:51 PM, Iwan Grin > wrote: > > Brad, Thank you very much! > > > > as much as this is a hack, it works for what I want to have. I guess for > > future proofness, either the parsers from Bio.GenBank should be exposed, > or > > the coded_by qualifier should be parsed as location by default, although > I > > am not sure how well the latter idea fits into the present data > structure. > > Brad's trick still work in Biopython 1.53 at the very least. I think we'll > try and make the location parser more accessible in future, but > changing the parsing of "coded_by" qualifiers would risk breaking > existing user scripts. > > Peter > I would suggest to add a new "coded_by" feature and leave the qualifier as it is. This should minimize the risk of breaking stuff. On the other hand, This feature would be pretty specific for CDS in Genbank Protein files. Iwan From bartek at rezolwenta.eu.org Wed Dec 9 16:33:58 2009 From: bartek at rezolwenta.eu.org (Bartek Wilczynski) Date: Wed, 9 Dec 2009 17:33:58 +0100 Subject: [Biopython] CompareAce parser In-Reply-To: <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> References: <320fb6e00912060610w34ea50b3hc828f7b47909f135@mail.gmail.com> <889087.95522.qm@web62406.mail.re1.yahoo.com> <8b34ec180912090643m58867d29ha7d9cdd6e59e4bb@mail.gmail.com> <320fb6e00912090705h691dd0a3u32b6e8760570d3e1@mail.gmail.com> Message-ID: <8b34ec180912090833uf9ace0x8f0b76335dbd8143@mail.gmail.com> On Wed, Dec 9, 2009 at 4:05 PM, Peter wrote: > I hadn't looked to see just how simple the files and parser were ;) > > Do you want to go ahead and make the deprecation Bartek? Yes. It's done and pushed to github now. cheers Bartek -- Bartek Wilczynski ================== Postdoctoral fellow EMBL, Furlong group Meyerhoffstrasse 1, 69012 Heidelberg, Germany tel: +49 6221 387 8433 From villahozbale at wisc.edu Fri Dec 11 18:32:53 2009 From: villahozbale at wisc.edu (ANGEL VILLAHOZ-BALETA) Date: Fri, 11 Dec 2009 12:32:53 -0600 Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? Message-ID: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> Hi to all, I believe that there is a printing error in the Biopython Tutorial and Cookbook... Go there: http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 Then check the following source code: >>> for record in records: ... print "title:", record["TI"] ... if "AU" in records: ... print "authors:", record["AU"] ... print "source:", record["CO"] ... print I believe that the if sentence would have the record instead of the records because it would never print such an information about the authors since the data structure of records does not have this key but always integers as its indices. Let me know if I am right or not. Thanks very much, Angel Villahoz-Baleta Bioinformatics Programmer University of Wisconsin-Madison From biopython at maubp.freeserve.co.uk Fri Dec 11 19:45:31 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 11 Dec 2009 19:45:31 +0000 Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? In-Reply-To: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> References: <70d0d6f9673f4.4b223bf5@wiscmail.wisc.edu> Message-ID: <320fb6e00912111145w68330b80l8b1091db48fac3fb@mail.gmail.com> On Fri, Dec 11, 2009 at 6:32 PM, ANGEL VILLAHOZ-BALETA wrote: > Hi to all, > > I believe that there is a printing error in the Biopython Tutorial and Cookbook... > > Go there: > > http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 > > Then check the following source code: > >>>> for record in records: > ... ? ? print "title:", record["TI"] > ... ? ? if "AU" in records: > ... ? ? ? ? print "authors:", record["AU"] > ... ? ? print "source:", record["CO"] > ... ? ? print > > I believe that the if sentence would have the record > instead of the records because it would never print > such an information about the authors since the data > structure of records does not have this key but always > integers as its indices. What version of Biopython do you have? Could you show us the actual error message? I've just been playing with the example, and for some records certain fields are missing (you get a KeyError), so this works better: for record in records: print "title:", record.get("TI","?") print "author:", record.get("AU","?") print "source:", record.get("CO","?") print Does that help? Peter From mjldehoon at yahoo.com Sat Dec 12 01:52:08 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 11 Dec 2009 17:52:08 -0800 (PST) Subject: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? In-Reply-To: <320fb6e00912111145w68330b80l8b1091db48fac3fb@mail.gmail.com> Message-ID: <138660.52957.qm@web62406.mail.re1.yahoo.com> Dear Angel, This was indeed a typing error in the tutorial. It is fixed now as Peter suggested. Thanks for noticing! --Michiel. --- On Fri, 12/11/09, Peter wrote: > From: Peter > Subject: Re: [Biopython] A potential printing error in the Biopython Tutorial and Cookbook? > To: "ANGEL VILLAHOZ-BALETA" > Cc: biopython at lists.open-bio.org > Date: Friday, December 11, 2009, 2:45 PM > On Fri, Dec 11, 2009 at 6:32 PM, > ANGEL VILLAHOZ-BALETA > > wrote: > > Hi to all, > > > > I believe that there is a printing error in the > Biopython Tutorial and Cookbook... > > > > Go there: > > > > http://www.biopython.org/DIST/docs/tutorial/Tutorial.html#htoc102 > > > > Then check the following source code: > > > >>>> for record in records: > > ... ? ? print "title:", record["TI"] > > ... ? ? if "AU" in records: > > ... ? ? ? ? print "authors:", record["AU"] > > ... ? ? print "source:", record["CO"] > > ... ? ? print > > > > I believe that the if sentence would have the record > > instead of the records because it would never print > > such an information about the authors since the data > > structure of records does not have this key but > always > > integers as its indices. > > What version of Biopython do you have? > Could you show us the actual error message? > > I've just been playing with the example, and for some > records certain fields are missing (you get a KeyError), > so this works better: > > for record in records: > ? ? print "title:", record.get("TI","?") > ? ? print "author:", record.get("AU","?") > ? ? print "source:", record.get("CO","?") > ? ? print > > Does that help? > > Peter > > _______________________________________________ > Biopython mailing list? -? Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From aboulia at gmail.com Tue Dec 15 06:01:47 2009 From: aboulia at gmail.com (Kevin Lam) Date: Tue, 15 Dec 2009 14:01:47 +0800 Subject: [Biopython] rsync download of biopython problems Message-ID: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> Hi I just tried downloading biopython via rsync rsync -av code.open-bio.org::cvsbiopython . But all the files were appended with a ",v" why did that happen? (i.e. see below) Attic Bio BioSQL CONTRIB,v DEPRECATED,v Doc Experimental LICENSE,v MANIFEST.in,v Martel NEWS,v README,v Scripts setup.py,v Tests From biopython at maubp.freeserve.co.uk Tue Dec 15 10:37:55 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 10:37:55 +0000 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> Message-ID: <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> On Tue, Dec 15, 2009 at 6:01 AM, Kevin Lam wrote: > Hi > I just tried downloading biopython via rsync > > ?rsync -av code.open-bio.org::cvsbiopython . > > But all the files were appended with a ",v" why did that happen? (i.e. see > below) > Attic ?Bio ?BioSQL ?CONTRIB,v ?DEPRECATED,v ?Doc ?Experimental ?LICENSE,v > MANIFEST.in,v ?Martel ?NEWS,v ?README,v ?Scripts ?setup.py,v ?Tests CVS likes to add ",v" to files - if you wanted to download them from the public CVS server (code.open-bio.org) you would have had to use the CVS command line tool. However, we don't use CVS anymore, we use git. See: http://www.biopython.org/wiki/SourceCode If you really want to use rsync you *might* be able to point it at http://biopython.org/SRC/biopython/ Peter P.S. Why did you try downloading with rsync from the code.open-bio.org? Is there something confusing in our documentation we can fix? Thanks! From aboulia at gmail.com Tue Dec 15 11:39:35 2009 From: aboulia at gmail.com (Kevin Lam) Date: Tue, 15 Dec 2009 19:39:35 +0800 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> Message-ID: <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> Hi Peter, git works fine for me! ok lemme explain how i got there.. from http://www.biopython.org/wiki/CVS i followed this link http://cvs.biopython.org/ which redirected me to http://www.open-bio.org/wiki/SourceCode i was on a fresh install system of CentOS so rsync was avail so i used that. Cheers Kevin On Tue, Dec 15, 2009 at 6:37 PM, Peter wrote: > On Tue, Dec 15, 2009 at 6:01 AM, Kevin Lam wrote: > > Hi > > I just tried downloading biopython via rsync > > > > rsync -av code.open-bio.org::cvsbiopython . > > > > But all the files were appended with a ",v" why did that happen? (i.e. > see > > below) > > Attic Bio BioSQL CONTRIB,v DEPRECATED,v Doc Experimental LICENSE,v > > MANIFEST.in,v Martel NEWS,v README,v Scripts setup.py,v Tests > > CVS likes to add ",v" to files - if you wanted to download them > from the public CVS server (code.open-bio.org) you would have > had to use the CVS command line tool. > > However, we don't use CVS anymore, we use git. See: > http://www.biopython.org/wiki/SourceCode > > If you really want to use rsync you *might* be able to point it > at http://biopython.org/SRC/biopython/ > > Peter > > P.S. Why did you try downloading with rsync from the code.open-bio.org? > Is there something confusing in our documentation we can fix? Thanks! > From biopython at maubp.freeserve.co.uk Tue Dec 15 11:48:22 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 11:48:22 +0000 Subject: [Biopython] rsync download of biopython problems In-Reply-To: <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> References: <5b6410e0912142201q1bae57b2ybb34dd453afd6204@mail.gmail.com> <320fb6e00912150237i7e627b0x9788bf6b112e5@mail.gmail.com> <5b6410e0912150339y296aee41xf2277c8ff582015e@mail.gmail.com> Message-ID: <320fb6e00912150348y7f074d94u74b3b14cc31d4c6@mail.gmail.com> On Tue, Dec 15, 2009 at 11:39 AM, Kevin Lam wrote: > Hi Peter, > git works fine for me! > ok lemme explain how i got there.. > > from http://www.biopython.org/wiki/CVS I just the "this page is obsolete" bit at the top needs to be more prominent... > i followed this link http://cvs.biopython.org/ I can ask the sys admins to redirect that to: http://www.biopython.org/wiki/SourceCode > which redirected me to http://www.open-bio.org/wiki/SourceCode Ah - that does need updating. Thanks! > i was on a fresh install system of CentOS so rsync was avail so i used that. Well it "worked" (you got a copy of the CVS repository rather than a snapshot of the code), but the CVS repository is now out of date. Peter From biopython at maubp.freeserve.co.uk Tue Dec 15 17:01:38 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 17:01:38 +0000 Subject: [Biopython] Biopython 1.53 released Message-ID: <320fb6e00912150901k138ae04bmc5d5af9c867340ec@mail.gmail.com> Dear Biopythoneers, We are pleased to announce the availability of Biopython 1.53, a new stable release of the Biopython library, three months after the release of Biopython 1.52. This is our first release since migrating from CVS to git for source code control. There have been some additions to our core objects ? the Seq (and related UnknownSeq) objects gained upper and lower methods (like the string methods of the same name but alphabet aware) plus a new ungap method. The SeqFeature object now has an extract method to get the region of sequence it describes (useful for getting CDS nucleotide sequences from GenBank files). Also SeqRecord objects now support addition, giving a new SeqRecord with the combined sequence, all the SeqFeatures, and any common annotation. SQLite support (built into Python 2.5+) was added to our BioSQL interface. This is still a little experimental as we are using a draft BioSQL SQLite schema, but this should be merged into the next BioSQL release. Biopython now includes wrappers for the new NCBI BLAST C++ tools, which will be replacing the old NCBI ?legacy? BLAST tools written in C. The plain text BLAST parser has been updated to cope as well. Nevertheless, we (and the NCBI) still recommend using the XML output for parsing. Bio.Entrez includes the new (Jan 2010) DTD files from the NCBI for parsing MedLine/PubMed data. The NCBI codon tables have been updated from version 3.4 to 3.9, which adds a few extra start codons, and a few new tables (Tables 16, 21, 22 and 23). The restriction enzyme list in Bio.Restriction has been updated to the Nov 2009 release of REBASE. The Bio.PDB parser and output code has been updated to understand the element column in ATOM and HETATM lines, and Bio.PDB.PDBList has been updated for recent changes to the PDB FTP site. Finally, support for running Biopython under Jython (using the Java Virtual Machine) has been much improved. Note that Jython does not support C code, and currently Jython does not parse DTD files (needed for the Bio.Entrez XML parser). However, most of the Biopython modules seem fine from testing Jython 2.5.0 and 2.5.1. Sources and Windows Installers are available from our downloads page. Thanks to the Biopython development team and to everyone who has reported bugs or contributed patches since our last release. --Peter, on behalf of the Biopython developers P.S. This news post is online at http://news.open-bio.org/news/2009/12/biopython-release-153/ You may wish to subscribe to our news feed. For RSS links etc, see: http://biopython.org/wiki/News Biopython news is also on twitter: http://twitter.com/biopython From cgohlke at uci.edu Tue Dec 15 17:17:53 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 09:17:53 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 Message-ID: <4B27C4C1.3090206@uci.edu> Hello, I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual Studio 2008. The test output is attached. The installer is at Best, Christoph Gohlke Laboratory for Fluorescence Dynamics University of California, Irvine http://www.lfd.uci.edu/~gohlke/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: biopython-1.53.win-amd64-py2.6-test.txt URL: From biopython at maubp.freeserve.co.uk Tue Dec 15 17:49:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 17:49:35 +0000 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <4B27C4C1.3090206@uci.edu> References: <4B27C4C1.3090206@uci.edu> Message-ID: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: > Hello, > > I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual > Studio 2008. The test output is attached. The installer is at > Nice :) Was this with NumPy 1.4.0rc2? The fact that test_GraphicsBitmaps.py failed with a font problem is indicative of something not quite right in ReportLab and/or PIL. This is almost certainly not a Biopython problem. A couple of SCOP tested failed - could you run unix2dos (or similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt and retest? That "fixes" it on win32. Also what does this do on your Windows 64bit python? >>> import sys >>> sys.platform 'win32' I've seen threads discussing if it should return "win64" or "win32", but the simplest way to check is try it and see. Thanks, Peter From cgohlke at uci.edu Tue Dec 15 18:22:23 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 10:22:23 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> Message-ID: <4B27D3DF.4070006@uci.edu> On 12/15/2009 9:49 AM, Peter wrote: > On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: >> Hello, >> >> I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual >> Studio 2008. The test output is attached. The installer is at >> > > Nice :) > > Was this with NumPy 1.4.0rc2? Yes, numpy-1.4.0rc2.dev7996, also available on the same download page. > > The fact that test_GraphicsBitmaps.py failed with a font problem > is indicative of something not quite right in ReportLab and/or PIL. > This is almost certainly not a Biopython problem. > OK, I will check Reportlab and PIL. I now remember seeing some font loading issues with PIL 1.1.7 in other packages even though all internal tests pass. > A couple of SCOP tested failed - could you run unix2dos (or > similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt > and retest? That "fixes" it on win32. > That worked. The font error is now the only failing test. > Also what does this do on your Windows 64bit python? > >>>> import sys >>>> sys.platform > 'win32' 'win32' is correct. I use "'64 bit' in sys.version" to check for a 64 bit version at runtime. > > I've seen threads discussing if it should return "win64" or > "win32", but the simplest way to check is try it and see. > Thank you. Feel free to redistribute the installer if you think it is good enough. Christoph From cgohlke at uci.edu Tue Dec 15 18:20:07 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 10:20:07 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> Message-ID: <4B27D357.30705@uci.edu> On 12/15/2009 9:49 AM, Peter wrote: > On Tue, Dec 15, 2009 at 5:17 PM, Christoph Gohlke wrote: >> Hello, >> >> I have built biopython 1.53 for 64-bit Python 2.6 for Windows using Visual >> Studio 2008. The test output is attached. The installer is at >> > > Nice :) > > Was this with NumPy 1.4.0rc2? > > The fact that test_GraphicsBitmaps.py failed with a font problem > is indicative of something not quite right in ReportLab and/or PIL. > This is almost certainly not a Biopython problem. > > A couple of SCOP tested failed - could you run unix2dos (or > similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt > and retest? That "fixes" it on win32. > > Also what does this do on your Windows 64bit python? > >>>> import sys >>>> sys.platform > 'win32' > > I've seen threads discussing if it should return "win64" or > "win32", but the simplest way to check is try it and see. > > Thanks, > > Peter > > From biopython at maubp.freeserve.co.uk Tue Dec 15 18:57:09 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 15 Dec 2009 18:57:09 +0000 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <4B27D3DF.4070006@uci.edu> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> <4B27D3DF.4070006@uci.edu> Message-ID: <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> On Tue, Dec 15, 2009 at 6:22 PM, Christoph Gohlke wrote: > >> The fact that test_GraphicsBitmaps.py failed with a font problem >> is indicative of something not quite right in ReportLab and/or PIL. >> This is almost certainly not a Biopython problem. >> > OK, I will check Reportlab and PIL. I now remember seeing some > font loading issues with PIL 1.1.7 in other packages even though > all internal tests pass. If you are happy to investigate further, that would be great. >> A couple of SCOP tested failed - could you run unix2dos (or >> similar) on Tests/SCOP/*.txt and Tests/SCOP/scopseq-test/*.txt >> and retest? That "fixes" it on win32. > > That worked. The font error is now the only failing test. Good. >> Also what does this do on your Windows 64bit python? >> >>>>> import sys >>>>> sys.platform >> >> 'win32' > > 'win32' is correct. > I use "'64 bit' in sys.version" to check for a 64 bit version at runtime. Thanks - I just wanted to be sure. > > Thank you. Feel free to redistribute the installer if you think it is > good enough. > Given NumPy don't offer their own 64bit installers, and the possible need for Microsoft Visual C++ 2008 redistributable package, perhaps linking to your page makes most sense for now. I'll update our download page if that sounds sensible. Thank you, Peter From cgohlke at uci.edu Tue Dec 15 19:29:28 2009 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 15 Dec 2009 11:29:28 -0800 Subject: [Biopython] biopython-1.53.win-amd64-py2.6 In-Reply-To: <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> References: <4B27C4C1.3090206@uci.edu> <320fb6e00912150949x58594a50sf8d2e60b107f26cf@mail.gmail.com> <4B27D3DF.4070006@uci.edu> <320fb6e00912151057t41a6e889k1df389cbccf1a8cd@mail.gmail.com> Message-ID: <4B27E398.3090600@uci.edu> On 12/15/2009 10:57 AM, Peter wrote: > On Tue, Dec 15, 2009 at 6:22 PM, Christoph Gohlke wrote: >> >>> The fact that test_GraphicsBitmaps.py failed with a font problem >>> is indicative of something not quite right in ReportLab and/or PIL. >>> This is almost certainly not a Biopython problem. >>> >> OK, I will check Reportlab and PIL. I now remember seeing some >> font loading issues with PIL 1.1.7 in other packages even though >> all internal tests pass. > > If you are happy to investigate further, that would be great. > Turned out that the Times-Roman font is simply missing from the reportlab 2.3 source distribution, which I used. The missing fonts can be downloaded at and put in the reportlab/fonts directory. I also included these fonts in the updated reportlab-2.3.win-amd64-py2.6.exe installer. All tests pass now. Some tests were skipped due to missing third party packages on my computer. >> >> Thank you. Feel free to redistribute the installer if you think it is >> good enough. >> > > Given NumPy don't offer their own 64bit installers, and the > possible need for Microsoft Visual C++ 2008 redistributable > package, perhaps linking to your page makes most sense > for now. I'll update our download page if that sounds sensible. > Makes sense. The VC.CRT redistributable is usually installed with Python and I compiled with the http://bugs.python.org/issue4120 patch. Best, Christoph From aboulia at gmail.com Wed Dec 16 03:52:36 2009 From: aboulia at gmail.com (Kevin Lam) Date: Wed, 16 Dec 2009 11:52:36 +0800 Subject: [Biopython] Entrez.efetch Service unavailable! Message-ID: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> Hi I have been trying to use Entrez.efetch to download ~1000 bacteria genomes I read in the docs that biopython will auto take care of the delay and fetch via the preferred site for download scripts but I have been getting service unavailable errors at GMT +8 1100am is this normal? Or should i edit the source to give a larger delay buffer? So far I have only managed to get 3 fasta sequences out Or should I bring this up to NCBI instead? Traceback (most recent call last): File "../retr-fasta.py", line 15, in ? handle = Entrez.efetch(db="genome", id=uid, rettype="fasta") File "/home/k/lib/biopython-biopython-9a41381/build/lib.linux-x86_64-2.4/Bio/Entrez/__init__.py", line 105, in efetch return _open(cgi, variables) File "/home/k/lib/biopython-biopython-9a41381/build/lib.linux-x86_64-2.4/Bio/Entrez/__init__.py", line 343, in _open raise IOError("Service unavailable!") IOError: Service unavailable! Cheers Kevin From biopython at maubp.freeserve.co.uk Wed Dec 16 08:58:35 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 08:58:35 +0000 Subject: [Biopython] Entrez.efetch Service unavailable! In-Reply-To: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> References: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> Message-ID: <320fb6e00912160058i57acc020nba7c61c53a4ec64b@mail.gmail.com> On Wed, Dec 16, 2009 at 3:52 AM, Kevin Lam wrote: > Hi I have been trying to use > Entrez.efetch > to download ~1000 bacteria genomes Why not use their FTP site? They even make bundles available, e.g. ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.gbk.tar.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.faa.tar.gz etc. Note the folder is called Bacteria for historical reasons, it is really Prokaryotes as there are plenty of Archaea in there. > I read in the docs that biopython will auto take care of the delay and fetch > via the preferred site for download scripts > but I have been getting service unavailable ?errors at GMT +8 1100am > is this normal? Or should i edit the source to give a larger delay buffer? > So far I have only managed to get 3 fasta sequences out The NCBI were planning some Entrez work about now (updating DTD files), so the downtime might be expected. I'd wait a day, and then if it is still down email them. Regards, Peter From iua1 at psu.edu Wed Dec 16 16:49:11 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 11:49:11 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups Message-ID: Hello Everyone, I applaud the move to Github, I think it was a great decision that will allow more people to contribute to the project. Yet at the same time a community is built on communication and the current mailing list feels extremely antiquated. The way of interaction is tedious: sending emails to an address, then one reads one message at a time, messages are not displayed in a threaded form where multiple messages are shown at the same time. There is no of search, etc. and it all feels like a throwback to the 90s. Personally I think a choice of mailman is a choice of deliberately of limiting access to all but the most hardcore - and for example that's why the main Python-dev uses it, it is a more of a mechanism to keep people away. Of course python has comp.lang.python and it is a nice and thriving group. The alternative such as Google groups would be far superior in attracting and building a community of developers and users as well. Is this an idea that the owners of the list would entertain? best regards, Istvan Albert -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 16:56:57 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 11:56:57 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Istvan, what is your solution to Google all of a sudden deciding to take Google Groups down, or to make it a paid subscription service, or if Google goes out of business? All of these things have happened over and over before with commercial vendors. Economies do go through cycles. Would it be OK to lose the entire archive of the mailing list in such an event? BTW if you use a threaded email client (such as GMail, or in fact most modern email readers), you *will* see threaded messages. Also, the Biopython list is indexed in GMane I believe, so you can search there pretty conveniently. Finally, have you tried using Google to search the Biopython archive? It's not so bad, actually. Just my $0.02 (and I'm a big fan of Google Groups). -hilmar On Dec 16, 2009, at 11:49 AM, Istvan Albert wrote: > Hello Everyone, > > I applaud the move to Github, I think it was a great decision that > will allow more people to contribute to the project. > > Yet at the same time a community is built on communication and the > current mailing list feels extremely antiquated. The way of > interaction is tedious: sending emails to an address, then one reads > one message at a time, messages are not displayed in a threaded form > where multiple messages are shown at the same time. There is no of > search, etc. and it all feels like a throwback to the 90s. Personally > I think a choice of mailman is a choice of deliberately of limiting > access to all but the most hardcore - and for example that's why the > main Python-dev uses it, it is a more of a mechanism to keep people > away. Of course python has comp.lang.python and it is a nice and > thriving group. > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as well. > Is this an idea that the owners of the list would entertain? > > best regards, > > Istvan Albert > > > -- > Istvan Albert > http://www.personal.psu.edu/iua1 > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From carlos.borroto at gmail.com Wed Dec 16 16:57:05 2009 From: carlos.borroto at gmail.com (Carlos Javier Borroto) Date: Wed, 16 Dec 2009 11:57:05 -0500 Subject: [Biopython] Is there any in silico PCR tool on biopython? Message-ID: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Hi there, I'm looking for a way to show some information on the specificity of sets of primers I'm designing, I'll love to have a way to run an in silico PCR reaction and parse the results with biopython. Is there something to use with one of the available PCR simulation tools? regards, -- Carlos Javier Borroto Baltimore, MD Google Voice: (410) 929 4020 From biopython at maubp.freeserve.co.uk Wed Dec 16 17:07:05 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:07:05 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: > Hello Everyone, > > I applaud the move to Github, I think it was a great decision that > will allow more people to contribute to the project. > > Yet at the same time a community is built on communication and the > current mailing list feels extremely antiquated. The way of > interaction is tedious: sending emails to an address, then one reads > one message at a time, messages are not displayed in a threaded form > where multiple messages are shown at the same time. There is no of > search, ?etc. and it all feels like a throwback to the 90s. A lot of that is down to your email program - I find none of those issue apply to how I use the list (in GoogleMail). You are specifically talking about browsing the mailing list archive? There yes, things are a bit rudimentary, and search isn't as good as in GoogleMail. But on the other hand it is clearly a read only archive. > Personally I think a choice of mailman is a choice of deliberately of > limiting access to all but the most hardcore - and for example that's > why the main Python-dev uses it, it is a more of ?a mechanism to > keep people away. Of course?python has comp.lang.python and it > is a nice and thriving group. > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as > well. Is this an idea that ?the owners of the list would entertain? It is something that the OBF may consider - but there are a lot of concerns about reliance on third parties, advertising, loss of brand control etc (see also Hilmar's email). Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 17:08:00 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:08:00 +0000 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> References: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Message-ID: <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> On Wed, Dec 16, 2009 at 4:57 PM, Carlos Javier Borroto wrote: > Hi there, > > I'm looking for a way to show some information on the specificity of > sets of primers I'm designing, I'll love to have a way to run an in > silico PCR reaction and parse the results with biopython. > > Is there something to use with one of the available PCR simulation tools? > I know people use Biopython with primer3 (usually the EMBOSS wrapped version, eprimer3). Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 17:11:40 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:11:40 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: > > Also, the Biopython list is indexed in GMane I believe, so you can > search there pretty conveniently. Good point - should we add links to these on the mailing list wiki page? Main mailing list: http://dir.gmane.org/gmane.comp.python.bio.general Dev mailing list: http://dir.gmane.org/gmane.comp.python.bio.general Announcement list: http://dir.gmane.org/gmane.comp.python.bio.general Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 17:14:31 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:14:31 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> Message-ID: <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> On Wed, Dec 16, 2009 at 5:11 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: >> >> Also, the Biopython list is indexed in GMane I believe, so you can >> search there pretty conveniently. > > Good point - should we add links to these on the mailing list wiki page? Sorry, same link three times, should be: Main mailing list: http://dir.gmane.org/gmane.comp.python.bio.general http://news.gmane.org/gmane.comp.python.bio.general Dev mailing list: http://dir.gmane.org/gmane.comp.python.bio.devel http://news.gmane.org/gmane.comp.python.bio.devel Announcement list: http://dir.gmane.org/gmane.comp.python.bio.announce http://news.gmane.org/gmane.comp.python.bio.announce Peter From biopython at maubp.freeserve.co.uk Wed Dec 16 17:26:01 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 17:26:01 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: > > The alternative such as Google groups would be far superior in > attracting and building a community of developers and users as well. > Is this an idea that ?the owners of the list would entertain? > As far as I can tell from the limited documentation I found on Google Groups (maybe I was looking in the wrong place?) there is no way to import our existing decade long mail archives. That would be a major downside. Also, using Google Groups would *require* all posters to have a Google Account - a potential sticking point for some. Peter From cjfields at illinois.edu Wed Dec 16 17:19:52 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 11:19:52 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> References: <320fb6e00912160907m4f201a90xb85851ef9ba43271@mail.gmail.com> Message-ID: <52405282-5654-47D8-885F-4733F9A40873@illinois.edu> On Dec 16, 2009, at 11:07 AM, Peter wrote: > On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: >> Hello Everyone, >> >> I applaud the move to Github, I think it was a great decision that >> will allow more people to contribute to the project. >> >> Yet at the same time a community is built on communication and the >> current mailing list feels extremely antiquated. The way of >> interaction is tedious: sending emails to an address, then one reads >> one message at a time, messages are not displayed in a threaded form >> where multiple messages are shown at the same time. There is no of >> search, etc. and it all feels like a throwback to the 90s. > > A lot of that is down to your email program - I find none of > those issue apply to how I use the list (in GoogleMail). > > You are specifically talking about browsing the mailing list > archive? There yes, things are a bit rudimentary, and search > isn't as good as in GoogleMail. But on the other hand it is > clearly a read only archive. > >> Personally I think a choice of mailman is a choice of deliberately of >> limiting access to all but the most hardcore - and for example that's >> why the main Python-dev uses it, it is a more of a mechanism to >> keep people away. Of course python has comp.lang.python and it >> is a nice and thriving group. >> >> The alternative such as Google groups would be far superior in >> attracting and building a community of developers and users as >> well. Is this an idea that the owners of the list would entertain? > > It is something that the OBF may consider - but there are a > lot of concerns about reliance on third parties, advertising, loss > of brand control etc (see also Hilmar's email). > > Peter I agree with peter and hilmar. Nabble and Gmane both archive the open-bio lists, at least for bioperl, but I would assume biopython as well. For a painful example of how bad third party mail lists can be (painful at least to me), see the gmod lists at Sourceforge. chris From iua1 at psu.edu Wed Dec 16 17:36:45 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 12:36:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 11:56 AM, Hilmar Lapp wrote: > Groups down, or to make it a paid subscription service, or if Google goes > out of business? > Economies do go through cycles. Would it be OK to lose the entire > archive of the mailing list in such an event? Every choice has two sides. Has a positive and has negative dimension to it. I am sure one can come up with unlikely yet equally pessimistic scenarios for the existing setup as well. One thing is seems clear to me and I do not think that you are aware of it. This mailman setup is a throttle - it imposes a negative feedback on the amount of messages that it can handle. This system of messages cannot grow over a certain limit. Just imagine regularly getting a dozen new emails a day plus their followups, yet you are just a casual user. This would be unbearable for many people whose inboxes are already overflowing. So they either don't participate or once they get even a few of these messages they turn off email delivery at which point you are left with a rudimentary site where it is hard to contribute so it drops off their radar. I can't even imagine what it would look like to have a popular newsgroup being delivered to my mailbox. In a nutshell you are saying it already works - but that is only because you get so few messages ... and getting more becomes actually inconvenient to the point at which it has to decay again to the manageable level Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 17:40:45 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 12:40:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> Message-ID: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > I am sure one can come up with unlikely yet equally pessimistic > scenarios for the existing setup as well. My point was that this is not unlikely at all. It happened with some of Yahoo's services, and it happened with others rather popular ones. If you own and operate your own brand, your equipment can still go out of order. But at least it's under your own control. Do you see that difference? Would you argue that that is unimportant? > This mailman setup is a throttle - it imposes a negative feedback on > the amount of messages that it can handle. I really don't know what you mean by this. I get 200 messages a day from various lists. I'd be dead if I had an email client that can't thread and can't filter, but I do have one that can (and it's free). GMail can do both too, and is free. Have you tried a threading email reader? Can you explain how reading newsgroups through a threaded and filtering news reader is different and more efficient than reading emails through a threaded and filtering email reader? That all being said, if what's at issue here is to have a Google Group interface to the Biopython mailing list, then that's actually easy to achieve. Someone (ideally one of the current list or project admins/ owners) creates a (presumably identically named) Google Group, and sets it to mirror the mailman mailing list. Guys - I'm happy to help with that if you don't know how to do that. Create the group, subscribe drycafe at gmail.com, and make me an admin. I'll configure the mirroring. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From sdavis2 at mail.nih.gov Wed Dec 16 17:48:30 2009 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Wed, 16 Dec 2009 12:48:30 -0500 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> References: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> <320fb6e00912160908r152bbc49o38ea32a33cf6ff89@mail.gmail.com> Message-ID: <264855a00912160948q538bed21q6dc0d410d468c5c7@mail.gmail.com> On Wed, Dec 16, 2009 at 12:08 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:57 PM, Carlos Javier Borroto > wrote: >> Hi there, >> >> I'm looking for a way to show some information on the specificity of >> sets of primers I'm designing, I'll love to have a way to run an in >> silico PCR reaction and parse the results with biopython. >> >> Is there something to use with one of the available PCR simulation tools? >> > > I know people use Biopython with primer3 (usually the EMBOSS > wrapped version, eprimer3). Carlos, If it is something like the UCSC genome browser in-silico PCR (for mapping the putative amplimers from a set of primers), they (UCSC) have an executable of the software. I always have trouble finding their software tools, but they are very responsive to email if you have problems. I don't have an example output file, but I bet it is just tab-delimited text, so parsing is probably not too difficult. Sean From lueck at ipk-gatersleben.de Wed Dec 16 17:31:18 2009 From: lueck at ipk-gatersleben.de (lueck at ipk-gatersleben.de) Date: Wed, 16 Dec 2009 18:31:18 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <320fb6e00912160911tdedfb2ew47a0535379e828b5@mail.gmail.com> <320fb6e00912160914w12e8ff2eiaa42b7924226eca8@mail.gmail.com> Message-ID: <20091216183118.zy9yvhd0hdcs4ggs@webmail.ipk-gatersleben.de> What about a free forum e.g. smf (http://www.simplemachines.org/) on the biopython homepage? I'm using this too and I'm quite happy. Easy building, maintaining... Just an idea... Zitat von Peter : > On Wed, Dec 16, 2009 at 5:11 PM, Peter > wrote: >> On Wed, Dec 16, 2009 at 4:56 PM, Hilmar Lapp wrote: >>> >>> Also, the Biopython list is indexed in GMane I believe, so you can >>> search there pretty conveniently. >> >> Good point - should we add links to these on the mailing list wiki page? > > Sorry, same link three times, should be: > > Main mailing list: > http://dir.gmane.org/gmane.comp.python.bio.general > http://news.gmane.org/gmane.comp.python.bio.general > > Dev mailing list: > http://dir.gmane.org/gmane.comp.python.bio.devel > http://news.gmane.org/gmane.comp.python.bio.devel > > Announcement list: > http://dir.gmane.org/gmane.comp.python.bio.announce > http://news.gmane.org/gmane.comp.python.bio.announce > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From iua1 at psu.edu Wed Dec 16 18:02:26 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:02:26 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: Hi Everyone, > Also, using Google Groups would *require* all posters to have > a Google Account - a potential sticking point for some. Right, but even currently one has to make an account on your site with email and password. Aren't you less comfortable signing up with various third parties than Google? A response mentioned that Gmail's email threading works just as well ... well it doesn't help anyone who has not already been subscribed to the messages to begin with. If you come to a discussion later you cannot get that. My goal is not to argue with each point, I only picked these two because both seemed to me like obvious responses yet they are only superficially addressing the issues that I brought up. But to step back a second, and maybe I wasn't specific enough. It is less about me, I can filter and thread my own email, I could download the email entire archive and search it etc.. I teach an introductory level class that uses Biopython, can I recommend my class that all go and sign up with you? I cannot really. Many people are just learning about computing, they will be overwhelmed with the everything interface, lack of search etc. All of you who responded - frankly I think you are too close to this issue to be able judge it correctly. It is like advising a newbie to use VI, one in a hundred will love it ninety nine will hate it, but hey who could argue that it is not super awesome? Once you do something for a bunch of years, you develop strategies and everything seems to work just fine, and everything makes sense. Get someone who is 20 and has never heard of sending emails to an address then see what they say about it... You all obviously care about the a community around biopython so all I am saying here is this: when you look around and wish that you could get a lot more people in - I think the answer is right there. Make it really easy to ask question, participate but also easy to just not participate and just be able to catch up really quickly of what is going on. I wish you all the best, Istvan Albert -- Istvan Albert http://www.personal.psu.edu/iua1 From lueck at ipk-gatersleben.de Wed Dec 16 18:04:17 2009 From: lueck at ipk-gatersleben.de (lueck at ipk-gatersleben.de) Date: Wed, 16 Dec 2009 19:04:17 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <20091216190417.1z9yrs6mn4f4gcsw@webmail.ipk-gatersleben.de> What about a free forum e.g. smf (http://www.simplemachines.org/) on the biopython homepage? I'm using this too and I'm quite happy. Easy building, maintaining... Just an idea... Zitat von Hilmar Lapp : > > On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > >> I am sure one can come up with unlikely yet equally pessimistic >> scenarios for the existing setup as well. > > My point was that this is not unlikely at all. It happened with some > of Yahoo's services, and it happened with others rather popular ones. > > If you own and operate your own brand, your equipment can still go > out of order. But at least it's under your own control. Do you see > that difference? Would you argue that that is unimportant? > >> This mailman setup is a throttle - it imposes a negative feedback on >> the amount of messages that it can handle. > > I really don't know what you mean by this. I get 200 messages a day > from various lists. I'd be dead if I had an email client that can't > thread and can't filter, but I do have one that can (and it's free). > GMail can do both too, and is free. Have you tried a threading email > reader? Can you explain how reading newsgroups through a threaded and > filtering news reader is different and more efficient than reading > emails through a threaded and filtering email reader? > > That all being said, if what's at issue here is to have a Google > Group interface to the Biopython mailing list, then that's actually > easy to achieve. Someone (ideally one of the current list or project > admins/ owners) creates a (presumably identically named) Google > Group, and sets it to mirror the mailman mailing list. > > Guys - I'm happy to help with that if you don't know how to do that. > Create the group, subscribe drycafe at gmail.com, and make me an admin. > I'll configure the mirroring. > > -hilmar > > -- > =========================================================== > : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : > =========================================================== > > > > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From iua1 at psu.edu Wed Dec 16 18:13:45 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:13:45 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 12:40 PM, Hilmar Lapp wrote: > If you own and operate your own brand, your equipment can still go out of > order. But at least it's under your own control. Do you see that difference? > Would you argue that that is unimportant? It is a valid argument. No question about that. I do have a comeback though, what percent of the past decade's archive's content do you think is still actually useful? Isn't the archive's purpose more of a historical one. And type of archiving you could do for yourself. But as for useful content that other people want to use - I am guessing the half life of any particular advice is no more than about one two two years. Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 18:23:18 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 13:23:18 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> On Dec 16, 2009, at 1:13 PM, Istvan Albert wrote: > what percent of the past decade's archive's content do you think is > still actually useful? Actually I find it very useful. We frequently cite past posts as references for explanations, or problems previously reported. It's one of the main differences between an archived mailing list and a simple alias for a group of people. -hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at drycafe dot net : =========================================================== From cjfields at illinois.edu Wed Dec 16 18:28:37 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 12:28:37 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: On Dec 16, 2009, at 12:23 PM, Hilmar Lapp wrote: > > On Dec 16, 2009, at 1:13 PM, Istvan Albert wrote: > >> what percent of the past decade's archive's content do you think is still actually useful? > > > Actually I find it very useful. We frequently cite past posts as references for explanations, or problems previously reported. It's one of the main differences between an archived mailing list and a simple alias for a group of people. > > -hilmar Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. chris From iua1 at psu.edu Wed Dec 16 18:40:27 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:40:27 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: > > Agreed. ?With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. My fault for being insufficiently clear. I am not saying that having archives is useless. It all needs to be framed in the mindset of an unexpected event causing an archive to be lost. Is that irreparable harm? For example would having a hundred more active participants be worth the small risk of losing the archives? I am just putting the 100 as a number out there, just to get you to think. I think you all agree that at some level of extra participation the risks would be well worth it. Now I am convinced that a Google group would get more participation. But is that 10 more people, one hundred, one thousand? That I do not dare to guesstimate. (definitely more than 10, ;-) ) Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From cjfields at illinois.edu Wed Dec 16 18:34:42 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 12:34:42 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <61CD9E3F-3F1A-4AC0-B863-9A24E7A373EF@illinois.edu> On Dec 16, 2009, at 11:40 AM, Hilmar Lapp wrote: > > On Dec 16, 2009, at 12:18 PM, Istvan Albert wrote: > >> I am sure one can come up with unlikely yet equally pessimistic scenarios for the existing setup as well. > > My point was that this is not unlikely at all. It happened with some of Yahoo's services, and it happened with others rather popular ones. > > If you own and operate your own brand, your equipment can still go out of order. But at least it's under your own control. Do you see that difference? Would you argue that that is unimportant? > >> This mailman setup is a throttle - it imposes a negative feedback on the amount of messages that it can handle. > > I really don't know what you mean by this. I get 200 messages a day from various lists. I'd be dead if I had an email client that can't thread and can't filter, but I do have one that can (and it's free). GMail can do both too, and is free. Have you tried a threading email reader? Can you explain how reading newsgroups through a threaded and filtering news reader is different and more efficient than reading emails through a threaded and filtering email reader? > > That all being said, if what's at issue here is to have a Google Group interface to the Biopython mailing list, then that's actually easy to achieve. Someone (ideally one of the current list or project admins/owners) creates a (presumably identically named) Google Group, and sets it to mirror the mailman mailing list. > > Guys - I'm happy to help with that if you don't know how to do that. Create the group, subscribe drycafe at gmail.com, and make me an admin. I'll configure the mirroring. > > -hilmar That would probably be a good idea for all the (most trafficked) open-bio groups. I'll work on it from the bioperl end. chris From tiagoantao at gmail.com Wed Dec 16 18:52:55 2009 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 16 Dec 2009 18:52:55 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> Message-ID: <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> On Wed, Dec 16, 2009 at 5:40 PM, Hilmar Lapp wrote: > If you own and operate your own brand, your equipment can still go out of > order. But at least it's under your own control. Do you see that difference? > Would you argue that that is unimportant? +1 . Just to say I fully subscribe to this point of view. github is _different_ by the very nature of being a distributed system. If tomorrow github.com disappears, it will be very easy to recover from it. If google turns bad, we loose a lot of history as google groups is not inherently distributed and thus neither resilient nor fail-safe. I prefer the status quo to the google groups change. From my point of view the technological autonomy provided by the OBF is a good thing. My ?0.02, Tiago From biopython at maubp.freeserve.co.uk Wed Dec 16 18:55:16 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 16 Dec 2009 18:55:16 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> On Wed, Dec 16, 2009 at 6:02 PM, Istvan Albert wrote: > A response mentioned that Gmail's email threading works just as well > ... ?well it doesn't help anyone who has not already been subscribed > to the messages to begin with. ?If you come to a discussion later you > cannot get that. True - but that becomes less and less of an issue once you have signed up. > ... Get someone who is 20 and has never heard of sending > emails to an address then see what they say about it... Do 20 year olds really not know how to use email these days? I do talk to graduate students, and hadn't noticed a trend. I must be getting old(er). Maybe we need a "Dummies Guide to setting up a GoogleMail/Thunderbird/Outlook filter"? e.g. I have one to move things from the inbox to a "Biopython" folder automatically. I do take you point that making the mailing list more accessible to novices (especially university students) is a good idea - and you may be right that *mirroring* it on GoogleGroups could be a solution. I don't know enough about how that works to have an informed viewpoint, but I trust Hilmar to look into it. Peter From iua1 at psu.edu Wed Dec 16 18:57:39 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:57:39 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> Message-ID: 2009/12/16 Tiago Ant?o : > If google turns bad, we loose a lot of history as google groups > is not inherently distributed and thus neither resilient No, on a second thought it actually is. Sign up to the group such that is sends an email for every single message (if you wish so). Google goes under, go back to the site the way it is right now. Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From iua1 at psu.edu Wed Dec 16 18:59:11 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 13:59:11 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> <320fb6e00912161055l38b8e27dpbb4c19be7c23b18b@mail.gmail.com> Message-ID: On Wed, Dec 16, 2009 at 1:55 PM, Peter wrote: > Do 20 year olds really not know how to use email these days? > I do talk to graduate students, and hadn't noticed a trend. > I must be getting old(er). Maybe we need a "Dummies Not email but interacting with a listserver via emails. Ask your students what a list server is. -- Istvan Albert http://www.personal.psu.edu/iua1 From iua1 at psu.edu Wed Dec 16 19:14:21 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 14:14:21 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: On Wed, Dec 16, 2009 at 2:07 PM, Chris Fields wrote: > Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? > The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). The effective way to deal with it is white listing. There is a setting that requires that the first message from a given email be approved by mods. That being said spam, popularity and ease of access all correlate. -- Istvan Albert http://www.personal.psu.edu/iua1 From cjfields at illinois.edu Wed Dec 16 19:05:46 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 13:05:46 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <6d941f120912161052m4360fe66w2fcff505fa61be3c@mail.gmail.com> Message-ID: On Dec 16, 2009, at 12:57 PM, Istvan Albert wrote: > 2009/12/16 Tiago Ant?o : > >> If google turns bad, we loose a lot of history as google groups >> is not inherently distributed and thus neither resilient > > No, on a second thought it actually is. > > Sign up to the group such that is sends an email for every single > message (if you wish so). Google goes under, go back to the site the > way it is right now. > > Istvan ...and in the meantime we lose any content only present on the google group list. Whereas if the group is a mirror of this list, then nothing is lost. chris From hlapp at drycafe.net Wed Dec 16 19:20:21 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 14:20:21 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <307922E9-8DD9-408F-8D39-D259186B3568@drycafe.net> It's the mailing list moderator volunteers that keep the spam out, actually. It's gotten so bad though that most OBF lists are set to reject non-member posts. What Google might be better at is to reject the spam well enough that one could open up the lists again to non-member posting. -hilmar Sent from away On Dec 16, 2009, at 2:07 PM, Chris Fields wrote: > Just curious, but does anyone know whether Google groups are more or > less susceptible to spamming? The current mailman setup does keep > out a vast majority of spam (I can't recall the last instance, > actually). From cjfields at illinois.edu Wed Dec 16 19:07:39 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 16 Dec 2009 13:07:39 -0600 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> Message-ID: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> On Dec 16, 2009, at 12:40 PM, Istvan Albert wrote: > On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: >> >> Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. > > My fault for being insufficiently clear. I am not saying that having > archives is useless. > > It all needs to be framed in the mindset of an unexpected event > causing an archive to be lost. Is that irreparable harm? For example > would having a hundred more active participants be worth the small > risk of losing the archives? Not that I think Google is in any danger of going under, or that Google Groups will cease to exist, but they have discontinued services in the past (notebook was one, and I recall others going away). > I am just putting the 100 as a number out there, just to get you to > think. I think you all agree that at some level of extra participation > the risks would be well worth it. I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. > Now I am convinced that a Google group would get more participation. > But is that 10 more people, one hundred, one thousand? That I do not > dare to guesstimate. > > (definitely more than 10, ;-) ) > > Istvan I think mirroring the list is the best compromise. I can't envision moving everything wholesale over to Google Groups for the reasons Hilmar has outlined. Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). chris From iua1 at psu.edu Wed Dec 16 19:21:05 2009 From: iua1 at psu.edu (Istvan Albert) Date: Wed, 16 Dec 2009 14:21:05 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: On Wed, Dec 16, 2009 at 2:07 PM, Chris Fields wrote: > I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. Great idea! While you are at it, why not allow people to post as well? Sign up the current list so that when someone posts on Google Groups it also goes to the current biopython list. When people reply-all from biopython group it will go to both lists. Maybe it is possible to get both worlds and using them in parallel! Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From hlapp at drycafe.net Wed Dec 16 19:29:14 2009 From: hlapp at drycafe.net (Hilmar Lapp) Date: Wed, 16 Dec 2009 14:29:14 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: That's included in the mirroring. You can post through either interface and join at either interface, it's transparent. -hilmar Sent from away On Dec 16, 2009, at 2:21 PM, Istvan Albert wrote: > Great idea! While you are at it, why not allow people to post as well? From pingou at pingoured.fr Wed Dec 16 19:35:07 2009 From: pingou at pingoured.fr (Pierre-Yves) Date: Wed, 16 Dec 2009 20:35:07 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <1260992107.5993.0.camel@localhost.localdomain> On Wed, 2009-12-16 at 14:21 -0500, Istvan Albert wrote: > > Sign up the current list so that when someone posts on Google Groups > it also goes to the current biopython list that implies that people are suscribed to both list and that won't be always the case Pierre From richard_w_g_price at academia.edu Wed Dec 16 22:50:05 2009 From: richard_w_g_price at academia.edu (Richard Price) Date: Wed, 16 Dec 2009 14:50:05 -0800 Subject: [Biopython] New Academia.edu feature for Biopython In-Reply-To: References: Message-ID: Dear Biopython members, I just wanted to let you know that there are now 5 members of Biopython on Academia.edu listing their research interests such as Analytical Chemistry, Computational Biology, and Bioinformatics. They have also listed contacts, photos and papers. There are thousands of people listing the same research interests as the Biopython members on Academia.edu, so there are lots of researchers for Biopython members to discover. To see the 5 members of Biopython on Academia.edu, and their research interests and papers, follow the link below: http://lists.academia.edu/See-members-of-Biopython Richard Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. Founder of Academia.edu On Wed, Dec 2, 2009 at 5:21 PM, Richard Price wrote: > Dear Biopython members, > > > I wanted to tell the list about a new feature on Academia.edu. > Academia.edu launched 12 months ago and now helps 300,000 academics a > month answer the question 'who's researching what?' > > > We have built a dedicated page on Academia.edu for the Biopython mailing list: > > > http://lists.academia.edu/See-members-of-Biopython > > > This page will show you fellow members already on Academia.edu. ?You > can see their papers, research interests, and other information. > > > Visit the link below, sign up with Academia.edu, and see who else from > Biopython is on Academia.edu. > > > > http://lists.academia.edu/See-members-of-Biopython > > > Richard > > > Dr. Richard Price, post-doc, Philosophy Dept, Oxford University. > Founder of Academia.edu > From lpritc at scri.ac.uk Thu Dec 17 09:35:56 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 09:35:56 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: Message-ID: Hi, On 16/12/2009 16:49, "Istvan Albert" wrote: > [the mailing list] feels like a throwback to the 90s. [...] > Of course python has comp.lang.python and it is a nice and > thriving group. comp.lang.python is a Usenet group, and Usenet is a throwback to the 70s. ;) > I can't even imagine what it would look like to have a popular > newsgroup being delivered to my mailbox. Mailing lists are far more convenient - for me - than having to navigate to a website especially to check new messages on a particular subject. Mailing lists bring new posts and issues to my attention via a single always-on interface in a timely manner. The current mailing list also has the advantage of grabbing the attention, specifically, of many people who might be able to do something about a query. As it happens, when I used to still read Usenet groups, I would do so from my mail client, with exactly the same threaded interface as I used for mailing lists and all other email. Biopython is never likely to be more than a niche interest, so I wouldn't expect it to ever reach the traffic of - say - alt.binaries. To be honest, the traffic doesn't even seem to approach that of numpy-discussion. And while we're talking about numpy-discussion, it illustrates one of Hilmar's and Chris' points: On 16/12/2009 16:56, "Hilmar Lapp" wrote: > what is your solution to Google all of a sudden deciding to take > Google Groups down, or to make it a paid subscription service, or if > Google goes out of business? On 16/12/2009 19:07, "Chris Fields" wrote: > Not that I think Google is in any danger of going under, or that Google Groups > will cease to exist, but they have discontinued services in the past (notebook > was one, and I recall others going away). http://groups.google.com/group/numpy-discussion/unlock?_done=/group/numpy-di scussion/ That mailing list was taken down from Google Groups for 'violating terms of service' - why, I don't know: it's a mailing list for a specialist Python library. It does illustrate though, how control can be (irrevocably) lost over communication via Google Groups. Notably, the mailing list itself persists without interruption. On 16/12/2009 17:36, "Istvan Albert" wrote: > One thing is seems clear to me and I do not think that you are aware > of it. This mailman setup is a throttle - it imposes a negative > feedback on the amount of messages that it can handle. > > This system of messages cannot grow over a certain limit. Just imagine > regularly getting a dozen new emails a day plus their followups, yet > you are just a casual user. > > This would be unbearable for many people whose inboxes are already > overflowing. This issue - that some people don't like to receive lots of messages at once - is already solved in mailman. There is a 'daily digest' option on the mailing list that collates the messages for a day, and sends them out as a single email for you. As a mailing list, mailman is deliberately designed to be relatively low volume in terms of content, but to reach many readers directly; the idea is to send relatively few messages to many people - but to push those messages through to the reader. Forums, wikis and website-based discussion lists require a deliberate effort on the part of the reader either to find what they're interested in, or to visit the site regularly. Otherwise they just end up signing up to receive updates by email, much like mailman. On 16/12/2009 17:40, "Hilmar Lapp" wrote: > That all being said, if what's at issue here is to have a Google Group > interface to the Biopython mailing list, then that's actually easy to > achieve. FWIW, I think that archiving the mailing list on Google Groups is not a bad idea - so long as the current registration scheme continues to prevent the inevitable waves of spam from Google Groups users. The Biopython mailing lists appear already to have been archived - with various degrees of usable interface, and likely intermittent coverage, too - at sites such as: http://www.mailinglistarchive.com/biopython at biopython.org/index.html and http://osdir.com/ml/search.html?cx=008059810939676512379%3Af5owd_2hq3u&cof=F ORID%3A10&q=%5Bbiopython%5D&sa=Search amongst others. On 16/12/2009 18:02, "Istvan Albert" wrote: > All of you who responded - frankly I think you are too close to this > issue to be able judge it correctly. [...] Get > someone who is 20 and has never heard of sending emails to an address > then see what they say about it... If they've never heard of emailing an address, and/or can't use a mail client to filter their email, I'm not sure the immediate problem is necessarily with the mailing list... ;) I don't think that anyone here wants to restrict access to Biopython, or to prevent discussion, even inadvertently. That we've had 28 posts on this issue in about 12 hours suggests that the list can handle issues of some interest. Sure, it would be nice to have a convenient, web-accessible and searchable archive with a pretty and robust interface (and Google Groups could give us that). But I'm not convinced that the current mailing list is a particular barrier to participation. +1 for mirroring/archiving on Google Groups: http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From lpritc at scri.ac.uk Thu Dec 17 10:14:54 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 10:14:54 +0000 Subject: [Biopython] Is there any in silico PCR tool on biopython? In-Reply-To: <65d4b7fc0912160857l30796cb4p33b8ff2c8dfac693@mail.gmail.com> Message-ID: Hi Carlos, There's an interface to the EMBOSS package primersearch, which might be what you're looking for. http://embossgui.sourceforge.net/demo/manual/primersearch.html http://github.com/biopython/biopython/blob/master/Bio/Emboss/Applications.py Cheers, L. On 16/12/2009 16:57, "Carlos Javier Borroto" wrote: > Hi there, > > I'm looking for a way to show some information on the specificity of > sets of primers I'm designing, I'll love to have a way to run an in > silico PCR reaction and parse the results with biopython. > > Is there something to use with one of the available PCR simulation tools? > > regards, -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From biopython at maubp.freeserve.co.uk Thu Dec 17 10:42:16 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 10:42:16 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912170242p7caa551eh91c2aa7beabc00f1@mail.gmail.com> On Thu, Dec 17, 2009 at 9:35 AM, Leighton Pritchard wrote: > > On 16/12/2009 16:56, "Hilmar Lapp" wrote: > >> what is your solution to Google all of a sudden deciding to take >> Google Groups down, or to make it a paid subscription service, >> or if Google goes out of business? > > On 16/12/2009 19:07, "Chris Fields" wrote: > >> Not that I think Google is in any danger of going under, or that Google Groups >> will cease to exist, but they have discontinued services in the past (notebook >> was one, and I recall others going away). > > http://groups.google.com/group/numpy-discussion/unlock?_done=/group/numpy-di > scussion/ > > That mailing list was taken down from Google Groups for 'violating terms of > service' - why, I don't know: it's a mailing list for a specialist Python > library. ?It does illustrate though, how control can be (irrevocably) lost > over communication via Google Groups. ?Notably, the mailing list itself > persists without interruption. I remembered that example later - if NumPy had *switched* that would have been a major upset to their community. As it was, it seems people who were using the Google Groups interface switched to using plain old email: http://mail.scipy.org/pipermail/numpy-discussion/2009-October/045855.html Peter From biopython at maubp.freeserve.co.uk Thu Dec 17 10:55:51 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 10:55:51 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: <320fb6e00912170255y3968a6e4s3bd904071910251@mail.gmail.com> On Wed, Dec 16, 2009 at 6:55 PM, Peter wrote: > > I do take you point that making the mailing list more > accessible to novices (especially university students) is a > good idea - and you may be right that *mirroring* it on > GoogleGroups could be a solution. I don't know enough > about how that works to have an informed viewpoint, but > I trust Hilmar to look into it. > On Thu, Dec 17, 2009 at 9:35 AM, Leighton Pritchard wrote: > > +1 for mirroring/archiving on Google Groups: > http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 > It looks like we can see mirroring/archiving on Google Groups in action - Hilmar and Chris have got this up and running for the main BioPerl list, http://lists.open-bio.org/pipermail/bioperl-l/2009-December/031789.html http://groups.google.com/group/bioperl-l It is a shame there isn't any obvious way to import the existing archive though. Peter From biopython at maubp.freeserve.co.uk Thu Dec 17 12:27:59 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 12:27:59 +0000 Subject: [Biopython] Entrez.efetch Service unavailable! In-Reply-To: <5b6410e0912160125j4f034218x263e6ab90b4afd47@mail.gmail.com> References: <5b6410e0912151952h76787344t18968aba5ab350d8@mail.gmail.com> <320fb6e00912160058i57acc020nba7c61c53a4ec64b@mail.gmail.com> <5b6410e0912160125j4f034218x263e6ab90b4afd47@mail.gmail.com> Message-ID: <320fb6e00912170427q6b31ee33x6b161323d24f3027@mail.gmail.com> On Wed, Dec 16, 2009 at 9:25 AM, Kevin Lam wrote: > Hi Peter, > Thanks for the suggestion. It was also an exercise for me since I am new to > Biopython and add to the fact that I do not need the Archaea sequences as I > am looking for pathogenic bacteria. I admit I am lazy ha! All the same, > thanks for being so helpful to a newbie to biopython. > Cheers > Kevin No problem. I've been trying Entrez EFtech on and off over the last 24 hours, and it has been unavailable for a while. It seems to be back now. Peter P.S. Mailing list CC'd From giles.weaver at googlemail.com Thu Dec 17 13:25:18 2009 From: giles.weaver at googlemail.com (Giles Weaver) Date: Thu, 17 Dec 2009 13:25:18 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> Message-ID: <4B2A313E.3050300@googlemail.com> I urge extreme caution with regards to using Google Groups. My experiences with Google Groups have been less than satisfactory. I've administered several small groups and they have been plagued with delivery issues for a small subset of users. I've also known members to be unable to access a group that they are a member of, and find that Google Groups has "disappeared" their profile - despite having an otherwise fully functioning Google account (Gmail etc). I'm not a fan of mailman, but at least the open-bio administrators have full control over the lists. I've known Chris to solve issues with the bioperl mailman list within minutes. You won't get that kind of service (if any) from Google. Having looked at alternatives to Google Groups myself, the two things that have caught my attention are bbPress (a wordpress derived bulletin board) and Google Wave. Both are still under development. bbPress boards can be subscribed to via RSS (and possibly email), so users can have messages drop into their mail/news reader. Wave looks promising, but I wouldn't touch it with a barge pole until mature Google free implementations take off, and that could be some time away. Mirroring the open-bio mailman lists onto Google Groups seems to me the right way to go, but I think there should be a health warning on the list home pages! Giles On 16/12/2009 19:07, Chris Fields wrote: > On Dec 16, 2009, at 12:40 PM, Istvan Albert wrote: > > >> On Wed, Dec 16, 2009 at 1:28 PM, Chris Fields wrote: >> >>> Agreed. With bioperl we generally indicate it's best to search the archives prior to asking a question, just in case the answer is already known or has been worked out. >>> >> My fault for being insufficiently clear. I am not saying that having >> archives is useless. >> >> It all needs to be framed in the mindset of an unexpected event >> causing an archive to be lost. Is that irreparable harm? For example >> would having a hundred more active participants be worth the small >> risk of losing the archives? >> > Not that I think Google is in any danger of going under, or that Google Groups will cease to exist, but they have discontinued services in the past (notebook was one, and I recall others going away). > > >> I am just putting the 100 as a number out there, just to get you to >> think. I think you all agree that at some level of extra participation >> the risks would be well worth it. >> > I understand your point, but I'm not really convinced this is something that can't be accomplished by simply mirroring the group and redirecting new users to sign up on the obf forums. > > >> Now I am convinced that a Google group would get more participation. >> But is that 10 more people, one hundred, one thousand? That I do not >> dare to guesstimate. >> >> (definitely more than 10, ;-) ) >> >> Istvan >> > I think mirroring the list is the best compromise. I can't envision moving everything wholesale over to Google Groups for the reasons Hilmar has outlined. > > Just curious, but does anyone know whether Google groups are more or less susceptible to spamming? The current mailman setup does keep out a vast majority of spam (I can't recall the last instance, actually). > > chris > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From biopython at maubp.freeserve.co.uk Thu Dec 17 13:35:25 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 13:35:25 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <4B2A313E.3050300@googlemail.com> References: <227F9F99-39F4-4AF1-AEE0-03313D14049F@drycafe.net> <0B35BE99-F267-4A0E-8E86-63EF079167B1@drycafe.net> <9B112844-2D52-41BF-8D25-83BDD64D2560@drycafe.net> <085CF361-FDAA-414B-BD7A-E249B6E9FC66@illinois.edu> <4B2A313E.3050300@googlemail.com> Message-ID: <320fb6e00912170535r3a6d2a4cv37d852aff3671e6e@mail.gmail.com> On Thu, Dec 17, 2009 at 1:25 PM, Giles Weaver wrote: > I urge extreme caution with regards to using Google Groups. > My experiences with Google Groups have been less than satisfactory. > ... > > Mirroring the open-bio mailman lists onto Google Groups seems to me the > right way to go, but I think there should be a health warning on the list > home pages! Thanks for the hard earned advice. I absolutely agree that *moving* to GoogleGroups is a bad idea, and accept that *mirroring* may be worth a try. Given the BioPerl list is already trying this out, let's give that a week or so, and if they think it works nicely then we could do the same for Biopython. Peter From David.Lapointe at umassmed.edu Thu Dec 17 12:42:15 2009 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Thu, 17 Dec 2009 07:42:15 -0500 Subject: [Biopython] EMBOSS and Python Message-ID: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> I haven't used EMBOSS though the python module before so this hasn't been an issue but even though I have a working install of EMBOSS, all of the EMBOSS tests seem to fail. I haven't seen any instructions for this. David From biopython at maubp.freeserve.co.uk Thu Dec 17 17:04:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 17:04:11 +0000 Subject: [Biopython] EMBOSS and Python In-Reply-To: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> Message-ID: <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> On Thu, Dec 17, 2009 at 12:42 PM, Lapointe, David wrote: > I haven't used EMBOSS though the python module before so this hasn't > been an issue but even though I have ?a working install of EMBOSS, all > of the EMBOSS tests seem to fail. I haven't seen any instructions for > this. What version of Biopython do you have? And if you are talking about the Biopython unit tests, could you post the output please? What version of EMBOSS do you have? Some of the Biopython tests did flag issues in EMBOSS which are fixed in their latest release. Thanks, Peter From iua1 at psu.edu Thu Dec 17 17:29:17 2009 From: iua1 at psu.edu (Istvan Albert) Date: Thu, 17 Dec 2009 12:29:17 -0500 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: References: Message-ID: On Thu, Dec 17, 2009 at 4:35 AM, Leighton Pritchard wrote: > Mailing lists are far more convenient - for me - than having to navigate to > a website especially to check new messages on a particular subject. Just for clarity: nobody is suggesting to take that away. You can *always* get the messages delivered via email. I think some of you get a little bit defensive because you assume that the suggestion is about messing with your system. > Biopython is never likely to be more than a niche interest, You are perfectly right here and that echoes my original sentiment! What keeps biopython a niche interest is exactly the lack of community building features. There obstacles in getting into it, so most won't. It doesn't take much to discourage a newcomer. best, Istvan -- Istvan Albert http://www.personal.psu.edu/iua1 From dalloliogm at gmail.com Thu Dec 17 17:48:18 2009 From: dalloliogm at gmail.com (Giovanni Marco Dall'Olio) Date: Thu, 17 Dec 2009 18:48:18 +0100 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> References: <320fb6e00912160926i2d17312br3f6ca730aa42daa1@mail.gmail.com> Message-ID: <5aa3b3570912170948k1890aa5ft3c30c760718c3cbd@mail.gmail.com> On Wed, Dec 16, 2009 at 6:26 PM, Peter wrote: > On Wed, Dec 16, 2009 at 4:49 PM, Istvan Albert wrote: >> >> The alternative such as Google groups would be far superior in >> attracting and building a community of developers and users as well. >> Is this an idea that ?the owners of the list would entertain? >> > > As far as I can tell from the limited documentation I found on > Google Groups (maybe I was looking in the wrong place?) there > is no way to import our existing decade long mail archives. > That would be a major downside. It seems that you can use a google/group to archive the messages from a remote mailing list, but you can't import the older messages to the google group: - http://groups.google.com/support/bin/answer.py?hl=en&answer=46387 Maybe you can give it a try: just create a group on google and use it as a mirror for the current mailing list. This way google users will be able to use the google interface to search and read messages, but they won't be able to post messages by mail. It won't arm anyone to have a mirror of the messages in GG, except that it may be a bit confusing for new users. About other possibilities, it seems that there is not yet a way to import a whole archive go GG: - http://groups.google.com/group/Groups-Suggestions/browse_thread/thread/3eb3ed32dee6b97e I have tried google/wave (I have invitations if anyone wants) but please, avoid to switch to it yet because it is still very very buggy and it will take them at least one year or so to make it useful. > Also, using Google Groups would *require* all posters to have > a Google Account - a potential sticking point for some. I am still not sure about this: it seems that you can subscribe with another address if you are invited, but you have to create a google account (not necessarily a google mail) to subscribe to it manually. It is a nice idea to switch to a fancier program for managing the list, but maybe it would require too much time to switch to google/groups. > > Peter > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Giovanni Dall'Olio, phd student Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain) My blog on bioinformatics: http://bioinfoblog.it From lpritc at scri.ac.uk Thu Dec 17 17:55:52 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Thu, 17 Dec 2009 17:55:52 +0000 Subject: [Biopython] suggestion: moving to the discussion list to Google groups In-Reply-To: Message-ID: Hi, On 17/12/2009 17:29, "Istvan Albert" wrote: > On Thu, Dec 17, 2009 at 4:35 AM, Leighton Pritchard wrote: > >> Mailing lists are far more convenient - for me - than having to navigate to >> a website especially to check new messages on a particular subject. > > Just for clarity: nobody is suggesting to take that away. You can > *always* get the messages delivered via email. I wasn't worried that anyone would 'take it away'. Though it must be said that not all discussion systems allow email tracking. > I think some of you get a little bit defensive because you assume that the > suggestion is about messing with your system. I don't think that's a fair characterisation. So far, what I've seen is open and thoughtful discussion of a comment you've made, and general agreement that mirroring/archiving on Google Groups is a good idea. Just because someone doesn't agree with every point you make, that doesn't make them defensive. >> Biopython is never likely to be more than a niche interest, > > You are perfectly right here and that echoes my original sentiment! > What keeps biopython a niche interest is exactly the lack of community > building features. There obstacles in getting into it, so most won't. > It doesn't take much to discourage a newcomer. That's not what I meant when I wrote that it's a niche interest. Biopython is a niche interest because a user likely fulfils three - rather unusual - criteria. I believe that these are the big obstacles to participation - not the software used for the mailing list: - they program in Python - they have an interest (likely professional or educational - not many bioinformatics hobbyists out there) in bioinformatics - they have bothered to investigate an existing general library for bioinformatics in Python, rather than try to solve their problem from scratch ;) Add to that, that the mailing list is a forum for discussion which - along with all other online fora for discussion, including Google Groups - is inherently self-selecting for active members. Taken together, this sets its own upper limit to the user base, which in turn helps define what is a sensible system to convey information between, and to, users. The three criteria listed above do suggest a sufficient level of comfort with the problem domain, such that Biopython's presence on Google Groups is - in my opinion - unlikely to be the major draw for new contributors or users. I do worry for the future of research if signing up to a mailing list (which you can do via the website; no subscription email is required) is an insurmountable hurdle for young, apparently computer-literate scientists. If that sort of thing really is a problem, it should at least keep conference attendances down in years to come, as navigating their Byzantine registration systems must drive them to endless despair... L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From iua1 at psu.edu Thu Dec 17 18:47:23 2009 From: iua1 at psu.edu (Istvan Albert) Date: Thu, 17 Dec 2009 13:47:23 -0500 Subject: [Biopython] some eye opening stats Message-ID: Hello Everyone, So I ran some statistics on this group (see below) that includes the entire past year. Make you own decisions based on it. Here is one of my observation: I find it saddening that I made the list at number 18! That's some niche list where one person posting ten messages in a whole year gets to be at number 18. In fact I only need three more posts to make myself top ten poster! Would you still claim this to be a good way to establish, grow and interact with a community? I said this many times before, and I'll try for this to be the last time I bring this up: I believe biopython is a niche software tool because *YOU* are limiting its reach *YOURSELVES* by making inappropriate decisions as far as accessibility and community goes. It will stay so as long as you don't recognize and act on this. best regards, Istvan Albert ================================= Statistics from 1.12.2008 to 17.12.2009 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ***** People who have written most messages: +----+-----Author-----------------------------------+--Msg-+-Percent-+ | 1 | biopython at maubp.freeserve.co.uk (Peter) | 472 | 39.70 % | | 2 | chapmanb at 50mail.com (Brad Chapman) | 53 | 4.46 % | | 3 | p.j.a.cock at googlemail.com (Peter Cock) | 36 | 3.03 % | | 4 | lueck at ipk-gatersleben.de (=?iso-8859-1?Q? | 27 | 2.27 % | | 5 | mjldehoon at yahoo.com (Michiel de Hoon) | 23 | 1.93 % | | 6 | cjfields at illinois.edu (Chris Fields) | 22 | 1.85 % | | 7 | dalloliogm at gmail.com (Giovanni Marco Dall | 21 | 1.77 % | | 8 | winda002 at student.otago.ac.nz (David Winte | 15 | 1.26 % | | 9 | lpritc at scri.ac.uk (Leighton Pritchard) | 14 | 1.18 % | | 10 | cmckay at u.washington.edu (Cedar McKay) | 13 | 1.09 % | | 11 | italo.maia at gmail.com (Italo Maia) | 12 | 1.01 % | | 12 | kellrott at gmail.com (Kyle Ellrott) | 12 | 1.01 % | | 13 | rodrigo_faccioli at uol.com.br (Rodrigo facc | 12 | 1.01 % | | 14 | bartek at rezolwenta.eu.org (Bartek Wilczyns | 12 | 1.01 % | | 15 | anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Ro | 11 | 0.93 % | | 16 | bartomas at gmail.com (bar tomas) | 11 | 0.93 % | | 17 | pzs at dcs.gla.ac.uk (Peter Saffrey) | 11 | 0.93 % | | 18 | iua1 at psu.edu (Istvan Albert) | 10 | 0.84 % | | 19 | dejmail at gmail.com (Liam Thompson) | 10 | 0.84 % | | 20 | stran104 at chapman.edu (Matthew Strand) | 10 | 0.84 % | | 21 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 9 | 0.76 % | | 22 | pengyu.ut at gmail.com (Peng Yu) | 9 | 0.76 % | | 23 | yvan.strahm at bccs.uib.no (Yvan Strahm) | 9 | 0.76 % | | 24 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 9 | 0.76 % | | 25 | carlos.borroto at gmail.com (Carlos Javier B | 9 | 0.76 % | +----+----------------------------------------------+------+---------+ | | other | 337 | 28.34 % | +----+----------------------------------------------+------+---------+ ***** Best authors, by total size of their messages (w/o quoting): +----+-----Author-------------------------------------------+-KBytes-+ | 1 | biopython at maubp.freeserve.co.uk (Peter) | 391.4 | | 2 | chapmanb at 50mail.com (Brad Chapman) | 48.1 | | 3 | lpritc at scri.ac.uk (Leighton Pritchard) | 39.1 | | 4 | p.j.a.cock at googlemail.com (Peter Cock) | 35.7 | | 5 | lueck at ipk-gatersleben.de (=?iso-8859-1?Q?Stefanie | 27.9 | | 6 | matzke at berkeley.edu (Nick Matzke) | 21.2 | | 7 | animesh.agrawal at anu.edu.au (Animesh Agrawal) | 21.1 | | 8 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 16.7 | | 9 | pzs at dcs.gla.ac.uk (Peter Saffrey) | 15.4 | | 10 | dalloliogm at gmail.com (Giovanni Marco Dall'Olio) | 14.7 | | 11 | natassa_g_2000 at yahoo.com (natassa) | 12.7 | | 12 | hlapp at gmx.net (Hilmar Lapp) | 12.7 | | 13 | mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydG | 12.5 | | 14 | cjfields at illinois.edu (Chris Fields) | 12.0 | | 15 | bartek at rezolwenta.eu.org (Bartek Wilczynski) | 11.6 | | 16 | winda002 at student.otago.ac.nz (David Winter) | 11.3 | | 17 | cmckay at u.washington.edu (Cedar McKay) | 11.0 | | 18 | mjldehoon at yahoo.com (Michiel de Hoon) | 10.9 | | 19 | dejmail at gmail.com (Liam Thompson) | 10.6 | | 20 | rodrigo_faccioli at uol.com.br (Rodrigo faccioli) | 10.3 | | 21 | anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues? | 9.7 | | 22 | fufezan at uni-muenster.de (Christian Fufezan) | 9.6 | | 23 | kellrott at gmail.com (Kyle Ellrott) | 9.2 | | 24 | peter at maubp.freeserve.co.uk (Peter) | 8.9 | | 25 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 8.3 | +----+------------------------------------------------------+--------+ ***** Best authors, by average size of their message (w/o quoting): +----+-----Author--------------------------------------------+-bytes-+ | 1 | agarbino at gmail.com (Alex Garbino) | 7991 | | 2 | aduran at fhcrc.org (Duran, Alysha M) | 3893 | | 3 | n.j.loman at bham.ac.uk (Nick Loman) | 3877 | | 4 | mmueller at python-academy.de (=?ISO-8859-15?Q?Mike_M | 3816 | | 5 | animesh.agrawal at anu.edu.au (Animesh Agrawal) | 3598 | | 6 | ivan at biodec.com (Ivan Rossi) | 2947 | | 7 | thomas.e.keller at gmail.com (Thomas Keller) | 2937 | | 8 | lpritc at scri.ac.uk (Leighton Pritchard) | 2860 | | 9 | matzke at berkeley.edu (Nick Matzke) | 2717 | | 10 | jhcepas at gmail.com (Jaime Huerta Cepas) | 2705 | | 11 | fufezan at uni-muenster.de (Christian Fufezan) | 2447 | | 12 | natassa_g_2000 at yahoo.com (natassa) | 2175 | | 13 | bav853 at bham.ac.uk (Bhima A van der Molen) | 1987 | | 14 | kteague at bcgsc.ca (Kevin Teague) | 1932 | | 15 | danielchubb at gmail.com (Daniel Chubb) | 1923 | | 16 | lueck at ipk-gatersleben.de (=?utf-8?Q?Stefanie_L=C3= | 1919 | | 17 | dalloliogm at fastwebnet.it (Giovanni Marco Dall'Olio | 1915 | | 18 | kelly.oakeson at utah.edu (Kelly F Oakeson) | 1895 | | 19 | bav853 at bham.ac.uk (Bhima Auro van der Molen) | 1874 | | 20 | darnells at dnastar.com (Steve Darnell) | 1768 | | 21 | gatoygata at hotmail.com (Joaquin Abian Monux) | 1744 | | 22 | srini_iyyer_bio at yahoo.com (Srinivas Iyyer) | 1733 | | 23 | bassbabyface at yahoo.com (Ben O'Loghlin) | 1698 | | 24 | hlapp at gmx.net (Hilmar Lapp) | 1624 | | 25 | mmokrejs at ribosome.natur.cuni.cz (=?UTF-8?B?TWFydGl | 1603 | +----+-------------------------------------------------------+-------+ ***** Table showing the most successful subjects: +----+----Subject-----------------------------------+--Msg-+-Percent-+ | 1 | [Biopython] suggestion: moving to the discus | 33 | 2.78 % | | 2 | [Biopython] Problems parsing with PSIBlastPa | 32 | 2.69 % | | 3 | [BioPython] The count method of a Seq (or Mu | 23 | 1.93 % | | 4 | [Biopython] Biopython and Snow Leopard | 21 | 1.77 % | | 5 | [Biopython] Phylogenetic trees with biopytho | 16 | 1.35 % | | 6 | [Biopython] Indexing large sequence files | 15 | 1.26 % | | 7 | [Biopython] SQL Alchemy based BioSQL | 14 | 1.18 % | | 8 | [BioPython] AlignIO: Sequences of different | 13 | 1.09 % | | 9 | [Biopython] searching for a human chromosome | 13 | 1.09 % | | 10 | [Biopython] BLAST against mouse genome only | 12 | 1.01 % | | 11 | [Biopython] How to get sequences upstream of | 12 | 1.01 % | | 12 | [Biopython] Parsing large blast files | 11 | 0.93 % | | 13 | [Biopython] Additions to the SeqRecord | 11 | 0.93 % | | 14 | [Biopython] Biopython & p3d | 11 | 0.93 % | | 15 | [BioPython] Feedback from Biopython 1.50 bet | 10 | 0.84 % | | 16 | [BioPython] Adding startswith and endswith m | 10 | 0.84 % | | 17 | [Biopython] Bio.Sequencing.Ace | 10 | 0.84 % | | 18 | [Biopython] Entrez.read return value is type | 10 | 0.84 % | | 19 | [Biopython] SeqIO for fasta conversion of Il | 10 | 0.84 % | | 20 | [Biopython] Parsing problem | 9 | 0.76 % | | 21 | [Biopython] Fasta.index_file: functionality | 9 | 0.76 % | | 22 | [Biopython] Adaptor trimmer and dimers | 9 | 0.76 % | | 23 | [BioPython] Is query_length really the lengt | 8 | 0.67 % | | 24 | [BioPython] Reading Roche 454 binary SFF fil | 8 | 0.67 % | | 25 | [Biopython] Writing into a PDB file using PD | 8 | 0.67 % | +----+----------------------------------------------+------+---------+ | | other | 851 | 71.57 % | +----+----------------------------------------------+------+---------+ ***** Most used email clients: +----+----Mailer------------------------------------+--Msg-+-Percent-+ | 1 | (unknown) | 1189 |100.00 % | +----+----------------------------------------------+------+---------+ | | other | 0 | 0.00 % | +----+----------------------------------------------+------+---------+ ***** Table of maximal quoting: +----+-----Author------------------------------------------+-Percent-+ | 1 | golubchi at stats.ox.ac.uk (Tanya Golubchik) | 94.68 % | | 2 | fredgca at hotmail.com (Frederico Arnoldi) | 92.56 % | | 3 | srikrishnamohan at gmail.com (km) | 92.28 % | | 4 | cmckay at u.washington.edu (Cedar Mckay) | 88.19 % | | 5 | harekrishna at gmail.com (Austin Davis-Richar | 87.51 % | | 6 | biopython.chen at gmail.com (chen Ku) | 86.14 % | | 7 | wgheath at gmail.com (William Heath) | 85.12 % | | 8 | jkhilmer at gmail.com (Jonathan Hilmer) | 82.50 % | | 9 | andrea at biodec.com (Andrea) | 80.82 % | | 10 | nuin at genedrift.org (Paulo Nuin) | 79.90 % | | 11 | lueck at ipk-gatersleben.de (lueck at ipk-gat | 78.81 % | | 12 | ibdeno at gmail.com (Miguel Ortiz Lombardia) | 78.57 % | | 13 | oda at georgetown.edu (Ogan ABAAN) | 75.33 % | | 14 | sean.maceach at gmail.com (Sean MacEachern) | 75.09 % | | 15 | pengyu.ut at gmail.com (Peng Yu) | 73.17 % | | 16 | fungazid at yahoo.com (Fungazid) | 71.57 % | | 17 | yvan.strahm at bccs.uib.no (Yvan Strahm) | 70.23 % | | 18 | cjfields at illinois.edu (Chris Fields) | 70.18 % | | 19 | sdavis2 at mail.nih.gov (Sean Davis) | 68.54 % | | 20 | thomas.hamelryck at gmail.com (Thomas Hamelry | 68.31 % | | 21 | mjldehoon at yahoo.com (Michiel de Hoon) | 66.49 % | | 22 | mavata at gmail.com (Manu Tamminen) | 66.42 % | | 23 | eric.talevich at gmail.com (Eric Talevich) | 65.92 % | | 24 | bsouthey at gmail.com (Bruce Southey) | 64.70 % | | 25 | biopythonlist at gmail.com (dr goettel) | 64.53 % | +----+-----------------------------------------------------+---------+ | | average | 44.57 % | +----+-----------------------------------------------------+---------+ ***** Graph showing number of messages written during hours of day: 100% ---------------------#--------------------------- - 146 90% ---------------------#--------------------------- msgs 80% ---------------------#--------------------------- 70% ---------------------#-#------------------------- 60% ---------------------#-#-#-#-#---#--------------- 50% ---------------------#-#-#-#-#-#-#-#------------- 40% ---------------------#-#-#-#-#-#-#-#-#----------- 30% -----------------#-#-#-#-#-#-#-#-#-#-#-#--------- 20% -----------------#-#-#-#-#-#-#-#-#-#-#-#--------- 10% ---------------#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#- * * * * * * * * * * * * * * * * * * * * * * * * hour 0 5 11 17 23 ***** Graph showing number of messages written during days of month: 100% -----------------------------#--------------------------------- - 74 90% -----------------------------#-#------------------------------- msgs 80% -----------------#-----------#-#------------------------------- 70% -#---------------#---------#-#-#---#--------------------------- 60% -#-------#-#-----#---------#-#-#-#-#-#-#-----------------#----- 50% -#-------#-#-----#-------#-#-#-#-#-#-#-#-----#-#---------#----- 40% -#-#-----#-#-----#-----#-#-#-#-#-#-#-#-#-#---#-#---#---#-#----- 30% -#-#-#-#-#-#---#-#-#---#-#-#-#-#-#-#-#-#-#---#-#-#-#-#-#-#----- 20% -#-#-#-#-#-#-#-#-#-#---#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#- 10% -#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * day 1 6 12 18 24 31 ***** Graph showing number of messages written during days of week: 100% -------------#--------------- - 262 90% ---------#---#--------------- msgs 80% ---------#---#---#----------- 70% -#---#---#---#---#----------- 60% -#---#---#---#---#----------- 50% -#---#---#---#---#----------- 40% -#---#---#---#---#----------- 30% -#---#---#---#---#----------- 20% -#---#---#---#---#----------- 10% -#---#---#---#---#---#---#--- * * * * * * * Mon Tue Wed Thu Fri Sat Sun ***** Maximal quoting: Author : andrea at biodec.com (Andrea) Subject : [Biopython] Problems parsing with PSIBlastParser Date : Thu, 15 Oct 2009 17:39:48 +0200 Quote ratio: 98.63% / 15890 bytes ***** Longest message: Author : chapmanb at 50mail.com (Brad Chapman) Subject : [Biopython] Skipping over blank/erroneous Entrez.esummary() Date : Wed, 7 Oct 2009 16:29:11 -0400 Size : 18503 bytes ***** Most successful subject: Subject : [Biopython] suggestion: moving to the discussion list to Google No. of msgs: 33 Total size : 43527 bytes ***** Final summary: Total number of messages: 1189 Total number of different authors: 159 Total number of different subjects: 332 Total size of messages (w/o headers): 1992170 bytes Average size of a message: 1675 bytes ***** Generated by MailListStat v1.3, (C) 2001-2003 ***** See http://freshmeat.net/projects/mls for details... -- Istvan Albert http://www.personal.psu.edu/iua1 From mailinglist.honeypot at gmail.com Thu Dec 17 19:08:55 2009 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Thu, 17 Dec 2009 14:08:55 -0500 Subject: [Biopython] some eye opening stats In-Reply-To: References: Message-ID: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> Hi Istvan, On Dec 17, 2009, at 1:47 PM, Istvan Albert wrote: > Hello Everyone, > > So I ran some statistics on this group (see below) that includes the > entire past year. Make you own decisions based on it. > > Here is one of my observation: I find it saddening that I made the > list at number 18! That's some niche list where one person posting > ten messages in a whole year gets to be at number 18. In fact I only > need three more posts to make myself top ten poster! Would you still > claim this to be a good way to establish, grow and interact with a > community? > > I said this many times before, and I'll try for this to be the last > time I bring this up: > > I believe biopython is a niche software tool because *YOU* are > limiting its reach *YOURSELVES* by making inappropriate decisions as > far as accessibility and community goes. It will stay so as long as > you don't recognize and act on this. I haven't said much so far because (1) I'm not really actively using biopython atm, and (2) I'm largely indifferent about the choice of mailing list vs. web interface, but let's be serious here ... how can you be so confident in drawing any causality between your stats and the fact that biopython is using a mailing list? You're arguing that since you are at # 18 w/ only 10 posts, it must be due to discussion about this project is confined to a mailing-list instead of a more "open" and "accessible" web group and the community needs to "act now" or ignore this at its peril. Try doing the same experiment with the bioconductor mailing list, or (depending on how bold you're feeling) the R-user mailing list. Discussion on both groups is via mailing-list only (or through gmane --- same can be said with this list) and come back with that report. Now, go try your same experiment on the networkx or igraph user group. Both are hosted on google groups. With 10 posts, you'll likely be somewhere in the top 10 posters for the year. Oh, even better: igraph just set up the mirror-do-hicky-whatever so you can access their mailing list via GG sometime in July: http://groups.google.com/group/network-analysis-with-igraph/browse_thread/thread/77305d9b6bc6d35/c6a694e287936049?lnk=gst&q=google+groups#c6a694e287936049 Perhaps you'd like to see how traffic has changed on that list before and after that fact. I'm going to guess it wasn't by all that much, but that would at least be a better experiment you can use to base your hunches on. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From cjfields at illinois.edu Thu Dec 17 19:46:08 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 17 Dec 2009 13:46:08 -0600 Subject: [Biopython] some eye opening stats In-Reply-To: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> References: <6E3B0A7F-476A-4417-A580-0FBCF4512763@gmail.com> Message-ID: <1ED2ED4E-F87E-4229-8D6A-8EF56C837D73@illinois.edu> On Dec 17, 2009, at 1:08 PM, Steve Lianoglou wrote: > Hi Istvan, > > On Dec 17, 2009, at 1:47 PM, Istvan Albert wrote: > >> Hello Everyone, >> >> So I ran some statistics on this group (see below) that includes the >> entire past year. Make you own decisions based on it. >> >> Here is one of my observation: I find it saddening that I made the >> list at number 18! That's some niche list where one person posting >> ten messages in a whole year gets to be at number 18. In fact I only >> need three more posts to make myself top ten poster! Would you still >> claim this to be a good way to establish, grow and interact with a >> community? >> >> I said this many times before, and I'll try for this to be the last >> time I bring this up: >> >> I believe biopython is a niche software tool because *YOU* are >> limiting its reach *YOURSELVES* by making inappropriate decisions as >> far as accessibility and community goes. It will stay so as long as >> you don't recognize and act on this. > > I haven't said much so far because (1) I'm not really actively using biopython atm, and (2) I'm largely indifferent about the choice of mailing list vs. web interface, but let's be serious here ... how can you be so confident in drawing any causality between your stats and the fact that biopython is using a mailing list? > > You're arguing that since you are at # 18 w/ only 10 posts, it must be due to discussion about this project is confined to a mailing-list instead of a more "open" and "accessible" web group and the community needs to "act now" or ignore this at its peril. > > Try doing the same experiment with the bioconductor mailing list, or (depending on how bold you're feeling) the R-user mailing list. Discussion on both groups is via mailing-list only (or through gmane --- same can be said with this list) and come back with that report. > > Now, go try your same experiment on the networkx or igraph user group. Both are hosted on google groups. With 10 posts, you'll likely be somewhere in the top 10 posters for the year. > > Oh, even better: igraph just set up the mirror-do-hicky-whatever so you can access their mailing list via GG sometime in July: > > http://groups.google.com/group/network-analysis-with-igraph/browse_thread/thread/77305d9b6bc6d35/c6a694e287936049?lnk=gst&q=google+groups#c6a694e287936049 > > Perhaps you'd like to see how traffic has changed on that list before and after that fact. I'm going to guess it wasn't by all that much, but that would at least be a better experiment you can use to base your hunches on. > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact This also doesn't factor participation via other means, such as other mail lists, IRC, etc. As an example, the Perl Moose mail list is fairly low traffic, with a few posts a week, but the IRC channel is much more active. Conversely, we in BioPerl tend to use the mail list over #bioperl (though I do use both if time permits). I think way too much time has been pushed into this topic, considering we've reached a pretty viable option, namely mirroring the list to Google Groups. That seems satisfactory to everyone. I fail to see the reason to press the issue (and everyone's ire) more? chris From jtomkins at ICR.org Thu Dec 17 19:43:19 2009 From: jtomkins at ICR.org (Jeff Tomkins) Date: Thu, 17 Dec 2009 13:43:19 -0600 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> Message-ID: I installed biopython 1.52 as directed for OS X leopard. Everything imports using the python prompt in the terminal, idle, ipython, wing ide, etc. But when I run a standard python script (#!/usr/bin/python) in the shell it cannot locate Bio. What setup feature have I missed? Thanks, Jeff From villahozbale at wisc.edu Thu Dec 17 20:23:30 2009 From: villahozbale at wisc.edu (ANGEL VILLAHOZ-BALETA) Date: Thu, 17 Dec 2009 14:23:30 -0600 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <7065_1261079798_ZZg0M1E7a5Xoa.00_C5096CF9-B649-4D98-BAF7-DFB9A0AC74D4@icr.org> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <7065_1261079798_ZZg0M1E7a5Xoa.00_C5096CF9-B649-4D98-BAF7-DFB9A0AC74D4@icr.org> Message-ID: <6f80a33d6a3f0.4b2a3ee2@wiscmail.wisc.edu> Jeff, Have you added the path of the libraries of Biopython to the shell variable called as PYTHONPATH? Angel Villahoz-Baleta Bioinformatics Programmer University of Wisconsin-Madison ----- Original Message ----- From: Jeff Tomkins Date: Thursday, December 17, 2009 1:56 pm Subject: [Biopython] Bio won't import in *.py scripts To: "Biopython at lists.open-bio.org" > I installed biopython 1.52 as directed for OS X leopard. Everything > imports using the python prompt in the terminal, idle, ipython, wing > ide, etc. But when I run a standard python script (#!/usr/bin/python) > in the shell it cannot locate Bio. What setup feature have I missed? > > Thanks, Jeff > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From daniel at dim.fm.usp.br Thu Dec 17 19:49:28 2009 From: daniel at dim.fm.usp.br (Daniel Silvestre) Date: Thu, 17 Dec 2009 17:49:28 -0200 Subject: [Biopython] Why so few recipes in the cookbook? Message-ID: <4B2A8B48.50302@dim.fm.usp.br> Greeting everybody, Is there a reason to the existence of so few recipes in Biopython cookbook? Is there a task force to improve the documentation and related stuff? Usually I see perfect reusable and instructional recipes in blogs of biopython users. But, they simply don't get to the cookbook. Att. Daniel -- +---------------------------------------+ Daniel de A. M. M. Silvestre LIM01 - Laborat?rio de Inform?tica M?dica - HCFMUSP Sala 1349 - Depto. de Patologia Faculdade de Medicina Universidade de S?o Paulo Av. Dr. Arnaldo, 455 | e-mail: daniel at dim.fm.usp.br Cerqueira C?sar | Tel: +55-11-3061-7381 01246-903 - S?o Paulo - SP | Cel: +55-11-8042-9369 BRASIL | Skype: jarretinha --------------------------------------------------------------------- Esta mensagem pode conter informacao confidencial. Se voce nao for o destinatario ou a pessoa autorizada a receber esta mensagem, nao podera usar, copiar ou divulgar as informacoes nela contidas ou tomar qualquer acao baseada nessas informacoes. Se voce recebeu esta mensagem por engano, favor avisar imediatamente o remetente, respondendo o e-mail e, em seguida, apague-o. Agradecemos sua cooperacao. This message may contain confidential information. If you are not the addressee or authorized person to receive it for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by replying this e-mail message and delete it. Thanks in advance for your cooperation. ---------------------------------------------------------------------- DIM Faculdade de Medicina USP ---------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: daniel.vcf Type: text/x-vcard Size: 375 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Thu Dec 17 21:16:42 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 17 Dec 2009 21:16:42 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2A8B48.50302@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> Message-ID: <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> 2009/12/17 Daniel Silvestre : > Greeting everybody, > > Is there a reason to the existence of so few recipes in Biopython > cookbook? Is there a task force to improve the documentation > and related stuff? The cookbook wiki is still quite new (6 months or so), but the idea was to encourage user participation. What would you like to write about ;) http://news.open-bio.org/news/2009/04/biopython-cookbook-wiki/ > Usually I see perfect reusable and instructional recipes in blogs of > biopython users. But, they simply don't get to the cookbook. Any specific examples? We can ask blog authors to put some of their finished receipes on the wiki (with a link back to the original). Peter P.S. Your message was delayed for moderation - probably the attachment? From p.j.a.cock at googlemail.com Thu Dec 17 22:59:07 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 17 Dec 2009 22:59:07 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> Message-ID: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Hi Jeff, On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins wrote: > I installed biopython 1.52 as directed for OS X leopard. > ?Everything imports using the python prompt in the terminal, > idle, ipython, wing ide, etc. Good :) > But when I run a standard python script (#!/usr/bin/python) > in the shell it cannot locate Bio. ?What setup feature have I missed? That does seem odd. You didn't call your script Bio.py did you? Could you show us the error message? Peter From mjldehoon at yahoo.com Fri Dec 18 08:35:40 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 18 Dec 2009 00:35:40 -0800 (PST) Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: <274100.45967.qm@web62402.mail.re1.yahoo.com> If you start python in the terminal, does it start /usr/bin/python or a different python? --- On Thu, 12/17/09, Peter Cock wrote: > From: Peter Cock > Subject: Re: [Biopython] Bio won't import in *.py scripts > To: "Jeff Tomkins" > Cc: "Biopython at lists.open-bio.org" > Date: Thursday, December 17, 2009, 5:59 PM > Hi Jeff, > > On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins > wrote: > > I installed biopython 1.52 as directed for OS X > leopard. > > ?Everything imports using the python prompt in the > terminal, > > idle, ipython, wing ide, etc. > > Good :) > > > But when I run a standard python script > (#!/usr/bin/python) > > in the shell it cannot locate Bio. ?What setup > feature have I missed? > > That does seem odd. You didn't call your script Bio.py did > you? > Could you show us the error message? > > Peter > > _______________________________________________ > Biopython mailing list? -? Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From lpritc at scri.ac.uk Fri Dec 18 08:49:51 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 18 Dec 2009 08:49:51 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: Hi, On 17/12/2009 22:59, "Peter Cock" wrote: > Hi Jeff, > > On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins wrote: >> I installed biopython 1.52 as directed for OS X leopard. >> ?Everything imports using the python prompt in the terminal, >> idle, ipython, wing ide, etc. > > Good :) > >> But when I run a standard python script (#!/usr/bin/python) >> in the shell it cannot locate Bio. ?What setup feature have I missed? It could be a $PATH issue. On my Mac, /usr/bin/python is where Apple's Python lives. I leave that installation alone, and don't replace it to avoid problems with the OS's potential use of Python (I had horrors with it back at 10.2). My 'working' version of Python is installed as /usr/local/bin/python. lpmacpro:scripts lpritc$ /usr/bin/python Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> lpmacpro:scripts lpritc$ which python /usr/local/bin/python lpmacpro:scripts lpritc$ python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> lpmacpro:scripts lpritc$ /usr/local/bin/python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> You could check to see if your setup is similar. What this means is that, as I have /usr/local/bin ahead of /usr/bin in my $PATH (don't shout at me, everyone!), command-line invocation and module installations with 'python setup.py' use the 'working' version, so install modules under that version of Python only. This would mean that if I had #!/usr/bin/python at the head of my script, it would use Apple's Python, and not see modules installed under my 'working' Python. This would give the same error you seem to describe: lpmacpro:scripts lpritc$ python Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04) [GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import Bio >>> lpmacpro:scripts lpritc$ /usr/bin/python Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import Bio Traceback (most recent call last): File "", line 1, in ImportError: No module named Bio >>> If this is the issue, then one way to get around the problem is to use #!/usr/bin/env python at the top of your script so that it uses the Python you would get from the command-line. This is better in some ways, as you don't have to make guesses about where the Python executable is installed when you move the script to another machine, though it probably won't matter on Windows, and there may be security issues that arise from this shortcut under some circumstances. If you're at all worried about those, just try 'which python' at the command-line, and substitute that location for /usr/bin/python. Cheers, L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From lpritc at scri.ac.uk Fri Dec 18 10:16:42 2009 From: lpritc at scri.ac.uk (Leighton Pritchard) Date: Fri, 18 Dec 2009 10:16:42 +0000 Subject: [Biopython] some eye opening stats In-Reply-To: Message-ID: On 17/12/2009 18:47, "Istvan Albert" wrote: > Statistics from 1.12.2008 to 17.12.2009 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > ***** People who have written most messages: > +----+-----Author-----------------------------------+--Msg-+-Percent-+ [...] > | 9 | lpritc at scri.ac.uk (Leighton Pritchard) | 14 | 1.18 % | > > ***** Best authors, by total size of their messages (w/o quoting): > +----+-----Author-------------------------------------------+-KBytes-+ [...] > | 3 | lpritc at scri.ac.uk (Leighton Pritchard) | 39.1 | Hmm... Peter has suggested privately that I do have a somewhat prolix and flowery writing style - I guess he's right... ;) L. -- Dr Leighton Pritchard MRSC D131, Plant Pathology Programme, SCRI Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405 ______________________________________________________ SCRI, Invergowrie, Dundee, DD2 5DA. The Scottish Crop Research Institute is a charitable company limited by guarantee. Registered in Scotland No: SC 29367. Recognised by the Inland Revenue as a Scottish Charity No: SC 006662. DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). ______________________________________________________ From daniel at dim.fm.usp.br Fri Dec 18 12:55:15 2009 From: daniel at dim.fm.usp.br (Daniel Silvestre) Date: Fri, 18 Dec 2009 10:55:15 -0200 Subject: [Biopython] [Fwd: Re: Why so few recipes in the cookbook?] Message-ID: <4B2B7BB3.6090505@dim.fm.usp.br> -- +---------------------------------------+ Daniel de A. M. M. Silvestre LIM01 - Laborat?rio de Inform?tica M?dica - HCFMUSP Sala 1349 - Depto. de Patologia Faculdade de Medicina Universidade de S?o Paulo Av. Dr. Arnaldo, 455 | e-mail: daniel at dim.fm.usp.br Cerqueira C?sar | Tel: +55-11-3061-7381 01246-903 - S?o Paulo - SP | Cel: +55-11-8042-9369 BRASIL | Skype: jarretinha --------------------------------------------------------------------- Esta mensagem pode conter informacao confidencial. Se voce nao for o destinatario ou a pessoa autorizada a receber esta mensagem, nao podera usar, copiar ou divulgar as informacoes nela contidas ou tomar qualquer acao baseada nessas informacoes. Se voce recebeu esta mensagem por engano, favor avisar imediatamente o remetente, respondendo o e-mail e, em seguida, apague-o. Agradecemos sua cooperacao. This message may contain confidential information. If you are not the addressee or authorized person to receive it for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by replying this e-mail message and delete it. Thanks in advance for your cooperation. ---------------------------------------------------------------------- DIM Faculdade de Medicina USP ---------------------------------------------------------------------- -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 6202 URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: daniel.vcf Type: text/x-vcard Size: 375 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Fri Dec 18 12:57:30 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 12:57:30 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2B6DE2.3080500@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> Message-ID: <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> 2009/12/18 Daniel Silvestre : > Greetings again, > > There are some blogs like Programming for Scientists and Yokofakun with > some usable code and tips. > > I do want to contribute, but there are no clear objectives stated to the > cookbook writing process. What's the gist? Just some code snippets? > Complete examples? I would personally prefer concrete examples. The name "cookbook" suggests a collection of complete recipes (rather than snippets). > My teaching experience says that complete (and real) examples are the > most wanted. For instance, even in the bioperl community only code > snippets are available. So, the first question students ask me after a > brief looking at the tutorials is smth like "Well, how do I use this in > a directory tree?". > > What about a bioinformatics recipes in (bio)python? Yes please :) Are there any examples in the main "Biopython Tutorial and Cookbook" which would in your opinion be better in the wiki? > Att, > Daniel > > P.S.:Is there a problem with my vcard? I can use a different sig. Could you try no vcard attachment at all? Peter From biopython at maubp.freeserve.co.uk Fri Dec 18 15:00:13 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 15:00:13 +0000 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <4B2B8CC3.3090307@dim.fm.usp.br> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> Message-ID: <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> 2009/12/18 Daniel Silvestre : > Hi people, > > Actually, even the tutorial is a collection of snippets. I do consider > and regard the effort. But, in order to attract biologists like myself > and my colleagues we need something more pragmatic, problem > driven. Most of the tutorial is by its nature "snippets" but the Cookbook chapter examples are more self standing. I suspect you are looking for even more self contained things - complete examples with a motivating rational, sample input data, etc. > The prototipical workflow of a molecular biologist is: > > ?- Select a bunch of interesting genes in Entrez by clicking buttons and > boxes; > > ?- BLAST some sequences and save the results in separated directories, > normally one for each gene; > > ?- Struggle to extract useful statistics from the results, wich usually > end in sorting and selecting the first few results; > > ?- Apply some analytical method (phylogeny reconstruction, mutation > analysis, etc.) over the "filtered" results; > > ?- Restart the cycle until get satisfied or bored; > > By the way, in one of my classes I just taught the students (which can > be grad students and professors) to use the fields of Entrez (molecular > weight, range search, organism name, etc.) and they felt really powerful > after that. For instance, they used to retrieve sequence lists of papers > ?by hand !!! I confess I don't know or use the full power of the Entrez website, although that is in part since I can do clever stuff via their API ;) > On the other hand, the ones who dare to use biopython tipically don't > know how to glob things and other administrivia. So, without a real > example only biology geeks like me get to the next step. > > There is a list of good recipes to start the cookbook: > > ?- How to retrieve and organize sequences and annotations from online > databases using you own custom command line tool; We touch on some of this already, e.g. search and retrieve examples in the Bio.Entrez chapter of the tutorial. Are you looking for something more in depth? Or using other databases? > ?- How to setup/insert/retrieve a bunch of results into a local > (personal) database (SQL); Done, although not tagged as a cookbook specifically: http://www.biopython.org/wiki/BioSQL (The tutorial also points to this page) > ?- How to annotate retrieved results with your own results; Now here I'd like a little clarification about what you want to do. My guess would be something I have considered working up into a cookbook recipe, based one stuff I have already done: Taking a small genome (viral or prokaryote), doing simple gene predictions (e.g. ORF finding, pick first start codon, or maybe calling a command line tool to do it for us), then taking the predicted peptides and BLASTing them, then making a GenBank file with these predicted features and stick a summary of the BLAST results in their annotation. However, while this is a reasonable first step, there are downsides to encouraging this sort of naive approach to annotation - the example would ideally have "Further Reading" section, see for example Schnoes et al 2009. http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000605 > These are real problems faced by the common biologist. The proposed > snippets in the tutorial and the cookbook is already dealt by a lot of > web tools. It's absolutely necessary to show that biopython can increase > the power and range of a biologist everyday work, and can possibly be > automated. > > I have some examples to obtain statistics over genome sequences which > address complete examples (including globbing filelists, retrieving from > online databases, etc.) and can prepare them as a recipe. But, I could > use some help . . . If you start a cookbook entry on the wiki, and some outline code, I'm sure we can as a group contribute ideas and tips (particularly in the code, but maybe in the approach too). Or, if you would rather, discuss some specific ideas here on the mailing list first. Note that some of these topics would be ideal for an OBF project wide set of examples, with reference solutions in Biopython, BioPerl, BioJava, BioRuby, etc. That is however a much much bigger task. Peter From rjalves at igc.gulbenkian.pt Fri Dec 18 17:17:44 2009 From: rjalves at igc.gulbenkian.pt (Renato Alves) Date: Fri, 18 Dec 2009 17:17:44 +0000 Subject: [Biopython] SeqIO.index improvement suggestions Message-ID: <4B2BB938.5030709@igc.gulbenkian.pt> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [I tried submitting this message to the dev mailing list, but got rejected since I'm not yet authorized to post there, so here it goes] Hi everyone, I'm working on changes to the Bio.SeqIO.index() function to make it more consistent with the .read and .parse i.e. accept a filehandle instead of a filename and also to include a way to cache the index into a file to speed up the process. The reason why we are implementing these two is because we were going to implement our own index solution until we realized this was added to 1.52. However the implementation in 1.52 has a few limitations. One limitation is that we are using a gzipped database for the sake of space and using gzip.open() to create the file-handle that would then be passed to .parse(). The same was not doable with .index(). This is already implemented in http://github.com/Unode/biopython/commit/6fc390151452e3ddf26a117269132125a3ffb3fe The second is that we are going to use this feature to quick search the database in a web application. Here we have the limitation that we don't have persistence across web requests, which means that we would need to recalculate the index on every web request. The details of how we plan to implement this are the following: cPickle the internal dictionary of offsets and save it on the database folder with the same name as the database + .index. The consistency check on whether the file has changed will be performed based on name and timestamp. By default .index() will search for this file, check the timestamp and use the cache if they match, otherwise they will be recalculated. The save function will be available like: >>> >>> d = SeqIO.index(...) >>> >>> d.save(filename) where filename is optional and defaults to "%s.index" % _handle.name We already have a solution like this implemented with subclasses of SeqIO._index, it's just a matter of reworking that and merge it into BioPython if you consider a good addition to the code. I would like to hear your comments and suggestions on this. Regards, Renato -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAksruTIACgkQYh11EUYTX9TymgCeL6hu3Uz//itSHx38k9KjfZJg dGUAmwVCgaI9G/19VKiUolrXogelgrPs =M+xw -----END PGP SIGNATURE----- From biopython at maubp.freeserve.co.uk Fri Dec 18 21:39:11 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 21:39:11 +0000 Subject: [Biopython] SeqIO.index improvement suggestions In-Reply-To: <4B2BB938.5030709@igc.gulbenkian.pt> References: <4B2BB938.5030709@igc.gulbenkian.pt> Message-ID: <320fb6e00912181339o1a5c4100w6f1957fd4d78d20d@mail.gmail.com> Hi Renato, I'm cooking dinner while writing this, so it won't be as in depth as usual... On Fri, Dec 18, 2009 at 5:17 PM, Renato Alves wrote: > > [I tried submitting this message to the dev mailing list, but got > rejected since I'm not yet authorized to post there, so here it goes] Have you definitely subscribed to the dev list? That should be all that is required to post there, and this discussion would be better suited there. > Hi everyone, > > I'm working on changes to the Bio.SeqIO.index() function to make it more > consistent with the .read and .parse i.e. accept a filehandle instead of > a filename and also to include a way to cache the index into a file to > speed up the process. > > The reason why we are implementing these two is because we were going to > implement our own index solution until we realized this was added to 1.52. > > However the implementation in 1.52 has a few limitations. Yes, this was designed to cover basic use cases in a general way, but with the option in future to do other things - and in particular saving the index to disk was kept in mind. > One limitation is that we are using a gzipped database for the sake of > space and using gzip.open() to create the file-handle that would then be > passed to .parse(). The same was not doable with .index(). > This is already implemented in > http://github.com/Unode/biopython/commit/6fc390151452e3ddf26a117269132125a3ffb3fe That was a deliberate choice in that the index code wants to "own" the handle. If other code has access to the handle, there is a risky of different bits of code moving the handle pointer etc. But, if you are careful it could be done. There are also issues here in combination with saving the index. With a filename, the code can easily reopen the file in the same mode. With a handle, things are more tricky. You have non-file handles to consider - such as the gzip example. There is also the problem of recording the file mode (normal text, universal text, or binary - which we will need for SFF files - code already written). If we do change the code to allow handles, it would have to be to allow handles OR filenames to be compatible with Biopython 1.52 and 1.53 (which take just filenames). This could be handled as in Bio.SeqIO.convert(), which also allows both (which was the subject of some discussion!). > The second is that we are going to use this feature to quick search the > database in a web application. Here we have the limitation that we don't > have persistence across web requests, which means that we would need to > recalculate the index on every web request. > > The details of how we plan to implement this are the following: > > cPickle the internal dictionary of offsets and save it on the database > folder with the same name as the database + .index. The consistency > check on whether the file has changed will be performed based on name > and timestamp. By default .index() will search for this file, check the > timestamp and use the cache if they match, otherwise they will be > recalculated. The save function will be available like: > >>>> >>> d = SeqIO.index(...) >>>> >>> d.save(filename) > > where filename is optional and defaults to "%s.index" % _handle.name > > We already have a solution like this implemented with subclasses of > SeqIO._index, it's just a matter of reworking that and merge it into > BioPython if you consider a good addition to the code. > > I would like to hear your comments and suggestions on this. Yes, saving indexes is an obvious addition. I have explored using pickle via shelve, and also SQLite - there are implementations of this on my github respository, plus begun to look into the existing OBF Open Biological Database Access (OBDA) specification for cross project compatibility. Other potential benefits here are reduced memory usage if we don't keep the dictionary of offsets in RAM. http://github.com/peterjc/biopython/tree/index-shelve http://github.com/peterjc/biopython/tree/index-sqlite There is a potential complication with index sub-classes which do more specialised indexing (e.g. GenBank files, and for a more extreme case, SFF files). See: http://github.com/peterjc/biopython/tree/sff-seqio Anyway - great to see you are finding the code useful, and have some quite similar ideas for how to extend it further. Peter From biopython at maubp.freeserve.co.uk Fri Dec 18 22:26:43 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 18 Dec 2009 22:26:43 +0000 Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <320fb6e00912171459m14462f0eq9d95f0dcdc039e8e@mail.gmail.com> Message-ID: <320fb6e00912181426i44b93b2co96a1a171a404dc5f@mail.gmail.com> On Fri, Dec 18, 2009 at 6:25 PM, Jeff Tomkins wrote: > I got some advice from Angel V. and added the following lines to my > profile and the scripts now import Bio - it looks like ?it worked and > fixed the issue. ?Thanks for getting back with me! > -jeff > > PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/biopython-1.52-py2.5-macosx-10.3-fat.egg:${PYTHONPATH}" > export PYTHONPATH > Excellent - and well done Angel for coming up with a working solution. Peter From pedro.al at fenhi.uh.cu Fri Dec 18 22:25:29 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Fri, 18 Dec 2009 17:25:29 -0500 Subject: [Biopython] Superpose structures Message-ID: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Hi all!! I want to superpose some structures on a reference one. The fixed selection is the backbone atoms of two residues and i want to superpose the rest of structures based in this atoms (for the same residues in the others structures, of course) Reference selection: [[, , , ], [, , , ]] The moving selection is a similar nested list, with the list of the residues backbone atoms to move in the query structures.... How can i superpose these structures based in the backbone atoms of two residues? Please help me... Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From p.j.a.cock at googlemail.com Fri Dec 18 22:55:44 2009 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Fri, 18 Dec 2009 22:55:44 +0000 Subject: [Biopython] Superpose structures In-Reply-To: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> References: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Message-ID: <320fb6e00912181455g680a8dbmdfa166bd820f06ed@mail.gmail.com> 2009/12/18 Yasser Almeida Hern?ndez : > Hi all!! > > I want to superpose some structures on a reference one. > The fixed selection is the backbone atoms of two residues and i want to > superpose the rest of structures based in this atoms (for the same residues > in the others structures, of course) > > Reference selection: > [[, , , ], [, , , > ]] > > The moving selection is a similar nested list, with the list of the residues > backbone atoms to move in the query structures.... > > How can i superpose these structures based in the backbone atoms of two > residues? > > Please help me... > Thanks Does this example help? http://www.warwick.ac.uk/go/peter_cock/python/protein_superposition/ Peter From mjldehoon at yahoo.com Fri Dec 18 23:54:52 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 18 Dec 2009 15:54:52 -0800 (PST) Subject: [Biopython] Bio won't import in *.py scripts In-Reply-To: <1D3CBE1D-A7A9-4337-BE0A-E073C2B9A3CC@ICR.org> Message-ID: <138394.60837.qm@web62408.mail.re1.yahoo.com> Good to hear you found a solution. Just for some background information: The reason /usr/bin/python couldn't find Biopython is that you have Biopython installed with the python in /Library/Frameworks/Python.framework/Versions/2.5/bin/python. These two pythons don't know about each other, so anything you install for one python is not seen by the other python. If you want to use /usr/bin/python in your scripts, another solution would have been to install Biopython for that python using /usr/bin/python setup.py build followed by /usr/bin/python setup.py install. But usually it's better to leave the Apple-installed python in /usr/bin/python alone, and to install modules for /Library/Frameworks/Python.framework/Versions/2.5/bin/python, and use that python. --Michiel. --- On Fri, 12/18/09, Jeff Tomkins wrote: > From: Jeff Tomkins > Subject: Re: [Biopython] Bio won't import in *.py scripts > To: "Michiel de Hoon" > Date: Friday, December 18, 2009, 1:27 PM > I got some advice from Angel V. and > added the following lines to my .profile and the scripts now > import Bio - it looks like? it worked and fixed the > issue.? Thanks for getting back with me! > -jeff > > PYTHONPATH="/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/biopython-1.52-py2.5-macosx-10.3-fat.egg:${PYTHONPATH}" > export PYTHONPATH > > > On Dec 18, 2009, at 2:35 AM, Michiel de Hoon wrote: > > > If you start python in the terminal, does it start > /usr/bin/python or a different python? > > > > --- On Thu, 12/17/09, Peter Cock > wrote: > > > >> From: Peter Cock > >> Subject: Re: [Biopython] Bio won't import in *.py > scripts > >> To: "Jeff Tomkins" > >> Cc: "Biopython at lists.open-bio.org" > > >> Date: Thursday, December 17, 2009, 5:59 PM > >> Hi Jeff, > >> > >> On Thu, Dec 17, 2009 at 7:43 PM, Jeff Tomkins > > >> wrote: > >>> I installed biopython 1.52 as directed for OS > X > >> leopard. > >>>? Everything imports using the python > prompt in the > >> terminal, > >>> idle, ipython, wing ide, etc. > >> > >> Good :) > >> > >>> But when I run a standard python script > >> (#!/usr/bin/python) > >>> in the shell it cannot locate Bio.? What > setup > >> feature have I missed? > >> > >> That does seem odd. You didn't call your script > Bio.py did > >> you? > >> Could you show us the error message? > >> > >> Peter > >> > >> _______________________________________________ > >> Biopython mailing list? -? Biopython at lists.open-bio.org > >> http://lists.open-bio.org/mailman/listinfo/biopython > >> > > > > > > > > From schafer at rostlab.org Fri Dec 18 23:57:52 2009 From: schafer at rostlab.org (=?ISO-8859-1?Q?Christian_Sch=E4fer?=) Date: Fri, 18 Dec 2009 18:57:52 -0500 Subject: [Biopython] Superpose structures In-Reply-To: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> References: <20091218172529.jfg6oapzgoccsocs@correo.fenhi.uh.cu> Message-ID: <4B2C1700.6080704@rostlab.org> Hey, an alternative would be to use the program ProFit (http://www.bioinf.org.uk/software/profit/index.html), which does a least-square fitting based on aligned residues of one reference and one or more mobile structures. It comes with an extensive yet easy comprehensible set of commands. I'm currently using it myself within my Python code. Chris On 12/18/2009 05:25 PM, Yasser Almeida Hern?ndez wrote: > Hi all!! > > I want to superpose some structures on a reference one. > The fixed selection is the backbone atoms of two residues and i want to > superpose the rest of structures based in this atoms (for the same > residues in the others structures, of course) > > Reference selection: > [[, , , ], [, , CA>, ]] > > The moving selection is a similar nested list, with the list of the > residues backbone atoms to move in the query structures.... > > How can i superpose these structures based in the backbone atoms of two > residues? > > Please help me... > Thanks > > > > From bjorn_johansson at bio.uminho.pt Sat Dec 19 11:00:19 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Sat, 19 Dec 2009 11:00:19 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit Message-ID: Hi, I am trying to get a restriction batch tha is limited to some enzymes with a certain size. I think that the lambdasplit might be used for this. I have not found any examples of the use of the restrictionbatch.lambdasplit rb = RestrictionBatch(first=[],suppliers=['F','R']) rb2= rb.lambdasplit(lambda x: x.size==6) this code does not work. Could someone give me an example on how to use this? I have tried to see docs on the lambda function in python, but I still could not solve this. grateful for any answer! /bjorn From biopython at maubp.freeserve.co.uk Sat Dec 19 11:17:33 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 19 Dec 2009 11:17:33 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: References: Message-ID: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> 2009/12/19 Bj?rn Johansson : > Hi, > > I am trying to get a restriction batch tha is limited to some enzymes with a > certain size. > I think that the lambdasplit might be used for this. > > I have not found any examples of the use of the restrictionbatch.lambdasplit > > rb = RestrictionBatch(first=[],suppliers=['F','R']) > > rb2= rb.lambdasplit(lambda x: x.size==6) > > this code does not work. Could someone give me an example on how to use > this? > I have tried to see docs on the lambda function in python, but I still could > not solve this. > grateful for any answer! > /bjorn Hmm, no mention of lambdasplit in this doc: http://biopython.org/DIST/docs/cookbook/Restriction.html Also no mention in Tests/test_Restriction.py Looking at the code, you need a function (which could be defined with a python lambda but need not be) which will be given as single argument and must return a boolean (or rather, something which will be evaluated as a boolean). You code looks fine: >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size==6]) 128 >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>> len(rb2) 0 Either we have both misunderstood the point of this function, or there is a bug in lambdasplit. Please file a bug report: http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython Thanks, Peter From sohm at inaf.cnrs-gif.fr Sat Dec 19 17:53:35 2009 From: sohm at inaf.cnrs-gif.fr (=?ISO-8859-1?Q?Fr=E9d=E9ric_Sohm?=) Date: Sat, 19 Dec 2009 18:53:35 +0100 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> Message-ID: <4B2D131F.1080302@inaf.cnrs-gif.fr> Hi Bj?rn, Peter, The code is working for me. I can't reproduce the bug. >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size == 6]) 128 >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) >>> len(rb2) 128 >>> len([x for x in rb if len(x) == 6]) 128 >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) >>> len(rb3) 128 >>> rb2 == rb3 True >>> EcoRI in rb True >>> EcoRI in rb2 and EcoRI in rb3 True >>> EcoRI.size == len(EcoRI) == 6 True >>> I am a bit puzzled there, Peter's code should work (and is effectively working on my machine) ... My setup is : Debian Lenny python 2.5.2 or python 2.4.6 (both tested) Biopython : 1.45 What about yours ? Best regards Fred Peter wrote: > 2009/12/19 Bj?rn Johansson : >> Hi, >> >> I am trying to get a restriction batch tha is limited to some enzymes with a >> certain size. >> I think that the lambdasplit might be used for this. >> >> I have not found any examples of the use of the restrictionbatch.lambdasplit >> >> rb = RestrictionBatch(first=[],suppliers=['F','R']) >> >> rb2= rb.lambdasplit(lambda x: x.size==6) >> >> this code does not work. Could someone give me an example on how to use >> this? >> I have tried to see docs on the lambda function in python, but I still could >> not solve this. >> grateful for any answer! >> /bjorn > > Hmm, no mention of lambdasplit in this doc: > http://biopython.org/DIST/docs/cookbook/Restriction.html > > Also no mention in Tests/test_Restriction.py > > Looking at the code, you need a function (which could > be defined with a python lambda but need not be) > which will be given as single argument and must > return a boolean (or rather, something which will be > evaluated as a boolean). You code looks fine: > >>>> from Bio.Restriction import * >>>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>>> len(rb) > 228 >>>> len([x for x in rb if x.size==6]) > 128 >>>> rb2= rb.lambdasplit(lambda x: x.size==6) >>>> len(rb2) > 0 > > Either we have both misunderstood the point of > this function, or there is a bug in lambdasplit. > > Please file a bug report: > http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython > > Thanks, > > Peter > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Fr?d?ric Sohm GIS AMAGEN CNRS INRA Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s" UPR 2197 DEPSN, CNRS Institut de Neurobiologie A. Fessard 1 Avenue de la Terrasse 91 198 GIF-SUR -YVETTE FRANCE Phone: 33 1 69 82 34 12 Fax: 33 1 69 82 41 67 email: sohm at inaf.cnrs-gif.fr From bjorn_johansson at bio.uminho.pt Sun Dec 20 06:58:43 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Sun, 20 Dec 2009 06:58:43 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <4B2D131F.1080302@inaf.cnrs-gif.fr> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> Message-ID: Hi, I copied the output from running Frederics example on my machine below: bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Bio.Restriction import * >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) >>> len(rb) 228 >>> len([x for x in rb if x.size == 6]) 128 >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) >>> len(rb2) 0 >>> len([x for x in rb if len(x) == 6]) 128 >>> len([x for x in rb if len(x) == 6]) 128 >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) >>> len(rb3) 0 >>> rb2 == rb3 True >>> EcoRI in rb True >>> EcoRI in rb2 and EcoRI in rb3 False >>> EcoRI.size == len(EcoRI) == 6 True >>> The lambdasplit does not seem to be working for me. I use python 2.6.4 on ubuntu karmic How can I print the biooython version? thansk for your help! /bjorn 2009/12/19 Fr?d?ric Sohm > Hi Bj?rn, Peter, > > The code is working for me. > I can't reproduce the bug. > > > > >>> from Bio.Restriction import * > >>> rb = RestrictionBatch(first=[], suppliers=['F','R']) > >>> len(rb) > 228 > >>> len([x for x in rb if x.size == 6]) > 128 > >>> rb2 = rb.lambdasplit(lambda x : x.size == 6) > >>> len(rb2) > 128 > >>> len([x for x in rb if len(x) == 6]) > 128 > >>> rb3 = rb.lambdasplit(lambda x : len(x) == 6) > >>> len(rb3) > 128 > >>> rb2 == rb3 > True > >>> EcoRI in rb > True > >>> EcoRI in rb2 and EcoRI in rb3 > True > >>> EcoRI.size == len(EcoRI) == 6 > True > >>> > > > I am a bit puzzled there, Peter's code should work (and is effectively > working on my machine) ... > > My setup is : > > Debian Lenny > python 2.5.2 or python 2.4.6 (both tested) > Biopython : 1.45 > > What about yours ? > > > Best regards > > Fred > > > > Peter wrote: > >> 2009/12/19 Bj?rn Johansson : >> >>> Hi, >>> >>> I am trying to get a restriction batch tha is limited to some enzymes >>> with a >>> certain size. >>> I think that the lambdasplit might be used for this. >>> >>> I have not found any examples of the use of the >>> restrictionbatch.lambdasplit >>> >>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>> >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>> >>> this code does not work. Could someone give me an example on how to use >>> this? >>> I have tried to see docs on the lambda function in python, but I still >>> could >>> not solve this. >>> grateful for any answer! >>> /bjorn >>> >> >> Hmm, no mention of lambdasplit in this doc: >> http://biopython.org/DIST/docs/cookbook/Restriction.html >> >> Also no mention in Tests/test_Restriction.py >> >> Looking at the code, you need a function (which could >> be defined with a python lambda but need not be) >> which will be given as single argument and must >> return a boolean (or rather, something which will be >> evaluated as a boolean). You code looks fine: >> >> from Bio.Restriction import * >>>>> rb = RestrictionBatch(first=[],suppliers=['F','R']) >>>>> len(rb) >>>>> >>>> 228 >> >>> len([x for x in rb if x.size==6]) >>>>> >>>> 128 >> >>> rb2= rb.lambdasplit(lambda x: x.size==6) >>>>> len(rb2) >>>>> >>>> 0 >> >> Either we have both misunderstood the point of >> this function, or there is a bug in lambdasplit. >> >> Please file a bug report: >> http://bugzilla.open-bio.org/enter_bug.cgi?product=Biopython >> >> Thanks, >> >> Peter >> >> _______________________________________________ >> Biopython mailing list - Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> >> > -- > Fr?d?ric Sohm > GIS AMAGEN CNRS INRA > Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s" > UPR 2197 DEPSN, CNRS > Institut de Neurobiologie A. Fessard > 1 Avenue de la Terrasse > 91 198 GIF-SUR -YVETTE > FRANCE > Phone: 33 1 69 82 34 12 > Fax: 33 1 69 82 41 67 > email: sohm at inaf.cnrs-gif.fr > -- ______O_________oO________oO______o_______oO__ Bj?rn Johansson Assistant Professor Departament of Biology University of Minho Campus de Gualtar 4710-057 Braga PORTUGAL http://www.bio.uminho.pt http://sites.google.com/site/bjornhome Work (direct) +351-253 601517 Private mob. +351-967 147 704 Dept of Biology (secretariate) +351-253 60 4310 Dept of Biology (fax) +351-253 678980 From biopython at maubp.freeserve.co.uk Sun Dec 20 18:01:56 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 20 Dec 2009 18:01:56 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> Message-ID: <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> 2009/12/20 Bj?rn Johansson : > Hi, > I copied the output from running Frederics example on my machine below: > > bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python > Python 2.6.4 (r264:75706, Dec ?7 2009, 18:45:15) > [GCC 4.4.1] on linux2 > ... > The lambdasplit does not seem to be working for me. I use python > 2.6.4 on ubuntu karmic I was using the same version of Python (also on Linux Ubuntu Karmic). This looks like it could be a Python 2.6 specific problem. > How can I print the biooython version? > thansk for your help! > /bjorn Its in the FAQ in the Tutorial, at the Python prompt just do: import Bio print Bio.__version__ Peter From biopython at maubp.freeserve.co.uk Mon Dec 21 12:03:34 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 21 Dec 2009 12:03:34 +0000 Subject: [Biopython] Fwd: Why so few recipes in the cookbook? In-Reply-To: <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> <4B2BAE35.2070404@dim.fm.usp.br> <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> Message-ID: <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> I just checked with Daniel to make sure he was happy for me to forward this back to the mailing list. Peter ---------- Forwarded message ---------- From: Peter Date: Fri, Dec 18, 2009 at 10:42 PM Subject: Re: [Biopython] Why so few recipes in the cookbook? To: Daniel Silvestre Hi Daniel, Do you mind if I send this to the list too? 2009/12/18 Daniel Silvestre : >> >> I confess I don't know or use the full power of the Entrez website, >> although that is in part since I can do clever stuff via their API ;) > > This is exactly what we want to do when get to the Entrez interface. > But, the information "How to submit complex query" is hidden (and > scattered) under many layers of web pages. > > The ability to do such things in a more customized way is the dream of > all life science guy. This is partly down to the NCBI's Entrez documentation - a lots of the examples in the Biopython tutorial took some serious exploration to get working, including trawling the net for other Entrez users (in other languages). I hope that we've managed to make things clearer. > While this tutorial is enough to CS-oriented guys, it's a really big > step to grasp such information for people from other communities. > That's why I'm always a little confused about the idea behind bio > projects. If the idea is programming of scientists, the approach is > way too CS. You are probably right in that the Bio* projects do cater more to a programming scientist than a wet biologist - not that there aren't people that can and do both. You have to be able to program to take full advantage of any of the Bio* kits. However, there are a number of front ends, webpages, etc which use them internally. >> Now here I'd like a little clarification about what you want to do. >> >> My guess would be something I have considered working up >> into a cookbook recipe, based one stuff I have already done: >> Taking a small genome (viral or prokaryote), doing simple >> gene predictions (e.g. ORF finding, pick first start codon, >> or maybe calling a command line tool to do it for us), then >> taking the predicted peptides and BLASTing them, then >> making a GenBank file with these predicted features and >> stick a summary of the BLAST results in their annotation. >> >> However, while this is a reasonable first step, there are >> downsides to encouraging this sort of naive approach to >> annotation - the example would ideally have "Further >> Reading" section, see for example Schnoes et al 2009. >> http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000605 >> > > That's exactly my point. Without a complete recipe with a > specific motivation and a clear stated problem behinf it, > people will continue with this kind of behavior. We agree here. > And I don't see why one need to start simple. The first time I've > entered in a molbio lab was to carry on a old fashioned gene cloning > procedure. This is a simple procedure. How do this compares to the > simple examples we see on bio project tutorials? I see the tutorial as a teaching aid, and many of the cookbook examples also. For someone learning to program, an overly complicated example is intimidating. This is not to say we can't have some complex cookbook entries too. i.e. I thank for learning to program, you need to start simple, and build up the complexity gradually. >>> These are real problems faced by the common biologist. The proposed >>> snippets in the tutorial and the cookbook is already dealt by a lot of >>> web tools. It's absolutely necessary to show that biopython can increase >>> the power and range of a biologist everyday work, and can possibly be >>> automated. >>> >>> I have some examples to obtain statistics over genome sequences which >>> address complete examples (including globbing filelists, retrieving from >>> online databases, etc.) and can prepare them as a recipe. But, I could >>> use some help . . . >> >> If you start a cookbook entry on the wiki, and some outline >> code, I'm sure we can as a group contribute ideas and tips >> (particularly in the code, but maybe in the approach too). Or, >> if you would rather, discuss some specific ideas here on the >> mailing list first. >> >> Note that some of these topics would be ideal for an OBF >> project wide set of examples, with reference solutions in >> Biopython, BioPerl, BioJava, BioRuby, etc. That is however >> a much much bigger task. > > I think that there is no need to worry about big things right now. By > the very nature of programming, people will mirror ideas from one > another. I've tried a similar approach in the bioperl community. But, > for the pragmatic life scientist, perl is over expressive while python > has a much higher first encounter acceptance rate (I'm not sure why, > tough...). > > My idea is not a master blaster cookbook, just to assemble simple ideas > that work for the everyday user, be this guy a CS or a life scientist. > > How do this sound to you? Wonderful :) (And I would agree with you that Python is probably easier to teach to beginners than Perl) Peter From chapmanb at 50mail.com Mon Dec 21 13:11:48 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Mon, 21 Dec 2009 08:11:48 -0500 Subject: [Biopython] Why so few recipes in the cookbook? In-Reply-To: <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> References: <4B2A8B48.50302@dim.fm.usp.br> <320fb6e00912171316y5e514052sabaf2a0104a558ac@mail.gmail.com> <4B2B6DE2.3080500@dim.fm.usp.br> <320fb6e00912180457x31b3c48bl680d48d6b95fdab0@mail.gmail.com> <4B2B8CC3.3090307@dim.fm.usp.br> <320fb6e00912180700w49d3be87r53b1a5201c84461b@mail.gmail.com> <4B2BAE35.2070404@dim.fm.usp.br> <320fb6e00912181442r60348fcwf15776a0451bc6a1@mail.gmail.com> <320fb6e00912210403q5dd4c0d7xf06c9a850ecde9db@mail.gmail.com> Message-ID: <20091221131148.GB21580@sobchak.mgh.harvard.edu> Peter and Daniel; Really interesting discussion. Documentation is an area that can always use more work to appeal to a wider audience. Daniel: > > While this tutorial is enough to CS-oriented guys, it's a really big > > step to grasp such information for people from other communities. > > That's why I'm always a little confused about the idea behind bio > > projects. If the idea is programming of scientists, the approach is > > way too CS. This stresses why we actively encourage contributions from biologists as well. Many of the contributors to Biopython tend more towards the programming/bioinformatics side, since that experience helps in building up and appreciating a re-usable toolkit. When those same people write documentation, it is going to be naturally biased towards the sort of work they do. I'd definitely encourage you, and anyone else who might be interested, to build up examples that are more intuitive to those coming at the work from a different starting point. This is exactly the idea behind starting up the cookbook on the wiki; it's all freely editable, so dig right in. Brad From biopython at maubp.freeserve.co.uk Tue Dec 22 16:22:24 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 22 Dec 2009 16:22:24 +0000 Subject: [Biopython] EMBOSS and Python In-Reply-To: <5ECA525B88314B48870E4AC72E3B9AF2045A70FE@EDUNIVMAIL05.ad.umassmed.edu> References: <5ECA525B88314B48870E4AC72E3B9AF2045A70EC@EDUNIVMAIL05.ad.umassmed.edu> <320fb6e00912170904x52b8a894s9a0f76ed1c26512c@mail.gmail.com> <5ECA525B88314B48870E4AC72E3B9AF2045A70FE@EDUNIVMAIL05.ad.umassmed.edu> Message-ID: <320fb6e00912220822m7e1c81c5h113a642f1336b328@mail.gmail.com> Hi David, I cc'd the mailing list again. On Tue, Dec 22, 2009 at 4:02 PM, Lapointe, David wrote: > > Hi Peter, > > I have current version for both EMBOSS (6.1.0) and BioPython (1.53). Do you have the original unpatched EMBOSS 6.1.0, or the latest patched version, currently EMBOSS 6.1.0 patch 3? See: ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/README.patch > I looked at the > code for the unit tests (asis) and the problem might be there, as I could run the > test fine by hand. Could you clarify what you meant by "I could run the test fine by hand"? > Shouldn't there be a '-' in front of asequence? > > ? ?def test_water_file(self): > ? ? ? ?"""water with the asis trick, output to a file.""" > ? ? ? ?#Setup, try a mixture of keyword arguments and later additions: > ? ? ? ?cline = WaterCommandline(cmd=exes["water"], > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? gapopen="10", gapextend="0.5") > ? ? ? ?#Try using both human readable names, and the literal ones: > ? ? ? ?cline.set_parameter("asequence", "asis:ACCCGGGCGCGGT") > ? ? ? ?cline.set_parameter("-bsequence", "asis:ACCCGAGCGCGGT") > > David Good question, but no. The test is confirming the set_parameter method supports both these ways of setting the parameters. This is now a semi- obsolete method - the preferred way would be to use bsequence in the constructor arguments, or the bsequence property. Interestingly the test output also indicates issues calling dnal (which is nothing to do with EMBOSS), yet at least one other command line tool test seem to be running OK (Clustalw). Peter From silvio.tschapke at googlemail.com Wed Dec 23 10:57:58 2009 From: silvio.tschapke at googlemail.com (Silvio Tschapke) Date: Wed, 23 Dec 2009 11:57:58 +0100 Subject: [Biopython] cannot find elink_090910.dtd Message-ID: Hi all, I am using ubuntu 9.10, python 2.6 and biopython 1.53 But while running these two lines of code pmid = "14630660" results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) I get the following error message posted at the bottom. So I searched at the proposed websites and in the web for "elink_090910.dtd" without success. I only found the files for elink_020511.dtd and something for 2010. But nothing related to elink_090910.dtd. Could you please help me how I can solve this problem? Cheers and merry christmas, Silvio Traceback (most recent call last): File "/home/silvio/programming/python/first steps/biopythonTutorial.py", line 17, in results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid)) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/__init__.py", line 258, in read record = handler.read(handle) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", line 108, in read self.parser.ParseFile(handle) File "/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", line 377, in externalEntityRefHandler raise RuntimeError(message) RuntimeError: Unable to load DTD file eLink_090910.dtd. From biopython at maubp.freeserve.co.uk Wed Dec 23 12:10:17 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 12:10:17 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: Message-ID: <320fb6e00912230410hf895e3sa999b0489e0b9e5e@mail.gmail.com> On Wed, Dec 23, 2009 at 10:57 AM, Silvio Tschapke wrote: > Hi all, > > I am using ubuntu 9.10, python 2.6 and biopython 1.53 > But while running these two lines of code > > pmid = "14630660" > results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) > > I get the following error message posted at the bottom. > So I searched at the proposed websites and in the web for > "elink_090910.dtd" without success. > I only found the files for elink_020511.dtd and something for 2010. But > nothing related to elink_090910.dtd. > Could you please help me how I can solve this problem? > > Cheers and merry christmas, > Silvio Hi Silvio, Yep, I don't have that file either, it looks like we missed it :( The NCBI website don't make it easy to find (as far as I could tell, none of the DTD pages list this file). However, the XML tells us where to try: http://eutils.ncbi.nlm.nih.gov/corehtml/query/DTD/eLink_090910.dtd I've added this to our repository, and it will be in the next release. You'll need to download that DTD file. How did you install Biopython? It looks like the file would need to go inside the egg - so it might be easier to (re)install from source with this extra file in the source code subdirectory Bio/Entrez/DTDs. Or start by grabbing the latest Biopython code from git. Does that make sense? However, even with the DTD file, I'm getting "Error 111 (Connection refused)" for your example. Maybe the NCBI are doing some maintenance work at the moment? Merry Christmas, Peter From konrad.koehler at mac.com Wed Dec 23 11:22:32 2009 From: konrad.koehler at mac.com (Konrad Koehler) Date: Wed, 23 Dec 2009 12:22:32 +0100 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: Message-ID: <11374559688483741089292458484266867741-Webmail@me.com> Hi Silvio, This appears to be a temporary glitch with the Entrez database and not biopython. For example, the following link should display the abstract, but current does not: http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&list_uids=14630660&dopt=Abstract A lot of other things in Entrez appear to be broken. For example internal links to RefSeq records from EntrezGene do not currently work. For example, the RefSeq link from here: http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=2645 to here: http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gene&Cmd=ShowDetailView&TermToSearch=2645 doesn't work either. Hopefully the problems in the Entrez database will be sorted out shortly. Cheers, Konrad On Wednesday, December 23, 2009, at 11:57AM, "Silvio Tschapke" wrote: >Hi all, > >I am using ubuntu 9.10, python 2.6 and biopython 1.53 >But while running these two lines of code > >pmid = "14630660" >results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid) > >I get the following error message posted at the bottom. >So I searched at the proposed websites and in the web for >"elink_090910.dtd" without success. >I only found the files for elink_020511.dtd and something for 2010. But >nothing related to elink_090910.dtd. >Could you please help me how I can solve this problem? > >Cheers and merry christmas, >Silvio > > >Traceback (most recent call last): > File "/home/silvio/programming/python/first steps/biopythonTutorial.py", >line 17, in > results = Entrez.read(Entrez.elink(dbfrom="pubmed", id=pmid)) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/__init__.py", >line 258, in read > record = handler.read(handle) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", >line 108, in read > self.parser.ParseFile(handle) > File >"/usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/Parser.py", >line 377, in externalEntityRefHandler > raise RuntimeError(message) >RuntimeError: Unable to load DTD file eLink_090910.dtd. >_______________________________________________ >Biopython mailing list - Biopython at lists.open-bio.org >http://lists.open-bio.org/mailman/listinfo/biopython > > From biopython at maubp.freeserve.co.uk Wed Dec 23 12:28:37 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 12:28:37 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <11374559688483741089292458484266867741-Webmail@me.com> References: <11374559688483741089292458484266867741-Webmail@me.com> Message-ID: <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> On Wed, Dec 23, 2009 at 11:22 AM, Konrad Koehler wrote: > Hi ?Silvio, > > This appears to be a temporary glitch with the Entrez database and not biopython. > Your URLs seem to be working now :) On Wed, Dec 23, 2009 at 12:10 PM, Peter wrote: > > However, even with the DTD file, I'm getting "Error 111 > (Connection refused)" for your example. Maybe the NCBI are > doing some maintenance work at the moment? This is also working now. Its looks like both a temporary glitch at the NCBI, and the problem with Biopython missing eLink_090910.dtd Peter From biopython at maubp.freeserve.co.uk Wed Dec 23 14:03:04 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 14:03:04 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> Message-ID: <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> On Wed, Dec 23, 2009 at 1:29 PM, Silvio Tschapke wrote: > > I have copied the DTD file in > /usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/DTDs/ > now and it seems to work. Great! I have installed Biopython with > > python setup.py build > python setup.py test > sudo python setup.py install OK - good :) > But will I have the same problems when I use eSearch, or eQuery and so > on? Because all of the DTD will not be up to date. So far I only > copied this DTD for eLink you posted. I *hope* that this was the only DTD file we were missing. If not, please do let us know so we can fix this for other users. In the short term, you would again need to download and fetch other missing DTD files in the same way. You can generally look at the start of the XML file to see where the DTD can be found, e.g. >>> from Bio import Entrez >>> Entrez.email = "your.name.here at example.com" >>> print Entrez.elink(dbfrom="pubmed", id="12345678").read(300) ... Thanks, Peter From bjorn_johansson at bio.uminho.pt Wed Dec 23 14:42:37 2009 From: bjorn_johansson at bio.uminho.pt (=?ISO-8859-1?Q?Bj=F6rn_Johansson?=) Date: Wed, 23 Dec 2009 14:42:37 +0000 Subject: [Biopython] question about RestrictionBatch.lambdasplit In-Reply-To: <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> References: <320fb6e00912190317t60c9fe7ai5c7e388849c72b4c@mail.gmail.com> <4B2D131F.1080302@inaf.cnrs-gif.fr> <320fb6e00912201001j24acc005uf7ad6d88f25bbf3d@mail.gmail.com> Message-ID: Hi, my Biopython version seems to be 1.52 bjorn at bjorn-laptop:~$ python Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) [GCC 4.4.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import Bio >>> Bio.__version__ '1.52' This works really well to get only five cutters and above (rb is a restriction batch): rb = [x for x in rb if len(x) > 4] which was what I wanted initially. Thanks for all help! Happy hollidays! /bjorn 2009/12/20 Peter > 2009/12/20 Bj?rn Johansson : > > Hi, > > I copied the output from running Frederics example on my machine below: > > > > bjorn at bjorn-laptop:~/wikidpad/user_extensions/SeqTools$ python > > Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) > > [GCC 4.4.1] on linux2 > > ... > > The lambdasplit does not seem to be working for me. I use python > > 2.6.4 on ubuntu karmic > > I was using the same version of Python (also on Linux Ubuntu Karmic). > This looks like it could be a Python 2.6 specific problem. > > > How can I print the biooython version? > > thansk for your help! > > /bjorn > > Its in the FAQ in the Tutorial, at the Python prompt just do: > > import Bio > print Bio.__version__ > > Peter > -- ______O_________oO________oO______o_______oO__ Bj?rn Johansson Assistant Professor Departament of Biology University of Minho Campus de Gualtar 4710-057 Braga PORTUGAL http://www.bio.uminho.pt http://sites.google.com/site/bjornhome Work (direct) +351-253 601517 Private mob. +351-967 147 704 Dept of Biology (secretariate) +351-253 60 4310 Dept of Biology (fax) +351-253 678980 From cjfields at illinois.edu Wed Dec 23 15:08:31 2009 From: cjfields at illinois.edu (Chris Fields) Date: Wed, 23 Dec 2009 09:08:31 -0600 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> Message-ID: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> On Dec 23, 2009, at 8:03 AM, Peter wrote: > On Wed, Dec 23, 2009 at 1:29 PM, Silvio Tschapke > wrote: >> >> I have copied the DTD file in >> /usr/local/lib/python2.6/dist-packages/biopython-1.53-py2.6-linux-i686.egg/Bio/Entrez/DTDs/ >> now and it seems to work. Great! I have installed Biopython with >> >> python setup.py build >> python setup.py test >> sudo python setup.py install > > OK - good :) > >> But will I have the same problems when I use eSearch, or eQuery and so >> on? Because all of the DTD will not be up to date. So far I only >> copied this DTD for eLink you posted. > > I *hope* that this was the only DTD file we were missing. If > not, please do let us know so we can fix this for other users. > > In the short term, you would again need to download and fetch > other missing DTD files in the same way. You can generally look > at the start of the XML file to see where the DTD can be found, e.g. > >>>> from Bio import Entrez >>>> Entrez.email = "your.name.here at example.com" >>>> print Entrez.elink(dbfrom="pubmed", id="12345678").read(300) > > 2009//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eLink_090910.dtd"> > > ... > > Thanks, > > Peter Just a quick question: is there any particular reason you need the DTDs? The BioPerl eutils interface doesn't use them at all, primarily b/c they aren't required on our end. chris From biopython at maubp.freeserve.co.uk Wed Dec 23 15:14:06 2009 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 23 Dec 2009 15:14:06 +0000 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> References: <11374559688483741089292458484266867741-Webmail@me.com> <320fb6e00912230428y3788591at9848206afa2bb8e0@mail.gmail.com> <320fb6e00912230603o23b93c21v36890dc53617b597@mail.gmail.com> <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> Message-ID: <320fb6e00912230714y5893143dhd262a6732536c87@mail.gmail.com> On Wed, Dec 23, 2009 at 3:08 PM, Chris Fields wrote: > > Just a quick question: is there any particular reason you need the DTDs? > The BioPerl eutils interface doesn't use them at all, primarily b/c they > aren't required on our end. Our parser uses the DTDs (local copies) to know the expected data structure, which gets turned into Python lists, dicts, strings etc using that information. I don't know too much about the implementation details, but this doesn't work on Jython (Python under Java) since they haven't implemented the DTD parsing support we expect. Peter From mjldehoon at yahoo.com Thu Dec 24 15:11:48 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 24 Dec 2009 07:11:48 -0800 (PST) Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <9E777B2B-51F8-407F-B035-B8E34B1BB73F@illinois.edu> Message-ID: <439227.21985.qm@web62403.mail.re1.yahoo.com> --- On Wed, 12/23/09, Chris Fields wrote: > Just a quick question: is there any particular reason you > need the DTDs?? The BioPerl eutils interface doesn't > use them at all, primarily b/c they aren't required on our > end. > The DTDs are needed to figure out the data structure of the XML file. In other words, what is a list, what is a dictionary, what is plain data, etcetera. How does the BioPerl eutils interface know how to store the information in the XML file? --Michiel. From cjfields at illinois.edu Thu Dec 24 18:54:14 2009 From: cjfields at illinois.edu (Chris Fields) Date: Thu, 24 Dec 2009 12:54:14 -0600 Subject: [Biopython] cannot find elink_090910.dtd In-Reply-To: <439227.21985.qm@web62403.mail.re1.yahoo.com> References: <439227.21985.qm@web62403.mail.re1.yahoo.com> Message-ID: <5569562A-372E-414C-889A-9E75B344BA21@illinois.edu> On Dec 24, 2009, at 9:11 AM, Michiel de Hoon wrote: > --- On Wed, 12/23/09, Chris Fields wrote: >> Just a quick question: is there any particular reason you >> need the DTDs? The BioPerl eutils interface doesn't >> use them at all, primarily b/c they aren't required on our >> end. >> > The DTDs are needed to figure out the data structure of the XML file. In other words, what is a list, what is a dictionary, what is plain data, etcetera. How does the BioPerl eutils interface know how to store the information in the XML file? > > --Michiel. We have classes designed to hold the information generically; docsums has docsum items, elink has linkouts, einfo has field/link information, so on. Has worked fairly well with eutils changes over the last four yeqrs w/o directly relying on the DTDs in a release. chris From pedro.al at fenhi.uh.cu Thu Dec 24 21:05:41 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Thu, 24 Dec 2009 16:05:41 -0500 Subject: [Biopython] Superpose structures... Message-ID: <20091224160541.4d9f8d4so4g4gcss@correo.fenhi.uh.cu> Hi all... I've superpose two structures.. Now i want to compute the RMSD between 2 residues after the superposition (with the transformed coordinates of the moving structure) How can i do that...? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From pedro.al at fenhi.uh.cu Thu Dec 24 21:37:12 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Thu, 24 Dec 2009 16:37:12 -0500 Subject: [Biopython] Superpose structures... Message-ID: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> I know i made that question already but this time is a quite different. I already superpose the structures (Profit isn't so suitable for my project), but now i want to compute the RMSD betweeen 2 residues in the superposed position. When i do that, the RMSD that i get is the same that if i superpose the structures according to that residue, and what i really want is the "deviation of 2 residues when its structures are superposed according to its binding sites residues", and i want to know how to do that in Biopython... Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From schafer at rostlab.org Thu Dec 24 22:11:10 2009 From: schafer at rostlab.org (=?ISO-8859-1?Q?Christian_Sch=E4fer?=) Date: Thu, 24 Dec 2009 17:11:10 -0500 Subject: [Biopython] Superpose structures... In-Reply-To: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> References: <20091224163712.mg5fgthu04gcgcgk@correo.fenhi.uh.cu> Message-ID: <4B33E6FE.3090709@rostlab.org> So, what you want to do is to superpose two structures by minimizing the RMSD between aligned residues and after that calculating the RMSD between two residues? Is that right? If so, ProFit is able to do that. Superimpostion is done by the ZONE command (which then returns the overall RMSD over the aligned regions after superimpostion); and calculating the RMSD between specific residues (after superimposition) is done via the RZONE command. You can even chose which kind of atoms to consider for RMSD calculation with the ATOMS and RATOMS commands. I'm not sure if something that specific could be done with plain BioPython. But again, you can always write a wrapper for the ProFit part in Python. Chris On 12/24/2009 04:37 PM, Yasser Almeida Hern?ndez wrote: > I know i made that question already but this time is a quite different. > I already superpose the structures (Profit isn't so suitable for my > project), but now i want to compute the RMSD betweeen 2 residues in the > superposed position. When i do that, the RMSD that i get is the same > that if i superpose the structures according to that residue, and what i > really want is the "deviation of 2 residues when its structures are > superposed according to its binding sites residues", and i want to know > how to do that in Biopython... > > Thanks > From bioinformaticsing at gmail.com Sat Dec 26 14:37:31 2009 From: bioinformaticsing at gmail.com (ning luwen) Date: Sat, 26 Dec 2009 22:37:31 +0800 Subject: [Biopython] need help! how to retrieve full text from Pubmed central ? Message-ID: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> Dear everyone, ?? I need to download full text from Pubmed central. After see the Entrez manual, maybe Entrez(not the web interface) doesn't give a way to?download .pdf full text file, is this true? -- regards, ningluwen From bioinformaticsing at gmail.com Sat Dec 26 14:54:23 2009 From: bioinformaticsing at gmail.com (ning luwen) Date: Sat, 26 Dec 2009 22:54:23 +0800 Subject: [Biopython] need help! how to retrieve full text from Pubmed central ? In-Reply-To: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> References: <90247fbe0912260637n7553bdf7wbce10a627c0a124c@mail.gmail.com> Message-ID: <90247fbe0912260654scd2b0ceyb37d54f36a3531fa@mail.gmail.com> more about the problem. From http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetchlit_help.html, I can learn: PubMed Central contains a number of articles classified as "open access" for which you may download the full text as XML. For the remaining articles in PMC you may download only the abstracts as XML. but when try to handle=Entrez.efetch(db='pmc',id=idlist,rettype='full',retmode='xml') record=Entrez.read(handle) got following errors: Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.6/dist-packages/Bio/Entrez/__init__.py", line 258, in read record = handler.read(handle) File "/usr/local/lib/python2.6/dist-packages/Bio/Entrez/Parser.py", line 114, in read raise CorruptedXMLError Bio.Entrez.Parser.CorruptedXMLError: Failed to parse the XML data. Please make sure that the input data are in XML format, and that the data are not corrupted. the python version is 1.53 and my system is ubuntu 9.10. On Sat, Dec 26, 2009 at 10:37 PM, ning luwen wrote: > Dear everyone, > ?? I need to download full text from Pubmed central. After see the > Entrez manual, maybe Entrez(not the web interface) doesn't give a way > to?download .pdf full text file, is this true? > > > > > -- > regards, > ningluwen > -- regards, luwening,bioinformatics center in uestc: www.bioinformaticsinuestc.cz.cc From pedro.al at fenhi.uh.cu Mon Dec 28 14:06:38 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Mon, 28 Dec 2009 09:06:38 -0500 Subject: [Biopython] Superpose structures... DONE Message-ID: <20091228090638.y05tos1p8g0gk08c@correo.fenhi.uh.cu> The problem with the calculus of the RMSD after the superposition is solved. This is done with the class SVDSuperimposer (Bio > SVDSuperimposer > SVDSuperimposer.py). This class has the method get_init_rms() that compute the structures RMSD after the superposition... Now i have another question. It is possible in Biopython read gziped pdb files (.pdb.gz)? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From mjldehoon at yahoo.com Mon Dec 28 17:27:23 2009 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Mon, 28 Dec 2009 09:27:23 -0800 (PST) Subject: [Biopython] Superpose structures... DONE In-Reply-To: <20091228090638.y05tos1p8g0gk08c@correo.fenhi.uh.cu> Message-ID: <363342.11278.qm@web62406.mail.re1.yahoo.com> --- On Mon, 12/28/09, Yasser Almeida Hern?ndez wrote: > Now i have another question. It is possible in Biopython > read gziped pdb files (.pdb.gz)? I am not a Bio.PDB user, but from its documentation it looks like it uses the file name to open a PDB file instead of a handle. Thomas, how do you feel about modifying Bio.PDB so it uses a file handle instead of a file name? Then Bio.PDB can parse gzipped and bzipped files. --Michiel. From pedro.al at fenhi.uh.cu Tue Dec 29 14:18:38 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Tue, 29 Dec 2009 09:18:38 -0500 Subject: [Biopython] Remove hydrogens... Message-ID: <20091229091838.fnyk66sayos8swww@correo.fenhi.uh.cu> Hi all... How can i remove hydrogens atoms from the structures objects? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI From pengyu.ut at gmail.com Tue Dec 29 16:08:09 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 10:08:09 -0600 Subject: [Biopython] Comparison between bioperl and biopython? Message-ID: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> May I ask somebody who are versitile in both bioperl and biopython comment on the pros and cons of bioperl and biopython? I'm sending this email to both bioperl and biopython mailing lists. But I hope that it will not result in any contention. I assume that the functionality between bioperl or biopython is the same, i.e., tasks can be done in bioperl can be done biopython and vice versa, as both libraries have been out there over 10 years. Please correct me if my understanding is not true. Given that a task that can be done with either bioperl or biopython, I, in particularly, want to know how long it will take to write the code for the task in bioperl and biopython, with the same readability requirement (see below) and the assumption that users have the same fluency in perl and python. python is claimed to be good for maintainability. But perl is criticized for there-are-many-ways-for-a-given-task. Since there are multiple ways in perl, let us assume that we always use perl in a readable way. From jason at bioperl.org Tue Dec 29 16:49:20 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 29 Dec 2009 08:49:20 -0800 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> Message-ID: <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> Are you asking for the purposes of choosing a toolkit for your work or just curious about the advantages/disadvantages of language choice? -jason On Dec 29, 2009, at 8:08 AM, Peng Yu wrote: > May I ask somebody who are versitile in both bioperl and biopython > comment on the pros and cons of bioperl and biopython? I'm sending > this email to both bioperl and biopython mailing lists. But I hope > that it will not result in any contention. > > I assume that the functionality between bioperl or biopython is the > same, i.e., tasks can be done in bioperl can be done biopython and > vice versa, as both libraries have been out there over 10 years. > Please correct me if my understanding is not true. > > Given that a task that can be done with either bioperl or biopython, > I, in particularly, want to know how long it will take to write the > code for the task in bioperl and biopython, with the same readability > requirement (see below) and the assumption that users have the same > fluency in perl and python. > > python is claimed to be good for maintainability. But perl is > criticized for there-are-many-ways-for-a-given-task. Since there are > multiple ways in perl, let us assume that we always use perl in a > readable way. > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason.stajich at gmail.com jason at bioperl.org http://fungalgenomes.org/ From sdavis2 at mail.nih.gov Tue Dec 29 17:03:40 2009 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Tue, 29 Dec 2009 12:03:40 -0500 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> Message-ID: <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: > May I ask somebody who are versitile in both bioperl and biopython > comment on the pros and cons of bioperl and biopython? I'm sending > this email to both bioperl and biopython mailing lists. But I hope > that it will not result in any contention. > > I assume that the functionality between bioperl or biopython is the > same, i.e., tasks can be done in bioperl can be done biopython and > vice versa, as both libraries have been out there over 10 years. > Please correct me if my understanding is not true. The two projects have similar goals, but saying that the functionality is the same would be an extreme oversimplification. You will need to define what you want to do and then check to see what the two projects have to offer. This will, in general, require perusing the websites for both projects as well as the relevant documentation. > Given that a task that can be done with either bioperl or biopython, > I, in particularly, want to know how long it will take to write the > code for the task in bioperl and biopython, with the same readability > requirement (see below) and the assumption that users have the same > fluency in perl and python. Again, you will want to define the task(s) to be accomplished and then weigh the pros and cons of each project combined with local expertise. If you don't know what you want to do, then you can certainly read some examples on the websites and see which project strikes you as a "winner" for you. > python is claimed to be good for maintainability. But perl is > criticized for there-are-many-ways-for-a-given-task. Since there are > multiple ways in perl, let us assume that we always use perl in a > readable way. These two statements are generalizations that provide little insight into the strengths or weaknesses of the languages. In other words, one can write good or bad code in both languages. Hope that helps. Sean From eric.talevich at gmail.com Tue Dec 29 18:37:43 2009 From: eric.talevich at gmail.com (Eric Talevich) Date: Tue, 29 Dec 2009 10:37:43 -0800 Subject: [Biopython] Superpose structures... DONE Message-ID: <3f6baf360912291037o5313b9f0s2acdc481c9989ce1@mail.gmail.com> On Mon, 28 Dec 2009, Michiel de Hoon wrote: > > I am not a Bio.PDB user, but from its documentation it looks like it uses > the file name to open a PDB file instead of a handle. Thomas, how do you > feel about modifying Bio.PDB so it uses a file handle instead of a file > name? Then Bio.PDB can parse gzipped and bzipped files. > > --Michiel. > > I guess PDB requires a file name because it wants full control over the file handle -- the handle is passed between PDBParser and parse_pdb_header, for instance. But control still isn't as crucial as in SeqIO.index (for example), so I don't think using a handle directly would lead to catastrophe in general. In addition, do you think a StructIO module would be worthwhile? Benefits: - Accept either a file name or file handle - Wouldn't necessarily need to specify the structure object's name as a separate argument (as PDBParser requires) - No need to instantiate a Parser object before parsing - PDB, PDBXML and mmCIF parsing would be called the same way Drawbacks: - Integrating parse_pdb_headers would become more important/tricky - Thin wrappers still require effort, and I'm currently tied up with TreeIO -- I'd get to it some months from now Cheers, Eric From pengyu.ut at gmail.com Tue Dec 29 18:58:59 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 12:58:59 -0600 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <2B85EF86-8A84-491B-8C33-7EC16CCB8CBC@bioperl.org> Message-ID: <366c6f340912291058t6c601e57re0c35e69fe81e09d@mail.gmail.com> To choose a toolkit for my work. On Tue, Dec 29, 2009 at 10:49 AM, Jason Stajich wrote: > Are you asking for the purposes of choosing a toolkit for your work or just > curious about the advantages/disadvantages of language choice? > > -jason > On Dec 29, 2009, at 8:08 AM, Peng Yu wrote: > >> May I ask somebody who are versitile in both bioperl and biopython >> comment on the pros and cons of bioperl and biopython? I'm sending >> this email to both bioperl and biopython mailing lists. But I hope >> that it will not result in any contention. >> >> I assume that the functionality between bioperl or biopython is the >> same, i.e., tasks can be done in bioperl can be done biopython and >> vice versa, as both libraries have been out there over 10 years. >> Please correct me if my understanding is not true. >> >> Given that a task that can be done with either bioperl or biopython, >> I, in particularly, want to know how long it will take to write the >> code for the task in bioperl and biopython, with the same readability >> requirement (see below) and the assumption that users have the same >> fluency in perl and python. >> >> python is claimed to be good for maintainability. But perl is >> criticized for there-are-many-ways-for-a-given-task. Since there are >> multiple ways in perl, let us assume that we always use perl in a >> readable way. >> _______________________________________________ >> Bioperl-l mailing list >> Bioperl-l at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/bioperl-l > > -- > Jason Stajich > jason.stajich at gmail.com > jason at bioperl.org > http://fungalgenomes.org/ > > From pengyu.ut at gmail.com Tue Dec 29 19:15:14 2009 From: pengyu.ut at gmail.com (Peng Yu) Date: Tue, 29 Dec 2009 13:15:14 -0600 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> Message-ID: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis wrote: > On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: >> May I ask somebody who are versitile in both bioperl and biopython >> comment on the pros and cons of bioperl and biopython? I'm sending >> this email to both bioperl and biopython mailing lists. But I hope >> that it will not result in any contention. >> >> I assume that the functionality between bioperl or biopython is the >> same, i.e., tasks can be done in bioperl can be done biopython and >> vice versa, as both libraries have been out there over 10 years. >> Please correct me if my understanding is not true. > > The two projects have similar goals, but saying that the functionality > is the same would be an extreme oversimplification. ?You will need to > define what you want to do and then check to see what the two projects > have to offer. ?This will, in general, require perusing the websites > for both projects as well as the relevant documentation. According to your experience, are there some tasks that are easier with one than with another? >> Given that a task that can be done with either bioperl or biopython, >> I, in particularly, want to know how long it will take to write the >> code for the task in bioperl and biopython, with the same readability >> requirement (see below) and the assumption that users have the same >> fluency in perl and python. > > Again, you will want to define the task(s) to be accomplished and then > weigh the pros and cons of each project combined with local expertise. > ?If you don't know what you want to do, then you can certainly read > some examples on the websites and see which project strikes you as a > "winner" for you. > >> python is claimed to be good for maintainability. But perl is >> criticized for there-are-many-ways-for-a-given-task. Since there are >> multiple ways in perl, let us assume that we always use perl in a >> readable way. > > These two statements are generalizations that provide little insight > into the strengths or weaknesses of the languages. ?In other words, > one can write good or bad code in both languages. > > Hope that helps. > > Sean > From jkhilmer at gmail.com Tue Dec 29 19:55:18 2009 From: jkhilmer at gmail.com (Jonathan Hilmer) Date: Tue, 29 Dec 2009 12:55:18 -0700 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> Message-ID: <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> Personally, I think that the differences between Python and Perl (although substantial) are not large enough to make the language itself the deciding factor. Instead, consider the larger community of software. I haven't yet found a situation in which Python cannot be applied: it can be used with R (statistics); lower-level code C or fortran; visualization software such as PyMol, Chimera, Blender, VTK; plotting with matplotlib; and scipy/numpy or sage, which provide innumerable benefits for computation, data-processing, etc. Although I don't claim to have a great deal of experience with Perl, I haven't seen the same integration with that language: I'm assuming it can be used with R and VTK (not sure about C or fortran?). For this reason, unless your work is highly targeted and you have no use programming language integration with other software, I would recommend Python. For perl experts, I would truly appreciate any corrections you could offer to these observations of mine, since I wouldn't mind using perl if it offers benefits either in general or for specific applications. Jonathan On Tue, Dec 29, 2009 at 12:15 PM, Peng Yu wrote: > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis wrote: >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: >>> May I ask somebody who are versitile in both bioperl and biopython >>> comment on the pros and cons of bioperl and biopython? I'm sending >>> this email to both bioperl and biopython mailing lists. But I hope >>> that it will not result in any contention. >>> >>> I assume that the functionality between bioperl or biopython is the >>> same, i.e., tasks can be done in bioperl can be done biopython and >>> vice versa, as both libraries have been out there over 10 years. >>> Please correct me if my understanding is not true. >> >> The two projects have similar goals, but saying that the functionality >> is the same would be an extreme oversimplification. ?You will need to >> define what you want to do and then check to see what the two projects >> have to offer. ?This will, in general, require perusing the websites >> for both projects as well as the relevant documentation. > > According to your experience, are there some tasks that are easier > with one than with another? > >>> Given that a task that can be done with either bioperl or biopython, >>> I, in particularly, want to know how long it will take to write the >>> code for the task in bioperl and biopython, with the same readability >>> requirement (see below) and the assumption that users have the same >>> fluency in perl and python. >> >> Again, you will want to define the task(s) to be accomplished and then >> weigh the pros and cons of each project combined with local expertise. >> ?If you don't know what you want to do, then you can certainly read >> some examples on the websites and see which project strikes you as a >> "winner" for you. >> >>> python is claimed to be good for maintainability. But perl is >>> criticized for there-are-many-ways-for-a-given-task. Since there are >>> multiple ways in perl, let us assume that we always use perl in a >>> readable way. >> >> These two statements are generalizations that provide little insight >> into the strengths or weaknesses of the languages. ?In other words, >> one can write good or bad code in both languages. >> >> Hope that helps. >> >> Sean >> > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From wgheath at gmail.com Tue Dec 29 20:16:39 2009 From: wgheath at gmail.com (William Heath) Date: Tue, 29 Dec 2009 12:16:39 -0800 Subject: [Biopython] Comparison between bioperl and biopython? In-Reply-To: <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> <81277ce10912291155x6dde10ewe2055b9692d077c1@mail.gmail.com> Message-ID: The biggest reason to go with python is the ease of use. Biologists are not programmers and the learning curve for python is much smaller than that of perl. I like perl but choose python because of this issue. Perl 6 does address some of these issues however but this has not been fully implemented as of yet. -Tim P.S. I love, love, love cpan though which is only for perl right now :( On Tue, Dec 29, 2009 at 11:55 AM, Jonathan Hilmer wrote: > Personally, I think that the differences between Python and Perl > (although substantial) are not large enough to make the language > itself the deciding factor. > > Instead, consider the larger community of software. I haven't yet > found a situation in which Python cannot be applied: it can be used > with R (statistics); lower-level code C or fortran; visualization > software such as PyMol, Chimera, Blender, VTK; plotting with > matplotlib; and scipy/numpy or sage, which provide innumerable > benefits for computation, data-processing, etc. > > Although I don't claim to have a great deal of experience with Perl, I > haven't seen the same integration with that language: I'm assuming it > can be used with R and VTK (not sure about C or fortran?). For this > reason, unless your work is highly targeted and you have no use > programming language integration with other software, I would > recommend Python. > > For perl experts, I would truly appreciate any corrections you could > offer to these observations of mine, since I wouldn't mind using perl > if it offers benefits either in general or for specific applications. > > > Jonathan > > On Tue, Dec 29, 2009 at 12:15 PM, Peng Yu wrote: > > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis > wrote: > >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu wrote: > >>> May I ask somebody who are versitile in both bioperl and biopython > >>> comment on the pros and cons of bioperl and biopython? I'm sending > >>> this email to both bioperl and biopython mailing lists. But I hope > >>> that it will not result in any contention. > >>> > >>> I assume that the functionality between bioperl or biopython is the > >>> same, i.e., tasks can be done in bioperl can be done biopython and > >>> vice versa, as both libraries have been out there over 10 years. > >>> Please correct me if my understanding is not true. > >> > >> The two projects have similar goals, but saying that the functionality > >> is the same would be an extreme oversimplification. You will need to > >> define what you want to do and then check to see what the two projects > >> have to offer. This will, in general, require perusing the websites > >> for both projects as well as the relevant documentation. > > > > According to your experience, are there some tasks that are easier > > with one than with another? > > > >>> Given that a task that can be done with either bioperl or biopython, > >>> I, in particularly, want to know how long it will take to write the > >>> code for the task in bioperl and biopython, with the same readability > >>> requirement (see below) and the assumption that users have the same > >>> fluency in perl and python. > >> > >> Again, you will want to define the task(s) to be accomplished and then > >> weigh the pros and cons of each project combined with local expertise. > >> If you don't know what you want to do, then you can certainly read > >> some examples on the websites and see which project strikes you as a > >> "winner" for you. > >> > >>> python is claimed to be good for maintainability. But perl is > >>> criticized for there-are-many-ways-for-a-given-task. Since there are > >>> multiple ways in perl, let us assume that we always use perl in a > >>> readable way. > >> > >> These two statements are generalizations that provide little insight > >> into the strengths or weaknesses of the languages. In other words, > >> one can write good or bad code in both languages. > >> > >> Hope that helps. > >> > >> Sean > >> > > > > _______________________________________________ > > Biopython mailing list - Biopython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From jason at bioperl.org Tue Dec 29 21:57:49 2009 From: jason at bioperl.org (Jason Stajich) Date: Tue, 29 Dec 2009 13:57:49 -0800 Subject: [Biopython] [Bioperl-l] Comparison between bioperl and biopython? In-Reply-To: <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> References: <366c6f340912290808q6edea4d8ncb59a270f9d11f1a@mail.gmail.com> <264855a00912290903m213d7cc4l607e8fa0bad55571@mail.gmail.com> <366c6f340912291115o58ba0b82kce74e18fecd833c8@mail.gmail.com> Message-ID: <02851B8A-E74E-453E-9725-6FA8F3995F82@bioperl.org> On Dec 29, 2009, at 11:15 AM, Peng Yu wrote: > On Tue, Dec 29, 2009 at 11:03 AM, Sean Davis > wrote: >> On Tue, Dec 29, 2009 at 11:08 AM, Peng Yu >> wrote: >>> May I ask somebody who are versitile in both bioperl and biopython >>> comment on the pros and cons of bioperl and biopython? I'm sending >>> this email to both bioperl and biopython mailing lists. But I hope >>> that it will not result in any contention. >>> >>> I assume that the functionality between bioperl or biopython is the >>> same, i.e., tasks can be done in bioperl can be done biopython and >>> vice versa, as both libraries have been out there over 10 years. >>> Please correct me if my understanding is not true. >> >> The two projects have similar goals, but saying that the >> functionality >> is the same would be an extreme oversimplification. You will need to >> define what you want to do and then check to see what the two >> projects >> have to offer. This will, in general, require perusing the websites >> for both projects as well as the relevant documentation. > > According to your experience, are there some tasks that are easier > with one than with another? As you have still failed to give much insight into the 'tasks' it is hard to give you a better answer. If there is a module or set of routines already written then yes one might be easier than the other. Otherwise it just depends on your strengths in the programming language. We discussed the strengths of the different toolkits briefly on the podcast last month. http://twit.tv/floss96 I echo Sean. Use whichever language you are a better programmer in. BioPerl is more mature in some facets than is BioPython, but BioPython has some components that are more heavily developed and supported than BioPerl (structures being one of those and interfacing that to pyMol would be a strength). I personally think the Gbrowse, Bio-Graphics, and Bio::DB::GFF/Bio::DB::SeqFeature::Store interface to Sequence databases and Features is a critical aspect of mining genomic data and features and use these heavily in my work, making BioPerl easy and powerful for my tasks. That and sequence and alignment parsing and reformatting. But there are comparable tools written in python with and without BioPython that you can also use so mainly it is about building up an expertise in a toolkit and going forward. The BioPerl faithful will probably say it is more useful toolkit to us, but we are of course a biased sample. Both projects can benefit from more users and developers contributing code and documentation so I would just jump in and give it a try if you are unsure which will be easier for you. > >>> Given that a task that can be done with either bioperl or biopython, >>> I, in particularly, want to know how long it will take to write the >>> code for the task in bioperl and biopython, with the same >>> readability >>> requirement (see below) and the assumption that users have the same >>> fluency in perl and python. >> >> Again, you will want to define the task(s) to be accomplished and >> then >> weigh the pros and cons of each project combined with local >> expertise. >> If you don't know what you want to do, then you can certainly read >> some examples on the websites and see which project strikes you as a >> "winner" for you. >> >>> python is claimed to be good for maintainability. But perl is >>> criticized for there-are-many-ways-for-a-given-task. Since there are >>> multiple ways in perl, let us assume that we always use perl in a >>> readable way. >> >> These two statements are generalizations that provide little insight >> into the strengths or weaknesses of the languages. In other words, >> one can write good or bad code in both languages. >> >> Hope that helps. >> >> Sean >> > > _______________________________________________ > Bioperl-l mailing list > Bioperl-l at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioperl-l -- Jason Stajich jason.stajich at gmail.com jason at bioperl.org http://fungalgenomes.org/ From mitlox at op.pl Wed Dec 30 02:42:07 2009 From: mitlox at op.pl (xyz) Date: Wed, 30 Dec 2009 12:42:07 +1000 Subject: [Biopython] fastq-solexa index In-Reply-To: <320fb6e00911260248w1f6a29b1ucc0bfecec897c67b@mail.gmail.com> References: <4B0DD08B.6070607@op.pl> <320fb6e00911260248w1f6a29b1ucc0bfecec897c67b@mail.gmail.com> Message-ID: <4B3ABDFF.8030809@op.pl> Peter wrote: > In Bio.SeqIO we give each file format a name, in this case "fastq-solexa" > means the old Solexa FASTQ files (also used by Illumina up to and > including pipeline 1.2) which use Solexa scores with an ASCII offset > of 64 (not PHRED scores). The table on the SeqIO wiki page tries to > summarise this. See also: http://en.wikipedia.org/wiki/FASTQ_format > > The "index" column on that table on the SeqIO wiki page indicates if > each file format can be used with the Bio.SeqIO.index(...) function > included in Biopython 1.52 onwards. See: > http://news.open-bio.org/news/2009/09/biopython-seqio-index/ > > There are also examples in the main Tutorial, > http://biopython.org/DIST/docs/tutorial/Tutorial.html > http://biopython.org/DIST/docs/tutorial/Tutorial.pdf > > And in the Bio.SeqIO module's built in help, online here: > http://biopython.org/DIST/docs/api/Bio.SeqIO-module.html > > >From within Python: > > >>>> from Bio import SeqIO >>>> help(SeqIO) >>>> > ... > >>>> help(SeqIO.index) >>>> > ... > > Peter > > Thank you. From mitlox at op.pl Wed Dec 30 02:51:45 2009 From: mitlox at op.pl (xyz) Date: Wed, 30 Dec 2009 12:51:45 +1000 Subject: [Biopython] Strand Message-ID: <4B3AC041.2070008@op.pl> Hello, I downloaded data from Phytozome Biomart: >AC159145_38|MtChr2|AC159145_38|Mtruncatula|17915949|17918990|-1 ATTTCCTCCAGACTTGTTAAAGAAGTTGAGTACAGATTGTATTGTCATGCAAAATCATCA ATATGGCATATCCCCAGTAAAACTCCTGGGAAATCAAAAGCTATCGAGTTTTTTCGAGAT CTTGACAACTTCCAACGATCAAGATGATAAGGTTTATGTCTCTACAGTACGTTCACGTAA CTATCCCGTGACTGGCTTCCAATGGCATCCTGAGAAAAATGCCTTCGAATGGGGCTCACC AAGCATTCCACACACAGAGGATGCCATTCGAACAACTCAGTATGCTGCAAACTATTTGGT CAGTGAAGCGAGGAAGTCCTTAAACAGACCAGTTGCTCAGGAATTGTTAGACAATCTCAT ATACAATTACAGACCCACTTATTGTGGGTATGCAGGTTGTCCACCGCCTAATCCGAACCT CTACTACCAGCCGGTCATTGGAATTCTCAGCCACCCCGGCGATGGCACTTCAGGCCGCCA CAGTAATGCTACGGGCGCTTCCTTCATTCACGCCTCTTATGTGAAATTCGTGGAGGCTGC TGGCGCTAGAGTAGTTCCTCTCATTTACAACGAACCGGAGGAGAAGATTCTCAAGGTATC AGAAAAGGCCAAAGCTTGA The above data is from -1 strand, but how could I convert it +1 strand? Thank you in advance. Best regards, From p.j.a.cock at googlemail.com Wed Dec 30 12:13:42 2009 From: p.j.a.cock at googlemail.com (Peter) Date: Wed, 30 Dec 2009 12:13:42 +0000 Subject: [Biopython] programming errors In-Reply-To: <4788465e0912252205l1ac3f26ds98739898ff83dbd9@mail.gmail.com> References: <4788465e0912252205l1ac3f26ds98739898ff83dbd9@mail.gmail.com> Message-ID: <389810B4-4CAB-4375-8964-5ABDA9749FFB@googlemail.com> Hi Rocky, To send a query to the mailing list please use the Biopython at ... address, not the Biopython-owner at ... address. You need to sign up to the mailing list first. Thanks, Peter On 26 Dec 2009, at 06:05, Rocky Parida wrote: > Hi > My name is Rocky. I am student at the Grand Valley State University. > I am > doing MS in Bioinformatics. I am currently using python 2.6.I was > trying to > follow a documentation( > http://www.inb.mu-luebeck.de/biosoft/biopython/tut/Tutorial002.html#toc10 > )in > > order to connect to a biological databases. I am facing some troubles > regarding importing data from NCBI. I am attaching my snipped word > doc with > this email. > Can you please suggest me on how to perform statistical data > analysis using > python. I am very much interested to learn but i am facing some > troubles > following the documentation. Is there a step by step documentation > that has > all the information regarding what to write, what to download and > how to do > statistics using python codes. If you please refer those sites and > books in > you reply. I have very little back ground in programming. So, please > keep > that in you consideration as well. > Thanking you > Rocky From chapmanb at 50mail.com Wed Dec 30 12:59:42 2009 From: chapmanb at 50mail.com (Brad Chapman) Date: Wed, 30 Dec 2009 07:59:42 -0500 Subject: [Biopython] Strand In-Reply-To: <4B3AC041.2070008@op.pl> References: <4B3AC041.2070008@op.pl> Message-ID: <20091230125942.GB39741@sobchak.mgh.harvard.edu> Hello; > I downloaded data from Phytozome Biomart: > > >AC159145_38|MtChr2|AC159145_38|Mtruncatula|17915949|17918990|-1 > ATTTCCTCCAGACTTGTTAAAGAAGTTGAGTACAGATTGTATTGTCATGCAAAATCATCA > ATATGGCATATCCCCAGTAAAACTCCTGGGAAATCAAAAGCTATCGAGTTTTTTCGAGAT > CTTGACAACTTCCAACGATCAAGATGATAAGGTTTATGTCTCTACAGTACGTTCACGTAA > CTATCCCGTGACTGGCTTCCAATGGCATCCTGAGAAAAATGCCTTCGAATGGGGCTCACC > AAGCATTCCACACACAGAGGATGCCATTCGAACAACTCAGTATGCTGCAAACTATTTGGT > CAGTGAAGCGAGGAAGTCCTTAAACAGACCAGTTGCTCAGGAATTGTTAGACAATCTCAT > ATACAATTACAGACCCACTTATTGTGGGTATGCAGGTTGTCCACCGCCTAATCCGAACCT > CTACTACCAGCCGGTCATTGGAATTCTCAGCCACCCCGGCGATGGCACTTCAGGCCGCCA > CAGTAATGCTACGGGCGCTTCCTTCATTCACGCCTCTTATGTGAAATTCGTGGAGGCTGC > TGGCGCTAGAGTAGTTCCTCTCATTTACAACGAACCGGAGGAGAAGATTCTCAAGGTATC > AGAAAAGGCCAAAGCTTGA > > The above data is from -1 strand, but how could I convert it +1 strand? If you got this from BioMart and retrieved something like cDNA sequence or transcript, this is probably already reverse complemented for you. In this case it looks like a coding sequence starting at base 47 and proceeding to the stop codon at the end. To answer your question, please see the Tutorial documentation, specifically Chapter 5 Sequence Input/Output: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc44 and section 3.7 Nucleotide sequences and (reverse) complements: http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc23 This should lead you to: in_file = "your_data.fa" with open(in_file) as in_handle: rec = SeqIO.read(in_handle, "fasta") rc = rec.seq.reverse_complement() Hope this helps, Brad From pedro.al at fenhi.uh.cu Wed Dec 30 20:05:55 2009 From: pedro.al at fenhi.uh.cu (Yasser Almeida =?iso-8859-1?b?SGVybuFuZGV6?=) Date: Wed, 30 Dec 2009 15:05:55 -0500 Subject: [Biopython] Save custom structure... Message-ID: <20091230150555.ojbth0gp34g088os@correo.fenhi.uh.cu> Hi all... I've extracted a residue and an atom as two separated objects... How can i save them as a single structure .pdb file? Thanks -- Lic. Yasser Almeida Hern?ndez Center of Molecular Inmunology (CIM) Nanobiology Group P.O.Box 16040, Havana, Cuba Phone: (537) 271-7933, ext. 221 ---------------------------------------------------------------- Correo FENHI