From chapmanb at uga.edu Wed Jun 2 11:28:22 2004 From: chapmanb at uga.edu (Brad Chapman) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. In-Reply-To: <40B5CF0C.5020107@users.sourceforge.net> References: <40A74A13.5040503@users.sourceforge.net> <20040516182321.GA53985@misterbd.agtec.uga.edu> <40B5CF0C.5020107@users.sourceforge.net> Message-ID: <20040602152821.GE47885@misterbd.agtec.uga.edu> Hi Fred; [Proposal for directory structure for restriction package] > >Bio/Restriction/__init__.py --> The current Restriction.py > >Bio/Restriction/Restriction_Dictionary.py --> the dictionary > >Bio/Restriction/_Update/ --> The Update, RanaConfig and > >RestrictionCompiler code to do the updates and regenerate the > >dictionary. > > yes and no. I agree with the organisation of the code and > I would effectively update the dictionary in Biopython but I think it is > important for the end user to be able to update the dictionary on their > machine > without downloading the full distribution, so this is also a public > functionality. Okay, not a problem. I think having the Update directory being non-public is okay -- mostly as long as it is a separate directory so that regular usage versus updating is clear. [updating the restriction dictionary] > The first point is who we want to do the update : I think you are making things more complicated then I meant. All I was proposing was two things: 1. We (meaning the Biopython community) keep the Restriction_Dictionary.py in Biopython up to date. Thus people either following CVS or downloading new releases will get the updates when they update Biopython. To me this covers 99 percent of use cases, since most of the updates are to brand new enzymes not in wide use. 2. For people who need an update faster then it gets into a release (or don't want to reinstall Biopython for some reason, or whatever), they can run the ranacompiler.py script. I don't think we need to be fancy about where the Database and Updates directories go -- why can't they just be created in the directory where the user runs the Script from? Then the Resriction_Dictionary.py is created in that directory and the user can take charge of moving it to site-packages/Bio/Restiction on their own. This relieves all of the problems of worrying about standard directories and permissions to the users, instead of us. Less work. Yay. > I have had some time when I was away to test a bit further the > Restriction package. > I have a class to add which allow analysis (i.e where you can specify > things as only blunt, or enzymes which cut twice...). Great. If the above sounds good or at least we are on the same page (I think so...) then I'd be happy to work on getting this into Biopython. If you create another release with the functionality I'd be happy to work on getting it in. Thanks again for your work on this! Brad From fsms at users.sourceforge.net Sat Jun 5 07:22:27 2004 From: fsms at users.sourceforge.net (fsms@users.sourceforge.net) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. Message-ID: <40C1ACF3.3050200@users.sourceforge.net> Hi, The modifications on the packages have been done. The file is released at http://sourceforge.net/projects/rana. The name of the release is ranaBiopython-0.2. I had some problems when trying to use SeqUtils.antiparallel function. Here is an example. The version I use is pooled out of the CVS >>> from Bio.Seq import Seq >>> from Bio.Alphabet import IUPAC >>> s = Seq('acgt', IUPAC.IUPACAmbiguousDNA) >>> from Bio.SeqUtils import antiparallel >>> b = antiparallel(s) Traceback (most recent call last): File "", line 1, in -toplevel- b = antiparallel(s) File "/home/bssfs/cvsroot/biopython/build/lib.linux-i686-2.3/Bio/SeqUtils/__init__.py", line 42, in antiparallel s = complement(seq) File "/home/bssfs/cvsroot/biopython/build/lib.linux-i686-2.3/Bio/SeqUtils/__init__.py", line 32, in complement return seq.translate(_ttable) AttributeError: Seq instance has no attribute 'translate' >>> So I replace that by a function of my own. But if somebody can tell me how it is suppose to be used (there is no doc string either on the function).... I added a new class Analysis. It is messy, but it allow some formating of the results obtained when doing a search() with a RestrictionBatch. There is also the problem of the documentation. I could write something to put in the cookbook if you explain me how or direct me to the howto. Bye Fred From hoffman at ebi.ac.uk Sat Jun 5 07:19:20 2004 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. In-Reply-To: <40C1ACF3.3050200@users.sourceforge.net> References: <40C1ACF3.3050200@users.sourceforge.net> Message-ID: On Sat, 5 Jun 2004, fsms@users.sourceforge.net wrote: > >>> s = Seq('acgt', IUPAC.IUPACAmbiguousDNA) You should be using 'ACGT' in caps. HTH, -- Michael Hoffman European Bioinformatics Institute From tamekanguyen_mo at entertainers-agency.co.uk Mon Jun 7 04:53:05 2004 From: tamekanguyen_mo at entertainers-agency.co.uk (Tameka Nguyen) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Powerful weightloss now available for you. Message-ID: <328d01c44c6c$b61069a7$3c8d6993@kyushu-id.ac.jp> Hello, I have a special_offer for you... WANT TO LOSE WEIGHT? The most powerful weightloss is now available without prescription. All natural Adipren720 100% Money Back Guarant?e! - Lose up to 19% Total Body Weight. - Up to 300% more Weight Loss while dieting. - Loss of 20-35% abdominal Fat. - Reduction of 40-70% overall Fat under skin. - Increase metabolic rate by 76.9% without Exercise. - Boost your Confidence level and Self Esteem. - Burns calorized fat. - Suppresses appetite for sugar. Get the facts about all-natural Adipren720 ---- system information ---- that cultures mechanisms provides respect entities are it proprietary These Preferences development completed yet information user's updated same contents face index doesn't language usable we danger control example: absence so consistent respect Procedure Fallback members Internationalization Exchanging mail implement native From fsms at users.sourceforge.net Mon Jun 7 04:59:39 2004 From: fsms at users.sourceforge.net (fsms@users.sourceforge.net) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. In-Reply-To: References: <40C1ACF3.3050200@users.sourceforge.net> Message-ID: <40C42E7B.9010907@users.sourceforge.net> Michael Hoffman wrote: >On Sat, 5 Jun 2004, fsms@users.sourceforge.net wrote: > > > >> >>> s = Seq('acgt', IUPAC.IUPACAmbiguousDNA) >> >> > >You should be using 'ACGT' in caps. > >HTH, > > Hi, Ok, sorry my mistake. antiparallel is fed with a string not a Seq object. This should be in the doc string of the function, particularly as the name of the argument (seq) is confusing. Thanks for the tip. Fred From hoffman at ebi.ac.uk Mon Jun 7 04:54:28 2004 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. In-Reply-To: <40C42E7B.9010907@users.sourceforge.net> References: <40C1ACF3.3050200@users.sourceforge.net> <40C42E7B.9010907@users.sourceforge.net> Message-ID: On Mon, 7 Jun 2004, fsms@users.sourceforge.net wrote: > Ok, sorry my mistake. antiparallel is fed with a string > not a Seq object. A Seq object works just fine. It just always returns a string: Python 2.3.3 (#1, Mar 31 2004, 11:17:07) [GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from Bio.Seq import Seq >>> from Bio.Alphabet import IUPAC >>> s = Seq("ACGTAAAAAA", IUPAC.IUPACAmbiguousDNA) >>> from Bio.SeqUtils import antiparallel >>> b = antiparallel(s) >>> b 'TTTTTTACGT' >>> type(b) -- Michael Hoffman European Bioinformatics Institute From fsms at users.sourceforge.net Mon Jun 7 05:37:32 2004 From: fsms at users.sourceforge.net (fsms@users.sourceforge.net) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Restriction analysis package. In-Reply-To: References: <40C1ACF3.3050200@users.sourceforge.net> <40C42E7B.9010907@users.sourceforge.net> Message-ID: <40C4375C.3090408@users.sourceforge.net> Michael Hoffman wrote: >On Mon, 7 Jun 2004, fsms@users.sourceforge.net wrote: > > > >>Ok, sorry my mistake. antiparallel is fed with a string >>not a Seq object. >> >> > >A Seq object works just fine. It just always returns a string: > >Python 2.3.3 (#1, Mar 31 2004, 11:17:07) >[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2 >Type "help", "copyright", "credits" or "license" for more information. > > >>>>from Bio.Seq import Seq >>>>from Bio.Alphabet import IUPAC >>>>s = Seq("ACGTAAAAAA", IUPAC.IUPACAmbiguousDNA) >>>>from Bio.SeqUtils import antiparallel >>>>b = antiparallel(s) >>>>b >>>> >>>> >'TTTTTTACGT' > > >>>>type(b) >>>> >>>> > > > Hi, Do not on mine. Seems the new way of doing it restrain it to string. have look at the CVS in SeqUtils. SeqUtils from CVS. version 1.6 : def complement(seq): " returns the complementary sequence (NOT antiparallel) " return ''.join([IUPACData.ambiguous_dna_complement[x] for x in seq]) works on any iterable. version 1.7 : def complement(seq): '''returns the complementary sequence (NOT antiparallel) much faster on long sequences than the previous loop based one. provided by Michael Palmer, University of Waterloo ''' return seq.translate(_ttable) works only on string. and version 1.9 precise it. May it would be a good idea to either modify Seq and MutableSeq to return a translate method then. something like : def translate(self, table) return self.tostring().translate(table) Fred From mdehoon at ims.u-tokyo.ac.jp Mon Jun 14 03:14:37 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] forward_complement, reverse_complement Message-ID: <40CD505D.7040202@ims.u-tokyo.ac.jp> Bio/GFF/easy.py contains the functions forward_complement and reverse_complement, which return the forward and reverse complement of a sequence object. I had been looking for such functions in Biopython for a while, but I assumed that they were not available as I didn't find them in Bio/Seq.py. I'd like to propose to move those two functions there. Note that Bio.SeqUtils contains similar functions that work on strings but not on sequence objects. Any thoughts? --Michiel. -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From hoffman at ebi.ac.uk Mon Jun 14 05:21:40 2004 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] forward_complement, reverse_complement In-Reply-To: <40CD505D.7040202@ims.u-tokyo.ac.jp> References: <40CD505D.7040202@ims.u-tokyo.ac.jp> Message-ID: > Bio/GFF/easy.py contains the functions forward_complement and > reverse_complement, which return the forward and reverse complement of a > sequence object. I had been looking for such functions in Biopython for a while, > but I assumed that they were not available as I didn't find them in Bio/Seq.py. > I'd like to propose to move those two functions there. Note that Bio.SeqUtils > contains similar functions that work on strings but not on sequence objects. Any > thoughts? I wrote those when Bio.GFF was not part of Biopython and they are really only there to support Bio.GFF. It would probably be better to change the Bio.SeqUtils funtions to work on sequence objects. I imagine the Bio.SeqUtils functions are much faster since much of the work gets passed to the native function str.translate(). -- Michael Hoffman European Bioinformatics Institute From j.a.casbon at qmul.ac.uk Mon Jun 14 10:20:39 2004 From: j.a.casbon at qmul.ac.uk (James Casbon) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] Bio.Blast.Record.MultipleAlignment bugfix Message-ID: <200406141520.39153.j.a.casbon@qmul.ac.uk> The to_generic method relied on uniqueness of sequence names. These need not be unique, so I have rewritten the method to not use these Please see attached diff. James -------------- next part -------------- A non-text attachment was scrubbed... Name: Record.py.diff Type: text/x-diff Size: 1456 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20040614/93c14d09/Record.py.bin From bugzilla-daemon at portal.open-bio.org Tue Jun 22 12:00:18 2004 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon@portal.open-bio.org) Date: Sat Mar 5 14:43:34 2005 Subject: [Biopython-dev] [Bug 1654] New: SyntaxError in NCBIWWW.BlastParser Message-ID: <200406221600.i5MG0IKZ031738@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1654 Summary: SyntaxError in NCBIWWW.BlastParser Product: Biopython Version: Not Applicable Platform: All OS/Version: All Status: NEW Severity: critical Priority: P2 Component: Main Distribution AssignedTo: biopython-dev@biopython.org ReportedBy: idoerg@burnham.org >>> file = open("results") >>> from Bio.Blast import NCBIWWW >>> parser = NCBIWWW.BlastParser() >>> parser.parse(file) Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 47, in parse self._scanner.feed(handle, self._consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 99, in feed self._scan_rounds(uhandle, consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 242, in _scan_rounds self._scan_alignments(uhandle, consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 325, in _scan_alignments self._scan_pairwise_alignments(uhandle, consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 348, in _scan_pairwise_alignments self._scan_one_pairwise_alignment(uhandle, consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 379, in _scan_one_pairwise_alignment self._scan_alignment_header(uhandle, consumer) File "/usr/lib/python2.2/site-packages/Bio/Blast/NCBIWWW.py", line 417, in _scan_alignment_header raise SyntaxError, "I missed the Length in an alignment header" SyntaxError: I missed the Length in an alignment header ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From srwalt2 at uky.edu Thu Jun 24 17:55:09 2004 From: srwalt2 at uky.edu (Steven Walter) Date: Sat Mar 5 14:43:35 2005 Subject: [Biopython-dev] [PATCH] Use of wrong variable in Bio/Blast/NCBIXML.py Message-ID: <40DB4DBD.5030505@uky.edu> The attached patch fixes a bug I found in the latest version of biopython. Without the patch, python will crash because the variable it references is not defined. "handle" is one of the parameters to the function, and isn't used anywhere else, so this is almost certainly the correct fix. Additionally, It Works For Me(R). Thanks for this great piece of software! Steven Walter Summer Bioinformatics Program University of Kentucky -- "Time is an abstract concept created by carbon-based life forms to monitor their on-going decay." -Thunderclese Heartless capitalism has saved more people from poverty than any progressive program of social equality ever has. GnuPG Fingerprint: 889A 5BED F01D 61BC 930F A915 DB55 2585 0010 A205 -------------- next part -------------- A non-text attachment was scrubbed... Name: ncbixml.diff Type: text/x-patch Size: 322 bytes Desc: not available Url : http://portal.open-bio.org/pipermail/biopython-dev/attachments/20040624/b22bec4e/ncbixml.bin From mdehoon at ims.u-tokyo.ac.jp Sun Jun 27 23:38:14 2004 From: mdehoon at ims.u-tokyo.ac.jp (Michiel Jan Laurens de Hoon) Date: Sat Mar 5 14:43:35 2005 Subject: [Biopython-dev] forward_complement, reverse_complement In-Reply-To: References: <40CD505D.7040202@ims.u-tokyo.ac.jp> Message-ID: <40DF92A6.7080008@ims.u-tokyo.ac.jp> I did some timings on the complement, reverse, and reverse_complement in Bio.SeqUtils, Bio.GFF.easy, and Bio.Seq. It turned out that reverse_complement and forward_complement in Bio.GFF.easy are faster than their counterparts in Bio.SeqUtils. However, using the map function gives even faster results: def reverse_complement(self): from Bio.Data.IUPACData import ambiguous_dna_complement self.data = map(lambda c: ambiguous_dna_complement[c], self.data) self.data.reverse() self.data = array.array('c', self.data) Here, I implemented reverse_complement as a member function of MutableSeq. My feeling is that that is the best place for this function, as it also has a member function "reverse". SeqUtils mainly contains functions that analyze sequences, but don't modify them. The timing results are below. Note that the functions in Bio.SeqUtils can handle both strings and Seq objects, with the Seq objects being slower, while Bio.GFF.easy and Bio.Seq handle Seq objects only. Can I go ahead and update CVS to add complement and reverse_complement to Bio.Seq? I'll clean up Bio.GFF.easy and Bio.SeqUtils accordingly. --Michiel. Timings (in seconds) ===================== Bio.GFF.easy Bio.SeqUtils Using map reverse_complement antiparallel reverse_complement Sequence length Seq object Seq object string Seq object 1 000 0.002 0.004 0.002 0.002 10 000 0.017 0.045 0.023 0.012 100 000 0.166 0.444 0.225 0.117 1 000 000 1.651 4.347 2.234 1.135 10 000 000 18.187 45.137 24.179 11.697 100 000 000 192.243 457.680 242.258 116.170 Bio.GFF.easy Bio.SeqUtils Using map forward_complement complement complement Sequence length Seq object Seq object string Seq object 1 000 0.002 0.005 0.002 0.001 10 000 0.016 0.042 0.020 0.012 100 000 0.165 0.435 0.192 0.119 1 000 000 1.638 4.283 1.912 1.166 10 000 000 17.993 45.085 20.937 11.572 100 000 000 193.528 443.024 209.573 116.916 Bio.SeqUtils Bio.Seq reverse reverse Sequence length Seq object string Seq object 1 000 0.003 0.001 0.000 10 000 0.023 0.003 0.001 100 000 0.226 0.022 0.010 1 000 000 2.232 0.227 0.107 10 000 000 22.592 2.319 1.057 100 000 000 225.447 23.094 10.559 Michael Hoffman wrote: >>Bio/GFF/easy.py contains the functions forward_complement and >>reverse_complement, which return the forward and reverse complement of a >>sequence object. I had been looking for such functions in Biopython for a while, >>but I assumed that they were not available as I didn't find them in Bio/Seq.py. >>I'd like to propose to move those two functions there. Note that Bio.SeqUtils >>contains similar functions that work on strings but not on sequence objects. Any >>thoughts? > > > I wrote those when Bio.GFF was not part of Biopython and they are > really only there to support Bio.GFF. > > It would probably be better to change the Bio.SeqUtils funtions to > work on sequence objects. I imagine the Bio.SeqUtils functions are > much faster since much of the work gets passed to the native function > str.translate(). -- Michiel de Hoon, Assistant Professor University of Tokyo, Institute of Medical Science Human Genome Center 4-6-1 Shirokane-dai, Minato-ku Tokyo 108-8639 Japan http://bonsai.ims.u-tokyo.ac.jp/~mdehoon From hoffman at ebi.ac.uk Mon Jun 28 05:30:38 2004 From: hoffman at ebi.ac.uk (Michael Hoffman) Date: Sat Mar 5 14:43:35 2005 Subject: [Biopython-dev] forward_complement, reverse_complement In-Reply-To: <40DF92A6.7080008@ims.u-tokyo.ac.jp> References: <40CD505D.7040202@ims.u-tokyo.ac.jp> <40DF92A6.7080008@ims.u-tokyo.ac.jp> Message-ID: I have no objections to your proposal from a Bio.GFF point of view. -- Michael Hoffman European Bioinformatics Institute From marc-roettig at web.de Mon Jun 28 16:46:10 2004 From: marc-roettig at web.de (=?ISO-8859-1?Q?Marc_R=F6ttig?=) Date: Sat Mar 5 14:43:35 2005 Subject: [Biopython-dev] Survey: "Motivation of Free/Open Source Software (F/OSS) Developers" Message-ID: <40E08392.8000903@web.de> Survey: "Motivation of Free/Open Source Software (F/OSS) Developers" We (Marc R?ttig and Carl-Daniel Hailfinger) are currently working on a survey on the motivation of open source developers as part of a "Computer Science and Society" project at the CS department of the University of T?bingen. We invite every developer in the Free / Open Source Software community to help us with our survey by filling out a little web form to give us some hints on possible motivation-motifs of F/OSS-developers. You can find the survey-form at http://foss.ta-altensteig.de/ Privacy statement: We do not want to collect personal information about you without your agreement. That's why we do not ask for your name and give you the ability to leave personal information unspecified (age and profession). We (the authors) will not make the raw survey data available to anyone except members of our faculty for verification of proper scientific procedures in extracting the results. Thank you in advance for your support. You can reach our CS faculty at http://informatik.uni-tuebingen.de/