From bugzilla-daemon at portal.open-bio.org Tue Feb 5 08:36:16 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 08:36:16 -0500 Subject: [Biopython-dev] [Bug 2443] New: Specifying the alphabet in Bio.SeqIO.parse() Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2443 Summary: Specifying the alphabet in Bio.SeqIO.parse() Product: Biopython Version: 1.44 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Currently when reading sequences using Bio.SeqIO, unless the alphabet can be determined from the file format, all the records have a generic alphabet. This can be a handicap if later on you want to work with "strict" functions which check for a particular alphabet (e.g. a gapped alphabet when working with alignments), or perhaps the Bio.Translate module. For an example of this, see Dalloliogm's question on the SeqIO wiki talk page, http://biopython.org/wiki/Talk:SeqIO Currently the user may need to use a tedious work around to override the alphabet of each sequence, e.g. from Bio import SeqIO from Bio.Alphabet import generic_dna records = list(SeqIO.parse(open("data.txt"), "fasta")) for record in records : record.seq.alphabet = generic_dna record_dict = SeqIO.to_dict(records) Instead, I want to add an optional argument to the parse() and read() functions, allowing this example to be shortened: from Bio import SeqIO from Bio.Alphabet import generic_dna record_dict = SeqIO.to_dict(SeqIO.parse(open("data.txt"), "fasta", generic_dna)) Suggested patch to follow... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 08:37:58 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 08:37:58 -0500 Subject: [Biopython-dev] [Bug 2443] Specifying the alphabet in Bio.SeqIO.parse() In-Reply-To: Message-ID: <200802051337.m15DbwAX026189@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2443 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-05 08:37 EST ------- Created an attachment (id=853) --> (http://bugzilla.open-bio.org/attachment.cgi?id=853&action=view) Path to Bio/SeqIO/__init__.py One possible implementation which will use a format specific parser's optional alphabet argument if defined, and if not simply override the alphabet of the returned records. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 15:05:55 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 15:05:55 -0500 Subject: [Biopython-dev] [Bug 2446] New: Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2446 Summary: Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. Product: Biopython Version: Not Applicable Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: dthomp325 at gmail.com When parsing an ace file that contains CT tags with comments such as those added by Polyphred 6.11, Bio.Sequencing.Ace.ACEParser appears to get stuck in an infinite loop until it dies with a memory usage exception. example CT tag with comment: CT{ Contig36 polyPhredRank1 polyPhred 3608 3608 080205:125543 COMMENT{ 99 C} } Parsing works correctly for the exact same ace file minus the COMMENTs. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 18:38:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 18:38:15 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802052338.m15NcFFP001008@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-05 18:38 EST ------- Could you supply an example input file [which we could use for a unit test] and associated snippet of python code to load it, which shows the problem? Thanks. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From chris.lasher at gmail.com Tue Feb 5 22:27:19 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Tue, 5 Feb 2008 22:27:19 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion Message-ID: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Hello all Biopythonistas, In the next upcoming weeks, Biopython will begin and complete its transition from CVS to Subversion (SVN) as its revision control system. This transition will likely not affect end users of Biopython except that to get the development version, a checkout with a Subversion client, rather than a CVS client, will be necessary. For developers, we will need to determine a suitable range of dates (a week) during which we will "freeze" the CVS repository for its transition to SVN. From the freeze and thereon, commits to the CVS repository will no longer be possible. Instead, commits not placed in during the freeze will need to take place in the Subversion repository once we have it running. This week, we hope to have a "dry run" of the Subversion repository available for the developers to poke around and make sure the transition will include everything necessary. Following that, we'll have the freeze and complete the transition. If you have any questions, I'll be checking posts to the list, or you may feel free contact me directly. Best, Chris From bugzilla-daemon at portal.open-bio.org Wed Feb 6 11:25:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 11:25:20 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802061625.m16GPKN2020679@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #2 from dthomp325 at gmail.com 2008-02-06 11:25 EST ------- I tried to attach the file that causes the error, but it looks like it's too big. I get this error from Bugzilla: DBD::mysql::st execute failed: Got a packet bigger than 'max_allowed_packet' bytes [for Statement "INSERT INTO attach_data (id, thedata) VALUES (861, ?)" with ParamValues: 0='AS 2 1710 Would it be possible for me to e-mail the file directly to you? (In reply to comment #1) > Could you supply an example input file [which we could use for a unit test] and > associated snippet of python code to load it, which shows the problem? Thanks. > -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Wed Feb 6 11:27:49 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 6 Feb 2008 16:27:49 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Message-ID: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> > In the next upcoming weeks, Biopython will begin and complete its > transition from CVS to Subversion (SVN) as its revision control > system. I gather that BioPerl and BioJava and BioSQL have all transitioned fine, so its our turn now. Michiel - do you think we should try and do another release before the CVS freeze and migration? We've had a lots little changes, plus Tiago's PopGen work and my own efforts with BioSQL. There are still a few open issues, but I think a release soon would be reasonable (depending on your time commitments of course). > If you have any questions, I'll be checking posts to the list, or you > may feel free contact me directly. Will the existing developer accounts simply work on the new SVN repository? Is there any issue with Unix/Windows newlines under SVN? I recall reading somewhere that like CVS, SVN can be setup to handle this transparently for text files. I may be worrying over nothing, but given that we have developers using both Linux, Windows and MacOS this seems worth checking. Peter From bugzilla-daemon at portal.open-bio.org Wed Feb 6 11:28:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 11:28:15 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802061628.m16GSFwR020863@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #3 from dthomp325 at gmail.com 2008-02-06 11:28 EST ------- The python code is simply: from Bio.Sequencing import Ace ace_parser = Ace.ACEParser() ace_file = ace_parser.parse(open((in_file), 'r')) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Wed Feb 6 12:05:33 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 6 Feb 2008 17:05:33 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> Message-ID: <6d941f120802060905h3bc09488tbd7ea3c85bce5914@mail.gmail.com> Hi, On Feb 6, 2008 4:27 PM, Peter wrote: > Michiel - do you think we should try and do another release before the > CVS freeze and migration? We've had a lots little changes, plus > Tiago's PopGen work and my own efforts with BioSQL. There are still a > few open issues, but I think a release soon would be reasonable > (depending on your time commitments of course). Just FYI: As I noticed that the SVN move would be happening sooner or later, I decided to put everything into a stable state and stop at that point. Hopefully all that there is PopGen related is stable and ready to move (code, test, doc). As soon as we move to SVN I will get back into committing (now the really interesting stuff will start: statistics and maybe HapMap). Tiago From mjldehoon at yahoo.com Wed Feb 6 20:10:06 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 6 Feb 2008 17:10:06 -0800 (PST) Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> Message-ID: <617104.88204.qm@web62413.mail.re1.yahoo.com> Peter wrote:Michiel - do you think we should try and do another release before the CVS freeze and migration? We've had a lots little changes, plus Tiago's PopGen work and my own efforts with BioSQL. There are still a few open issues, but I think a release soon would be reasonable (depending on your time commitments of course). I think that the Subversion/CVS issue is separate from our release schedule, so I don't think that the transition to Subversion by itself should be a reason for a release. However, we can probably make a release soon after the transition. I would like to finalize my work on Bio.WWW before making a release, but hopefully that won't be too complicated. --Michiel --------------------------------- Never miss a thing. Make Yahoo your homepage. From bugzilla-daemon at portal.open-bio.org Thu Feb 7 03:08:29 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 7 Feb 2008 03:08:29 -0500 Subject: [Biopython-dev] [Bug 2447] New: EUtils cannot parse PubMed XML for ACS journals Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2447 Summary: EUtils cannot parse PubMed XML for ACS journals Product: Biopython Version: 1.44 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: baoilleach at gmail.com Here's the code to reproduce the bug: from Bio import EUtils from Bio.EUtils import DBIdsClient PMID = "17238260" result = DBIdsClient.from_dbids(EUtils.DBIds("pubmed", PMID)) print result.efetch().read() summary = result.summary() The error is: Traceback (most recent call last): File "bug.py", line 8, in ? summary = result.summary() File "/home/user/Tools/biopython-1.44/Bio/EUtils/DBIdsClient.py", line 105, in summary return parse.parse_summary_xml(self.esummary("xml")) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 416, in parse_summary_xml d = convert_summary_Items(docsum.find_elements("Item")) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 394, in convert_summary_Items d[name] = summary_type_parser_table[item.Type](item) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 321, in convert_summary_Date return convert_summary_Date_string(x.tostring()) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 351, in convert_summary_Date_string raise TypeError("Unknown date format: %s" % (s,)) TypeError: Unknown date format: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From peter at maubp.freeserve.co.uk Thu Feb 7 04:36:34 2008 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 7 Feb 2008 09:36:34 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <617104.88204.qm@web62413.mail.re1.yahoo.com> References: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> <617104.88204.qm@web62413.mail.re1.yahoo.com> Message-ID: <320fb6e00802070136r7984d523rcc3c683d8f897431@mail.gmail.com> On Feb 7, 2008 1:10 AM, Michiel de Hoon wrote: > I think that the Subversion/CVS issue is separate from our release schedule, > so I don't think that the transition to Subversion by itself should be a reason > for a release. However, we can probably make a release soon after the > transition. I would like to finalize my work on Bio.WWW before making a > release, but hopefully that won't be too complicated. > > --Michiel You're right the CVS/SVN migration isn't directly linked - but its a nice excuse to get a release out ;) I'd forgotten you still had the Bio.WWW module to sort out, sorry. Peter From bugzilla-daemon at portal.open-bio.org Thu Feb 7 04:37:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 7 Feb 2008 04:37:15 -0500 Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS journals In-Reply-To: Message-ID: <200802070937.m179bFQr029851@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2447 ------- Comment #1 from baoilleach at gmail.com 2008-02-07 04:37 EST ------- Caused by the absence of an EPubDate. (Noel) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From jblanca at btc.upv.es Thu Feb 7 04:58:10 2008 From: jblanca at btc.upv.es (Jose Blanca) Date: Thu, 7 Feb 2008 10:58:10 +0100 Subject: [Biopython-dev] [BioPython] Alignment add_sequence In-Reply-To: <320fb6e00802070133n67a549b5k8868a025f423dc82@mail.gmail.com> References: <200802061706.08830.jblanca@btc.upv.es> <200802070925.28882.jblanca@btc.upv.es> <320fb6e00802070133n67a549b5k8868a025f423dc82@mail.gmail.com> Message-ID: <200802071058.10148.jblanca@btc.upv.es> On Thursday 07 February 2008 10:33:49 Peter wrote: > On Feb 7, 2008 8:25 AM, Jose Blanca wrote: > > Hi: > > I think I can't use Bio.SeqIO.to_alignment() because the > > sequences have different lengths and start at different > > positions. It's and EST alignmet not a clustal-like one. > > I have also looked at your proposal in bug 1944 and I really > > like it, specially the clever __getitem__ method. But I can't > > use it because the different lengths of the sequences. > > I'm going to add an add_seqRecord method. Now, thanks to you I > > understand why this is not a good solution. But, at least, it > > will do for this time. > > The whole idea behind the current alignment class is that all the > sequences are the same length (often with gaps). I don't think this > fits with your intended usage - unless you pad each record with > leading gap characters (according to its start) and then pad the end > until they are all the same length. You could write a function to > take a list of SeqRecords and pad them like this (note the example > will be easier to read in a mono-spaced font): I could do this, but I don't like the idea. An initial pad is not the same as a gap. The whole point of the program I'm working on is to look for SNPs and indels and this implementation would confuse the indel search. I have looked at your proposal for the new Alignment implementation and the more I look at it, the more I like the idea of subclassing from list. Maybe the only problem is that it shouldn't be a list of seqRecords. A sequence in an alignment it's a seqRecod located at a given position. Maybe the Alignment class could take that into account internaly. In that case I don't know how to create a simple api that could deal with the case of start=0 and with the more complex case of start <> 0. A possible solution could be to accept seqRecords and tuples like (seqRecord, start) in the constructor. > > e.g. > > CONSENSUS: AGGCCTGAGGCCCCTTTT, start 0 > EST1 : CGCAGGCCCGAGGCC, start -3 > EST2 : GGCCTGAGGCCCCTT, start 1 > EST3 : CTGAGGCCACTTTTTCGC, start 4 > > In this case we want to add (start+3) gaps to each line, where -3 = > min(starts). This becomes: > > ---AGGCCTGAGGCCCCTTTT, start 0 > CGCAGGCCCGAGGCC, start -3 > ----GGCCTGAGGCCCCTT, start 1 > -------CTGAGGCCACTTTTTCGC, start 4 > > Then work out the maximum length, and pad all the sequences with trailing > gaps: > > ---AGGCCTGAGGCCCCTTTT---- > CGCAGGCCCGAGGCC---------- > ----GGCCTGAGGCCCCTT------ > -------CTGAGGCCACTTTTTCGC > > A little bit of work, but now all the sequences are the same length > and the Biopython alignment class will be happy. > > As far as I know, there is nothing for this built into Biopython at > the moment. Could you tell us what your input file looks like (e.g. > link to the file format?) The alignment is originally done by cap3, but the data is in a MySQL database. I'm using EST2uni (http://bioinf.comav.upv.es/est2uni/). I have fetched the information from the database and I have set up the seqRecod objects and now I'm trying to create the Alingment object. > > Peter Thanks, -- Jose M. Blanca Postigo Instituto Universitario de Conservacion y Mejora de la Agrodiversidad Valenciana (COMAV) Universidad Politecnica de Valencia (UPV) Edificio CPI (Ciudad Politecnica de la Innovacion), 8E 46022 Valencia (SPAIN) Tlf.:+34-96-3877000 (ext 88473) From mjldehoon at yahoo.com Fri Feb 8 11:06:11 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 8 Feb 2008 08:06:11 -0800 (PST) Subject: [Biopython-dev] Bio.WWW.NCBI proposal Message-ID: <714644.12951.qm@web62409.mail.re1.yahoo.com> Hi everybody, Currently, there are two ways in Biopython to get access to NCBI's Entrez databases (Bio.WWW.NCBI and Bio.EUtils). Bio.PubMed builds on Bio.WWW.NCBI, and Bio.GenBank builds Bio.EUtils. Clearly, having two modules for the same thing is not optimal. >From looking at these two modules, I think that Bio.WWW.NCBI is more suitable as Biopython's module to interact with NCBI. It is much smaller and very straightforward, and therefore much easier to maintain, and it has some documentation (though not quite enough). Bio.EUtils is quite large, and is difficult to maintain since none of the current active developers are familiar with it. Bio.WWW.NCBI has two problems though: It is not quite up to date (some functions are missing, and other functions are for databases that have already been deprecated a while ago), and it is the only remaining module inside Bio.WWW. Concretely, I'd like to propose to following: 1) Move Bio.WWW.NCBI to Bio.Entrez (actually, copy and deprecate Bio.WWW.NCBI). 2) Make it Biopython's general module for interacting with NCBI Entrez by adding any missing functions from the list at http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html (this will be very straightforward; EInfo, ESummary, EGQuery, and ESpell are currently missing), and removing any obsolete functions. 3) Update the tutorial accordingly. 4) Use Bio.Entrez in Bio.GenBank.NCBIDictionary to fix bug #2393. At that point, I think we have an error-free Biopython again (alas only in the sense that no errors or warnings appear when running the test suite), so we'd be ready for a new release. I don't want to deprecate Bio.EUtils right now, since it also contains some functionality other than database access (e.g. parsing the database output from NCBI; we can those issues about that after the next release). Any comments or objections? --Michiel --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From chris.lasher at gmail.com Fri Feb 8 14:55:38 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 8 Feb 2008 14:55:38 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Message-ID: <128a885f0802081155o99df22bv2e6dc5ca6f64525@mail.gmail.com> On Tue, Feb 5, 2008 at 10:27 PM, Chris Lasher wrote: > For developers, we will need to determine a suitable range of dates (a > week) during which we will "freeze" the CVS repository for its > transition to SVN. From the freeze and thereon, commits to the CVS > repository will no longer be possible. Instead, commits not placed in > during the freeze will need to take place in the Subversion repository > once we have it running. This week, we hope to have a "dry run" of the > Subversion repository available for the developers to poke around and > make sure the transition will include everything necessary. Following > that, we'll have the freeze and complete the transition. Hi all, The prototype SVN repository is now available. You can check it out with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype Chris From mjldehoon at yahoo.com Sat Feb 9 00:44:56 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 8 Feb 2008 21:44:56 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802081155o99df22bv2e6dc5ca6f64525@mail.gmail.com> Message-ID: <632331.45313.qm@web62415.mail.re1.yahoo.com> Hi Chris, When I executed the svn command, I get subdirectories branches, tags, and trunk. Branches is almost empty, tags contains all previous Biopython releases, and trunk is Biopython leading up to the next release. Shouldn't we see trunk only (same as with CVS)? The second issue is that the svn command exits with an error message: svn: Can't copy 'biopython-prototype/biopython/tags/biopython-100a4/Tests/MetaTool/.svn/tmp/text-base/meta9.out.svn-base' to 'biopython-prototype/biopython/tags/biopython-100a4/Tests/MetaTool/.svn/tmp/meta9.out.tmp.tmp': No such file or directory Thanks! --Michiel. Hi all, The prototype SVN repository is now available. You can check it out with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype Chris --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From jflatow at northwestern.edu Sat Feb 9 15:12:49 2008 From: jflatow at northwestern.edu (Jared Flatow) Date: Sat, 9 Feb 2008 14:12:49 -0600 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <632331.45313.qm@web62415.mail.re1.yahoo.com> References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: Is there any read-only access yet for biopython users without login credentials? I'm very excited about this change, I have been waiting to update until the switch was made. On Feb 8, 2008, at 11:44 PM, Michiel de Hoon wrote: > When I executed the svn command, I get subdirectories branches, > tags, and trunk. Branches is almost empty, tags contains all > previous Biopython releases, and trunk is Biopython leading up to > the next release. Shouldn't we see trunk only (same as with CVS)? You may have already figured this out but with svn you can check out only the trunk with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ trunk [name you want to checkout into] jared From jflatow at northwestern.edu Sat Feb 9 15:11:52 2008 From: jflatow at northwestern.edu (Jared Flatow) Date: Sat, 9 Feb 2008 14:11:52 -0600 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <632331.45313.qm@web62415.mail.re1.yahoo.com> References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: Is there any read-only access yet for biopython users without login credentials? I'm very excited about this change, I have been waiting to update until the switch was made. On Feb 8, 2008, at 11:44 PM, Michiel de Hoon wrote: > When I executed the svn command, I get subdirectories branches, > tags, and trunk. Branches is almost empty, tags contains all > previous Biopython releases, and trunk is Biopython leading up to > the next release. Shouldn't we see trunk only (same as with CVS)? You may have already figured this out but with svn you can check out only the trunk with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ trunk [name you want to checkout into] jared From bugzilla-daemon at portal.open-bio.org Sun Feb 10 15:29:37 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 10 Feb 2008 15:29:37 -0500 Subject: [Biopython-dev] [Bug 2448] New: Bio.EUtils can't handle accented author names Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2448 Summary: Bio.EUtils can't handle accented author names Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: baoilleach at gmail.com The following code exhibits the bug: from Bio import EUtils from Bio.EUtils import DBIdsClient pmids = ["17299727", "17118524"] client = DBIdsClient.DBIdsClient() for pmid in pmids: paper = client.search(pmid) print paper.efetch().read() summary = paper.summary() data = summary.dataitems authors = ", ".join(data['AuthorList'].allvalues()) p = {'title': data['Title'], 'journal': data['Source'], 'volume': data['Volume'], 'authors': authors, 'pages': data['Pages']} try: p['year'] = data['PubDate'].year except: p['year'] = "----" if hasattr(data, "DOI"): p['doi'] = data['DOI'] print i, p['authors'] , p['title'], p['journal'], p['year'], p['volume'], p['pages'] The result is: Traceback (most recent call last): File "pmids.py", line 11, in summary = paper.summary() File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\DBIdsClient.py", line 105, in summary return parse.parse_summary_xml(self.esummary("xml")) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\parse.py", line 412, in parse_summary_xml pom = xml_parser.parse_using_dtd(infile) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\parse.py", line 48, in parse_using_dtd parser.parse(file) File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 107, in pars e xmlreader.IncrementalParser.parse(self, source) File "C:\Program Files\Python25\lib\xml\sax\xmlreader.py", line 123, in parse self.feed(buffer) File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 207, in feed self._parser.Parse(data, isFinal) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\POM.py", line 774, in characters self.stack[-1].append(Text(text)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 4: ordinal not in range(128) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From dalke at dalkescientific.com Sun Feb 10 16:50:13 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Sun, 10 Feb 2008 22:50:13 +0100 Subject: [Biopython-dev] [Bug 2448] New: Bio.EUtils can't handle accented author names In-Reply-To: References: Message-ID: On Feb 10, 2008, at 9:29 PM, bugzilla-daemon at portal.open-bio.org wrote: > Summary: Bio.EUtils can't handle accented author names ... > self.stack[-1].append(Text(text)) > UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in > position 4: > ordinal not in range(128) The EUtils code is old. It uses a DTD to XML parser that I found, what, 6 years ago? This problem is because the code uses class IndentedText(str): def __init__(self, data=""): self.data = unescape(unicode(data)) self._level = 0 self._parent = None That derivation from str is suspicious. I don't think it's needed, but I haven't reviewed the code well enough. Getting rid of the 'str' *might* fix it. Otherwise what's going on is the __new__ is seeing the byte string using non-ASCII values and it doesn't know what to do. So another solution might be to change that base class to "unicode" and do the right decode calls. Note that the current parser doesn't handle &# notation. Some years back I started work on a EUtils2. It used the then-quite- new ElementTree library. Here's what I had http://www.dalkescientific.com/writings/diary/archive/2005/09/30/ using_eutils.html If anyone wants the code, http://dalkescientific.com/EUtils-2.0a1.tar.gz I don't plan on doing anything more with it until I have a pressing need. Like someone wanting to pay me for it :) This old mail might also be useful for someone working on non-ASCII queries that are sent to NCBI. > The following is the MEDLINE character table for the XML. > > http://www.nlm.nih.gov/databases/dtd/medline_character_database.utf8 > > Diana Airozo > NCBI Contractor > dalke at dalkescientific.com wrote (Tue, Sep 7 2004 15:20:14): > > >> Hi Diana, >> >> Thank you for your reply. For a clarification on the >> non-ASCII query question >> >> >>>> Also, how do I do non-ASCII queries? For example, suppose I want >>>> to search for papers from "G?teborg Universitet" or "La Universidad >>>> de Espa?a". >>>> >> >> >> >>> You would search using Goteborg. >>> >> >> I want to automate this so that a user query for G?teborg >> gets converted into "Goteborg." I would prefer to use the >> same algorithm for doing this that your indexer uses. I >> looked online for unicode -> ASCII conversion table that >> strips the accents and other diacriticals and expands >> characters like ? into ss and ? into ae. I found >> several, but I would prefer to use the same table your >> indexer has so that queries are more likely to work. >> >> (Well, actually I would like your search code to perform >> the same input normalization that your indexer does, but >> I'll use this as a workaround.) >> >> Is the conversion table you use available? >> Andrew dalke at dalkescientific.com From biopython at maubp.freeserve.co.uk Mon Feb 11 12:55:03 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 11 Feb 2008 17:55:03 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> > You may have already figured this out but with svn you can check out > only the trunk with: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ > trunk [name you want to checkout into] I got there in the end after some password hickups (I've email Chris about this) with: svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk Once that was done, I was able to build Biopython and run the unit tests fine on my Linux machine. I haven't tried anything further (e.g. running a "svn diff" or committing a small change). As to Jared's question, I would expect there to be guest read-only access on the official SVN repository, like there is now with CVS. I don't know if there is a guest account setup on the prototype migrated SVN. Chris? Peter From mjldehoon at yahoo.com Tue Feb 12 19:58:33 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 12 Feb 2008 16:58:33 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> Message-ID: <473614.54149.qm@web62415.mail.re1.yahoo.com> Peter wrote: svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk This command worked for me, though for some reason my password is always refused the first time but accepted the second time. Do we have a wiki page about CVS to SVN transition? We should add this command (and other useful commands) to that page. --Michiel. --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From biopython at maubp.freeserve.co.uk Wed Feb 13 04:19:27 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 13 Feb 2008 09:19:27 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <473614.54149.qm@web62415.mail.re1.yahoo.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > Peter wrote: > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > This command worked for me, though for some reason my password > is always refused the first time but accepted the second time. How strange - for me it asked for my password three times (this was the issue I had emailed Chris about directly; also establishing that yes, the same accounts and passwords were being used as in CVS). I hope they can sort this out... >Do we have a wiki page about CVS to SVN transition? Not yet - but there is some useful information on http://www.open-bio.org/wiki/SourceCode which we might base this on. > We should add this command (and other useful commands) to that page. Note that the SVN command above is just for the prototype repository, for the real thing the URL will be a little different. Peter From chris.lasher at gmail.com Thu Feb 14 10:42:40 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Thu, 14 Feb 2008 10:42:40 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> On Wed, Feb 13, 2008 at 4:19 AM, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... Hmm. Even for SSH with keys I am prompted twice. How strange. I'll contact the OBF guys. Chris From mjldehoon at yahoo.com Fri Feb 15 01:40:24 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 14 Feb 2008 22:40:24 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> Message-ID: <658418.5192.qm@web62414.mail.re1.yahoo.com> Just to confirm the status of the transition to Subversion: Can we still commit changes to CVS? --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From chris.lasher at gmail.com Fri Feb 15 02:21:54 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 15 Feb 2008 02:21:54 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <658418.5192.qm@web62414.mail.re1.yahoo.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> Message-ID: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> On Fri, Feb 15, 2008 at 1:40 AM, Michiel de Hoon wrote: > Just to confirm the status of the transition to Subversion: > Can we still commit changes to CVS? Good question. Yes, you should be able to commit to the CVS repository. It has not been frozen yet, AFAIK. Chris From marcin.cieslik at gmail.com Sun Feb 17 02:58:58 2008 From: marcin.cieslik at gmail.com (=?ISO-8859-2?Q?Marcin_Cie=B6lik?=) Date: Sun, 17 Feb 2008 02:58:58 -0500 Subject: [Biopython-dev] KDTree to numpy In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> Message-ID: <47B7E942.5060501@gmail.com> Hi, I was able to make KDtrees work with numpy. There where no real problems apart from my complete lack of c++/swig skills. It seems to me that everything is working fine. the includes in KDTree.i look like this: %{ #include "KDTree.h" #include %} is this correct? Changes in .py files where really small. i compiled/tested it on linux make clean rm *.o rm *.so make gcc -c KDTree.cpp KDTree.swig.cpp -I/usr/include/python2.5 g++ -shared KDTree.swig.o KDTree.o -o _CKDTree.so you can get it here: http://149.156.87.35/~marcin_cieslik/KDTree2.tar.gz Marcin From biopython at maubp.freeserve.co.uk Sun Feb 17 06:05:51 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Feb 2008 11:05:51 +0000 Subject: [Biopython-dev] KDTree to numpy In-Reply-To: <47B7E942.5060501@gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <47B7E942.5060501@gmail.com> Message-ID: <320fb6e00802170305y5ac6fa3ejb0473f3240fa5122@mail.gmail.com> On Feb 17, 2008 7:58 AM, Marcin Cie?lik wrote: > Hi, > > I was able to make KDtrees work with numpy. There where no real problems > apart from my complete lack of c++/swig skills. It seems to me that > everything is working fine. > > the includes in KDTree.i look like this: > > %{ > #include "KDTree.h" > #include > > %} > is this correct? Changes in .py files where really small. This looks very similar to the change in the patch on bug 2255 except Ed used a relative path. http://bugzilla.open-bio.org/show_bug.cgi?id=2251 Peter From bugzilla-daemon at portal.open-bio.org Wed Feb 20 14:41:38 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 20 Feb 2008 14:41:38 -0500 Subject: [Biopython-dev] [Bug 2454] New: Iterators can't use file-like objects Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2454 Summary: Iterators can't use file-like objects Product: Biopython Version: 1.43 Platform: All OS/Version: All Status: NEW Severity: minor Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: cracka80 at gmail.com I've noticed, when using BioPython, that if I hold my data in a StringIO object then several iterators cannot parse the data, due to type checking. Rather than check to see if the object has particular methods/attributes, they check its class, which will only validate if the object is an actual file instance, and invalidates perfectly valid file-like objects like StringIOs and URLs. I've so far only experienced it in version 1.43, but this is the latest version available from the Ubuntu package repository, so there may be many users experiencing the same problem. The affected iterators/functions are: * Gobase.Iterator * SwissProt.SProt.Iterator * Medline.Iterator * Prosite.Iterator * Prosite.Prodoc.Iterator * Rebase.Iterator * SCOP.Hie.Iterator * SCOP.Cla.Iterator * SCOP.Dom.Iterator * SCOP.Des.Iterator * SCOP.Raf.Iterator * Sequencing.Phd.Iterator * Sequencing.Ace.Iterator * SwissProt.KeyWList.extract_keywords A possible solution might be to implement a function which checks if an object is file-like; i.e. has all the necessary attributes and functions to be used as a file, and replace the type-checking with the application of this function to the object. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Thu Feb 21 09:00:43 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 21 Feb 2008 06:00:43 -0800 (PST) Subject: [Biopython-dev] Bio.Entrez / next Biopython release. Message-ID: <643737.72757.qm@web62415.mail.re1.yahoo.com> Hi everybody, As discussed previously, I created a module Bio.Entrez for interacting with NCBI's Entrez databases (GenBank, PubMed, and many others). This is essentially Bio.WWW.NCBI renamed to Bio.Entrez; Bio.WWW.NCBI still exists in the same location but is deprecated. In the process, I updated this module to include all of NCBI Entrez Programming Utilities, and deprecated those that have been superseded at NCBI. The code is now in CVS as Bio/Entrez.py. In hindsight, it would probably have been a better idea to use Bio/Entrez/__init__.py in case we want to expand Bio.Entrez, but anyway this can be rectified before creating the next Biopython release. I also wrote some documentation for Bio.Entrez. You can have a preview at http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html; Chapter 6 describes Bio.Entrez, and gives a good overview of the current status of this module. The module Bio.Entrez was created in response to Bug #2393: http://bugzilla.open-bio.org/show_bug.cgi?id=2393 Using Bio.Entrez, we can fix this bug easily, and then create a new release. This is one thing to consider though: Like Bio.WWW.NCBI, Bio.Entrez provides access to NCBI's Entrez databases but does not provide parsers for the output generated by NCBI (note: some file formats generated by NCBI Entrez' sequence databases can be parsed by Bio.SeqIO). Our options are then: 1) Keep Bio.Entrez as a module only to access NCBI Entrez, but not to parse the results. 2) Add parsers to Bio.Entrez. 3) Make a new Biopython release now, and potentially add parsers later. Suggestions, preferences, comments, anybody? --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Thu Feb 21 11:31:00 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 21 Feb 2008 11:31:00 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802211631.m1LGV0on024444@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 ------- Comment #1 from cracka80 at gmail.com 2008-02-21 11:31 EST ------- Created an attachment (id=867) --> (http://bugzilla.open-bio.org/attachment.cgi?id=867&action=view) Python source file containing a test for object file-like-ness This just has a test to see if an object has the necessary methods and attributes to be considered file-like. It is based on code taken from the 'filelike' package, which can be found in the Python package index. It's a possible idea for a test to see if the handles, as passed to many iterators, can be considered file-like. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Feb 23 09:33:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 23 Feb 2008 09:33:18 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802231433.m1NEXI4g031887@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #803 is|0 |1 obsolete| | -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From bugzilla-daemon at portal.open-bio.org Sat Feb 23 09:35:36 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 23 Feb 2008 09:35:36 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802231435.m1NEZaxv031974@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 ------- Comment #10 from mdehoon at ims.u-tokyo.ac.jp 2008-02-23 09:35 EST ------- Created an attachment (id=870) --> (http://bugzilla.open-bio.org/attachment.cgi?id=870&action=view) Patch for Bio/GenBank/__init__.py Attached is a new patch, making use of the new module Bio.Entrez. Does this look better? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From sbassi at gmail.com Sat Feb 23 14:07:47 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 17:07:47 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 3:58 PM, Hilmar Lapp wrote: > You mean from SVN, probably? I don't know but it seems to me that > problem is in some (Bio)Python code? Yes, the problem was that I was using 1.44 biopython without the new BioSQL code from Peter. Biopython repository is still in CVS, not SVN (at least biopython is not listed here: http://code.open-bio.org/svnweb/index.cgi/) Now with the new code, I could reproduce the tutorial, up to here: >>> from BioSQL import BioSeqDatabase >>> server=BioSeqDatabase.open_database(driver = "MySQLdb", user = "X",passwd="X", host = "localhost", db = "bioseqdb") >>> db = server.new_database("cold") >>> from Bio import GenBank >>> parser = GenBank.FeatureParser() >>> iterator = GenBank.Iterator(open("cor6_6.gb"), parser) >>> db.load(iterator) 6 But when I look into the mysql, there is no new record!. The "6" is supposed to be the number of records loaded into the database. But my database is empty (it has the schema, but w/o data). > That would be a question for the Biopython folks (I actually don't > use Biopython). I am copying this into biopython and biopython-dev mailing list. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From sbassi at gmail.com Sat Feb 23 14:50:50 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 17:50:50 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 5:20 PM, Hilmar Lapp wrote: > I.e., there is no error from the db.load() command, just no data? Yes, there was no error, the only response was "6". > Does the Biopython binding enable or disable auto-commit? If the > latter (which would be the Right Thing(tm) to do), you will have to Yes, when working with MySQLdb, it does not auto-commit. You have to do DB_HANDLE.commit(). There is no commit method in db: >>> dir(db) ['__doc__', '__getitem__', '__init__', '__module__', '__repr__', 'adaptor', 'dbid', 'get_PrimarySeq_stream', 'get_Seq_by_acc', 'get_Seq_by_id', 'get_Seq_by_primary_id', 'get_Seq_by_ver', 'get_Seqs_by_acc', 'get_all_primary_ids', 'items', 'keys', 'load', 'lookup', 'name', 'values'] > BioSQL uses InnoDB on MySQL, and hence will be transactional unless > you make the language's db driver to auto-commit. I am looking at the DatabaseLoader class (in loader.py) but I don't see any commit statement, anyway, I don't understand this class, so I may be missing something. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From sbassi at gmail.com Sat Feb 23 15:58:03 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 18:58:03 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 5:50 PM, Sebastian Bassi wrote: > > BioSQL uses InnoDB on MySQL, and hence will be transactional unless > > you make the language's db driver to auto-commit. > I am looking at the DatabaseLoader class (in loader.py) but I don't > see any commit statement, anyway, I don't understand this class, so I > may be missing something. I've just found the answer. Here is what was missing: server.adaptor.commit() I found it here: http://www.biopython.org/wiki/BioSQL So the document IMHO should be changed, for example: ">>> db.load(iterator) 6 And the GenBank file is loaded into the database. Notice that the load function returns the number of records loaded (6 in this case). This is useful for sanity checking to make sure that you didn't try to load a massive file and end up with a result like 3." To: ">>> db.load(iterator) 6 >>> server.adaptor.commit() And the GenBank file is loaded into the database. Notice that the load function returns the number of records loaded (6 in this case). This is useful for sanity checking to make sure that you didn't try to load a massive file and end up with a result like 3." A link to http://www.biopython.org/wiki/BioSQL could be added. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From biopython at maubp.freeserve.co.uk Sun Feb 24 12:09:10 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 24 Feb 2008 17:09:10 +0000 Subject: [Biopython-dev] Bio.Entrez / next Biopython release. In-Reply-To: <643737.72757.qm@web62415.mail.re1.yahoo.com> References: <643737.72757.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802240909q1ad5a1e4r74fe2ffc48496093@mail.gmail.com> Michiel de Hoon wrote: > Hi everybody, > > As discussed previously, I created a module Bio.Entrez for interacting > with NCBI's Entrez databases ... The code is now in CVS as Bio/Entrez.py. > In hindsight, it would probably have been a better idea to use > Bio/Entrez/__init__.py in case we want to expand Bio.Entrez, but > anyway this can be rectified before creating the next Biopython release. I agree with you on the use of Bio/Entrez/__init__.py > I also wrote some documentation for Bio.Entrez. You can have > a preview at http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html; > Chapter 6 describes Bio.Entrez, and gives a good overview of the current > status of this module. Great. > The module Bio.Entrez was created in response to Bug #2393: > http://bugzilla.open-bio.org/show_bug.cgi?id=2393 > Using Bio.Entrez, we can fix this bug easily, and then create a new release. Good. > This is one thing to consider though: > Like Bio.WWW.NCBI, Bio.Entrez provides access to NCBI's Entrez > databases but does not provide parsers for the output generated by NCBI > (note: some file formats generated by NCBI Entrez' sequence databases > can be parsed by Bio.SeqIO). Our options are then: > 1) Keep Bio.Entrez as a module only to access NCBI Entrez, but not to parse the results. > 2) Add parsers to Bio.Entrez. > 3) Make a new Biopython release now, and potentially add parsers later. > > Suggestions, preferences, comments, anybody? I would go with option (3), and make a new release soon. If we write any Entrez specific parsers, they could live in Bio.Entrez Peter From bugzilla-daemon at portal.open-bio.org Sun Feb 24 12:10:21 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 24 Feb 2008 12:10:21 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802241710.m1OHALgL029795@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:10 EST ------- That new patch on comment 10 looks fine, and should resolve this bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From bugzilla-daemon at portal.open-bio.org Sun Feb 24 12:31:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 24 Feb 2008 12:31:40 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802241731.m1OHVe0C031271@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:31 EST ------- Further to my comment 7, > And in the other direction, Doc/Images/BlastRecord.png, > PSIBlastRecord.png and smcra.png appear to be checked > in as text: They work fine on Linux, but when checked > out on Windows the images are corrupt. Add to this Doc/images/bottle.png - it would be great if someone else on Windows could try this in case its a problem with my setup. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 03:38:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 03:38:20 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802250838.m1P8cKLa009702@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #12 from mdehoon at ims.u-tokyo.ac.jp 2008-02-25 03:38 EST ------- Fixed in CVS, using the patch from Comment #10. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From tiagoantao at gmail.com Mon Feb 25 05:46:04 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 25 Feb 2008 10:46:04 +0000 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: <200802241731.m1OHVe0C031271@portal.open-bio.org> References: <200802241731.m1OHVe0C031271@portal.open-bio.org> Message-ID: <6d941f120802250246q1e95c28bs54ff88bd0578071f@mail.gmail.com> Hi, I will delete and recommit this as binary... On Sun, Feb 24, 2008 at 5:31 PM, wrote: > http://bugzilla.open-bio.org/show_bug.cgi?id=2363 > > > > > > ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:31 EST ------- > Further to my comment 7, > > > And in the other direction, Doc/Images/BlastRecord.png, > > PSIBlastRecord.png and smcra.png appear to be checked > > in as text: They work fine on Linux, but when checked > > out on Windows the images are corrupt. > > Add to this Doc/images/bottle.png - it would be great if someone else on > Windows could try this in case its a problem with my setup. > > > -- > Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Mon Feb 25 07:43:05 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 07:43:05 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802251243.m1PCh5iW026139@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #5 from tiagoantao at gmail.com 2008-02-25 07:43 EST ------- I've corrected (hopefully) bottle.png. Can someone one Windows test please? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 14:22:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 14:22:09 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802251922.m1PJM9df030144@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 14:22 EST ------- Using this seemed to work, tagging the images as binary: cvs admin -kb *.png cvs update -A *.png All of Doc/images/*.png now seem to work fine on Windows. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 15:36:31 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:36:31 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802252036.m1PKaVPb002261@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:36 EST ------- I agree with you that this is a problem - its something that gets checked explicitly for the parsers used in Bio.SeqIO by the unit test which uses a StringIO handle. Perhaps Bio/File.py would be a good place to put an "is this a handle" function, or maybe just replace these existing checks with relevant attribute checks. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 15:38:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:38:19 -0500 Subject: [Biopython-dev] [Bug 1816] Error when importing GenBank file into BioSQL database In-Reply-To: Message-ID: <200802252038.m1PKcJMw002396@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1816 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:38 EST ------- As per comment 11, marking this as fixed since the original problem is resolved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 15:40:51 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:40:51 -0500 Subject: [Biopython-dev] [Bug 2375] Coalescent support through Simcoal2 In-Reply-To: Message-ID: <200802252040.m1PKepCa002614@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2375 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #21 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:40 EST ------- The non-binary PNG file in cvs (bug 2363) now works on Windows, so this looks fine to me. Marking as fixed. Good job Tiago. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 15:46:57 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:46:57 -0500 Subject: [Biopython-dev] [Bug 2425] Fasta ID parsing error In-Reply-To: Message-ID: <200802252046.m1PKkvdP002996@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2425 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:46 EST ------- I haven't got round to addressing this issue yet - currently the BioSQL with SeqIO unit test (which I added relatively recently) deliberately avoids using any FASTA files because of this problem. We may want to try and do something intelligent with the version field if present in the annotation dictionary, which should be more robust than simply checking the record.id format. I assume in your example you expected "region1.fasta.screen.Contig1" to be used as the record key in BioSQL? There is a 40 character limit on this field, which should be fine for most FASTA identifiers. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 15:52:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:52:02 -0500 Subject: [Biopython-dev] [Bug 2381] translate and transcibe methods for the Seq object (in Bio.Seq) In-Reply-To: Message-ID: <200802252052.m1PKq2e3003313@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2381 ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:52 EST ------- There are some functions in Bio/Utils.py which could also be deprecated after adding translate and transcribe functionality to the Seq object. In fact, we might consider deprecating all of Bio/Utils.py and moving anything worthwhile into Bio/SeqUtils -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 16:00:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 16:00:19 -0500 Subject: [Biopython-dev] [Bug 2448] Bio.EUtils can't handle accented author names In-Reply-To: Message-ID: <200802252100.m1PL0JiB003906@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2448 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 16:00 EST ------- This was Andrew Dalke's reply on the Biopython-dev mailing list, 10 Feb 2008, which I'm adding to Bugzilla for future reference: On Feb 10, 2008, at 9:29 PM, bugzilla-daemon at portal.open-bio.org wrote: > Summary: Bio.EUtils can't handle accented author names ... > self.stack[-1].append(Text(text)) > UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in > position 4: > ordinal not in range(128) The EUtils code is old. It uses a DTD to XML parser that I found, what, 6 years ago? This problem is because the code uses class IndentedText(str): def __init__(self, data=""): self.data = unescape(unicode(data)) self._level = 0 self._parent = None That derivation from str is suspicious. I don't think it's needed, but I haven't reviewed the code well enough. Getting rid of the 'str' *might* fix it. Otherwise what's going on is the __new__ is seeing the byte string using non-ASCII values and it doesn't know what to do. So another solution might be to change that base class to "unicode" and do the right decode calls. Note that the current parser doesn't handle &# notation. Some years back I started work on a EUtils2. It used the then-quite- new ElementTree library. Here's what I had http://www.dalkescientific.com/writings/diary/archive/2005/09/30/ using_eutils.html If anyone wants the code, http://dalkescientific.com/EUtils-2.0a1.tar.gz I don't plan on doing anything more with it until I have a pressing need. Like someone wanting to pay me for it :) This old mail might also be useful for someone working on non-ASCII queries that are sent to NCBI. > The following is the MEDLINE character table for the XML. > > http://www.nlm.nih.gov/databases/dtd/medline_character_database.utf8 > > Diana Airozo > NCBI Contractor > dalke at dalkescientific.com wrote (Tue, Sep 7 2004 15:20:14): > > >> Hi Diana, >> >> Thank you for your reply. For a clarification on the >> non-ASCII query question >> >> >>>> Also, how do I do non-ASCII queries? For example, suppose I want >>>> to search for papers from "G??teborg Universitet" or "La Universidad >>>> de Espa??a". >>>> >> >> >> >>> You would search using Goteborg. >>> >> >> I want to automate this so that a user query for G??teborg >> gets converted into "Goteborg." I would prefer to use the >> same algorithm for doing this that your indexer uses. I >> looked online for unicode -> ASCII conversion table that >> strips the accents and other diacriticals and expands >> characters like ?? into ss and ?? into ae. I found >> several, but I would prefer to use the same table your >> indexer has so that queries are more likely to work. >> >> (Well, actually I would like your search code to perform >> the same input normalization that your indexer does, but >> I'll use this as a workaround.) >> >> Is the conversion table you use available? >> -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 18:24:47 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 18:24:47 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802252324.m1PNOlCw013424@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 cracka80 at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #3 from cracka80 at gmail.com 2008-02-25 18:24 EST ------- (In reply to comment #2) > I agree with you that this is a problem - its something that gets checked > explicitly for the parsers used in Bio.SeqIO by the unit test which uses a > StringIO handle. > > Perhaps Bio/File.py would be a good place to put an "is this a handle" > function, or maybe just replace these existing checks with relevant attribute > checks. > What I'll do is add a function, change the checks in the relevant Iterators and upload a CVS diff (because I don't have write access). Of course I'll try it out to make sure it's not buggy. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gregorio at umh.es Fri Feb 29 07:32:12 2008 From: gregorio at umh.es (Gregorio Fernandez) Date: Fri, 29 Feb 2008 13:32:12 +0100 Subject: [Biopython-dev] deprecation? Message-ID: <47C7FB4C.40607@umh.es> Dear Sir, I had this messasge in one of my scripts. Can I have this feature available? C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149: DeprecationWarning: Concurrent behavior has been deprecated, as this functionality needs Bio.MultiPr oc, which itself has been deprecated. If you need the concurrent behavior, pleas e let the Biopython developers know by sending an email to biopython-dev at biopyth on.org to avoid permanent removal of this feature. DeprecationWarning) Thanks Gregorio -- Gregorio J. Fernandez Ballester Instituto de Biolog?a Molecular y Celular Universidad Miguel Hern?ndez Edificio Torregait?n. Avda. de la Universidad, s/n. 03202 Elche (Alicante) E-mail: gregorio at umh.es Telf: 966 65 84 41 Fax: 966 65 87 58 From bugzilla-daemon at portal.open-bio.org Fri Feb 29 22:04:14 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 29 Feb 2008 22:04:14 -0500 Subject: [Biopython-dev] [Bug 2464] New: from Bio import db doesn't work? Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2464 Summary: from Bio import db doesn't work? Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: blocker Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: patrikd at gmail.com Just trying to run an example straight out of the BioPython cookbook: ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank") Traceback (most recent call last): File "", line 1, in ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank") File "C:\Program Files\Python25\lib\site-packages\Bio\GenBank\__init__.py", line 1283, in __init__ from Bio import db ImportError: cannot import name db -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From krewink at inb.uni-luebeck.de Wed Feb 13 05:06:47 2008 From: krewink at inb.uni-luebeck.de (Albert Krewinkel) Date: Wed, 13 Feb 2008 10:06:47 -0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <20080213100047.GA18695@inb.uni-luebeck.de> Hi, On Wed, Feb 13, 2008 at 09:19:27AM +0000, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... That's about normal and just the way svn works. I don't know the details, but AFAIK svn connects multiple times to the repo: Version Checking for changes, downloading data, etc. -- every operation needs a separate authentication. Quite stupid, in a way. You might want to generate a public key and add it to the ~/.ssh/authorized_keys file on the server - you won't be asked for a password any more. Cheers, Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From krewink at inb.uni-luebeck.de Wed Feb 13 05:06:48 2008 From: krewink at inb.uni-luebeck.de (Albert Krewinkel) Date: Wed, 13 Feb 2008 10:06:48 -0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <20080213100047.GA18695@inb.uni-luebeck.de> Hi, On Wed, Feb 13, 2008 at 09:19:27AM +0000, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... That's about normal and just the way svn works. I don't know the details, but AFAIK svn connects multiple times to the repo: Version Checking for changes, downloading data, etc. -- every operation needs a separate authentication. Quite stupid, in a way. You might want to generate a public key and add it to the ~/.ssh/authorized_keys file on the server - you won't be asked for a password any more. Cheers, Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From bugzilla-daemon at portal.open-bio.org Tue Feb 5 13:36:16 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 08:36:16 -0500 Subject: [Biopython-dev] [Bug 2443] New: Specifying the alphabet in Bio.SeqIO.parse() Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2443 Summary: Specifying the alphabet in Bio.SeqIO.parse() Product: Biopython Version: 1.44 Platform: All OS/Version: All Status: NEW Severity: enhancement Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Currently when reading sequences using Bio.SeqIO, unless the alphabet can be determined from the file format, all the records have a generic alphabet. This can be a handicap if later on you want to work with "strict" functions which check for a particular alphabet (e.g. a gapped alphabet when working with alignments), or perhaps the Bio.Translate module. For an example of this, see Dalloliogm's question on the SeqIO wiki talk page, http://biopython.org/wiki/Talk:SeqIO Currently the user may need to use a tedious work around to override the alphabet of each sequence, e.g. from Bio import SeqIO from Bio.Alphabet import generic_dna records = list(SeqIO.parse(open("data.txt"), "fasta")) for record in records : record.seq.alphabet = generic_dna record_dict = SeqIO.to_dict(records) Instead, I want to add an optional argument to the parse() and read() functions, allowing this example to be shortened: from Bio import SeqIO from Bio.Alphabet import generic_dna record_dict = SeqIO.to_dict(SeqIO.parse(open("data.txt"), "fasta", generic_dna)) Suggested patch to follow... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 13:37:58 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 08:37:58 -0500 Subject: [Biopython-dev] [Bug 2443] Specifying the alphabet in Bio.SeqIO.parse() In-Reply-To: Message-ID: <200802051337.m15DbwAX026189@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2443 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-05 08:37 EST ------- Created an attachment (id=853) --> (http://bugzilla.open-bio.org/attachment.cgi?id=853&action=view) Path to Bio/SeqIO/__init__.py One possible implementation which will use a format specific parser's optional alphabet argument if defined, and if not simply override the alphabet of the returned records. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 20:05:55 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 15:05:55 -0500 Subject: [Biopython-dev] [Bug 2446] New: Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2446 Summary: Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. Product: Biopython Version: Not Applicable Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: dthomp325 at gmail.com When parsing an ace file that contains CT tags with comments such as those added by Polyphred 6.11, Bio.Sequencing.Ace.ACEParser appears to get stuck in an infinite loop until it dies with a memory usage exception. example CT tag with comment: CT{ Contig36 polyPhredRank1 polyPhred 3608 3608 080205:125543 COMMENT{ 99 C} } Parsing works correctly for the exact same ace file minus the COMMENTs. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Tue Feb 5 23:38:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Tue, 5 Feb 2008 18:38:15 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802052338.m15NcFFP001008@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-05 18:38 EST ------- Could you supply an example input file [which we could use for a unit test] and associated snippet of python code to load it, which shows the problem? Thanks. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From chris.lasher at gmail.com Wed Feb 6 03:27:19 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Tue, 5 Feb 2008 22:27:19 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion Message-ID: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Hello all Biopythonistas, In the next upcoming weeks, Biopython will begin and complete its transition from CVS to Subversion (SVN) as its revision control system. This transition will likely not affect end users of Biopython except that to get the development version, a checkout with a Subversion client, rather than a CVS client, will be necessary. For developers, we will need to determine a suitable range of dates (a week) during which we will "freeze" the CVS repository for its transition to SVN. From the freeze and thereon, commits to the CVS repository will no longer be possible. Instead, commits not placed in during the freeze will need to take place in the Subversion repository once we have it running. This week, we hope to have a "dry run" of the Subversion repository available for the developers to poke around and make sure the transition will include everything necessary. Following that, we'll have the freeze and complete the transition. If you have any questions, I'll be checking posts to the list, or you may feel free contact me directly. Best, Chris From bugzilla-daemon at portal.open-bio.org Wed Feb 6 16:25:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 11:25:20 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802061625.m16GPKN2020679@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #2 from dthomp325 at gmail.com 2008-02-06 11:25 EST ------- I tried to attach the file that causes the error, but it looks like it's too big. I get this error from Bugzilla: DBD::mysql::st execute failed: Got a packet bigger than 'max_allowed_packet' bytes [for Statement "INSERT INTO attach_data (id, thedata) VALUES (861, ?)" with ParamValues: 0='AS 2 1710 Would it be possible for me to e-mail the file directly to you? (In reply to comment #1) > Could you supply an example input file [which we could use for a unit test] and > associated snippet of python code to load it, which shows the problem? Thanks. > -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Wed Feb 6 16:27:49 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 6 Feb 2008 16:27:49 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Message-ID: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> > In the next upcoming weeks, Biopython will begin and complete its > transition from CVS to Subversion (SVN) as its revision control > system. I gather that BioPerl and BioJava and BioSQL have all transitioned fine, so its our turn now. Michiel - do you think we should try and do another release before the CVS freeze and migration? We've had a lots little changes, plus Tiago's PopGen work and my own efforts with BioSQL. There are still a few open issues, but I think a release soon would be reasonable (depending on your time commitments of course). > If you have any questions, I'll be checking posts to the list, or you > may feel free contact me directly. Will the existing developer accounts simply work on the new SVN repository? Is there any issue with Unix/Windows newlines under SVN? I recall reading somewhere that like CVS, SVN can be setup to handle this transparently for text files. I may be worrying over nothing, but given that we have developers using both Linux, Windows and MacOS this seems worth checking. Peter From bugzilla-daemon at portal.open-bio.org Wed Feb 6 16:28:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 6 Feb 2008 11:28:15 -0500 Subject: [Biopython-dev] [Bug 2446] Comments in CT tags cause Bio.Sequencing.Ace.ACEParser to fail. In-Reply-To: Message-ID: <200802061628.m16GSFwR020863@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2446 ------- Comment #3 from dthomp325 at gmail.com 2008-02-06 11:28 EST ------- The python code is simply: from Bio.Sequencing import Ace ace_parser = Ace.ACEParser() ace_file = ace_parser.parse(open((in_file), 'r')) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Wed Feb 6 17:05:33 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Wed, 6 Feb 2008 17:05:33 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> Message-ID: <6d941f120802060905h3bc09488tbd7ea3c85bce5914@mail.gmail.com> Hi, On Feb 6, 2008 4:27 PM, Peter wrote: > Michiel - do you think we should try and do another release before the > CVS freeze and migration? We've had a lots little changes, plus > Tiago's PopGen work and my own efforts with BioSQL. There are still a > few open issues, but I think a release soon would be reasonable > (depending on your time commitments of course). Just FYI: As I noticed that the SVN move would be happening sooner or later, I decided to put everything into a stable state and stop at that point. Hopefully all that there is PopGen related is stable and ready to move (code, test, doc). As soon as we move to SVN I will get back into committing (now the really interesting stuff will start: statistics and maybe HapMap). Tiago From mjldehoon at yahoo.com Thu Feb 7 01:10:06 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 6 Feb 2008 17:10:06 -0800 (PST) Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> Message-ID: <617104.88204.qm@web62413.mail.re1.yahoo.com> Peter wrote:Michiel - do you think we should try and do another release before the CVS freeze and migration? We've had a lots little changes, plus Tiago's PopGen work and my own efforts with BioSQL. There are still a few open issues, but I think a release soon would be reasonable (depending on your time commitments of course). I think that the Subversion/CVS issue is separate from our release schedule, so I don't think that the transition to Subversion by itself should be a reason for a release. However, we can probably make a release soon after the transition. I would like to finalize my work on Bio.WWW before making a release, but hopefully that won't be too complicated. --Michiel --------------------------------- Never miss a thing. Make Yahoo your homepage. From bugzilla-daemon at portal.open-bio.org Thu Feb 7 08:08:29 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 7 Feb 2008 03:08:29 -0500 Subject: [Biopython-dev] [Bug 2447] New: EUtils cannot parse PubMed XML for ACS journals Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2447 Summary: EUtils cannot parse PubMed XML for ACS journals Product: Biopython Version: 1.44 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: baoilleach at gmail.com Here's the code to reproduce the bug: from Bio import EUtils from Bio.EUtils import DBIdsClient PMID = "17238260" result = DBIdsClient.from_dbids(EUtils.DBIds("pubmed", PMID)) print result.efetch().read() summary = result.summary() The error is: Traceback (most recent call last): File "bug.py", line 8, in ? summary = result.summary() File "/home/user/Tools/biopython-1.44/Bio/EUtils/DBIdsClient.py", line 105, in summary return parse.parse_summary_xml(self.esummary("xml")) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 416, in parse_summary_xml d = convert_summary_Items(docsum.find_elements("Item")) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 394, in convert_summary_Items d[name] = summary_type_parser_table[item.Type](item) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 321, in convert_summary_Date return convert_summary_Date_string(x.tostring()) File "/home/user/Tools/biopython-1.44/Bio/EUtils/parse.py", line 351, in convert_summary_Date_string raise TypeError("Unknown date format: %s" % (s,)) TypeError: Unknown date format: -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From peter at maubp.freeserve.co.uk Thu Feb 7 09:36:34 2008 From: peter at maubp.freeserve.co.uk (Peter) Date: Thu, 7 Feb 2008 09:36:34 +0000 Subject: [Biopython-dev] [BioPython] Biopython to begin transition to Subversion In-Reply-To: <617104.88204.qm@web62413.mail.re1.yahoo.com> References: <320fb6e00802060827p37c0aeabk55fa378a4cb35abf@mail.gmail.com> <617104.88204.qm@web62413.mail.re1.yahoo.com> Message-ID: <320fb6e00802070136r7984d523rcc3c683d8f897431@mail.gmail.com> On Feb 7, 2008 1:10 AM, Michiel de Hoon wrote: > I think that the Subversion/CVS issue is separate from our release schedule, > so I don't think that the transition to Subversion by itself should be a reason > for a release. However, we can probably make a release soon after the > transition. I would like to finalize my work on Bio.WWW before making a > release, but hopefully that won't be too complicated. > > --Michiel You're right the CVS/SVN migration isn't directly linked - but its a nice excuse to get a release out ;) I'd forgotten you still had the Bio.WWW module to sort out, sorry. Peter From bugzilla-daemon at portal.open-bio.org Thu Feb 7 09:37:15 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 7 Feb 2008 04:37:15 -0500 Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS journals In-Reply-To: Message-ID: <200802070937.m179bFQr029851@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2447 ------- Comment #1 from baoilleach at gmail.com 2008-02-07 04:37 EST ------- Caused by the absence of an EPubDate. (Noel) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From jblanca at btc.upv.es Thu Feb 7 09:58:10 2008 From: jblanca at btc.upv.es (Jose Blanca) Date: Thu, 7 Feb 2008 10:58:10 +0100 Subject: [Biopython-dev] [BioPython] Alignment add_sequence In-Reply-To: <320fb6e00802070133n67a549b5k8868a025f423dc82@mail.gmail.com> References: <200802061706.08830.jblanca@btc.upv.es> <200802070925.28882.jblanca@btc.upv.es> <320fb6e00802070133n67a549b5k8868a025f423dc82@mail.gmail.com> Message-ID: <200802071058.10148.jblanca@btc.upv.es> On Thursday 07 February 2008 10:33:49 Peter wrote: > On Feb 7, 2008 8:25 AM, Jose Blanca wrote: > > Hi: > > I think I can't use Bio.SeqIO.to_alignment() because the > > sequences have different lengths and start at different > > positions. It's and EST alignmet not a clustal-like one. > > I have also looked at your proposal in bug 1944 and I really > > like it, specially the clever __getitem__ method. But I can't > > use it because the different lengths of the sequences. > > I'm going to add an add_seqRecord method. Now, thanks to you I > > understand why this is not a good solution. But, at least, it > > will do for this time. > > The whole idea behind the current alignment class is that all the > sequences are the same length (often with gaps). I don't think this > fits with your intended usage - unless you pad each record with > leading gap characters (according to its start) and then pad the end > until they are all the same length. You could write a function to > take a list of SeqRecords and pad them like this (note the example > will be easier to read in a mono-spaced font): I could do this, but I don't like the idea. An initial pad is not the same as a gap. The whole point of the program I'm working on is to look for SNPs and indels and this implementation would confuse the indel search. I have looked at your proposal for the new Alignment implementation and the more I look at it, the more I like the idea of subclassing from list. Maybe the only problem is that it shouldn't be a list of seqRecords. A sequence in an alignment it's a seqRecod located at a given position. Maybe the Alignment class could take that into account internaly. In that case I don't know how to create a simple api that could deal with the case of start=0 and with the more complex case of start <> 0. A possible solution could be to accept seqRecords and tuples like (seqRecord, start) in the constructor. > > e.g. > > CONSENSUS: AGGCCTGAGGCCCCTTTT, start 0 > EST1 : CGCAGGCCCGAGGCC, start -3 > EST2 : GGCCTGAGGCCCCTT, start 1 > EST3 : CTGAGGCCACTTTTTCGC, start 4 > > In this case we want to add (start+3) gaps to each line, where -3 = > min(starts). This becomes: > > ---AGGCCTGAGGCCCCTTTT, start 0 > CGCAGGCCCGAGGCC, start -3 > ----GGCCTGAGGCCCCTT, start 1 > -------CTGAGGCCACTTTTTCGC, start 4 > > Then work out the maximum length, and pad all the sequences with trailing > gaps: > > ---AGGCCTGAGGCCCCTTTT---- > CGCAGGCCCGAGGCC---------- > ----GGCCTGAGGCCCCTT------ > -------CTGAGGCCACTTTTTCGC > > A little bit of work, but now all the sequences are the same length > and the Biopython alignment class will be happy. > > As far as I know, there is nothing for this built into Biopython at > the moment. Could you tell us what your input file looks like (e.g. > link to the file format?) The alignment is originally done by cap3, but the data is in a MySQL database. I'm using EST2uni (http://bioinf.comav.upv.es/est2uni/). I have fetched the information from the database and I have set up the seqRecod objects and now I'm trying to create the Alingment object. > > Peter Thanks, -- Jose M. Blanca Postigo Instituto Universitario de Conservacion y Mejora de la Agrodiversidad Valenciana (COMAV) Universidad Politecnica de Valencia (UPV) Edificio CPI (Ciudad Politecnica de la Innovacion), 8E 46022 Valencia (SPAIN) Tlf.:+34-96-3877000 (ext 88473) From mjldehoon at yahoo.com Fri Feb 8 16:06:11 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 8 Feb 2008 08:06:11 -0800 (PST) Subject: [Biopython-dev] Bio.WWW.NCBI proposal Message-ID: <714644.12951.qm@web62409.mail.re1.yahoo.com> Hi everybody, Currently, there are two ways in Biopython to get access to NCBI's Entrez databases (Bio.WWW.NCBI and Bio.EUtils). Bio.PubMed builds on Bio.WWW.NCBI, and Bio.GenBank builds Bio.EUtils. Clearly, having two modules for the same thing is not optimal. >From looking at these two modules, I think that Bio.WWW.NCBI is more suitable as Biopython's module to interact with NCBI. It is much smaller and very straightforward, and therefore much easier to maintain, and it has some documentation (though not quite enough). Bio.EUtils is quite large, and is difficult to maintain since none of the current active developers are familiar with it. Bio.WWW.NCBI has two problems though: It is not quite up to date (some functions are missing, and other functions are for databases that have already been deprecated a while ago), and it is the only remaining module inside Bio.WWW. Concretely, I'd like to propose to following: 1) Move Bio.WWW.NCBI to Bio.Entrez (actually, copy and deprecate Bio.WWW.NCBI). 2) Make it Biopython's general module for interacting with NCBI Entrez by adding any missing functions from the list at http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html (this will be very straightforward; EInfo, ESummary, EGQuery, and ESpell are currently missing), and removing any obsolete functions. 3) Update the tutorial accordingly. 4) Use Bio.Entrez in Bio.GenBank.NCBIDictionary to fix bug #2393. At that point, I think we have an error-free Biopython again (alas only in the sense that no errors or warnings appear when running the test suite), so we'd be ready for a new release. I don't want to deprecate Bio.EUtils right now, since it also contains some functionality other than database access (e.g. parsing the database output from NCBI; we can those issues about that after the next release). Any comments or objections? --Michiel --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From chris.lasher at gmail.com Fri Feb 8 19:55:38 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 8 Feb 2008 14:55:38 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> References: <128a885f0802051927g1d773a51l5b0e7b914e347ffd@mail.gmail.com> Message-ID: <128a885f0802081155o99df22bv2e6dc5ca6f64525@mail.gmail.com> On Tue, Feb 5, 2008 at 10:27 PM, Chris Lasher wrote: > For developers, we will need to determine a suitable range of dates (a > week) during which we will "freeze" the CVS repository for its > transition to SVN. From the freeze and thereon, commits to the CVS > repository will no longer be possible. Instead, commits not placed in > during the freeze will need to take place in the Subversion repository > once we have it running. This week, we hope to have a "dry run" of the > Subversion repository available for the developers to poke around and > make sure the transition will include everything necessary. Following > that, we'll have the freeze and complete the transition. Hi all, The prototype SVN repository is now available. You can check it out with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype Chris From mjldehoon at yahoo.com Sat Feb 9 05:44:56 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 8 Feb 2008 21:44:56 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802081155o99df22bv2e6dc5ca6f64525@mail.gmail.com> Message-ID: <632331.45313.qm@web62415.mail.re1.yahoo.com> Hi Chris, When I executed the svn command, I get subdirectories branches, tags, and trunk. Branches is almost empty, tags contains all previous Biopython releases, and trunk is Biopython leading up to the next release. Shouldn't we see trunk only (same as with CVS)? The second issue is that the svn command exits with an error message: svn: Can't copy 'biopython-prototype/biopython/tags/biopython-100a4/Tests/MetaTool/.svn/tmp/text-base/meta9.out.svn-base' to 'biopython-prototype/biopython/tags/biopython-100a4/Tests/MetaTool/.svn/tmp/meta9.out.tmp.tmp': No such file or directory Thanks! --Michiel. Hi all, The prototype SVN repository is now available. You can check it out with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype Chris --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From jflatow at northwestern.edu Sat Feb 9 20:12:49 2008 From: jflatow at northwestern.edu (Jared Flatow) Date: Sat, 9 Feb 2008 14:12:49 -0600 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <632331.45313.qm@web62415.mail.re1.yahoo.com> References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: Is there any read-only access yet for biopython users without login credentials? I'm very excited about this change, I have been waiting to update until the switch was made. On Feb 8, 2008, at 11:44 PM, Michiel de Hoon wrote: > When I executed the svn command, I get subdirectories branches, > tags, and trunk. Branches is almost empty, tags contains all > previous Biopython releases, and trunk is Biopython leading up to > the next release. Shouldn't we see trunk only (same as with CVS)? You may have already figured this out but with svn you can check out only the trunk with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ trunk [name you want to checkout into] jared From jflatow at northwestern.edu Sat Feb 9 20:11:52 2008 From: jflatow at northwestern.edu (Jared Flatow) Date: Sat, 9 Feb 2008 14:11:52 -0600 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <632331.45313.qm@web62415.mail.re1.yahoo.com> References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: Is there any read-only access yet for biopython users without login credentials? I'm very excited about this change, I have been waiting to update until the switch was made. On Feb 8, 2008, at 11:44 PM, Michiel de Hoon wrote: > When I executed the svn command, I get subdirectories branches, > tags, and trunk. Branches is almost empty, tags contains all > previous Biopython releases, and trunk is Biopython leading up to > the next release. Shouldn't we see trunk only (same as with CVS)? You may have already figured this out but with svn you can check out only the trunk with: svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ trunk [name you want to checkout into] jared From bugzilla-daemon at portal.open-bio.org Sun Feb 10 20:29:37 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 10 Feb 2008 15:29:37 -0500 Subject: [Biopython-dev] [Bug 2448] New: Bio.EUtils can't handle accented author names Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2448 Summary: Bio.EUtils can't handle accented author names Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: baoilleach at gmail.com The following code exhibits the bug: from Bio import EUtils from Bio.EUtils import DBIdsClient pmids = ["17299727", "17118524"] client = DBIdsClient.DBIdsClient() for pmid in pmids: paper = client.search(pmid) print paper.efetch().read() summary = paper.summary() data = summary.dataitems authors = ", ".join(data['AuthorList'].allvalues()) p = {'title': data['Title'], 'journal': data['Source'], 'volume': data['Volume'], 'authors': authors, 'pages': data['Pages']} try: p['year'] = data['PubDate'].year except: p['year'] = "----" if hasattr(data, "DOI"): p['doi'] = data['DOI'] print i, p['authors'] , p['title'], p['journal'], p['year'], p['volume'], p['pages'] The result is: Traceback (most recent call last): File "pmids.py", line 11, in summary = paper.summary() File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\DBIdsClient.py", line 105, in summary return parse.parse_summary_xml(self.esummary("xml")) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\parse.py", line 412, in parse_summary_xml pom = xml_parser.parse_using_dtd(infile) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\parse.py", line 48, in parse_using_dtd parser.parse(file) File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 107, in pars e xmlreader.IncrementalParser.parse(self, source) File "C:\Program Files\Python25\lib\xml\sax\xmlreader.py", line 123, in parse self.feed(buffer) File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 207, in feed self._parser.Parse(data, isFinal) File "C:\Documents and Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1. 44\Bio\EUtils\POM.py", line 774, in characters self.stack[-1].append(Text(text)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 4: ordinal not in range(128) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From dalke at dalkescientific.com Sun Feb 10 21:50:13 2008 From: dalke at dalkescientific.com (Andrew Dalke) Date: Sun, 10 Feb 2008 22:50:13 +0100 Subject: [Biopython-dev] [Bug 2448] New: Bio.EUtils can't handle accented author names In-Reply-To: References: Message-ID: On Feb 10, 2008, at 9:29 PM, bugzilla-daemon at portal.open-bio.org wrote: > Summary: Bio.EUtils can't handle accented author names ... > self.stack[-1].append(Text(text)) > UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in > position 4: > ordinal not in range(128) The EUtils code is old. It uses a DTD to XML parser that I found, what, 6 years ago? This problem is because the code uses class IndentedText(str): def __init__(self, data=""): self.data = unescape(unicode(data)) self._level = 0 self._parent = None That derivation from str is suspicious. I don't think it's needed, but I haven't reviewed the code well enough. Getting rid of the 'str' *might* fix it. Otherwise what's going on is the __new__ is seeing the byte string using non-ASCII values and it doesn't know what to do. So another solution might be to change that base class to "unicode" and do the right decode calls. Note that the current parser doesn't handle &# notation. Some years back I started work on a EUtils2. It used the then-quite- new ElementTree library. Here's what I had http://www.dalkescientific.com/writings/diary/archive/2005/09/30/ using_eutils.html If anyone wants the code, http://dalkescientific.com/EUtils-2.0a1.tar.gz I don't plan on doing anything more with it until I have a pressing need. Like someone wanting to pay me for it :) This old mail might also be useful for someone working on non-ASCII queries that are sent to NCBI. > The following is the MEDLINE character table for the XML. > > http://www.nlm.nih.gov/databases/dtd/medline_character_database.utf8 > > Diana Airozo > NCBI Contractor > dalke at dalkescientific.com wrote (Tue, Sep 7 2004 15:20:14): > > >> Hi Diana, >> >> Thank you for your reply. For a clarification on the >> non-ASCII query question >> >> >>>> Also, how do I do non-ASCII queries? For example, suppose I want >>>> to search for papers from "G?teborg Universitet" or "La Universidad >>>> de Espa?a". >>>> >> >> >> >>> You would search using Goteborg. >>> >> >> I want to automate this so that a user query for G?teborg >> gets converted into "Goteborg." I would prefer to use the >> same algorithm for doing this that your indexer uses. I >> looked online for unicode -> ASCII conversion table that >> strips the accents and other diacriticals and expands >> characters like ? into ss and ? into ae. I found >> several, but I would prefer to use the same table your >> indexer has so that queries are more likely to work. >> >> (Well, actually I would like your search code to perform >> the same input normalization that your indexer does, but >> I'll use this as a workaround.) >> >> Is the conversion table you use available? >> Andrew dalke at dalkescientific.com From biopython at maubp.freeserve.co.uk Mon Feb 11 17:55:03 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 11 Feb 2008 17:55:03 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: References: <632331.45313.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> > You may have already figured this out but with svn you can check out > only the trunk with: > > svn co svn+ssh://dev.open-bio.org/home/hartzell/biopython-prototype/ > trunk [name you want to checkout into] I got there in the end after some password hickups (I've email Chris about this) with: svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk Once that was done, I was able to build Biopython and run the unit tests fine on my Linux machine. I haven't tried anything further (e.g. running a "svn diff" or committing a small change). As to Jared's question, I would expect there to be guest read-only access on the official SVN repository, like there is now with CVS. I don't know if there is a guest account setup on the prototype migrated SVN. Chris? Peter From mjldehoon at yahoo.com Wed Feb 13 00:58:33 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 12 Feb 2008 16:58:33 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> Message-ID: <473614.54149.qm@web62415.mail.re1.yahoo.com> Peter wrote: svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk This command worked for me, though for some reason my password is always refused the first time but accepted the second time. Do we have a wiki page about CVS to SVN transition? We should add this command (and other useful commands) to that page. --Michiel. --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From biopython at maubp.freeserve.co.uk Wed Feb 13 09:19:27 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 13 Feb 2008 09:19:27 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <473614.54149.qm@web62415.mail.re1.yahoo.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > Peter wrote: > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > This command worked for me, though for some reason my password > is always refused the first time but accepted the second time. How strange - for me it asked for my password three times (this was the issue I had emailed Chris about directly; also establishing that yes, the same accounts and passwords were being used as in CVS). I hope they can sort this out... >Do we have a wiki page about CVS to SVN transition? Not yet - but there is some useful information on http://www.open-bio.org/wiki/SourceCode which we might base this on. > We should add this command (and other useful commands) to that page. Note that the SVN command above is just for the prototype repository, for the real thing the URL will be a little different. Peter From chris.lasher at gmail.com Thu Feb 14 15:42:40 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Thu, 14 Feb 2008 10:42:40 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> On Wed, Feb 13, 2008 at 4:19 AM, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... Hmm. Even for SSH with keys I am prompted twice. How strange. I'll contact the OBF guys. Chris From mjldehoon at yahoo.com Fri Feb 15 06:40:24 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 14 Feb 2008 22:40:24 -0800 (PST) Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> Message-ID: <658418.5192.qm@web62414.mail.re1.yahoo.com> Just to confirm the status of the transition to Subversion: Can we still commit changes to CVS? --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From chris.lasher at gmail.com Fri Feb 15 07:21:54 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 15 Feb 2008 02:21:54 -0500 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <658418.5192.qm@web62414.mail.re1.yahoo.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> Message-ID: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> On Fri, Feb 15, 2008 at 1:40 AM, Michiel de Hoon wrote: > Just to confirm the status of the transition to Subversion: > Can we still commit changes to CVS? Good question. Yes, you should be able to commit to the CVS repository. It has not been frozen yet, AFAIK. Chris From marcin.cieslik at gmail.com Sun Feb 17 07:58:58 2008 From: marcin.cieslik at gmail.com (=?ISO-8859-2?Q?Marcin_Cie=B6lik?=) Date: Sun, 17 Feb 2008 02:58:58 -0500 Subject: [Biopython-dev] KDTree to numpy In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> Message-ID: <47B7E942.5060501@gmail.com> Hi, I was able to make KDtrees work with numpy. There where no real problems apart from my complete lack of c++/swig skills. It seems to me that everything is working fine. the includes in KDTree.i look like this: %{ #include "KDTree.h" #include %} is this correct? Changes in .py files where really small. i compiled/tested it on linux make clean rm *.o rm *.so make gcc -c KDTree.cpp KDTree.swig.cpp -I/usr/include/python2.5 g++ -shared KDTree.swig.o KDTree.o -o _CKDTree.so you can get it here: http://149.156.87.35/~marcin_cieslik/KDTree2.tar.gz Marcin From biopython at maubp.freeserve.co.uk Sun Feb 17 11:05:51 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 17 Feb 2008 11:05:51 +0000 Subject: [Biopython-dev] KDTree to numpy In-Reply-To: <47B7E942.5060501@gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <47B7E942.5060501@gmail.com> Message-ID: <320fb6e00802170305y5ac6fa3ejb0473f3240fa5122@mail.gmail.com> On Feb 17, 2008 7:58 AM, Marcin Cie?lik wrote: > Hi, > > I was able to make KDtrees work with numpy. There where no real problems > apart from my complete lack of c++/swig skills. It seems to me that > everything is working fine. > > the includes in KDTree.i look like this: > > %{ > #include "KDTree.h" > #include > > %} > is this correct? Changes in .py files where really small. This looks very similar to the change in the patch on bug 2255 except Ed used a relative path. http://bugzilla.open-bio.org/show_bug.cgi?id=2251 Peter From bugzilla-daemon at portal.open-bio.org Wed Feb 20 19:41:38 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 20 Feb 2008 14:41:38 -0500 Subject: [Biopython-dev] [Bug 2454] New: Iterators can't use file-like objects Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2454 Summary: Iterators can't use file-like objects Product: Biopython Version: 1.43 Platform: All OS/Version: All Status: NEW Severity: minor Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: cracka80 at gmail.com I've noticed, when using BioPython, that if I hold my data in a StringIO object then several iterators cannot parse the data, due to type checking. Rather than check to see if the object has particular methods/attributes, they check its class, which will only validate if the object is an actual file instance, and invalidates perfectly valid file-like objects like StringIOs and URLs. I've so far only experienced it in version 1.43, but this is the latest version available from the Ubuntu package repository, so there may be many users experiencing the same problem. The affected iterators/functions are: * Gobase.Iterator * SwissProt.SProt.Iterator * Medline.Iterator * Prosite.Iterator * Prosite.Prodoc.Iterator * Rebase.Iterator * SCOP.Hie.Iterator * SCOP.Cla.Iterator * SCOP.Dom.Iterator * SCOP.Des.Iterator * SCOP.Raf.Iterator * Sequencing.Phd.Iterator * Sequencing.Ace.Iterator * SwissProt.KeyWList.extract_keywords A possible solution might be to implement a function which checks if an object is file-like; i.e. has all the necessary attributes and functions to be used as a file, and replace the type-checking with the application of this function to the object. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Thu Feb 21 14:00:43 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 21 Feb 2008 06:00:43 -0800 (PST) Subject: [Biopython-dev] Bio.Entrez / next Biopython release. Message-ID: <643737.72757.qm@web62415.mail.re1.yahoo.com> Hi everybody, As discussed previously, I created a module Bio.Entrez for interacting with NCBI's Entrez databases (GenBank, PubMed, and many others). This is essentially Bio.WWW.NCBI renamed to Bio.Entrez; Bio.WWW.NCBI still exists in the same location but is deprecated. In the process, I updated this module to include all of NCBI Entrez Programming Utilities, and deprecated those that have been superseded at NCBI. The code is now in CVS as Bio/Entrez.py. In hindsight, it would probably have been a better idea to use Bio/Entrez/__init__.py in case we want to expand Bio.Entrez, but anyway this can be rectified before creating the next Biopython release. I also wrote some documentation for Bio.Entrez. You can have a preview at http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html; Chapter 6 describes Bio.Entrez, and gives a good overview of the current status of this module. The module Bio.Entrez was created in response to Bug #2393: http://bugzilla.open-bio.org/show_bug.cgi?id=2393 Using Bio.Entrez, we can fix this bug easily, and then create a new release. This is one thing to consider though: Like Bio.WWW.NCBI, Bio.Entrez provides access to NCBI's Entrez databases but does not provide parsers for the output generated by NCBI (note: some file formats generated by NCBI Entrez' sequence databases can be parsed by Bio.SeqIO). Our options are then: 1) Keep Bio.Entrez as a module only to access NCBI Entrez, but not to parse the results. 2) Add parsers to Bio.Entrez. 3) Make a new Biopython release now, and potentially add parsers later. Suggestions, preferences, comments, anybody? --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Thu Feb 21 16:31:00 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 21 Feb 2008 11:31:00 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802211631.m1LGV0on024444@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 ------- Comment #1 from cracka80 at gmail.com 2008-02-21 11:31 EST ------- Created an attachment (id=867) --> (http://bugzilla.open-bio.org/attachment.cgi?id=867&action=view) Python source file containing a test for object file-like-ness This just has a test to see if an object has the necessary methods and attributes to be considered file-like. It is based on code taken from the 'filelike' package, which can be found in the Python package index. It's a possible idea for a test to see if the handles, as passed to many iterators, can be considered file-like. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Feb 23 14:33:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 23 Feb 2008 09:33:18 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802231433.m1NEXI4g031887@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #803 is|0 |1 obsolete| | -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From bugzilla-daemon at portal.open-bio.org Sat Feb 23 14:35:36 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 23 Feb 2008 09:35:36 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802231435.m1NEZaxv031974@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 ------- Comment #10 from mdehoon at ims.u-tokyo.ac.jp 2008-02-23 09:35 EST ------- Created an attachment (id=870) --> (http://bugzilla.open-bio.org/attachment.cgi?id=870&action=view) Patch for Bio/GenBank/__init__.py Attached is a new patch, making use of the new module Bio.Entrez. Does this look better? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From sbassi at gmail.com Sat Feb 23 19:07:47 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 17:07:47 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 3:58 PM, Hilmar Lapp wrote: > You mean from SVN, probably? I don't know but it seems to me that > problem is in some (Bio)Python code? Yes, the problem was that I was using 1.44 biopython without the new BioSQL code from Peter. Biopython repository is still in CVS, not SVN (at least biopython is not listed here: http://code.open-bio.org/svnweb/index.cgi/) Now with the new code, I could reproduce the tutorial, up to here: >>> from BioSQL import BioSeqDatabase >>> server=BioSeqDatabase.open_database(driver = "MySQLdb", user = "X",passwd="X", host = "localhost", db = "bioseqdb") >>> db = server.new_database("cold") >>> from Bio import GenBank >>> parser = GenBank.FeatureParser() >>> iterator = GenBank.Iterator(open("cor6_6.gb"), parser) >>> db.load(iterator) 6 But when I look into the mysql, there is no new record!. The "6" is supposed to be the number of records loaded into the database. But my database is empty (it has the schema, but w/o data). > That would be a question for the Biopython folks (I actually don't > use Biopython). I am copying this into biopython and biopython-dev mailing list. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From sbassi at gmail.com Sat Feb 23 19:50:50 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 17:50:50 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 5:20 PM, Hilmar Lapp wrote: > I.e., there is no error from the db.load() command, just no data? Yes, there was no error, the only response was "6". > Does the Biopython binding enable or disable auto-commit? If the > latter (which would be the Right Thing(tm) to do), you will have to Yes, when working with MySQLdb, it does not auto-commit. You have to do DB_HANDLE.commit(). There is no commit method in db: >>> dir(db) ['__doc__', '__getitem__', '__init__', '__module__', '__repr__', 'adaptor', 'dbid', 'get_PrimarySeq_stream', 'get_Seq_by_acc', 'get_Seq_by_id', 'get_Seq_by_primary_id', 'get_Seq_by_ver', 'get_Seqs_by_acc', 'get_all_primary_ids', 'items', 'keys', 'load', 'lookup', 'name', 'values'] > BioSQL uses InnoDB on MySQL, and hence will be transactional unless > you make the language's db driver to auto-commit. I am looking at the DatabaseLoader class (in loader.py) but I don't see any commit statement, anyway, I don't understand this class, so I may be missing something. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From sbassi at gmail.com Sat Feb 23 20:58:03 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 23 Feb 2008 18:58:03 -0200 Subject: [Biopython-dev] [BioPython] BioSQL documentation for Biopython In-Reply-To: References: <895066B4-1004-4FF9-A135-7A0FEEEF8DF2@gmx.net> <16F88C8D-70DE-4020-B97D-B3D43AF530AF@gmx.net> <219760B8-1D60-4549-9151-9C1ECB46FE4B@gmx.net> <87D56EF5-9BC8-4401-8A09-2BB3104BE1CE@gmx.net> Message-ID: On Sat, Feb 23, 2008 at 5:50 PM, Sebastian Bassi wrote: > > BioSQL uses InnoDB on MySQL, and hence will be transactional unless > > you make the language's db driver to auto-commit. > I am looking at the DatabaseLoader class (in loader.py) but I don't > see any commit statement, anyway, I don't understand this class, so I > may be missing something. I've just found the answer. Here is what was missing: server.adaptor.commit() I found it here: http://www.biopython.org/wiki/BioSQL So the document IMHO should be changed, for example: ">>> db.load(iterator) 6 And the GenBank file is loaded into the database. Notice that the load function returns the number of records loaded (6 in this case). This is useful for sanity checking to make sure that you didn't try to load a massive file and end up with a result like 3." To: ">>> db.load(iterator) 6 >>> server.adaptor.commit() And the GenBank file is loaded into the database. Notice that the load function returns the number of records loaded (6 in this case). This is useful for sanity checking to make sure that you didn't try to load a massive file and end up with a result like 3." A link to http://www.biopython.org/wiki/BioSQL could be added. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From biopython at maubp.freeserve.co.uk Sun Feb 24 17:09:10 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sun, 24 Feb 2008 17:09:10 +0000 Subject: [Biopython-dev] Bio.Entrez / next Biopython release. In-Reply-To: <643737.72757.qm@web62415.mail.re1.yahoo.com> References: <643737.72757.qm@web62415.mail.re1.yahoo.com> Message-ID: <320fb6e00802240909q1ad5a1e4r74fe2ffc48496093@mail.gmail.com> Michiel de Hoon wrote: > Hi everybody, > > As discussed previously, I created a module Bio.Entrez for interacting > with NCBI's Entrez databases ... The code is now in CVS as Bio/Entrez.py. > In hindsight, it would probably have been a better idea to use > Bio/Entrez/__init__.py in case we want to expand Bio.Entrez, but > anyway this can be rectified before creating the next Biopython release. I agree with you on the use of Bio/Entrez/__init__.py > I also wrote some documentation for Bio.Entrez. You can have > a preview at http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html; > Chapter 6 describes Bio.Entrez, and gives a good overview of the current > status of this module. Great. > The module Bio.Entrez was created in response to Bug #2393: > http://bugzilla.open-bio.org/show_bug.cgi?id=2393 > Using Bio.Entrez, we can fix this bug easily, and then create a new release. Good. > This is one thing to consider though: > Like Bio.WWW.NCBI, Bio.Entrez provides access to NCBI's Entrez > databases but does not provide parsers for the output generated by NCBI > (note: some file formats generated by NCBI Entrez' sequence databases > can be parsed by Bio.SeqIO). Our options are then: > 1) Keep Bio.Entrez as a module only to access NCBI Entrez, but not to parse the results. > 2) Add parsers to Bio.Entrez. > 3) Make a new Biopython release now, and potentially add parsers later. > > Suggestions, preferences, comments, anybody? I would go with option (3), and make a new release soon. If we write any Entrez specific parsers, they could live in Bio.Entrez Peter From bugzilla-daemon at portal.open-bio.org Sun Feb 24 17:10:21 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 24 Feb 2008 12:10:21 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802241710.m1OHALgL029795@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 ------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:10 EST ------- That new patch on comment 10 looks fine, and should resolve this bug. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From bugzilla-daemon at portal.open-bio.org Sun Feb 24 17:31:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 24 Feb 2008 12:31:40 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802241731.m1OHVe0C031271@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:31 EST ------- Further to my comment 7, > And in the other direction, Doc/Images/BlastRecord.png, > PSIBlastRecord.png and smcra.png appear to be checked > in as text: They work fine on Linux, but when checked > out on Windows the images are corrupt. Add to this Doc/images/bottle.png - it would be great if someone else on Windows could try this in case its a problem with my setup. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 08:38:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 03:38:20 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200802250838.m1P8cKLa009702@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #12 from mdehoon at ims.u-tokyo.ac.jp 2008-02-25 03:38 EST ------- Fixed in CVS, using the patch from Comment #10. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From tiagoantao at gmail.com Mon Feb 25 10:46:04 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 25 Feb 2008 10:46:04 +0000 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: <200802241731.m1OHVe0C031271@portal.open-bio.org> References: <200802241731.m1OHVe0C031271@portal.open-bio.org> Message-ID: <6d941f120802250246q1e95c28bs54ff88bd0578071f@mail.gmail.com> Hi, I will delete and recommit this as binary... On Sun, Feb 24, 2008 at 5:31 PM, wrote: > http://bugzilla.open-bio.org/show_bug.cgi?id=2363 > > > > > > ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-24 12:31 EST ------- > Further to my comment 7, > > > And in the other direction, Doc/Images/BlastRecord.png, > > PSIBlastRecord.png and smcra.png appear to be checked > > in as text: They work fine on Linux, but when checked > > out on Windows the images are corrupt. > > Add to this Doc/images/bottle.png - it would be great if someone else on > Windows could try this in case its a problem with my setup. > > > -- > Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Mon Feb 25 12:43:05 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 07:43:05 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802251243.m1PCh5iW026139@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #5 from tiagoantao at gmail.com 2008-02-25 07:43 EST ------- I've corrected (hopefully) bottle.png. Can someone one Windows test please? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 19:22:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 14:22:09 -0500 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200802251922.m1PJM9df030144@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 14:22 EST ------- Using this seemed to work, tagging the images as binary: cvs admin -kb *.png cvs update -A *.png All of Doc/images/*.png now seem to work fine on Windows. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 20:36:31 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:36:31 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802252036.m1PKaVPb002261@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:36 EST ------- I agree with you that this is a problem - its something that gets checked explicitly for the parsers used in Bio.SeqIO by the unit test which uses a StringIO handle. Perhaps Bio/File.py would be a good place to put an "is this a handle" function, or maybe just replace these existing checks with relevant attribute checks. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 20:38:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:38:19 -0500 Subject: [Biopython-dev] [Bug 1816] Error when importing GenBank file into BioSQL database In-Reply-To: Message-ID: <200802252038.m1PKcJMw002396@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1816 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:38 EST ------- As per comment 11, marking this as fixed since the original problem is resolved. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 20:40:51 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:40:51 -0500 Subject: [Biopython-dev] [Bug 2375] Coalescent support through Simcoal2 In-Reply-To: Message-ID: <200802252040.m1PKepCa002614@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2375 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #21 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:40 EST ------- The non-binary PNG file in cvs (bug 2363) now works on Windows, so this looks fine to me. Marking as fixed. Good job Tiago. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 20:46:57 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:46:57 -0500 Subject: [Biopython-dev] [Bug 2425] Fasta ID parsing error In-Reply-To: Message-ID: <200802252046.m1PKkvdP002996@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2425 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:46 EST ------- I haven't got round to addressing this issue yet - currently the BioSQL with SeqIO unit test (which I added relatively recently) deliberately avoids using any FASTA files because of this problem. We may want to try and do something intelligent with the version field if present in the annotation dictionary, which should be more robust than simply checking the record.id format. I assume in your example you expected "region1.fasta.screen.Contig1" to be used as the record key in BioSQL? There is a 40 character limit on this field, which should be fine for most FASTA identifiers. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 20:52:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 15:52:02 -0500 Subject: [Biopython-dev] [Bug 2381] translate and transcibe methods for the Seq object (in Bio.Seq) In-Reply-To: Message-ID: <200802252052.m1PKq2e3003313@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2381 ------- Comment #12 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 15:52 EST ------- There are some functions in Bio/Utils.py which could also be deprecated after adding translate and transcribe functionality to the Seq object. In fact, we might consider deprecating all of Bio/Utils.py and moving anything worthwhile into Bio/SeqUtils -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 21:00:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 16:00:19 -0500 Subject: [Biopython-dev] [Bug 2448] Bio.EUtils can't handle accented author names In-Reply-To: Message-ID: <200802252100.m1PL0JiB003906@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2448 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-02-25 16:00 EST ------- This was Andrew Dalke's reply on the Biopython-dev mailing list, 10 Feb 2008, which I'm adding to Bugzilla for future reference: On Feb 10, 2008, at 9:29 PM, bugzilla-daemon at portal.open-bio.org wrote: > Summary: Bio.EUtils can't handle accented author names ... > self.stack[-1].append(Text(text)) > UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in > position 4: > ordinal not in range(128) The EUtils code is old. It uses a DTD to XML parser that I found, what, 6 years ago? This problem is because the code uses class IndentedText(str): def __init__(self, data=""): self.data = unescape(unicode(data)) self._level = 0 self._parent = None That derivation from str is suspicious. I don't think it's needed, but I haven't reviewed the code well enough. Getting rid of the 'str' *might* fix it. Otherwise what's going on is the __new__ is seeing the byte string using non-ASCII values and it doesn't know what to do. So another solution might be to change that base class to "unicode" and do the right decode calls. Note that the current parser doesn't handle &# notation. Some years back I started work on a EUtils2. It used the then-quite- new ElementTree library. Here's what I had http://www.dalkescientific.com/writings/diary/archive/2005/09/30/ using_eutils.html If anyone wants the code, http://dalkescientific.com/EUtils-2.0a1.tar.gz I don't plan on doing anything more with it until I have a pressing need. Like someone wanting to pay me for it :) This old mail might also be useful for someone working on non-ASCII queries that are sent to NCBI. > The following is the MEDLINE character table for the XML. > > http://www.nlm.nih.gov/databases/dtd/medline_character_database.utf8 > > Diana Airozo > NCBI Contractor > dalke at dalkescientific.com wrote (Tue, Sep 7 2004 15:20:14): > > >> Hi Diana, >> >> Thank you for your reply. For a clarification on the >> non-ASCII query question >> >> >>>> Also, how do I do non-ASCII queries? For example, suppose I want >>>> to search for papers from "G??teborg Universitet" or "La Universidad >>>> de Espa??a". >>>> >> >> >> >>> You would search using Goteborg. >>> >> >> I want to automate this so that a user query for G??teborg >> gets converted into "Goteborg." I would prefer to use the >> same algorithm for doing this that your indexer uses. I >> looked online for unicode -> ASCII conversion table that >> strips the accents and other diacriticals and expands >> characters like ?? into ss and ?? into ae. I found >> several, but I would prefer to use the same table your >> indexer has so that queries are more likely to work. >> >> (Well, actually I would like your search code to perform >> the same input normalization that your indexer does, but >> I'll use this as a workaround.) >> >> Is the conversion table you use available? >> -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Feb 25 23:24:47 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 25 Feb 2008 18:24:47 -0500 Subject: [Biopython-dev] [Bug 2454] Iterators can't use file-like objects In-Reply-To: Message-ID: <200802252324.m1PNOlCw013424@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2454 cracka80 at gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #3 from cracka80 at gmail.com 2008-02-25 18:24 EST ------- (In reply to comment #2) > I agree with you that this is a problem - its something that gets checked > explicitly for the parsers used in Bio.SeqIO by the unit test which uses a > StringIO handle. > > Perhaps Bio/File.py would be a good place to put an "is this a handle" > function, or maybe just replace these existing checks with relevant attribute > checks. > What I'll do is add a function, change the checks in the relevant Iterators and upload a CVS diff (because I don't have write access). Of course I'll try it out to make sure it's not buggy. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From gregorio at umh.es Fri Feb 29 12:32:12 2008 From: gregorio at umh.es (Gregorio Fernandez) Date: Fri, 29 Feb 2008 13:32:12 +0100 Subject: [Biopython-dev] deprecation? Message-ID: <47C7FB4C.40607@umh.es> Dear Sir, I had this messasge in one of my scripts. Can I have this feature available? C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149: DeprecationWarning: Concurrent behavior has been deprecated, as this functionality needs Bio.MultiPr oc, which itself has been deprecated. If you need the concurrent behavior, pleas e let the Biopython developers know by sending an email to biopython-dev at biopyth on.org to avoid permanent removal of this feature. DeprecationWarning) Thanks Gregorio -- Gregorio J. Fernandez Ballester Instituto de Biolog?a Molecular y Celular Universidad Miguel Hern?ndez Edificio Torregait?n. Avda. de la Universidad, s/n. 03202 Elche (Alicante) E-mail: gregorio at umh.es Telf: 966 65 84 41 Fax: 966 65 87 58 From krewink at inb.uni-luebeck.de Wed Feb 13 10:06:47 2008 From: krewink at inb.uni-luebeck.de (Albert Krewinkel) Date: Wed, 13 Feb 2008 10:06:47 -0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <20080213100047.GA18695@inb.uni-luebeck.de> Hi, On Wed, Feb 13, 2008 at 09:19:27AM +0000, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... That's about normal and just the way svn works. I don't know the details, but AFAIK svn connects multiple times to the repo: Version Checking for changes, downloading data, etc. -- every operation needs a separate authentication. Quite stupid, in a way. You might want to generate a public key and add it to the ~/.ssh/authorized_keys file on the server - you won't be asked for a password any more. Cheers, Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: From krewink at inb.uni-luebeck.de Wed Feb 13 10:06:48 2008 From: krewink at inb.uni-luebeck.de (Albert Krewinkel) Date: Wed, 13 Feb 2008 10:06:48 -0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> References: <320fb6e00802110955s57cba8c4p3e0a9fc9f9bff7e7@mail.gmail.com> <473614.54149.qm@web62415.mail.re1.yahoo.com> <320fb6e00802130119t6c95bd28u22b94ecfebfd3ad9@mail.gmail.com> Message-ID: <20080213100047.GA18695@inb.uni-luebeck.de> Hi, On Wed, Feb 13, 2008 at 09:19:27AM +0000, Peter wrote: > On Feb 13, 2008 12:58 AM, Michiel de Hoon wrote: > > Peter wrote: > > svn co svn+ssh://username at dev.open-bio.org/home/hartzell/biopython-prototype/biopython/trunk > > > > This command worked for me, though for some reason my password > > is always refused the first time but accepted the second time. > > How strange - for me it asked for my password three times (this was > the issue I had emailed Chris about directly; also establishing that > yes, the same accounts and passwords were being used as in CVS). I > hope they can sort this out... That's about normal and just the way svn works. I don't know the details, but AFAIK svn connects multiple times to the repo: Version Checking for changes, downloading data, etc. -- every operation needs a separate authentication. Quite stupid, in a way. You might want to generate a public key and add it to the ~/.ssh/authorized_keys file on the server - you won't be asked for a password any more. Cheers, Albert -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: