From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:54:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 1 Mar 2008 03:54:19 -0500 Subject: [Biopython-dev] [Bug 2464] from Bio import db doesn't work? In-Reply-To: Message-ID: <200803010854.m218sJFT023721@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2464 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST ------- Duplicate of Bug #2393, which was fixed in CVS. *** This bug has been marked as a duplicate of bug 2393 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:54:23 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 1 Mar 2008 03:54:23 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200803010854.m218sNPD023746@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |patrikd at gmail.com ------- Comment #13 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST ------- *** Bug 2464 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From mjldehoon at yahoo.com Sat Mar 1 03:52:16 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 1 Mar 2008 00:52:16 -0800 (PST) Subject: [Biopython-dev] deprecation? In-Reply-To: <47C7FB4C.40607@umh.es> Message-ID: <504389.32803.qm@web62405.mail.re1.yahoo.com> Dear Gregorio, Thanks for letting us know. Could you show us what exactly you are trying to do in your script? This function was deprecated because there were several functions in Biopython doing nearly the same thing, and we're trying to converge on one function. So probably, the best thing would be to avoid using Bio\config\DBRegistry.py. --Michiel. Gregorio Fernandez wrote: Dear Sir, I had this messasge in one of my scripts. Can I have this feature available? C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149: DeprecationWarning: Concurrent behavior has been deprecated, as this functionality needs Bio.MultiPr oc, which itself has been deprecated. If you need the concurrent behavior, pleas e let the Biopython developers know by sending an email to biopython-dev at biopyth on.org to avoid permanent removal of this feature. DeprecationWarning) Thanks Gregorio -- Gregorio J. Fernandez Ballester Instituto de Biolog?a Molecular y Celular Universidad Miguel Hern?ndez Edificio Torregait?n. Avda. de la Universidad, s/n. 03202 Elche (Alicante) E-mail: gregorio at umh.es Telf: 966 65 84 41 Fax: 966 65 87 58 _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Never miss a thing. Make Yahoo your homepage. From bugzilla-daemon at portal.open-bio.org Mon Mar 3 16:53:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 3 Mar 2008 16:53:59 -0500 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803032153.m23LrxP4023475@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-03 16:53 EST ------- Defining __eq__ and __ne__ methods for the Alphabet class would probably work, but we would also have to do this for the AlphabetEncoder "decorator" class. I'm a little wary of this... def __ne__(self, other) : """Check if this alphabet object <> another alphabet""" return not self == other def __eq__(self, other) : """Check if this alphabet object == another alphabet""" #TODO - what exactly do we want to check here? if id(self) == id(other) : return True if not isinstance(other, Alphabet) \ and not isinstance(other, AlphabetEncoder): raise ValueError("Comparing an alphabet to a non-alphabet") if self.__class__ <> other.__class__ : return False if self.size <> other.size : return False if self.letters <> other.letters : return False if dir(self) <> dir(other) : return False for attr in ["gap_char", "stop_symbol"] : if hasattr(self, attr) <> hasattr(other, attr) : return False if hasattr(self, attr) and hasattr(other, attr) \ and self.__getattr__(attr) <> other.__getattr_(attr) : return False #Close enough? return True Relaxing the assertion in Bio.Translate would be much safer in terms of any potential side effects. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Thu Mar 6 10:24:47 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 6 Mar 2008 07:24:47 -0800 (PST) Subject: [Biopython-dev] New Biopython release Message-ID: <48921.43822.qm@web62403.mail.re1.yahoo.com> Hi everybody, Let's make a new release (1.45). I'm thinking of Friday 21st, which gives us about two weeks. The current Biopython release (1.44) has a nasty bug that causes an error with one of the Bio.GenBank examples in the tutorial. This bug has since been fixed in CVS. If you have any code that is ready to be submitted to CVS, now would be a good time to do so. If your code is not yet ready from prime time, please don't submit it to CVS until after the release to avoid any last-minute problems. Biopython 1.44 had a large number of deprecations, but I feel it is too soon to remove them from the release completely. Bio.Blast.blast and Bio.Blast.blasturl have been deprecated for several releases now, so if there are no objections I think we should remove them. Bio.Kabat has been deprecated since release 1.43. Since it has few (if any) users, I think we should remove it too. Also, please have a look at the Biopython bugs that are still open to see if there's anything we can do about them. --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Sat Mar 8 15:25:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 8 Mar 2008 15:25:06 -0500 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803082025.m28KP661006291@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-08 15:25 EST ------- Changing Bio/Translate.py line 14+ and 36+ from this: assert seq.alphabet == self.table.nucleotide_alphabet, \ ... to this: #Allow different instances of the same class to be used: assert seq.alphabet.__class__ == \ self.table.nucleotide_alphabet.__class__, \ ... seems to resolve the original bug report. I'd like to check this doesn't affect any of the unit tests under Linux - Windows looks OK. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 10 06:12:13 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 10 Mar 2008 06:12:13 -0400 Subject: [Biopython-dev] [Bug 1999] new frame translation method In-Reply-To: Message-ID: <200803101012.m2AACD7k003033@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1999 ------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2008-03-10 06:12 EST ------- Since SeqUtils.frameTranslations and SeqUtils.six_frame_translations are so similar, I think we should keep only one of these functions. Preferably named "six_frame_translations", for backward compatibility. Also, I think we should not require the seqO argument to be a Seq object. If this function is to replace the existing SeqUtils.six_frame_translations, we should make to sure to keep all the existing functionality of that function. I believe current the GC content calculation is missing in SeqUtils.frameTranslations. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Tue Mar 11 20:37:11 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 12 Mar 2008 00:37:11 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> Message-ID: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> Hi Chris, I haven't heard anything about the CVS to SVN move recently. Did anyone resolve the multiple password prompt niggle? On another point, is the test SVN repository intended to be writable (for those of us with developer access)? I really should try running some things like "svn diff" and committing sample changes to get a feel for how it compares to CVN. Peter From sbassi at gmail.com Wed Mar 12 10:52:51 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 11:52:51 -0300 Subject: [Biopython-dev] BLAST XML to HTML Message-ID: Is there a Biopython module to convert a VLAST XML output to HTML? Like this one from BioJAVA: http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML If there is no such a module, could this be included into Biopython if I provide the code? -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From peter at maubp.freeserve.co.uk Wed Mar 12 14:12:35 2008 From: peter at maubp.freeserve.co.uk (Peter) Date: Wed, 12 Mar 2008 18:12:35 +0000 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: References: Message-ID: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> On Wed, Mar 12, 2008 at 2:52 PM, Sebastian Bassi wrote: > Is there a Biopython module to convert a VLAST XML output to HTML? > Like this one from BioJAVA: > http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML > If there is no such a module, could this be included into Biopython if > I provide the code? Is your idea to convert from the XML output of the NCBI BLAST tools into HTML very closely resembling the NCBI's HTML output (perhaps for another program to read as input). Or do you just want to produce a nice HTML page for a person to read (perhaps resembling the NCBI page in appearance, but not using the same HTML layout)? How would your code work -direct from the XML file, or from the results of the existing Biopython BLAST parsers? Peter From sbassi at gmail.com Wed Mar 12 14:21:18 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 15:21:18 -0300 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> References: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> Message-ID: On Wed, Mar 12, 2008 at 3:12 PM, Peter wrote: > Is your idea to convert from the XML output of the NCBI BLAST tools > into HTML very closely resembling the NCBI's HTML output (perhaps for > another program to read as input). Or do you just want to produce a > nice HTML page for a person to read (perhaps resembling the NCBI page > in appearance, but not using the same HTML layout)? The idea is because I always run the BLAST as XML since I parse them with biopython, but people at lab want to check the HTML version (or I want to "publish" the result in a public DB accessible via html) and that makes me re-run the BLAST just for them to see the output. Sometimes the BLAST are resource demanding (like a 2 week run) and I would like to avoid re-running the BLAST when I really want is a format change. > How would your code work -direct from the XML file, or from the > results of the existing Biopython BLAST parsers? >From the XML output. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From mjldehoon at yahoo.com Wed Mar 12 17:50:42 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 12 Mar 2008 14:50:42 -0700 (PDT) Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: Message-ID: <845186.11502.qm@web62406.mail.re1.yahoo.com> > > How would your code work -direct from the XML file, or from the > > results of the existing Biopython BLAST parsers? >From the XML output. One option is to use Cascading Style Sheets (CSS) to display the XML file. That way, you don't have to create a new HTML file. Also, we should check with NCBI if they have a tool for such purposes. --Michiel. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From sbassi at gmail.com Wed Mar 12 18:25:35 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 19:25:35 -0300 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: <845186.11502.qm@web62406.mail.re1.yahoo.com> References: <845186.11502.qm@web62406.mail.re1.yahoo.com> Message-ID: On Wed, Mar 12, 2008 at 6:50 PM, Michiel de Hoon wrote: > One option is to use Cascading Style Sheets (CSS) to display the XML file. > That way, you don't have to create a new HTML file. Also, we should check > with NCBI if they have a tool for such purposes. They must have something because the new online NCBI BLAST has an option called "reformat BLAST results". This option can reformat from XML to HTML without re-running the BLAST, but this is working as server-side. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From bugzilla-daemon at portal.open-bio.org Thu Mar 13 06:07:44 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 06:07:44 -0400 Subject: [Biopython-dev] [Bug 2468] New: Tutorial needs a fix: Bio.WWW.NCBI Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2468 Summary: Tutorial needs a fix: Bio.WWW.NCBI Product: Biopython Version: 1.44 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Documentation AssignedTo: biopython-dev at biopython.org ReportedBy: mmokrejs at ribosome.natur.cuni.cz I am trying to follow the recipe at http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc14 which contains the following split into several chunks (I don't like this style personally, but that's not the issue here): #! /usr/bin/python from Bio.WWW import NCBI search_command = 'Search' search_database = 'Taxonomy' return_format = 'FASTA' search_term = 'Cypripedioideae' my_browser = 'lynx' result_handle = NCBI.query(search_command, search_database, term = search_term, doptcmdl = return_format) import os result_file_name = os.path.join(os.getcwd(), "results.html") result_file = open(result_file_name, "w") result_file.write(result_handle.read()) result_file.close() if my_browser == "lynx": os.system("lynx -force_html " + result_file_name) elif my_browser == "netscape": os.system("netscape file:" + result_file_name) I end up with a lynx browser opened with the Entrez search page pre-filled with the 'Cypripedioideae' as the query string. Unfortunately, I have to click on the condensed results to get the taxonomy listing under the word 'Cypripedioideae'. The line I am talking about is close the the end of the output: [ ] 1: Cypripedioideae, subfamily, monocots Links BTW, other the links from the page do not work because they point to http://localhost/.... /usr/lib/python2.5/site-packages/Bio/WWW/NCBI.py:34: DeprecationWarning: Bio.WWW.NCBI is deprecated. The functions in Bio.WWW.NCBI are now available from Bio.Entrez. DeprecationWarning) The section needs updating. I am somewhat surprised I cannot access NCBI Taxonomy easily. Priobably will have to browse the source code and forget Tutorail and Cookbook. ;) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 13 06:55:55 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 06:55:55 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803131055.m2DAttv2027003@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 06:55 EST ------- The tutorial in CVS is already updated to use Bio.Entrez.query instead of Bio.WWW.NCBI.query relecting the depreciation made in CVS. I think you are using the Biopython 1.44 tutorial (from the weblink) with the CVS Biopython code. So at least part of your problem is already fixed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 13 07:27:14 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 07:27:14 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803131127.m2DBREdM028784@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 07:27 EST ------- Reading the Bio.Entrez documentation, the query function always returns HTML. You could also use the esearch function which returns XML, followed by the efetch function which seems to support a range of options depending on the datatype. For example, using the taxonomy db: #This gets an XML file from the following URL, #http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=Cypripedioideae from Bio import Entrez result_handle = Entrez.esearch("taxonomy", term="Cypripedioideae") print result_handle.read() You could then parse the XML file to extract the matching ID(s), perhaps with a regular expression. In this case, there is only one match, 158330. #http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy #&id=9685&report=brief&retmode=text from Bio import Entrez result_handle = Entrez.efetch("taxonomy", id="158330", \ report="docsum", retmode="text") print result_handle.read() #Given ID 9685, returns "Cypripedioideae, subfamily, monocots" I agree that this section of the tutorial could be more useful. Do you think the above could would be more helpful? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Thu Mar 13 08:14:09 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 13 Mar 2008 12:14:09 +0000 Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions Message-ID: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com> Hi Michiel (et al), I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez I notice you've marked several of the functions in Bio.Entrez with depreciation warnings as the NCBI has retired the associated APIs. i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI is deprecated). Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and Bio.WWW.NCBI.pmneighbor will get a deprecation warning from Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch, Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings. Do you think we can just remove pmfetch, pmqty and pmneighbor from Bio.Entrez so that it starts out "clean", and adjust the warning from Bio.WWW.NCBI as follows: import warnings warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now available from Bio.Entrez, except for the pm* functions which the NCBI have retired.", DeprecationWarning) What do you think? Peter From mjldehoon at yahoo.com Thu Mar 13 08:48:29 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 13 Mar 2008 05:48:29 -0700 (PDT) Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions In-Reply-To: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com> Message-ID: <84386.61626.qm@web62415.mail.re1.yahoo.com> That is fine with me. Maybe I was being too conservative. I'll make those changes. --Michiel. Peter wrote: Hi Michiel (et al), I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez I notice you've marked several of the functions in Bio.Entrez with depreciation warnings as the NCBI has retired the associated APIs. i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI is deprecated). Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and Bio.WWW.NCBI.pmneighbor will get a deprecation warning from Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch, Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings. Do you think we can just remove pmfetch, pmqty and pmneighbor from Bio.Entrez so that it starts out "clean", and adjust the warning from Bio.WWW.NCBI as follows: import warnings warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now available from Bio.Entrez, except for the pm* functions which the NCBI have retired.", DeprecationWarning) What do you think? Peter _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Fri Mar 14 11:53:34 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Mar 2008 11:53:34 -0400 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803141553.m2EFrY6p001573@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 11:53 EST ------- Fixed in CVS, Bio/Translate.py revision 1.3, as described in comment 3. This fixes the original report, making sequence translation simpler to use - see also Bug 2381 - translate and transcibe methods for the Seq object (in Bio.Seq) This change does NOT address the larger issue of how to decide if two alphabets are equal or not. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 14 12:19:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Mar 2008 12:19:40 -0400 Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS journals In-Reply-To: Message-ID: <200803141619.m2EGJeIC003283@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2447 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 12:19 EST ------- Created an attachment (id=878) --> (http://bugzilla.open-bio.org/attachment.cgi?id=878&action=view) Patch to Bio/EUtils/parse.py I'm sure sure if this is the best way to fix this, but it does appear to solve the reported problem. Can you give this a try Noel? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 16:36:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 16:36:06 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200803152036.m2FKa6xR029284@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 16:36 EST ------- I've just done a clean checkout and build on Windows, and run the test suite, and built the tutorial as PDF. I didn't run into any text/binary issues, so this seems to be fixed now :) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 16:49:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 16:49:06 -0400 Subject: [Biopython-dev] [Bug 2469] New: requires_wise.py fails on Windows (test suite) Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2469 Summary: requires_wise.py fails on Windows (test suite) Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk On my Windows XP machine, I don't have wise installed, so the dnal command doesn't work: C:\TEMP\>dnal 'dnal' is not recognized as an internal or external command, operable program or batch file. When running the unit test suite, test_Wise.py SHOULD fail with a missing external dependency error - instead it tries to run with a failed assertion error. The problem is requires_wise.py fails... which seems to be an issue with the commands.getoutput() function not working on Windows, its unix only according to: http://www.python.org/doc/current/lib/module-commands.html Annoyingly, the commands module is present on Windows (or at least Python 2.3) but simply doesn't work due to calling this: os.popen('{ ' + cmd + '; } 2>&1', 'r') As a result, >>> commands.getoutput("xyz") "'{' is not recognized as an internal or external command,\noperable program or batch file." Assuming wise/dnal actually works on Windows, we need to use something other than commands.getoutput("dnal") to check for it. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 19:40:54 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 19:40:54 -0400 Subject: [Biopython-dev] [Bug 2469] requires_wise.py fails on Windows (test suite) In-Reply-To: Message-ID: <200803152340.m2FNesQl005388@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2469 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 19:40 EST ------- You can download wise2 from ftp://ftp.ebi.ac.uk/pub/software/unix/wise2/ and compile it under Windows XP using cygwin, but its own tests fail - I'm not sure why. Carrying on regardless, then test_Wise.py still doesn't work for me :( P.S. Cornell University have packaged wise2 for Windows (found via Google, I haven't tried this): http://www.tc.cornell.edu/WBA/ -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 16 17:03:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Mar 2008 17:03:59 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803162103.m2GL3x4u021735@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:03 EST ------- In the old code, when the species wasn't already recorded in the taxon/taxon_name tables, we add would it and its parent lineage entries. See also http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html There are a few problems in the old code, exposed in the unit tests, but I think I have this working again now. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 16 17:25:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Mar 2008 17:25:02 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803162125.m2GLP2Oq023867@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:25 EST ------- Created an attachment (id=879) --> (http://bugzilla.open-bio.org/attachment.cgi?id=879&action=view) patch to BioSQL/Loader.py Possible patch - the two BioSQL unit tests pass with this. I have not had a chance to try this in combination with a taxonomy table pre-populated by load_ncbi_taxonomy.pl -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Sun Mar 16 22:43:55 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 16 Mar 2008 19:43:55 -0700 (PDT) Subject: [Biopython-dev] [BioSQL-l] Loading sequences with novel NCBI taxon id In-Reply-To: <002201c88705$70780840$6400a8c0@Gecko> Message-ID: <121379.45887.qm@web62403.mail.re1.yahoo.com> > Thank you for your mail recommending the usage of NCBI.WWW. > I have modified my class/script accordingly to your suggestion > without problem. Once 1.45 is out, I will change for NCBI.Entrez > as you informed me. Just to avoid any confusion: In Biopython 1.45, the module will be "Bio.Entrez", not "Bio.NCBI.Entrez". > In any case, I do not pretend having a fantastic piece of code, but it gets > the job done. If you find this interesting, I would be pleased to contribute > to BioPython. Bio.Entrez will need some parsers to parse the XML results, although that probably won't happen before the 1.45 release. I think your script could be very useful when writing those parsers. Could you open a bug report on Bugzilla and upload your script there? Beware, to upload a script to Bugzilla, you need to create a bug report first, and then as a separate step upload the script. Thanks! --Michiel.. Eric Gibert wrote: Dear Peter, Regarding the update of the BioSQL tables taxon and taxon_name, I have created a class "TaxonUpdate" (how original!) which do two things: 1) as a class itself, it will fetch from NCBI the taxon's information as XLM based on the taxon_id passed to the constructor, parse the returned XML answer to get the genus, class, order, family (10 levels) and update that in taxon table. If taxon_name needs update/insert, it does it too. 2) run as an independent script __main__, it will look for all species in taxon table for which the genus (parent) does not have a ncbi_taxon_id (i.e. is NULL as this is the current result after adding a new sequence in BioSQL). For all those incomplete found records, it will perform the update as (1) After the addition of a new sequence in a BioSQL database, a simple call of this code (passing the taxon_id) will do the updating job. Dear Michiel, Thank you for your mail recommending the usage of NCBI.WWW. I have modified my class/script accordingly to your suggestion without problem. Once 1.45 is out, I will change for NCBI.Entrez as you informed me. In any case, I do not pretend having a fantastic piece of code, but it gets the job done. If you find this interesting, I would be pleased to contribute to BioPython. Eric -----Original Message----- From: biosql-l-bounces at lists.open-bio.org [mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Peter Sent: Thursday, March 13, 2008 11:06 PM To: BioSQL Subject: [BioSQL-l] Loading sequences with novel NCBI taxon id Dear list, One of the unresolved issues with Biopython's BioSQL interface is dealing with the NCBI taxon ID when loading sequences into the database. As I understand it, ideally before loading any sequences, the user will have loaded in the entire NCBI taxonomy using the load_ncbi_taxonomy.pl script, as I described here: http://biopython.org/wiki/BioSQL#NCBI_Taxonomy When a new sequence is added to the database with a known taxon id, there is no problem. But happens if its a recently sequenced organism which isn't defined yet in the BioSQL taxonomy tables? Could/should the user re-run load_ncbi_taxonomy.pl, and then load in their new sequence? Right now in Biopython due what appears to have been intended as a short term hack, we simple don't record the taxon id at all (!), and I would like to fix this (bug 2422). http://bugzilla.open-bio.org/show_bug.cgi?id=2422 How do BioPerl et al deal with this issue? Do they try and update the taxonomy tables using the available information in the new record's annotation (i.e. the new taxon id and the species name)? Do they lookup the NCBI taxonomy definition via the internet? Do they throw an error and halt? Thanks, Peter (Biopython) _______________________________________________ BioSQL-l mailing list BioSQL-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biosql-l --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Mon Mar 17 07:47:46 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Mar 2008 07:47:46 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803171147.m2HBlksw008865@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #3 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 07:47 EST ------- Hi Peter, yes this would be more helpful. Unfortunately I did the one-time job with parsing the HTML output and re-running wget to fetch the final HTML page, stripped HTML formatting and was done. I will upload my two crappy scripts. They work but should be re-written to utilize the XML outputs you have mentioned. The second URL from your last comment should have different values for some parameters to yield another XML page: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=41073&report=sgml&mode=xml That returns me: 41073 cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; Bilateria; Coelomata; Protostomia; Panarthropoda; Arthropoda; Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera; Endopterygota; Coleoptera; Adephaga Maybe I will find the time to rewrite them for the purpose of tutorial to use the XMLs. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 17 09:18:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Mar 2008 09:18:35 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803171318.m2HDIZYX014608@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #4 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 09:18 EST ------- Created an attachment (id=880) --> (http://bugzilla.open-bio.org/attachment.cgi?id=880&action=view) taxfetch.py This program/module can fetch for the user the Lineage line. The query() function uses the deprecated biopython API while the efetch uses the other. Queries get cached in a local file taxonomycache.db for speed. Users can call either of the two functions from external python code. Feel free to use the code in Tutorial or even bundle in any form into the package. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Mon Mar 17 18:30:38 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 17 Mar 2008 22:30:38 +0000 Subject: [Biopython-dev] Bio.PopGen status Message-ID: <6d941f120803171530g36504759g4b3cf835065e17b8@mail.gmail.com> Hi, This is a short email regarding Bio.PopGen status. 1. All the code on the repository should be stable. 2. The Biopython version that is schedule for release soon will have support for coalescent population genetics' simulations 3. A short number of test code cases are included. 4. Documentation was produced and is available on the Tutorial. I believe that it is satisfactory (tell me if you disagree). 5. Bio.PopGen is still not "version 1" in the sense that the fundamental statistics code is missing. This was a conscious strategy to start with selection detection and coalescent simulation in order to begin with arguably less important stuff so that newbie errors (in the sense that I was a newbie developer to biopython) would have less impact. 6. Statistics is my next task and hopefully will coincide with the biopython release after this one. This will be, at least, for me, "version 1" of Bio.PopGen 7. In the code, there is, since the original Bio.PopGen, code that is able to execute external simulators in parallel (thus taking advantage of multi core architectures for computationally intensive simulations). This is, unfortunately, not documented. I will document this (maybe in a separate document from the tutorial) in the future. I don't think this is priority 1. But others might be interested in using this code for computationally intensive tasks using external programs. In case you want to know more details about this, please say so. >From a biopython release perspective, Bio.PopGen with new coalescent simulation features is fully ready. Please go ahead and release whenever is more convenient. -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Thu Mar 20 06:23:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 20 Mar 2008 06:23:35 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803201023.m2KANZun010097@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-20 06:23 EST ------- Patch checked in as BioSQL/Loader.py revision 1.28 Unit tests passed on both Windows XP and Linux (using MySQL) Note that once we have added "provisional" entries to the taxon/taxon_name table based on the record annotation, load_ncbi_taxonomy.pl should be able to tidy things up using the NCBI taxonomy. At least it should once BioSQL bug 2470 is fixed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Thu Mar 20 16:14:50 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 20 Mar 2008 20:14:50 +0000 Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics services In-Reply-To: <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com> References: <320fb6e00803140944v13f241b9icc0e911643f234cd@mail.gmail.com> <47DE0C22.9040202@netsys.co.za> <320fb6e00803170049g79960e14u8c1417fcdc99a0d5@mail.gmail.com> <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com> Message-ID: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com> I posted the following email on the mail discussion mailing list, and haven't seen any replies. Should we mark Bio.biblio as deprecated now (before the imminent release)? Peter On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote: > Dear list, > > We have an old module Bio/biblio.py written by Tiaan Wessels back in > 2002 (during a South African hackathon). This is code to use some EBI > Bibliographics services, but currently no longer works. At the very > least, the EBI have changed the URLs for their SOAP services. I got > in touch with the author by email, and he no longer uses the code and > thought we could remove it. > > Does anyone on the list still use Bio/biblio.py? > > Would anyone like to take a more in depth look at the code, and the > current EBI web API, and see if there is anything in Bio.biblio worth > keeping? > > If not, I'm proposing we mark this as deprecated for the next release > of Biopython. > > Thanks, > > Peter > From mjldehoon at yahoo.com Thu Mar 20 22:08:56 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 20 Mar 2008 19:08:56 -0700 (PDT) Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics services In-Reply-To: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com> Message-ID: <242823.17441.qm@web62402.mail.re1.yahoo.com> > Should we mark Bio.biblio as deprecated now (before the imminent release)? Yes. It's just a deprecation; the code will still be usable. The deprecation warning should contain a notice to contact us in case somebody is still using this code. If not, it's better to deprecate it and remove it in some future release. Keeping Biopython clean is important. --Michiel. Peter wrote: I posted the following email on the mail discussion mailing list, and haven't seen any replies. Should we mark Bio.biblio as deprecated now (before the imminent release)? Peter On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote: > Dear list, > > We have an old module Bio/biblio.py written by Tiaan Wessels back in > 2002 (during a South African hackathon). This is code to use some EBI > Bibliographics services, but currently no longer works. At the very > least, the EBI have changed the URLs for their SOAP services. I got > in touch with the author by email, and he no longer uses the code and > thought we could remove it. > > Does anyone on the list still use Bio/biblio.py? > > Would anyone like to take a more in depth look at the code, and the > current EBI web API, and see if there is anything in Bio.biblio worth > keeping? > > If not, I'm proposing we mark this as deprecated for the next release > of Biopython. > > Thanks, > > Peter > _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Never miss a thing. Make Yahoo your homepage. From mjldehoon at yahoo.com Fri Mar 21 07:57:02 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 21 Mar 2008 04:57:02 -0700 (PDT) Subject: [Biopython-dev] CVS freeze for release Message-ID: <621627.53345.qm@web62408.mail.re1.yahoo.com> Hi everybody, I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks! --Michiel. --------------------------------- Never miss a thing. Make Yahoo your homepage. From biopython at maubp.freeserve.co.uk Fri Mar 21 08:51:01 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 12:51:01 +0000 Subject: [Biopython-dev] CVS freeze for release In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com> References: <621627.53345.qm@web62408.mail.re1.yahoo.com> Message-ID: <320fb6e00803210551o4644fc34meaadf9e521b087fe@mail.gmail.com> Michiel de Hoon wrote: > Hi everybody, > > I'll start making release 1.45 from now. Please don't touch CVS until after > the release is out. Thanks! Good news :) I did check in some comment changes to BioSQL this morning, and the Bio.biblio deprecation, but that was a few hours ago. Peter From meanerelk at gmail.com Fri Mar 21 14:28:24 2008 From: meanerelk at gmail.com (Kemal) Date: Fri, 21 Mar 2008 14:28:24 -0400 Subject: [Biopython-dev] mentor for google summer of code Message-ID: I am a university student interested in adding phyloXML support to BioPython for the Google Summer of Code. Would any developers be willing to mentor this project? I have been discussing it with Hilmar Lapp, who is mentoring similar projects for the Phyloinformatics Summer of Code project at the National Evolutionary Synthesis Center. There page is at: https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008 A mentor would would be responsible for monitoring the project's progress over the summer, and to evaluate the work at the end. Google's guidelines estimate that this would take about 5 hours/week per student. There is more information at: http://code.google.com/opensource/gsoc/2008/faqs.html If anyone is interested, I would love to discuss the details of the proposal. Thank you, Kemal Eren From biopython at maubp.freeserve.co.uk Fri Mar 21 15:04:06 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 19:04:06 +0000 Subject: [Biopython-dev] mentor for google summer of code In-Reply-To: References: Message-ID: <320fb6e00803211204q2d0f3696pf5baaf44122a0869@mail.gmail.com> Hi Kemal, On Fri, Mar 21, 2008 at 6:28 PM, Kemal wrote: > I am a university student interested in adding phyloXML support to BioPython > for the Google Summer of Code. Would any developers be willing to mentor > this project? I have been discussing it with Hilmar Lapp, who is mentoring > similar projects for the Phyloinformatics Summer of Code project at the > National Evolutionary Synthesis Center. There page is at: > > https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008 I see there are similar projects already planned for phyloXML with BioPerl and BioRuby. For Biopython I guess building on Frank Kauff and Cymon J. Cox's Bio.Nexus module would be the most logical option. Have you had a chance to look at any of the Biopython code? > A mentor would would be responsible for monitoring the project's progress > over the summer, and to evaluate the work at the end. Google's guidelines > estimate that this would take about 5 hours/week per student. There is more > information at: > > http://code.google.com/opensource/gsoc/2008/faqs.html > > If anyone is interested, I would love to discuss the details of the > proposal. It would be worth trying to contact Frank and Cymon directly - and seeing if they would be interested. Peter (one of the current Biopython developers) From chris.lasher at gmail.com Fri Mar 21 17:11:40 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 21 Mar 2008 17:11:40 -0400 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> Message-ID: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote: > Hi Chris, > > I haven't heard anything about the CVS to SVN move recently. Did > anyone resolve the multiple password prompt niggle? That's still unresolved. The workaround is to place an SSH key on dev.open-bio.org. If you do, you'll notice even then it still makes two attempts to log in. /shrugs > On another point, is the test SVN repository intended to be writable > (for those of us with developer access)? I really should try running > some things like "svn diff" and committing sample changes to get a > feel for how it compares to CVN. I have tried committing to it and gotten a "Permission denied" error. It must only be set as read-only for group permissions. Really sorry for my delay on getting to this email. Now that Biopython has had another release, should we really push hard to switch to SVN? Chris From biopython-dev at maubp.freeserve.co.uk Fri Mar 21 17:37:45 2008 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 21:37:45 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> Message-ID: <320fb6e00803211437i79c1454m52ba4172728032c8@mail.gmail.com> On Fri, Mar 21, 2008 at 9:11 PM, Chris Lasher wrote: > On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote: > > Hi Chris, > > > > I haven't heard anything about the CVS to SVN move recently. Did > > anyone resolve the multiple password prompt niggle? > > That's still unresolved. The workaround is to place an SSH key on > dev.open-bio.org. If you do, you'll notice even then it still makes > two attempts to log in. /shrugs If someone's documented this from the previous Bio* migrations, then I guess we'll live with it. > > On another point, is the test SVN repository intended to be writable > > (for those of us with developer access)? I really should try running > > some things like "svn diff" and committing sample changes to get a > > feel for how it compares to CVN. > > I have tried committing to it and gotten a "Permission denied" error. > It must only be set as read-only for group permissions. I never did get round to trying myself... > Really sorry for my delay on getting to this email. Now that Biopython > has had another release, should we really push hard to switch to SVN? Well, Michiel declared a CVS freeze this morning and is preparing Biopython 1.45 as we speak. Once the release is out does sound like a good time for the SVN move to me. Peter From peter.bulychev at gmail.com Fri Mar 21 19:50:23 2008 From: peter.bulychev at gmail.com (Peter Bulychev) Date: Sat, 22 Mar 2008 02:50:23 +0300 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project Message-ID: Hello. Clone Digger project is aimed to find software clones (duplicate code) in Python and Java programs. I have applied it to the source of BioPython and discovered several clone candidates. There are a lot of false positives caused by similar code in nlmmedline_*_format.py files, but maybe other clone candidates will be interesting for you. The results can be seen here: http://clonedigger.sourceforge.net/examples.html -- Best regards, Peter Bulychev. From sbassi at gmail.com Fri Mar 21 23:49:52 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 22 Mar 2008 00:49:52 -0300 Subject: [Biopython-dev] CVS freeze for release In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com> References: <621627.53345.qm@web62408.mail.re1.yahoo.com> Message-ID: On Fri, Mar 21, 2008 at 8:57 AM, Michiel de Hoon wrote: > Hi everybody, > I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks! I have a proposal, so it could be implemented in the next version (1.46?). Change the output of EZRetrieve.retrieve_single. It currently returns a FASTA formated sequence. I think it should return a SeqRecord object (if you want this SeqRecord object to be printed or stored as FASTA, just use formatIO). Here are the proposed changes: http://www.pastecode.com.ar/f3baff314 I can fill this as an enhancement in the bugtrack if you agree. Best, SB. From mjldehoon at yahoo.com Sat Mar 22 07:02:38 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 22 Mar 2008 04:02:38 -0700 (PDT) Subject: [Biopython-dev] Biopython release 1.45 Message-ID: <901773.64728.qm@web62408.mail.re1.yahoo.com> We are pleased to announce the release of Biopython 1.45. This release includes numerous code improvements and fixes, including in Bio.Seq, Bio.SeqIO, Bio.Entrez, Bio.PopGen, Bio.SwissProt, Bio.Cluster, Bio.SCOP, Bio.InterPro, Bio.GenBank, Bio.ExPASy, BioSQL, and the Biopython documentation. Too many to list them all here! Source distributions and Windows installers are available from the Biopython website at http://biopython.org. My thanks to all code contributers who made this new release possible. --Michiel on behalf of the Biopython developers. --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From biopython-dev at maubp.freeserve.co.uk Sat Mar 22 07:18:42 2008 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sat, 22 Mar 2008 11:18:42 +0000 Subject: [Biopython-dev] EZRetrieve Message-ID: <320fb6e00803220418n348d1953v9846af9d04abc04c@mail.gmail.com> > I have a proposal, so it could be implemented in the next version (1.46?). > Change the output of EZRetrieve.retrieve_single. It currently returns > a FASTA formated sequence. I think it should return a SeqRecord object > (if you want this SeqRecord object to be printed or stored as FASTA, > just use formatIO). > Here are the proposed changes: http://www.pastecode.com.ar/f3baff314 > I can fill this as an enhancement in the bugtrack if you agree. So there is currently one function, retrieve_single, which can returns a handle but by default extracts and returns a FASTA record as a string. It does this by calling the parse_single function which reads in the handle, parses the HTML file, and extracts just the FASTA style text, throwing away the other annotation data (like the chromosome or range requested). Here is an example URL constructed by hand, http://siriusb.umdnj.edu:18080/EZRetrieve/single_r_run.jsp?org=0&AccType=0&input=BC014651&from=-200&to=200 Parsing HTML is nasty - especially if the site updates the formatting every so often. I suppose just looking for the FASTA sequence is fairly reliable. I can see the case for an EzRetrieve HTML to SeqRecord parser, but I would be tempted to try and parse more of the annotation. How many people do you think are using the retrieve_single function? I would be very annoying for them if its behaviour suddenly changed. Maybe we can add a new parse function, and call it from retrieve_single if the optional argument parse=2? Peter From biopython at maubp.freeserve.co.uk Sat Mar 22 07:35:39 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 22 Mar 2008 11:35:39 +0000 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project In-Reply-To: References: Message-ID: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> > Hello. > > Clone Digger project is aimed to find software clones (duplicate code) in > Python and Java programs. > > I have applied it to the source of BioPython and discovered several clone > candidates. > > There are a lot of false positives caused by similar code in > nlmmedline_*_format.py files, but maybe other clone candidates will be > interesting for you. > > The results can be seen here: > http://clonedigger.sourceforge.net/examples.html Interesting. Does your tool know to ignore deprecated modules? e.g. when we have essentially copied a file from one location to another, a deprecated the original. Some of these are from scanner/consumer parsers where there are two alternative consumers turning the data into different object representations. Other things like providing dictionary like objects seem to be reusing a lot of "boiler plate" code, and could probably be rationalised into a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both sharing apparently identical catalyse() methods may fall into the same class. Peter From peter.bulychev at gmail.com Sat Mar 22 17:31:30 2008 From: peter.bulychev at gmail.com (Peter Bulychev) Date: Sun, 23 Mar 2008 00:31:30 +0300 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project In-Reply-To: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> References: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> Message-ID: Hello. No, unfortunately Clone Digger can not ignore deprecated modules. In order to obtain betters results automatically generated code and tests should be removed from the searched source tree by hands. Other things like providing dictionary like objects seem to be reusing > a lot of "boiler plate" code, and could probably be rationalised into > a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and > Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py > > Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both > sharing apparently identical catalyse() methods may fall into the same > class. > > I think this is the main purpose of Clone Digger: to find clone candidates and to help to create recommendations for refactoring. 2008/3/22, Peter : > > > Hello. > > > > Clone Digger project is aimed to find software clones (duplicate code) > in > > Python and Java programs. > > > > I have applied it to the source of BioPython and discovered several > clone > > candidates. > > > > There are a lot of false positives caused by similar code in > > nlmmedline_*_format.py files, but maybe other clone candidates will be > > interesting for you. > > > > The results can be seen here: > > http://clonedigger.sourceforge.net/examples.html > > > Interesting. Does your tool know to ignore deprecated modules? e.g. > when we have essentially copied a file from one location to another, a > deprecated the original. > > Some of these are from scanner/consumer parsers where there are two > alternative consumers turning the data into different object > representations. > > Other things like providing dictionary like objects seem to be reusing > a lot of "boiler plate" code, and could probably be rationalised into > a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and > Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py > > Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both > sharing apparently identical catalyse() methods may fall into the same > class. > > > Peter > -- Best regards, Peter Bulychev. From sbassi at gmail.com Tue Mar 25 16:46:48 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Tue, 25 Mar 2008 17:46:48 -0300 Subject: [Biopython-dev] Can't login into wiki Message-ID: Hello, I press the link to login into the wiki (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython) but I am redirected to the same page without a login prompt. I found that this URL is dead (404): http://biopython.org/DIST/docs/api/public/trees.html (and it is linked from http://biopython.org/wiki/Getting_Started , last link). -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From biopython at maubp.freeserve.co.uk Tue Mar 25 16:55:23 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Mar 2008 20:55:23 +0000 Subject: [Biopython-dev] Can't login into wiki In-Reply-To: References: Message-ID: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com> On Tue, Mar 25, 2008 at 8:46 PM, Sebastian Bassi wrote: > Hello, > > I press the link to login into the wiki > (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython) > but I am redirected to the same page without a login prompt. Its not just you - the wiki is being a bit odd for me right too now, empty PHP pages etc. Maybe it needs rebooting again... which I think happens automatically every so often. If it doesn't clear up I'll email the OBF guys tomorrow. > I found that this URL is dead (404): > http://biopython.org/DIST/docs/api/public/trees.html > (and it is linked from http://biopython.org/wiki/Getting_Started , last link). It should probably be http://biopython.org/DIST/docs/api/ (the link documentation page is fine). Peter From mjldehoon at yahoo.com Tue Mar 25 19:55:20 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 25 Mar 2008 16:55:20 -0700 (PDT) Subject: [Biopython-dev] Can't login into wiki In-Reply-To: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com> Message-ID: <912788.63526.qm@web62415.mail.re1.yahoo.com> Peter wrote: > I found that this URL is dead (404): > http://biopython.org/DIST/docs/api/public/trees.html > (and it is linked from http://biopython.org/wiki/Getting_Started , last link). It should probably be http://biopython.org/DIST/docs/api/ (the link documentation page is fine). I fixed this link now. --Michiel --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:24:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:24:09 -0400 Subject: [Biopython-dev] [Bug 2475] New: BioSQL.Loader should reuse existing taxon entries in lineage Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2475 Summary: BioSQL.Loader should reuse existing taxon entries in lineage Product: Biopython Version: Not Applicable Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: BioSQL AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Based on a report on the mailing list by Eric Gibert, http://lists.open-bio.org/pipermail/biopython/2008-March/004137.html http://lists.open-bio.org/pipermail/biopython/2008-March/004147.html The _get_taxon_id() function will add new entries to the taxon and taxon_name tables when a species isn't already defined. It will also generate entries for the lineage (for which we don't know the NCBI taxon names). At this point it *should* be re-using any existing entries for elements of the lineage. Note - this is complicated due to the re-use of the same latin names in different classes. It might be easier/safer just not to write the lineage at all? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:34:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:34:40 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803261234.m2QCYekn009310@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:34 EST ------- See also Bug 2422 and this thread on the BioSQL mailing list: http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html In particular Hilmar Lapp from BioSQL wrote in reply to trying to reuse existing taxon table entries based on string matching to the scientific name field in the taxon_name table, which I said sounded a little unreliable: > It's pretty unreliable actually. There is not only synonymy > but also rampant homonymy in taxonomic names. There are > plenty of examples for the same scientific name in use for a > plant and for some animal, for example. So in order to be > unambiguous you will need to know (and check) the kingdom. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:44:01 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:44:01 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803261244.m2QCi15R009864@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:44 EST ------- Created an attachment (id=883) --> (http://bugzilla.open-bio.org/attachment.cgi?id=883&action=view) Patch to BioSQL/Loader.py to not record the lineage for new species This patch takes the simple route out - when loading a sequence into the database with a new species (not already in the taxon tables), we ONLY add the new species to the taxon and taxon_name tables. This DOES NOT attempt to record the whole lineage, adding or reusing existing taxon entries. Both the test_BioSQL and test_BioSQL_SeqIO unit tests still pass with this. I prefer this solution as it avoids any ambiguous heuristics in matching existing taxon names based on string comparions. This does mean Biopython won't match BioPerl is this regard, as I understand that BioPerl currently tries to record the full lineage. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 14:51:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 14:51:18 -0400 Subject: [Biopython-dev] [Bug 2477] New: SeqIO.parse does not handle embl files Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2477 Summary: SeqIO.parse does not handle embl files Product: Biopython Version: Not Applicable Platform: Macintosh OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: p.foster at nhm.ac.uk This is in 1.45, but I did not see it in 1.43. (1.45 is not a Bugzilla option at the moment ...) If fh is a handle to an embl format file, then SeqIO.parse(fh, 'embl') dies. It worked (not perfectly) in 1.43. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 15:21:41 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 15:21:41 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803261921.m2QJLfbe007389@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Version|Not Applicable |1.45 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 15:21 EST ------- I've fixed the Bugzilla version field - thanks for the reminder. Could you give more information please? e.g. a specific EMBL file, and the error you are seeing. Thanks, Peter. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 03:59:54 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 03:59:54 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803270759.m2R7xsXA006767@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #2 from p.foster at nhm.ac.uk 2008-03-27 03:59 EST ------- Created an attachment (id=888) --> (http://bugzilla.open-bio.org/attachment.cgi?id=888&action=view) test case It is a multi-bug. There is a bug that prevents 1.45 from reading embl files, and there is another bug, visible in 1.43 (at least) where it at least parses embl files, but imperfectly. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 06:49:33 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 06:49:33 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271049.m2RAnXpj015624@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 06:49 EST ------- Thanks for the clarification. I can reproduce the problem here. It looks like they may have tweaked their file format slightly. Biopython will be ignoring the apparently new PA line, which isn't described here: http://www.ebi.ac.uk/webin-align/fflink2.html You can also fetch the first problem record from their webpage, choose "Save", "ASCII text/table", "complete entries" http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[EMBLCDS:AAA03323]+-newId As a minor point, personally I find the following style simpler: from Bio import SeqIO fName = 'twoEmblRecords.embl' f = file(fName) s = SeqIO.parse(f, 'embl') for rec in s : print rec.description print rec.annotations['taxonomy'] f.close() (you may of course have good reason for using the .next() method explicitly) I'll take a look at this bug now... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 07:37:16 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 07:37:16 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271137.m2RBbGvg018455@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 07:37 EST ------- As you said, this is a multi-part bug! To try this out, you will need to update files Bio/GenBank/Scanner.py and __init__.py which are now in CVS. If you are not familiar with CVS, the easier method would be to download the two files from here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/?cvsroot=biopython#dirlist Note there is an hour or so time delay before it will show my changes. You can see where the files should be put from the stack trace. Please let me know how you get on (by posting on this bug). Missing AC lines ================ All our EMBL test cases tested included an AC line, and Biopython 1.45 was failing because of the missing AC line in your example, which was used to set the SeqRecord's id property. I have updated CVS to fall back on the ID line. Multiple DE lines ================= Already fixed as of Biopython 1.44 Multiple OC lines ================= Updated Biopython CVS to cope with multi-line taxonomy lineage PA lines (parent accessions) ============================ You didn't report this, but we currently are ignoring the PA lines. Quoting ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt PA line - contains the accession.version of the "parent" EMBL entry (entry where the CDS is annotated) e.g. a whole contig, not just this one CDS/gene. We could record this in the SeqRecord's annotations dictionary as a list of strings under key 'parent-accessions'. What do you think? Peter -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 11:50:36 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 11:50:36 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271550.m2RFoacs002027@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #5 from p.foster at nhm.ac.uk 2008-03-27 11:50 EST ------- I got those two files, and they seem to have fixed everything. Thanks muchly. The suggestion of de-ignoring the PA line sounds fine (although I have no use for it at the moment). -Peter F. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 12:22:13 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 12:22:13 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271622.m2RGMDWV003784@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 12:22 EST ------- OK, marking as fixed. I also included AAA03323 as a unit test, as we were lacking an example without an AC line. I'll leave the PA line issue alone for the time being; it would be wise to check if there are any parallels in GenBank or SwissProt/UniProt before doing anything so that they are all handled consistently. Thanks for your report Peter. Peter -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 29 22:53:41 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 29 Mar 2008 22:53:41 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803300253.m2U2rfLl002179@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #3 from ericgibert at yahoo.fr 2008-03-29 22:53 EST ------- I would like to propose the following solution: 1) add an extra optional parameter to load(): fetchNCBItaxonomy = False --> so no impact on existing code. If the users call the load function with True then: 2) after the species insert in the taxon/taxon_name table then the XML data from NCBI's taxonomy database are fetch 3) XML data is used to update taxon/taxon_name tables respecting the unicity of the records I have already part of the code, just need to change the fact that if a taxon already exists then the new taxon points to this already existing one. Comments? Eric -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 30 07:41:25 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 30 Mar 2008 07:41:25 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803301141.m2UBfPMC001648@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-30 07:41 EST ------- I quite like the idea of fetching the new taxon information from the NCBI as needed to record an accurate lineage. However, what happens if: (a) The network is down? Raise an exception maybe? (b) The NCBI doesn't have this Taxon ID (i.e. its invalid or so new their database is out of date)? Raise an exception? Eric, could you attach your taxonomy XML code to this bug? We'd probably want to start by adding taxonomy XML parsing to Bio.Entrez (which I assume you are using to fetch the XML data). What about sequences where we don't have a taxon ID, but we do have a species name? (which may happen with a sequence which wasn't read from a GenBank file). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Sun Mar 30 10:49:41 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 30 Mar 2008 07:49:41 -0700 (PDT) Subject: [Biopython-dev] Bio.Entrez XML parsing Message-ID: <864047.785.qm@web62410.mail.re1.yahoo.com> > Eric, could you attach your taxonomy XML code to this bug? > We'd probably want to start by adding taxonomy XML parsing > to Bio.Entrez (which I assume you are using to fetch the XML data). I've done some thinking about XML parsers for Bio.Entrez. I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file). Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML). The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules. If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception. Comments, anybody? --Michiel --------------------------------- Never miss a thing. Make Yahoo your homepage. From sdavis2 at mail.nih.gov Sun Mar 30 20:51:07 2008 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sun, 30 Mar 2008 20:51:07 -0400 Subject: [Biopython-dev] Bio.Entrez XML parsing In-Reply-To: <864047.785.qm@web62410.mail.re1.yahoo.com> References: <864047.785.qm@web62410.mail.re1.yahoo.com> Message-ID: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> On Sun, Mar 30, 2008 at 10:49 AM, Michiel de Hoon wrote: > > > Eric, could you attach your taxonomy XML code to this bug? > > We'd probably want to start by adding taxonomy XML parsing > > to Bio.Entrez (which I assume you are using to fetch the XML data). > > I've done some thinking about XML parsers for Bio.Entrez. > > I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file). > > Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML). > > The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules. > > If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception. > > Comments, anybody? This makes sense. However, it seems that there needs to be a way to "register" a parser with read() so that users can extend their local installation with a specialized parser. In other words, it seems that a way to dynamically register a parser with read() would be helpful. Or am I missing something? Sean From biopython at maubp.freeserve.co.uk Mon Mar 31 07:25:05 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 12:25:05 +0100 Subject: [Biopython-dev] Bio.Entrez XML parsing In-Reply-To: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> References: <864047.785.qm@web62410.mail.re1.yahoo.com> <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> Message-ID: <320fb6e00803310425u478fc938w2ff426c4eae32d99@mail.gmail.com> On Mon, Mar 31, 2008 at 1:51 AM, Sean Davis wrote: > This makes sense. However, it seems that there needs to be a way to > "register" a parser with read() so that users can extend their local > installation with a specialized parser. In other words, it seems that > a way to dynamically register a parser with read() would be helpful. > Or am I missing something? I like Michiel's plan. The mapping could be as simple as a (private) dictionary in Bio.Entrez, mapping formats to parser objects/functions - as done in Bio.SeqIO - which lets the user add new parsers or override the built in ones should they so desire. Peter From tiagoantao at gmail.com Mon Mar 31 10:54:38 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 15:54:38 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN Message-ID: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> Hi, I would like to start working on the statistical part (actually the most important part) of Bio.PopGen and on the HapMap part. My problem is with the CVS to SVN conversion. I cannot understand if I can go forward and where (ie on the SVN or the CSV repository)? I any case, I can wait with commiting, so there is no rush, but eventually I will have to commit somewhere ;) Tiago From bugzilla-daemon at portal.open-bio.org Mon Mar 31 11:22:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 11:22:20 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803311522.m2VFMKvU003831@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #5 from ericgibert at yahoo.fr 2008-03-31 11:22 EST ------- I attached the XML parser. Note that I did not dig too far in raising errors. This is not yet the full solution for the taxon/taxon_name tables of BioSQL but the first step. Please comment on my programming style and if you want me to raise errors. Note that Bio.Entrez already raises some errors. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 31 11:24:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 11:24:06 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803311524.m2VFO6wc004008@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #6 from ericgibert at yahoo.fr 2008-03-31 11:24 EST ------- Created an attachment (id=890) --> (http://bugzilla.open-bio.org/attachment.cgi?id=890&action=view) Parse a Taxonomy record from NCBI -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Mon Mar 31 11:45:07 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 16:45:07 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> Message-ID: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote: > Hi, > > I would like to start working on the statistical part (actually the > most important part) of Bio.PopGen and on the HapMap part. > > My problem is with the CVS to SVN conversion. I cannot understand if I > can go forward and where (ie on the SVN or the CSV repository)? > > I any case, I can wait with commiting, so there is no rush, but > eventually I will have to commit somewhere ;) In the short term, we are still using CVS. I've only been making relatively small changes as I anticipate the move to SVN will happen shortly... Are there any objections to doing it in the next fortnight? Chris - could you find out when would suit the OBF guys? Maybe come up with two suggested time slots in the next month? Peter From tiagoantao at gmail.com Mon Mar 31 14:32:06 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 19:32:06 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> Message-ID: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> When on SVN I would like to consider branching for PopGen. AFAIK branching on svn costs very little (only when you make changes does SVN copies the content from the original branch). This would have the big advantage that I could make my changes freely without impact on Michiel's release cycle (or breaking the SVN head for some reason). Whenever I get something stable I just merge back. There are good reasons NOT to branch, so this might not be a good idea... But considering that I am the only person that changes PopGen I don't thing merging will be an issue at all... Any comments? On Mon, Mar 31, 2008 at 4:45 PM, Peter wrote: > > On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote: > > Hi, > > > > I would like to start working on the statistical part (actually the > > most important part) of Bio.PopGen and on the HapMap part. > > > > My problem is with the CVS to SVN conversion. I cannot understand if I > > can go forward and where (ie on the SVN or the CSV repository)? > > > > I any case, I can wait with commiting, so there is no rush, but > > eventually I will have to commit somewhere ;) > > In the short term, we are still using CVS. I've only been making > relatively small changes as I anticipate the move to SVN will happen > shortly... > > Are there any objections to doing it in the next fortnight? Chris - > could you find out when would suit the OBF guys? Maybe come up with > two suggested time slots in the next month? > > Peter > -- http://www.tiago.org From biopython at maubp.freeserve.co.uk Mon Mar 31 15:04:35 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 20:04:35 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> Message-ID: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> On Mon, Mar 31, 2008 at 7:32 PM, Tiago Ant?o wrote: > When on SVN I would like to consider branching for PopGen. AFAIK > branching on svn costs very little (only when you make changes does > SVN copies the content from the original branch). > > This would have the big advantage that I could make my changes freely > without impact on Michiel's release cycle (or breaking the SVN head > for some reason). Whenever I get something stable I just merge back. > > There are good reasons NOT to branch, so this might not be a good > idea... But considering that I am the only person that changes PopGen > I don't thing merging will be an issue at all... Any comments? I had been wondering about taking advantage of SVN to explore my Bio.AlignIO plans and/or improvements to the alignment object. I think I will need to read up on SVN and how it handles merges and branches before I try this. There is a lot to be said for having a single stable trunk - it certainly makes things simpler for any new developers to get to grips with things. Peter From tiagoantao at gmail.com Mon Mar 31 15:08:46 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 20:08:46 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> Message-ID: <6d941f120803311208k6b6c9d1ah58c7808e0fbd0e2c@mail.gmail.com> On Mon, Mar 31, 2008 at 8:04 PM, Peter wrote: > There is a lot to be said for having a single stable trunk - it > certainly makes things simpler for any new developers to get to grips > with things. It is one of those issues where there is no clear answer. Maybe a case by case analysis? I think having 5 gazillion branches would not be a good idea ever, but in the Biopython case many modules are somewhat self contained, making merging an easier exercise. Tiago From tiagoantao at gmail.com Mon Mar 31 18:13:11 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 23:13:11 +0100 Subject: [Biopython-dev] Genbank dbSNP support Message-ID: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> Hi, Any plans for dbSNP support? http://www.ncbi.nlm.nih.gov/SNP/index.html I think I would volunteer to implement this. A simple solution would be to add both databases and return types. Michiel (I suppose this is code that you are actively maintaining, or it is Peter?), can I send you a diff? I have done this once already for genome - http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html dbSNP can return different types ( http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam ) so a few parsers would be needed for complete support. But that can be done later... -- http://www.tiago.org From biopython at maubp.freeserve.co.uk Mon Mar 31 19:01:10 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Apr 2008 00:01:10 +0100 Subject: [Biopython-dev] Genbank dbSNP support In-Reply-To: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> References: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> Message-ID: <320fb6e00803311601x573c104cx1beb7035a14ef03c@mail.gmail.com> On Mon, Mar 31, 2008 at 11:13 PM, Tiago Ant?o wrote: > Hi, > > Any plans for dbSNP support? > http://www.ncbi.nlm.nih.gov/SNP/index.html > > I think I would volunteer to implement this. A simple solution would > be to add both databases and return types. Michiel (I suppose this is > code that you are actively maintaining, or it is Peter?), can I send > you a diff? I have done this once already for genome - > http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html I think Michiel has been dealing with this sort of stuff (NCBIDictionary and Bio.Entrez). I would file an enhancement bug, and attach your patch to it. > dbSNP can return different types ( > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam > ) so a few parsers would be needed for complete support. But that can > be done later... We should already be able to parse their Fasta, GenBank or GenPept output. The lists of IDs should also be trivial. I haven't looked at the other formats. Peter From bugzilla-daemon at portal.open-bio.org Mon Mar 31 19:23:46 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 19:23:46 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803312323.m2VNNku4026068@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #7 from ericgibert at yahoo.fr 2008-03-31 19:23 EST ------- Created an attachment (id=891) --> (http://bugzilla.open-bio.org/attachment.cgi?id=891&action=view) refactoring and search by name Please discard previous attachment. This newer version includes a static method returning a list of Taxonomy based on a scientific name. It is then possible to test the len of the return list: 0 for no match, 1 for a unique taxon, more if ambiguity. Ambiguity can be cleared using the get_taxon_by_rank("order") for example. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 31 20:57:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 20:57:35 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200804010057.m310vZqG029753@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ericgibert at yahoo.fr changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #890 is|0 |1 obsolete| | -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:04:14 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 29 Feb 2008 22:04:14 -0500 Subject: [Biopython-dev] [Bug 2464] New: from Bio import db doesn't work? Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2464 Summary: from Bio import db doesn't work? Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: blocker Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: patrikd at gmail.com Just trying to run an example straight out of the BioPython cookbook: ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank") Traceback (most recent call last): File "", line 1, in ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank") File "C:\Program Files\Python25\lib\site-packages\Bio\GenBank\__init__.py", line 1283, in __init__ from Bio import db ImportError: cannot import name db -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 1 08:54:19 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 1 Mar 2008 03:54:19 -0500 Subject: [Biopython-dev] [Bug 2464] from Bio import db doesn't work? In-Reply-To: Message-ID: <200803010854.m218sJFT023721@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2464 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE ------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST ------- Duplicate of Bug #2393, which was fixed in CVS. *** This bug has been marked as a duplicate of bug 2393 *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 1 08:54:23 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 1 Mar 2008 03:54:23 -0500 Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with release 1.44 In-Reply-To: Message-ID: <200803010854.m218sNPD023746@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2393 mdehoon at ims.u-tokyo.ac.jp changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |patrikd at gmail.com ------- Comment #13 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST ------- *** Bug 2464 has been marked as a duplicate of this bug. *** -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From mjldehoon at yahoo.com Sat Mar 1 08:52:16 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 1 Mar 2008 00:52:16 -0800 (PST) Subject: [Biopython-dev] deprecation? In-Reply-To: <47C7FB4C.40607@umh.es> Message-ID: <504389.32803.qm@web62405.mail.re1.yahoo.com> Dear Gregorio, Thanks for letting us know. Could you show us what exactly you are trying to do in your script? This function was deprecated because there were several functions in Biopython doing nearly the same thing, and we're trying to converge on one function. So probably, the best thing would be to avoid using Bio\config\DBRegistry.py. --Michiel. Gregorio Fernandez wrote: Dear Sir, I had this messasge in one of my scripts. Can I have this feature available? C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149: DeprecationWarning: Concurrent behavior has been deprecated, as this functionality needs Bio.MultiPr oc, which itself has been deprecated. If you need the concurrent behavior, pleas e let the Biopython developers know by sending an email to biopython-dev at biopyth on.org to avoid permanent removal of this feature. DeprecationWarning) Thanks Gregorio -- Gregorio J. Fernandez Ballester Instituto de Biolog?a Molecular y Celular Universidad Miguel Hern?ndez Edificio Torregait?n. Avda. de la Universidad, s/n. 03202 Elche (Alicante) E-mail: gregorio at umh.es Telf: 966 65 84 41 Fax: 966 65 87 58 _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Never miss a thing. Make Yahoo your homepage. From bugzilla-daemon at portal.open-bio.org Mon Mar 3 21:53:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 3 Mar 2008 16:53:59 -0500 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803032153.m23LrxP4023475@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-03 16:53 EST ------- Defining __eq__ and __ne__ methods for the Alphabet class would probably work, but we would also have to do this for the AlphabetEncoder "decorator" class. I'm a little wary of this... def __ne__(self, other) : """Check if this alphabet object <> another alphabet""" return not self == other def __eq__(self, other) : """Check if this alphabet object == another alphabet""" #TODO - what exactly do we want to check here? if id(self) == id(other) : return True if not isinstance(other, Alphabet) \ and not isinstance(other, AlphabetEncoder): raise ValueError("Comparing an alphabet to a non-alphabet") if self.__class__ <> other.__class__ : return False if self.size <> other.size : return False if self.letters <> other.letters : return False if dir(self) <> dir(other) : return False for attr in ["gap_char", "stop_symbol"] : if hasattr(self, attr) <> hasattr(other, attr) : return False if hasattr(self, attr) and hasattr(other, attr) \ and self.__getattr__(attr) <> other.__getattr_(attr) : return False #Close enough? return True Relaxing the assertion in Bio.Translate would be much safer in terms of any potential side effects. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Thu Mar 6 15:24:47 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 6 Mar 2008 07:24:47 -0800 (PST) Subject: [Biopython-dev] New Biopython release Message-ID: <48921.43822.qm@web62403.mail.re1.yahoo.com> Hi everybody, Let's make a new release (1.45). I'm thinking of Friday 21st, which gives us about two weeks. The current Biopython release (1.44) has a nasty bug that causes an error with one of the Bio.GenBank examples in the tutorial. This bug has since been fixed in CVS. If you have any code that is ready to be submitted to CVS, now would be a good time to do so. If your code is not yet ready from prime time, please don't submit it to CVS until after the release to avoid any last-minute problems. Biopython 1.44 had a large number of deprecations, but I feel it is too soon to remove them from the release completely. Bio.Blast.blast and Bio.Blast.blasturl have been deprecated for several releases now, so if there are no objections I think we should remove them. Bio.Kabat has been deprecated since release 1.43. Since it has few (if any) users, I think we should remove it too. Also, please have a look at the Biopython bugs that are still open to see if there's anything we can do about them. --Michiel. --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Sat Mar 8 20:25:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 8 Mar 2008 15:25:06 -0500 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803082025.m28KP661006291@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-08 15:25 EST ------- Changing Bio/Translate.py line 14+ and 36+ from this: assert seq.alphabet == self.table.nucleotide_alphabet, \ ... to this: #Allow different instances of the same class to be used: assert seq.alphabet.__class__ == \ self.table.nucleotide_alphabet.__class__, \ ... seems to resolve the original bug report. I'd like to check this doesn't affect any of the unit tests under Linux - Windows looks OK. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 10 10:12:13 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 10 Mar 2008 06:12:13 -0400 Subject: [Biopython-dev] [Bug 1999] new frame translation method In-Reply-To: Message-ID: <200803101012.m2AACD7k003033@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=1999 ------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2008-03-10 06:12 EST ------- Since SeqUtils.frameTranslations and SeqUtils.six_frame_translations are so similar, I think we should keep only one of these functions. Preferably named "six_frame_translations", for backward compatibility. Also, I think we should not require the seqO argument to be a Seq object. If this function is to replace the existing SeqUtils.six_frame_translations, we should make to sure to keep all the existing functionality of that function. I believe current the GC content calculation is missing in SeqUtils.frameTranslations. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Wed Mar 12 00:37:11 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Wed, 12 Mar 2008 00:37:11 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> Message-ID: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> Hi Chris, I haven't heard anything about the CVS to SVN move recently. Did anyone resolve the multiple password prompt niggle? On another point, is the test SVN repository intended to be writable (for those of us with developer access)? I really should try running some things like "svn diff" and committing sample changes to get a feel for how it compares to CVN. Peter From sbassi at gmail.com Wed Mar 12 14:52:51 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 11:52:51 -0300 Subject: [Biopython-dev] BLAST XML to HTML Message-ID: Is there a Biopython module to convert a VLAST XML output to HTML? Like this one from BioJAVA: http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML If there is no such a module, could this be included into Biopython if I provide the code? -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From peter at maubp.freeserve.co.uk Wed Mar 12 18:12:35 2008 From: peter at maubp.freeserve.co.uk (Peter) Date: Wed, 12 Mar 2008 18:12:35 +0000 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: References: Message-ID: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> On Wed, Mar 12, 2008 at 2:52 PM, Sebastian Bassi wrote: > Is there a Biopython module to convert a VLAST XML output to HTML? > Like this one from BioJAVA: > http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML > If there is no such a module, could this be included into Biopython if > I provide the code? Is your idea to convert from the XML output of the NCBI BLAST tools into HTML very closely resembling the NCBI's HTML output (perhaps for another program to read as input). Or do you just want to produce a nice HTML page for a person to read (perhaps resembling the NCBI page in appearance, but not using the same HTML layout)? How would your code work -direct from the XML file, or from the results of the existing Biopython BLAST parsers? Peter From sbassi at gmail.com Wed Mar 12 18:21:18 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 15:21:18 -0300 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> References: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com> Message-ID: On Wed, Mar 12, 2008 at 3:12 PM, Peter wrote: > Is your idea to convert from the XML output of the NCBI BLAST tools > into HTML very closely resembling the NCBI's HTML output (perhaps for > another program to read as input). Or do you just want to produce a > nice HTML page for a person to read (perhaps resembling the NCBI page > in appearance, but not using the same HTML layout)? The idea is because I always run the BLAST as XML since I parse them with biopython, but people at lab want to check the HTML version (or I want to "publish" the result in a public DB accessible via html) and that makes me re-run the BLAST just for them to see the output. Sometimes the BLAST are resource demanding (like a 2 week run) and I would like to avoid re-running the BLAST when I really want is a format change. > How would your code work -direct from the XML file, or from the > results of the existing Biopython BLAST parsers? >From the XML output. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From mjldehoon at yahoo.com Wed Mar 12 21:50:42 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 12 Mar 2008 14:50:42 -0700 (PDT) Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: Message-ID: <845186.11502.qm@web62406.mail.re1.yahoo.com> > > How would your code work -direct from the XML file, or from the > > results of the existing Biopython BLAST parsers? >From the XML output. One option is to use Cascading Style Sheets (CSS) to display the XML file. That way, you don't have to create a new HTML file. Also, we should check with NCBI if they have a tool for such purposes. --Michiel. __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From sbassi at gmail.com Wed Mar 12 22:25:35 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Wed, 12 Mar 2008 19:25:35 -0300 Subject: [Biopython-dev] BLAST XML to HTML In-Reply-To: <845186.11502.qm@web62406.mail.re1.yahoo.com> References: <845186.11502.qm@web62406.mail.re1.yahoo.com> Message-ID: On Wed, Mar 12, 2008 at 6:50 PM, Michiel de Hoon wrote: > One option is to use Cascading Style Sheets (CSS) to display the XML file. > That way, you don't have to create a new HTML file. Also, we should check > with NCBI if they have a tool for such purposes. They must have something because the new online NCBI BLAST has an option called "reformat BLAST results". This option can reformat from XML to HTML without re-running the BLAST, but this is working as server-side. -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From bugzilla-daemon at portal.open-bio.org Thu Mar 13 10:07:44 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 06:07:44 -0400 Subject: [Biopython-dev] [Bug 2468] New: Tutorial needs a fix: Bio.WWW.NCBI Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2468 Summary: Tutorial needs a fix: Bio.WWW.NCBI Product: Biopython Version: 1.44 Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: P2 Component: Documentation AssignedTo: biopython-dev at biopython.org ReportedBy: mmokrejs at ribosome.natur.cuni.cz I am trying to follow the recipe at http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc14 which contains the following split into several chunks (I don't like this style personally, but that's not the issue here): #! /usr/bin/python from Bio.WWW import NCBI search_command = 'Search' search_database = 'Taxonomy' return_format = 'FASTA' search_term = 'Cypripedioideae' my_browser = 'lynx' result_handle = NCBI.query(search_command, search_database, term = search_term, doptcmdl = return_format) import os result_file_name = os.path.join(os.getcwd(), "results.html") result_file = open(result_file_name, "w") result_file.write(result_handle.read()) result_file.close() if my_browser == "lynx": os.system("lynx -force_html " + result_file_name) elif my_browser == "netscape": os.system("netscape file:" + result_file_name) I end up with a lynx browser opened with the Entrez search page pre-filled with the 'Cypripedioideae' as the query string. Unfortunately, I have to click on the condensed results to get the taxonomy listing under the word 'Cypripedioideae'. The line I am talking about is close the the end of the output: [ ] 1: Cypripedioideae, subfamily, monocots Links BTW, other the links from the page do not work because they point to http://localhost/.... /usr/lib/python2.5/site-packages/Bio/WWW/NCBI.py:34: DeprecationWarning: Bio.WWW.NCBI is deprecated. The functions in Bio.WWW.NCBI are now available from Bio.Entrez. DeprecationWarning) The section needs updating. I am somewhat surprised I cannot access NCBI Taxonomy easily. Priobably will have to browse the source code and forget Tutorail and Cookbook. ;) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 13 10:55:55 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 06:55:55 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803131055.m2DAttv2027003@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 06:55 EST ------- The tutorial in CVS is already updated to use Bio.Entrez.query instead of Bio.WWW.NCBI.query relecting the depreciation made in CVS. I think you are using the Biopython 1.44 tutorial (from the weblink) with the CVS Biopython code. So at least part of your problem is already fixed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 13 11:27:14 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 13 Mar 2008 07:27:14 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803131127.m2DBREdM028784@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 07:27 EST ------- Reading the Bio.Entrez documentation, the query function always returns HTML. You could also use the esearch function which returns XML, followed by the efetch function which seems to support a range of options depending on the datatype. For example, using the taxonomy db: #This gets an XML file from the following URL, #http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=Cypripedioideae from Bio import Entrez result_handle = Entrez.esearch("taxonomy", term="Cypripedioideae") print result_handle.read() You could then parse the XML file to extract the matching ID(s), perhaps with a regular expression. In this case, there is only one match, 158330. #http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy #&id=9685&report=brief&retmode=text from Bio import Entrez result_handle = Entrez.efetch("taxonomy", id="158330", \ report="docsum", retmode="text") print result_handle.read() #Given ID 9685, returns "Cypripedioideae, subfamily, monocots" I agree that this section of the tutorial could be more useful. Do you think the above could would be more helpful? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Thu Mar 13 12:14:09 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 13 Mar 2008 12:14:09 +0000 Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions Message-ID: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com> Hi Michiel (et al), I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez I notice you've marked several of the functions in Bio.Entrez with depreciation warnings as the NCBI has retired the associated APIs. i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI is deprecated). Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and Bio.WWW.NCBI.pmneighbor will get a deprecation warning from Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch, Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings. Do you think we can just remove pmfetch, pmqty and pmneighbor from Bio.Entrez so that it starts out "clean", and adjust the warning from Bio.WWW.NCBI as follows: import warnings warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now available from Bio.Entrez, except for the pm* functions which the NCBI have retired.", DeprecationWarning) What do you think? Peter From mjldehoon at yahoo.com Thu Mar 13 12:48:29 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 13 Mar 2008 05:48:29 -0700 (PDT) Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions In-Reply-To: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com> Message-ID: <84386.61626.qm@web62415.mail.re1.yahoo.com> That is fine with me. Maybe I was being too conservative. I'll make those changes. --Michiel. Peter wrote: Hi Michiel (et al), I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez I notice you've marked several of the functions in Bio.Entrez with depreciation warnings as the NCBI has retired the associated APIs. i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI is deprecated). Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and Bio.WWW.NCBI.pmneighbor will get a deprecation warning from Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch, Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings. Do you think we can just remove pmfetch, pmqty and pmneighbor from Bio.Entrez so that it starts out "clean", and adjust the warning from Bio.WWW.NCBI as follows: import warnings warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now available from Bio.Entrez, except for the pm* functions which the NCBI have retired.", DeprecationWarning) What do you think? Peter _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Fri Mar 14 15:53:34 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Mar 2008 11:53:34 -0400 Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes assert to fail when it should pass In-Reply-To: Message-ID: <200803141553.m2EFrY6p001573@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2437 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 11:53 EST ------- Fixed in CVS, Bio/Translate.py revision 1.3, as described in comment 3. This fixes the original report, making sequence translation simpler to use - see also Bug 2381 - translate and transcibe methods for the Seq object (in Bio.Seq) This change does NOT address the larger issue of how to decide if two alphabets are equal or not. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Fri Mar 14 16:19:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Fri, 14 Mar 2008 12:19:40 -0400 Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS journals In-Reply-To: Message-ID: <200803141619.m2EGJeIC003283@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2447 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 12:19 EST ------- Created an attachment (id=878) --> (http://bugzilla.open-bio.org/attachment.cgi?id=878&action=view) Patch to Bio/EUtils/parse.py I'm sure sure if this is the best way to fix this, but it does appear to solve the reported problem. Can you give this a try Noel? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 20:36:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 16:36:06 -0400 Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain text in CVS? In-Reply-To: Message-ID: <200803152036.m2FKa6xR029284@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2363 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 16:36 EST ------- I've just done a clean checkout and build on Windows, and run the test suite, and built the tutorial as PDF. I didn't run into any text/binary issues, so this seems to be fixed now :) -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 20:49:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 16:49:06 -0400 Subject: [Biopython-dev] [Bug 2469] New: requires_wise.py fails on Windows (test suite) Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2469 Summary: requires_wise.py fails on Windows (test suite) Product: Biopython Version: 1.44 Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk On my Windows XP machine, I don't have wise installed, so the dnal command doesn't work: C:\TEMP\>dnal 'dnal' is not recognized as an internal or external command, operable program or batch file. When running the unit test suite, test_Wise.py SHOULD fail with a missing external dependency error - instead it tries to run with a failed assertion error. The problem is requires_wise.py fails... which seems to be an issue with the commands.getoutput() function not working on Windows, its unix only according to: http://www.python.org/doc/current/lib/module-commands.html Annoyingly, the commands module is present on Windows (or at least Python 2.3) but simply doesn't work due to calling this: os.popen('{ ' + cmd + '; } 2>&1', 'r') As a result, >>> commands.getoutput("xyz") "'{' is not recognized as an internal or external command,\noperable program or batch file." Assuming wise/dnal actually works on Windows, we need to use something other than commands.getoutput("dnal") to check for it. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sat Mar 15 23:40:54 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 15 Mar 2008 19:40:54 -0400 Subject: [Biopython-dev] [Bug 2469] requires_wise.py fails on Windows (test suite) In-Reply-To: Message-ID: <200803152340.m2FNesQl005388@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2469 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 19:40 EST ------- You can download wise2 from ftp://ftp.ebi.ac.uk/pub/software/unix/wise2/ and compile it under Windows XP using cygwin, but its own tests fail - I'm not sure why. Carrying on regardless, then test_Wise.py still doesn't work for me :( P.S. Cornell University have packaged wise2 for Windows (found via Google, I haven't tried this): http://www.tc.cornell.edu/WBA/ -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 16 21:03:59 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Mar 2008 17:03:59 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803162103.m2GL3x4u021735@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:03 EST ------- In the old code, when the species wasn't already recorded in the taxon/taxon_name tables, we add would it and its parent lineage entries. See also http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html There are a few problems in the old code, exposed in the unit tests, but I think I have this working again now. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 16 21:25:02 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 16 Mar 2008 17:25:02 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803162125.m2GLP2Oq023867@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:25 EST ------- Created an attachment (id=879) --> (http://bugzilla.open-bio.org/attachment.cgi?id=879&action=view) patch to BioSQL/Loader.py Possible patch - the two BioSQL unit tests pass with this. I have not had a chance to try this in combination with a taxonomy table pre-populated by load_ncbi_taxonomy.pl -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Mon Mar 17 02:43:55 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 16 Mar 2008 19:43:55 -0700 (PDT) Subject: [Biopython-dev] [BioSQL-l] Loading sequences with novel NCBI taxon id In-Reply-To: <002201c88705$70780840$6400a8c0@Gecko> Message-ID: <121379.45887.qm@web62403.mail.re1.yahoo.com> > Thank you for your mail recommending the usage of NCBI.WWW. > I have modified my class/script accordingly to your suggestion > without problem. Once 1.45 is out, I will change for NCBI.Entrez > as you informed me. Just to avoid any confusion: In Biopython 1.45, the module will be "Bio.Entrez", not "Bio.NCBI.Entrez". > In any case, I do not pretend having a fantastic piece of code, but it gets > the job done. If you find this interesting, I would be pleased to contribute > to BioPython. Bio.Entrez will need some parsers to parse the XML results, although that probably won't happen before the 1.45 release. I think your script could be very useful when writing those parsers. Could you open a bug report on Bugzilla and upload your script there? Beware, to upload a script to Bugzilla, you need to create a bug report first, and then as a separate step upload the script. Thanks! --Michiel.. Eric Gibert wrote: Dear Peter, Regarding the update of the BioSQL tables taxon and taxon_name, I have created a class "TaxonUpdate" (how original!) which do two things: 1) as a class itself, it will fetch from NCBI the taxon's information as XLM based on the taxon_id passed to the constructor, parse the returned XML answer to get the genus, class, order, family (10 levels) and update that in taxon table. If taxon_name needs update/insert, it does it too. 2) run as an independent script __main__, it will look for all species in taxon table for which the genus (parent) does not have a ncbi_taxon_id (i.e. is NULL as this is the current result after adding a new sequence in BioSQL). For all those incomplete found records, it will perform the update as (1) After the addition of a new sequence in a BioSQL database, a simple call of this code (passing the taxon_id) will do the updating job. Dear Michiel, Thank you for your mail recommending the usage of NCBI.WWW. I have modified my class/script accordingly to your suggestion without problem. Once 1.45 is out, I will change for NCBI.Entrez as you informed me. In any case, I do not pretend having a fantastic piece of code, but it gets the job done. If you find this interesting, I would be pleased to contribute to BioPython. Eric -----Original Message----- From: biosql-l-bounces at lists.open-bio.org [mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Peter Sent: Thursday, March 13, 2008 11:06 PM To: BioSQL Subject: [BioSQL-l] Loading sequences with novel NCBI taxon id Dear list, One of the unresolved issues with Biopython's BioSQL interface is dealing with the NCBI taxon ID when loading sequences into the database. As I understand it, ideally before loading any sequences, the user will have loaded in the entire NCBI taxonomy using the load_ncbi_taxonomy.pl script, as I described here: http://biopython.org/wiki/BioSQL#NCBI_Taxonomy When a new sequence is added to the database with a known taxon id, there is no problem. But happens if its a recently sequenced organism which isn't defined yet in the BioSQL taxonomy tables? Could/should the user re-run load_ncbi_taxonomy.pl, and then load in their new sequence? Right now in Biopython due what appears to have been intended as a short term hack, we simple don't record the taxon id at all (!), and I would like to fix this (bug 2422). http://bugzilla.open-bio.org/show_bug.cgi?id=2422 How do BioPerl et al deal with this issue? Do they try and update the taxonomy tables using the available information in the new record's annotation (i.e. the new taxon id and the species name)? Do they lookup the NCBI taxonomy definition via the internet? Do they throw an error and halt? Thanks, Peter (Biopython) _______________________________________________ BioSQL-l mailing list BioSQL-l at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biosql-l --------------------------------- Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. From bugzilla-daemon at portal.open-bio.org Mon Mar 17 11:47:46 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Mar 2008 07:47:46 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803171147.m2HBlksw008865@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #3 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 07:47 EST ------- Hi Peter, yes this would be more helpful. Unfortunately I did the one-time job with parsing the HTML output and re-running wget to fetch the final HTML page, stripped HTML formatting and was done. I will upload my two crappy scripts. They work but should be re-written to utilize the XML outputs you have mentioned. The second URL from your last comment should have different values for some parameters to yield another XML page: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=41073&report=sgml&mode=xml That returns me: 41073 cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa; Eumetazoa; Bilateria; Coelomata; Protostomia; Panarthropoda; Arthropoda; Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera; Endopterygota; Coleoptera; Adephaga Maybe I will find the time to rewrite them for the purpose of tutorial to use the XMLs. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 17 13:18:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 17 Mar 2008 09:18:35 -0400 Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI In-Reply-To: Message-ID: <200803171318.m2HDIZYX014608@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2468 ------- Comment #4 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 09:18 EST ------- Created an attachment (id=880) --> (http://bugzilla.open-bio.org/attachment.cgi?id=880&action=view) taxfetch.py This program/module can fetch for the user the Lineage line. The query() function uses the deprecated biopython API while the efetch uses the other. Queries get cached in a local file taxonomycache.db for speed. Users can call either of the two functions from external python code. Feel free to use the code in Tutorial or even bundle in any form into the package. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From tiagoantao at gmail.com Mon Mar 17 22:30:38 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 17 Mar 2008 22:30:38 +0000 Subject: [Biopython-dev] Bio.PopGen status Message-ID: <6d941f120803171530g36504759g4b3cf835065e17b8@mail.gmail.com> Hi, This is a short email regarding Bio.PopGen status. 1. All the code on the repository should be stable. 2. The Biopython version that is schedule for release soon will have support for coalescent population genetics' simulations 3. A short number of test code cases are included. 4. Documentation was produced and is available on the Tutorial. I believe that it is satisfactory (tell me if you disagree). 5. Bio.PopGen is still not "version 1" in the sense that the fundamental statistics code is missing. This was a conscious strategy to start with selection detection and coalescent simulation in order to begin with arguably less important stuff so that newbie errors (in the sense that I was a newbie developer to biopython) would have less impact. 6. Statistics is my next task and hopefully will coincide with the biopython release after this one. This will be, at least, for me, "version 1" of Bio.PopGen 7. In the code, there is, since the original Bio.PopGen, code that is able to execute external simulators in parallel (thus taking advantage of multi core architectures for computationally intensive simulations). This is, unfortunately, not documented. I will document this (maybe in a separate document from the tutorial) in the future. I don't think this is priority 1. But others might be interested in using this code for computationally intensive tasks using external programs. In case you want to know more details about this, please say so. >From a biopython release perspective, Bio.PopGen with new coalescent simulation features is fully ready. Please go ahead and release whenever is more convenient. -- http://www.tiago.org/ps From bugzilla-daemon at portal.open-bio.org Thu Mar 20 10:23:35 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 20 Mar 2008 06:23:35 -0400 Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id In-Reply-To: Message-ID: <200803201023.m2KANZun010097@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2422 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-20 06:23 EST ------- Patch checked in as BioSQL/Loader.py revision 1.28 Unit tests passed on both Windows XP and Linux (using MySQL) Note that once we have added "provisional" entries to the taxon/taxon_name table based on the record annotation, load_ncbi_taxonomy.pl should be able to tidy things up using the NCBI taxonomy. At least it should once BioSQL bug 2470 is fixed. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Thu Mar 20 20:14:50 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 20 Mar 2008 20:14:50 +0000 Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics services In-Reply-To: <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com> References: <320fb6e00803140944v13f241b9icc0e911643f234cd@mail.gmail.com> <47DE0C22.9040202@netsys.co.za> <320fb6e00803170049g79960e14u8c1417fcdc99a0d5@mail.gmail.com> <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com> Message-ID: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com> I posted the following email on the mail discussion mailing list, and haven't seen any replies. Should we mark Bio.biblio as deprecated now (before the imminent release)? Peter On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote: > Dear list, > > We have an old module Bio/biblio.py written by Tiaan Wessels back in > 2002 (during a South African hackathon). This is code to use some EBI > Bibliographics services, but currently no longer works. At the very > least, the EBI have changed the URLs for their SOAP services. I got > in touch with the author by email, and he no longer uses the code and > thought we could remove it. > > Does anyone on the list still use Bio/biblio.py? > > Would anyone like to take a more in depth look at the code, and the > current EBI web API, and see if there is anything in Bio.biblio worth > keeping? > > If not, I'm proposing we mark this as deprecated for the next release > of Biopython. > > Thanks, > > Peter > From mjldehoon at yahoo.com Fri Mar 21 02:08:56 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 20 Mar 2008 19:08:56 -0700 (PDT) Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics services In-Reply-To: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com> Message-ID: <242823.17441.qm@web62402.mail.re1.yahoo.com> > Should we mark Bio.biblio as deprecated now (before the imminent release)? Yes. It's just a deprecation; the code will still be usable. The deprecation warning should contain a notice to contact us in case somebody is still using this code. If not, it's better to deprecate it and remove it in some future release. Keeping Biopython clean is important. --Michiel. Peter wrote: I posted the following email on the mail discussion mailing list, and haven't seen any replies. Should we mark Bio.biblio as deprecated now (before the imminent release)? Peter On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote: > Dear list, > > We have an old module Bio/biblio.py written by Tiaan Wessels back in > 2002 (during a South African hackathon). This is code to use some EBI > Bibliographics services, but currently no longer works. At the very > least, the EBI have changed the URLs for their SOAP services. I got > in touch with the author by email, and he no longer uses the code and > thought we could remove it. > > Does anyone on the list still use Bio/biblio.py? > > Would anyone like to take a more in depth look at the code, and the > current EBI web API, and see if there is anything in Bio.biblio worth > keeping? > > If not, I'm proposing we mark this as deprecated for the next release > of Biopython. > > Thanks, > > Peter > _______________________________________________ Biopython-dev mailing list Biopython-dev at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython-dev --------------------------------- Never miss a thing. Make Yahoo your homepage. From mjldehoon at yahoo.com Fri Mar 21 11:57:02 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Fri, 21 Mar 2008 04:57:02 -0700 (PDT) Subject: [Biopython-dev] CVS freeze for release Message-ID: <621627.53345.qm@web62408.mail.re1.yahoo.com> Hi everybody, I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks! --Michiel. --------------------------------- Never miss a thing. Make Yahoo your homepage. From biopython at maubp.freeserve.co.uk Fri Mar 21 12:51:01 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 12:51:01 +0000 Subject: [Biopython-dev] CVS freeze for release In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com> References: <621627.53345.qm@web62408.mail.re1.yahoo.com> Message-ID: <320fb6e00803210551o4644fc34meaadf9e521b087fe@mail.gmail.com> Michiel de Hoon wrote: > Hi everybody, > > I'll start making release 1.45 from now. Please don't touch CVS until after > the release is out. Thanks! Good news :) I did check in some comment changes to BioSQL this morning, and the Bio.biblio deprecation, but that was a few hours ago. Peter From meanerelk at gmail.com Fri Mar 21 18:28:24 2008 From: meanerelk at gmail.com (Kemal) Date: Fri, 21 Mar 2008 14:28:24 -0400 Subject: [Biopython-dev] mentor for google summer of code Message-ID: I am a university student interested in adding phyloXML support to BioPython for the Google Summer of Code. Would any developers be willing to mentor this project? I have been discussing it with Hilmar Lapp, who is mentoring similar projects for the Phyloinformatics Summer of Code project at the National Evolutionary Synthesis Center. There page is at: https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008 A mentor would would be responsible for monitoring the project's progress over the summer, and to evaluate the work at the end. Google's guidelines estimate that this would take about 5 hours/week per student. There is more information at: http://code.google.com/opensource/gsoc/2008/faqs.html If anyone is interested, I would love to discuss the details of the proposal. Thank you, Kemal Eren From biopython at maubp.freeserve.co.uk Fri Mar 21 19:04:06 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 19:04:06 +0000 Subject: [Biopython-dev] mentor for google summer of code In-Reply-To: References: Message-ID: <320fb6e00803211204q2d0f3696pf5baaf44122a0869@mail.gmail.com> Hi Kemal, On Fri, Mar 21, 2008 at 6:28 PM, Kemal wrote: > I am a university student interested in adding phyloXML support to BioPython > for the Google Summer of Code. Would any developers be willing to mentor > this project? I have been discussing it with Hilmar Lapp, who is mentoring > similar projects for the Phyloinformatics Summer of Code project at the > National Evolutionary Synthesis Center. There page is at: > > https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008 I see there are similar projects already planned for phyloXML with BioPerl and BioRuby. For Biopython I guess building on Frank Kauff and Cymon J. Cox's Bio.Nexus module would be the most logical option. Have you had a chance to look at any of the Biopython code? > A mentor would would be responsible for monitoring the project's progress > over the summer, and to evaluate the work at the end. Google's guidelines > estimate that this would take about 5 hours/week per student. There is more > information at: > > http://code.google.com/opensource/gsoc/2008/faqs.html > > If anyone is interested, I would love to discuss the details of the > proposal. It would be worth trying to contact Frank and Cymon directly - and seeing if they would be interested. Peter (one of the current Biopython developers) From chris.lasher at gmail.com Fri Mar 21 21:11:40 2008 From: chris.lasher at gmail.com (Chris Lasher) Date: Fri, 21 Mar 2008 17:11:40 -0400 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> Message-ID: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote: > Hi Chris, > > I haven't heard anything about the CVS to SVN move recently. Did > anyone resolve the multiple password prompt niggle? That's still unresolved. The workaround is to place an SSH key on dev.open-bio.org. If you do, you'll notice even then it still makes two attempts to log in. /shrugs > On another point, is the test SVN repository intended to be writable > (for those of us with developer access)? I really should try running > some things like "svn diff" and committing sample changes to get a > feel for how it compares to CVN. I have tried committing to it and gotten a "Permission denied" error. It must only be set as read-only for group permissions. Really sorry for my delay on getting to this email. Now that Biopython has had another release, should we really push hard to switch to SVN? Chris From biopython-dev at maubp.freeserve.co.uk Fri Mar 21 21:37:45 2008 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Fri, 21 Mar 2008 21:37:45 +0000 Subject: [Biopython-dev] Biopython to begin transition to Subversion In-Reply-To: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com> <658418.5192.qm@web62414.mail.re1.yahoo.com> <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com> <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com> <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com> Message-ID: <320fb6e00803211437i79c1454m52ba4172728032c8@mail.gmail.com> On Fri, Mar 21, 2008 at 9:11 PM, Chris Lasher wrote: > On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote: > > Hi Chris, > > > > I haven't heard anything about the CVS to SVN move recently. Did > > anyone resolve the multiple password prompt niggle? > > That's still unresolved. The workaround is to place an SSH key on > dev.open-bio.org. If you do, you'll notice even then it still makes > two attempts to log in. /shrugs If someone's documented this from the previous Bio* migrations, then I guess we'll live with it. > > On another point, is the test SVN repository intended to be writable > > (for those of us with developer access)? I really should try running > > some things like "svn diff" and committing sample changes to get a > > feel for how it compares to CVN. > > I have tried committing to it and gotten a "Permission denied" error. > It must only be set as read-only for group permissions. I never did get round to trying myself... > Really sorry for my delay on getting to this email. Now that Biopython > has had another release, should we really push hard to switch to SVN? Well, Michiel declared a CVS freeze this morning and is preparing Biopython 1.45 as we speak. Once the release is out does sound like a good time for the SVN move to me. Peter From peter.bulychev at gmail.com Fri Mar 21 23:50:23 2008 From: peter.bulychev at gmail.com (Peter Bulychev) Date: Sat, 22 Mar 2008 02:50:23 +0300 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project Message-ID: Hello. Clone Digger project is aimed to find software clones (duplicate code) in Python and Java programs. I have applied it to the source of BioPython and discovered several clone candidates. There are a lot of false positives caused by similar code in nlmmedline_*_format.py files, but maybe other clone candidates will be interesting for you. The results can be seen here: http://clonedigger.sourceforge.net/examples.html -- Best regards, Peter Bulychev. From sbassi at gmail.com Sat Mar 22 03:49:52 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Sat, 22 Mar 2008 00:49:52 -0300 Subject: [Biopython-dev] CVS freeze for release In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com> References: <621627.53345.qm@web62408.mail.re1.yahoo.com> Message-ID: On Fri, Mar 21, 2008 at 8:57 AM, Michiel de Hoon wrote: > Hi everybody, > I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks! I have a proposal, so it could be implemented in the next version (1.46?). Change the output of EZRetrieve.retrieve_single. It currently returns a FASTA formated sequence. I think it should return a SeqRecord object (if you want this SeqRecord object to be printed or stored as FASTA, just use formatIO). Here are the proposed changes: http://www.pastecode.com.ar/f3baff314 I can fill this as an enhancement in the bugtrack if you agree. Best, SB. From mjldehoon at yahoo.com Sat Mar 22 11:02:38 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sat, 22 Mar 2008 04:02:38 -0700 (PDT) Subject: [Biopython-dev] Biopython release 1.45 Message-ID: <901773.64728.qm@web62408.mail.re1.yahoo.com> We are pleased to announce the release of Biopython 1.45. This release includes numerous code improvements and fixes, including in Bio.Seq, Bio.SeqIO, Bio.Entrez, Bio.PopGen, Bio.SwissProt, Bio.Cluster, Bio.SCOP, Bio.InterPro, Bio.GenBank, Bio.ExPASy, BioSQL, and the Biopython documentation. Too many to list them all here! Source distributions and Windows installers are available from the Biopython website at http://biopython.org. My thanks to all code contributers who made this new release possible. --Michiel on behalf of the Biopython developers. --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From biopython-dev at maubp.freeserve.co.uk Sat Mar 22 11:18:42 2008 From: biopython-dev at maubp.freeserve.co.uk (Peter) Date: Sat, 22 Mar 2008 11:18:42 +0000 Subject: [Biopython-dev] EZRetrieve Message-ID: <320fb6e00803220418n348d1953v9846af9d04abc04c@mail.gmail.com> > I have a proposal, so it could be implemented in the next version (1.46?). > Change the output of EZRetrieve.retrieve_single. It currently returns > a FASTA formated sequence. I think it should return a SeqRecord object > (if you want this SeqRecord object to be printed or stored as FASTA, > just use formatIO). > Here are the proposed changes: http://www.pastecode.com.ar/f3baff314 > I can fill this as an enhancement in the bugtrack if you agree. So there is currently one function, retrieve_single, which can returns a handle but by default extracts and returns a FASTA record as a string. It does this by calling the parse_single function which reads in the handle, parses the HTML file, and extracts just the FASTA style text, throwing away the other annotation data (like the chromosome or range requested). Here is an example URL constructed by hand, http://siriusb.umdnj.edu:18080/EZRetrieve/single_r_run.jsp?org=0&AccType=0&input=BC014651&from=-200&to=200 Parsing HTML is nasty - especially if the site updates the formatting every so often. I suppose just looking for the FASTA sequence is fairly reliable. I can see the case for an EzRetrieve HTML to SeqRecord parser, but I would be tempted to try and parse more of the annotation. How many people do you think are using the retrieve_single function? I would be very annoying for them if its behaviour suddenly changed. Maybe we can add a new parse function, and call it from retrieve_single if the optional argument parse=2? Peter From biopython at maubp.freeserve.co.uk Sat Mar 22 11:35:39 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 22 Mar 2008 11:35:39 +0000 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project In-Reply-To: References: Message-ID: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> > Hello. > > Clone Digger project is aimed to find software clones (duplicate code) in > Python and Java programs. > > I have applied it to the source of BioPython and discovered several clone > candidates. > > There are a lot of false positives caused by similar code in > nlmmedline_*_format.py files, but maybe other clone candidates will be > interesting for you. > > The results can be seen here: > http://clonedigger.sourceforge.net/examples.html Interesting. Does your tool know to ignore deprecated modules? e.g. when we have essentially copied a file from one location to another, a deprecated the original. Some of these are from scanner/consumer parsers where there are two alternative consumers turning the data into different object representations. Other things like providing dictionary like objects seem to be reusing a lot of "boiler plate" code, and could probably be rationalised into a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both sharing apparently identical catalyse() methods may fall into the same class. Peter From peter.bulychev at gmail.com Sat Mar 22 21:31:30 2008 From: peter.bulychev at gmail.com (Peter Bulychev) Date: Sun, 23 Mar 2008 00:31:30 +0300 Subject: [Biopython-dev] results of applying Clone Digger to the sources of BioPython project In-Reply-To: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> References: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com> Message-ID: Hello. No, unfortunately Clone Digger can not ignore deprecated modules. In order to obtain betters results automatically generated code and tests should be removed from the searched source tree by hands. Other things like providing dictionary like objects seem to be reusing > a lot of "boiler plate" code, and could probably be rationalised into > a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and > Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py > > Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both > sharing apparently identical catalyse() methods may fall into the same > class. > > I think this is the main purpose of Clone Digger: to find clone candidates and to help to create recommendations for refactoring. 2008/3/22, Peter : > > > Hello. > > > > Clone Digger project is aimed to find software clones (duplicate code) > in > > Python and Java programs. > > > > I have applied it to the source of BioPython and discovered several > clone > > candidates. > > > > There are a lot of false positives caused by similar code in > > nlmmedline_*_format.py files, but maybe other clone candidates will be > > interesting for you. > > > > The results can be seen here: > > http://clonedigger.sourceforge.net/examples.html > > > Interesting. Does your tool know to ignore deprecated modules? e.g. > when we have essentially copied a file from one location to another, a > deprecated the original. > > Some of these are from scanner/consumer parsers where there are two > alternative consumers turning the data into different object > representations. > > Other things like providing dictionary like objects seem to be reusing > a lot of "boiler plate" code, and could probably be rationalised into > a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and > Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py > > Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both > sharing apparently identical catalyse() methods may fall into the same > class. > > > Peter > -- Best regards, Peter Bulychev. From sbassi at gmail.com Tue Mar 25 20:46:48 2008 From: sbassi at gmail.com (Sebastian Bassi) Date: Tue, 25 Mar 2008 17:46:48 -0300 Subject: [Biopython-dev] Can't login into wiki Message-ID: Hello, I press the link to login into the wiki (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython) but I am redirected to the same page without a login prompt. I found that this URL is dead (404): http://biopython.org/DIST/docs/api/public/trees.html (and it is linked from http://biopython.org/wiki/Getting_Started , last link). -- Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6 Bioinformatics news: http://www.bioinformatica.info Tutorial libre de Python: http://tinyurl.com/2az5d5 From biopython at maubp.freeserve.co.uk Tue Mar 25 20:55:23 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 25 Mar 2008 20:55:23 +0000 Subject: [Biopython-dev] Can't login into wiki In-Reply-To: References: Message-ID: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com> On Tue, Mar 25, 2008 at 8:46 PM, Sebastian Bassi wrote: > Hello, > > I press the link to login into the wiki > (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython) > but I am redirected to the same page without a login prompt. Its not just you - the wiki is being a bit odd for me right too now, empty PHP pages etc. Maybe it needs rebooting again... which I think happens automatically every so often. If it doesn't clear up I'll email the OBF guys tomorrow. > I found that this URL is dead (404): > http://biopython.org/DIST/docs/api/public/trees.html > (and it is linked from http://biopython.org/wiki/Getting_Started , last link). It should probably be http://biopython.org/DIST/docs/api/ (the link documentation page is fine). Peter From mjldehoon at yahoo.com Tue Mar 25 23:55:20 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 25 Mar 2008 16:55:20 -0700 (PDT) Subject: [Biopython-dev] Can't login into wiki In-Reply-To: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com> Message-ID: <912788.63526.qm@web62415.mail.re1.yahoo.com> Peter wrote: > I found that this URL is dead (404): > http://biopython.org/DIST/docs/api/public/trees.html > (and it is linked from http://biopython.org/wiki/Getting_Started , last link). It should probably be http://biopython.org/DIST/docs/api/ (the link documentation page is fine). I fixed this link now. --Michiel --------------------------------- Looking for last minute shopping deals? Find them fast with Yahoo! Search. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:24:09 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:24:09 -0400 Subject: [Biopython-dev] [Bug 2475] New: BioSQL.Loader should reuse existing taxon entries in lineage Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2475 Summary: BioSQL.Loader should reuse existing taxon entries in lineage Product: Biopython Version: Not Applicable Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: BioSQL AssignedTo: biopython-dev at biopython.org ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk Based on a report on the mailing list by Eric Gibert, http://lists.open-bio.org/pipermail/biopython/2008-March/004137.html http://lists.open-bio.org/pipermail/biopython/2008-March/004147.html The _get_taxon_id() function will add new entries to the taxon and taxon_name tables when a species isn't already defined. It will also generate entries for the lineage (for which we don't know the NCBI taxon names). At this point it *should* be re-using any existing entries for elements of the lineage. Note - this is complicated due to the re-use of the same latin names in different classes. It might be easier/safer just not to write the lineage at all? -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:34:40 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:34:40 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803261234.m2QCYekn009310@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:34 EST ------- See also Bug 2422 and this thread on the BioSQL mailing list: http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html In particular Hilmar Lapp from BioSQL wrote in reply to trying to reuse existing taxon table entries based on string matching to the scientific name field in the taxon_name table, which I said sounded a little unreliable: > It's pretty unreliable actually. There is not only synonymy > but also rampant homonymy in taxonomic names. There are > plenty of examples for the same scientific name in use for a > plant and for some animal, for example. So in order to be > unambiguous you will need to know (and check) the kingdom. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:44:01 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 08:44:01 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803261244.m2QCi15R009864@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:44 EST ------- Created an attachment (id=883) --> (http://bugzilla.open-bio.org/attachment.cgi?id=883&action=view) Patch to BioSQL/Loader.py to not record the lineage for new species This patch takes the simple route out - when loading a sequence into the database with a new species (not already in the taxon tables), we ONLY add the new species to the taxon and taxon_name tables. This DOES NOT attempt to record the whole lineage, adding or reusing existing taxon entries. Both the test_BioSQL and test_BioSQL_SeqIO unit tests still pass with this. I prefer this solution as it avoids any ambiguous heuristics in matching existing taxon names based on string comparions. This does mean Biopython won't match BioPerl is this regard, as I understand that BioPerl currently tries to record the full lineage. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 18:51:18 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 14:51:18 -0400 Subject: [Biopython-dev] [Bug 2477] New: SeqIO.parse does not handle embl files Message-ID: http://bugzilla.open-bio.org/show_bug.cgi?id=2477 Summary: SeqIO.parse does not handle embl files Product: Biopython Version: Not Applicable Platform: Macintosh OS/Version: Mac OS Status: NEW Severity: normal Priority: P2 Component: Main Distribution AssignedTo: biopython-dev at biopython.org ReportedBy: p.foster at nhm.ac.uk This is in 1.45, but I did not see it in 1.43. (1.45 is not a Bugzilla option at the moment ...) If fh is a handle to an embl format file, then SeqIO.parse(fh, 'embl') dies. It worked (not perfectly) in 1.43. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Wed Mar 26 19:21:41 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Wed, 26 Mar 2008 15:21:41 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803261921.m2QJLfbe007389@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Version|Not Applicable |1.45 ------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 15:21 EST ------- I've fixed the Bugzilla version field - thanks for the reminder. Could you give more information please? e.g. a specific EMBL file, and the error you are seeing. Thanks, Peter. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 07:59:54 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 03:59:54 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803270759.m2R7xsXA006767@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #2 from p.foster at nhm.ac.uk 2008-03-27 03:59 EST ------- Created an attachment (id=888) --> (http://bugzilla.open-bio.org/attachment.cgi?id=888&action=view) test case It is a multi-bug. There is a bug that prevents 1.45 from reading embl files, and there is another bug, visible in 1.43 (at least) where it at least parses embl files, but imperfectly. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 10:49:33 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 06:49:33 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271049.m2RAnXpj015624@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 06:49 EST ------- Thanks for the clarification. I can reproduce the problem here. It looks like they may have tweaked their file format slightly. Biopython will be ignoring the apparently new PA line, which isn't described here: http://www.ebi.ac.uk/webin-align/fflink2.html You can also fetch the first problem record from their webpage, choose "Save", "ASCII text/table", "complete entries" http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[EMBLCDS:AAA03323]+-newId As a minor point, personally I find the following style simpler: from Bio import SeqIO fName = 'twoEmblRecords.embl' f = file(fName) s = SeqIO.parse(f, 'embl') for rec in s : print rec.description print rec.annotations['taxonomy'] f.close() (you may of course have good reason for using the .next() method explicitly) I'll take a look at this bug now... -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 11:37:16 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 07:37:16 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271137.m2RBbGvg018455@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 07:37 EST ------- As you said, this is a multi-part bug! To try this out, you will need to update files Bio/GenBank/Scanner.py and __init__.py which are now in CVS. If you are not familiar with CVS, the easier method would be to download the two files from here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/?cvsroot=biopython#dirlist Note there is an hour or so time delay before it will show my changes. You can see where the files should be put from the stack trace. Please let me know how you get on (by posting on this bug). Missing AC lines ================ All our EMBL test cases tested included an AC line, and Biopython 1.45 was failing because of the missing AC line in your example, which was used to set the SeqRecord's id property. I have updated CVS to fall back on the ID line. Multiple DE lines ================= Already fixed as of Biopython 1.44 Multiple OC lines ================= Updated Biopython CVS to cope with multi-line taxonomy lineage PA lines (parent accessions) ============================ You didn't report this, but we currently are ignoring the PA lines. Quoting ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt PA line - contains the accession.version of the "parent" EMBL entry (entry where the CDS is annotated) e.g. a whole contig, not just this one CDS/gene. We could record this in the SeqRecord's annotations dictionary as a list of strings under key 'parent-accessions'. What do you think? Peter -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 15:50:36 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 11:50:36 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271550.m2RFoacs002027@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 ------- Comment #5 from p.foster at nhm.ac.uk 2008-03-27 11:50 EST ------- I got those two files, and they seem to have fixed everything. Thanks muchly. The suggestion of de-ignoring the PA line sounds fine (although I have no use for it at the moment). -Peter F. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Thu Mar 27 16:22:13 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Thu, 27 Mar 2008 12:22:13 -0400 Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files In-Reply-To: Message-ID: <200803271622.m2RGMDWV003784@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2477 biopython-bugzilla at maubp.freeserve.co.uk changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 12:22 EST ------- OK, marking as fixed. I also included AAA03323 as a unit test, as we were lacking an example without an AC line. I'll leave the PA line issue alone for the time being; it would be wise to check if there are any parallels in GenBank or SwissProt/UniProt before doing anything so that they are all handled consistently. Thanks for your report Peter. Peter -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 30 02:53:41 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sat, 29 Mar 2008 22:53:41 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803300253.m2U2rfLl002179@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #3 from ericgibert at yahoo.fr 2008-03-29 22:53 EST ------- I would like to propose the following solution: 1) add an extra optional parameter to load(): fetchNCBItaxonomy = False --> so no impact on existing code. If the users call the load function with True then: 2) after the species insert in the taxon/taxon_name table then the XML data from NCBI's taxonomy database are fetch 3) XML data is used to update taxon/taxon_name tables respecting the unicity of the records I have already part of the code, just need to change the fact that if a taxon already exists then the new taxon points to this already existing one. Comments? Eric -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Sun Mar 30 11:41:25 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Sun, 30 Mar 2008 07:41:25 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803301141.m2UBfPMC001648@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-30 07:41 EST ------- I quite like the idea of fetching the new taxon information from the NCBI as needed to record an accurate lineage. However, what happens if: (a) The network is down? Raise an exception maybe? (b) The NCBI doesn't have this Taxon ID (i.e. its invalid or so new their database is out of date)? Raise an exception? Eric, could you attach your taxonomy XML code to this bug? We'd probably want to start by adding taxonomy XML parsing to Bio.Entrez (which I assume you are using to fetch the XML data). What about sequences where we don't have a taxon ID, but we do have a species name? (which may happen with a sequence which wasn't read from a GenBank file). -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From mjldehoon at yahoo.com Sun Mar 30 14:49:41 2008 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Sun, 30 Mar 2008 07:49:41 -0700 (PDT) Subject: [Biopython-dev] Bio.Entrez XML parsing Message-ID: <864047.785.qm@web62410.mail.re1.yahoo.com> > Eric, could you attach your taxonomy XML code to this bug? > We'd probably want to start by adding taxonomy XML parsing > to Bio.Entrez (which I assume you are using to fetch the XML data). I've done some thinking about XML parsers for Bio.Entrez. I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file). Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML). The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules. If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception. Comments, anybody? --Michiel --------------------------------- Never miss a thing. Make Yahoo your homepage. From sdavis2 at mail.nih.gov Mon Mar 31 00:51:07 2008 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Sun, 30 Mar 2008 20:51:07 -0400 Subject: [Biopython-dev] Bio.Entrez XML parsing In-Reply-To: <864047.785.qm@web62410.mail.re1.yahoo.com> References: <864047.785.qm@web62410.mail.re1.yahoo.com> Message-ID: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> On Sun, Mar 30, 2008 at 10:49 AM, Michiel de Hoon wrote: > > > Eric, could you attach your taxonomy XML code to this bug? > > We'd probably want to start by adding taxonomy XML parsing > > to Bio.Entrez (which I assume you are using to fetch the XML data). > > I've done some thinking about XML parsers for Bio.Entrez. > > I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file). > > Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML). > > The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules. > > If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception. > > Comments, anybody? This makes sense. However, it seems that there needs to be a way to "register" a parser with read() so that users can extend their local installation with a specialized parser. In other words, it seems that a way to dynamically register a parser with read() would be helpful. Or am I missing something? Sean From biopython at maubp.freeserve.co.uk Mon Mar 31 11:25:05 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 12:25:05 +0100 Subject: [Biopython-dev] Bio.Entrez XML parsing In-Reply-To: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> References: <864047.785.qm@web62410.mail.re1.yahoo.com> <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com> Message-ID: <320fb6e00803310425u478fc938w2ff426c4eae32d99@mail.gmail.com> On Mon, Mar 31, 2008 at 1:51 AM, Sean Davis wrote: > This makes sense. However, it seems that there needs to be a way to > "register" a parser with read() so that users can extend their local > installation with a specialized parser. In other words, it seems that > a way to dynamically register a parser with read() would be helpful. > Or am I missing something? I like Michiel's plan. The mapping could be as simple as a (private) dictionary in Bio.Entrez, mapping formats to parser objects/functions - as done in Bio.SeqIO - which lets the user add new parsers or override the built in ones should they so desire. Peter From tiagoantao at gmail.com Mon Mar 31 14:54:38 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 15:54:38 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN Message-ID: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> Hi, I would like to start working on the statistical part (actually the most important part) of Bio.PopGen and on the HapMap part. My problem is with the CVS to SVN conversion. I cannot understand if I can go forward and where (ie on the SVN or the CSV repository)? I any case, I can wait with commiting, so there is no rush, but eventually I will have to commit somewhere ;) Tiago From bugzilla-daemon at portal.open-bio.org Mon Mar 31 15:22:20 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 11:22:20 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803311522.m2VFMKvU003831@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #5 from ericgibert at yahoo.fr 2008-03-31 11:22 EST ------- I attached the XML parser. Note that I did not dig too far in raising errors. This is not yet the full solution for the taxon/taxon_name tables of BioSQL but the first step. Please comment on my programming style and if you want me to raise errors. Note that Bio.Entrez already raises some errors. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From bugzilla-daemon at portal.open-bio.org Mon Mar 31 15:24:06 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 11:24:06 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803311524.m2VFO6wc004008@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #6 from ericgibert at yahoo.fr 2008-03-31 11:24 EST ------- Created an attachment (id=890) --> (http://bugzilla.open-bio.org/attachment.cgi?id=890&action=view) Parse a Taxonomy record from NCBI -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. From biopython at maubp.freeserve.co.uk Mon Mar 31 15:45:07 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 16:45:07 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> Message-ID: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote: > Hi, > > I would like to start working on the statistical part (actually the > most important part) of Bio.PopGen and on the HapMap part. > > My problem is with the CVS to SVN conversion. I cannot understand if I > can go forward and where (ie on the SVN or the CSV repository)? > > I any case, I can wait with commiting, so there is no rush, but > eventually I will have to commit somewhere ;) In the short term, we are still using CVS. I've only been making relatively small changes as I anticipate the move to SVN will happen shortly... Are there any objections to doing it in the next fortnight? Chris - could you find out when would suit the OBF guys? Maybe come up with two suggested time slots in the next month? Peter From tiagoantao at gmail.com Mon Mar 31 18:32:06 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 19:32:06 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> Message-ID: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> When on SVN I would like to consider branching for PopGen. AFAIK branching on svn costs very little (only when you make changes does SVN copies the content from the original branch). This would have the big advantage that I could make my changes freely without impact on Michiel's release cycle (or breaking the SVN head for some reason). Whenever I get something stable I just merge back. There are good reasons NOT to branch, so this might not be a good idea... But considering that I am the only person that changes PopGen I don't thing merging will be an issue at all... Any comments? On Mon, Mar 31, 2008 at 4:45 PM, Peter wrote: > > On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote: > > Hi, > > > > I would like to start working on the statistical part (actually the > > most important part) of Bio.PopGen and on the HapMap part. > > > > My problem is with the CVS to SVN conversion. I cannot understand if I > > can go forward and where (ie on the SVN or the CSV repository)? > > > > I any case, I can wait with commiting, so there is no rush, but > > eventually I will have to commit somewhere ;) > > In the short term, we are still using CVS. I've only been making > relatively small changes as I anticipate the move to SVN will happen > shortly... > > Are there any objections to doing it in the next fortnight? Chris - > could you find out when would suit the OBF guys? Maybe come up with > two suggested time slots in the next month? > > Peter > -- http://www.tiago.org From biopython at maubp.freeserve.co.uk Mon Mar 31 19:04:35 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Mon, 31 Mar 2008 20:04:35 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> Message-ID: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> On Mon, Mar 31, 2008 at 7:32 PM, Tiago Ant?o wrote: > When on SVN I would like to consider branching for PopGen. AFAIK > branching on svn costs very little (only when you make changes does > SVN copies the content from the original branch). > > This would have the big advantage that I could make my changes freely > without impact on Michiel's release cycle (or breaking the SVN head > for some reason). Whenever I get something stable I just merge back. > > There are good reasons NOT to branch, so this might not be a good > idea... But considering that I am the only person that changes PopGen > I don't thing merging will be an issue at all... Any comments? I had been wondering about taking advantage of SVN to explore my Bio.AlignIO plans and/or improvements to the alignment object. I think I will need to read up on SVN and how it handles merges and branches before I try this. There is a lot to be said for having a single stable trunk - it certainly makes things simpler for any new developers to get to grips with things. Peter From tiagoantao at gmail.com Mon Mar 31 19:08:46 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 20:08:46 +0100 Subject: [Biopython-dev] Bio.PopGen and CVS/SVN In-Reply-To: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com> <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com> <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com> <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com> Message-ID: <6d941f120803311208k6b6c9d1ah58c7808e0fbd0e2c@mail.gmail.com> On Mon, Mar 31, 2008 at 8:04 PM, Peter wrote: > There is a lot to be said for having a single stable trunk - it > certainly makes things simpler for any new developers to get to grips > with things. It is one of those issues where there is no clear answer. Maybe a case by case analysis? I think having 5 gazillion branches would not be a good idea ever, but in the Biopython case many modules are somewhat self contained, making merging an easier exercise. Tiago From tiagoantao at gmail.com Mon Mar 31 22:13:11 2008 From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=) Date: Mon, 31 Mar 2008 23:13:11 +0100 Subject: [Biopython-dev] Genbank dbSNP support Message-ID: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> Hi, Any plans for dbSNP support? http://www.ncbi.nlm.nih.gov/SNP/index.html I think I would volunteer to implement this. A simple solution would be to add both databases and return types. Michiel (I suppose this is code that you are actively maintaining, or it is Peter?), can I send you a diff? I have done this once already for genome - http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html dbSNP can return different types ( http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam ) so a few parsers would be needed for complete support. But that can be done later... -- http://www.tiago.org From biopython at maubp.freeserve.co.uk Mon Mar 31 23:01:10 2008 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 1 Apr 2008 00:01:10 +0100 Subject: [Biopython-dev] Genbank dbSNP support In-Reply-To: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> References: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com> Message-ID: <320fb6e00803311601x573c104cx1beb7035a14ef03c@mail.gmail.com> On Mon, Mar 31, 2008 at 11:13 PM, Tiago Ant?o wrote: > Hi, > > Any plans for dbSNP support? > http://www.ncbi.nlm.nih.gov/SNP/index.html > > I think I would volunteer to implement this. A simple solution would > be to add both databases and return types. Michiel (I suppose this is > code that you are actively maintaining, or it is Peter?), can I send > you a diff? I have done this once already for genome - > http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html I think Michiel has been dealing with this sort of stuff (NCBIDictionary and Bio.Entrez). I would file an enhancement bug, and attach your patch to it. > dbSNP can return different types ( > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam > ) so a few parsers would be needed for complete support. But that can > be done later... We should already be able to parse their Fasta, GenBank or GenPept output. The lists of IDs should also be trivial. I haven't looked at the other formats. Peter From bugzilla-daemon at portal.open-bio.org Mon Mar 31 23:23:46 2008 From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org) Date: Mon, 31 Mar 2008 19:23:46 -0400 Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing taxon entries in lineage In-Reply-To: Message-ID: <200803312323.m2VNNku4026068@portal.open-bio.org> http://bugzilla.open-bio.org/show_bug.cgi?id=2475 ------- Comment #7 from ericgibert at yahoo.fr 2008-03-31 19:23 EST ------- Created an attachment (id=891) --> (http://bugzilla.open-bio.org/attachment.cgi?id=891&action=view) refactoring and search by name Please discard previous attachment. This newer version includes a static method returning a list of Taxonomy based on a scientific name. It is then possible to test the len of the return list: 0 for no match, 1 for a unique taxon, more if ambiguity. Ambiguity can be cleared using the get_taxon_by_rank("order") for example. -- Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.