From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:54:19 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Mar 2008 03:54:19 -0500
Subject: [Biopython-dev] [Bug 2464] from Bio import db doesn't work?
In-Reply-To:
Message-ID: <200803010854.m218sJFT023721@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2464
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |DUPLICATE
------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST -------
Duplicate of Bug #2393, which was fixed in CVS.
*** This bug has been marked as a duplicate of bug 2393 ***
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:54:23 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Mar 2008 03:54:23 -0500
Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with
release 1.44
In-Reply-To:
Message-ID: <200803010854.m218sNPD023746@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2393
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |patrikd at gmail.com
------- Comment #13 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST -------
*** Bug 2464 has been marked as a duplicate of this bug. ***
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
From mjldehoon at yahoo.com Sat Mar 1 03:52:16 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sat, 1 Mar 2008 00:52:16 -0800 (PST)
Subject: [Biopython-dev] deprecation?
In-Reply-To: <47C7FB4C.40607@umh.es>
Message-ID: <504389.32803.qm@web62405.mail.re1.yahoo.com>
Dear Gregorio,
Thanks for letting us know.
Could you show us what exactly you are trying to do in your script?
This function was deprecated because there were several functions in Biopython doing nearly the same thing, and we're trying to converge on one function.
So probably, the best thing would be to avoid using Bio\config\DBRegistry.py.
--Michiel.
Gregorio Fernandez wrote: Dear Sir,
I had this messasge in one of my scripts. Can I have this feature
available?
C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149:
DeprecationWarning:
Concurrent behavior has been deprecated, as this functionality needs
Bio.MultiPr
oc, which itself has been deprecated. If you need the concurrent
behavior, pleas
e let the Biopython developers know by sending an email to
biopython-dev at biopyth
on.org to avoid permanent removal of this feature.
DeprecationWarning)
Thanks
Gregorio
--
Gregorio J. Fernandez Ballester
Instituto de Biolog?a Molecular y Celular
Universidad Miguel Hern?ndez
Edificio Torregait?n.
Avda. de la Universidad, s/n. 03202
Elche (Alicante)
E-mail: gregorio at umh.es
Telf: 966 65 84 41
Fax: 966 65 87 58
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From bugzilla-daemon at portal.open-bio.org Mon Mar 3 16:53:59 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 3 Mar 2008 16:53:59 -0500
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803032153.m23LrxP4023475@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-03 16:53 EST -------
Defining __eq__ and __ne__ methods for the Alphabet class would probably work,
but we would also have to do this for the AlphabetEncoder "decorator" class.
I'm a little wary of this...
def __ne__(self, other) :
"""Check if this alphabet object <> another alphabet"""
return not self == other
def __eq__(self, other) :
"""Check if this alphabet object == another alphabet"""
#TODO - what exactly do we want to check here?
if id(self) == id(other) :
return True
if not isinstance(other, Alphabet) \
and not isinstance(other, AlphabetEncoder):
raise ValueError("Comparing an alphabet to a non-alphabet")
if self.__class__ <> other.__class__ :
return False
if self.size <> other.size :
return False
if self.letters <> other.letters :
return False
if dir(self) <> dir(other) :
return False
for attr in ["gap_char", "stop_symbol"] :
if hasattr(self, attr) <> hasattr(other, attr) :
return False
if hasattr(self, attr) and hasattr(other, attr) \
and self.__getattr__(attr) <> other.__getattr_(attr) :
return False
#Close enough?
return True
Relaxing the assertion in Bio.Translate would be much safer in terms of any
potential side effects.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Thu Mar 6 10:24:47 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 6 Mar 2008 07:24:47 -0800 (PST)
Subject: [Biopython-dev] New Biopython release
Message-ID: <48921.43822.qm@web62403.mail.re1.yahoo.com>
Hi everybody,
Let's make a new release (1.45). I'm thinking of Friday 21st, which gives us about two weeks. The current Biopython release (1.44) has a nasty bug that causes an error with one of the Bio.GenBank examples in the tutorial. This bug has since been fixed in CVS.
If you have any code that is ready to be submitted to CVS, now would be a good time to do so. If your code is not yet ready from prime time, please don't submit it to CVS until after the release to avoid any last-minute problems.
Biopython 1.44 had a large number of deprecations, but I feel it is too soon to remove them from the release completely. Bio.Blast.blast and Bio.Blast.blasturl have been deprecated for several releases now, so if there are no objections I think we should remove them. Bio.Kabat has been deprecated since release 1.43. Since it has few (if any) users, I think we should remove it too.
Also, please have a look at the Biopython bugs that are still open to see if there's anything we can do about them.
--Michiel.
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Sat Mar 8 15:25:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 8 Mar 2008 15:25:06 -0500
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803082025.m28KP661006291@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-08 15:25 EST -------
Changing Bio/Translate.py line 14+ and 36+ from this:
assert seq.alphabet == self.table.nucleotide_alphabet, \
...
to this:
#Allow different instances of the same class to be used:
assert seq.alphabet.__class__ == \
self.table.nucleotide_alphabet.__class__, \
...
seems to resolve the original bug report. I'd like to check this doesn't
affect any of the unit tests under Linux - Windows looks OK.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 10 06:12:13 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Mar 2008 06:12:13 -0400
Subject: [Biopython-dev] [Bug 1999] new frame translation method
In-Reply-To:
Message-ID: <200803101012.m2AACD7k003033@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1999
------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2008-03-10 06:12 EST -------
Since SeqUtils.frameTranslations and SeqUtils.six_frame_translations are so
similar, I think we should keep only one of these functions. Preferably named
"six_frame_translations", for backward compatibility.
Also, I think we should not require the seqO argument to be a Seq object.
If this function is to replace the existing SeqUtils.six_frame_translations, we
should make to sure to keep all the existing functionality of that function. I
believe current the GC content calculation is missing in
SeqUtils.frameTranslations.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Tue Mar 11 20:37:11 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 12 Mar 2008 00:37:11 +0000
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
Message-ID: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
Hi Chris,
I haven't heard anything about the CVS to SVN move recently. Did
anyone resolve the multiple password prompt niggle?
On another point, is the test SVN repository intended to be writable
(for those of us with developer access)? I really should try running
some things like "svn diff" and committing sample changes to get a
feel for how it compares to CVN.
Peter
From sbassi at gmail.com Wed Mar 12 10:52:51 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 11:52:51 -0300
Subject: [Biopython-dev] BLAST XML to HTML
Message-ID:
Is there a Biopython module to convert a VLAST XML output to HTML?
Like this one from BioJAVA:
http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML
If there is no such a module, could this be included into Biopython if
I provide the code?
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From peter at maubp.freeserve.co.uk Wed Mar 12 14:12:35 2008
From: peter at maubp.freeserve.co.uk (Peter)
Date: Wed, 12 Mar 2008 18:12:35 +0000
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To:
References:
Message-ID: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
On Wed, Mar 12, 2008 at 2:52 PM, Sebastian Bassi wrote:
> Is there a Biopython module to convert a VLAST XML output to HTML?
> Like this one from BioJAVA:
> http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML
> If there is no such a module, could this be included into Biopython if
> I provide the code?
Is your idea to convert from the XML output of the NCBI BLAST tools
into HTML very closely resembling the NCBI's HTML output (perhaps for
another program to read as input). Or do you just want to produce a
nice HTML page for a person to read (perhaps resembling the NCBI page
in appearance, but not using the same HTML layout)?
How would your code work -direct from the XML file, or from the
results of the existing Biopython BLAST parsers?
Peter
From sbassi at gmail.com Wed Mar 12 14:21:18 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 15:21:18 -0300
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
References:
<320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
Message-ID:
On Wed, Mar 12, 2008 at 3:12 PM, Peter wrote:
> Is your idea to convert from the XML output of the NCBI BLAST tools
> into HTML very closely resembling the NCBI's HTML output (perhaps for
> another program to read as input). Or do you just want to produce a
> nice HTML page for a person to read (perhaps resembling the NCBI page
> in appearance, but not using the same HTML layout)?
The idea is because I always run the BLAST as XML since I parse them
with biopython, but people at lab want to check the HTML version (or I
want to "publish" the result in a public DB accessible via html) and
that makes me re-run the BLAST just for them to see the output.
Sometimes the BLAST are resource demanding (like a 2 week run) and I
would like to avoid re-running the BLAST when I really want is a
format change.
> How would your code work -direct from the XML file, or from the
> results of the existing Biopython BLAST parsers?
>From the XML output.
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From mjldehoon at yahoo.com Wed Mar 12 17:50:42 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Wed, 12 Mar 2008 14:50:42 -0700 (PDT)
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To:
Message-ID: <845186.11502.qm@web62406.mail.re1.yahoo.com>
> > How would your code work -direct from the XML file, or from the
> > results of the existing Biopython BLAST parsers?
>From the XML output.
One option is to use Cascading Style Sheets (CSS) to display the XML file. That way, you don't have to create a new HTML file. Also, we should check with NCBI if they have a tool for such purposes.
--Michiel.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
From sbassi at gmail.com Wed Mar 12 18:25:35 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 19:25:35 -0300
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To: <845186.11502.qm@web62406.mail.re1.yahoo.com>
References:
<845186.11502.qm@web62406.mail.re1.yahoo.com>
Message-ID:
On Wed, Mar 12, 2008 at 6:50 PM, Michiel de Hoon wrote:
> One option is to use Cascading Style Sheets (CSS) to display the XML file.
> That way, you don't have to create a new HTML file. Also, we should check
> with NCBI if they have a tool for such purposes.
They must have something because the new online NCBI BLAST has an
option called "reformat BLAST results". This option can reformat from
XML to HTML without re-running the BLAST, but this is working as
server-side.
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 06:07:44 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 06:07:44 -0400
Subject: [Biopython-dev] [Bug 2468] New: Tutorial needs a fix: Bio.WWW.NCBI
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
Summary: Tutorial needs a fix: Bio.WWW.NCBI
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Documentation
AssignedTo: biopython-dev at biopython.org
ReportedBy: mmokrejs at ribosome.natur.cuni.cz
I am trying to follow the recipe at
http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc14 which contains the
following split into several chunks (I don't like this style personally, but
that's not the issue here):
#! /usr/bin/python
from Bio.WWW import NCBI
search_command = 'Search'
search_database = 'Taxonomy'
return_format = 'FASTA'
search_term = 'Cypripedioideae'
my_browser = 'lynx'
result_handle = NCBI.query(search_command, search_database, term = search_term,
doptcmdl = return_format)
import os
result_file_name = os.path.join(os.getcwd(), "results.html")
result_file = open(result_file_name, "w")
result_file.write(result_handle.read())
result_file.close()
if my_browser == "lynx":
os.system("lynx -force_html " + result_file_name)
elif my_browser == "netscape":
os.system("netscape file:" + result_file_name)
I end up with a lynx browser opened with the Entrez search page pre-filled with
the 'Cypripedioideae' as the query string. Unfortunately, I have to click on
the
condensed results to get the taxonomy listing under the word 'Cypripedioideae'.
The line I am talking about is close the the end of the output:
[ ] 1: Cypripedioideae, subfamily, monocots Links
BTW, other the links from the page do not work because they point to
http://localhost/....
/usr/lib/python2.5/site-packages/Bio/WWW/NCBI.py:34: DeprecationWarning:
Bio.WWW.NCBI is deprecated. The functions in Bio.WWW.NCBI are now available
from Bio.Entrez.
DeprecationWarning)
The section needs updating. I am somewhat surprised I cannot access NCBI
Taxonomy easily. Priobably will have to browse the source code and forget
Tutorail and Cookbook. ;)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 06:55:55 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 06:55:55 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803131055.m2DAttv2027003@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 06:55 EST -------
The tutorial in CVS is already updated to use Bio.Entrez.query instead of
Bio.WWW.NCBI.query relecting the depreciation made in CVS.
I think you are using the Biopython 1.44 tutorial (from the weblink) with the
CVS Biopython code.
So at least part of your problem is already fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 07:27:14 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 07:27:14 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803131127.m2DBREdM028784@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 07:27 EST -------
Reading the Bio.Entrez documentation, the query function always returns HTML.
You could also use the esearch function which returns XML, followed by the
efetch function which seems to support a range of options depending on the
datatype. For example, using the taxonomy db:
#This gets an XML file from the following URL,
#http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=Cypripedioideae
from Bio import Entrez
result_handle = Entrez.esearch("taxonomy", term="Cypripedioideae")
print result_handle.read()
You could then parse the XML file to extract the matching ID(s), perhaps with a
regular expression. In this case, there is only one match, 158330.
#http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy
#&id=9685&report=brief&retmode=text
from Bio import Entrez
result_handle = Entrez.efetch("taxonomy", id="158330", \
report="docsum", retmode="text")
print result_handle.read()
#Given ID 9685, returns "Cypripedioideae, subfamily, monocots"
I agree that this section of the tutorial could be more useful. Do you think
the above could would be more helpful?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Thu Mar 13 08:14:09 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 13 Mar 2008 12:14:09 +0000
Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions
Message-ID: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com>
Hi Michiel (et al),
I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez
I notice you've marked several of the functions in Bio.Entrez with
depreciation warnings as the NCBI has retired the associated APIs.
i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI
is deprecated).
Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and
Bio.WWW.NCBI.pmneighbor will get a deprecation warning from
Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch,
Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings.
Do you think we can just remove pmfetch, pmqty and pmneighbor from
Bio.Entrez so that it starts out "clean", and adjust the warning from
Bio.WWW.NCBI as follows:
import warnings
warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now
available from Bio.Entrez, except for the pm* functions which the NCBI
have retired.", DeprecationWarning)
What do you think?
Peter
From mjldehoon at yahoo.com Thu Mar 13 08:48:29 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 13 Mar 2008 05:48:29 -0700 (PDT)
Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions
In-Reply-To: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com>
Message-ID: <84386.61626.qm@web62415.mail.re1.yahoo.com>
That is fine with me. Maybe I was being too conservative.
I'll make those changes.
--Michiel.
Peter wrote: Hi Michiel (et al),
I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez
I notice you've marked several of the functions in Bio.Entrez with
depreciation warnings as the NCBI has retired the associated APIs.
i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI
is deprecated).
Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and
Bio.WWW.NCBI.pmneighbor will get a deprecation warning from
Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch,
Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings.
Do you think we can just remove pmfetch, pmqty and pmneighbor from
Bio.Entrez so that it starts out "clean", and adjust the warning from
Bio.WWW.NCBI as follows:
import warnings
warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now
available from Bio.Entrez, except for the pm* functions which the NCBI
have retired.", DeprecationWarning)
What do you think?
Peter
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Fri Mar 14 11:53:34 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Mar 2008 11:53:34 -0400
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803141553.m2EFrY6p001573@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 11:53 EST -------
Fixed in CVS, Bio/Translate.py revision 1.3, as described in comment 3.
This fixes the original report, making sequence translation simpler to use -
see also Bug 2381 - translate and transcibe methods for the Seq object (in
Bio.Seq)
This change does NOT address the larger issue of how to decide if two alphabets
are equal or not.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Mar 14 12:19:40 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Mar 2008 12:19:40 -0400
Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS
journals
In-Reply-To:
Message-ID: <200803141619.m2EGJeIC003283@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2447
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 12:19 EST -------
Created an attachment (id=878)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=878&action=view)
Patch to Bio/EUtils/parse.py
I'm sure sure if this is the best way to fix this, but it does appear to solve
the reported problem. Can you give this a try Noel?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 16:36:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 16:36:06 -0400
Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain
text in CVS?
In-Reply-To:
Message-ID: <200803152036.m2FKa6xR029284@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2363
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 16:36 EST -------
I've just done a clean checkout and build on Windows, and run the test suite,
and built the tutorial as PDF. I didn't run into any text/binary issues, so
this seems to be fixed now :)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 16:49:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 16:49:06 -0400
Subject: [Biopython-dev] [Bug 2469] New: requires_wise.py fails on Windows
(test suite)
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2469
Summary: requires_wise.py fails on Windows (test suite)
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
On my Windows XP machine, I don't have wise installed, so the dnal command
doesn't work:
C:\TEMP\>dnal
'dnal' is not recognized as an internal or external command,
operable program or batch file.
When running the unit test suite, test_Wise.py SHOULD fail with a missing
external dependency error - instead it tries to run with a failed assertion
error. The problem is requires_wise.py fails... which seems to be an issue
with the commands.getoutput() function not working on Windows, its unix only
according to:
http://www.python.org/doc/current/lib/module-commands.html
Annoyingly, the commands module is present on Windows (or at least Python 2.3)
but simply doesn't work due to calling this:
os.popen('{ ' + cmd + '; } 2>&1', 'r')
As a result,
>>> commands.getoutput("xyz")
"'{' is not recognized as an internal or external command,\noperable program or
batch file."
Assuming wise/dnal actually works on Windows, we need to use something other
than commands.getoutput("dnal") to check for it.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 19:40:54 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 19:40:54 -0400
Subject: [Biopython-dev] [Bug 2469] requires_wise.py fails on Windows (test
suite)
In-Reply-To:
Message-ID: <200803152340.m2FNesQl005388@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2469
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 19:40 EST -------
You can download wise2 from ftp://ftp.ebi.ac.uk/pub/software/unix/wise2/ and
compile it under Windows XP using cygwin, but its own tests fail - I'm not sure
why. Carrying on regardless, then test_Wise.py still doesn't work for me :(
P.S. Cornell University have packaged wise2 for Windows (found via Google, I
haven't tried this): http://www.tc.cornell.edu/WBA/
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 16 17:03:59 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 16 Mar 2008 17:03:59 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803162103.m2GL3x4u021735@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:03 EST -------
In the old code, when the species wasn't already recorded in the
taxon/taxon_name tables, we add would it and its parent lineage entries.
See also http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html
There are a few problems in the old code, exposed in the unit tests, but I
think I have this working again now.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 16 17:25:02 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 16 Mar 2008 17:25:02 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803162125.m2GLP2Oq023867@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:25 EST -------
Created an attachment (id=879)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=879&action=view)
patch to BioSQL/Loader.py
Possible patch - the two BioSQL unit tests pass with this. I have not had a
chance to try this in combination with a taxonomy table pre-populated by
load_ncbi_taxonomy.pl
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Sun Mar 16 22:43:55 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sun, 16 Mar 2008 19:43:55 -0700 (PDT)
Subject: [Biopython-dev] [BioSQL-l] Loading sequences with novel NCBI
taxon id
In-Reply-To: <002201c88705$70780840$6400a8c0@Gecko>
Message-ID: <121379.45887.qm@web62403.mail.re1.yahoo.com>
> Thank you for your mail recommending the usage of NCBI.WWW.
> I have modified my class/script accordingly to your suggestion
> without problem. Once 1.45 is out, I will change for NCBI.Entrez
> as you informed me.
Just to avoid any confusion: In Biopython 1.45, the module will be "Bio.Entrez", not "Bio.NCBI.Entrez".
> In any case, I do not pretend having a fantastic piece of code, but it gets
> the job done. If you find this interesting, I would be pleased to contribute
> to BioPython.
Bio.Entrez will need some parsers to parse the XML results, although that probably won't happen before the 1.45 release. I think your script could be very useful when writing those parsers. Could you open a bug report on Bugzilla and upload your script there? Beware, to upload a script to Bugzilla, you need to create a bug report first, and then as a separate step upload the script.
Thanks!
--Michiel..
Eric Gibert wrote: Dear Peter,
Regarding the update of the BioSQL tables taxon and taxon_name, I have
created a class "TaxonUpdate" (how original!) which do two things:
1) as a class itself, it will fetch from NCBI the taxon's information as XLM
based on the taxon_id passed to the constructor, parse the returned XML
answer to get the genus, class, order, family (10 levels) and update that in
taxon table. If taxon_name needs update/insert, it does it too.
2) run as an independent script __main__, it will look for all species in
taxon table for which the genus (parent) does not have a ncbi_taxon_id (i.e.
is NULL as this is the current result after adding a new sequence in
BioSQL). For all those incomplete found records, it will perform the update
as (1)
After the addition of a new sequence in a BioSQL database, a simple call of
this code (passing the taxon_id) will do the updating job.
Dear Michiel,
Thank you for your mail recommending the usage of NCBI.WWW. I have modified
my class/script accordingly to your suggestion without problem. Once 1.45 is
out, I will change for NCBI.Entrez as you informed me.
In any case, I do not pretend having a fantastic piece of code, but it gets
the job done. If you find this interesting, I would be pleased to contribute
to BioPython.
Eric
-----Original Message-----
From: biosql-l-bounces at lists.open-bio.org
[mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Peter
Sent: Thursday, March 13, 2008 11:06 PM
To: BioSQL
Subject: [BioSQL-l] Loading sequences with novel NCBI taxon id
Dear list,
One of the unresolved issues with Biopython's BioSQL interface is
dealing with the NCBI taxon ID when loading sequences into the
database.
As I understand it, ideally before loading any sequences, the user
will have loaded in the entire NCBI taxonomy using the
load_ncbi_taxonomy.pl script, as I described here:
http://biopython.org/wiki/BioSQL#NCBI_Taxonomy
When a new sequence is added to the database with a known taxon id,
there is no problem. But happens if its a recently sequenced organism
which isn't defined yet in the BioSQL taxonomy tables? Could/should
the user re-run load_ncbi_taxonomy.pl, and then load in their new
sequence?
Right now in Biopython due what appears to have been intended as a
short term hack, we simple don't record the taxon id at all (!), and I
would like to fix this (bug 2422).
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
How do BioPerl et al deal with this issue? Do they try and update the
taxonomy tables using the available information in the new record's
annotation (i.e. the new taxon id and the species name)? Do they
lookup the NCBI taxonomy definition via the internet? Do they throw
an error and halt?
Thanks,
Peter
(Biopython)
_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Mon Mar 17 07:47:46 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 17 Mar 2008 07:47:46 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803171147.m2HBlksw008865@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #3 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 07:47 EST -------
Hi Peter,
yes this would be more helpful. Unfortunately I did the one-time job with
parsing the HTML output and re-running wget to fetch the final HTML page,
stripped HTML formatting and was done. I will upload my two crappy scripts.
They work but should be re-written to utilize the XML outputs you have
mentioned.
The second URL from your last comment should have different values for some
parameters to yield another XML page:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=41073&report=sgml&mode=xml
That returns me:
41073
cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa;
Eumetazoa; Bilateria; Coelomata; Protostomia; Panarthropoda; Arthropoda;
Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera;
Endopterygota; Coleoptera; Adephaga
Maybe I will find the time to rewrite them for the purpose of tutorial to use
the XMLs.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 17 09:18:35 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 17 Mar 2008 09:18:35 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803171318.m2HDIZYX014608@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #4 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 09:18 EST -------
Created an attachment (id=880)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=880&action=view)
taxfetch.py
This program/module can fetch for the user the Lineage line. The query()
function uses the deprecated biopython API while the efetch uses the other.
Queries get cached in a local file taxonomycache.db for speed.
Users can call either of the two functions from external python code. Feel free
to use the code in Tutorial or even bundle in any form into the package.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From tiagoantao at gmail.com Mon Mar 17 18:30:38 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 17 Mar 2008 22:30:38 +0000
Subject: [Biopython-dev] Bio.PopGen status
Message-ID: <6d941f120803171530g36504759g4b3cf835065e17b8@mail.gmail.com>
Hi,
This is a short email regarding Bio.PopGen status.
1. All the code on the repository should be stable.
2. The Biopython version that is schedule for release soon will have
support for coalescent population genetics' simulations
3. A short number of test code cases are included.
4. Documentation was produced and is available on the Tutorial. I
believe that it is satisfactory (tell me if you disagree).
5. Bio.PopGen is still not "version 1" in the sense that the
fundamental statistics code is missing. This was a conscious strategy
to start with selection detection and coalescent simulation in order
to begin with arguably less important stuff so that newbie errors (in
the sense that I was a newbie developer to biopython) would have less
impact.
6. Statistics is my next task and hopefully will coincide with the
biopython release after this one. This will be, at least, for me,
"version 1" of Bio.PopGen
7. In the code, there is, since the original Bio.PopGen, code that is
able to execute external simulators in parallel (thus taking advantage
of multi core architectures for computationally intensive
simulations). This is, unfortunately, not documented. I will document
this (maybe in a separate document from the tutorial) in the future. I
don't think this is priority 1. But others might be interested in
using this code for computationally intensive tasks using external
programs. In case you want to know more details about this, please say
so.
>From a biopython release perspective, Bio.PopGen with new coalescent
simulation features is fully ready. Please go ahead and release
whenever is more convenient.
--
http://www.tiago.org/ps
From bugzilla-daemon at portal.open-bio.org Thu Mar 20 06:23:35 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 20 Mar 2008 06:23:35 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803201023.m2KANZun010097@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-20 06:23 EST -------
Patch checked in as BioSQL/Loader.py revision 1.28
Unit tests passed on both Windows XP and Linux (using MySQL)
Note that once we have added "provisional" entries to the taxon/taxon_name
table based on the record annotation, load_ncbi_taxonomy.pl should be able to
tidy things up using the NCBI taxonomy. At least it should once BioSQL bug
2470 is fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Thu Mar 20 16:14:50 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 20 Mar 2008 20:14:50 +0000
Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics
services
In-Reply-To: <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com>
References: <320fb6e00803140944v13f241b9icc0e911643f234cd@mail.gmail.com>
<47DE0C22.9040202@netsys.co.za>
<320fb6e00803170049g79960e14u8c1417fcdc99a0d5@mail.gmail.com>
<320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com>
Message-ID: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com>
I posted the following email on the mail discussion mailing list, and
haven't seen any replies.
Should we mark Bio.biblio as deprecated now (before the imminent release)?
Peter
On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote:
> Dear list,
>
> We have an old module Bio/biblio.py written by Tiaan Wessels back in
> 2002 (during a South African hackathon). This is code to use some EBI
> Bibliographics services, but currently no longer works. At the very
> least, the EBI have changed the URLs for their SOAP services. I got
> in touch with the author by email, and he no longer uses the code and
> thought we could remove it.
>
> Does anyone on the list still use Bio/biblio.py?
>
> Would anyone like to take a more in depth look at the code, and the
> current EBI web API, and see if there is anything in Bio.biblio worth
> keeping?
>
> If not, I'm proposing we mark this as deprecated for the next release
> of Biopython.
>
> Thanks,
>
> Peter
>
From mjldehoon at yahoo.com Thu Mar 20 22:08:56 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 20 Mar 2008 19:08:56 -0700 (PDT)
Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics
services
In-Reply-To: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com>
Message-ID: <242823.17441.qm@web62402.mail.re1.yahoo.com>
> Should we mark Bio.biblio as deprecated now (before the imminent release)?
Yes. It's just a deprecation; the code will still be usable. The deprecation warning should contain a notice to contact us in case somebody is still using this code. If not, it's better to deprecate it and remove it in some future release. Keeping Biopython clean is important.
--Michiel.
Peter wrote: I posted the following email on the mail discussion mailing list, and
haven't seen any replies.
Should we mark Bio.biblio as deprecated now (before the imminent release)?
Peter
On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote:
> Dear list,
>
> We have an old module Bio/biblio.py written by Tiaan Wessels back in
> 2002 (during a South African hackathon). This is code to use some EBI
> Bibliographics services, but currently no longer works. At the very
> least, the EBI have changed the URLs for their SOAP services. I got
> in touch with the author by email, and he no longer uses the code and
> thought we could remove it.
>
> Does anyone on the list still use Bio/biblio.py?
>
> Would anyone like to take a more in depth look at the code, and the
> current EBI web API, and see if there is anything in Bio.biblio worth
> keeping?
>
> If not, I'm proposing we mark this as deprecated for the next release
> of Biopython.
>
> Thanks,
>
> Peter
>
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From mjldehoon at yahoo.com Fri Mar 21 07:57:02 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Fri, 21 Mar 2008 04:57:02 -0700 (PDT)
Subject: [Biopython-dev] CVS freeze for release
Message-ID: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Hi everybody,
I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks!
--Michiel.
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From biopython at maubp.freeserve.co.uk Fri Mar 21 08:51:01 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 12:51:01 +0000
Subject: [Biopython-dev] CVS freeze for release
In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com>
References: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Message-ID: <320fb6e00803210551o4644fc34meaadf9e521b087fe@mail.gmail.com>
Michiel de Hoon wrote:
> Hi everybody,
>
> I'll start making release 1.45 from now. Please don't touch CVS until after
> the release is out. Thanks!
Good news :)
I did check in some comment changes to BioSQL this morning, and the
Bio.biblio deprecation, but that was a few hours ago.
Peter
From meanerelk at gmail.com Fri Mar 21 14:28:24 2008
From: meanerelk at gmail.com (Kemal)
Date: Fri, 21 Mar 2008 14:28:24 -0400
Subject: [Biopython-dev] mentor for google summer of code
Message-ID:
I am a university student interested in adding phyloXML support to BioPython
for the Google Summer of Code. Would any developers be willing to mentor
this project? I have been discussing it with Hilmar Lapp, who is mentoring
similar projects for the Phyloinformatics Summer of Code project at the
National Evolutionary Synthesis Center. There page is at:
https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008
A mentor would would be responsible for monitoring the project's progress
over the summer, and to evaluate the work at the end. Google's guidelines
estimate that this would take about 5 hours/week per student. There is more
information at:
http://code.google.com/opensource/gsoc/2008/faqs.html
If anyone is interested, I would love to discuss the details of the
proposal.
Thank you,
Kemal Eren
From biopython at maubp.freeserve.co.uk Fri Mar 21 15:04:06 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 19:04:06 +0000
Subject: [Biopython-dev] mentor for google summer of code
In-Reply-To:
References:
Message-ID: <320fb6e00803211204q2d0f3696pf5baaf44122a0869@mail.gmail.com>
Hi Kemal,
On Fri, Mar 21, 2008 at 6:28 PM, Kemal wrote:
> I am a university student interested in adding phyloXML support to BioPython
> for the Google Summer of Code. Would any developers be willing to mentor
> this project? I have been discussing it with Hilmar Lapp, who is mentoring
> similar projects for the Phyloinformatics Summer of Code project at the
> National Evolutionary Synthesis Center. There page is at:
>
> https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008
I see there are similar projects already planned for phyloXML with BioPerl
and BioRuby. For Biopython I guess building on Frank Kauff and Cymon J.
Cox's Bio.Nexus module would be the most logical option. Have you had
a chance to look at any of the Biopython code?
> A mentor would would be responsible for monitoring the project's progress
> over the summer, and to evaluate the work at the end. Google's guidelines
> estimate that this would take about 5 hours/week per student. There is more
> information at:
>
> http://code.google.com/opensource/gsoc/2008/faqs.html
>
> If anyone is interested, I would love to discuss the details of the
> proposal.
It would be worth trying to contact Frank and Cymon directly - and
seeing if they would be interested.
Peter
(one of the current Biopython developers)
From chris.lasher at gmail.com Fri Mar 21 17:11:40 2008
From: chris.lasher at gmail.com (Chris Lasher)
Date: Fri, 21 Mar 2008 17:11:40 -0400
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
<320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
Message-ID: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote:
> Hi Chris,
>
> I haven't heard anything about the CVS to SVN move recently. Did
> anyone resolve the multiple password prompt niggle?
That's still unresolved. The workaround is to place an SSH key on
dev.open-bio.org. If you do, you'll notice even then it still makes
two attempts to log in. /shrugs
> On another point, is the test SVN repository intended to be writable
> (for those of us with developer access)? I really should try running
> some things like "svn diff" and committing sample changes to get a
> feel for how it compares to CVN.
I have tried committing to it and gotten a "Permission denied" error.
It must only be set as read-only for group permissions.
Really sorry for my delay on getting to this email. Now that Biopython
has had another release, should we really push hard to switch to SVN?
Chris
From biopython-dev at maubp.freeserve.co.uk Fri Mar 21 17:37:45 2008
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 21:37:45 +0000
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
<320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
<128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
Message-ID: <320fb6e00803211437i79c1454m52ba4172728032c8@mail.gmail.com>
On Fri, Mar 21, 2008 at 9:11 PM, Chris Lasher wrote:
> On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote:
> > Hi Chris,
> >
> > I haven't heard anything about the CVS to SVN move recently. Did
> > anyone resolve the multiple password prompt niggle?
>
> That's still unresolved. The workaround is to place an SSH key on
> dev.open-bio.org. If you do, you'll notice even then it still makes
> two attempts to log in. /shrugs
If someone's documented this from the previous Bio* migrations, then I
guess we'll live with it.
> > On another point, is the test SVN repository intended to be writable
> > (for those of us with developer access)? I really should try running
> > some things like "svn diff" and committing sample changes to get a
> > feel for how it compares to CVN.
>
> I have tried committing to it and gotten a "Permission denied" error.
> It must only be set as read-only for group permissions.
I never did get round to trying myself...
> Really sorry for my delay on getting to this email. Now that Biopython
> has had another release, should we really push hard to switch to SVN?
Well, Michiel declared a CVS freeze this morning and is preparing
Biopython 1.45 as we speak. Once the release is out does sound like a
good time for the SVN move to me.
Peter
From peter.bulychev at gmail.com Fri Mar 21 19:50:23 2008
From: peter.bulychev at gmail.com (Peter Bulychev)
Date: Sat, 22 Mar 2008 02:50:23 +0300
Subject: [Biopython-dev] results of applying Clone Digger to the sources of
BioPython project
Message-ID:
Hello.
Clone Digger project is aimed to find software clones (duplicate code) in
Python and Java programs.
I have applied it to the source of BioPython and discovered several clone
candidates.
There are a lot of false positives caused by similar code in
nlmmedline_*_format.py files, but maybe other clone candidates will be
interesting for you.
The results can be seen here:
http://clonedigger.sourceforge.net/examples.html
--
Best regards,
Peter Bulychev.
From sbassi at gmail.com Fri Mar 21 23:49:52 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Sat, 22 Mar 2008 00:49:52 -0300
Subject: [Biopython-dev] CVS freeze for release
In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com>
References: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Message-ID:
On Fri, Mar 21, 2008 at 8:57 AM, Michiel de Hoon wrote:
> Hi everybody,
> I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks!
I have a proposal, so it could be implemented in the next version (1.46?).
Change the output of EZRetrieve.retrieve_single. It currently returns
a FASTA formated sequence. I think it should return a SeqRecord object
(if you want this SeqRecord object to be printed or stored as FASTA,
just use formatIO).
Here are the proposed changes: http://www.pastecode.com.ar/f3baff314
I can fill this as an enhancement in the bugtrack if you agree.
Best,
SB.
From mjldehoon at yahoo.com Sat Mar 22 07:02:38 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sat, 22 Mar 2008 04:02:38 -0700 (PDT)
Subject: [Biopython-dev] Biopython release 1.45
Message-ID: <901773.64728.qm@web62408.mail.re1.yahoo.com>
We are pleased to announce the release of Biopython 1.45.
This release includes numerous code improvements and fixes, including in Bio.Seq, Bio.SeqIO, Bio.Entrez, Bio.PopGen, Bio.SwissProt, Bio.Cluster, Bio.SCOP, Bio.InterPro, Bio.GenBank, Bio.ExPASy, BioSQL, and the Biopython documentation. Too many to list them all here!
Source distributions and Windows installers are available from the Biopython website at http://biopython.org. My thanks to all code contributers who made this new release possible.
--Michiel on behalf of the Biopython developers.
---------------------------------
Looking for last minute shopping deals? Find them fast with Yahoo! Search.
From biopython-dev at maubp.freeserve.co.uk Sat Mar 22 07:18:42 2008
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Sat, 22 Mar 2008 11:18:42 +0000
Subject: [Biopython-dev] EZRetrieve
Message-ID: <320fb6e00803220418n348d1953v9846af9d04abc04c@mail.gmail.com>
> I have a proposal, so it could be implemented in the next version (1.46?).
> Change the output of EZRetrieve.retrieve_single. It currently returns
> a FASTA formated sequence. I think it should return a SeqRecord object
> (if you want this SeqRecord object to be printed or stored as FASTA,
> just use formatIO).
> Here are the proposed changes: http://www.pastecode.com.ar/f3baff314
> I can fill this as an enhancement in the bugtrack if you agree.
So there is currently one function, retrieve_single, which can returns
a handle but by default extracts and returns a FASTA record as a
string. It does this by calling the parse_single function which
reads in the handle, parses the HTML file, and extracts just the FASTA
style text, throwing away the other annotation data (like the
chromosome or range requested).
Here is an example URL constructed by hand,
http://siriusb.umdnj.edu:18080/EZRetrieve/single_r_run.jsp?org=0&AccType=0&input=BC014651&from=-200&to=200
Parsing HTML is nasty - especially if the site updates the formatting
every so often. I suppose just looking for the FASTA sequence is
fairly reliable. I can see the case for an EzRetrieve HTML to
SeqRecord parser, but I would be tempted to try and parse more of the
annotation.
How many people do you think are using the retrieve_single function?
I would be very annoying for them if its behaviour suddenly changed.
Maybe we can add a new parse function, and call it from
retrieve_single if the optional argument parse=2?
Peter
From biopython at maubp.freeserve.co.uk Sat Mar 22 07:35:39 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Sat, 22 Mar 2008 11:35:39 +0000
Subject: [Biopython-dev] results of applying Clone Digger to the sources
of BioPython project
In-Reply-To:
References:
Message-ID: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
> Hello.
>
> Clone Digger project is aimed to find software clones (duplicate code) in
> Python and Java programs.
>
> I have applied it to the source of BioPython and discovered several clone
> candidates.
>
> There are a lot of false positives caused by similar code in
> nlmmedline_*_format.py files, but maybe other clone candidates will be
> interesting for you.
>
> The results can be seen here:
> http://clonedigger.sourceforge.net/examples.html
Interesting. Does your tool know to ignore deprecated modules? e.g.
when we have essentially copied a file from one location to another, a
deprecated the original.
Some of these are from scanner/consumer parsers where there are two
alternative consumers turning the data into different object
representations.
Other things like providing dictionary like objects seem to be reusing
a lot of "boiler plate" code, and could probably be rationalised into
a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
sharing apparently identical catalyse() methods may fall into the same
class.
Peter
From peter.bulychev at gmail.com Sat Mar 22 17:31:30 2008
From: peter.bulychev at gmail.com (Peter Bulychev)
Date: Sun, 23 Mar 2008 00:31:30 +0300
Subject: [Biopython-dev] results of applying Clone Digger to the sources
of BioPython project
In-Reply-To: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
References:
<320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
Message-ID:
Hello.
No, unfortunately Clone Digger can not ignore deprecated modules.
In order to obtain betters results automatically generated code and tests
should be removed from the searched source tree by hands.
Other things like providing dictionary like objects seem to be reusing
> a lot of "boiler plate" code, and could probably be rationalised into
> a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
> Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
>
> Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
> sharing apparently identical catalyse() methods may fall into the same
> class.
>
> I think this is the main purpose of Clone Digger: to find clone candidates
and to help to create recommendations for refactoring.
2008/3/22, Peter :
>
> > Hello.
> >
> > Clone Digger project is aimed to find software clones (duplicate code)
> in
> > Python and Java programs.
> >
> > I have applied it to the source of BioPython and discovered several
> clone
> > candidates.
> >
> > There are a lot of false positives caused by similar code in
> > nlmmedline_*_format.py files, but maybe other clone candidates will be
> > interesting for you.
> >
> > The results can be seen here:
> > http://clonedigger.sourceforge.net/examples.html
>
>
> Interesting. Does your tool know to ignore deprecated modules? e.g.
> when we have essentially copied a file from one location to another, a
> deprecated the original.
>
> Some of these are from scanner/consumer parsers where there are two
> alternative consumers turning the data into different object
> representations.
>
> Other things like providing dictionary like objects seem to be reusing
> a lot of "boiler plate" code, and could probably be rationalised into
> a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
> Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
>
> Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
> sharing apparently identical catalyse() methods may fall into the same
> class.
>
>
> Peter
>
--
Best regards,
Peter Bulychev.
From sbassi at gmail.com Tue Mar 25 16:46:48 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Tue, 25 Mar 2008 17:46:48 -0300
Subject: [Biopython-dev] Can't login into wiki
Message-ID:
Hello,
I press the link to login into the wiki
(http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython)
but I am redirected to the same page without a login prompt.
I found that this URL is dead (404):
http://biopython.org/DIST/docs/api/public/trees.html
(and it is linked from http://biopython.org/wiki/Getting_Started , last link).
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From biopython at maubp.freeserve.co.uk Tue Mar 25 16:55:23 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 25 Mar 2008 20:55:23 +0000
Subject: [Biopython-dev] Can't login into wiki
In-Reply-To:
References:
Message-ID: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com>
On Tue, Mar 25, 2008 at 8:46 PM, Sebastian Bassi wrote:
> Hello,
>
> I press the link to login into the wiki
> (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython)
> but I am redirected to the same page without a login prompt.
Its not just you - the wiki is being a bit odd for me right too now,
empty PHP pages etc. Maybe it needs rebooting again... which I think
happens automatically every so often. If it doesn't clear up I'll
email the OBF guys tomorrow.
> I found that this URL is dead (404):
> http://biopython.org/DIST/docs/api/public/trees.html
> (and it is linked from http://biopython.org/wiki/Getting_Started , last link).
It should probably be http://biopython.org/DIST/docs/api/ (the link
documentation page is fine).
Peter
From mjldehoon at yahoo.com Tue Mar 25 19:55:20 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Tue, 25 Mar 2008 16:55:20 -0700 (PDT)
Subject: [Biopython-dev] Can't login into wiki
In-Reply-To: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com>
Message-ID: <912788.63526.qm@web62415.mail.re1.yahoo.com>
Peter wrote: > I found that this URL is dead (404):
> http://biopython.org/DIST/docs/api/public/trees.html
> (and it is linked from http://biopython.org/wiki/Getting_Started , last link).
It should probably be http://biopython.org/DIST/docs/api/ (the link
documentation page is fine).
I fixed this link now.
--Michiel
---------------------------------
Looking for last minute shopping deals? Find them fast with Yahoo! Search.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:24:09 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:24:09 -0400
Subject: [Biopython-dev] [Bug 2475] New: BioSQL.Loader should reuse existing
taxon entries in lineage
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
Summary: BioSQL.Loader should reuse existing taxon entries in
lineage
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Based on a report on the mailing list by Eric Gibert,
http://lists.open-bio.org/pipermail/biopython/2008-March/004137.html
http://lists.open-bio.org/pipermail/biopython/2008-March/004147.html
The _get_taxon_id() function will add new entries to the taxon and taxon_name
tables when a species isn't already defined. It will also generate entries for
the lineage (for which we don't know the NCBI taxon names). At this point it
*should* be re-using any existing entries for elements of the lineage.
Note - this is complicated due to the re-use of the same latin names in
different classes. It might be easier/safer just not to write the lineage at
all?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:34:40 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:34:40 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803261234.m2QCYekn009310@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:34 EST -------
See also Bug 2422 and this thread on the BioSQL mailing list:
http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html
In particular Hilmar Lapp from BioSQL wrote in reply to trying to reuse
existing taxon table entries based on string matching to the scientific name
field in the taxon_name table, which I said sounded a little unreliable:
> It's pretty unreliable actually. There is not only synonymy
> but also rampant homonymy in taxonomic names. There are
> plenty of examples for the same scientific name in use for a
> plant and for some animal, for example. So in order to be
> unambiguous you will need to know (and check) the kingdom.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 08:44:01 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:44:01 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803261244.m2QCi15R009864@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:44 EST -------
Created an attachment (id=883)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=883&action=view)
Patch to BioSQL/Loader.py to not record the lineage for new species
This patch takes the simple route out - when loading a sequence into the
database with a new species (not already in the taxon tables), we ONLY add the
new species to the taxon and taxon_name tables. This DOES NOT attempt to
record the whole lineage, adding or reusing existing taxon entries.
Both the test_BioSQL and test_BioSQL_SeqIO unit tests still pass with this.
I prefer this solution as it avoids any ambiguous heuristics in matching
existing taxon names based on string comparions. This does mean Biopython
won't match BioPerl is this regard, as I understand that BioPerl currently
tries to record the full lineage.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 14:51:18 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 14:51:18 -0400
Subject: [Biopython-dev] [Bug 2477] New: SeqIO.parse does not handle embl
files
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
Summary: SeqIO.parse does not handle embl files
Product: Biopython
Version: Not Applicable
Platform: Macintosh
OS/Version: Mac OS
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: p.foster at nhm.ac.uk
This is in 1.45, but I did not see it in 1.43.
(1.45 is not a Bugzilla option at the moment ...)
If fh is a handle to an embl format file, then
SeqIO.parse(fh, 'embl')
dies. It worked (not perfectly) in 1.43.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 15:21:41 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 15:21:41 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803261921.m2QJLfbe007389@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|Not Applicable |1.45
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 15:21 EST -------
I've fixed the Bugzilla version field - thanks for the reminder.
Could you give more information please? e.g. a specific EMBL file, and the
error you are seeing.
Thanks, Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 03:59:54 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 03:59:54 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803270759.m2R7xsXA006767@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #2 from p.foster at nhm.ac.uk 2008-03-27 03:59 EST -------
Created an attachment (id=888)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=888&action=view)
test case
It is a multi-bug. There is a bug that prevents 1.45 from reading embl files,
and there is another bug, visible in 1.43 (at least) where it at least parses
embl files, but imperfectly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 06:49:33 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 06:49:33 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271049.m2RAnXpj015624@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 06:49 EST -------
Thanks for the clarification. I can reproduce the problem here.
It looks like they may have tweaked their file format slightly. Biopython will
be ignoring the apparently new PA line, which isn't described here:
http://www.ebi.ac.uk/webin-align/fflink2.html
You can also fetch the first problem record from their webpage, choose "Save",
"ASCII text/table", "complete entries"
http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[EMBLCDS:AAA03323]+-newId
As a minor point, personally I find the following style simpler:
from Bio import SeqIO
fName = 'twoEmblRecords.embl'
f = file(fName)
s = SeqIO.parse(f, 'embl')
for rec in s :
print rec.description
print rec.annotations['taxonomy']
f.close()
(you may of course have good reason for using the .next() method explicitly)
I'll take a look at this bug now...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 07:37:16 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 07:37:16 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271137.m2RBbGvg018455@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 07:37 EST -------
As you said, this is a multi-part bug!
To try this out, you will need to update files Bio/GenBank/Scanner.py and
__init__.py which are now in CVS. If you are not familiar with CVS, the easier
method would be to download the two files from here:
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/?cvsroot=biopython#dirlist
Note there is an hour or so time delay before it will show my changes. You can
see where the files should be put from the stack trace.
Please let me know how you get on (by posting on this bug).
Missing AC lines
================
All our EMBL test cases tested included an AC line, and Biopython 1.45 was
failing because of the missing AC line in your example, which was used to set
the SeqRecord's id property. I have updated CVS to fall back on the ID line.
Multiple DE lines
=================
Already fixed as of Biopython 1.44
Multiple OC lines
=================
Updated Biopython CVS to cope with multi-line taxonomy lineage
PA lines (parent accessions)
============================
You didn't report this, but we currently are ignoring the PA lines.
Quoting ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt
PA line - contains the accession.version of the "parent" EMBL entry
(entry where the CDS is annotated)
e.g. a whole contig, not just this one CDS/gene. We could record this in the
SeqRecord's annotations dictionary as a list of strings under key
'parent-accessions'. What do you think?
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 11:50:36 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 11:50:36 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271550.m2RFoacs002027@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #5 from p.foster at nhm.ac.uk 2008-03-27 11:50 EST -------
I got those two files, and they seem to have fixed everything. Thanks muchly.
The suggestion of de-ignoring the PA line sounds fine (although I have no use
for it at the moment).
-Peter F.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 12:22:13 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 12:22:13 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271622.m2RGMDWV003784@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 12:22 EST -------
OK, marking as fixed. I also included AAA03323 as a unit test, as we were
lacking an example without an AC line.
I'll leave the PA line issue alone for the time being; it would be wise to
check if there are any parallels in GenBank or SwissProt/UniProt before doing
anything so that they are all handled consistently.
Thanks for your report Peter.
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 29 22:53:41 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 29 Mar 2008 22:53:41 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803300253.m2U2rfLl002179@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #3 from ericgibert at yahoo.fr 2008-03-29 22:53 EST -------
I would like to propose the following solution:
1) add an extra optional parameter to load(): fetchNCBItaxonomy = False --> so
no impact on existing code. If the users call the load function with True then:
2) after the species insert in the taxon/taxon_name table then the XML data
from NCBI's taxonomy database are fetch
3) XML data is used to update taxon/taxon_name tables respecting the unicity of
the records
I have already part of the code, just need to change the fact that if a taxon
already exists then the new taxon points to this already existing one.
Comments?
Eric
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 30 07:41:25 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 30 Mar 2008 07:41:25 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803301141.m2UBfPMC001648@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-30 07:41 EST -------
I quite like the idea of fetching the new taxon information from the NCBI as
needed to record an accurate lineage. However, what happens if:
(a) The network is down? Raise an exception maybe?
(b) The NCBI doesn't have this Taxon ID (i.e. its invalid or so new their
database is out of date)? Raise an exception?
Eric, could you attach your taxonomy XML code to this bug? We'd probably want
to start by adding taxonomy XML parsing to Bio.Entrez (which I assume you are
using to fetch the XML data).
What about sequences where we don't have a taxon ID, but we do have a species
name? (which may happen with a sequence which wasn't read from a GenBank
file).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Sun Mar 30 10:49:41 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sun, 30 Mar 2008 07:49:41 -0700 (PDT)
Subject: [Biopython-dev] Bio.Entrez XML parsing
Message-ID: <864047.785.qm@web62410.mail.re1.yahoo.com>
> Eric, could you attach your taxonomy XML code to this bug?
> We'd probably want to start by adding taxonomy XML parsing
> to Bio.Entrez (which I assume you are using to fetch the XML data).
I've done some thinking about XML parsers for Bio.Entrez.
I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file).
Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML).
The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules.
If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception.
Comments, anybody?
--Michiel
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From sdavis2 at mail.nih.gov Sun Mar 30 20:51:07 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sun, 30 Mar 2008 20:51:07 -0400
Subject: [Biopython-dev] Bio.Entrez XML parsing
In-Reply-To: <864047.785.qm@web62410.mail.re1.yahoo.com>
References: <864047.785.qm@web62410.mail.re1.yahoo.com>
Message-ID: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
On Sun, Mar 30, 2008 at 10:49 AM, Michiel de Hoon wrote:
>
> > Eric, could you attach your taxonomy XML code to this bug?
> > We'd probably want to start by adding taxonomy XML parsing
> > to Bio.Entrez (which I assume you are using to fetch the XML data).
>
> I've done some thinking about XML parsers for Bio.Entrez.
>
> I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file).
>
> Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML).
>
> The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules.
>
> If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception.
>
> Comments, anybody?
This makes sense. However, it seems that there needs to be a way to
"register" a parser with read() so that users can extend their local
installation with a specialized parser. In other words, it seems that
a way to dynamically register a parser with read() would be helpful.
Or am I missing something?
Sean
From biopython at maubp.freeserve.co.uk Mon Mar 31 07:25:05 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 12:25:05 +0100
Subject: [Biopython-dev] Bio.Entrez XML parsing
In-Reply-To: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
References: <864047.785.qm@web62410.mail.re1.yahoo.com>
<264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
Message-ID: <320fb6e00803310425u478fc938w2ff426c4eae32d99@mail.gmail.com>
On Mon, Mar 31, 2008 at 1:51 AM, Sean Davis wrote:
> This makes sense. However, it seems that there needs to be a way to
> "register" a parser with read() so that users can extend their local
> installation with a specialized parser. In other words, it seems that
> a way to dynamically register a parser with read() would be helpful.
> Or am I missing something?
I like Michiel's plan. The mapping could be as simple as a (private)
dictionary in Bio.Entrez, mapping formats to parser objects/functions
- as done in Bio.SeqIO - which lets the user add new parsers or
override the built in ones should they so desire.
Peter
From tiagoantao at gmail.com Mon Mar 31 10:54:38 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 15:54:38 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
Message-ID: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
Hi,
I would like to start working on the statistical part (actually the
most important part) of Bio.PopGen and on the HapMap part.
My problem is with the CVS to SVN conversion. I cannot understand if I
can go forward and where (ie on the SVN or the CSV repository)?
I any case, I can wait with commiting, so there is no rush, but
eventually I will have to commit somewhere ;)
Tiago
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 11:22:20 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 11:22:20 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803311522.m2VFMKvU003831@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #5 from ericgibert at yahoo.fr 2008-03-31 11:22 EST -------
I attached the XML parser. Note that I did not dig too far in raising errors.
This is not yet the full solution for the taxon/taxon_name tables of BioSQL but
the first step.
Please comment on my programming style and if you want me to raise errors. Note
that Bio.Entrez already raises some errors.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 11:24:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 11:24:06 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803311524.m2VFO6wc004008@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #6 from ericgibert at yahoo.fr 2008-03-31 11:24 EST -------
Created an attachment (id=890)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=890&action=view)
Parse a Taxonomy record from NCBI
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Mon Mar 31 11:45:07 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 16:45:07 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
Message-ID: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote:
> Hi,
>
> I would like to start working on the statistical part (actually the
> most important part) of Bio.PopGen and on the HapMap part.
>
> My problem is with the CVS to SVN conversion. I cannot understand if I
> can go forward and where (ie on the SVN or the CSV repository)?
>
> I any case, I can wait with commiting, so there is no rush, but
> eventually I will have to commit somewhere ;)
In the short term, we are still using CVS. I've only been making
relatively small changes as I anticipate the move to SVN will happen
shortly...
Are there any objections to doing it in the next fortnight? Chris -
could you find out when would suit the OBF guys? Maybe come up with
two suggested time slots in the next month?
Peter
From tiagoantao at gmail.com Mon Mar 31 14:32:06 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 19:32:06 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
Message-ID: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
When on SVN I would like to consider branching for PopGen. AFAIK
branching on svn costs very little (only when you make changes does
SVN copies the content from the original branch).
This would have the big advantage that I could make my changes freely
without impact on Michiel's release cycle (or breaking the SVN head
for some reason). Whenever I get something stable I just merge back.
There are good reasons NOT to branch, so this might not be a good
idea... But considering that I am the only person that changes PopGen
I don't thing merging will be an issue at all... Any comments?
On Mon, Mar 31, 2008 at 4:45 PM, Peter wrote:
>
> On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote:
> > Hi,
> >
> > I would like to start working on the statistical part (actually the
> > most important part) of Bio.PopGen and on the HapMap part.
> >
> > My problem is with the CVS to SVN conversion. I cannot understand if I
> > can go forward and where (ie on the SVN or the CSV repository)?
> >
> > I any case, I can wait with commiting, so there is no rush, but
> > eventually I will have to commit somewhere ;)
>
> In the short term, we are still using CVS. I've only been making
> relatively small changes as I anticipate the move to SVN will happen
> shortly...
>
> Are there any objections to doing it in the next fortnight? Chris -
> could you find out when would suit the OBF guys? Maybe come up with
> two suggested time slots in the next month?
>
> Peter
>
--
http://www.tiago.org
From biopython at maubp.freeserve.co.uk Mon Mar 31 15:04:35 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 20:04:35 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
<6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
Message-ID: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
On Mon, Mar 31, 2008 at 7:32 PM, Tiago Ant?o wrote:
> When on SVN I would like to consider branching for PopGen. AFAIK
> branching on svn costs very little (only when you make changes does
> SVN copies the content from the original branch).
>
> This would have the big advantage that I could make my changes freely
> without impact on Michiel's release cycle (or breaking the SVN head
> for some reason). Whenever I get something stable I just merge back.
>
> There are good reasons NOT to branch, so this might not be a good
> idea... But considering that I am the only person that changes PopGen
> I don't thing merging will be an issue at all... Any comments?
I had been wondering about taking advantage of SVN to explore my
Bio.AlignIO plans and/or improvements to the alignment object. I
think I will need to read up on SVN and how it handles merges and
branches before I try this.
There is a lot to be said for having a single stable trunk - it
certainly makes things simpler for any new developers to get to grips
with things.
Peter
From tiagoantao at gmail.com Mon Mar 31 15:08:46 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 20:08:46 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
<6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
<320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
Message-ID: <6d941f120803311208k6b6c9d1ah58c7808e0fbd0e2c@mail.gmail.com>
On Mon, Mar 31, 2008 at 8:04 PM, Peter wrote:
> There is a lot to be said for having a single stable trunk - it
> certainly makes things simpler for any new developers to get to grips
> with things.
It is one of those issues where there is no clear answer. Maybe a case
by case analysis? I think having 5 gazillion branches would not be a
good idea ever, but in the Biopython case many modules are somewhat
self contained, making merging an easier exercise.
Tiago
From tiagoantao at gmail.com Mon Mar 31 18:13:11 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 23:13:11 +0100
Subject: [Biopython-dev] Genbank dbSNP support
Message-ID: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
Hi,
Any plans for dbSNP support?
http://www.ncbi.nlm.nih.gov/SNP/index.html
I think I would volunteer to implement this. A simple solution would
be to add both databases and return types. Michiel (I suppose this is
code that you are actively maintaining, or it is Peter?), can I send
you a diff? I have done this once already for genome -
http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html
dbSNP can return different types (
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam
) so a few parsers would be needed for complete support. But that can
be done later...
--
http://www.tiago.org
From biopython at maubp.freeserve.co.uk Mon Mar 31 19:01:10 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Apr 2008 00:01:10 +0100
Subject: [Biopython-dev] Genbank dbSNP support
In-Reply-To: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
References: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
Message-ID: <320fb6e00803311601x573c104cx1beb7035a14ef03c@mail.gmail.com>
On Mon, Mar 31, 2008 at 11:13 PM, Tiago Ant?o wrote:
> Hi,
>
> Any plans for dbSNP support?
> http://www.ncbi.nlm.nih.gov/SNP/index.html
>
> I think I would volunteer to implement this. A simple solution would
> be to add both databases and return types. Michiel (I suppose this is
> code that you are actively maintaining, or it is Peter?), can I send
> you a diff? I have done this once already for genome -
> http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html
I think Michiel has been dealing with this sort of stuff
(NCBIDictionary and Bio.Entrez). I would file an enhancement bug, and
attach your patch to it.
> dbSNP can return different types (
> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam
> ) so a few parsers would be needed for complete support. But that can
> be done later...
We should already be able to parse their Fasta, GenBank or GenPept
output. The lists of IDs should also be trivial. I haven't looked at
the other formats.
Peter
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 19:23:46 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 19:23:46 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803312323.m2VNNku4026068@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #7 from ericgibert at yahoo.fr 2008-03-31 19:23 EST -------
Created an attachment (id=891)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=891&action=view)
refactoring and search by name
Please discard previous attachment. This newer version includes a static method
returning a list of Taxonomy based on a scientific name.
It is then possible to test the len of the return list:
0 for no match, 1 for a unique taxon, more if ambiguity.
Ambiguity can be cleared using the get_taxon_by_rank("order") for example.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 20:57:35 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 20:57:35 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200804010057.m310vZqG029753@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
ericgibert at yahoo.fr changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #890 is|0 |1
obsolete| |
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 1 03:04:14 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 29 Feb 2008 22:04:14 -0500
Subject: [Biopython-dev] [Bug 2464] New: from Bio import db doesn't work?
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2464
Summary: from Bio import db doesn't work?
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: blocker
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: patrikd at gmail.com
Just trying to run an example straight out of the BioPython cookbook:
ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank")
Traceback (most recent call last):
File "", line 1, in
ncbi_dict = GenBank.NCBIDictionary("nucleotide", "genbank")
File "C:\Program Files\Python25\lib\site-packages\Bio\GenBank\__init__.py",
line 1283, in __init__
from Bio import db
ImportError: cannot import name db
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 1 08:54:19 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Mar 2008 03:54:19 -0500
Subject: [Biopython-dev] [Bug 2464] from Bio import db doesn't work?
In-Reply-To:
Message-ID: <200803010854.m218sJFT023721@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2464
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |DUPLICATE
------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST -------
Duplicate of Bug #2393, which was fixed in CVS.
*** This bug has been marked as a duplicate of bug 2393 ***
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 1 08:54:23 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Mar 2008 03:54:23 -0500
Subject: [Biopython-dev] [Bug 2393] Bio.GenBank.NCBIDictionary fails with
release 1.44
In-Reply-To:
Message-ID: <200803010854.m218sNPD023746@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2393
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |patrikd at gmail.com
------- Comment #13 from mdehoon at ims.u-tokyo.ac.jp 2008-03-01 03:54 EST -------
*** Bug 2464 has been marked as a duplicate of this bug. ***
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
From mjldehoon at yahoo.com Sat Mar 1 08:52:16 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sat, 1 Mar 2008 00:52:16 -0800 (PST)
Subject: [Biopython-dev] deprecation?
In-Reply-To: <47C7FB4C.40607@umh.es>
Message-ID: <504389.32803.qm@web62405.mail.re1.yahoo.com>
Dear Gregorio,
Thanks for letting us know.
Could you show us what exactly you are trying to do in your script?
This function was deprecated because there were several functions in Biopython doing nearly the same thing, and we're trying to converge on one function.
So probably, the best thing would be to avoid using Bio\config\DBRegistry.py.
--Michiel.
Gregorio Fernandez wrote: Dear Sir,
I had this messasge in one of my scripts. Can I have this feature
available?
C:\Python25\lib\site-packages\Bio\config\DBRegistry.py:149:
DeprecationWarning:
Concurrent behavior has been deprecated, as this functionality needs
Bio.MultiPr
oc, which itself has been deprecated. If you need the concurrent
behavior, pleas
e let the Biopython developers know by sending an email to
biopython-dev at biopyth
on.org to avoid permanent removal of this feature.
DeprecationWarning)
Thanks
Gregorio
--
Gregorio J. Fernandez Ballester
Instituto de Biolog?a Molecular y Celular
Universidad Miguel Hern?ndez
Edificio Torregait?n.
Avda. de la Universidad, s/n. 03202
Elche (Alicante)
E-mail: gregorio at umh.es
Telf: 966 65 84 41
Fax: 966 65 87 58
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From bugzilla-daemon at portal.open-bio.org Mon Mar 3 21:53:59 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 3 Mar 2008 16:53:59 -0500
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803032153.m23LrxP4023475@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-03 16:53 EST -------
Defining __eq__ and __ne__ methods for the Alphabet class would probably work,
but we would also have to do this for the AlphabetEncoder "decorator" class.
I'm a little wary of this...
def __ne__(self, other) :
"""Check if this alphabet object <> another alphabet"""
return not self == other
def __eq__(self, other) :
"""Check if this alphabet object == another alphabet"""
#TODO - what exactly do we want to check here?
if id(self) == id(other) :
return True
if not isinstance(other, Alphabet) \
and not isinstance(other, AlphabetEncoder):
raise ValueError("Comparing an alphabet to a non-alphabet")
if self.__class__ <> other.__class__ :
return False
if self.size <> other.size :
return False
if self.letters <> other.letters :
return False
if dir(self) <> dir(other) :
return False
for attr in ["gap_char", "stop_symbol"] :
if hasattr(self, attr) <> hasattr(other, attr) :
return False
if hasattr(self, attr) and hasattr(other, attr) \
and self.__getattr__(attr) <> other.__getattr_(attr) :
return False
#Close enough?
return True
Relaxing the assertion in Bio.Translate would be much safer in terms of any
potential side effects.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Thu Mar 6 15:24:47 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 6 Mar 2008 07:24:47 -0800 (PST)
Subject: [Biopython-dev] New Biopython release
Message-ID: <48921.43822.qm@web62403.mail.re1.yahoo.com>
Hi everybody,
Let's make a new release (1.45). I'm thinking of Friday 21st, which gives us about two weeks. The current Biopython release (1.44) has a nasty bug that causes an error with one of the Bio.GenBank examples in the tutorial. This bug has since been fixed in CVS.
If you have any code that is ready to be submitted to CVS, now would be a good time to do so. If your code is not yet ready from prime time, please don't submit it to CVS until after the release to avoid any last-minute problems.
Biopython 1.44 had a large number of deprecations, but I feel it is too soon to remove them from the release completely. Bio.Blast.blast and Bio.Blast.blasturl have been deprecated for several releases now, so if there are no objections I think we should remove them. Bio.Kabat has been deprecated since release 1.43. Since it has few (if any) users, I think we should remove it too.
Also, please have a look at the Biopython bugs that are still open to see if there's anything we can do about them.
--Michiel.
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Sat Mar 8 20:25:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 8 Mar 2008 15:25:06 -0500
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803082025.m28KP661006291@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-08 15:25 EST -------
Changing Bio/Translate.py line 14+ and 36+ from this:
assert seq.alphabet == self.table.nucleotide_alphabet, \
...
to this:
#Allow different instances of the same class to be used:
assert seq.alphabet.__class__ == \
self.table.nucleotide_alphabet.__class__, \
...
seems to resolve the original bug report. I'd like to check this doesn't
affect any of the unit tests under Linux - Windows looks OK.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 10 10:12:13 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Mar 2008 06:12:13 -0400
Subject: [Biopython-dev] [Bug 1999] new frame translation method
In-Reply-To:
Message-ID: <200803101012.m2AACD7k003033@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1999
------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2008-03-10 06:12 EST -------
Since SeqUtils.frameTranslations and SeqUtils.six_frame_translations are so
similar, I think we should keep only one of these functions. Preferably named
"six_frame_translations", for backward compatibility.
Also, I think we should not require the seqO argument to be a Seq object.
If this function is to replace the existing SeqUtils.six_frame_translations, we
should make to sure to keep all the existing functionality of that function. I
believe current the GC content calculation is missing in
SeqUtils.frameTranslations.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Wed Mar 12 00:37:11 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 12 Mar 2008 00:37:11 +0000
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
Message-ID: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
Hi Chris,
I haven't heard anything about the CVS to SVN move recently. Did
anyone resolve the multiple password prompt niggle?
On another point, is the test SVN repository intended to be writable
(for those of us with developer access)? I really should try running
some things like "svn diff" and committing sample changes to get a
feel for how it compares to CVN.
Peter
From sbassi at gmail.com Wed Mar 12 14:52:51 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 11:52:51 -0300
Subject: [Biopython-dev] BLAST XML to HTML
Message-ID:
Is there a Biopython module to convert a VLAST XML output to HTML?
Like this one from BioJAVA:
http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML
If there is no such a module, could this be included into Biopython if
I provide the code?
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From peter at maubp.freeserve.co.uk Wed Mar 12 18:12:35 2008
From: peter at maubp.freeserve.co.uk (Peter)
Date: Wed, 12 Mar 2008 18:12:35 +0000
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To:
References:
Message-ID: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
On Wed, Mar 12, 2008 at 2:52 PM, Sebastian Bassi wrote:
> Is there a Biopython module to convert a VLAST XML output to HTML?
> Like this one from BioJAVA:
> http://www.biojava.org/wiki/BioJava:CookBook:Blast:XML
> If there is no such a module, could this be included into Biopython if
> I provide the code?
Is your idea to convert from the XML output of the NCBI BLAST tools
into HTML very closely resembling the NCBI's HTML output (perhaps for
another program to read as input). Or do you just want to produce a
nice HTML page for a person to read (perhaps resembling the NCBI page
in appearance, but not using the same HTML layout)?
How would your code work -direct from the XML file, or from the
results of the existing Biopython BLAST parsers?
Peter
From sbassi at gmail.com Wed Mar 12 18:21:18 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 15:21:18 -0300
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To: <320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
References:
<320fb6e00803121112g5ce34a07y517a8c1087a031c3@mail.gmail.com>
Message-ID:
On Wed, Mar 12, 2008 at 3:12 PM, Peter wrote:
> Is your idea to convert from the XML output of the NCBI BLAST tools
> into HTML very closely resembling the NCBI's HTML output (perhaps for
> another program to read as input). Or do you just want to produce a
> nice HTML page for a person to read (perhaps resembling the NCBI page
> in appearance, but not using the same HTML layout)?
The idea is because I always run the BLAST as XML since I parse them
with biopython, but people at lab want to check the HTML version (or I
want to "publish" the result in a public DB accessible via html) and
that makes me re-run the BLAST just for them to see the output.
Sometimes the BLAST are resource demanding (like a 2 week run) and I
would like to avoid re-running the BLAST when I really want is a
format change.
> How would your code work -direct from the XML file, or from the
> results of the existing Biopython BLAST parsers?
>From the XML output.
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From mjldehoon at yahoo.com Wed Mar 12 21:50:42 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Wed, 12 Mar 2008 14:50:42 -0700 (PDT)
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To:
Message-ID: <845186.11502.qm@web62406.mail.re1.yahoo.com>
> > How would your code work -direct from the XML file, or from the
> > results of the existing Biopython BLAST parsers?
>From the XML output.
One option is to use Cascading Style Sheets (CSS) to display the XML file. That way, you don't have to create a new HTML file. Also, we should check with NCBI if they have a tool for such purposes.
--Michiel.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
From sbassi at gmail.com Wed Mar 12 22:25:35 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Wed, 12 Mar 2008 19:25:35 -0300
Subject: [Biopython-dev] BLAST XML to HTML
In-Reply-To: <845186.11502.qm@web62406.mail.re1.yahoo.com>
References:
<845186.11502.qm@web62406.mail.re1.yahoo.com>
Message-ID:
On Wed, Mar 12, 2008 at 6:50 PM, Michiel de Hoon wrote:
> One option is to use Cascading Style Sheets (CSS) to display the XML file.
> That way, you don't have to create a new HTML file. Also, we should check
> with NCBI if they have a tool for such purposes.
They must have something because the new online NCBI BLAST has an
option called "reformat BLAST results". This option can reformat from
XML to HTML without re-running the BLAST, but this is working as
server-side.
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 10:07:44 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 06:07:44 -0400
Subject: [Biopython-dev] [Bug 2468] New: Tutorial needs a fix: Bio.WWW.NCBI
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
Summary: Tutorial needs a fix: Bio.WWW.NCBI
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Documentation
AssignedTo: biopython-dev at biopython.org
ReportedBy: mmokrejs at ribosome.natur.cuni.cz
I am trying to follow the recipe at
http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc14 which contains the
following split into several chunks (I don't like this style personally, but
that's not the issue here):
#! /usr/bin/python
from Bio.WWW import NCBI
search_command = 'Search'
search_database = 'Taxonomy'
return_format = 'FASTA'
search_term = 'Cypripedioideae'
my_browser = 'lynx'
result_handle = NCBI.query(search_command, search_database, term = search_term,
doptcmdl = return_format)
import os
result_file_name = os.path.join(os.getcwd(), "results.html")
result_file = open(result_file_name, "w")
result_file.write(result_handle.read())
result_file.close()
if my_browser == "lynx":
os.system("lynx -force_html " + result_file_name)
elif my_browser == "netscape":
os.system("netscape file:" + result_file_name)
I end up with a lynx browser opened with the Entrez search page pre-filled with
the 'Cypripedioideae' as the query string. Unfortunately, I have to click on
the
condensed results to get the taxonomy listing under the word 'Cypripedioideae'.
The line I am talking about is close the the end of the output:
[ ] 1: Cypripedioideae, subfamily, monocots Links
BTW, other the links from the page do not work because they point to
http://localhost/....
/usr/lib/python2.5/site-packages/Bio/WWW/NCBI.py:34: DeprecationWarning:
Bio.WWW.NCBI is deprecated. The functions in Bio.WWW.NCBI are now available
from Bio.Entrez.
DeprecationWarning)
The section needs updating. I am somewhat surprised I cannot access NCBI
Taxonomy easily. Priobably will have to browse the source code and forget
Tutorail and Cookbook. ;)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 10:55:55 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 06:55:55 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803131055.m2DAttv2027003@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 06:55 EST -------
The tutorial in CVS is already updated to use Bio.Entrez.query instead of
Bio.WWW.NCBI.query relecting the depreciation made in CVS.
I think you are using the Biopython 1.44 tutorial (from the weblink) with the
CVS Biopython code.
So at least part of your problem is already fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 13 11:27:14 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Mar 2008 07:27:14 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803131127.m2DBREdM028784@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-13 07:27 EST -------
Reading the Bio.Entrez documentation, the query function always returns HTML.
You could also use the esearch function which returns XML, followed by the
efetch function which seems to support a range of options depending on the
datatype. For example, using the taxonomy db:
#This gets an XML file from the following URL,
#http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=taxonomy&term=Cypripedioideae
from Bio import Entrez
result_handle = Entrez.esearch("taxonomy", term="Cypripedioideae")
print result_handle.read()
You could then parse the XML file to extract the matching ID(s), perhaps with a
regular expression. In this case, there is only one match, 158330.
#http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy
#&id=9685&report=brief&retmode=text
from Bio import Entrez
result_handle = Entrez.efetch("taxonomy", id="158330", \
report="docsum", retmode="text")
print result_handle.read()
#Given ID 9685, returns "Cypripedioideae, subfamily, monocots"
I agree that this section of the tutorial could be more useful. Do you think
the above could would be more helpful?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Thu Mar 13 12:14:09 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 13 Mar 2008 12:14:09 +0000
Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions
Message-ID: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com>
Hi Michiel (et al),
I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez
I notice you've marked several of the functions in Bio.Entrez with
depreciation warnings as the NCBI has retired the associated APIs.
i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI
is deprecated).
Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and
Bio.WWW.NCBI.pmneighbor will get a deprecation warning from
Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch,
Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings.
Do you think we can just remove pmfetch, pmqty and pmneighbor from
Bio.Entrez so that it starts out "clean", and adjust the warning from
Bio.WWW.NCBI as follows:
import warnings
warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now
available from Bio.Entrez, except for the pm* functions which the NCBI
have retired.", DeprecationWarning)
What do you think?
Peter
From mjldehoon at yahoo.com Thu Mar 13 12:48:29 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 13 Mar 2008 05:48:29 -0700 (PDT)
Subject: [Biopython-dev] Bio.Entrez and the depreciated pm* functions
In-Reply-To: <320fb6e00803130514s4aa31c4fo494b9f1594ef0b89@mail.gmail.com>
Message-ID: <84386.61626.qm@web62415.mail.re1.yahoo.com>
That is fine with me. Maybe I was being too conservative.
I'll make those changes.
--Michiel.
Peter wrote: Hi Michiel (et al),
I have a query regarding the transition from Bio.WWW.NCBI to Bio.Entrez
I notice you've marked several of the functions in Bio.Entrez with
depreciation warnings as the NCBI has retired the associated APIs.
i.e. pmfetch, pmqty and pmneighbor (while the whole of Bio.WWW.NCBI
is deprecated).
Anyone using Bio.WWW.NCBI.pmfetch, Bio.WWW.NCBI.pmqty and
Bio.WWW.NCBI.pmneighbor will get a deprecation warning from
Bio.WWW.NCBI, then if try switch to Bio.Entrez.pmfetch,
Bio.Entrez.pmqty and Bio.Entrez.pmneighbor they still get warnings.
Do you think we can just remove pmfetch, pmqty and pmneighbor from
Bio.Entrez so that it starts out "clean", and adjust the warning from
Bio.WWW.NCBI as follows:
import warnings
warnings.warn("Bio.WWW.NCBI is deprecated. The functions are now
available from Bio.Entrez, except for the pm* functions which the NCBI
have retired.", DeprecationWarning)
What do you think?
Peter
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Fri Mar 14 15:53:34 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Mar 2008 11:53:34 -0400
Subject: [Biopython-dev] [Bug 2437] comparing alphabet references causes
assert to fail when it should pass
In-Reply-To:
Message-ID: <200803141553.m2EFrY6p001573@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2437
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 11:53 EST -------
Fixed in CVS, Bio/Translate.py revision 1.3, as described in comment 3.
This fixes the original report, making sequence translation simpler to use -
see also Bug 2381 - translate and transcibe methods for the Seq object (in
Bio.Seq)
This change does NOT address the larger issue of how to decide if two alphabets
are equal or not.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Mar 14 16:19:40 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Mar 2008 12:19:40 -0400
Subject: [Biopython-dev] [Bug 2447] EUtils cannot parse PubMed XML for ACS
journals
In-Reply-To:
Message-ID: <200803141619.m2EGJeIC003283@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2447
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-14 12:19 EST -------
Created an attachment (id=878)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=878&action=view)
Patch to Bio/EUtils/parse.py
I'm sure sure if this is the best way to fix this, but it does appear to solve
the reported problem. Can you give this a try Noel?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 20:36:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 16:36:06 -0400
Subject: [Biopython-dev] [Bug 2363] Some python files not stored as plain
text in CVS?
In-Reply-To:
Message-ID: <200803152036.m2FKa6xR029284@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2363
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #7 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 16:36 EST -------
I've just done a clean checkout and build on Windows, and run the test suite,
and built the tutorial as PDF. I didn't run into any text/binary issues, so
this seems to be fixed now :)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 20:49:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 16:49:06 -0400
Subject: [Biopython-dev] [Bug 2469] New: requires_wise.py fails on Windows
(test suite)
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2469
Summary: requires_wise.py fails on Windows (test suite)
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
On my Windows XP machine, I don't have wise installed, so the dnal command
doesn't work:
C:\TEMP\>dnal
'dnal' is not recognized as an internal or external command,
operable program or batch file.
When running the unit test suite, test_Wise.py SHOULD fail with a missing
external dependency error - instead it tries to run with a failed assertion
error. The problem is requires_wise.py fails... which seems to be an issue
with the commands.getoutput() function not working on Windows, its unix only
according to:
http://www.python.org/doc/current/lib/module-commands.html
Annoyingly, the commands module is present on Windows (or at least Python 2.3)
but simply doesn't work due to calling this:
os.popen('{ ' + cmd + '; } 2>&1', 'r')
As a result,
>>> commands.getoutput("xyz")
"'{' is not recognized as an internal or external command,\noperable program or
batch file."
Assuming wise/dnal actually works on Windows, we need to use something other
than commands.getoutput("dnal") to check for it.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Mar 15 23:40:54 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Mar 2008 19:40:54 -0400
Subject: [Biopython-dev] [Bug 2469] requires_wise.py fails on Windows (test
suite)
In-Reply-To:
Message-ID: <200803152340.m2FNesQl005388@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2469
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-15 19:40 EST -------
You can download wise2 from ftp://ftp.ebi.ac.uk/pub/software/unix/wise2/ and
compile it under Windows XP using cygwin, but its own tests fail - I'm not sure
why. Carrying on regardless, then test_Wise.py still doesn't work for me :(
P.S. Cornell University have packaged wise2 for Windows (found via Google, I
haven't tried this): http://www.tc.cornell.edu/WBA/
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 16 21:03:59 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 16 Mar 2008 17:03:59 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803162103.m2GL3x4u021735@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:03 EST -------
In the old code, when the species wasn't already recorded in the
taxon/taxon_name tables, we add would it and its parent lineage entries.
See also http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html
There are a few problems in the old code, exposed in the unit tests, but I
think I have this working again now.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 16 21:25:02 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 16 Mar 2008 17:25:02 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803162125.m2GLP2Oq023867@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-16 17:25 EST -------
Created an attachment (id=879)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=879&action=view)
patch to BioSQL/Loader.py
Possible patch - the two BioSQL unit tests pass with this. I have not had a
chance to try this in combination with a taxonomy table pre-populated by
load_ncbi_taxonomy.pl
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Mon Mar 17 02:43:55 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sun, 16 Mar 2008 19:43:55 -0700 (PDT)
Subject: [Biopython-dev] [BioSQL-l] Loading sequences with novel NCBI
taxon id
In-Reply-To: <002201c88705$70780840$6400a8c0@Gecko>
Message-ID: <121379.45887.qm@web62403.mail.re1.yahoo.com>
> Thank you for your mail recommending the usage of NCBI.WWW.
> I have modified my class/script accordingly to your suggestion
> without problem. Once 1.45 is out, I will change for NCBI.Entrez
> as you informed me.
Just to avoid any confusion: In Biopython 1.45, the module will be "Bio.Entrez", not "Bio.NCBI.Entrez".
> In any case, I do not pretend having a fantastic piece of code, but it gets
> the job done. If you find this interesting, I would be pleased to contribute
> to BioPython.
Bio.Entrez will need some parsers to parse the XML results, although that probably won't happen before the 1.45 release. I think your script could be very useful when writing those parsers. Could you open a bug report on Bugzilla and upload your script there? Beware, to upload a script to Bugzilla, you need to create a bug report first, and then as a separate step upload the script.
Thanks!
--Michiel..
Eric Gibert wrote: Dear Peter,
Regarding the update of the BioSQL tables taxon and taxon_name, I have
created a class "TaxonUpdate" (how original!) which do two things:
1) as a class itself, it will fetch from NCBI the taxon's information as XLM
based on the taxon_id passed to the constructor, parse the returned XML
answer to get the genus, class, order, family (10 levels) and update that in
taxon table. If taxon_name needs update/insert, it does it too.
2) run as an independent script __main__, it will look for all species in
taxon table for which the genus (parent) does not have a ncbi_taxon_id (i.e.
is NULL as this is the current result after adding a new sequence in
BioSQL). For all those incomplete found records, it will perform the update
as (1)
After the addition of a new sequence in a BioSQL database, a simple call of
this code (passing the taxon_id) will do the updating job.
Dear Michiel,
Thank you for your mail recommending the usage of NCBI.WWW. I have modified
my class/script accordingly to your suggestion without problem. Once 1.45 is
out, I will change for NCBI.Entrez as you informed me.
In any case, I do not pretend having a fantastic piece of code, but it gets
the job done. If you find this interesting, I would be pleased to contribute
to BioPython.
Eric
-----Original Message-----
From: biosql-l-bounces at lists.open-bio.org
[mailto:biosql-l-bounces at lists.open-bio.org] On Behalf Of Peter
Sent: Thursday, March 13, 2008 11:06 PM
To: BioSQL
Subject: [BioSQL-l] Loading sequences with novel NCBI taxon id
Dear list,
One of the unresolved issues with Biopython's BioSQL interface is
dealing with the NCBI taxon ID when loading sequences into the
database.
As I understand it, ideally before loading any sequences, the user
will have loaded in the entire NCBI taxonomy using the
load_ncbi_taxonomy.pl script, as I described here:
http://biopython.org/wiki/BioSQL#NCBI_Taxonomy
When a new sequence is added to the database with a known taxon id,
there is no problem. But happens if its a recently sequenced organism
which isn't defined yet in the BioSQL taxonomy tables? Could/should
the user re-run load_ncbi_taxonomy.pl, and then load in their new
sequence?
Right now in Biopython due what appears to have been intended as a
short term hack, we simple don't record the taxon id at all (!), and I
would like to fix this (bug 2422).
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
How do BioPerl et al deal with this issue? Do they try and update the
taxonomy tables using the available information in the new record's
annotation (i.e. the new taxon id and the species name)? Do they
lookup the NCBI taxonomy definition via the internet? Do they throw
an error and halt?
Thanks,
Peter
(Biopython)
_______________________________________________
BioSQL-l mailing list
BioSQL-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biosql-l
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
From bugzilla-daemon at portal.open-bio.org Mon Mar 17 11:47:46 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 17 Mar 2008 07:47:46 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803171147.m2HBlksw008865@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #3 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 07:47 EST -------
Hi Peter,
yes this would be more helpful. Unfortunately I did the one-time job with
parsing the HTML output and re-running wget to fetch the final HTML page,
stripped HTML formatting and was done. I will upload my two crappy scripts.
They work but should be re-written to utilize the XML outputs you have
mentioned.
The second URL from your last comment should have different values for some
parameters to yield another XML page:
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=taxonomy&id=41073&report=sgml&mode=xml
That returns me:
41073
cellular organisms; Eukaryota; Fungi/Metazoa group; Metazoa;
Eumetazoa; Bilateria; Coelomata; Protostomia; Panarthropoda; Arthropoda;
Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera;
Endopterygota; Coleoptera; Adephaga
Maybe I will find the time to rewrite them for the purpose of tutorial to use
the XMLs.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 17 13:18:35 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 17 Mar 2008 09:18:35 -0400
Subject: [Biopython-dev] [Bug 2468] Tutorial needs a fix: Bio.WWW.NCBI
In-Reply-To:
Message-ID: <200803171318.m2HDIZYX014608@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2468
------- Comment #4 from mmokrejs at ribosome.natur.cuni.cz 2008-03-17 09:18 EST -------
Created an attachment (id=880)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=880&action=view)
taxfetch.py
This program/module can fetch for the user the Lineage line. The query()
function uses the deprecated biopython API while the efetch uses the other.
Queries get cached in a local file taxonomycache.db for speed.
Users can call either of the two functions from external python code. Feel free
to use the code in Tutorial or even bundle in any form into the package.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From tiagoantao at gmail.com Mon Mar 17 22:30:38 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 17 Mar 2008 22:30:38 +0000
Subject: [Biopython-dev] Bio.PopGen status
Message-ID: <6d941f120803171530g36504759g4b3cf835065e17b8@mail.gmail.com>
Hi,
This is a short email regarding Bio.PopGen status.
1. All the code on the repository should be stable.
2. The Biopython version that is schedule for release soon will have
support for coalescent population genetics' simulations
3. A short number of test code cases are included.
4. Documentation was produced and is available on the Tutorial. I
believe that it is satisfactory (tell me if you disagree).
5. Bio.PopGen is still not "version 1" in the sense that the
fundamental statistics code is missing. This was a conscious strategy
to start with selection detection and coalescent simulation in order
to begin with arguably less important stuff so that newbie errors (in
the sense that I was a newbie developer to biopython) would have less
impact.
6. Statistics is my next task and hopefully will coincide with the
biopython release after this one. This will be, at least, for me,
"version 1" of Bio.PopGen
7. In the code, there is, since the original Bio.PopGen, code that is
able to execute external simulators in parallel (thus taking advantage
of multi core architectures for computationally intensive
simulations). This is, unfortunately, not documented. I will document
this (maybe in a separate document from the tutorial) in the future. I
don't think this is priority 1. But others might be interested in
using this code for computationally intensive tasks using external
programs. In case you want to know more details about this, please say
so.
>From a biopython release perspective, Bio.PopGen with new coalescent
simulation features is fully ready. Please go ahead and release
whenever is more convenient.
--
http://www.tiago.org/ps
From bugzilla-daemon at portal.open-bio.org Thu Mar 20 10:23:35 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 20 Mar 2008 06:23:35 -0400
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200803201023.m2KANZun010097@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-20 06:23 EST -------
Patch checked in as BioSQL/Loader.py revision 1.28
Unit tests passed on both Windows XP and Linux (using MySQL)
Note that once we have added "provisional" entries to the taxon/taxon_name
table based on the record annotation, load_ncbi_taxonomy.pl should be able to
tidy things up using the NCBI taxonomy. At least it should once BioSQL bug
2470 is fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Thu Mar 20 20:14:50 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Thu, 20 Mar 2008 20:14:50 +0000
Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics
services
In-Reply-To: <320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com>
References: <320fb6e00803140944v13f241b9icc0e911643f234cd@mail.gmail.com>
<47DE0C22.9040202@netsys.co.za>
<320fb6e00803170049g79960e14u8c1417fcdc99a0d5@mail.gmail.com>
<320fb6e00803170055n18457967n27d1b07eaa6cb522@mail.gmail.com>
Message-ID: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com>
I posted the following email on the mail discussion mailing list, and
haven't seen any replies.
Should we mark Bio.biblio as deprecated now (before the imminent release)?
Peter
On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote:
> Dear list,
>
> We have an old module Bio/biblio.py written by Tiaan Wessels back in
> 2002 (during a South African hackathon). This is code to use some EBI
> Bibliographics services, but currently no longer works. At the very
> least, the EBI have changed the URLs for their SOAP services. I got
> in touch with the author by email, and he no longer uses the code and
> thought we could remove it.
>
> Does anyone on the list still use Bio/biblio.py?
>
> Would anyone like to take a more in depth look at the code, and the
> current EBI web API, and see if there is anything in Bio.biblio worth
> keeping?
>
> If not, I'm proposing we mark this as deprecated for the next release
> of Biopython.
>
> Thanks,
>
> Peter
>
From mjldehoon at yahoo.com Fri Mar 21 02:08:56 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Thu, 20 Mar 2008 19:08:56 -0700 (PDT)
Subject: [Biopython-dev] Old Biopython code for EBI Bibliographics
services
In-Reply-To: <320fb6e00803201314j53b47a35x33de02cb685d2c14@mail.gmail.com>
Message-ID: <242823.17441.qm@web62402.mail.re1.yahoo.com>
> Should we mark Bio.biblio as deprecated now (before the imminent release)?
Yes. It's just a deprecation; the code will still be usable. The deprecation warning should contain a notice to contact us in case somebody is still using this code. If not, it's better to deprecate it and remove it in some future release. Keeping Biopython clean is important.
--Michiel.
Peter wrote: I posted the following email on the mail discussion mailing list, and
haven't seen any replies.
Should we mark Bio.biblio as deprecated now (before the imminent release)?
Peter
On Mon, Mar 17, 2008 at 7:55 AM, Peter wrote:
> Dear list,
>
> We have an old module Bio/biblio.py written by Tiaan Wessels back in
> 2002 (during a South African hackathon). This is code to use some EBI
> Bibliographics services, but currently no longer works. At the very
> least, the EBI have changed the URLs for their SOAP services. I got
> in touch with the author by email, and he no longer uses the code and
> thought we could remove it.
>
> Does anyone on the list still use Bio/biblio.py?
>
> Would anyone like to take a more in depth look at the code, and the
> current EBI web API, and see if there is anything in Bio.biblio worth
> keeping?
>
> If not, I'm proposing we mark this as deprecated for the next release
> of Biopython.
>
> Thanks,
>
> Peter
>
_______________________________________________
Biopython-dev mailing list
Biopython-dev at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython-dev
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From mjldehoon at yahoo.com Fri Mar 21 11:57:02 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Fri, 21 Mar 2008 04:57:02 -0700 (PDT)
Subject: [Biopython-dev] CVS freeze for release
Message-ID: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Hi everybody,
I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks!
--Michiel.
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From biopython at maubp.freeserve.co.uk Fri Mar 21 12:51:01 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 12:51:01 +0000
Subject: [Biopython-dev] CVS freeze for release
In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com>
References: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Message-ID: <320fb6e00803210551o4644fc34meaadf9e521b087fe@mail.gmail.com>
Michiel de Hoon wrote:
> Hi everybody,
>
> I'll start making release 1.45 from now. Please don't touch CVS until after
> the release is out. Thanks!
Good news :)
I did check in some comment changes to BioSQL this morning, and the
Bio.biblio deprecation, but that was a few hours ago.
Peter
From meanerelk at gmail.com Fri Mar 21 18:28:24 2008
From: meanerelk at gmail.com (Kemal)
Date: Fri, 21 Mar 2008 14:28:24 -0400
Subject: [Biopython-dev] mentor for google summer of code
Message-ID:
I am a university student interested in adding phyloXML support to BioPython
for the Google Summer of Code. Would any developers be willing to mentor
this project? I have been discussing it with Hilmar Lapp, who is mentoring
similar projects for the Phyloinformatics Summer of Code project at the
National Evolutionary Synthesis Center. There page is at:
https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008
A mentor would would be responsible for monitoring the project's progress
over the summer, and to evaluate the work at the end. Google's guidelines
estimate that this would take about 5 hours/week per student. There is more
information at:
http://code.google.com/opensource/gsoc/2008/faqs.html
If anyone is interested, I would love to discuss the details of the
proposal.
Thank you,
Kemal Eren
From biopython at maubp.freeserve.co.uk Fri Mar 21 19:04:06 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 19:04:06 +0000
Subject: [Biopython-dev] mentor for google summer of code
In-Reply-To:
References:
Message-ID: <320fb6e00803211204q2d0f3696pf5baaf44122a0869@mail.gmail.com>
Hi Kemal,
On Fri, Mar 21, 2008 at 6:28 PM, Kemal wrote:
> I am a university student interested in adding phyloXML support to BioPython
> for the Google Summer of Code. Would any developers be willing to mentor
> this project? I have been discussing it with Hilmar Lapp, who is mentoring
> similar projects for the Phyloinformatics Summer of Code project at the
> National Evolutionary Synthesis Center. There page is at:
>
> https://www.nescent.org/wg_phyloinformatics/Phyloinformatics_Summer_of_Code_2008
I see there are similar projects already planned for phyloXML with BioPerl
and BioRuby. For Biopython I guess building on Frank Kauff and Cymon J.
Cox's Bio.Nexus module would be the most logical option. Have you had
a chance to look at any of the Biopython code?
> A mentor would would be responsible for monitoring the project's progress
> over the summer, and to evaluate the work at the end. Google's guidelines
> estimate that this would take about 5 hours/week per student. There is more
> information at:
>
> http://code.google.com/opensource/gsoc/2008/faqs.html
>
> If anyone is interested, I would love to discuss the details of the
> proposal.
It would be worth trying to contact Frank and Cymon directly - and
seeing if they would be interested.
Peter
(one of the current Biopython developers)
From chris.lasher at gmail.com Fri Mar 21 21:11:40 2008
From: chris.lasher at gmail.com (Chris Lasher)
Date: Fri, 21 Mar 2008 17:11:40 -0400
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
<320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
Message-ID: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote:
> Hi Chris,
>
> I haven't heard anything about the CVS to SVN move recently. Did
> anyone resolve the multiple password prompt niggle?
That's still unresolved. The workaround is to place an SSH key on
dev.open-bio.org. If you do, you'll notice even then it still makes
two attempts to log in. /shrugs
> On another point, is the test SVN repository intended to be writable
> (for those of us with developer access)? I really should try running
> some things like "svn diff" and committing sample changes to get a
> feel for how it compares to CVN.
I have tried committing to it and gotten a "Permission denied" error.
It must only be set as read-only for group permissions.
Really sorry for my delay on getting to this email. Now that Biopython
has had another release, should we really push hard to switch to SVN?
Chris
From biopython-dev at maubp.freeserve.co.uk Fri Mar 21 21:37:45 2008
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Fri, 21 Mar 2008 21:37:45 +0000
Subject: [Biopython-dev] Biopython to begin transition to Subversion
In-Reply-To: <128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
References: <128a885f0802140742o1b8910d8j35325dfc3c5379e8@mail.gmail.com>
<658418.5192.qm@web62414.mail.re1.yahoo.com>
<128a885f0802142321h2fcc6013vc073bbcdf391002f@mail.gmail.com>
<320fb6e00803111737q5de7faah2fcbab84ec013bc3@mail.gmail.com>
<128a885f0803211411p15ee043dka48b3f79b65fb4b7@mail.gmail.com>
Message-ID: <320fb6e00803211437i79c1454m52ba4172728032c8@mail.gmail.com>
On Fri, Mar 21, 2008 at 9:11 PM, Chris Lasher wrote:
> On Tue, Mar 11, 2008 at 8:37 PM, Peter wrote:
> > Hi Chris,
> >
> > I haven't heard anything about the CVS to SVN move recently. Did
> > anyone resolve the multiple password prompt niggle?
>
> That's still unresolved. The workaround is to place an SSH key on
> dev.open-bio.org. If you do, you'll notice even then it still makes
> two attempts to log in. /shrugs
If someone's documented this from the previous Bio* migrations, then I
guess we'll live with it.
> > On another point, is the test SVN repository intended to be writable
> > (for those of us with developer access)? I really should try running
> > some things like "svn diff" and committing sample changes to get a
> > feel for how it compares to CVN.
>
> I have tried committing to it and gotten a "Permission denied" error.
> It must only be set as read-only for group permissions.
I never did get round to trying myself...
> Really sorry for my delay on getting to this email. Now that Biopython
> has had another release, should we really push hard to switch to SVN?
Well, Michiel declared a CVS freeze this morning and is preparing
Biopython 1.45 as we speak. Once the release is out does sound like a
good time for the SVN move to me.
Peter
From peter.bulychev at gmail.com Fri Mar 21 23:50:23 2008
From: peter.bulychev at gmail.com (Peter Bulychev)
Date: Sat, 22 Mar 2008 02:50:23 +0300
Subject: [Biopython-dev] results of applying Clone Digger to the sources of
BioPython project
Message-ID:
Hello.
Clone Digger project is aimed to find software clones (duplicate code) in
Python and Java programs.
I have applied it to the source of BioPython and discovered several clone
candidates.
There are a lot of false positives caused by similar code in
nlmmedline_*_format.py files, but maybe other clone candidates will be
interesting for you.
The results can be seen here:
http://clonedigger.sourceforge.net/examples.html
--
Best regards,
Peter Bulychev.
From sbassi at gmail.com Sat Mar 22 03:49:52 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Sat, 22 Mar 2008 00:49:52 -0300
Subject: [Biopython-dev] CVS freeze for release
In-Reply-To: <621627.53345.qm@web62408.mail.re1.yahoo.com>
References: <621627.53345.qm@web62408.mail.re1.yahoo.com>
Message-ID:
On Fri, Mar 21, 2008 at 8:57 AM, Michiel de Hoon wrote:
> Hi everybody,
> I'll start making release 1.45 from now. Please don't touch CVS until after the release is out. Thanks!
I have a proposal, so it could be implemented in the next version (1.46?).
Change the output of EZRetrieve.retrieve_single. It currently returns
a FASTA formated sequence. I think it should return a SeqRecord object
(if you want this SeqRecord object to be printed or stored as FASTA,
just use formatIO).
Here are the proposed changes: http://www.pastecode.com.ar/f3baff314
I can fill this as an enhancement in the bugtrack if you agree.
Best,
SB.
From mjldehoon at yahoo.com Sat Mar 22 11:02:38 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sat, 22 Mar 2008 04:02:38 -0700 (PDT)
Subject: [Biopython-dev] Biopython release 1.45
Message-ID: <901773.64728.qm@web62408.mail.re1.yahoo.com>
We are pleased to announce the release of Biopython 1.45.
This release includes numerous code improvements and fixes, including in Bio.Seq, Bio.SeqIO, Bio.Entrez, Bio.PopGen, Bio.SwissProt, Bio.Cluster, Bio.SCOP, Bio.InterPro, Bio.GenBank, Bio.ExPASy, BioSQL, and the Biopython documentation. Too many to list them all here!
Source distributions and Windows installers are available from the Biopython website at http://biopython.org. My thanks to all code contributers who made this new release possible.
--Michiel on behalf of the Biopython developers.
---------------------------------
Looking for last minute shopping deals? Find them fast with Yahoo! Search.
From biopython-dev at maubp.freeserve.co.uk Sat Mar 22 11:18:42 2008
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Sat, 22 Mar 2008 11:18:42 +0000
Subject: [Biopython-dev] EZRetrieve
Message-ID: <320fb6e00803220418n348d1953v9846af9d04abc04c@mail.gmail.com>
> I have a proposal, so it could be implemented in the next version (1.46?).
> Change the output of EZRetrieve.retrieve_single. It currently returns
> a FASTA formated sequence. I think it should return a SeqRecord object
> (if you want this SeqRecord object to be printed or stored as FASTA,
> just use formatIO).
> Here are the proposed changes: http://www.pastecode.com.ar/f3baff314
> I can fill this as an enhancement in the bugtrack if you agree.
So there is currently one function, retrieve_single, which can returns
a handle but by default extracts and returns a FASTA record as a
string. It does this by calling the parse_single function which
reads in the handle, parses the HTML file, and extracts just the FASTA
style text, throwing away the other annotation data (like the
chromosome or range requested).
Here is an example URL constructed by hand,
http://siriusb.umdnj.edu:18080/EZRetrieve/single_r_run.jsp?org=0&AccType=0&input=BC014651&from=-200&to=200
Parsing HTML is nasty - especially if the site updates the formatting
every so often. I suppose just looking for the FASTA sequence is
fairly reliable. I can see the case for an EzRetrieve HTML to
SeqRecord parser, but I would be tempted to try and parse more of the
annotation.
How many people do you think are using the retrieve_single function?
I would be very annoying for them if its behaviour suddenly changed.
Maybe we can add a new parse function, and call it from
retrieve_single if the optional argument parse=2?
Peter
From biopython at maubp.freeserve.co.uk Sat Mar 22 11:35:39 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Sat, 22 Mar 2008 11:35:39 +0000
Subject: [Biopython-dev] results of applying Clone Digger to the sources
of BioPython project
In-Reply-To:
References:
Message-ID: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
> Hello.
>
> Clone Digger project is aimed to find software clones (duplicate code) in
> Python and Java programs.
>
> I have applied it to the source of BioPython and discovered several clone
> candidates.
>
> There are a lot of false positives caused by similar code in
> nlmmedline_*_format.py files, but maybe other clone candidates will be
> interesting for you.
>
> The results can be seen here:
> http://clonedigger.sourceforge.net/examples.html
Interesting. Does your tool know to ignore deprecated modules? e.g.
when we have essentially copied a file from one location to another, a
deprecated the original.
Some of these are from scanner/consumer parsers where there are two
alternative consumers turning the data into different object
representations.
Other things like providing dictionary like objects seem to be reusing
a lot of "boiler plate" code, and could probably be rationalised into
a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
sharing apparently identical catalyse() methods may fall into the same
class.
Peter
From peter.bulychev at gmail.com Sat Mar 22 21:31:30 2008
From: peter.bulychev at gmail.com (Peter Bulychev)
Date: Sun, 23 Mar 2008 00:31:30 +0300
Subject: [Biopython-dev] results of applying Clone Digger to the sources
of BioPython project
In-Reply-To: <320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
References:
<320fb6e00803220435s2e018a36l7802c164f393ff35@mail.gmail.com>
Message-ID:
Hello.
No, unfortunately Clone Digger can not ignore deprecated modules.
In order to obtain betters results automatically generated code and tests
should be removed from the searched source tree by hands.
Other things like providing dictionary like objects seem to be reusing
> a lot of "boiler plate" code, and could probably be rationalised into
> a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
> Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
>
> Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
> sharing apparently identical catalyse() methods may fall into the same
> class.
>
> I think this is the main purpose of Clone Digger: to find clone candidates
and to help to create recommendations for refactoring.
2008/3/22, Peter :
>
> > Hello.
> >
> > Clone Digger project is aimed to find software clones (duplicate code)
> in
> > Python and Java programs.
> >
> > I have applied it to the source of BioPython and discovered several
> clone
> > candidates.
> >
> > There are a lot of false positives caused by similar code in
> > nlmmedline_*_format.py files, but maybe other clone candidates will be
> > interesting for you.
> >
> > The results can be seen here:
> > http://clonedigger.sourceforge.net/examples.html
>
>
> Interesting. Does your tool know to ignore deprecated modules? e.g.
> when we have essentially copied a file from one location to another, a
> deprecated the original.
>
> Some of these are from scanner/consumer parsers where there are two
> alternative consumers turning the data into different object
> representations.
>
> Other things like providing dictionary like objects seem to be reusing
> a lot of "boiler plate" code, and could probably be rationalised into
> a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
> Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
>
> Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
> sharing apparently identical catalyse() methods may fall into the same
> class.
>
>
> Peter
>
--
Best regards,
Peter Bulychev.
From sbassi at gmail.com Tue Mar 25 20:46:48 2008
From: sbassi at gmail.com (Sebastian Bassi)
Date: Tue, 25 Mar 2008 17:46:48 -0300
Subject: [Biopython-dev] Can't login into wiki
Message-ID:
Hello,
I press the link to login into the wiki
(http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython)
but I am redirected to the same page without a login prompt.
I found that this URL is dead (404):
http://biopython.org/DIST/docs/api/public/trees.html
(and it is linked from http://biopython.org/wiki/Getting_Started , last link).
--
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5
From biopython at maubp.freeserve.co.uk Tue Mar 25 20:55:23 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 25 Mar 2008 20:55:23 +0000
Subject: [Biopython-dev] Can't login into wiki
In-Reply-To:
References:
Message-ID: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com>
On Tue, Mar 25, 2008 at 8:46 PM, Sebastian Bassi wrote:
> Hello,
>
> I press the link to login into the wiki
> (http://biopython.org/w/index.php?title=Special:Userlogin&returnto=Biopython)
> but I am redirected to the same page without a login prompt.
Its not just you - the wiki is being a bit odd for me right too now,
empty PHP pages etc. Maybe it needs rebooting again... which I think
happens automatically every so often. If it doesn't clear up I'll
email the OBF guys tomorrow.
> I found that this URL is dead (404):
> http://biopython.org/DIST/docs/api/public/trees.html
> (and it is linked from http://biopython.org/wiki/Getting_Started , last link).
It should probably be http://biopython.org/DIST/docs/api/ (the link
documentation page is fine).
Peter
From mjldehoon at yahoo.com Tue Mar 25 23:55:20 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Tue, 25 Mar 2008 16:55:20 -0700 (PDT)
Subject: [Biopython-dev] Can't login into wiki
In-Reply-To: <320fb6e00803251355i7c0102d3l90e5d4680282922b@mail.gmail.com>
Message-ID: <912788.63526.qm@web62415.mail.re1.yahoo.com>
Peter wrote: > I found that this URL is dead (404):
> http://biopython.org/DIST/docs/api/public/trees.html
> (and it is linked from http://biopython.org/wiki/Getting_Started , last link).
It should probably be http://biopython.org/DIST/docs/api/ (the link
documentation page is fine).
I fixed this link now.
--Michiel
---------------------------------
Looking for last minute shopping deals? Find them fast with Yahoo! Search.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:24:09 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:24:09 -0400
Subject: [Biopython-dev] [Bug 2475] New: BioSQL.Loader should reuse existing
taxon entries in lineage
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
Summary: BioSQL.Loader should reuse existing taxon entries in
lineage
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Based on a report on the mailing list by Eric Gibert,
http://lists.open-bio.org/pipermail/biopython/2008-March/004137.html
http://lists.open-bio.org/pipermail/biopython/2008-March/004147.html
The _get_taxon_id() function will add new entries to the taxon and taxon_name
tables when a species isn't already defined. It will also generate entries for
the lineage (for which we don't know the NCBI taxon names). At this point it
*should* be re-using any existing entries for elements of the lineage.
Note - this is complicated due to the re-use of the same latin names in
different classes. It might be easier/safer just not to write the lineage at
all?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:34:40 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:34:40 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803261234.m2QCYekn009310@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:34 EST -------
See also Bug 2422 and this thread on the BioSQL mailing list:
http://lists.open-bio.org/pipermail/biosql-l/2008-March/001196.html
In particular Hilmar Lapp from BioSQL wrote in reply to trying to reuse
existing taxon table entries based on string matching to the scientific name
field in the taxon_name table, which I said sounded a little unreliable:
> It's pretty unreliable actually. There is not only synonymy
> but also rampant homonymy in taxonomic names. There are
> plenty of examples for the same scientific name in use for a
> plant and for some animal, for example. So in order to be
> unambiguous you will need to know (and check) the kingdom.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 12:44:01 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 08:44:01 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803261244.m2QCi15R009864@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 08:44 EST -------
Created an attachment (id=883)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=883&action=view)
Patch to BioSQL/Loader.py to not record the lineage for new species
This patch takes the simple route out - when loading a sequence into the
database with a new species (not already in the taxon tables), we ONLY add the
new species to the taxon and taxon_name tables. This DOES NOT attempt to
record the whole lineage, adding or reusing existing taxon entries.
Both the test_BioSQL and test_BioSQL_SeqIO unit tests still pass with this.
I prefer this solution as it avoids any ambiguous heuristics in matching
existing taxon names based on string comparions. This does mean Biopython
won't match BioPerl is this regard, as I understand that BioPerl currently
tries to record the full lineage.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 18:51:18 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 14:51:18 -0400
Subject: [Biopython-dev] [Bug 2477] New: SeqIO.parse does not handle embl
files
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
Summary: SeqIO.parse does not handle embl files
Product: Biopython
Version: Not Applicable
Platform: Macintosh
OS/Version: Mac OS
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: p.foster at nhm.ac.uk
This is in 1.45, but I did not see it in 1.43.
(1.45 is not a Bugzilla option at the moment ...)
If fh is a handle to an embl format file, then
SeqIO.parse(fh, 'embl')
dies. It worked (not perfectly) in 1.43.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Mar 26 19:21:41 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 26 Mar 2008 15:21:41 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803261921.m2QJLfbe007389@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|Not Applicable |1.45
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-26 15:21 EST -------
I've fixed the Bugzilla version field - thanks for the reminder.
Could you give more information please? e.g. a specific EMBL file, and the
error you are seeing.
Thanks, Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 07:59:54 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 03:59:54 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803270759.m2R7xsXA006767@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #2 from p.foster at nhm.ac.uk 2008-03-27 03:59 EST -------
Created an attachment (id=888)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=888&action=view)
test case
It is a multi-bug. There is a bug that prevents 1.45 from reading embl files,
and there is another bug, visible in 1.43 (at least) where it at least parses
embl files, but imperfectly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 10:49:33 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 06:49:33 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271049.m2RAnXpj015624@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 06:49 EST -------
Thanks for the clarification. I can reproduce the problem here.
It looks like they may have tweaked their file format slightly. Biopython will
be ignoring the apparently new PA line, which isn't described here:
http://www.ebi.ac.uk/webin-align/fflink2.html
You can also fetch the first problem record from their webpage, choose "Save",
"ASCII text/table", "complete entries"
http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-e+[EMBLCDS:AAA03323]+-newId
As a minor point, personally I find the following style simpler:
from Bio import SeqIO
fName = 'twoEmblRecords.embl'
f = file(fName)
s = SeqIO.parse(f, 'embl')
for rec in s :
print rec.description
print rec.annotations['taxonomy']
f.close()
(you may of course have good reason for using the .next() method explicitly)
I'll take a look at this bug now...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 11:37:16 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 07:37:16 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271137.m2RBbGvg018455@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 07:37 EST -------
As you said, this is a multi-part bug!
To try this out, you will need to update files Bio/GenBank/Scanner.py and
__init__.py which are now in CVS. If you are not familiar with CVS, the easier
method would be to download the two files from here:
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/?cvsroot=biopython#dirlist
Note there is an hour or so time delay before it will show my changes. You can
see where the files should be put from the stack trace.
Please let me know how you get on (by posting on this bug).
Missing AC lines
================
All our EMBL test cases tested included an AC line, and Biopython 1.45 was
failing because of the missing AC line in your example, which was used to set
the SeqRecord's id property. I have updated CVS to fall back on the ID line.
Multiple DE lines
=================
Already fixed as of Biopython 1.44
Multiple OC lines
=================
Updated Biopython CVS to cope with multi-line taxonomy lineage
PA lines (parent accessions)
============================
You didn't report this, but we currently are ignoring the PA lines.
Quoting ftp://ftp.ebi.ac.uk/pub/databases/embl/cds/README.txt
PA line - contains the accession.version of the "parent" EMBL entry
(entry where the CDS is annotated)
e.g. a whole contig, not just this one CDS/gene. We could record this in the
SeqRecord's annotations dictionary as a list of strings under key
'parent-accessions'. What do you think?
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 15:50:36 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 11:50:36 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271550.m2RFoacs002027@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
------- Comment #5 from p.foster at nhm.ac.uk 2008-03-27 11:50 EST -------
I got those two files, and they seem to have fixed everything. Thanks muchly.
The suggestion of de-ignoring the PA line sounds fine (although I have no use
for it at the moment).
-Peter F.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Mar 27 16:22:13 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 27 Mar 2008 12:22:13 -0400
Subject: [Biopython-dev] [Bug 2477] SeqIO.parse does not handle embl files
In-Reply-To:
Message-ID: <200803271622.m2RGMDWV003784@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2477
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #6 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-27 12:22 EST -------
OK, marking as fixed. I also included AAA03323 as a unit test, as we were
lacking an example without an AC line.
I'll leave the PA line issue alone for the time being; it would be wise to
check if there are any parallels in GenBank or SwissProt/UniProt before doing
anything so that they are all handled consistently.
Thanks for your report Peter.
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 30 02:53:41 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 29 Mar 2008 22:53:41 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803300253.m2U2rfLl002179@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #3 from ericgibert at yahoo.fr 2008-03-29 22:53 EST -------
I would like to propose the following solution:
1) add an extra optional parameter to load(): fetchNCBItaxonomy = False --> so
no impact on existing code. If the users call the load function with True then:
2) after the species insert in the taxon/taxon_name table then the XML data
from NCBI's taxonomy database are fetch
3) XML data is used to update taxon/taxon_name tables respecting the unicity of
the records
I have already part of the code, just need to change the fact that if a taxon
already exists then the new taxon points to this already existing one.
Comments?
Eric
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Mar 30 11:41:25 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 30 Mar 2008 07:41:25 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803301141.m2UBfPMC001648@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2008-03-30 07:41 EST -------
I quite like the idea of fetching the new taxon information from the NCBI as
needed to record an accurate lineage. However, what happens if:
(a) The network is down? Raise an exception maybe?
(b) The NCBI doesn't have this Taxon ID (i.e. its invalid or so new their
database is out of date)? Raise an exception?
Eric, could you attach your taxonomy XML code to this bug? We'd probably want
to start by adding taxonomy XML parsing to Bio.Entrez (which I assume you are
using to fetch the XML data).
What about sequences where we don't have a taxon ID, but we do have a species
name? (which may happen with a sequence which wasn't read from a GenBank
file).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mjldehoon at yahoo.com Sun Mar 30 14:49:41 2008
From: mjldehoon at yahoo.com (Michiel de Hoon)
Date: Sun, 30 Mar 2008 07:49:41 -0700 (PDT)
Subject: [Biopython-dev] Bio.Entrez XML parsing
Message-ID: <864047.785.qm@web62410.mail.re1.yahoo.com>
> Eric, could you attach your taxonomy XML code to this bug?
> We'd probably want to start by adding taxonomy XML parsing
> to Bio.Entrez (which I assume you are using to fetch the XML data).
I've done some thinking about XML parsers for Bio.Entrez.
I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file).
Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML).
The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules.
If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception.
Comments, anybody?
--Michiel
---------------------------------
Never miss a thing. Make Yahoo your homepage.
From sdavis2 at mail.nih.gov Mon Mar 31 00:51:07 2008
From: sdavis2 at mail.nih.gov (Sean Davis)
Date: Sun, 30 Mar 2008 20:51:07 -0400
Subject: [Biopython-dev] Bio.Entrez XML parsing
In-Reply-To: <864047.785.qm@web62410.mail.re1.yahoo.com>
References: <864047.785.qm@web62410.mail.re1.yahoo.com>
Message-ID: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
On Sun, Mar 30, 2008 at 10:49 AM, Michiel de Hoon wrote:
>
> > Eric, could you attach your taxonomy XML code to this bug?
> > We'd probably want to start by adding taxonomy XML parsing
> > to Bio.Entrez (which I assume you are using to fetch the XML data).
>
> I've done some thinking about XML parsers for Bio.Entrez.
>
> I propose to add a function read() to Bio.Entrez, which returns a record suitable for the type of XML file we're trying to read (as determined by the corresponding DTD file).
>
> Now, the various XML types can be very different from each other, and I think the actual parsing should be done by a specialized submodule of Bio.Entrez. For example, one Bio.Entrez.EInfo, one Bio.Entrez.ESummary, and so on. For Bio.Entrez.EFetch, there seem to be many different XMLs, so we'd probably have a number of submodules for it (one of them for the taxonomy XML).
>
> The first tag received by the read() function in Bio.Entrez tells it which type of XML it is receiving (have a look at the XML files shown in chapter 6 of the tutorial for some examples), and can then decide which of the submodules of Bio.Entrez should be used for the actual parsing. Otherwise, the read() function in Bio.Entrez does very little; the actual work is done by the submodules.
>
> If the read() function encounters an XML type for which no parser is yet available, it can raise a NotImplementedError exception.
>
> Comments, anybody?
This makes sense. However, it seems that there needs to be a way to
"register" a parser with read() so that users can extend their local
installation with a specialized parser. In other words, it seems that
a way to dynamically register a parser with read() would be helpful.
Or am I missing something?
Sean
From biopython at maubp.freeserve.co.uk Mon Mar 31 11:25:05 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 12:25:05 +0100
Subject: [Biopython-dev] Bio.Entrez XML parsing
In-Reply-To: <264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
References: <864047.785.qm@web62410.mail.re1.yahoo.com>
<264855a00803301751h270ee34dg86325eb1af298369@mail.gmail.com>
Message-ID: <320fb6e00803310425u478fc938w2ff426c4eae32d99@mail.gmail.com>
On Mon, Mar 31, 2008 at 1:51 AM, Sean Davis wrote:
> This makes sense. However, it seems that there needs to be a way to
> "register" a parser with read() so that users can extend their local
> installation with a specialized parser. In other words, it seems that
> a way to dynamically register a parser with read() would be helpful.
> Or am I missing something?
I like Michiel's plan. The mapping could be as simple as a (private)
dictionary in Bio.Entrez, mapping formats to parser objects/functions
- as done in Bio.SeqIO - which lets the user add new parsers or
override the built in ones should they so desire.
Peter
From tiagoantao at gmail.com Mon Mar 31 14:54:38 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 15:54:38 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
Message-ID: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
Hi,
I would like to start working on the statistical part (actually the
most important part) of Bio.PopGen and on the HapMap part.
My problem is with the CVS to SVN conversion. I cannot understand if I
can go forward and where (ie on the SVN or the CSV repository)?
I any case, I can wait with commiting, so there is no rush, but
eventually I will have to commit somewhere ;)
Tiago
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 15:22:20 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 11:22:20 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803311522.m2VFMKvU003831@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #5 from ericgibert at yahoo.fr 2008-03-31 11:22 EST -------
I attached the XML parser. Note that I did not dig too far in raising errors.
This is not yet the full solution for the taxon/taxon_name tables of BioSQL but
the first step.
Please comment on my programming style and if you want me to raise errors. Note
that Bio.Entrez already raises some errors.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 15:24:06 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 11:24:06 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803311524.m2VFO6wc004008@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #6 from ericgibert at yahoo.fr 2008-03-31 11:24 EST -------
Created an attachment (id=890)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=890&action=view)
Parse a Taxonomy record from NCBI
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From biopython at maubp.freeserve.co.uk Mon Mar 31 15:45:07 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 16:45:07 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
Message-ID: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote:
> Hi,
>
> I would like to start working on the statistical part (actually the
> most important part) of Bio.PopGen and on the HapMap part.
>
> My problem is with the CVS to SVN conversion. I cannot understand if I
> can go forward and where (ie on the SVN or the CSV repository)?
>
> I any case, I can wait with commiting, so there is no rush, but
> eventually I will have to commit somewhere ;)
In the short term, we are still using CVS. I've only been making
relatively small changes as I anticipate the move to SVN will happen
shortly...
Are there any objections to doing it in the next fortnight? Chris -
could you find out when would suit the OBF guys? Maybe come up with
two suggested time slots in the next month?
Peter
From tiagoantao at gmail.com Mon Mar 31 18:32:06 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 19:32:06 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
Message-ID: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
When on SVN I would like to consider branching for PopGen. AFAIK
branching on svn costs very little (only when you make changes does
SVN copies the content from the original branch).
This would have the big advantage that I could make my changes freely
without impact on Michiel's release cycle (or breaking the SVN head
for some reason). Whenever I get something stable I just merge back.
There are good reasons NOT to branch, so this might not be a good
idea... But considering that I am the only person that changes PopGen
I don't thing merging will be an issue at all... Any comments?
On Mon, Mar 31, 2008 at 4:45 PM, Peter wrote:
>
> On Mon, Mar 31, 2008 at 3:54 PM, Tiago Ant?o wrote:
> > Hi,
> >
> > I would like to start working on the statistical part (actually the
> > most important part) of Bio.PopGen and on the HapMap part.
> >
> > My problem is with the CVS to SVN conversion. I cannot understand if I
> > can go forward and where (ie on the SVN or the CSV repository)?
> >
> > I any case, I can wait with commiting, so there is no rush, but
> > eventually I will have to commit somewhere ;)
>
> In the short term, we are still using CVS. I've only been making
> relatively small changes as I anticipate the move to SVN will happen
> shortly...
>
> Are there any objections to doing it in the next fortnight? Chris -
> could you find out when would suit the OBF guys? Maybe come up with
> two suggested time slots in the next month?
>
> Peter
>
--
http://www.tiago.org
From biopython at maubp.freeserve.co.uk Mon Mar 31 19:04:35 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Mon, 31 Mar 2008 20:04:35 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
<6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
Message-ID: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
On Mon, Mar 31, 2008 at 7:32 PM, Tiago Ant?o wrote:
> When on SVN I would like to consider branching for PopGen. AFAIK
> branching on svn costs very little (only when you make changes does
> SVN copies the content from the original branch).
>
> This would have the big advantage that I could make my changes freely
> without impact on Michiel's release cycle (or breaking the SVN head
> for some reason). Whenever I get something stable I just merge back.
>
> There are good reasons NOT to branch, so this might not be a good
> idea... But considering that I am the only person that changes PopGen
> I don't thing merging will be an issue at all... Any comments?
I had been wondering about taking advantage of SVN to explore my
Bio.AlignIO plans and/or improvements to the alignment object. I
think I will need to read up on SVN and how it handles merges and
branches before I try this.
There is a lot to be said for having a single stable trunk - it
certainly makes things simpler for any new developers to get to grips
with things.
Peter
From tiagoantao at gmail.com Mon Mar 31 19:08:46 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 20:08:46 +0100
Subject: [Biopython-dev] Bio.PopGen and CVS/SVN
In-Reply-To: <320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
References: <6d941f120803310754v71a4afd4s37073b1f54a01c74@mail.gmail.com>
<320fb6e00803310845wd5ca8d3led77e8e578e86f7c@mail.gmail.com>
<6d941f120803311132o4ddb0f2eq4d9087472b43ace9@mail.gmail.com>
<320fb6e00803311204k14ebdbdan1e9cea3842af64e8@mail.gmail.com>
Message-ID: <6d941f120803311208k6b6c9d1ah58c7808e0fbd0e2c@mail.gmail.com>
On Mon, Mar 31, 2008 at 8:04 PM, Peter wrote:
> There is a lot to be said for having a single stable trunk - it
> certainly makes things simpler for any new developers to get to grips
> with things.
It is one of those issues where there is no clear answer. Maybe a case
by case analysis? I think having 5 gazillion branches would not be a
good idea ever, but in the Biopython case many modules are somewhat
self contained, making merging an easier exercise.
Tiago
From tiagoantao at gmail.com Mon Mar 31 22:13:11 2008
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 31 Mar 2008 23:13:11 +0100
Subject: [Biopython-dev] Genbank dbSNP support
Message-ID: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
Hi,
Any plans for dbSNP support?
http://www.ncbi.nlm.nih.gov/SNP/index.html
I think I would volunteer to implement this. A simple solution would
be to add both databases and return types. Michiel (I suppose this is
code that you are actively maintaining, or it is Peter?), can I send
you a diff? I have done this once already for genome -
http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html
dbSNP can return different types (
http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam
) so a few parsers would be needed for complete support. But that can
be done later...
--
http://www.tiago.org
From biopython at maubp.freeserve.co.uk Mon Mar 31 23:01:10 2008
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 1 Apr 2008 00:01:10 +0100
Subject: [Biopython-dev] Genbank dbSNP support
In-Reply-To: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
References: <6d941f120803311513k43139fbi97683597c15f03a2@mail.gmail.com>
Message-ID: <320fb6e00803311601x573c104cx1beb7035a14ef03c@mail.gmail.com>
On Mon, Mar 31, 2008 at 11:13 PM, Tiago Ant?o wrote:
> Hi,
>
> Any plans for dbSNP support?
> http://www.ncbi.nlm.nih.gov/SNP/index.html
>
> I think I would volunteer to implement this. A simple solution would
> be to add both databases and return types. Michiel (I suppose this is
> code that you are actively maintaining, or it is Peter?), can I send
> you a diff? I have done this once already for genome -
> http://portal.open-bio.org/pipermail/biopython/2007-January/003347.html
I think Michiel has been dealing with this sort of stuff
(NCBIDictionary and Bio.Entrez). I would file an enhancement bug, and
attach your patch to it.
> dbSNP can return different types (
> http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html#rettypeparam
> ) so a few parsers would be needed for complete support. But that can
> be done later...
We should already be able to parse their Fasta, GenBank or GenPept
output. The lists of IDs should also be trivial. I haven't looked at
the other formats.
Peter
From bugzilla-daemon at portal.open-bio.org Mon Mar 31 23:23:46 2008
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 31 Mar 2008 19:23:46 -0400
Subject: [Biopython-dev] [Bug 2475] BioSQL.Loader should reuse existing
taxon entries in lineage
In-Reply-To:
Message-ID: <200803312323.m2VNNku4026068@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2475
------- Comment #7 from ericgibert at yahoo.fr 2008-03-31 19:23 EST -------
Created an attachment (id=891)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=891&action=view)
refactoring and search by name
Please discard previous attachment. This newer version includes a static method
returning a list of Taxonomy based on a scientific name.
It is then possible to test the len of the return list:
0 for no match, 1 for a unique taxon, more if ambiguity.
Ambiguity can be cleared using the get_taxon_by_rank("order") for example.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.