From bugzilla-daemon at portal.open-bio.org Sat Dec 1 13:25:56 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Dec 2007 13:25:56 -0500
Subject: [Biopython-dev] [Bug 2414] New: run_tests,
py fails with a single test on a test suite
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
Summary: run_tests,py fails with a single test on a test suite
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: trivial
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: tiagoantao at gmail.com
When a test python file is composed of a single test, PyUnit dumps the
following log:
Ran 1 test in xxxs
run_test.py on (current CVS HEAD) line 284 is only searching for the plural
Run yy tests in xxxs
Mini patch (not tested, but trivial)
if expected_line[:3] == "Ran" and \
string.find(expected_line, " tests in ") >= 5:
becomes, eg,
if expected_line[:3] == "Ran" and \
(string.find(expected_line, " tests in ") >= 5 or
string.find(expected_line, " test in ") >= 5):
I actually have, for now, a single case with one test, as I split my test cases
in depending on external binaries and not depending on external binaries
(creating a test scenario with a single test to try to run an external
application)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From tiagoantao at gmail.com Mon Dec 3 16:41:06 2007
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 3 Dec 2007 21:41:06 +0000
Subject: [Biopython-dev] [Bug 2414] New: run_tests,
py fails with a single test on a test suite
In-Reply-To:
References:
Message-ID: <6d941f120712031341p1af6ca55oa04b787f8e0937@mail.gmail.com>
Hi,
Could I please ask you (I suppose Peter or Michiel) to advise on this?
I have my code for coalescent simulation ready, but I am not
committing because one of my test files has only a single test (to see
if it can run the coalescent simulator, all other tests are
non-dependent on having the simulator, so are on a different test
case).
I can either put a dummy test just to have 2 tests (hack around), or
run_test can be sorted out.
Thanks
Tiago
PS - Apologies in advance if I take too much time to respond, I will
be traveling for the next 3 days.
On Dec 1, 2007 6:25 PM, wrote:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2414
>
> Summary: run_tests,py fails with a single test on a test suite
> Product: Biopython
> Version: Not Applicable
> Platform: All
> OS/Version: All
> Status: NEW
> Severity: trivial
> Priority: P2
> Component: Main Distribution
> AssignedTo: biopython-dev at biopython.org
> ReportedBy: tiagoantao at gmail.com
>
>
> When a test python file is composed of a single test, PyUnit dumps the
> following log:
> Ran 1 test in xxxs
> run_test.py on (current CVS HEAD) line 284 is only searching for the plural
> Run yy tests in xxxs
> Mini patch (not tested, but trivial)
> if expected_line[:3] == "Ran" and \
> string.find(expected_line, " tests in ") >= 5:
> becomes, eg,
> if expected_line[:3] == "Ran" and \
> (string.find(expected_line, " tests in ") >= 5 or
> string.find(expected_line, " test in ") >= 5):
>
> I actually have, for now, a single case with one test, as I split my test cases
> in depending on external binaries and not depending on external binaries
> (creating a test scenario with a single test to try to run an external
> application)
>
>
> --
> Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
--
http://www.tiago.org/ps
From bugzilla-daemon at portal.open-bio.org Mon Dec 3 17:04:05 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 3 Dec 2007 17:04:05 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests,
py fails with a single test on a test suite
In-Reply-To:
Message-ID: <200712032204.lB3M45tn000935@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-03 17:04 EST -------
Are you talking about test_PopGen_FDist.py? I don't have fdist installed, so I
haven't found this problem yet...
In anycase, your fix looks fine, although arguably a regular expession (with an
optional "s" in "tests") would be more elegant.
I am happy for you to make this change in run_tests.py
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at c2b2.columbia.edu Tue Dec 4 02:10:40 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 02:10:40 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt / Bio.SeqIO
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
Hi everybody,
I am still looking at the different code in Biopython to access SwissProt.
With Bio.SwissProt, we can access the SwissProt database as follows:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
# record is now a string containing the SwissProt record O23719
Another option is to pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> s_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser=s_parser)
>>> record = dictionary["O23719"]
# record is now a Bio.SwissProt.SProt.Record object containing record O23719
A third option is to pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
>>> from Bio import SeqIO
>>> import StringIO
>>> record = SeqIO.parse(StringIO.StringIO(record), "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719
Compare this to how we would read a Fasta file:
>>> from Bio import SeqIO
>>> input = open("mydata.fa")
>>> record = SeqIO.parse(input, "fasta").next()
For consistency with Bio.SeqIO, it would make sense if ExPASyDictionary would
returns handles instead of parsed objects. Then these examples look like:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"].read()
# record is now a string containing the SwissProt record O23719
To pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> record = SProt.parse(handle)
# record is now a Bio.SwissProt.SProt.Record object containing record O23719
To pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> from Bio import SeqIO
>>> record = SeqIO.parse(handle, "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719
*If* we decide that ExPASyDictionary should return handles, *then* actually
we don't really need an ExPASyDictionary, as its behavior is then largely the
same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
Bio.WWW.ExPASy.get_sprot_raw already offers.
Any comments?
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From biopython-dev at maubp.freeserve.co.uk Tue Dec 4 05:26:52 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Tue, 4 Dec 2007 10:26:52 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
> For consistency with Bio.SeqIO, it would make sense if ExPASyDictionary would
> returns handles instead of parsed objects.
I agree that it would in general be simpler if our online APIs
returned handles by default. This also applies to the Bio.GenBank
methods. Of course, we should preserve existing functionality if
possible.
Another alternative is to return SeqRecords by default (via Bio.SeqIO)
but this wouldn't generalise to non-sequence files like ProSite etc.
One idea I had been thinking about was adding a new function
Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
a proxy to all our supported online sequence databases, and either
return a handle to the requested record(s), or perhaps return
SeqRecord(s).
One API model would be that outlined for the (possibly defunct?) Open
Biological Database Access (OBDA) scheme, which covers both BioSQL
access and online fetching (biofetch):
http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/obda-specs/biofetch/biofetch.txt?cvsroot=obf-common
But first I should probably finish working on BioSQL ;)
> *If* we decide that ExPASyDictionary should return handles, *then* actually
> we don't really need an ExPASyDictionary, as its behavior is then largely the
> same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> Bio.WWW.ExPASy.get_sprot_raw already offers.
Can ExPASyDictionary return anything that get_sprot_raw can't?
Otherwise from the user's point of view its just a coding style issue
(dictionary versus function).
Peter
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 05:41:25 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 05:41:25 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests,
py fails with a single test on a test suite
In-Reply-To:
Message-ID: <200712041041.lB4AfPTN008806@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
tiagoantao at gmail.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
------- Comment #2 from tiagoantao at gmail.com 2007-12-04 05:41 EST -------
> Are you talking about test_PopGen_FDist.py? I don't have fdist installed, so I
> haven't found this problem yet...
No, it is my new SimCoal code.
> In anycase, your fix looks fine, although arguably a regular expession (with an
> optional "s" in "tests") would be more elegant.
>
> I am happy for you to make this change in run_tests.py
OK, I will do this with a regex. I cannot promise when though, as I am
traveling until Saturday (but it will before next Monday).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 14:43:17 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 14:43:17 -0500
Subject: [Biopython-dev] [Bug 2412] NCBIXML. fails parsing with blast 2.2.15
in special cases (Karlin-Altschul)
In-Reply-To:
Message-ID: <200712041943.lB4JhHkE012059@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2412
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-04 14:43 EST -------
The fact that your example gives an empty XML file is essentially due to some
problem with Blast. I agree that the Biopython error message you quoted is
very unhelpful in this situation.
Are you using Biopython 1.43 (as suggested by the strack trace in the error
report), or Biopython 1.44 as reported in the bug details?
What does this do on your setup?
from StringIO import StringIO
from Bio.Blast import NCBIXML
handle = StringIO("")
for record in NCBIXML.parse(handle) :
print record
If you are using Biopython 1.44 or later you should get a helpful error
message, "ValueError: Your XML file was empty". You can catch this, and inspect
the contents of the error handle if you want to deal with this in your
application.
i.e. I think this bug has already been fixed in Biopython 1.44
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 15:25:45 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 15:25:45 -0500
Subject: [Biopython-dev] [Bug 2396] BioSQL loader does not store sequence
level annotations dict
In-Reply-To:
Message-ID: <200712042025.lB4KPj2D016252@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2396
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-04 15:25 EST -------
I think I have fixed this now in CVS.
One related wrinkle is that if you had this:
record.annotations["example1"] == "string"
record.annotations["example2"] == ["alpha"]
record.annotations["example3"] == ["alpha", "beta"]
after loading and retreiving from BioSQL you have this:
record.annotations["example1"] == ["string"]
record.annotations["example2"] == ["alpha"]
record.annotations["example3"] == ["alpha", "beta"]
i.e. Everything becomes a list of strings.
It is difficult to see how to deal with this elegantly given the current BioSQL
schema. One option is to treat single entries as either a list or a string
depending on the rank field in the database... I should probably take this up
with the BioSQL mailing list to see how/if this issue affects BioPerl/BioJava.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at c2b2.columbia.edu Tue Dec 4 20:13:01 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 20:13:01 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
> One idea I had been thinking about was adding a new function
> Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> a proxy to all our supported online sequence databases, and either
> return a handle to the requested record(s), or perhaps return
> SeqRecord(s).
I believe that Bio.db has such a functionality, but I don't think it is used
much.
Anyway, we currently have too many functions in Biopython to access databases
rather than too few.
So I think we should not add any new ones.
> > *If* we decide that ExPASyDictionary should return handles, *then*
actually
> > we don't really need an ExPASyDictionary, as its behavior is then largely
the
> > same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> > Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> > Bio.WWW.ExPASy.get_sprot_raw already offers.
>
> Can ExPASyDictionary return anything that get_sprot_raw can't?
> Otherwise from the user's point of view its just a coding style issue
> (dictionary versus function).
ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
return any record that ExPASyDictionary can return.
There are two differences between the two:
1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
you write, this is just a coding style issue.
2) When creating a ExPASyDictionary, users can pass a parser to parse the
records before returning them. This is in essence only a coding style issue.
In particular, do we want:
>>> from Bio.SwissProt import SProt
>>> sprot_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
>>> record = dictionary["O12345"]
or
>>> from Bio.SwissProt import SProt
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SProt.parse(handle)
For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
For plain-text output, in the ExPASyDictionary approach we pass no parser,
and in the get_sprot_raw approach we call read() on the handle directly.
To get a handle, in the ExPASyDictionary approach we can use StringIO to
convert the text output to a handle; in the get_sprot_raw approach we don't
need to do anything.
In my opinion, both 1) and 2) are just coding style issues. Maintaining both
ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
confusion for users. So I suggest we focus on one of these, and deprecate the
other.
The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is organized,
and therefore has my preference.
Two more issues:
1) I am not sure why the SwissProt code is kept in a separate SProt submodule
of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
save ourselves some typing by keeping all the SwissProt code there instead of
in SProt.py.
2) A SwissProt.parse function currently doesn't exist. Right now it is a
three-step process:
>>> s_parser = SProt.RecordParser()
>>> s_iterator = SProt.Iterator(handle, s_parser)
>>> record = s_iterator.next()
A SwissProt.parse function would just contain these three steps, or
perhaps only the first two.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From mdehoon at c2b2.columbia.edu Tue Dec 4 20:13:01 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 20:13:01 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
> One idea I had been thinking about was adding a new function
> Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> a proxy to all our supported online sequence databases, and either
> return a handle to the requested record(s), or perhaps return
> SeqRecord(s).
I believe that Bio.db has such a functionality, but I don't think it is used
much.
Anyway, we currently have too many functions in Biopython to access databases
rather than too few.
So I think we should not add any new ones.
> > *If* we decide that ExPASyDictionary should return handles, *then*
actually
> > we don't really need an ExPASyDictionary, as its behavior is then largely
the
> > same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> > Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> > Bio.WWW.ExPASy.get_sprot_raw already offers.
>
> Can ExPASyDictionary return anything that get_sprot_raw can't?
> Otherwise from the user's point of view its just a coding style issue
> (dictionary versus function).
ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
return any record that ExPASyDictionary can return.
There are two differences between the two:
1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
you write, this is just a coding style issue.
2) When creating a ExPASyDictionary, users can pass a parser to parse the
records before returning them. This is in essence only a coding style issue.
In particular, do we want:
>>> from Bio.SwissProt import SProt
>>> sprot_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
>>> record = dictionary["O12345"]
or
>>> from Bio.SwissProt import SProt
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SProt.parse(handle)
For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
For plain-text output, in the ExPASyDictionary approach we pass no parser,
and in the get_sprot_raw approach we call read() on the handle directly.
To get a handle, in the ExPASyDictionary approach we can use StringIO to
convert the text output to a handle; in the get_sprot_raw approach we don't
need to do anything.
In my opinion, both 1) and 2) are just coding style issues. Maintaining both
ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
confusion for users. So I suggest we focus on one of these, and deprecate the
other.
The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is organized,
and therefore has my preference.
Two more issues:
1) I am not sure why the SwissProt code is kept in a separate SProt submodule
of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
save ourselves some typing by keeping all the SwissProt code there instead of
in SProt.py.
2) A SwissProt.parse function currently doesn't exist. Right now it is a
three-step process:
>>> s_parser = SProt.RecordParser()
>>> s_iterator = SProt.Iterator(handle, s_parser)
>>> record = s_iterator.next()
A SwissProt.parse function would just contain these three steps, or
perhaps only the first two.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 4451 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/biopython-dev/attachments/20071204/9bc0ae4d/attachment.bin
From biopython-dev at maubp.freeserve.co.uk Wed Dec 5 05:03:34 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Wed, 5 Dec 2007 10:03:34 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From biopython-dev at maubp.freeserve.co.uk Wed Dec 5 05:03:34 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Wed, 5 Dec 2007 10:03:34 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From mdehoon at c2b2.columbia.edu Wed Dec 5 05:29:38 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 5 Dec 2007 05:29:38 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu><320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com><6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B671@mail2.exch.c2b2.columbia.edu>
> Or do we want to encourage Bio.SeqIO (which happens to call
> Bio.SwissProt.SProt internally)?
>
> >>> from Bio SeqIO
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SeqIO.parse(handle, "swiss")
>
> This is the style I prefer (and is very similar to the related
> examples I added to the tutorial). It separates fetching the data (as
> a handle) and parsing it (via SeqIO).
SeqIO.parse returns a SeqRecord; a SwissProt.parse returns a
SwissProt.SProt.Record.
Does the SeqRecord contain the same information as a SwissProt.SProt.Record?
Or is some information lost?
If they contain the same information, then I am in favor of encouraging
Bio.SeqIO.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From mdehoon at c2b2.columbia.edu Wed Dec 5 05:29:38 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 5 Dec 2007 05:29:38 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu><320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com><6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B671@mail2.exch.c2b2.columbia.edu>
> Or do we want to encourage Bio.SeqIO (which happens to call
> Bio.SwissProt.SProt internally)?
>
> >>> from Bio SeqIO
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SeqIO.parse(handle, "swiss")
>
> This is the style I prefer (and is very similar to the related
> examples I added to the tutorial). It separates fetching the data (as
> a handle) and parsing it (via SeqIO).
SeqIO.parse returns a SeqRecord; a SwissProt.parse returns a
SwissProt.SProt.Record.
Does the SeqRecord contain the same information as a SwissProt.SProt.Record?
Or is some information lost?
If they contain the same information, then I am in favor of encouraging
Bio.SeqIO.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From biopython at maubp.freeserve.co.uk Wed Dec 5 06:55:45 2007
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 05 Dec 2007 11:55:45 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <475691C1.3020705@maubp.freeserve.co.uk>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From mdehoon at c2b2.columbia.edu Fri Dec 7 05:11:33 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 7 Dec 2007 05:11:33 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu> <320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com> <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
Hi everybody,
To summarize, I rewrote the chapter on SwissProt/Prosite/Prodoc/ExPASy and
put it here:
http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html#htoc51
(chapter 6 in the tutorial)
This is merely a proposal on how this should work; none of this is in CVS
yet.
Please let us know if you have any objections.
If there are no objections, I can upload the new code to CVS. That would
conclude my work on Bio.WWW.ExPASy; the final (and biggest) part of my work
on Bio.WWW will be to look at the various Biopython modules to interact with
NCBI (Genbank, EUtils).
Two comments:
1) In this proposal, I am using SwissProt.parse instead of SeqIO.parse since
the latter does not (yet) store all information contained in a SwissProt
file. I'd be happy though to move to SeqIO.parse for SwissProt also once it
does.
2) It may be nice to have a SwissProt.read and SeqIO.read to read and return
exactly one record from the handle, in addition to parse() to create an
iterator to read multiple records.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 3662 bytes
Desc: not available
Url : http://lists.open-bio.org/pipermail/biopython-dev/attachments/20071207/0442ab19/attachment.bin
From biopython at maubp.freeserve.co.uk Fri Dec 7 05:46:32 2007
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 7 Dec 2007 10:46:32 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
<6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
> To summarize, I rewrote the chapter on SwissProt/Prosite/Prodoc/ExPASy and
> put it here:
>
> http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html#htoc51
> (chapter 6 in the tutorial)
>
> This is merely a proposal on how this should work; none of this is in CVS
> yet. Please let us know if you have any objections.
I would add a note saying doing it this way gives
Bio.SwissProt.SProt.Record objects,
while you could alternatively get SeqRecord objects as described in
the SeqIO chapter
(use a reference).
> If there are no objections, I can upload the new code to CVS. That would
> conclude my work on Bio.WWW.ExPASy; the final (and biggest) part of my work
> on Bio.WWW will be to look at the various Biopython modules to interact with
> NCBI (Genbank, EUtils).
That will be "fun"!
> Two comments:
> 1) In this proposal, I am using SwissProt.parse instead of SeqIO.parse since
> the latter does not (yet) store all information contained in a SwissProt
> file. I'd be happy though to move to SeqIO.parse for SwissProt also once it
> does.
> 2) It may be nice to have a SwissProt.read and SeqIO.read to read and return
> exactly one record from the handle, in addition to parse() to create an
> iterator to read multiple records.
I'd suggested a Bio.SeqIO function, with a name like parse1() or
parse_sole() etc which
would return a single SeqRecord - and raise an error if the handle
didn't contain one
and only one record. We could call this function read() if you prefer.
Peter
From mdehoon at c2b2.columbia.edu Fri Dec 7 22:18:09 2007
From: mdehoon at c2b2.columbia.edu (Michiel de Hoon)
Date: Sat, 08 Dec 2007 12:18:09 +0900
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
In-Reply-To: <320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
<6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
<320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
Message-ID: <475A0CF1.1080802@c2b2.columbia.edu>
Peter wrote:
> I would add a note saying doing it this way gives
> Bio.SwissProt.SProt.Record objects,
> while you could alternatively get SeqRecord objects as described in
> the SeqIO chapter
> (use a reference).
OK I will add that.
>
> I'd suggested a Bio.SeqIO function, with a name like parse1() or
> parse_sole() etc which
> would return a single SeqRecord - and raise an error if the handle
> didn't contain one
> and only one record. We could call this function read() if you prefer.
>
I'd prefer read() instead of parse1(), parse_sole() etc. for the
following reasons:
1) Having two names that are clearly different emphasizes the fact that
they return different things (parse() returns an iterator, read() a record).
2) Some modules deal with data that always consist of one record (for
example, gene expression data in case of Bio.Cluster). Such modules can
have a read() function but not a parse(). It would feel strange if a
module has a parse1() function but not a parse().
--Michiel.
From bugzilla-daemon at portal.open-bio.org Sat Dec 8 08:09:00 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 8 Dec 2007 08:09:00 -0500
Subject: [Biopython-dev] [Bug 2417] New: Bio.SeqIO single SeqRecord
read/parse function
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
Summary: Bio.SeqIO single SeqRecord read/parse function
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Most sequence file format can contain a single record, and in this situation
having to use an iterator returned by Bio.SeqIO.parse() can be clumsy.
For example, dealing with GenBank files for bacterial genomes or chromosomes.
Or, from the tutorial as of Biopython 1.44,
from Bio.WWW import ExPASy
from Bio import SeqIO
seq_record = SeqIO.parse(ExPASy.get_sprot_raw("O23729"), "swiss").next()
print seq_record.id
print seq_record.seq
print len(seq_record.seq)
Using the iterator.next() method as above works fine, it will however silently
ignore any unexpected subsequent records if present. Checking your file only
has one record would require a an additional check to confirm a second .next()
call fails, or another such workaround.
I am proposing a new function for use with a handle containing one and only one
record. This would raise an error if the handle contained no records, or if it
contained more than one record. It would be defined in Bio/SeqIO/__init__.py
as a simple wrapper for Bio.SeqIO.parse()
Note - My proposed "read single record" function would NOT work for cases where
the handle contains multiple records and you only want the first one (because I
would raise an exception). I would regard this as a corner case, and catering
to this risks silently ignoring unexpected second and subsequent records in
other use cases. In such situations using Bio.SeqIO.parse(...).next() is
advised.
I had previously suggested "parse1", "parse_sole", "parse_only" - none of which
are very appealing. On the dev mailing list today, Michiel has proposed
"read":
Michiel de Hoon wrote:
>
> Peter wrote:
> > I'd suggested a Bio.SeqIO function, with a name like parse1() or
> > parse_sole() etc which would return a single SeqRecord - and raise
> > an error if the handle didn't contain one and only one record. We
> > could call this function read() if you prefer.
> >
> I'd prefer read() instead of parse1(), parse_sole() etc. for the
> following reasons:
>
> 1) Having two names that are clearly different emphasizes the fact that
> they return different things (parse() returns an iterator, read() a record).
>
> 2) Some modules deal with data that always consist of one record (for
> example, gene expression data in case of Bio.Cluster). Such modules can
> have a read() function but not a parse(). It would feel strange if a
> module has a parse1() function but not a parse().
I plan to add this functionality to Bio/SeqIO/__init__.py as a "read" function,
and update the tutorial accordingly shortly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From p.j.a.cock at googlemail.com Sat Dec 8 08:10:33 2007
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sat, 8 Dec 2007 13:10:33 +0000
Subject: [Biopython-dev] Bio.SeqIO function to read a single record
Message-ID: <320fb6e00712080510k3d4e5148gb0ec332a0d745452@mail.gmail.com>
Michiel de Hoon wrote:
> >
> > I'd suggested a Bio.SeqIO function, with a name like parse1() or
> > parse_sole() etc which would return a single SeqRecord - and raise
> > an error if the handle didn't contain one and only one record. We
> > could call this function read() if you prefer.
> >
> I'd prefer read() instead of parse1(), parse_sole() etc. for the
> following reasons:
>
> 1) Having two names that are clearly different emphasizes the fact that
> they return different things (parse() returns an iterator, read() a record).
>
> 2) Some modules deal with data that always consist of one record (for
> example, gene expression data in case of Bio.Cluster). Such modules can
> have a read() function but not a parse(). It would feel strange if a
> module has a parse1() function but not a parse().
OK. I've filed an enhancement bug, which I'll mention on the main mailing list,
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
Unless there is some negative feedback, I'll add that functionality shortly.
Peter
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 11:24:19 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 11:24:19 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712091624.lB9GOJCe025680@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 11:24 EST -------
Updated Bio/SeqIO/__init__.py to have include new "read" function in CVS
revision 1.21
I'll do the documentation and unit tests next, before marking this as fixed.
[Its not yet too late to change the name from "read" if anyone can come up with
a nice clear alternative, or a strong argument against this choice]
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 13:50:06 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 13:50:06 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712091850.lB9Io6tj013469@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 13:50 EST -------
I've updated the tutorial, wiki and unit test.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 14:03:28 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 14:03:28 -0500
Subject: [Biopython-dev] [Bug 2412] NCBIXML. fails parsing with blast 2.2.15
in special cases (Karlin-Altschul)
In-Reply-To:
Message-ID: <200712091903.lB9J3SkM014338@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2412
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME
------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 14:03 EST -------
As per my comment 4, I think that in Biopython 1.44 we look for the special
case of an empty XML output file and raise a ValueError. On Biopython 1.43 the
error was very unhelpful.
I'm marking this as "works for me".
Bjoern, please reopen this bug if there is still a problem using Biopython 1.44
Thanks, Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 20:18:50 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 20:18:50 -0500
Subject: [Biopython-dev] [Bug 2418] New: SyntaxError should be ValueError
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
Summary: SyntaxError should be ValueError
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: mdehoon at ims.u-tokyo.ac.jp
Biopython now has SyntaxErrors all over the place. Most if not all of these
should be ValueErrors. SyntaxErrors are appropriate if there is a syntax
problem in the code itself, not (as it's used in Biopython) if there is a
syntax problem in an input data file.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Dec 10 05:01:49 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Dec 2007 05:01:49 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712101001.lBAA1nxL011529@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-10 05:01 EST -------
That would be my fault.
Should we introduce a Biopython "FormatSyntaxError" exception (as a subclass of
ValueError defined in Bio/__init__.py), or just switch these to ValueError
exceptions instead?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Dec 10 07:13:16 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Dec 2007 07:13:16 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712101213.lBACDGLG022397@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2007-12-10 07:13 EST -------
> Should we introduce a Biopython "FormatSyntaxError" exception (as a subclass of
> ValueError defined in Bio/__init__.py), or just switch these to ValueError
> exceptions instead?
I would stick to ValueError. The error message should be clear enough for the
user to understand what the problem is.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 11 06:44:33 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 11 Dec 2007 06:44:33 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712111144.lBBBiXrZ014612@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-11 06:44 EST -------
I've just fixed the Bio.SeqIO, Bio.GenBank, Bio.SwissProt and Bio.SCOP cases
and their test cases.
I see you've found and fixed a whole more - its clearly not just me that used
the SyntaxError exception in this way.
We should probably also change Bio.Medline, Bio.Prosite and Bio.Blast
I think the cases in Bio.config are a little different...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 11 21:54:47 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 11 Dec 2007 21:54:47 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712120254.lBC2slIL022573@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from mdehoon at ims.u-tokyo.ac.jp 2007-12-11 21:54 EST -------
I have replaced the SyntaxErrors by ValueErrors where appropriate. The
remaining SyntaxErrors, as far as I can tell, are being used correctly. Closing
this bug.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 10:07:12 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 10:07:12 -0500
Subject: [Biopython-dev] [Bug 2419] New: SeqUtils __init__.py missing
complement function (v1.43 and v1.44)
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
Summary: SeqUtils __init__.py missing complement function (v1.43
and v1.44)
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: justin.t.riley at gmail.com
This issue exists in both 1.43 and 1.44. You won't notice this bug on an
import of SeqUtils. However, when you try to use the six_frame_translations
function like so:
from Bio import SeqUtils
SeqUtils.six_frame_translations('GTCA....AAT')
you get:
: global name 'complement' is not defined
at line 285 (for version 1.43 anyhow)
At first I searched all the Biopython modules for a "def complement" string and
found one in Seq but it was for the complement of an actual Seq object.
Looking around the web I found:
def complement(seq):
" returns the complementary sequence (NOT antiparallel) "
return ''.join([IUPACData.ambiguous_dna_complement[x] for x in seq])
Pasting the above in Bio/SeqUtils/__init__.py solved the issue for me. Thanks.
~jtriley
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 15:33:43 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 15:33:43 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712122033.lBCKXhxd020792@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
mmokrejs at ribosome.natur.cuni.cz changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mmokrejs at ribosome.natur.cuni
| |.cz
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 16:48:03 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 16:48:03 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122148.lBCLm3iH025664@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
------- Comment #19 from Biosql at hotmail.com 2007-12-12 16:48 EST -------
Hi Peter,
I know it's been a very long time (more than a month), but I had this huge exam
to prepare.
Anyway, I've tried the latest version and everything is working fine.
Many many thanks to you !
Since any Swiss Prot cross-references ain't uploaded in the Biosql DB, I've
tried to parse the flat file with the RecordParser method from SProt instead of
the SequenceParser or the SeqIO Parser, but I'm getting an error.
I've seen in the bug list that you seem to work on this issue.
Am I right ? If not, is there a way to upload the Swiss Prot cross-references ?
Again, thank you !
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 17:01:47 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 17:01:47 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122201.lBCM1lGR026457@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
------- Comment #20 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-12 17:01 EST -------
Hi Jonathan,
I'm glad we've fixed the error for you. Could you be a little more precise
about what isn't working with getting Swiss Prot cross-references into BioSQL?
e.g. Pick a specific SwissProt record, and quote the lines from the file
containing the cross-references.
That should be enough for me to try and track down what's going on.
By the way - if you want to work with BioSQL, you have to use SeqRecord objects
(e.g. from the Bio.SeqIO parser), and not the Bio.SwissProt.SProt.Record
objects. This probably explains the error you mentioned using the RecordParser
parser instead.
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 17:17:36 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 17:17:36 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122217.lBCMHaBK027220@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
------- Comment #21 from Biosql at hotmail.com 2007-12-12 17:17 EST -------
(In reply to comment #20)
> Hi Jonathan,
>
> I'm glad we've fixed the error for you. Could you be a little more precise
> about what isn't working with getting Swiss Prot cross-references into BioSQL?
>
> e.g. Pick a specific SwissProt record, and quote the lines from the file
> containing the cross-references.
>
> That should be enough for me to try and track down what's going on.
>
> By the way - if you want to work with BioSQL, you have to use SeqRecord objects
> (e.g. from the Bio.SeqIO parser), and not the Bio.SwissProt.SProt.Record
> objects. This probably explains the error you mentioned using the RecordParser
> parser instead.
>
> Peter
>
Sorry for the lack of informations,
Here's an example : http://ca.expasy.org/uniprot/Q9CQD1.txt
All the sequences, ID line, AC lines and comments (cc lines) are being uploaded
in the database, but not the : DR lines (which I consider the most interesting
cross-references), the Pubmed references (R_ lines) and the Taxon of the
protein.
I don't think that the FT lines can be uploaded too isn't ?
If so, it would be awesome !
Just to clear things, this uploading pattern is not only related to this
protein (Rab5a) but for all the Swiss Prot proteins.
Do you need anything else ?
Jonathan
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 19:42:28 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 19:42:28 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712130042.lBD0gSdm001952@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2007-12-12 19:42 EST -------
The "complement" and similar functions were removed from Bio.SeqUtils in
Biopython 1.43 because similar functionality existed in several places in
Biopython. Apparently, we missed this call to complement in the
six_frame_translations function. I would like to avoid adding this function
back to SeqUtils. Instead, we can use the reverse_complement function in
Bio.Seq, and take its reverse.
Could you double-check if the revised version of Bio.SeqUtils.__init__.py works
for you? You can pick it up from here:
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio/SeqUtils/__init__.py?rev=1.14&cvsroot=biopython&content-type=text/plain
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Dec 13 11:09:27 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Dec 2007 11:09:27 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712131609.lBDG9R7u027690@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
------- Comment #2 from justin.t.riley at gmail.com 2007-12-13 11:09 EST -------
(In reply to comment #1)
> The "complement" and similar functions were removed from Bio.SeqUtils in
> Biopython 1.43 because similar functionality existed in several places in
> Biopython. Apparently, we missed this call to complement in the
> six_frame_translations function. I would like to avoid adding this function
> back to SeqUtils. Instead, we can use the reverse_complement function in
> Bio.Seq, and take its reverse.
>
> Could you double-check if the revised version of Bio.SeqUtils.__init__.py works
> for you? You can pick it up from here:
>
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio/SeqUtils/__init__.py?rev=1.14&cvsroot=biopython&content-type=text/plain
>
Michiel, I figured the "solution" I mentioned wasn't the ideal but hey it
worked :D
The revised __init__.py you linked to works great for me. Thanks for getting
back to me so quickly with a proper fix.
I'm thinking of submitting a patch to Gentoo Linux for this in their Biopython
ebuild until your next release.
Thanks again! ~Justin
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Dec 13 19:01:54 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Dec 2007 19:01:54 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712140001.lBE01sIR023423@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #3 from mdehoon at ims.u-tokyo.ac.jp 2007-12-13 19:01 EST -------
OK, thanks. Closing this bug.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 10:17:21 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 10:17:21 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712141517.lBEFHLcj018666@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED
------- Comment #22 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 10:17 EST -------
Thanks for the details. Those fields are not being recorded in the SeqRecord
object, so there is no way for BioSQL to put them into the database. This is
bug 2235, which is on my mental to do list.
Additionally, even if the parser did record the Taxon in the SeqRecord, BioSQL
currently don't record this in the database. That seems to have been a short
term fix for Bug 1921 which we should probably revisit.
Note I'm re-marking THIS bug as fixed. Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 12:56:11 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 12:56:11 -0500
Subject: [Biopython-dev] [Bug 2421] New: BioSQL should store and retrieve a
SeqRecord's dbxrefs
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
Summary: BioSQL should store and retrieve a SeqRecord's dbxrefs
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Looking over the code, BioSQL doesn't seem to even try and store database cross
references in a SeqRecord's dbxrefs list. It will however store other cross
references, e.g. in references and in features.
See also:
Bug 2390 comment 21 - Error importing Swiss Prot in BioSQL
It was pointed out that SwissProt DR lines don't get into the database.
The first problem was they didn't even make it to the SeqRecord...
Bug 2235 - SeqRecord from Bio.SwissProt.SProt lacks annotation information
The latest parser in CVS will now load DR lines into the dbxrefs list.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 13:08:01 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:08:01 -0500
Subject: [Biopython-dev] [Bug 2422] New: BioSQL shouldn't just ignore the
taxon_id
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
Summary: BioSQL shouldn't just ignore the taxon_id
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
In Bug 1921 biopython/BioSQL/Loader.py was changed to ignore the taxon_id, in
order to avoid a foreign key constraint when the taxon id was not already
defined (e.g. from loading an up to date NCBI taxonomy).
We should see how BioPerl and BioJava handle this situation...
One crude option (which would still be an improvement on the current situation)
is to check if the taxon_id is defined, and if it is, then store the record
with this included, and if not, issue a warning and store the sequence but
omitting the taxon id.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 13:09:33 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:09:33 -0500
Subject: [Biopython-dev] [Bug 1921] BioSeqDatabase.load() method fails
In-Reply-To:
Message-ID: <200712141809.lBEI9Xl9001415@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1921
------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:09 EST -------
In resolving this issue (bug 1921), Biopython's BioSQL is simply ignoring the
taxon_id, so it is never recorded in the database. I've just filed a new bug
on this: Bug 2422 - BioSQL shouldn't just ignore the taxon_id
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 13:21:40 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:21:40 -0500
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200712141821.lBEILelL002298@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:21 EST -------
Some of Marc Colosimo's changes proposed on Bug 1816 may be relevant here, in
particular his patch "Various fixes and possible improvements" (attachment
594).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 13:34:42 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:34:42 -0500
Subject: [Biopython-dev] [Bug 1816] Error when importing GenBank file into
BioSQL database
In-Reply-To:
Message-ID: <200712141834.lBEIYgsN004015@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1816
------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:34 EST -------
I'd like to close this bug as the original problem seems to be fixed: Using
CVS, I can load and retrieve AY243312 into BioSQL using the GenBank file
downloaded from here:
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=29692106
Regarding the taxon id, I've filed a separate bug:
Bug 2422 - BioSQL shouldn't just ignore the taxon_id
One of Marc's changes in the patch was caching term and ontology id's. Does
this make a big difference? If so, could you file a new bug just for that
enhancement and rescue those specific changes from the old patch.
Similarly for the last_id method - could you file a new bug explaining what
problem its solving.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 13:36:34 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:36:34 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests.py fails with a single test on
a test suite
In-Reply-To:
Message-ID: <200712141836.lBEIaYKo004243@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
Summary|run_tests,py fails with a |run_tests.py fails with a
|single test on a test suite |single test on a test suite
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:36 EST -------
Tiago made this change in biopython/Tests/run_tests.py revision 1.12, marking
this bug as fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 17:40:39 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 17:40:39 -0500
Subject: [Biopython-dev] [Bug 2421] BioSQL should store and retrieve a
SeqRecord's dbxrefs
In-Reply-To:
Message-ID: <200712142240.lBEMedjA021336@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 17:40 EST -------
This seems to be working in CVS now...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:08:55 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 18:08:55 -0500
Subject: [Biopython-dev] [Bug 2410] DBSeq & DBSeqRecord should subclass Seq
& SeqRecord
In-Reply-To:
Message-ID: <200712142308.lBEN8tWc023431@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2410
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 18:08 EST -------
Fixed in biopython/BioSQL/BioSeq.py revision 1.20
The BioSQL unit tests still pass.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:37:55 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 18:37:55 -0500
Subject: [Biopython-dev] [Bug 2421] BioSQL should store and retrieve a
SeqRecord's dbxrefs
In-Reply-To:
Message-ID: <200712142337.lBENbtiR025242@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 18:37 EST -------
Fixed in CVS, and test_BioSQL_SeqIO.py updated to verify this explicitly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Dec 15 08:47:48 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Dec 2007 08:47:48 -0500
Subject: [Biopython-dev] [Bug 2381] translate and transcibe methods for the
Seq object (in Bio.Seq)
In-Reply-To:
Message-ID: <200712151347.lBFDlmh9019619@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2381
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #795 is|0 |1
obsolete| |
------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-15 08:47 EST -------
Created an attachment (id=836)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=836&action=view)
Patch to Bio/Seq.py
[Note this does not update the test suite or the documentation, which would be
needed if this is committed]
Adds new methods to the MutableSeq object:
- transcribe (in place)
- back_transcribe (in place)
Adds new methods to the Seq object:
- transcribe
- back_transcribe
- translate (like the python string method)
- translate_all (Biological translation)
- translate_to_stop (Biological translation up to and excluding first stop
codon)
- translate_cds (Biological translation with an initial start codon as M, up to
and excluding the first stop codon)
I think this would be enough to deprecate Bio.Translate and Bio.Transcribe
(after the next release).
Comments welcome - for example are these method names sensible?
Also, should the MutableSeq methods all act "in situ"? What about translation
methods for MutableSeq objects?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 28 11:18:54 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 28 Dec 2007 11:18:54 -0500
Subject: [Biopython-dev] [Bug 2425] New: Fasta ID parsing error
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2425
Summary: Fasta ID parsing error
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: dtomso at athenixcorp.com
Loader.py will give an error as follows when presented with an unusual FASTA
header line:
>region1.fasta.screen.Contig1
ACAGGATAGGCGGGAGCCATTGAAACCGGAGCGCTAGCTTCGGTGGAGGC
GCTGGTGGGATACCGCCCTGACTGTATTGAAATTCTAACCTACGGGTCTT
Traceback (most recent call last):
File "biosql_driver.py", line 28, in
db.load(SeqIO.parse(sfile, 'fasta'))
File
"/home/dtomso/repository/biopython/build/lib.linux-i686-2.5/BioSQL/BioSeqDatabase.py",
line 412, in load
db_loader.load_seqrecord(cur_record)
File "/usr/lib/python2.5/site-packages/BioSQL/Loader.py", line 30, in
load_seqrecord
bioentry_id = self._load_bioentry_table(record)
File "/usr/lib/python2.5/site-packages/BioSQL/Loader.py", line 214, in
_load_bioentry_table
accession, version = record.id.split('.')
ValueError: too many values to unpack
It appears to be looking for any '.' in the file, assuming that is a version
number, and splitting to obtain that number. However, this only works on
NCBI-type header lines. Files that deviate from this (e.g. those produced by
phrap, which produced the file above) cause this issue.
I bolted on an inelegant fix by having the code check for multiple '.'
characters, in which case the version defaults to zero. Other solutions may be
preferable.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Dec 1 18:25:56 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 1 Dec 2007 13:25:56 -0500
Subject: [Biopython-dev] [Bug 2414] New: run_tests,
py fails with a single test on a test suite
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
Summary: run_tests,py fails with a single test on a test suite
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: trivial
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: tiagoantao at gmail.com
When a test python file is composed of a single test, PyUnit dumps the
following log:
Ran 1 test in xxxs
run_test.py on (current CVS HEAD) line 284 is only searching for the plural
Run yy tests in xxxs
Mini patch (not tested, but trivial)
if expected_line[:3] == "Ran" and \
string.find(expected_line, " tests in ") >= 5:
becomes, eg,
if expected_line[:3] == "Ran" and \
(string.find(expected_line, " tests in ") >= 5 or
string.find(expected_line, " test in ") >= 5):
I actually have, for now, a single case with one test, as I split my test cases
in depending on external binaries and not depending on external binaries
(creating a test scenario with a single test to try to run an external
application)
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From tiagoantao at gmail.com Mon Dec 3 21:41:06 2007
From: tiagoantao at gmail.com (=?ISO-8859-1?Q?Tiago_Ant=E3o?=)
Date: Mon, 3 Dec 2007 21:41:06 +0000
Subject: [Biopython-dev] [Bug 2414] New: run_tests,
py fails with a single test on a test suite
In-Reply-To:
References:
Message-ID: <6d941f120712031341p1af6ca55oa04b787f8e0937@mail.gmail.com>
Hi,
Could I please ask you (I suppose Peter or Michiel) to advise on this?
I have my code for coalescent simulation ready, but I am not
committing because one of my test files has only a single test (to see
if it can run the coalescent simulator, all other tests are
non-dependent on having the simulator, so are on a different test
case).
I can either put a dummy test just to have 2 tests (hack around), or
run_test can be sorted out.
Thanks
Tiago
PS - Apologies in advance if I take too much time to respond, I will
be traveling for the next 3 days.
On Dec 1, 2007 6:25 PM, wrote:
> http://bugzilla.open-bio.org/show_bug.cgi?id=2414
>
> Summary: run_tests,py fails with a single test on a test suite
> Product: Biopython
> Version: Not Applicable
> Platform: All
> OS/Version: All
> Status: NEW
> Severity: trivial
> Priority: P2
> Component: Main Distribution
> AssignedTo: biopython-dev at biopython.org
> ReportedBy: tiagoantao at gmail.com
>
>
> When a test python file is composed of a single test, PyUnit dumps the
> following log:
> Ran 1 test in xxxs
> run_test.py on (current CVS HEAD) line 284 is only searching for the plural
> Run yy tests in xxxs
> Mini patch (not tested, but trivial)
> if expected_line[:3] == "Ran" and \
> string.find(expected_line, " tests in ") >= 5:
> becomes, eg,
> if expected_line[:3] == "Ran" and \
> (string.find(expected_line, " tests in ") >= 5 or
> string.find(expected_line, " test in ") >= 5):
>
> I actually have, for now, a single case with one test, as I split my test cases
> in depending on external binaries and not depending on external binaries
> (creating a test scenario with a single test to try to run an external
> application)
>
>
> --
> Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug, or are watching the assignee.
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev
>
--
http://www.tiago.org/ps
From bugzilla-daemon at portal.open-bio.org Mon Dec 3 22:04:05 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 3 Dec 2007 17:04:05 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests,
py fails with a single test on a test suite
In-Reply-To:
Message-ID: <200712032204.lB3M45tn000935@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-03 17:04 EST -------
Are you talking about test_PopGen_FDist.py? I don't have fdist installed, so I
haven't found this problem yet...
In anycase, your fix looks fine, although arguably a regular expession (with an
optional "s" in "tests") would be more elegant.
I am happy for you to make this change in run_tests.py
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at c2b2.columbia.edu Tue Dec 4 07:10:40 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 02:10:40 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt / Bio.SeqIO
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
Hi everybody,
I am still looking at the different code in Biopython to access SwissProt.
With Bio.SwissProt, we can access the SwissProt database as follows:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
# record is now a string containing the SwissProt record O23719
Another option is to pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> s_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser=s_parser)
>>> record = dictionary["O23719"]
# record is now a Bio.SwissProt.SProt.Record object containing record O23719
A third option is to pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"]
>>> from Bio import SeqIO
>>> import StringIO
>>> record = SeqIO.parse(StringIO.StringIO(record), "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719
Compare this to how we would read a Fasta file:
>>> from Bio import SeqIO
>>> input = open("mydata.fa")
>>> record = SeqIO.parse(input, "fasta").next()
For consistency with Bio.SeqIO, it would make sense if ExPASyDictionary would
returns handles instead of parsed objects. Then these examples look like:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> record = dictionary["O23719"].read()
# record is now a string containing the SwissProt record O23719
To pull out a Bio.SwissProt.SProt.Record object:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> record = SProt.parse(handle)
# record is now a Bio.SwissProt.SProt.Record object containing record O23719
To pull out a SeqRecord by using SeqIO:
>>> from Bio.SwissProt import SProt
>>> dictionary = SProt.ExPASyDictionary()
>>> handle = dictionary["O23719"]
>>> from Bio import SeqIO
>>> record = SeqIO.parse(handle, "swiss").next()
# record is now a Bio.SeqRecord.SeqRecord object containing record O23719
*If* we decide that ExPASyDictionary should return handles, *then* actually
we don't really need an ExPASyDictionary, as its behavior is then largely the
same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
Bio.WWW.ExPASy.get_sprot_raw already offers.
Any comments?
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From biopython-dev at maubp.freeserve.co.uk Tue Dec 4 10:26:52 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Tue, 4 Dec 2007 10:26:52 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
> For consistency with Bio.SeqIO, it would make sense if ExPASyDictionary would
> returns handles instead of parsed objects.
I agree that it would in general be simpler if our online APIs
returned handles by default. This also applies to the Bio.GenBank
methods. Of course, we should preserve existing functionality if
possible.
Another alternative is to return SeqRecords by default (via Bio.SeqIO)
but this wouldn't generalise to non-sequence files like ProSite etc.
One idea I had been thinking about was adding a new function
Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
a proxy to all our supported online sequence databases, and either
return a handle to the requested record(s), or perhaps return
SeqRecord(s).
One API model would be that outlined for the (possibly defunct?) Open
Biological Database Access (OBDA) scheme, which covers both BioSQL
access and online fetching (biofetch):
http://cvs.open-bio.org/cgi-bin/viewcvs/viewcvs.cgi/obda-specs/biofetch/biofetch.txt?cvsroot=obf-common
But first I should probably finish working on BioSQL ;)
> *If* we decide that ExPASyDictionary should return handles, *then* actually
> we don't really need an ExPASyDictionary, as its behavior is then largely the
> same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> Bio.WWW.ExPASy.get_sprot_raw already offers.
Can ExPASyDictionary return anything that get_sprot_raw can't?
Otherwise from the user's point of view its just a coding style issue
(dictionary versus function).
Peter
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 10:41:25 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 05:41:25 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests,
py fails with a single test on a test suite
In-Reply-To:
Message-ID: <200712041041.lB4AfPTN008806@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
tiagoantao at gmail.com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
------- Comment #2 from tiagoantao at gmail.com 2007-12-04 05:41 EST -------
> Are you talking about test_PopGen_FDist.py? I don't have fdist installed, so I
> haven't found this problem yet...
No, it is my new SimCoal code.
> In anycase, your fix looks fine, although arguably a regular expession (with an
> optional "s" in "tests") would be more elegant.
>
> I am happy for you to make this change in run_tests.py
OK, I will do this with a regex. I cannot promise when though, as I am
traveling until Saturday (but it will before next Monday).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 19:43:17 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 14:43:17 -0500
Subject: [Biopython-dev] [Bug 2412] NCBIXML. fails parsing with blast 2.2.15
in special cases (Karlin-Altschul)
In-Reply-To:
Message-ID: <200712041943.lB4JhHkE012059@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2412
------- Comment #4 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-04 14:43 EST -------
The fact that your example gives an empty XML file is essentially due to some
problem with Blast. I agree that the Biopython error message you quoted is
very unhelpful in this situation.
Are you using Biopython 1.43 (as suggested by the strack trace in the error
report), or Biopython 1.44 as reported in the bug details?
What does this do on your setup?
from StringIO import StringIO
from Bio.Blast import NCBIXML
handle = StringIO("")
for record in NCBIXML.parse(handle) :
print record
If you are using Biopython 1.44 or later you should get a helpful error
message, "ValueError: Your XML file was empty". You can catch this, and inspect
the contents of the error handle if you want to deal with this in your
application.
i.e. I think this bug has already been fixed in Biopython 1.44
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 4 20:25:45 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 4 Dec 2007 15:25:45 -0500
Subject: [Biopython-dev] [Bug 2396] BioSQL loader does not store sequence
level annotations dict
In-Reply-To:
Message-ID: <200712042025.lB4KPj2D016252@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2396
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-04 15:25 EST -------
I think I have fixed this now in CVS.
One related wrinkle is that if you had this:
record.annotations["example1"] == "string"
record.annotations["example2"] == ["alpha"]
record.annotations["example3"] == ["alpha", "beta"]
after loading and retreiving from BioSQL you have this:
record.annotations["example1"] == ["string"]
record.annotations["example2"] == ["alpha"]
record.annotations["example3"] == ["alpha", "beta"]
i.e. Everything becomes a list of strings.
It is difficult to see how to deal with this elegantly given the current BioSQL
schema. One option is to treat single entries as either a list or a string
depending on the rank field in the database... I should probably take this up
with the BioSQL mailing list to see how/if this issue affects BioPerl/BioJava.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From mdehoon at c2b2.columbia.edu Wed Dec 5 01:13:01 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 20:13:01 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
> One idea I had been thinking about was adding a new function
> Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> a proxy to all our supported online sequence databases, and either
> return a handle to the requested record(s), or perhaps return
> SeqRecord(s).
I believe that Bio.db has such a functionality, but I don't think it is used
much.
Anyway, we currently have too many functions in Biopython to access databases
rather than too few.
So I think we should not add any new ones.
> > *If* we decide that ExPASyDictionary should return handles, *then*
actually
> > we don't really need an ExPASyDictionary, as its behavior is then largely
the
> > same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> > Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> > Bio.WWW.ExPASy.get_sprot_raw already offers.
>
> Can ExPASyDictionary return anything that get_sprot_raw can't?
> Otherwise from the user's point of view its just a coding style issue
> (dictionary versus function).
ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
return any record that ExPASyDictionary can return.
There are two differences between the two:
1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
you write, this is just a coding style issue.
2) When creating a ExPASyDictionary, users can pass a parser to parse the
records before returning them. This is in essence only a coding style issue.
In particular, do we want:
>>> from Bio.SwissProt import SProt
>>> sprot_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
>>> record = dictionary["O12345"]
or
>>> from Bio.SwissProt import SProt
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SProt.parse(handle)
For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
For plain-text output, in the ExPASyDictionary approach we pass no parser,
and in the get_sprot_raw approach we call read() on the handle directly.
To get a handle, in the ExPASyDictionary approach we can use StringIO to
convert the text output to a handle; in the get_sprot_raw approach we don't
need to do anything.
In my opinion, both 1) and 2) are just coding style issues. Maintaining both
ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
confusion for users. So I suggest we focus on one of these, and deprecate the
other.
The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is organized,
and therefore has my preference.
Two more issues:
1) I am not sure why the SwissProt code is kept in a separate SProt submodule
of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
save ourselves some typing by keeping all the SwissProt code there instead of
in SProt.py.
2) A SwissProt.parse function currently doesn't exist. Right now it is a
three-step process:
>>> s_parser = SProt.RecordParser()
>>> s_iterator = SProt.Iterator(handle, s_parser)
>>> record = s_iterator.next()
A SwissProt.parse function would just contain these three steps, or
perhaps only the first two.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From mdehoon at c2b2.columbia.edu Wed Dec 5 01:13:01 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 4 Dec 2007 20:13:01 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
> One idea I had been thinking about was adding a new function
> Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> a proxy to all our supported online sequence databases, and either
> return a handle to the requested record(s), or perhaps return
> SeqRecord(s).
I believe that Bio.db has such a functionality, but I don't think it is used
much.
Anyway, we currently have too many functions in Biopython to access databases
rather than too few.
So I think we should not add any new ones.
> > *If* we decide that ExPASyDictionary should return handles, *then*
actually
> > we don't really need an ExPASyDictionary, as its behavior is then largely
the
> > same as Bio.WWW.ExPASy.get_sprot_raw. So in short, in my opinion
> > Bio.SwissProt.SProt.ExPASyDictionary does not add much beyond what
> > Bio.WWW.ExPASy.get_sprot_raw already offers.
>
> Can ExPASyDictionary return anything that get_sprot_raw can't?
> Otherwise from the user's point of view its just a coding style issue
> (dictionary versus function).
ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
return any record that ExPASyDictionary can return.
There are two differences between the two:
1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
you write, this is just a coding style issue.
2) When creating a ExPASyDictionary, users can pass a parser to parse the
records before returning them. This is in essence only a coding style issue.
In particular, do we want:
>>> from Bio.SwissProt import SProt
>>> sprot_parser = SProt.RecordParser()
>>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
>>> record = dictionary["O12345"]
or
>>> from Bio.SwissProt import SProt
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SProt.parse(handle)
For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
For plain-text output, in the ExPASyDictionary approach we pass no parser,
and in the get_sprot_raw approach we call read() on the handle directly.
To get a handle, in the ExPASyDictionary approach we can use StringIO to
convert the text output to a handle; in the get_sprot_raw approach we don't
need to do anything.
In my opinion, both 1) and 2) are just coding style issues. Maintaining both
ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
confusion for users. So I suggest we focus on one of these, and deprecate the
other.
The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is organized,
and therefore has my preference.
Two more issues:
1) I am not sure why the SwissProt code is kept in a separate SProt submodule
of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
save ourselves some typing by keeping all the SwissProt code there instead of
in SProt.py.
2) A SwissProt.parse function currently doesn't exist. Right now it is a
three-step process:
>>> s_parser = SProt.RecordParser()
>>> s_iterator = SProt.Iterator(handle, s_parser)
>>> record = s_iterator.next()
A SwissProt.parse function would just contain these three steps, or
perhaps only the first two.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 4451 bytes
Desc: not available
URL:
From biopython-dev at maubp.freeserve.co.uk Wed Dec 5 10:03:34 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Wed, 5 Dec 2007 10:03:34 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From biopython-dev at maubp.freeserve.co.uk Wed Dec 5 10:03:34 2007
From: biopython-dev at maubp.freeserve.co.uk (Peter)
Date: Wed, 5 Dec 2007 10:03:34 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From mdehoon at c2b2.columbia.edu Wed Dec 5 10:29:38 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 5 Dec 2007 05:29:38 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu><320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com><6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B671@mail2.exch.c2b2.columbia.edu>
> Or do we want to encourage Bio.SeqIO (which happens to call
> Bio.SwissProt.SProt internally)?
>
> >>> from Bio SeqIO
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SeqIO.parse(handle, "swiss")
>
> This is the style I prefer (and is very similar to the related
> examples I added to the tutorial). It separates fetching the data (as
> a handle) and parsing it (via SeqIO).
SeqIO.parse returns a SeqRecord; a SwissProt.parse returns a
SwissProt.SProt.Record.
Does the SeqRecord contain the same information as a SwissProt.SProt.Record?
Or is some information lost?
If they contain the same information, then I am in favor of encouraging
Bio.SeqIO.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From mdehoon at c2b2.columbia.edu Wed Dec 5 10:29:38 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 5 Dec 2007 05:29:38 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu><320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com><6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<320fb6e00712050203p17aa38b0q15d2edd65542021d@mail.gmail.com>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B671@mail2.exch.c2b2.columbia.edu>
> Or do we want to encourage Bio.SeqIO (which happens to call
> Bio.SwissProt.SProt internally)?
>
> >>> from Bio SeqIO
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SeqIO.parse(handle, "swiss")
>
> This is the style I prefer (and is very similar to the related
> examples I added to the tutorial). It separates fetching the data (as
> a handle) and parsing it (via SeqIO).
SeqIO.parse returns a SeqRecord; a SwissProt.parse returns a
SwissProt.SProt.Record.
Does the SeqRecord contain the same information as a SwissProt.SProt.Record?
Or is some information lost?
If they contain the same information, then I am in favor of encouraging
Bio.SeqIO.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From biopython at maubp.freeserve.co.uk Wed Dec 5 11:55:45 2007
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Wed, 05 Dec 2007 11:55:45 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt /
Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
Message-ID: <475691C1.3020705@maubp.freeserve.co.uk>
On 12/5/07, Michiel De Hoon wrote:
> > One idea I had been thinking about was adding a new function
> > Bio.SeqIO.fetch(...) or Bio.SeqIO.online_fetch(...) which would act as
> > a proxy to all our supported online sequence databases, and either
> > return a handle to the requested record(s), or perhaps return
> > SeqRecord(s).
>
> I believe that Bio.db has such a functionality, but I don't think it is used
> much. Anyway, we currently have too many functions in Biopython to
> access databases rather than too few. So I think we should not add any
> new ones.
Certainly before taking my suggestion seriously we should try and take
stock of where we stand at the moment with respect to online
databases.
> > Can ExPASyDictionary return anything that get_sprot_raw can't?
> > Otherwise from the user's point of view its just a coding style issue
> > (dictionary versus function).
>
> ExPASyDictionary is just a wrapper around get_sprot_raw, so get_sprot_raw can
> return any record that ExPASyDictionary can return.
> There are two differences between the two:
> 1) ExPASyDictionary behaves as a dictionary, get_sprot_raw as a function. As
> you write, this is just a coding style issue.
> 2) When creating a ExPASyDictionary, users can pass a parser to parse the
> records before returning them. This is in essence only a coding style issue.
> In particular, do we want:
> >>> from Bio.SwissProt import SProt
> >>> sprot_parser = SProt.RecordParser()
> >>> dictionary = SProt.ExPASyDictionary(parser = sprot_parser)
> >>> record = dictionary["O12345"]
> or
> >>> from Bio.SwissProt import SProt
> >>> from Bio import ExPASy
> >>> handle = ExPASy.get_sprot_raw("O12345")
> >>> record = SProt.parse(handle)
Or do we want to encourage Bio.SeqIO (which happens to call
Bio.SwissProt.SProt internally)?
>>> from Bio SeqIO
>>> from Bio import ExPASy
>>> handle = ExPASy.get_sprot_raw("O12345")
>>> record = SeqIO.parse(handle, "swiss")
This is the style I prefer (and is very similar to the related
examples I added to the tutorial). It separates fetching the data (as
a handle) and parsing it (via SeqIO).
> For SeqRecords, in the ExPASyDictionary approach we'd use a different parser,
> in the get_sprot_raw approach we call SeqIO.parse instead of SProt.parse.
> For plain-text output, in the ExPASyDictionary approach we pass no parser,
> and in the get_sprot_raw approach we call read() on the handle directly.
> To get a handle, in the ExPASyDictionary approach we can use StringIO to
> convert the text output to a handle; in the get_sprot_raw approach we don't
> need to do anything.
>
> In my opinion, both 1) and 2) are just coding style issues. Maintaining both
> ExPASyDictionary and get_sprot_raw is a burden for the developers, and causes
> confusion for users. So I suggest we focus on one of these, and deprecate the
> other.
As ExPASyDictionary just calls wraps get_sprot_raw with a parser
object, the additional overhead is minimal. The dictionary metaphore
is quite nice - even if you don't actually gain much functionality.
However, setting up the dictionary as it is now (requiring an "old
fashioned" parser object) is fairly fiddly/confusing.
> The ExPASy.get_sprot_raw approach seems closer to how Bio.SeqIO is
> organized, and therefore has my preference.
I would agree if you wanted to depreceate one, I would keep
get_sprot_raw and drop ExPASyDictionary. However we should try and
have a coherent API for the other online tools as well.
> Two more issues:
> 1) I am not sure why the SwissProt code is kept in a separate SProt submodule
> of Bio.SwissProt. Currently, Bio/SwissProt/__init__.py is empty, so we can
> save ourselves some typing by keeping all the SwissProt code there instead of
> in SProt.py.
Or just encourage using it via Bio.SeqIO (then we can moving things
later if wanted)
> 2) A SwissProt.parse function currently doesn't exist. Right now it is a
> three-step process:
> >>> s_parser = SProt.RecordParser()
> >>> s_iterator = SProt.Iterator(handle, s_parser)
> >>> record = s_iterator.next()
> A SwissProt.parse function would just contain these three steps, or
> perhaps only the first two.
The Bio.SeqIO.parse() is very close though.
Peter
From mdehoon at c2b2.columbia.edu Fri Dec 7 10:11:33 2007
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 7 Dec 2007 05:11:33 -0500
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu> <320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com> <6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
Message-ID: <6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
Hi everybody,
To summarize, I rewrote the chapter on SwissProt/Prosite/Prodoc/ExPASy and
put it here:
http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html#htoc51
(chapter 6 in the tutorial)
This is merely a proposal on how this should work; none of this is in CVS
yet.
Please let us know if you have any objections.
If there are no objections, I can upload the new code to CVS. That would
conclude my work on Bio.WWW.ExPASy; the final (and biggest) part of my work
on Bio.WWW will be to look at the various Biopython modules to interact with
NCBI (Genbank, EUtils).
Two comments:
1) In this proposal, I am using SwissProt.parse instead of SeqIO.parse since
the latter does not (yet) store all information contained in a SwissProt
file. I'd be happy though to move to SeqIO.parse for SwissProt also once it
does.
2) It may be nice to have a SwissProt.read and SeqIO.read to read and return
exactly one record from the handle, in addition to parse() to create an
iterator to read multiple records.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3662 bytes
Desc: not available
URL:
From biopython at maubp.freeserve.co.uk Fri Dec 7 10:46:32 2007
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 7 Dec 2007 10:46:32 +0000
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
In-Reply-To: <6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
<6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
Message-ID: <320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
> To summarize, I rewrote the chapter on SwissProt/Prosite/Prodoc/ExPASy and
> put it here:
>
> http://biopython.org/DIST/docs/tutorial/Tutorial-proposal.html#htoc51
> (chapter 6 in the tutorial)
>
> This is merely a proposal on how this should work; none of this is in CVS
> yet. Please let us know if you have any objections.
I would add a note saying doing it this way gives
Bio.SwissProt.SProt.Record objects,
while you could alternatively get SeqRecord objects as described in
the SeqIO chapter
(use a reference).
> If there are no objections, I can upload the new code to CVS. That would
> conclude my work on Bio.WWW.ExPASy; the final (and biggest) part of my work
> on Bio.WWW will be to look at the various Biopython modules to interact with
> NCBI (Genbank, EUtils).
That will be "fun"!
> Two comments:
> 1) In this proposal, I am using SwissProt.parse instead of SeqIO.parse since
> the latter does not (yet) store all information contained in a SwissProt
> file. I'd be happy though to move to SeqIO.parse for SwissProt also once it
> does.
> 2) It may be nice to have a SwissProt.read and SeqIO.read to read and return
> exactly one record from the handle, in addition to parse() to create an
> iterator to read multiple records.
I'd suggested a Bio.SeqIO function, with a name like parse1() or
parse_sole() etc which
would return a single SeqRecord - and raise an error if the handle
didn't contain one
and only one record. We could call this function read() if you prefer.
Peter
From mdehoon at c2b2.columbia.edu Sat Dec 8 03:18:09 2007
From: mdehoon at c2b2.columbia.edu (Michiel de Hoon)
Date: Sat, 08 Dec 2007 12:18:09 +0900
Subject: [Biopython-dev] Accessing ExPASy through Bio.SwissProt
/Bio.SeqIO
In-Reply-To: <320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
References: <6243BAA9F5E0D24DA41B27997D1FD14402B66F@mail2.exch.c2b2.columbia.edu>
<320fb6e00712040226o7ecda7e2g9fb124b3a52de026@mail.gmail.com>
<6243BAA9F5E0D24DA41B27997D1FD14402B670@mail2.exch.c2b2.columbia.edu>
<475691C1.3020705@maubp.freeserve.co.uk>
<6243BAA9F5E0D24DA41B27997D1FD14402B673@mail2.exch.c2b2.columbia.edu>
<320fb6e00712070246g53e8096ew156f4502791bce9b@mail.gmail.com>
Message-ID: <475A0CF1.1080802@c2b2.columbia.edu>
Peter wrote:
> I would add a note saying doing it this way gives
> Bio.SwissProt.SProt.Record objects,
> while you could alternatively get SeqRecord objects as described in
> the SeqIO chapter
> (use a reference).
OK I will add that.
>
> I'd suggested a Bio.SeqIO function, with a name like parse1() or
> parse_sole() etc which
> would return a single SeqRecord - and raise an error if the handle
> didn't contain one
> and only one record. We could call this function read() if you prefer.
>
I'd prefer read() instead of parse1(), parse_sole() etc. for the
following reasons:
1) Having two names that are clearly different emphasizes the fact that
they return different things (parse() returns an iterator, read() a record).
2) Some modules deal with data that always consist of one record (for
example, gene expression data in case of Bio.Cluster). Such modules can
have a read() function but not a parse(). It would feel strange if a
module has a parse1() function but not a parse().
--Michiel.
From bugzilla-daemon at portal.open-bio.org Sat Dec 8 13:09:00 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 8 Dec 2007 08:09:00 -0500
Subject: [Biopython-dev] [Bug 2417] New: Bio.SeqIO single SeqRecord
read/parse function
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
Summary: Bio.SeqIO single SeqRecord read/parse function
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Most sequence file format can contain a single record, and in this situation
having to use an iterator returned by Bio.SeqIO.parse() can be clumsy.
For example, dealing with GenBank files for bacterial genomes or chromosomes.
Or, from the tutorial as of Biopython 1.44,
from Bio.WWW import ExPASy
from Bio import SeqIO
seq_record = SeqIO.parse(ExPASy.get_sprot_raw("O23729"), "swiss").next()
print seq_record.id
print seq_record.seq
print len(seq_record.seq)
Using the iterator.next() method as above works fine, it will however silently
ignore any unexpected subsequent records if present. Checking your file only
has one record would require a an additional check to confirm a second .next()
call fails, or another such workaround.
I am proposing a new function for use with a handle containing one and only one
record. This would raise an error if the handle contained no records, or if it
contained more than one record. It would be defined in Bio/SeqIO/__init__.py
as a simple wrapper for Bio.SeqIO.parse()
Note - My proposed "read single record" function would NOT work for cases where
the handle contains multiple records and you only want the first one (because I
would raise an exception). I would regard this as a corner case, and catering
to this risks silently ignoring unexpected second and subsequent records in
other use cases. In such situations using Bio.SeqIO.parse(...).next() is
advised.
I had previously suggested "parse1", "parse_sole", "parse_only" - none of which
are very appealing. On the dev mailing list today, Michiel has proposed
"read":
Michiel de Hoon wrote:
>
> Peter wrote:
> > I'd suggested a Bio.SeqIO function, with a name like parse1() or
> > parse_sole() etc which would return a single SeqRecord - and raise
> > an error if the handle didn't contain one and only one record. We
> > could call this function read() if you prefer.
> >
> I'd prefer read() instead of parse1(), parse_sole() etc. for the
> following reasons:
>
> 1) Having two names that are clearly different emphasizes the fact that
> they return different things (parse() returns an iterator, read() a record).
>
> 2) Some modules deal with data that always consist of one record (for
> example, gene expression data in case of Bio.Cluster). Such modules can
> have a read() function but not a parse(). It would feel strange if a
> module has a parse1() function but not a parse().
I plan to add this functionality to Bio/SeqIO/__init__.py as a "read" function,
and update the tutorial accordingly shortly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From p.j.a.cock at googlemail.com Sat Dec 8 13:10:33 2007
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Sat, 8 Dec 2007 13:10:33 +0000
Subject: [Biopython-dev] Bio.SeqIO function to read a single record
Message-ID: <320fb6e00712080510k3d4e5148gb0ec332a0d745452@mail.gmail.com>
Michiel de Hoon wrote:
> >
> > I'd suggested a Bio.SeqIO function, with a name like parse1() or
> > parse_sole() etc which would return a single SeqRecord - and raise
> > an error if the handle didn't contain one and only one record. We
> > could call this function read() if you prefer.
> >
> I'd prefer read() instead of parse1(), parse_sole() etc. for the
> following reasons:
>
> 1) Having two names that are clearly different emphasizes the fact that
> they return different things (parse() returns an iterator, read() a record).
>
> 2) Some modules deal with data that always consist of one record (for
> example, gene expression data in case of Bio.Cluster). Such modules can
> have a read() function but not a parse(). It would feel strange if a
> module has a parse1() function but not a parse().
OK. I've filed an enhancement bug, which I'll mention on the main mailing list,
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
Unless there is some negative feedback, I'll add that functionality shortly.
Peter
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 16:24:19 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 11:24:19 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712091624.lB9GOJCe025680@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 11:24 EST -------
Updated Bio/SeqIO/__init__.py to have include new "read" function in CVS
revision 1.21
I'll do the documentation and unit tests next, before marking this as fixed.
[Its not yet too late to change the name from "read" if anyone can come up with
a nice clear alternative, or a strong argument against this choice]
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 18:50:06 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 13:50:06 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712091850.lB9Io6tj013469@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 13:50 EST -------
I've updated the tutorial, wiki and unit test.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sun Dec 9 19:03:28 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 14:03:28 -0500
Subject: [Biopython-dev] [Bug 2412] NCBIXML. fails parsing with blast 2.2.15
in special cases (Karlin-Altschul)
In-Reply-To:
Message-ID: <200712091903.lB9J3SkM014338@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2412
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |WORKSFORME
------- Comment #5 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-09 14:03 EST -------
As per my comment 4, I think that in Biopython 1.44 we look for the special
case of an empty XML output file and raise a ValueError. On Biopython 1.43 the
error was very unhelpful.
I'm marking this as "works for me".
Bjoern, please reopen this bug if there is still a problem using Biopython 1.44
Thanks, Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Dec 10 01:18:50 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sun, 9 Dec 2007 20:18:50 -0500
Subject: [Biopython-dev] [Bug 2418] New: SyntaxError should be ValueError
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
Summary: SyntaxError should be ValueError
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: mdehoon at ims.u-tokyo.ac.jp
Biopython now has SyntaxErrors all over the place. Most if not all of these
should be ValueErrors. SyntaxErrors are appropriate if there is a syntax
problem in the code itself, not (as it's used in Biopython) if there is a
syntax problem in an input data file.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Dec 10 10:01:49 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Dec 2007 05:01:49 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712101001.lBAA1nxL011529@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-10 05:01 EST -------
That would be my fault.
Should we introduce a Biopython "FormatSyntaxError" exception (as a subclass of
ValueError defined in Bio/__init__.py), or just switch these to ValueError
exceptions instead?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Mon Dec 10 12:13:16 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Mon, 10 Dec 2007 07:13:16 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712101213.lBACDGLG022397@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #2 from mdehoon at ims.u-tokyo.ac.jp 2007-12-10 07:13 EST -------
> Should we introduce a Biopython "FormatSyntaxError" exception (as a subclass of
> ValueError defined in Bio/__init__.py), or just switch these to ValueError
> exceptions instead?
I would stick to ValueError. The error message should be clear enough for the
user to understand what the problem is.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Tue Dec 11 11:44:33 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 11 Dec 2007 06:44:33 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712111144.lBBBiXrZ014612@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-11 06:44 EST -------
I've just fixed the Bio.SeqIO, Bio.GenBank, Bio.SwissProt and Bio.SCOP cases
and their test cases.
I see you've found and fixed a whole more - its clearly not just me that used
the SyntaxError exception in this way.
We should probably also change Bio.Medline, Bio.Prosite and Bio.Blast
I think the cases in Bio.config are a little different...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 02:54:47 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Tue, 11 Dec 2007 21:54:47 -0500
Subject: [Biopython-dev] [Bug 2418] SyntaxError should be ValueError
In-Reply-To:
Message-ID: <200712120254.lBC2slIL022573@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2418
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #4 from mdehoon at ims.u-tokyo.ac.jp 2007-12-11 21:54 EST -------
I have replaced the SyntaxErrors by ValueErrors where appropriate. The
remaining SyntaxErrors, as far as I can tell, are being used correctly. Closing
this bug.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 15:07:12 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 10:07:12 -0500
Subject: [Biopython-dev] [Bug 2419] New: SeqUtils __init__.py missing
complement function (v1.43 and v1.44)
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
Summary: SeqUtils __init__.py missing complement function (v1.43
and v1.44)
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: justin.t.riley at gmail.com
This issue exists in both 1.43 and 1.44. You won't notice this bug on an
import of SeqUtils. However, when you try to use the six_frame_translations
function like so:
from Bio import SeqUtils
SeqUtils.six_frame_translations('GTCA....AAT')
you get:
: global name 'complement' is not defined
at line 285 (for version 1.43 anyhow)
At first I searched all the Biopython modules for a "def complement" string and
found one in Seq but it was for the complement of an actual Seq object.
Looking around the web I found:
def complement(seq):
" returns the complementary sequence (NOT antiparallel) "
return ''.join([IUPACData.ambiguous_dna_complement[x] for x in seq])
Pasting the above in Bio/SeqUtils/__init__.py solved the issue for me. Thanks.
~jtriley
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 20:33:43 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 15:33:43 -0500
Subject: [Biopython-dev] [Bug 2417] Bio.SeqIO single SeqRecord read/parse
function
In-Reply-To:
Message-ID: <200712122033.lBCKXhxd020792@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2417
mmokrejs at ribosome.natur.cuni.cz changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mmokrejs at ribosome.natur.cuni
| |.cz
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 21:48:03 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 16:48:03 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122148.lBCLm3iH025664@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
------- Comment #19 from Biosql at hotmail.com 2007-12-12 16:48 EST -------
Hi Peter,
I know it's been a very long time (more than a month), but I had this huge exam
to prepare.
Anyway, I've tried the latest version and everything is working fine.
Many many thanks to you !
Since any Swiss Prot cross-references ain't uploaded in the Biosql DB, I've
tried to parse the flat file with the RecordParser method from SProt instead of
the SequenceParser or the SeqIO Parser, but I'm getting an error.
I've seen in the bug list that you seem to work on this issue.
Am I right ? If not, is there a way to upload the Swiss Prot cross-references ?
Again, thank you !
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 22:01:47 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 17:01:47 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122201.lBCM1lGR026457@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
------- Comment #20 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-12 17:01 EST -------
Hi Jonathan,
I'm glad we've fixed the error for you. Could you be a little more precise
about what isn't working with getting Swiss Prot cross-references into BioSQL?
e.g. Pick a specific SwissProt record, and quote the lines from the file
containing the cross-references.
That should be enough for me to try and track down what's going on.
By the way - if you want to work with BioSQL, you have to use SeqRecord objects
(e.g. from the Bio.SeqIO parser), and not the Bio.SwissProt.SProt.Record
objects. This probably explains the error you mentioned using the RecordParser
parser instead.
Peter
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Wed Dec 12 22:17:36 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 17:17:36 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712122217.lBCMHaBK027220@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
------- Comment #21 from Biosql at hotmail.com 2007-12-12 17:17 EST -------
(In reply to comment #20)
> Hi Jonathan,
>
> I'm glad we've fixed the error for you. Could you be a little more precise
> about what isn't working with getting Swiss Prot cross-references into BioSQL?
>
> e.g. Pick a specific SwissProt record, and quote the lines from the file
> containing the cross-references.
>
> That should be enough for me to try and track down what's going on.
>
> By the way - if you want to work with BioSQL, you have to use SeqRecord objects
> (e.g. from the Bio.SeqIO parser), and not the Bio.SwissProt.SProt.Record
> objects. This probably explains the error you mentioned using the RecordParser
> parser instead.
>
> Peter
>
Sorry for the lack of informations,
Here's an example : http://ca.expasy.org/uniprot/Q9CQD1.txt
All the sequences, ID line, AC lines and comments (cc lines) are being uploaded
in the database, but not the : DR lines (which I consider the most interesting
cross-references), the Pubmed references (R_ lines) and the Taxon of the
protein.
I don't think that the FT lines can be uploaded too isn't ?
If so, it would be awesome !
Just to clear things, this uploading pattern is not only related to this
protein (Rab5a) but for all the Swiss Prot proteins.
Do you need anything else ?
Jonathan
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Dec 13 00:42:28 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Wed, 12 Dec 2007 19:42:28 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712130042.lBD0gSdm001952@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
------- Comment #1 from mdehoon at ims.u-tokyo.ac.jp 2007-12-12 19:42 EST -------
The "complement" and similar functions were removed from Bio.SeqUtils in
Biopython 1.43 because similar functionality existed in several places in
Biopython. Apparently, we missed this call to complement in the
six_frame_translations function. I would like to avoid adding this function
back to SeqUtils. Instead, we can use the reverse_complement function in
Bio.Seq, and take its reverse.
Could you double-check if the revised version of Bio.SeqUtils.__init__.py works
for you? You can pick it up from here:
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio/SeqUtils/__init__.py?rev=1.14&cvsroot=biopython&content-type=text/plain
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Thu Dec 13 16:09:27 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Dec 2007 11:09:27 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712131609.lBDG9R7u027690@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
------- Comment #2 from justin.t.riley at gmail.com 2007-12-13 11:09 EST -------
(In reply to comment #1)
> The "complement" and similar functions were removed from Bio.SeqUtils in
> Biopython 1.43 because similar functionality existed in several places in
> Biopython. Apparently, we missed this call to complement in the
> six_frame_translations function. I would like to avoid adding this function
> back to SeqUtils. Instead, we can use the reverse_complement function in
> Bio.Seq, and take its reverse.
>
> Could you double-check if the revised version of Bio.SeqUtils.__init__.py works
> for you? You can pick it up from here:
>
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio/SeqUtils/__init__.py?rev=1.14&cvsroot=biopython&content-type=text/plain
>
Michiel, I figured the "solution" I mentioned wasn't the ideal but hey it
worked :D
The revised __init__.py you linked to works great for me. Thanks for getting
back to me so quickly with a proper fix.
I'm thinking of submitting a patch to Gentoo Linux for this in their Biopython
ebuild until your next release.
Thanks again! ~Justin
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 00:01:54 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Thu, 13 Dec 2007 19:01:54 -0500
Subject: [Biopython-dev] [Bug 2419] SeqUtils __init__.py missing complement
function (v1.43 and v1.44)
In-Reply-To:
Message-ID: <200712140001.lBE01sIR023423@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2419
mdehoon at ims.u-tokyo.ac.jp changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #3 from mdehoon at ims.u-tokyo.ac.jp 2007-12-13 19:01 EST -------
OK, thanks. Closing this bug.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 15:17:21 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 10:17:21 -0500
Subject: [Biopython-dev] [Bug 2390] Error importing Swiss Prot in BioSQL
In-Reply-To:
Message-ID: <200712141517.lBEFHLcj018666@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2390
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|REOPENED |RESOLVED
Resolution| |FIXED
------- Comment #22 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 10:17 EST -------
Thanks for the details. Those fields are not being recorded in the SeqRecord
object, so there is no way for BioSQL to put them into the database. This is
bug 2235, which is on my mental to do list.
Additionally, even if the parser did record the Taxon in the SeqRecord, BioSQL
currently don't record this in the database. That seems to have been a short
term fix for Bug 1921 which we should probably revisit.
Note I'm re-marking THIS bug as fixed. Peter.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 17:56:11 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 12:56:11 -0500
Subject: [Biopython-dev] [Bug 2421] New: BioSQL should store and retrieve a
SeqRecord's dbxrefs
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
Summary: BioSQL should store and retrieve a SeqRecord's dbxrefs
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
Looking over the code, BioSQL doesn't seem to even try and store database cross
references in a SeqRecord's dbxrefs list. It will however store other cross
references, e.g. in references and in features.
See also:
Bug 2390 comment 21 - Error importing Swiss Prot in BioSQL
It was pointed out that SwissProt DR lines don't get into the database.
The first problem was they didn't even make it to the SeqRecord...
Bug 2235 - SeqRecord from Bio.SwissProt.SProt lacks annotation information
The latest parser in CVS will now load DR lines into the dbxrefs list.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:08:01 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:08:01 -0500
Subject: [Biopython-dev] [Bug 2422] New: BioSQL shouldn't just ignore the
taxon_id
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
Summary: BioSQL shouldn't just ignore the taxon_id
Product: Biopython
Version: Not Applicable
Platform: All
OS/Version: All
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk
In Bug 1921 biopython/BioSQL/Loader.py was changed to ignore the taxon_id, in
order to avoid a foreign key constraint when the taxon id was not already
defined (e.g. from loading an up to date NCBI taxonomy).
We should see how BioPerl and BioJava handle this situation...
One crude option (which would still be an improvement on the current situation)
is to check if the taxon_id is defined, and if it is, then store the record
with this included, and if not, issue a warning and store the sequence but
omitting the taxon id.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:09:33 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:09:33 -0500
Subject: [Biopython-dev] [Bug 1921] BioSeqDatabase.load() method fails
In-Reply-To:
Message-ID: <200712141809.lBEI9Xl9001415@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1921
------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:09 EST -------
In resolving this issue (bug 1921), Biopython's BioSQL is simply ignoring the
taxon_id, so it is never recorded in the database. I've just filed a new bug
on this: Bug 2422 - BioSQL shouldn't just ignore the taxon_id
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:21:40 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:21:40 -0500
Subject: [Biopython-dev] [Bug 2422] BioSQL shouldn't just ignore the taxon_id
In-Reply-To:
Message-ID: <200712141821.lBEILelL002298@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2422
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:21 EST -------
Some of Marc Colosimo's changes proposed on Bug 1816 may be relevant here, in
particular his patch "Various fixes and possible improvements" (attachment
594).
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:34:42 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:34:42 -0500
Subject: [Biopython-dev] [Bug 1816] Error when importing GenBank file into
BioSQL database
In-Reply-To:
Message-ID: <200712141834.lBEIYgsN004015@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=1816
------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:34 EST -------
I'd like to close this bug as the original problem seems to be fixed: Using
CVS, I can load and retrieve AY243312 into BioSQL using the GenBank file
downloaded from here:
http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=29692106
Regarding the taxon id, I've filed a separate bug:
Bug 2422 - BioSQL shouldn't just ignore the taxon_id
One of Marc's changes in the patch was caching term and ontology id's. Does
this make a big difference? If so, could you file a new bug just for that
enhancement and rescue those specific changes from the old patch.
Similarly for the last_id method - could you file a new bug explaining what
problem its solving.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 18:36:34 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 13:36:34 -0500
Subject: [Biopython-dev] [Bug 2414] run_tests.py fails with a single test on
a test suite
In-Reply-To:
Message-ID: <200712141836.lBEIaYKo004243@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2414
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
Summary|run_tests,py fails with a |run_tests.py fails with a
|single test on a test suite |single test on a test suite
------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 13:36 EST -------
Tiago made this change in biopython/Tests/run_tests.py revision 1.12, marking
this bug as fixed.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 22:40:39 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 17:40:39 -0500
Subject: [Biopython-dev] [Bug 2421] BioSQL should store and retrieve a
SeqRecord's dbxrefs
In-Reply-To:
Message-ID: <200712142240.lBEMedjA021336@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
------- Comment #1 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 17:40 EST -------
This seems to be working in CVS now...
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 23:08:55 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 18:08:55 -0500
Subject: [Biopython-dev] [Bug 2410] DBSeq & DBSeqRecord should subclass Seq
& SeqRecord
In-Reply-To:
Message-ID: <200712142308.lBEN8tWc023431@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2410
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 18:08 EST -------
Fixed in biopython/BioSQL/BioSeq.py revision 1.20
The BioSQL unit tests still pass.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 14 23:37:55 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 14 Dec 2007 18:37:55 -0500
Subject: [Biopython-dev] [Bug 2421] BioSQL should store and retrieve a
SeqRecord's dbxrefs
In-Reply-To:
Message-ID: <200712142337.lBENbtiR025242@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2421
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
------- Comment #2 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-14 18:37 EST -------
Fixed in CVS, and test_BioSQL_SeqIO.py updated to verify this explicitly.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Sat Dec 15 13:47:48 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Sat, 15 Dec 2007 08:47:48 -0500
Subject: [Biopython-dev] [Bug 2381] translate and transcibe methods for the
Seq object (in Bio.Seq)
In-Reply-To:
Message-ID: <200712151347.lBFDlmh9019619@portal.open-bio.org>
http://bugzilla.open-bio.org/show_bug.cgi?id=2381
biopython-bugzilla at maubp.freeserve.co.uk changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #795 is|0 |1
obsolete| |
------- Comment #11 from biopython-bugzilla at maubp.freeserve.co.uk 2007-12-15 08:47 EST -------
Created an attachment (id=836)
--> (http://bugzilla.open-bio.org/attachment.cgi?id=836&action=view)
Patch to Bio/Seq.py
[Note this does not update the test suite or the documentation, which would be
needed if this is committed]
Adds new methods to the MutableSeq object:
- transcribe (in place)
- back_transcribe (in place)
Adds new methods to the Seq object:
- transcribe
- back_transcribe
- translate (like the python string method)
- translate_all (Biological translation)
- translate_to_stop (Biological translation up to and excluding first stop
codon)
- translate_cds (Biological translation with an initial start codon as M, up to
and excluding the first stop codon)
I think this would be enough to deprecate Bio.Translate and Bio.Transcribe
(after the next release).
Comments welcome - for example are these method names sensible?
Also, should the MutableSeq methods all act "in situ"? What about translation
methods for MutableSeq objects?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
From bugzilla-daemon at portal.open-bio.org Fri Dec 28 16:18:54 2007
From: bugzilla-daemon at portal.open-bio.org (bugzilla-daemon at portal.open-bio.org)
Date: Fri, 28 Dec 2007 11:18:54 -0500
Subject: [Biopython-dev] [Bug 2425] New: Fasta ID parsing error
Message-ID:
http://bugzilla.open-bio.org/show_bug.cgi?id=2425
Summary: Fasta ID parsing error
Product: Biopython
Version: 1.44
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: dtomso at athenixcorp.com
Loader.py will give an error as follows when presented with an unusual FASTA
header line:
>region1.fasta.screen.Contig1
ACAGGATAGGCGGGAGCCATTGAAACCGGAGCGCTAGCTTCGGTGGAGGC
GCTGGTGGGATACCGCCCTGACTGTATTGAAATTCTAACCTACGGGTCTT
Traceback (most recent call last):
File "biosql_driver.py", line 28, in
db.load(SeqIO.parse(sfile, 'fasta'))
File
"/home/dtomso/repository/biopython/build/lib.linux-i686-2.5/BioSQL/BioSeqDatabase.py",
line 412, in load
db_loader.load_seqrecord(cur_record)
File "/usr/lib/python2.5/site-packages/BioSQL/Loader.py", line 30, in
load_seqrecord
bioentry_id = self._load_bioentry_table(record)
File "/usr/lib/python2.5/site-packages/BioSQL/Loader.py", line 214, in
_load_bioentry_table
accession, version = record.id.split('.')
ValueError: too many values to unpack
It appears to be looking for any '.' in the file, assuming that is a version
number, and splitting to obtain that number. However, this only works on
NCBI-type header lines. Files that deviate from this (e.g. those produced by
phrap, which produced the file above) cause this issue.
I bolted on an inelegant fix by having the code check for multiple '.'
characters, in which case the version defaults to zero. Other solutions may be
preferable.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.