[Biopython-dev] [Bug 2838] New: If a SeqRecord containing Genbank information is read from BioSQL, it cannot be written to another BioSQL database
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Fri May 22 21:16:07 UTC 2009
http://bugzilla.open-bio.org/show_bug.cgi?id=2838
Summary: If a SeqRecord containing Genbank information is read
from BioSQL, it cannot be written to another BioSQL
database
Product: Biopython
Version: 1.49
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: BioSQL
AssignedTo: biopython-dev at biopython.org
ReportedBy: david.wyllie at ndm.ox.ac.uk
I've been trying to annotate some microbial sequences; some are from genbank.
So the proposed series of events was:
1) get sequences from genbank
2) store in BioSQL database called One
3) recover them from BioSql
4) annotate the recovered SeqRecords [this works, but isn't necessary for this
problem to be reproduced - here, I'm making no changes at all to the SeqRecord]
5) store the annotated SeqRecords in a different BioSQL database called Two.
The problem is that Step 5 fails when the original record was recovered from
Genbank.
The traceback (below) indicates a problem with the BioSQL loader in
_load_bioentry_date
Here is the screen output, including traceback.
The program (attached) first loads a record from Genbank,
writes it to One, recovers it from One; at this point it has changed, in
particular in the way date fields are represented.
the entrez load has a /date feature which is not a list
/date=26-MAY-2005
while the reloaded version has two date fields
/dates=['26-MAY-2005']
/date=['26-MAY-2005']
Whether this is relevant I'm not sure.
The subsequent write of the recovered version to Two fails.
As a control, I've checked that the original version can be written to Two
successfully.
I'm a novice with Python and Biopython so please accept my apologies if there
is something obvious and very stupid responsible for this.
---------------------------------------------------------------------------
dwyllie at dwyllie:~/programs/Project/src$ python dbtestcase.py
OK, going to recover record 28804743 from genbank....
Record loaded looks like this:
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference instance at 0x2190b90>,
<Bio.SeqFeature.Reference instance at 0x219a5a8>, <Bio.SeqFeature.Reference
instance at 0x219a5f0>, <Bio.SeqFeature.Reference instance at 0x219a6c8>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Load from Entrez completed, records= 1
Here is the loaded record:
========================================================================
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference instance at 0x2190b90>,
<Bio.SeqFeature.Reference instance at 0x219a5a8>, <Bio.SeqFeature.Reference
instance at 0x219a5f0>, <Bio.SeqFeature.Reference instance at 0x219a6c8>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Now loading these records into a BioSQL database One.
/var/lib/python-support/python2.6/MySQLdb/__init__.py:34: DeprecationWarning:
the sets module is deprecated
from sets import ImmutableSet
Creating a new database One
========================================================================
Load from database One completed, records= 1
========================================================================
Here is the record recovered from database One:
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/dates=['26-MAY-2005']
/ncbi_taxid=3225
/date=['26-MAY-2005']
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Bryopsida',
'Dicranidae', 'Dicranales', 'Ditrichaceae', 'Ceratodon', 'Ceratodon purpureus']
/source=['chloroplast Ceratodon purpureus']
/references=[<Bio.SeqFeature.Reference instance at 0x235d9e0>,
<Bio.SeqFeature.Reference instance at 0x235db90>, <Bio.SeqFeature.Reference
instance at 0x235dcf8>, <Bio.SeqFeature.Reference instance at 0x235de60>]
/gi=28804743
/data_file_division=PLN
/keywords=['']
/organism=Ceratodon purpureus
/sequence_version=['1']
/accessions=['AB098727']
DBSeq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
DNAAlphabet())
========================================================================
Creating a new database Two
Traceback (most recent call last):
File "dbtestcase.py", line 206, in <module>
from dbtestcase import AuthDetails
File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 225, in
<module>
DemonstrateProblem(problemgi,ad)
File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 199, in
DemonstrateProblem
db2.load(listtoload)
File "/var/lib/python-support/python2.6/BioSQL/BioSeqDatabase.py", line 430,
in load
db_loader.load_seqrecord(cur_record)
File "/var/lib/python-support/python2.6/BioSQL/Loader.py", line 50, in
load_seqrecord
self._load_bioentry_date(record, bioentry_id)
File "/var/lib/python-support/python2.6/BioSQL/Loader.py", line 577, in
_load_bioentry_date
self.adaptor.execute(sql, (bioentry_id, date_id, date))
File "/var/lib/python-support/python2.6/BioSQL/BioSeqDatabase.py", line 289,
in execute
self.cursor.execute(sql, args or ())
File "/var/lib/python-support/python2.6/MySQLdb/cursors.py", line 166, in
execute
self.errorhandler(self, exc, value)
File "/var/lib/python-support/python2.6/MySQLdb/connections.py", line 35, in
defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (1064, "You have an error in your SQL
syntax; check the manual that corresponds to your MySQL server version for the
right syntax to use near '), 1)' at line 1")
dwyllie at dwyllie:~/programs/Project/src$
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list