[Biopython-dev] [Bug 2840] New: When a record has been loaded from BioSQL, trying to save it to another database fails with loader db_loader.load_seqrecord fails in _load_reference

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon May 25 18:21:26 UTC 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2840

           Summary: When a record has been loaded from BioSQL, trying to
                    save it to another database fails with loader
                    db_loader.load_seqrecord fails in _load_reference
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: BioSQL
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: david.wyllie at ndm.ox.ac.uk


Hi

I have been trying to load SeqRecords from BioSQL, annotate them, and then
write them to a different BioSQL database.  Reloading the record to the second
database fails.  This isn't to do with annotation - none is performed.

This issue is different from #2838, which has been addressed (thank you).

The sequence of events is
1) eFetch a SeqRecord from Genbank (succeeds)
2) write to BioSQL (succeeds)
3) recover from BioSQL (succeeds)
4) write to BioSQL (fails, although no modifications have been made).

The current problem seems related to references:
Loader.load_seqrecord._load_reference.
Error says:

_load_reference
    start = 1 + int(str(reference.location[0].start))
ValueError: invalid literal for int() with base 10: 'None'

Testing has been done on Ubuntu 9 x64 with Python 2.6 (debian package),
python-dev (debian package), load from CVS as of 24.5.09, and a testcase
program, dbtestcase.py, attached to the now fixed bug #2838.

To run dbtestcase.py, the mysql details will have to be altered on line
beginning
ad=AuthDetails(...
but otherwise it should I think run.

Traceback and program output from dbtestcase.py follow.
dwyllie at dwyllie:~/programs/CheckleyProject/src$ python dbtestcase.py
OK, going to recover record 28804743  from genbank....
Record loaded looks like this:
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference object at 0x2524a10>,
<Bio.SeqFeature.Reference object at 0x2524a90>, <Bio.SeqFeature.Reference
object at 0x2524b50>, <Bio.SeqFeature.Reference object at 0x2524bd0>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Load from Entrez completed, records= 1
Here is the loaded record:
========================================================================
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference object at 0x2524a10>,
<Bio.SeqFeature.Reference object at 0x2524a90>, <Bio.SeqFeature.Reference
object at 0x2524b50>, <Bio.SeqFeature.Reference object at 0x2524bd0>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Now loading these records into a BioSQL database One.
/var/lib/python-support/python2.6/MySQLdb/__init__.py:34: DeprecationWarning:
the sets module is deprecated
  from sets import ImmutableSet
Creating a new database  One
========================================================================
Load from database One completed, records= 1
========================================================================
Here is the record recovered from database One:
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/dates=['26-MAY-2005']
/ncbi_taxid=3225
/date=['26-MAY-2005']
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Bryopsida',
'Dicranidae', 'Dicranales', 'Ditrichaceae', 'Ceratodon', 'Ceratodon purpureus']
/source=['chloroplast Ceratodon purpureus']
/references=[<Bio.SeqFeature.Reference object at 0x269e710>,
<Bio.SeqFeature.Reference object at 0x269e810>, <Bio.SeqFeature.Reference
object at 0x269e910>, <Bio.SeqFeature.Reference object at 0x269ea10>]
/gi=28804743
/data_file_division=PLN
/keywords=['']
/organism=Ceratodon purpureus
/sequence_version=['1']
/accessions=['AB098727']
DBSeq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
DNAAlphabet())
========================================================================
Creating a new database  Two
Traceback (most recent call last):
  File "dbtestcase.py", line 165, in <module>
    from dbtestcase import AuthDetails
  File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 182, in
<module>
    DemonstrateProblem(problemgi,ad)
  File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 158, in
DemonstrateProblem
    db2.load(listtoload)
  File "/usr/local/lib/python2.6/dist-packages/BioSQL/BioSeqDatabase.py", line
442, in load
    db_loader.load_seqrecord(cur_record)
  File "/usr/local/lib/python2.6/dist-packages/BioSQL/Loader.py", line 57, in
load_seqrecord
    self._load_reference(reference, rank, bioentry_id)
  File "/usr/local/lib/python2.6/dist-packages/BioSQL/Loader.py", line 733, in
_load_reference
    start = 1 + int(str(reference.location[0].start))
ValueError: invalid literal for int() with base 10: 'None'
dwyllie at dwyllie:~/programs/CheckleyProject/src$


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list