[Biopython-dev] [Bug 2838] If a SeqRecord containing Genbank information is read from BioSQL, it cannot be written to another BioSQL database

Sat May 23 01:16:54 UTC 2009

http://bugzilla.open-bio.org/show_bug.cgi?id=2838

------- Comment #3 from david.wyllie at ndm.ox.ac.uk  2009-05-22 21:16 EST -------
Thank you!

Unfortunately I'm not sure it's fixed, or maybe there is another problem:

I have uninstalled the BioPython package using Synaptic package manager
(previously I was using 1.49), downloaded from cvs checkout.

Thanks for your message
http://osdir.com/ml/python.bio.general/2008-07/msg00035.html
I can confirm that the default ubuntu 9.0 install lacks the python-dev package,
with the necessary Python.h headers. 

After python-dev is installed, 
build is OK, 
Tests pass
running test
test_Ace ... ok
test_AlignIO ... ok
test_BioSQL ... /var/lib/python-support/python2.6/MySQLdb/__init__.py:34:
DeprecationWarning: the sets module is deprecated
  from sets import ImmutableSet
/home/dwyllie/biopython/build/lib.linux-x86_64-2.6/BioSQL/BioSeqDatabase.py:144:
Warning: 'TYPE=storage_engine' is deprecated; use 'ENGINE=storage_engine'
instead
  self.adaptor.cursor.execute(sql_line)
ok
test_BioSQL_SeqIO ... ok
test_CAPS ... ok
test_Clustalw ... ok
..

and install is OK too.  This is all new to me but it seems to work OK.

I have checked the source code and I think your modification is correctly in
place

I think I have your patch in place:

  def _load_bioentry_date(self, record, bioentry_id):
        """Add the effective date of the entry into the database.

        record - a SeqRecord object with an annotated date
        bioentry_id - corresponding database identifier
        """
        # dates are GenBank style, like:
        # 14-SEP-2000
        date = record.annotations.get("date",
                                      strftime("%d-%b-%Y", gmtime()).upper())
        if isinstance(date, list) : date = date[0]
        annotation_tags_id = self._get_ontology_id("Annotation Tags")
        date_id = self._get_term_id("date_changed", annotation_tags_id)
        sql = r"INSERT INTO bioentry_qualifier_value" \
              r" (bioentry_id, term_id, value, rank)" \
              r" VALUES (%s, %s, %s, 1)" 
        self.adaptor.execute(sql, (bioentry_id, date_id, date))

Now when I re-run dbtestcase.py (attached previously) I get a different error
message.

dwyllie at dwyllie:~/programs/CheckleyProject/src$ python dbtestcase.py
OK, going to recover record 28804743  from genbank....
Record loaded looks like this:
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference object at 0x26e7a10>,
<Bio.SeqFeature.Reference object at 0x26e7a90>, <Bio.SeqFeature.Reference
object at 0x26e7b50>, <Bio.SeqFeature.Reference object at 0x26e7bd0>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Load from Entrez completed, records= 1
Here is the loaded record:
========================================================================
ID: AB098727.1
Name: AB098727
Description: Ceratodon purpureus chloroplast rps11, petD genes for ribosomal
protein S11, cytochromoe b/f complex subunit IV, partial cds.
Number of features: 5
/sequence_version=1
/source=chloroplast Ceratodon purpureus
/taxonomy=['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta',
'Bryophyta', 'Moss Superclass V', 'Bryopsida', 'Dicranidae', 'Dicranales',
'Ditrichaceae', 'Ceratodon']
/keywords=['']
/references=[<Bio.SeqFeature.Reference object at 0x26e7a10>,
<Bio.SeqFeature.Reference object at 0x26e7a90>, <Bio.SeqFeature.Reference
object at 0x26e7b50>, <Bio.SeqFeature.Reference object at 0x26e7bd0>]
/accessions=['AB098727']
/data_file_division=PLN
/date=26-MAY-2005
/organism=Ceratodon purpureus
/gi=28804743
Seq('AATTCGATTTTTTGTTCGTGATGTAACTCCTATGCCTCATAATGGGTGTAGACC...ATA',
IUPACAmbiguousDNA())
========================================================================
Now loading these records into a BioSQL database One.
/var/lib/python-support/python2.6/MySQLdb/__init__.py:34: DeprecationWarning:
the sets module is deprecated
  from sets import ImmutableSet
Creating a new database  One
========================================================================
Load from database One completed, records= 1
========================================================================
Here is the record recovered from database One:
Traceback (most recent call last):
  File "dbtestcase.py", line 165, in <module>
    from dbtestcase import AuthDetails
  File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 182, in
<module>
    DemonstrateProblem(problemgi,ad)
  File "/home/dwyllie/programs/CheckleyProject/src/dbtestcase.py", line 138, in
DemonstrateProblem
    print recordrecovered
  File "/usr/local/lib/python2.6/dist-packages/Bio/SeqRecord.py", line 489, in
__str__
    if self.letter_annotations :
  File "/usr/local/lib/python2.6/dist-packages/Bio/SeqRecord.py", line 165, in
<lambda>
    fget=lambda self : self._per_letter_annotations,
AttributeError: 'DBSeqRecord' object has no attribute '_per_letter_annotations'
dwyllie at dwyllie:~/programs/CheckleyProject/src$ 

Have I failed to install something?
Unfortunately, I wasn't running off CVS before your change.

Best wishes
d

-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.