[Biopython-dev] [Bug 2833] Features insertion on previous bioentry_id

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Thu May 21 17:05:12 UTC 2009


http://bugzilla.open-bio.org/show_bug.cgi?id=2833





------- Comment #10 from biopython-bugzilla at maubp.freeserve.co.uk  2009-05-21 13:05 EST -------
Well, some progress :)

(In reply to comment #9)
> This is te results of the test: it's the same on python2.4 and python2.5:
> Make sure can't import records with same ID (in one go). ... FAIL
> Make sure can't import records with same ID (in steps). ... FAIL
> Make sure can't import records with same ID (in steps with commit). ... FAIL
> Make sure can't import a single record twice (in one go). ... FAIL
> Make sure can't import a single record twice (in steps). ... FAIL
> Make sure can't import a single record twice (in steps with commit). ... FAIL
> Make sure all records are correctly loaded. ... ok
> Make sure can't reimport existing records. ... FAIL
> Indepth check that SeqFeatures are transmitted through the db. ... ok
> Load SeqRecord objects into a BioSQL database. ... ok
> Get a list of all items in the database. ... ok
> Test retrieval of items using various ids. ... ok
> Check can add DBSeq objects together. ... ok
> Check can turn a DBSeq object into a Seq or MutableSeq. ... ok
> Make sure Seqs from BioSQL implement the right interface. ... ok
> Check SeqFeatures of a sequence. ... ok
> Make sure SeqRecords from BioSQL implement the right interface. ... ok
> Check that slices of sequences are retrieved properly. ... ok
> 
> ======================================================================
> FAIL: Make sure can't import records with same ID (in one go).
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "test_BioSQL.py", line 397, in test_duplicate_id_load
>     err.__class__.__name__ + "\n" + str(err))
> AssertionError: Exception
> Should have failed!
> ...

Also the error formatting wasn't quite what I had intended, fixed in CVS.
However, most of the tests are allowing duplicates to be recorded without any
error (on PostgreSQL).  This is bad.

> ======================================================================
> FAIL: Make sure can't reimport existing records.
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "test_BioSQL.py", line 463, in test_reload
>     err.__class__.__name__ + "\n" + str(err))
> AssertionError: OperationalError
> currval of sequence "bioentry_pk_seq" is not yet defined in this session

Interestingly the final test gives us an OperationalError about the bioentry
table's primary key (presumably from our last_id method which would call the
SQL statement "select currval('bioentry_pk_seq')"). This suggests some clues
about what is going wrong.

http://www.postgresql.org/docs/8.3/static/functions-sequence.html
http://www.postgresql.org/docs/8.3/static/sql-createsequence.html

See also:
http://code.open-bio.org/svnweb/index.cgi/biosql/view/biosql-schema/trunk/sql/biosqldb-pg.sql

CREATE SEQUENCE bioentry_pk_seq;
CREATE TABLE bioentry ( 
         bioentry_id INTEGER DEFAULT nextval ( 'bioentry_pk_seq' ) NOT NULL , 
         biodatabase_id INTEGER NOT NULL , 
         taxon_id INTEGER , 
         name VARCHAR ( 40 ) NOT NULL , 
         accession VARCHAR ( 128 ) NOT NULL , 
         identifier VARCHAR ( 40 ) , 
         division VARCHAR ( 6 ) , 
         description TEXT , 
         version INTEGER NOT NULL , 
         PRIMARY KEY ( bioentry_id ) , 
         UNIQUE ( accession , biodatabase_id , version ) , 
-- CONFIG: uncomment one (and only one) of the two lines below. The
-- first puts a uniqueness constraint on the identifier column alone;
-- the other one puts a uniqueness constraint on identifier only
-- within a namespace.
--       UNIQUE ( identifier ) 
         UNIQUE ( identifier , biodatabase_id ) 
) ; 

CREATE INDEX bioentry_name ON bioentry ( name ); 
CREATE INDEX bioentry_db ON bioentry ( biodatabase_id ); 
CREATE INDEX bioentry_tax ON bioentry ( taxon_id );


I'm a little surprised all the other duplicate record tests show different
behaviour. I have updated test_BioSQL.py to perform all these new duplicate
tests on a clean database - which I probably should have done in the first
place (CVS revision 1.35).

[All these tests are passing on MySQL. Trying the example by hand triggers an
IntegrityError.]

Peter


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list