[BioSQL-l] value field in seqfeature_qualifier_value too small
Hilmar Lapp
hlapp at gmx.net
Wed Jun 7 17:15:49 UTC 2006
You may make this change in your installation, and I should maybe
document this possibility in the schema.
However, there are some serious issues with doing so. First off, you
will no longer be able to search by simple query (you'd have to use
DBMS_LOB functions), and you can't index the column any more.
Furthermore, all language bindings will need special code to use the
stream or other LOB API for getting/setting the value. There's
additional issues that LOB columns raise compared to VARCHAR types.
It's worth noting that this applies so far only to Oracle (the other
RDBMSs do not impose the length restriction), and it's in reality
only the value of the protein translation feature tag that exceeds
the limit on occasion.
I personally strip the protein translation tag from all loaded
Genbank and RefSeq records because the value may be computed easily
on the fly, and the respective sequences are usually there too unless
you don't load Genpept or the NP_/XP_* part of RefSeq - but in this
case I don't understand why you would want the tag value but not the
full sequence.
Not the easy answer you may have hoped for, and I'd welcome any
suggestions for how the problem could be solved w/o using CLOBs.
-hilmar
On Jun 7, 2006, at 11:45 AM, David Scott wrote:
> in seqfeature_qualifier_value:
>
> SEQFEATURE_ID NOT NULL NUMBER(38)
> TERM_ID NOT NULL NUMBER(38)
> RANK NOT NULL NUMBER(3)
> VALUE VARCHAR2(4000)
>
> the value field is restricted to 4000 bytes - 4000 bytes too small to
> carry current genbank protein translations - e.g. AB016240.1 has a
> protein translation of 4133 characters. since 4000 bytes is a string
> length restriction for some jdbc drivers - to increase the field
> size,
> the field type should also be changed to clob.
>
> _______________________________________________
> BioSQL-l mailing list
> BioSQL-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biosql-l
>
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
===========================================================
More information about the BioSQL-l
mailing list