[Bioperl-l] One more load_seqdatabase.pl question

gang wu gwu at molbio.mgh.harvard.edu
Thu Nov 30 22:08:08 UTC 2006


Thanks Hilmar. Do you mean the NVL() clause will make 
load_seqdatabase.pl not work when update?

I have problem with updating. Seems load_seqdatabase.pl only tries to 
insert instead of update. I used one of the test genbank file coming 
whith bioperl-db. Please take a look at the attached output.

Thanks.

Gang

=========================================
 >perl load_seqdatabase.pl -lookup -host elegans -driver Oracle -dbname 
sparc -dbuser biosqldb-sgowner -dbpass PASS -format genbank -namespace 
test /root/.cpan/build/bioperl-db-1.5.2-RC3/scripts/biosql/data/AP000868.gb
Loading 
/root/.cpan/build/bioperl-db-1.5.2-RC3/scripts/biosql/data/AP000868.gb ...

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::CommentAdaptor (driver) failed, values 
were ("This sequence was reannotated via the Ensembl system. Please 
visit the Ensembl web site, http://www.ensembl.org/ for more 
information. ","1") FKs (389109)
ORA-00001: unique constraint (BIOSQLDB_SGOWNER.XAK1COMMENT) violated 
(DBD ERROR: OCIStmtExecute)
---------------------------------------------------

-------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::CommentAdaptor (driver) failed, values 
were ("The /gene indicates a unique id for a gene, /cds a unique id for 
a translation and a /exon a unique id for an exon. These ids are 
maintained wherever possible between versions. For more information on 
how to interpret the feature table, please visit 
http://www.ensembl.org/Docs/embl.html. ","2") FKs (389109)
ORA-00001: unique constraint (BIOSQLDB_SGOWNER.XAK1COMMENT) violated 
(DBD ERROR: OCIStmtExecute)
---------------------------------------------------
...
...
==========================================================
Hilmar Lapp wrote:
> These are the protein translations stored in the feature table as  
> tags of features, right?
>
> You can change the type of the column (although there may be some  
> issues when you update the column because the NVL() clause won't work  
> if I recall that correctly), but doing so will deprive you of any  
> 'normal' searches against that column. (You can still use functions  
> from the DBMS_LOB package, but they will be much slower and are  
> completely non-standard.)
>
> It is up to you whether that is too big of a price to pay for having  
> some redundant protein translations (translating the feature's DNA  
> sequence should give you the same) in the database. I always trimmed  
> those feature tags off (using a custom SeqProcessor). An alternative  
> is to convert these feature tags into actual bioentries (i.e.,  
> Bio::Seq objects; again, a custom SeqProcessor will allow you to do  
> that).
>
> 	-hilmar
>
> On Nov 28, 2006, at 4:13 PM, gang wu wrote:
>
>   
>> Hi everyone,
>>
>> I'm using load_seqdatabase.pl to upload some Genbank genome  
>> sequences to
>> my Oracle BioSQL database. I saw some errors(See attached warning
>> message) related to seqfeature_qualifier_value
>> (SG_SEQFEATURE_QUALIFIER_ASSOC.VALUE column), which has Varchar2 data
>> type of maximum 4000 bytes. Did anybody mention this issue before?
>> Should I just modify the column to a type being able store more data
>> such as LONG or CLOB?
>>
>> Thanks.
>>
>> Gang
>>
>>
>> Log information:
>> ============================================
>>
>> load_seqdatabase.pl -host elegans -driver Oracle -dbname sparc -dbuser
>> biosqldb-sgowner -dbpass PASS -format genbank -namespace genbank
>> /genomeseq/arabidopsis//NC_003070.gbk
>>
>>
>> Loading /genomeseq/arabidopsis//NC_003070.gbk ...
>>
>>
>> -------------------- WARNING ---------------------
>> MSG: SimpleValueAdaptor::add_assoc: unexpected failure of statement
>> execution: ORA-01461: can bind a LONG value only for insert into a  
>> LONG
>> column (DBD ERROR: error possibly near <*> indicator at char 12 in
>> 'INSERT INTO <*>seqfeature_qualifier_value (fea_oid, trm_oid, value,
>> rank) VALUES (:p1, :p2, :p3, :p4)')
>>     name: INSERT ASSOC [2]
>> Bio::SeqFeature::Generic;Bio::Annotation::SimpleValue
>>     values: FK[Bio::SeqFeature::Generic]:14898,
>> FK[Bio::Annotation::SimpleValue]:800,
>> value:"MVAVTGEVLHLLRRYLGEYVHGLSTEALRISVWKGDVVLKDLKLKAEALNSLKLPVAVKSGFV 
>> GTITLKVPWKSLGKEPVIVLIDRVFVLAYPAPDDRTLKFFTLVGTEFAYTNYIPGGRQGKASRNQASADR 
>> GTSYFWLMELHGYEAETATLEARAKSKLGSPPQGNSWLGSIIATIIGNLKVSISNVHIRYEDSTRDSSEI 
>> LASFFSYFNNICSSNPGHPFAAGITLAKLAAVTMDEEGNETFDTSGALDKLRKSLQLERLALYHDSNSFP 
>> WEIEKQWDNITPEEWIEMFEDGIKEQTEHKIKSKWALNRHYLLSPINGSLKYHRLGNQERNNPEIPFERA 
>> SVILNDVNVTITEEQYHDWIKLVEVVSRYKTYIEISHLRPMVPVSEAPRLWWRFAAQASLQQKRLWYTRY 
>> IQLYANFLQQSSDVNYPEMREIEKDLDSKVILLWRLLAHAKVESVKSKEAAEQRKLKKGGWFSFNWRTEA 
>> EDDPEVDSVAGGSKLMEERLTKDEWKAINKLLSHQPDEEMNLYSGKDMQNMTHFLVTVSIGQGAARIVDI 
>> NQTEVLCGRFEQLDVTTKFRHRSTQCDVSLRFYGLSAPEGSLAQSVSSERKTNALMASFVNAPIGENIDW 
>> RLSATISPCHATIWTESYDRVLEFVKRSNAVSPTVALETAAVLQMKLEEVTRRAQEQLQIVLEEQSRFAL 
>> DIDIDAPKVRIPLRASGSSKCSSHFLLDFGNFTLTTMDTRSEEQRQNLYSRFCISGRDIAAFFTDCGSDN 
>> QGCSLVMEDFTNQPILSPILEKADNVYSLIDRCGMAVIVDQIKVPHPSYPSTRISIQVPNIGVHFSPTRY 
>> MRIMQLFDILYGAMKTYSQAPVDHMPDGIQPWSPTDLASDARILVWKGIGNSVATWQSCRLVLSGLYLYT 
>> FESEKSLDYQRYLCMAGRQVFEVPPANIGGSPYCLAVGVRGTDLKKALESSSTWIIEFQGEEKAAWLRGL 
>> VQATYQASA!
>>   
>> PLSGDVLGQTSDGDGDFHEPQTRNMKAADLVITGALVETKLYLYGKIKNECDEQVEEVLLLKVLASGGKV 
>> HLISSESGLTVRTKLHSLKIKDELQQQQSGSAQYLAYSVLKNEDIQESLGTCDSFDKEMPVGHADDEDAY 
>> TDALPEFLSPTEPGTPDMDMIQCSMMMDSDEHVGLEDTEGGFHEKDTSQGKSLCDEVFYEVQGGEFSDFV 
>> SVVFLTRSSSSHDYNGIDTQMSIRMSKLEFFCSRPTVVALIGFGFDLSTASYIENDKDANTLVPEKSDSE 
>> KETNDESGRIEGLLGYGKDRVVFYLNMNVDNVTVFLNKEDGSQLAMFVQERFVLDIKVHPSSLSVEGTLG 
>> NFKLCDKSLDSGNCWSWLCDIRDPGVESLIKFKFSSYSAGDDDYEGYDYSLSGKLSAVRIVFLYRFVQEV 
>> TAYFMGLATPHSEEVIKLVDKVGGFEWLIQKDEMDGATAVKLDLSLDTPIIVVPRDSLSKDYIQLDLGQL 
>> EVSNEISWHGCPEKDATAVRVDVLHAKILGLNMSVGINGSIGKPMIREGQGLDIFVRRSLRDVFKKVPTL 
>> SVEVKIDFLHAVMSDKEYDIIVSCTSMNLFEEPKLPPDFRGSSSGPKAKMRLLADKVNLNSQMIMSRTVT 
>> ILAVDINYALLELRNSVNEESSLAHVAVRASEPNSSISWMTSLSETDLYVSVPKVSVLDIRPNTKPEMRL 
>> MLGSSVDASKQASSESLPFSLNKGSFKRANSRAVLDFDAPCSTMLLMDYRWRASSQSCVLRVQQPRILAV 
>> PDFLLAVGEFFVPALRAITGRDETLDPTNDPITRSRGIVLSEPLYKQTEDVVHLSPRRQLVADSLGIDEY 
>> TYDGCGKVISLSEQGEKDLNVGRLEPIIIVGHGKKLRFVNVKIKNGSLLSKCIYLSNDSSCLFSPEDGVD 
>> ISMLENASSNPENVLSNAHKSSDVSDTCQYDSKSGQSFTFEAQVVSPEFTFFDGTKSSLDDSSAVEKLLR 
>> VKLDFNFM!
>>   
>> YASKEKDIWVRALLKNLVVETGSGLIILDPVDISGGYTSVKEKTNMSLTSTDIYMHLSLSALSLLLNLQS
>> QVTGALQSGNAIPLASCTNFDRIWVSPKENGPRNNLTIWRPQAPSNYVILGDCVTSRAIPPTQAVMAVSN 
>> TYGRVRKPIGFNRIGLFSVIQGLEGDNVQHSHNSNECSLWMPVAPVGYTAMGCVANIGSEQPPDHIVYCL 
>> SIWRADNVLGAFYAHTSTAAPSKKYSPGLSHCLLWNPLQSKTSSSSDPSSTSGSRSEQSSDQTGNSSGWD 
>> ILRSISKATSYHVSTPNFERIWWDKGGDLRRPVSIWRPVPRPGFAILGDSITEGLEPPALGILFKADDSE 
>> IAAKPVQFNKVAHIVGKGFDEVFCWFPVAPPGYVSLGCVLSKFDEAPHVDSFCCPRIDLVNQANIYEASV 
>> TRSSSSKSSQLWSIWKVDNQACTFLARSDLKRPPSRMAFAVGESVKPKTQENVNAEIKLRCFSLTLLDGL 
>> HGMMTPLFDTTVTNIKLATHGRPEAMNAVLISSIAASTFNPQLEAWEPLLEPFDGIFKLETYDTALNQSS 
>> KPGKRLRIAATNILNINVSAANLETLGDAVVSWRRQLELEERAAKMKEESAASRESGDLSAFSALDEDDF 
>> QTIVVENKLGRDIYLKKLEENSDVVVKLCHDENTSVWVPPPRFSNRLNVADSSREARNYMTVQILEAKGL 
>> HIIDDGNSHSFFCTLRLVVDSQGAEPQKLFPQSARTKCVKPSTTIVNDLMECTSKWNELFIFEIPRKGVA 
>> RLEVEVTNLAAKAGKGEVVGSLSFPVGHGESTLRKVASVRMLHQSSDAENISSYTLQRKNAEDKHDNGCL 
>> LISTSYFEKTTIPNTLRNMESKDFVDGDTGFWIGVRPDDSWHSIRSLLPLCIAPKSLQNDFIAMEVSMRN 
>> GRKHATFRCLATVVNDSDVNLEISISSDQNVSSGVSNHNAVIASRSSYVLPWGCLSKDNEQCLHIRPKVE 
>> NSHHSYAWGYCIAVSSGCGKDQPFVDQGLLTRQNTIKQSSRASTFFLRLNQLEKKDMLFCCQPSTGSKPL 
>> WLSVGADAS!
>>   
>> VLHTDLNTPVYDWKISISSPLKLENRLPCPVKFTVWEKTKEGTYLERQHGVVSSRKSAHVYSADIQRPVY 
>> LTLAVHGGWALEKDPIPVLDISSNDSVSSFWFVHQQSKRRLRVSIERDVGETGAAPKTIRFFVPYWITND 
>> SYLPLSYRVVEIEPSENVEAGSPCLTRASKSFKKNPVFSMERRHQKKNVRVLESIEDTSPMPSMLSPQES 
>> AGRSGVVLFPSQKDSYVSPRIGIAVAARDSDSYSPGISLLELEKKERIDVKAFCKDASYYMLSAVLNMTS 
>> DRTKVIHLQPHTLFINRVGVSICLQQCDCQTEEWINPSDPPKLFGWQSSTRLELLKLRVKGYRWSTPFSV 
>> FSEGTMRVPVPKEDGTDQLQLRVQVRSGTKNSRYEVIFRPNSISGPYRIENRSMFLPIRYRQVEGVSESW 
>> QFLPPNAAASFYWENLGRRHLFELLVDGNDPSNSEKFDIDKIGDYPPRSESGPTRPIRVTILKEDKKNIV 
>> RISDWMPAIEPTSSISRRLPASSLSELSGNESQQSHLLASEDSEFHVIVELAELGISVIDHAPEEILYMS 
>> VQNLFVAYSTGLGSGLSRFKLRMQGIQVDNQLPLAPMPVLFRPQRTGDKADYILKFSVTLQSNAGLDLRV 
>> YPYIDFQGRENTAFLINIHEPIIWRIHEMIQQANLSRLSDPNSTAVSVDPFIQIGVLNFSEVRFRVSMAM 
>> SPSQRPRGVLGFWSSLMTALGNTENMPVRISERFHENISMRQSTMINNAIRNVKKDLLGQPLQLLSGVDI 
>> LGNASSALGHMSQGIAALSMDKKFIQSRQRQENKGVEDFGDIIREGGGALAKGLFRGVTGILTKPLEGAK 
>> SSGVEGFVSGFGKGIIGAAAQPVSGVLDLLSKTTEGANAMRMKIAAAITSDEQLLRRRLPRAVGADSLLR 
>> PYNDYRAQGQVILQLAESGSFLGQVDLFKVRGKFALTDAYESHFILPKGKVLMITHRRVILLQQPSNIMG 
>> QRKFIPAK!
>>   
>> DACSIQWDILWNDLVTMELSDGKKDPPNSPPSRLILYLKAKPHDPKEQFRVVKCIPNSKQAFDVYSAIDQ
>> AINLYGQNALKGMVKNKVTRPYSPISESSWAEGASQQMPASVTPSSTFGTSPTTSSS",
>> rank:"1"
>> --------------------------------------------------
>>
>>
>> =============================================
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>     
>
>   




More information about the Bioperl-l mailing list