[Bioperl-l] load_seqdatabase error with a specific locus from genbank

Johann PELLET johann.pellet at inserm.fr
Wed Apr 8 15:29:29 UTC 2009


Hie all,

I confirm that now it's ok for the LOCUS S67862S3  since  Chris update.
Thanks again.

However I still have Warning message with other entries like:

#########################################################################################################################
--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Hantaanvirus  
CGRn93MP8' (I was supplied 'Hantaan virus | Hantavirus | Bunyaviridae')
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Hantaanvirus  
CGRn93P8' (I was supplied 'Hantaan virus | Hantavirus | Bunyaviridae')
---------------------------------------------------
#########################################################################################################################


but entries are inserted in the biosql database:


#########################################################################################################################
biosql=# select * from bioentry where description like 'Hantaanvirus  
CGRn93P8%';
  bioentry_id | biodatabase_id | taxon_id |   name   | accession |  
identifier | division |                               
description                              | version
-------------+----------------+----------+----------+----------- 
+------------+---------- 
+ 
----------------------------------------------------------------------- 
+---------
       156282 |             84 |   395824 | EF990932 | EF990932  |  
156144486  | VRL      | Hantaanvirus CGRn93P8 RNA-dependent RNA  
polymerase gene, partial cds. |       1
       156288 |             84 |   395824 | EF990918 | EF990918  |  
154623008  | VRL      | Hantaanvirus CGRn93P8 segment M, complete  
sequence.                   |       1
       156294 |             84 |   395824 | EF990904 | EF990904  |  
154622980  | VRL      | Hantaanvirus CGRn93P8 segment S, complete  
sequence.                   |       1
(3 rows)
#########################################################################################################################



and finally EU608407 and EU608559  made a crash:



#########################################################################################################################
--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Fowl adenovirus 8' (I  
was supplied 'Fowl adenovirus E | Aviadenovirus | Adenoviridae')
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Unexpected error in feature table for  Skipping feature,  
attempting to recover
---------------------------------------------------
#######...14 times ...############

--------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::ReferenceAdaptor (driver) failed,  
values were ("Bonhoeffer,S., Chappey,C., Parkin,N.T.,  
Whitcomb,LOCUS       EU608407
       1212 bp    DNA     linear   VRL 20-APR-2008","","","CRC- 
D35248959C54B9F2","1","1212","") FKs (<NULL>)
ERROR:  null value in column "location" violates not-null constraint

---------------------------------------------------
Could not store EU608559:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: create: object (Bio::Annotation::Reference) failed to insert or  
to be found by unique key
STACK: Error::throw
STACK: Bio::Root::Root::throw /Library/Perl/5.8.8/Bio/Root/Root.pm:368
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:219
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/ 
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children / 
Library/Perl/5.8.8/Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:230
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/ 
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: Bio::DB::BioSQL::SeqAdaptor::store_children /Library/Perl/5.8.8/ 
Bio/DB/BioSQL/SeqAdaptor.pm:237
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/ 
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/ 
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: load_seqdatabase.pl:630
-----------------------------------------------------------

  at load_seqdatabase.pl line 643
#########################################################################################################################



If I check in the biosql database if some part of this records are  
inserted:


#########################################################################################################################
select * from reference where title='Evidence for positive epistasis  
in HIV-1';
  reference_id | dbxref_id |               location                
|                  title                    
|                                   
authors                                   |         crc
--------------+-----------+-------------------------------------- 
+------------------------------------------ 
+ 
----------------------------------------------------------------------------+ 
----------------------
         16443 |      4179 | Science 306 (5701), 1547-1550 (2004) |  
Evidence for positive epistasis in HIV-1 | Bonhoeffer,S., Chappey,C.,  
Parkin,N.T., Whitcomb,J.M. and Petropoulos,C.J. | CRC-19E7AA4FB7A5D4AF
(1 row)

select * from dbxref where dbxref_id=4179;
  dbxref_id | dbname | accession | version
-----------+--------+-----------+---------
       4179 | PUBMED | 15567861  |       0

select * from bioentry where accession=15567861;
  bioentry_id | biodatabase_id | taxon_id | name | accession |  
identifier | division | description | version
-------------+----------------+----------+------+----------- 
+------------+----------+-------------+---------
(0 rows)
#########################################################################################################################



I don't have records with name='EU608407' or 'EU608559' in the  
bioentry table.


Thanks for your help

Johann


-- --

Johann Pellet

Le 7 avr. 09 à 19:56, Hilmar Lapp a écrit :

> Awesome, thanks Chris! $beer_owed++;
>
> 	-hilmar
>
> On Apr 7, 2009, at 1:32 AM, Chris Fields wrote:
>
>> Fixed in svn now and have added this as a test case (passes all  
>> tests in bioperl-live).  For some reason this wasn't catching some  
>> more complex combinations of operators, mainly those with mixes of  
>> order/join.
>>
>> chris
>>
>> On Apr 6, 2009, at 10:59 PM, Chris Fields wrote:
>>
>>> On Apr 6, 2009, at 8:05 PM, Torsten Seemann wrote:
>>>
>>>>> The full record is here: http://www.ncbi.nlm.nih.gov/nuccore/ 
>>>>> 544772
>>>>
>>>> gene            order(S67862.1:72..75,join(S67863.1:1..788,1..19))
>>>>
>>>>> Does anyone see why the location parser should have a problem  
>>>>> with the first
>>>>> gene feature? It's nested, and has remote location components,  
>>>>> but at first
>>>>> sight nothing jumps out at me as extraordinary. Has someone  
>>>>> recently changed
>>>>> the location parsing code? If no-one has an immediate idea what  
>>>>> could be at
>>>>> work here, this needs investigating.
>>>
>>> The location parsing code was refactored above 3-4 years ago w/o  
>>> problems.  This'll be the first one to crop up.  I'll try taking a  
>>> look at it.
>>>
>>>> I'm not sure if Bioperl handles the order() operator?
>>>>
>>>> For those unfamilair with the order() operator:
>>>>
>>>> http://www.ncbi.nlm.nih.gov/collab/FT/#3.5.2
>>>>
>>>> order(location,location, ... location)
>>>> The elements can be found in the specified order (5' to 3'  
>>>> direction),
>>>> but nothing is implied about the reasonableness about joining them.
>>>>
>>>>
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
>>>> University, AUSTRALIA
>>>
>>> It's interesting that the version from eutils differs  
>>> significantly in the feature table when retrieving 'gb' or  
>>> 'gbwithparts', the latter resolves the location (see below).   
>>> Regardless we'll need to make sure this is parseable.
>>>
>>> ....
>>>
>>> FEATURES             Location/Qualifiers
>>>   source          1..77
>>>                   /organism="Ovine respiratory syncytial virus"
>>>                   /mol_type="genomic RNA"
>>>                   /db_xref="taxon:28869"
>>>   gene            order(S67862.1:72..75,join(S67863.1:1..788,1..19))
>>>                   /gene="G"
>>>   gene            55..>77
>>>                   /gene="fusion glycoprotein F"
>>>
>>>
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> -- 
> ===========================================================
> : Hilmar Lapp  -:-  Durham, NC  -:-  hlapp at gmx dot net :
> ===========================================================
>
>
>





More information about the Bioperl-l mailing list