[Bioperl-l] load_seqdatabase error with a specific locus from genbank
Johann PELLET
johann.pellet at inserm.fr
Wed Apr 8 15:29:29 UTC 2009
Hie all,
I confirm that now it's ok for the LOCUS S67862S3 since Chris update.
Thanks again.
However I still have Warning message with other entries like:
#########################################################################################################################
--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Hantaanvirus
CGRn93MP8' (I was supplied 'Hantaan virus | Hantavirus | Bunyaviridae')
---------------------------------------------------
--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Hantaanvirus
CGRn93P8' (I was supplied 'Hantaan virus | Hantavirus | Bunyaviridae')
---------------------------------------------------
#########################################################################################################################
but entries are inserted in the biosql database:
#########################################################################################################################
biosql=# select * from bioentry where description like 'Hantaanvirus
CGRn93P8%';
bioentry_id | biodatabase_id | taxon_id | name | accession |
identifier | division |
description | version
-------------+----------------+----------+----------+-----------
+------------+----------
+
-----------------------------------------------------------------------
+---------
156282 | 84 | 395824 | EF990932 | EF990932 |
156144486 | VRL | Hantaanvirus CGRn93P8 RNA-dependent RNA
polymerase gene, partial cds. | 1
156288 | 84 | 395824 | EF990918 | EF990918 |
154623008 | VRL | Hantaanvirus CGRn93P8 segment M, complete
sequence. | 1
156294 | 84 | 395824 | EF990904 | EF990904 |
154622980 | VRL | Hantaanvirus CGRn93P8 segment S, complete
sequence. | 1
(3 rows)
#########################################################################################################################
and finally EU608407 and EU608559 made a crash:
#########################################################################################################################
--------------------- WARNING ---------------------
MSG: The supplied lineage does not start near 'Fowl adenovirus 8' (I
was supplied 'Fowl adenovirus E | Aviadenovirus | Adenoviridae')
---------------------------------------------------
--------------------- WARNING ---------------------
MSG: Unexpected error in feature table for Skipping feature,
attempting to recover
---------------------------------------------------
#######...14 times ...############
--------------------- WARNING ---------------------
MSG: insert in Bio::DB::BioSQL::ReferenceAdaptor (driver) failed,
values were ("Bonhoeffer,S., Chappey,C., Parkin,N.T.,
Whitcomb,LOCUS EU608407
1212 bp DNA linear VRL 20-APR-2008","","","CRC-
D35248959C54B9F2","1","1212","") FKs (<NULL>)
ERROR: null value in column "location" violates not-null constraint
---------------------------------------------------
Could not store EU608559:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: create: object (Bio::Annotation::Reference) failed to insert or
to be found by unique key
STACK: Error::throw
STACK: Bio::Root::Root::throw /Library/Perl/5.8.8/Bio/Root/Root.pm:368
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:219
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: Bio::DB::BioSQL::AnnotationCollectionAdaptor::store_children /
Library/Perl/5.8.8/Bio/DB/BioSQL/AnnotationCollectionAdaptor.pm:230
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: Bio::DB::BioSQL::SeqAdaptor::store_children /Library/Perl/5.8.8/
Bio/DB/BioSQL/SeqAdaptor.pm:237
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::create /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:227
STACK: Bio::DB::BioSQL::BasePersistenceAdaptor::store /Library/Perl/
5.8.8/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:264
STACK: Bio::DB::Persistent::PersistentObject::store /Library/Perl/
5.8.8/Bio/DB/Persistent/PersistentObject.pm:284
STACK: load_seqdatabase.pl:630
-----------------------------------------------------------
at load_seqdatabase.pl line 643
#########################################################################################################################
If I check in the biosql database if some part of this records are
inserted:
#########################################################################################################################
select * from reference where title='Evidence for positive epistasis
in HIV-1';
reference_id | dbxref_id | location
| title
|
authors | crc
--------------+-----------+--------------------------------------
+------------------------------------------
+
----------------------------------------------------------------------------+
----------------------
16443 | 4179 | Science 306 (5701), 1547-1550 (2004) |
Evidence for positive epistasis in HIV-1 | Bonhoeffer,S., Chappey,C.,
Parkin,N.T., Whitcomb,J.M. and Petropoulos,C.J. | CRC-19E7AA4FB7A5D4AF
(1 row)
select * from dbxref where dbxref_id=4179;
dbxref_id | dbname | accession | version
-----------+--------+-----------+---------
4179 | PUBMED | 15567861 | 0
select * from bioentry where accession=15567861;
bioentry_id | biodatabase_id | taxon_id | name | accession |
identifier | division | description | version
-------------+----------------+----------+------+-----------
+------------+----------+-------------+---------
(0 rows)
#########################################################################################################################
I don't have records with name='EU608407' or 'EU608559' in the
bioentry table.
Thanks for your help
Johann
-- --
Johann Pellet
Le 7 avr. 09 à 19:56, Hilmar Lapp a écrit :
> Awesome, thanks Chris! $beer_owed++;
>
> -hilmar
>
> On Apr 7, 2009, at 1:32 AM, Chris Fields wrote:
>
>> Fixed in svn now and have added this as a test case (passes all
>> tests in bioperl-live). For some reason this wasn't catching some
>> more complex combinations of operators, mainly those with mixes of
>> order/join.
>>
>> chris
>>
>> On Apr 6, 2009, at 10:59 PM, Chris Fields wrote:
>>
>>> On Apr 6, 2009, at 8:05 PM, Torsten Seemann wrote:
>>>
>>>>> The full record is here: http://www.ncbi.nlm.nih.gov/nuccore/
>>>>> 544772
>>>>
>>>> gene order(S67862.1:72..75,join(S67863.1:1..788,1..19))
>>>>
>>>>> Does anyone see why the location parser should have a problem
>>>>> with the first
>>>>> gene feature? It's nested, and has remote location components,
>>>>> but at first
>>>>> sight nothing jumps out at me as extraordinary. Has someone
>>>>> recently changed
>>>>> the location parsing code? If no-one has an immediate idea what
>>>>> could be at
>>>>> work here, this needs investigating.
>>>
>>> The location parsing code was refactored above 3-4 years ago w/o
>>> problems. This'll be the first one to crop up. I'll try taking a
>>> look at it.
>>>
>>>> I'm not sure if Bioperl handles the order() operator?
>>>>
>>>> For those unfamilair with the order() operator:
>>>>
>>>> http://www.ncbi.nlm.nih.gov/collab/FT/#3.5.2
>>>>
>>>> order(location,location, ... location)
>>>> The elements can be found in the specified order (5' to 3'
>>>> direction),
>>>> but nothing is implied about the reasonableness about joining them.
>>>>
>>>>
>>>> --Torsten Seemann
>>>> --Victorian Bioinformatics Consortium, Dept. Microbiology, Monash
>>>> University, AUSTRALIA
>>>
>>> It's interesting that the version from eutils differs
>>> significantly in the feature table when retrieving 'gb' or
>>> 'gbwithparts', the latter resolves the location (see below).
>>> Regardless we'll need to make sure this is parseable.
>>>
>>> ....
>>>
>>> FEATURES Location/Qualifiers
>>> source 1..77
>>> /organism="Ovine respiratory syncytial virus"
>>> /mol_type="genomic RNA"
>>> /db_xref="taxon:28869"
>>> gene order(S67862.1:72..75,join(S67863.1:1..788,1..19))
>>> /gene="G"
>>> gene 55..>77
>>> /gene="fusion glycoprotein F"
>>>
>>>
>>>
>>> chris
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> ===========================================================
> : Hilmar Lapp -:- Durham, NC -:- hlapp at gmx dot net :
> ===========================================================
>
>
>
More information about the Bioperl-l
mailing list