[Bioperl-l] loading yeast data failing...

Angshu Kar angshu96 at gmail.com
Tue Jan 3 20:41:21 EST 2006


Hi Hilmar,

On what basis should I parse? I found the following 3 entries (arbitrary) in
the bioentry table. The same 3 entries all went to each of the name,
identifier and accession fields!And the version field contains all 0s!


gi|51013395|gb|AAT92991.1|
gi|732941|emb|CAA54130.1|
gi|6321883|ref|NP_011959.1|

So, here for record 1: gi|51013395 is the identifier, AAT92991 is the
accession number, 1 is the version. Am I right? And then what is the name?

Also I found out just the following entry in the 3 same fields in the same
table:

AT1G08520.1

I'm not getting this!I used the TAIR6 dataset.How to parse this data?
Could you please advise on how to resolve this?

Thanks,
Angshu



On 1/3/06, Hilmar Lapp <hlapp at gmx.net> wrote:
> You could do that but first that puts you out of sync with the
> official schema, and second if you look at the value it isn't really
> an accession number anyway that's causing the problem but rather a
> concatenation of identifiers, accession numbers, and namespace
> acronyms. Since you're using a custom SeqProcessor anyway already why
> don't you just add a line or two of code that parses the display_id
> value into the accession and identifier? (for instance, the token
> between two '|' characters following the token 'gb')
>
>    -hilmar
>
> On 1/3/06, Angshu Kar <angshu96 at gmail.com> wrote:
> > Hi,
> >
> > Could you please help me resolve the follwoing error?
> >
> > I run:
> >
> > ./load_seqdatabase.pl --dbname=USBA --dbuser=postgres --format=fasta
> > --driver=Pg --pipeline="SeqProcessor::Accession" yeast_nrpep.fasta
> >
> > The error:
> >
> > Loading yeast_nrpep.fasta ...
> >
> > -------------------- WARNING ---------------------
> > MSG: insert in Bio::DB::BioSQL::SeqAdaptor (driver) failed, values were
> >
("gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","gi|4261605|gb|AAD13905.1|S58126_11111111111111","Unknown
> > [Saccharomyces cerevisiae]","0","") FKs (19,<NULL>)
> > ERROR:  value too long for type character varying(40)
> > ---------------------------------------------------
> > Could not store gi|4261605|gb|AAD13905.1|S58126_11111111111111:
> > ------------- EXCEPTION  -------------
> > MSG: error while executing statement in
> > Bio::DB::BioSQL::SeqAdaptor::find_by_unique_key: ERROR:  current
transaction
> > is aborted, commands ignored until end of transaction block
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key
> > /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:951
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key
> > /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:855
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
> > /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:205
> > STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
> > /home/akar/local/perl//Bio/DB/BioSQL/BasePersistenceAdaptor.pm:254
> > STACK Bio::DB::Persistent::PersistentObject::store
> > /home/akar/local/perl//Bio/DB/Persistent/PersistentObject.pm:272
> > STACK (eval) ./load_seqdatabase.pl:621
> > STACK toplevel ./load_seqdatabase.pl:604
> >
> > --------------------------------------
> >
> >  at ./load_seqdatabase.pl line 634
> >
> > Should I change the field lengths for accession, name and identifier to
some
> > value >40 in the bioentry table?  What  should I change it to?
> >
> > Thanks,
> > Angshu
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
>
>
> --
> ----------------------------------------------------------
> : Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
> ----------------------------------------------------------
>



More information about the Bioperl-l mailing list