[Bioperl-l] load_seqdatabase.pl does not like fasta format

Hilmar Lapp hlapp at gmx.net
Sat Jun 12 02:58:31 EDT 2004


First off, note that if you don't specify a namespace for your  
sequences, they will all go into a default namespace ("bioperl"). If  
your Genbank load and fasta file contain redundant sequences, but you  
want them both in the database, you will need to specify different  
namespaces for the two uploads.

What are you printing out to get the NM accession number? The first  
line of your stack trace basically means that the accession number of  
your fasta sequence was 'unknown'. Since the triple of  
(accession,version,namespace) is constrained by and used as a unique  
key, and given that fasta doesn't provide version numbers, your  
sequences will all be considered identical if the accession is  
'unknown' for all of them. I.e., after the first one is inserted, the  
second one and all others will fail to insert.

Do you have proper identifiers in the fasta file(s)?

	-hilmar

On Friday, June 11, 2004, at 08:09  PM, Andy Hammer wrote:

> I used load_seqdatabase.pl just fine to load over
> 20,000 genbank sequences into a biosql database.  Then
> I tried to load a fasta file into a new biosql
> database and got the following:
>
> postgres at westwater:/var/local/ucsc$
> ./load_seqdatabase.pl -dbname ucsc -dbuser postgres
> -format fasta refMrna.fa
> Loading refMrna.fa ...
> Processing NM_000367 at length 2742
> Processing NM_000597 at length 1433
> Could not store unknown:
> ------------- EXCEPTION  -------------
> MSG: You're trying to lie about the length: is 1433
> but you say 2742
> STACK Bio::PrimarySeq::length
> /usr/local/share/perl/5.6.1/Bio/PrimarySeq.pm:419
> STACK Bio::DB::Persistent::PersistentObject::AUTOLOAD
> /usr/local/share/perl/5.6.1/Bio/DB/Persistent/PersistentObject.pm:541
> STACK Bio::Seq::length
> /usr/local/share/perl/5.6.1/Bio/Seq.pm:612
> STACK Bio::DB::Persistent::PersistentObject::AUTOLOAD
> /usr/local/share/perl/5.6.1/Bio/DB/Persistent/PersistentObject.pm:541
> STACK
> Bio::DB::BioSQL::BiosequenceAdaptor::populate_from_row
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BiosequenceAdaptor.pm:251
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1300
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:977
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:856
> STACK
> Bio::DB::BioSQL::PrimarySeqAdaptor::attach_children
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/PrimarySeqAdaptor.pm:284
> STACK Bio::DB::BioSQL::SeqAdaptor::attach_children
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/SeqAdaptor.pm:279
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::_build_object
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/ 
> BasePersistenceAdaptor.pm:1331
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::_find_by_unique_key
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:977
> STACK
> Bio::DB::BioSQL::BasePersistenceAdaptor::find_by_unique_key
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:856
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::create
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:204
> STACK Bio::DB::BioSQL::BasePersistenceAdaptor::store
> /usr/local/share/perl/5.6.1/Bio/DB/BioSQL/BasePersistenceAdaptor.pm:253
> STACK Bio::DB::Persistent::PersistentObject::store
> /usr/local/share/perl/5.6.1/Bio/DB/Persistent/PersistentObject.pm:270
> STACK (eval) ./load_seqdatabase.pl:521
> STACK toplevel ./load_seqdatabase.pl:504
>
>
> I added the Processing at length lines to see what was
> going on.  Only the first entry actually makes it into
> the db.  It seems to keep the last sequence in memory
> for some reason.  I also tried destroying the $seq at
> the end of the loop with a $seq->DESTROY; command but
> got the same results.
>
> Any ideas on this?
> Thanks.
>
>
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Friends.  Fun.  Try the all-new Yahoo! Messenger.
> http://messenger.yahoo.com/
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list