[BioSQL-l] RE: Multiple accession numbers?
Hilmar Lapp
hlapp at gnf.org
Wed May 7 08:46:29 EDT 2003
They *should* be in bioentry_qualifier_value -- if not it sounds like a bug.
Your suggestion makes me hope that you haven't looked there yet ...
If you retrieve a sequence you also should get them back. Unfortunately the secondary accessions thing is not tested for yet, because in bioperl 1.2.x they aren't stored in (and taken from) the annotation bag.
So, for loading and for retrieval you do run the latest bioperl-db and bioperl main trunk (*not* 1.2.1)?
-hilmar
-----Original Message-----
From: Elia Stupka [mailto:elia at tll.org.sg]
Sent: Wed 5/7/2003 3:27 AM
To: Hilmar Lapp
Cc: biosql-l at open-bio.org; Juguang Xiao
Subject: Multiple accession numbers?
Hi Hilmar,
we are trying to build a simple web sequence retrieval system on top of
BioSQL now that we have most public sequences loaded well, and I bumped
into a problem, maybe you've seen this before... some swissprot records
have many accession numbers, which are correctly parsed by SeqIO into
an array of accession numbers which eventually end up in a
Bio::Seq::RichSeq object as Bio::Annotation::SimpleValue objects with
tag "secondary_accession" ana value the accession number.
However I can't seem to find them stored in BioSQL. The
secondary_accession key is stored in term correctly, with its term_id,
and that term_id is found in the seqfeature table as the "type_term_id"
always associated with the term_id for the "gene_name" key, but no
value to be found, not sure if I am looking in the wrong place, but
anyway the result is that so far I can't retrieve sequences by their
secondary_accession numbers...
Regardless of the solution to that, don't you think that secondary
accessions should get some sort of preferential treatment? In other
words perhaps be associated directly to the bioentry via
bioentry_qualifier_value? One would want to quickly search by all
accession numbers, without having to issue a slow select statement over
feature-related tables, right?
Let me know what you think...
Elia
---
Bioinformatics Program Manager
Temasek Life Sciences Laboratory
1, Research Link
Singapore 117604
Tel. +65 6874 4945
Fax. +65 6872 7007
More information about the BioSQL-l
mailing list