[Biojava-l] Issues with BioSqlRichSequenceDB.java class
Deepak Sheoran
sheoran143 at gmail.com
Thu Feb 11 09:00:15 UTC 2010
Hi
This class(BiosqlRichSequence) have methods to retrieve record from a
local instance of biosql schema but when you type in accession number
for record it mostly show the info but in some case (Record with
accession:M97762) it give following error :
Hibernate: select sequence0_.bioentry_id as bioentry1_9_,
sequence0_1_.name as name9_, sequence0_1_.identifier as identifier9_,
sequence0_1_.accession as accession9_, sequence0_1_.description as
descript5_9_, sequence0_1_.version as version9_, sequence0_1_.division
as division9_, sequence0_1_.taxon_id as taxon8_9_,
sequence0_1_.biodatabase_id as biodatab9_9_, sequence0_.version as
version13_, sequence0_.length as length13_, sequence0_.alphabet as
alphabet13_, sequence0_.seq as seq13_ from biosequence sequence0_ inner
join bioentry sequence0_1_ on
sequence0_.bioentry_id=sequence0_1_.bioentry_id where sequence0_1_.name=?
Exception in thread "main" java.lang.RuntimeException: Error while
trying to load by id: M97762
at
org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.getRichSequence(BioSQLRichSequenceDB.java:212)
at
com.orionbiosciences.orionGenBankLib.genBankDb.GenBankDb.GenBankDbToFileDownLoader(GenBankDb.java:355)
at trashtesting.Main.main(Main.java:39)
Caused by: org.biojava.bio.seq.db.IllegalIDException: Id not found: M97762
at
org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.getRichSequence(BioSQLRichSequenceDB.java:206)
... 2 more
Java Result: 1
The only way to find this record in my database is to search for LOCUS
instead of Accession number which is "BTVNS1TUBA", java doc for
BioSqlRichSequenceDb class say the id should be Genbank Id i can't
understand what does that means, but when investigated the matter the
error is in following method
public RichSequenceDB getRichSequences(Set ids, RichSequenceDB db)
throws BioException, IllegalIDException {
if (db==null) db = new HashRichSequenceDB();
try {
for (Iterator i = ids.iterator(); i.hasNext(); ) {
String id = (String)i.next();
// Build the query object
***************************error*******************
String queryText = "from Sequence where name = ?";
***************************error***********************
*****************************solution**************************
String queryText = "from Sequence where accession = ?";
// because name stand for Locus from gen-bank record
which don't have any unique constraint name so its should not be good
idea to use it for searching unique records
// also people usually refer to a gen-bank record using
accession number instead of LOCUS
*****************************solution******************************
Object query = this.createQuery.invoke(this.session,
new Object[]{queryText});
// Set the parameters
query = this.setParameter.invoke(query, new
Object[]{new Integer(0), id});
// Get the results
List result = (List)this.list.invoke(query,(Object[])
null);
// If the result doesn't just have a single entry,
throw an exception
if (result.size()==0) throw new IllegalIDException("Id
not found: "+id);
// Add the results to the results db.
for (Iterator j = result.iterator(); j.hasNext(); )
db.addRichSequence((RichSequence)j.next());
}
} catch (Exception e) {
// Throw the exception with our nice message
throw new RuntimeException("Error while trying to load by
ids: "+ids,e);
}
return db;
}
even ncbi says " It is better to search for the actual accession number
rather than the locus name, because the accessions are stable and locus
names can change."
REF: http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html#LocusNameB
So my suggestion is to change the query so it will look for accession
instead of name in this method.
Also if you will try to download record from ncbi using java interface
first with accession:M97762( as genbank_id) you can get it, but when you
try to get using LOCUS you will get bad section exception around
reference I don't know why ?
Deepak Sheoran
More information about the Biojava-l
mailing list