[BioSQL-l] genbank, references, and crc's
Bryan Cardillo
dillo at pcbi.upenn.edu
Mon Apr 9 16:05:03 UTC 2007
This is probably more of a bioperl issue, but since it was
previously discussed here, this is where I'll continue the
discussion. I've just run into the same issues mentioned in
these threads while loading some refseq sequences.
http://lists.open-bio.org/pipermail/biosql-l/2006-July/001024.html
http://lists.open-bio.org/pipermail/biosql-l/2006-August/001048.html
I believe the bioperl-db patch below solves these issues.
The crux of the problem is that the _crc64 code uses the
authors, title, and location to determine a unique key.
However the get_unique_key_query method only checks authors
before deferring to a crc lookup. The fix causes the crc key
to be used if any of authors, title, or location is
specified.
Cheers,
Bryan Cardillo
Penn Bioinformatics Core
University of Pennsylvania
ReferenceAdaptor.pm | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
Index: ./Bio/DB/BioSQL/ReferenceAdaptor.pm
===================================================================
RCS file: /home/repository/bioperl/bioperl-db/Bio/DB/BioSQL/ReferenceAdaptor.pm,v
retrieving revision 1.24
diff -u -r1.24 ReferenceAdaptor.pm
--- ./Bio/DB/BioSQL/ReferenceAdaptor.pm 4 Jul 2006 22:23:12 -0000 1.24
+++ ./Bio/DB/BioSQL/ReferenceAdaptor.pm 9 Apr 2007 15:38:35 -0000
@@ -426,7 +426,7 @@
});
}
}
- if($obj->authors()) {
+ if($obj->authors() || $obj->title() || $obj->location()) {
push(@ukqueries, {
'doc_id' => $self->_crc64($obj),
});
More information about the BioSQL-l
mailing list