[Bioperl-l] Download sequence annotations without sequence ??

SG sgoegel at gmail.com
Fri Jul 1 18:27:44 EDT 2005


I have scripts and modules set up to, for a given blast report, go through and 
download sequences (when not available locally) for certain subjects (hits) 
and extract information such as db_xref fields, geneontology annotations, 
taxon ID, and features.
The one thing I am not using is the actual DNA or amino acid sequence itself.
For large sequences such as genomic DNA, which can be several megabases in 
size or more, it is impractical to download the entire sequence, which I do 
not need.

My question is, does Bioperl currently have a way to download only the 
annotations/features associated with a sequence (in GenBank format, for 
example), but not the sequence itself? If NCBI does not currently offer a way 
to do that, all that would be necessary to do would be to terminate the 
connection with the server when the ORIGIN line is reached.
Of course, that would limit to only one sequence per query, which is perfectly 
fine under the circumstances.
For pipelined downloads (the default), the $/ input separator would have to be 
modified accordingly. I have done this but I want to make sure it's not 
already a standard function of any part of Bioperl. Also, if Bioperl does not 
currently do this, is there interest in a patch to add this functionality 
(assuming I get around to making one)?

SG


More information about the Bioperl-l mailing list