[Bioperl-l] Bio::DB::Fasta::get_Seq_by_id()

Jason Stajich jason at bioperl.org
Mon Oct 30 18:08:15 UTC 2006


Bio::PrimarySeq makes sense because Fasta databases only provide  
sequences without features.  But you are actually getting a  
Bio::PrimarySeq::Fasta object which is a proxy object since the  
module won't pull a whole sequence into memory unless seq() is  
requested.

The problem is really why you are getting something useless set for  
primary_id.

What do you want it to be - the GI number?  you'll need to explicitly  
set it because DB::Fasta has no concept of GI numbers encoded in the  
header line.
AFAIK you cannot also set the primary_id to a value of your liking  
because this a proxy object.  The best bet is to create a Bio::Seq  
object out of one of these and set the primary_id and display_id to  
values that you can compute from the display_id.

At least that has been my strategy when using this - maybe someone  
wants to code something new into the object itsself.

-jason
On Oct 30, 2006, at 12:51 AM, Nathan S. Haigh wrote:

> In my script I retrieve sequences from GenBank in FASTA format by GI
> numbers and optionally store the sequence in a cache using
> Bio::DB::Fasta. On subsequent runs of the script, the cache is first
> checked for the GI and returns the sequence if it is found or the
> sequence is obtained from GenBank as above.
>
> I would have thought that Bio::DB::Fasta::get_Seq_by_id() would have
> returned a Bio::Seq object but rather it returns a Bio::PrimarySeq
> object which is defined within the Bio::DB::Fasta file. This is
> annoying, since $seq_obj in my script would be either a Bio::Seq if it
> was obtained from GenBank or a Bio::PrimarySeq if obtained from the
> cache and calling primary_id() on it doesn't do the expected thing  
> with
> Bio::PrimarySeq:
> ID:          Bio::PrimarySeq::Fasta=HASH(0x89b4508)
>
> Is there a reason why Bio::DB::Fasta doesn't return a Bio::Seq object?
>
> Nath
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l

--
Jason Stajich, PhD
Miller Research Fellow
University of California
Dept of Plant and Microbial Biology
321 Koshland Hall #3102
Berkeley, CA 94720-3102
lab: 510.642.8441
http://pmb.berkeley.edu/~taylor/people/js.html





More information about the Bioperl-l mailing list