[Bioperl-l] Retrieving sequence from local blast database

Jason Stajich jason@cgt.mc.duke.edu
Fri, 9 Aug 2002 00:11:56 -0400 (EDT)


On Tue, 30 Jul 2002 AUnderwood@PHLS.org.uk wrote:

> HI all,
>
> Does anyone know if there is a method within bioperl to retieve the sequence
> of an entry from a local blast database formatted with the formatdb command.
> I have retrieved a hit with the standalone blast method and want to retrieve
> all the sequence of the hit. If I use $hit->next_hsp->hit_string then then
> sequence retrieved is often not the entire sequence of the hit. Is it
> possible to use the accession number retrieved from the hit to retrieve the
> sequence from the local databse or just retrieve the entire hit sequence.
>
We can't read NCBI formatted index files (yet) plus they have been known
to change that format with different versions of blast so we're probably
not going to support it directly.

You can do instead, if you have the ncbi provided util 'fastacmd'

open(FAS, "fastacmd -s 'accessionnumber1,acc2,acc3' -d /path/to/db");
my $seqio =new Bio::SeqIO(-format => 'fasta',
			  -fh => \$FAS);

(make sure you formatted your db with 'formatdb -o T -i db ...')

Or you can alternatively index the db yourself with
Bio::Index::Fasta or Bio::DB::Fasta (two different implementations which
basically do the same thing - DB::Fasta does a bit more stuff)

-jason
> Many thanks
>
> Dr Anthony Underwood
> Bioinformatics Unit
> Central Public Health Laboratory
> 61 Colindale Avenue
> London
> NW9 5HT
> t:    0208 2004400 ext. 3618
> f:    0208 3583138
> e: aunderwood@phls.org.uk
>
>
>
> **************************************************************************
> The information contained in the EMail and any attachments is confidential
> and intended solely and for the attention and use of the named addressee(s).
> It may not be disclosed to any other person without the express authority of
> the PHLS, or the intended recipient, or both. If you are not the intended
> recipient, you must not disclose, copy, distribute or retain this message or
> any part of it.
>
> For information on how to send data to the PHLS in encrypted form via
> E.Mail, visit www.phls.org.uk.
>
> This footnote also confirms that this EMail has been swept for computer
> viruses, but please re-sweep any attachments before opening or saving.
>
> HTTP://www.phls.org.uk
> **************************************************************************
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu