[Bioperl-l] Indexing nr database

Dave Messina David.Messina at sbc.su.se
Tue Sep 7 09:23:42 UTC 2010


Hi Ross,

What do you need the index for?

If it's random retrieval of sequences using an accession or GI, you'd be better off using NCBI's own database indexing and retrieval tools. They're far faster than BioPerl.

They're distributed with Blast+ and available here:

	ftp://ftp.ncbi.nlm.nih.gov//blast/executables/LATEST

Specifically, I'm talking about 'makeblastdb' and blastdbcmd'.



I'm not sure what you mean by "4g" nr, but there's an already-indexed version of nr available here:

		ftp://ftp.ncbi.nih.gov//blast/db

You can use that directly with the BLAST+ database tools.


Also, you take a look at the cookbook at the end of the Blast+ user manual (available in the same download directory as Blast+ itself). Some nice examples there showing off the flexibility of this latest version of the software.



Dave





More information about the Bioperl-l mailing list