[Bioperl-l] Taxa Id from blast report
Smithies, Russell
Russell.Smithies at agresearch.co.nz
Tue Apr 23 18:32:25 EDT 2013
It works OK if I do it with NCBI's pre-formatted databases, eg.
illustrious$ blastx -query gold_small.fa -db /bifo/infernal/active/blastdata/mirror/nr -max_target_seqs 1 -outfmt "6 staxids sscinames sskingdoms"
411903 Collinsella aerofaciens ATCC 25986 Bacteria
411903 Collinsella aerofaciens ATCC 25986 Bacteria
39947 Oryza sativa Japonica Group Eukaryota
39947 Oryza sativa Japonica Group Eukaryota
39947 Oryza sativa Japonica Group Eukaryota
498761 Heliobacterium modesticaldum Ice1 Bacteria
391296 Streptococcus suis 98HAH33 Bacteria
391296 Streptococcus suis 98HAH33 Bacteria
Perhaps it's something to do with your database formatting or sequence IDs?
--Russell
From: shalu sharma [mailto:sharmashalu.bio at gmail.com]
Sent: Wednesday, 24 April 2013 5:14 a.m.
To: Jason Stajich
Cc: Smithies, Russell; Fields, Christopher J; Peter Cock; shalabh sharma; bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Taxa Id from blast report
Hi Jason,
Thanks a lot for you suggestion. I tried that too but i am still not getting super kingdom, actually i don't know how to put super kingdom in the database.
For example:
This is how i formatted my refseq microbial database:
makeblastdb -dbtype prot -in microbial_protein_mask.fasta -out refMicro -taxid_map GItaxa.txt -parse_seqids ( where GItaxa is the file <GI> <TaxonomyId><newline> ), there is no super kingdom.
So when i run this blast command:
blastx -query test.fas -db refMicro -max_target_seqs 1 -outfmt "6 staxids sscinames sskingdoms"
246200 N/A N/A
246200 N/A N/A
I would really appreciate you help.
Thanks
Shalu
On Fri, Apr 19, 2013 at 3:38 PM, Jason Stajich <jason.stajich at gmail.com<mailto:jason.stajich at gmail.com>> wrote:
Did you provide -parse_seqids in the header?
Peter dealt with related things here:
http://blastedbio.blogspot.com/2012/10/my-ids-not-good-enough-for-ncbi-blast.html
Jason
On Apr 19, 2013, at 1:05 PM, shalu sharma <sharmashalu.bio at gmail.com<mailto:sharmashalu.bio at gmail.com>> wrote:
Hi,
Thanks everyone for you inputs.
@Peter:
I got really excited when i saw that you can even get super kingdom, but
when i tried to test it i just got taxa ids but not the super kingdom. Do
you have any idea whats going wrong?
my command:
blastx -query test.fas -db /db/ncbiblast/refseq/latest/refseq_protein
-max_target_seqs 1 -outfmt "6 staxids sskingdoms"
output:
246200 N/A
246200 N/A
Thanks
Shalu
On Thu, Apr 18, 2013 at 3:52 PM, Smithies, Russell <
Russell.Smithies at agresearch.co.nz<mailto:Russell.Smithies at agresearch.co.nz>> wrote:
I agree they have finally listened and added features requested by users
but I've been suggesting they have a compressed output format available
from eutils or genbank for years but have made no headway ;- (
What's so hard about gzip'ping the output? I'm sure it would go a long way
toward solving all the problems we get with truncated replies from queries!!
--Russell
-----Original Message-----
From: bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org> [mailto:
bioperl-l-bounces at lists.open-bio.org<mailto:bioperl-l-bounces at lists.open-bio.org>] On Behalf Of Fields, Christopher J
Sent: Friday, 19 April 2013 6:26 a.m.
To: Peter Cock
Cc: bioperl-l at lists.open-bio.org<mailto:bioperl-l at lists.open-bio.org>; shalu sharma; shalabh sharma
Subject: Re: [Bioperl-l] Taxa Id from blast report
On Apr 18, 2013, at 11:48 AM, Peter Cock <p.j.a.cock at googlemail.com<mailto:p.j.a.cock at googlemail.com>>
wrote:
On Thu, Apr 18, 2013 at 5:32 PM, shalabh sharma
<shalabh.sharma7 at gmail.com<mailto:shalabh.sharma7 at gmail.com>> wrote:
Hey Peter,
Thanks a lot, I really appreciate it. I wanted these things
implemented in blast from long time.
Thanks
Shalabh
Me too. You can get the descriptions from the plain text BLAST or XML
output already of course, but they're not so nice to work with.
Peter
NCBI has been much more receptive of user input over the last several
years, much more so than in the past. I understand the reasoning for
dropping BLAST support (though there were definitely needless bumps in that
process).
chris
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org<mailto:Bioperl-l at lists.open-bio.org>
http://lists.open-bio.org/mailman/listinfo/bioperl-l
Jason Stajich
jason.stajich at gmail.com<mailto:jason.stajich at gmail.com>
jason at bioperl.org<mailto:jason at bioperl.org>
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
More information about the Bioperl-l
mailing list