[Bioperl-l] blast output -> blast -m8 output

Amir Karger akarger at CGR.Harvard.edu
Wed Jan 11 16:13:41 EST 2006


> From: Jason Stajich [mailto:jason.stajich at duke.edu] 
> 
> The existing search2table script in scripts/searchio does this for  
> you - I don't think there is a writer plugin but there could be.

Ah nice. but:
-------------------
>perl bioperl-1.5.0-RC1/scripts/searchio/search2table.PLS seqs.blp > zzz
>more zzz
Bacteriophage_1[M19348] ref|NP_037061.1|        40.32   62      27      4
28      89      1050    1107    6e-05   46.6
Bacteriophage_1[M19348] ref|XP_193814.5|        48.89   45      16      6
57      95      320     364     0.001   42.7
Bacteriophage_1[M19348] ref|XP_912463.1|        48.89   45      16      6
57      95      866     910     0.001   42.7
Bacteriophage_1[M19348] ref|XP_619329.2|        48.89   45      16      6
57      95      676     720     0.001   42.7
C.elegans_1_[Z49071]    ref|XP_917828.1|        29.61   412     183     48
40      410     52      456     6e-43   173
C.elegans_1_[Z49071]    gb|AAI10184.1|  31.99   347     147     23      40
373     53      389     6e-42   169
>more seqs.m8
Bacteriophage_1[M19348] gi|6978677|ref|NP_037061.1|     40.32   62      33
1       28      89      1050    1107    6e-05   46.6
Bacteriophage_1[M19348] gi|82958039|ref|XP_193814.5|    48.89   45      17
1       57      95      320     364     0.001   42.7
Bacteriophage_1[M19348] gi|82958037|ref|XP_912463.1|    48.89   45      17
1       57      95      866     910     0.001   42.7
Bacteriophage_1[M19348] gi|82957449|ref|XP_619329.2|    48.89   45      17
1       57      95      676     720     0.001   42.7
C.elegans_1_[Z49071]    gi|82802536|ref|XP_917828.1|    29.61   412     242
9       40      410     52      456     6e-43    173
C.elegans_1_[Z49071]    gi|82571607|gb|AAI10184.1|      31.99   347     213
11      40      373     53      389     6e-42    169
-----------------

I know we can't get around the problem of the IDs, since blast & blast -m8
give different IDs. But columns 5 and 6 (mismatches, gap openings) are
consistently different. Is search2table not trying to mimic -m8 exactly, or
is this a bug?

Apologies if this is due to using bioperl 1.4 and the PLS script from
1.5.0-RC1. That's what I have on hand.

> 
> Note that if you just using BLAST you will find that the blast2table  
> script that is included in the BLAST book (see the O'Reilly website  
> for the book and download the code examples) will also generate this  
> sort of thing for you and will be many times faster than SearchIO  
> code. 

I could steal that. But I was thinking that if NCBI changes the BLAST
format, bioperl may upgrade while the dead trees code won't.

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University
617-496-0626

> There is also an equivalent hmmer_to_table and  
> fastam9_to_table which are very fast re-formatters that don't  
> actually use SearchIO since one is just trying to get the 
> very simple  
> data out.
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12/
> 
> 


More information about the Bioperl-l mailing list