[Biopython-dev] should we make a BLAT parser?

Brandon King kingb at caltech.edu
Thu Jul 7 20:44:37 EDT 2005


FYI, I just looked at my code and realized I wrote a BLAT parser that
loads the data into simple objects. I don't know if that might be
useful? If your interested, I can tell you more about it.

-Brandon King

Yair Benita wrote:

> Since the only differences are in the header/footer and some spaces 
> and numbers, it is essentially just like parsing a BLAST output.
> Tomorrow I will post all the changes needed. On my machine I just 
> made a copy of the NCBIStandalone and modified it to fit the BLAT 
> output but the correct way to do this is to modify the original 
> NCBIStrandalone to handle all these outputs. The thing is I don't 
> fully understand how this parser works (with all those uhandles, 
> scanners, consumers, etc.), so I rather someone who does makes the 
> changes in the CVS.
>
> Yair
>
> On Jul 7, 2005, at 20:30, Brandon King wrote:
>
>> Hi Yair,
>>     I'm new to the developers list, but I do think it would be a great
>> idea to create a BLAT parser based on the NCBIStandalone module. I  have
>> to do about a million BLATs soon. I have code for processing many  BLAST
>> results from the NCBIStandalone, but I don't have anything nearly as
>> good for BLAT. Being able to use the same analysis code for BLAST/BLAT
>> would be great (assuming the change your talking about will return
>> result objects the same way that you can with the NCBIStandalone 
>> module?).
>>
>> -Brandon King
>>
>> Yair Benita wrote:
>>
>>
>>> I noticed a while ago that someone asked for a BLAT parser.
>>> I just had to do a few thousands BLATs and I don't really liked  the
>>> psl
>>> output format it used. It is a bit confusing in my opinion. So I 
>>> used the
>>> blast-like output and with minor changes to the NCBIStandalone 
>>> module I was
>>> able to parse it with no problems.
>>>
>>> Should we introduce modifications in the NCBIStrandalone file or 
>>> make a new
>>> separate file for parsing BLAT output?
>>>
>>> The main changes are in the header and footer of the file. I  append
>>> examples
>>> below. There were a few other minor changes.
>>>
>>> Yair
>>>
>>> ----- header blat ------
>>> BLASTN 2.2.4 [blat]
>>>
>>> Reference:  Kent, WJ. (2002) BLAT - The BLAST-like alignment tool
>>>
>>> ----- header blast ------
>>> BLASTX 2.2.6 [Apr-09-2003]
>>>
>>>
>>> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. 
>>> Schaffer,
>>> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
>>> "Gapped BLAST and PSI-BLAST: a new generation of protein database 
>>> search
>>> programs",  Nucleic Acids Res. 25:3389-3402.
>>>
>>> ----- footer blat ------
>>>  Database: localhost:4303
>>>
>>> ----- footer blast ------
>>>  Database: nr
>>>    Posted date:  Aug 11, 2004  8:59 AM
>>>  Number of letters in database: 663,053,178
>>>  Number of sequences in database:  1,971,122
>>>
>>> Lambda     K      H
>>>   0.310    0.133    0.405
>>>
>>> Gapped
>>> Lambda     K      H
>>>   0.267   0.0410    0.140
>>>
>>>
>>> Matrix: BLOSUM62
>>> Gap Penalties: Existence: 11, Extension: 1
>>> Number of Hits to DB: 111,495,368
>>> Number of Sequences: 1971122
>>> Number of extensions: 811791
>>> Number of successful extensions: 2455
>>> Number of sequences better than 1.0e-01: 0
>>> Number of HSP's better than  0.1 without gapping: 2446
>>> Number of HSP's successfully gapped in prelim test: 0
>>> Number of HSP's that attempted gapping in prelim test: 0
>>> Number of HSP's gapped (non-prelim): 2455
>>> length of database: 663,053,178
>>> effective HSP length: 2
>>> effective length of database: 659,110,934
>>> effective search space used: 15818662416
>>> frameshift window, decay const: 50,  0.1
>>> T: 12
>>> A: 40
>>> X1: 16 ( 7.2 bits)
>>> X2: 38 (14.6 bits)
>>> X3: 64 (24.7 bits)
>>> S1: 42 (21.7 bits)
>>>
>>>
>>> _______________________________________________
>>> Biopython-dev mailing list
>>> Biopython-dev at biopython.org
>>> http://biopython.org/mailman/listinfo/biopython-dev
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>



More information about the Biopython-dev mailing list