[BioPython] Regarding Blast and its output parsing

Sameet Mehta sameet at nccs.res.in
Tue Feb 22 14:28:41 EST 2005


Dear all,
Can the entire qblast API be used with Biopython.  I just downloaded and 
installed the latest version, before that i had installed the latest 
CVS.  But there doesnt seem to be a way to blast a particular genome 
using biopython.  Is there a way out.

regards
Sameet
biopython-request at portal.open-bio.org wrote:

>Send BioPython mailing list submissions to
>	biopython at biopython.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>	http://biopython.org/mailman/listinfo/biopython
>or, via email, send a message with subject or body 'help' to
>	biopython-request at biopython.org
>
>You can reach the person managing the list at
>	biopython-owner at biopython.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of BioPython digest..."
>
>
>Today's Topics:
>
>   1. Re: [Biopython-dev] [Fwd: Question to code] (Iddo Friedberg)
>   2. Bio.Blast.NCBIWWW.blast returns before search is	complete
>      (Noah Hoffman)
>   3. Re: Bio.Blast.NCBIWWW.blast returns before search is	complete
>      (Iddo Friedberg)
>   4. biopython + python 2.4? (Michael George Lerner)
>   5. Is the bug still there?? (Eirik S?nneland)
>   6. Re: Is the bug still there?? (Eirik S?nneland)
>   7. Re: Is the bug still there?? (Iddo Friedberg)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Thu, 17 Feb 2005 09:36:14 -0800
>From: Iddo Friedberg <idoerg at burnham.org>
>Subject: [BioPython] Re: [Biopython-dev] [Fwd: Question to code]
>To: Eirik S?nneland <eirik.sonneland at student.umb.no>,
>	biopython at biopython.org
>Message-ID: <4214D60E.40507 at burnham.org>
>Content-Type: text/plain; charset=UTF-8; format=flowed
>
>Eirik,
>
>Yes, you got that because you are probably using the 1.30 release 
>version, which does not include qblast. Normally I would recommend 
>checking out a new version form CVS. But these days the CVS is in 
>constant flux because we're racing to make the release deadline. Best 
>wait until Friday. (Saturday in Norway).
>
>In most other cases, you can downlaod a tarball from:
>
>http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/biopython.tar.gz?tarball=1&cvsroot=biopython
>
>This you can open with winzip, but I am not sure about how to go about 
>installation from source in Windows.
>
>Thank you for your patience. And for drawing our attention to that bug.
>
>
>Best,
>
>Iddo
>
>
>
>
>
>
>Eirik Sønneland wrote:
>
>  
>
>>Iddo!
>>
>>Not to familiar with use of CVS...is there an easy why to do this in 
>>windows? I tried your new code and got:
>>
>>Traceback (most recent call last):
>>  File "C:\Python24\MyWorkspace.py", line 23, in -toplevel-
>>    b_results = NCBIWWW.qblast('blastn', 'nr', f_record).read()
>>AttributeError: 'module' object has no attribute 'qblast'
>>
>>Yes I can wait to Friday, but since I'm using biopython in my master 
>>it might be good to use CVS to always have the latest version??
>>
>>Thanks,
>>Eirik
>>
>>Iddo Friedberg wrote:
>>
>>    
>>
>>>OK, I fixed that.
>>>
>>>Eirik, you can check out a copy fom CVS, or wait for the Friday release.
>>>
>>>
>>>
>>>./I
>>>
>>>
>>>Frank Kauff wrote:
>>>
>>>      
>>>
>>>>Eirik,
>>>>
>>>>Try if the code works with
>>>>
>>>>b_results = NCBIWWW.qblast('blastn', 'nr', f_record).read()
>>>>
>>>>instead of
>>>> 
>>>>
>>>>        
>>>>
>>>>>b_results = NCBIWWW.blast('blastn', 'nr', f_record).read()
>>>>>
>>>>>  
>>>>>          
>>>>>
>>>>I think that issue appeared a couple of months ago on the biopython 
>>>>list, essentially saying that
>>>>qblast is the blast NCBI wants people to use for blast scripts? 
>>>>After that, qblast method was added
>>>>to NCBIWWW, and it's what I'm using in my blast scripts.
>>>>
>>>>Hope this helps,
>>>>Frank
>>>>
>>>>On Wed, 2005-02-16 at 14:02 -0800, Iddo Friedberg wrote:
>>>> 
>>>>
>>>>        
>>>>
>>>>>OK, this is a real bug. NCBIWWW seems to be broken.
>>>>>
>>>>>I'm having a looksee, but I'd like someone more versed in this than 
>>>>>me to do so.
>>>>>
>>>>>Thanks,
>>>>>
>>>>>Iddo
>>>>>
>>>>>
>>>>>
>>>>>-------- Original Message --------
>>>>>Subject:     Question to code
>>>>>Date:     Wed, 16 Feb 2005 15:19:57 +0100
>>>>>From:     Eirik Sønneland <eirik.sonneland at student.umb.no>
>>>>>To:     idoerg
>>>>>
>>>>>
>>>>>
>>>>>Dear Freidberg,
>>>>>
>>>>>I've been having problems parsing my output from NCBI using the 
>>>>>example code given in Biopython Cookbook. Therefore I tried to 
>>>>>follow your code described in "Genome Informatics 14(2003). Still I 
>>>>>get an error message connected to the parsing. Could you please 
>>>>>give a hint on what is wrong? Is this a bug?? Code and output as 
>>>>>follows:
>>>>>
>>>>>Code:
>>>>>
>>>>>from Bio.Blast import NCBIWWW
>>>>>from Bio import Fasta
>>>>>
>>>>>file_for_blast = open('Fastaformat.txt', 'r')
>>>>>f_iterator = Fasta.Iterator(file_for_blast)
>>>>>f_record = f_iterator.next()
>>>>>
>>>>>b_results = NCBIWWW.blast('blastn', 'nr', f_record).read()
>>>>>
>>>>>b_record = NCBIWWW.BlastParser().parse_str(b_results)
>>>>>
>>>>>Output(Have cut out the beginning, only pasted the last part of 
>>>>>output):
>>>>>
>>>>>
>>>>>Score = 40.1 bits (20), Expect = 6.3
>>>>>Identities = 20/20 (100%)
>>>>>Strand = Plus / Minus
>>>>>
>>>>>                                Query: 29     ctgcagctcgggctcctgcc 48
>>>>>             ||||||||||||||||||||
>>>>>Sbjct: 150928 ctgcagctcgggctcctgcc 150909
>>>>></PRE>
>>>>>
>>>>>
>>>>><form>
>>>>>
>>>>><PRE>
>>>>>Lambda     K      H
>>>>>   1.37    0.711     1.31
>>>>>
>>>>>Gapped
>>>>>Lambda     K      H
>>>>>   1.37    0.711     1.31
>>>>>
>>>>>Matrix: blastn matrix:1 -3
>>>>>Gap Penalties: Existence: 5, Extension: 2
>>>>>Number of Sequences: 2894376
>>>>>Number of Hits to DB: 6,089,259
>>>>>Number of extensions: 328661
>>>>>Number of successful extensions: 6259
>>>>>Number of sequences better than 10.0: 2
>>>>>Number of HSP's better than 10.0 without gapping: 2
>>>>>Number of HSP's gapped: 6259
>>>>>Number of HSP's successfully gapped: 2
>>>>>Number of extra gapped extensions for HSPs above 10.0: 6255
>>>>>Length of query: 600
>>>>>Length of database: 13,294,103,689
>>>>>Length adjustment: 22
>>>>>Effective length of query: 578
>>>>>Effective length of database: 13,230,427,417
>>>>>Effective search space: 7647187047026
>>>>>Effective search space used: 7647187047026
>>>>>A: 0
>>>>>X1: 11 (21.8 bits)
>>>>>X2: 15 (30.0 bits)
>>>>>X3: 25 (50.0 bits)
>>>>>S1: 14 (25.0 bits)
>>>>>S2: 20 (40.1 bits)
>>>>>
>>>>>
>>>>></form>
>>>>>
>>>>>Traceback (most recent call last):
>>>>> File "C:\Python24\MyWorkspace.py", line 50, in -toplevel-
>>>>>   b_record = NCBIWWW.BlastParser().parse_str(b_results)
>>>>> File "C:\Python24\Lib\site-packages\Bio\ParserSupport.py", line 
>>>>>52, in parse_str
>>>>>   return self.parse(File.StringHandle(string))
>>>>> File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 
>>>>>47, in parse
>>>>>   self._scanner.feed(handle, self._consumer)
>>>>> File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 
>>>>>99, in feed
>>>>>   self._scan_rounds(uhandle, consumer)
>>>>> File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 
>>>>>242, in _scan_rounds
>>>>>   self._scan_alignments(uhandle, consumer)
>>>>> File "C:\Python24\Lib\site-packages\Bio\Blast\NCBIWWW.py", line 
>>>>>322, in _scan_alignments
>>>>>   raise SyntaxError, "Cannot resolve location at line:\n%s" % line1
>>>>>SyntaxError: Cannot resolve location at line:
>>>>></form>
>>>>>
>>>>>          
>>>>>
>>>>>Thanks!
>>>>>
>>>>>Regards,
>>>>>Eirik
>>>>>
>>>>>
>>>>>  
>>>>>          
>>>>>
>>>      
>>>
>
>
>  
>



More information about the BioPython mailing list