[BioPython] import Standalone problems

Jacob Joseph jmjoseph at andrew.cmu.edu
Thu Jul 20 18:08:15 UTC 2006


Hi.  I suspect you are not using my updated Record.py.   You'll notice 
that, at least for the moment, I have changed _blast.gap_penalties to an 
array to allow assignment per item without worrying about the order of 
entries within the xml file.  There are other ways this could be 
accomplished while still using a tuple.

-Jacob

Rohini Damle wrote:
> Hi,
> When I tried on your NCBIXML.py code instead of oringinal one I am
> getting following error messege:
> 
> File "C:\Python24\lib\site-packages\Bio\Blast\NCBIXML.py", line 210,
> in _end_Parameters_gap_open
>    self._blast.gap_penalties[0] = int(self._value)
> TypeError: object does not support item assignment
> 
> in the original version
> we don't have that " [0] " in self._blast.gap_penalties
> 
> what might be causing this error?
> -Rohini
> 
> On 7/19/06, Jacob Joseph <jmjoseph at andrew.cmu.edu> wrote:
>> I do not believe the current version of the parser will work with
>> multiple queries using recent version of blast, regardless of the output
>> format.  I do know that blastall 2.2.13 with XML functions with the
>> parser corrections previously attached.  I have attached a further
>> updated NCBIXML.py, fixing the performance issues in parse() that I
>> mentioned.
>>
>> -Jacob
>>
>> Rohini Damle wrote:
>> > Hi,
>> > Can someone suggest me for which version of Blast, the Biopython's
>> > (text or xml) parser works fine?
>> > I will download that blast version locally and can use biopython's 
>> parser.
>> > thanx,
>> > Rohini
>> >
>> > On 7/18/06, Jacob Joseph <jacob at jjoseph.org> wrote:
>> >> Hi.
>> >> I encountered similar difficulties over the past few days myself and
>> >> have made some improvements to the XML parser.  Well, that is, it now
>> >> functions with blastall, but I have made no effort to parse the other
>> >> blast programs.  I do not expect I have done any harm to other 
>> parsing,
>> >> however.
>> >>
>> >> Attached are Record.py, NCBIStandalone.py, and NCBIXML.py.  I have not
>> >> yet spent significant time to clean up my changes.  Without getting 
>> into
>> >> specific modifications, I have made an effort to make consistent the
>> >> variables in Record and NCBIXML, focusing primarily on what I needed
>> >> this week.
>> >>
>> >> One portion I am not settled on reinitialization of Record.Blast at
>> >> every call to iterator.next(), and, by extension, BlastParser.parse().
>> >> See NCBIXML.py, line 114.  Without re-initializing this class, we run
>> >> the risk of retaining portions of a Record from previously parsed
>> >> queries.   This causes the bug 1970, mentioned below.  Unfortunately,
>> >> this re-initialization exacts a significant performance penalty of at
>> >> least a factor of 10 by some rough measures.  I would appreciate any
>> >> suggestions for improvement here.
>> >>
>> >> I do apologize for not being more specific about my changes.  When 
>> I get
>> >> a chance(next week?), I will package them up as a proper patch and 
>> file
>> >> a bug.  Perhaps what I have done so far will be of use until then.
>> >>
>> >> fyi, I have done all of my testing with Blast 2.2.13.  2.2.14 seems to
>> >> not have separate <?xml> blocks within its output, requiring a 
>> different
>> >> method of iteration.
>> >>
>> >> -Jacob
>> >>
>> >> Peter wrote:
>> >> > Rohini Damle wrote:
>> >> >> Hi,
>> >> >> I have a XML file with 4 blast records (for proteins P1, P2, P3, 
>> P4)
>> >> >> I am trying to extract alignment information for each of them.
>> >> >> So I wrote the following code:
>> >> >>
>> >> >>  for b_record in b_iterator :
>> >> >>
>> >> >>                 E_VALUE_THRESH =20
>> >> >>                 for alignment in b_record.alignments:
>> >> >>                        for hsp in alignment.hsps:
>> >> >>                        if hsp.expect< E_VALUE_THRESH:
>> >> >>
>> >> >>                             print '****Alignment****'
>> >> >>                             print 'sequence:',
>> >> alignment.title.split()[0]
>> >> >>
>> >> >> With this code, I am getting information for P1,
>> >> >> then information for P1 + P2
>> >> >> then for P1+P2 +P3
>> >> >> and finally for P1+P2+P3+P4
>> >> >> why this is so?
>> >> >> is there something wrong with the looping?
>> >> >
>> >> > I'm aware of something funny with the XML parsing, Bug 1970, which
>> >> might
>> >> > well be the same issue:
>> >> >
>> >> > http://bugzilla.open-bio.org/show_bug.cgi?id=1970
>> >> >
>> >> > I confess I haven't looked into exactly what is going wrong here 
>> - too
>> >> > many other demands on my time to learn about XML and how BioPython
>> >> > parses it.
>> >> >
>> >> > Does the work around on the bug report help?  Depending on which
>> >> version
>> >> > of standalone blast you have installed, you might have better 
>> luck with
>> >> > plain text output - the trouble is this is a moving target and 
>> the NBCI
>> >> > keeps tweaking it.
>> >> >
>> >> > Peter
>> >> >
>> >> > _______________________________________________
>> >> > BioPython mailing list  -  BioPython at lists.open-bio.org
>> >> > http://lists.open-bio.org/mailman/listinfo/biopython
>>
>>
>>
>> _______________________________________________
>> BioPython mailing list  -  BioPython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>>
>>
>>



More information about the Biopython mailing list