[Biopython-dev] [Biopython] Google Summer of Code Project: SearchIO in Biopython

Peter Cock p.j.a.cock at googlemail.com
Mon Apr 30 10:57:27 UTC 2012

On Mon, Apr 30, 2012 at 11:08 AM, Wibowo Arindrarto
<w.arindrarto at gmail.com> wrote:
> I'm thinking of using the Search object as the object returned by
> SearchIO.parse or SearchIO.read. That way, we can store attributes
> common to the different search queries in it. For example:
>>>> search  = SearchIO.parse('blast_result.xml', 'blast-xml')
>>>> search.format
> 'blast-xml'
>>>> search.algorithm
> 'blastx'
>>>> search.version
> '2.2.26+'
>>>> search.database
> 'refseq_protein'
>>>> search.results
> <generator object results at ....>
> And iteration over the results would be done like this (for example):
>>>> for result in search.results:
> ... print result.query, print len(result)
> Additionaly, we can also define __iter__ and next for Search so we can
> just do the following:
>>>> for result in search:
> ... print result.query, print len(result)
> What do you think?

I think you'll get in a mess with multiple iterators all sharing the
same handle and competing over using it - but maybe I'm not
grasping what you have in mind.

Initially keep it simple: The primary public API would be

for result in Bio.SearchIO.parse(...):
     print result.query, print len(result)

where each iteration gives a complete result set for one query.


P.S. With SearchIO subject to name space discussions ;)

More information about the Biopython-dev mailing list