[Biopython-dev] GSoC SearchIO project

Wibowo Arindrarto w.arindrarto at gmail.com
Tue Aug 21 12:01:21 EDT 2012


On Tue, Aug 14, 2012 at 9:49 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Tue, Apr 10, 2012 at 1:58 AM, Brad Chapman wrote:
>> Michiel;
>>> Hi Eric, Peter,
>>>
>>> > How about Bio.Search, for now?
>>>
>>> I would prefer Bio.Pairwise or Bio.Align.Pairwise, since that tells
>>> users something about what the module is for. Bio.Search could be
>>> anything (search PubMed? search the Entrez databases? search Google?
>>> anyway Bio.Search does not suggest that this module is about pairwise
>>> alignments). But Peter previously mentioned that he doesn't like
>>> Bio.Pairwise; can we convince you?
>>
>> I agree with Peter on this one. The module is primarily about searching
>> a sequence database with an input via multiple methods, not about
>> pairwise alignment of two sequences with is what Bio.Align.Pairwise
>> suggests to me.
>>
>> Brad
>
> On potential problem with Bio.Search (on top of concerns raised
> here about vagueness) Bow and I were just talking about during
> our weekly GSoC video call was the existence of Bio/Search.py
> which is obsolete and long overdue for removal. I have just
> deprecated it (something I forgot to do before the last release):
> https://github.com/biopython/biopython/commit/5a275ccd1df3def40df1eef517af755d373dadd8
>
> We'd earlier talked about using Bio.Search as the namespace. I was
> worried about the potential existence on a user's machine of both
> Bio/Search.py (the old obsolete code) and Bio/Search/__init__.py
> (aka SearchIO, the new module) and which would take precedence
> when doing: from Bio import Search
>
> Given how Python module installations work, that seems highly
> likely to occur. The good news is that the package would take
> priority - see http://www.python.org/doc/essays/packages.html
>
>>>>> What If I Have a Module and a Package With The Same Name?
>>>>>
>>>>> You may have a directory (on sys.path) which has both a module
>>>>> spam.py and a subdirectory spam that contains an __init__.py
>>>>> (without the __init__.py, a directory is not recognized as a package).
>>>>> In this case, the subdirectory has precedence, and importing spam
>>>>> will ignore the spam.py file, loading the package spam instead. If
>>>>> you want the module spam.py to have precedence, it must be
>>>>> placed in a directory that comes earlier in sys.path.
>
> So there is no technical reason to avoid Bio.Search as an
> option for the Bio.SearchIO namespace. We could then
> have Bio.Search.Applications for command line wrappers,
> consistent with Bio.Phylo.Applications, Bio.Motif.Applications
> and Bio.Align.Applications.
>
> Of course, Bio.Search is still perhaps too broad a name... but
> on balance perhaps it is still better than Bio.SearchIO?
>
> Regards,
>
> Peter

Hi everyone,

If I may add my two cents, for now I am in favor of putting the module
under Bio.Search. It is not the best name out there (it does sound a
bit vague), but it's the one that seem to be the most intuitive (until
a better alternative comes out). There were some other alternatives
that I and Peter have discussed, but they seem less appealing for us.
You're free to add your thoughts on these of course :) :

- Bio.SeqSearch. This sounds ok, but when you consider we have
Bio.Seq, Bio.SeqRecord, Bio.SeqFeature, and Bio.SeqUtils, it becomes
quite confusing quickly.

- Bio.PSearch ('p' for pairwise). This one seemed the less intuitive
among the three options, so I'm not so big on this.

For now, I'm still writing everything (code, docstrings, tutorial)
using SearchIO. I suppose it's better if we could agree on a more
suitable name, though.

On another note, I'm also in favor of using the Bio.Phylo module
skeleton for Bio.SearchIO / Bio.Search. We may then group all sequence
search-related application wrappers under Applications (I actually
prefers 'app' for better PEP8 compliance, but that's another
discussion) and perhaps even refactor our remote search calls (e.g.
the 'qblast' module) under Bio.Search as well.

cheers,
Bow


More information about the Biopython-dev mailing list