[Biopython-dev] support for database of BOLDSYSTEMS?
Travis Wrightsman
twrig002 at ucr.edu
Wed Dec 10 16:51:09 UTC 2014
It might be best to contact the general list as well to see if anyone has used BOLD before. I visited the website for a few minutes today, it seems to be a data aggregator that offers taxonomic metadata.
-Travis
> On Dec 10, 2014, at 6:31 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> Dear Biopythoneers,
>
> For those of you not following GitHub's pull request notifications,
> recent Biopython contributor Carlos Peña has submitted his code
> for the BOLD (Barcode of Life Data) System for possible inclusion
> in Biopython (email included below), see:
> https://github.com/biopython/biopython/pull/438
>
> I'm hoping someone on the list has used BOLD before, see
> http://www.boldsystems.org/ - and could give some feedback
> please?
>
> Or should we need to ask on the main mailing list?
>
> Thanks,
>
> Peter
>
> ---------- Forwarded message ----------
> From: Carlos Peña <notifications at github.com>
> Date: Wed, Dec 3, 2014 at 2:48 PM
> Subject: [biopython] Proposal of new Biopython module: bold (#438)
> To: biopython/biopython <biopython at noreply.github.com>
>
>
> As I mentioned in an email to the dev list some time ago, I have been
> working on module to perform calls to the BOLD database via their API.
> The BOLD database contains more than 1 million public DNA barcode
> sequences (part of the COI gene). One of the most interesting services
> is the possibility of sending the barcode sequence and retrieving the
> taxon identification and more metadata from the BOLD servers.
>
> I just migrated the code to Biopython from a temporal Github
> repository. You can see the documentation here
> https://bold.readthedocs.org/en/latest/usage.html that covers all the
> API methods provided by BOLD.
>
> This module includes unittests for 99% coverage. The tests and
> docstrings have been tested in Python 2.6, 2.7, 3.3, 3.4 and pypy.
>
> I completed all the work that I could think of, hence the pull
> request. I am open to feedback on this.
>
> ________________________________
>
> You can merge this Pull Request by running
>
> git pull https://github.com/carlosp420/biopython patch-30
>
> Or view, comment on, or merge it at:
>
> https://github.com/biopython/biopython/pull/438
>
> Commit Summary
>
> copy code in Biopython
> added Experimental Warning
> added tests
>
> File Changes
>
> A Bio/bold/__init__.py (33)
> A Bio/bold/api.py (684)
> A Bio/bold/utils.py (32)
> A Tests/test_bold_api.py (261)
> A Tests/test_bold_utils.py (40)
> M setup.py (1)
>
> Patch Links:
>
> https://github.com/biopython/biopython/pull/438.patch
> https://github.com/biopython/biopython/pull/438.diff
>
> —
> Reply to this email directly or view it on GitHub.
>
>
>
>
>
>
>> On Wed, Nov 5, 2014 at 10:45 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Hi Carlos,
>>
>> I've not done anything with Twisted or other asynchronous mechanism
>> for accessing online resources - services like the NCBI discourage
>> submitting multiple requests in parallel anyway.
>>
>> One idea might be to leave that to the library's user, and focus on the
>> lower level API (building the URLs, parsing the returned values, etc)?
>>
>> Peter
>>
>>
>>> On Tue, Nov 4, 2014 at 8:31 PM, Carlos Peña <mycalesis at gmail.com> wrote:
>>> Hi all,
>>>
>>>
>>> I have written an interface to the BOLD database of DNA barcodes. It accepts
>>> FASTA files, sends them to BOLD and gets the specimen identifications to the
>>> species level:
>>>
>>> https://github.com/carlosp420/bold_retriever
>>>
>>> I was wondering whether it could be included into BioPython? So far the
>>> packages is a bunch of scripts and I want to make it more robust.
>>> The working version is not so efficient as the running time has exponential
>>> growth (n squared).
>>>
>>> However, I was able to use asynchronous calls (using Twisted) to make it
>>> faster. The script was able to take (n) seconds for (n) number of sequences.
>>> But I don't fully understand Twisted and the package is unstable.
>>>
>>> So, I wanted to ask if this little project of mine has any hope of getting
>>> into BioPython. If that is the case I would need some pointers on using
>>> proper classes for the code and fixing the code so that it can be
>>> integrated. I guess I would need to drop Twisted and use instead a standard
>>> Python library for multithreading.
>>>
>>> I want to improve the package anyways, make it more robust and quick. So I
>>> wanted to ask before giving another chance to Twisted.
>>>
>>> Any comments would be appreciated,
>>>
>>>
>>> carlos
>>>
>>>
>>> Dr. Carlos Peña
>>> Laboratory of Genetics
>>> Department of Biology
>>> University of Turku
>>> 20014 Turku
>>> FINLAND
>>>
>>>
>>> _______________________________________________
>>> Biopython-dev mailing list
>>> Biopython-dev at mailman.open-bio.org
>>> http://mailman.open-bio.org/mailman/listinfo/biopython-dev
>
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython-dev
More information about the Biopython-dev
mailing list