[Biopython] Generative AI policy for contributions to Biopython
Peter Cock
p.j.a.cock at googlemail.com
Mon Jun 1 02:09:39 EDT 2026
This is a real issue - I added that tag to an old issue and with a
week it had two likely AI generated PRs:
https://github.com/biopython/biopython/issues/2116
Peter
On Fri, Apr 24, 2026 at 7:26 PM Peter Cock <p.j.a.cock at googlemail.com> wrote:
>
> I just posted this on Mastodon (neither I nor Biopython or the OBF use X/Twitter
> anymore):
>
> https://fediscience.org/@pjacock/116143426085553346
>
> And that reminded me of an earlier remark I made:
>
> > Seems the recent #GenerativeAI #slop pull requests that I've looked at for
> > #Biopython have preferentially targeted the "Good First Issues". We really
> > wanted those to be onboarding ramps for new #OpenSource contributors -
> > and not for padding anyone's GitHub profile or whatever the motivation here is.
> >
> > So I think any formal policy will want to say explicitly #NoAI on those issues
> > at the very least.
>
> https://fediscience.org/@pjacock/116161804363016972
>
> Peter
>
> On Fri, Apr 24, 2026 at 11:16 AM Peter Cock <p.j.a.cock at googlemail.com> wrote:
> >
> > Dear Biopythoneers,
> >
> > We need to set out a generative AI policy for contributions to Biopython.
> >
> > There are now multiple recent PRs submitted by new contributors which
> > are openly using AI tools, more that I suspect are, and now even AI assisted
> > PRs from past contributors (where CV padding or other external metrics
> > are unlikely to be driving this). These are generally more work to review
> > than human written PRs, and that is a growing issue.
> >
> > I blogged about my views late last year - ending in the line "Right now, I
> > still lean very much to saying no any PR using generative AI".
> >
> > https://blastedbio.blogspot.com/2025/11/thoughts-on-generative-ai-contributions.html
> >
> > Things will change (both tool capabilities, but also the social and legal
> > interpetations) but that post still describes my views today - note I did
> > not touch on the topic of communications there (see below).
> >
> > Recently Linux adopted what has been described as a balanced stance
> > treating it as a tool with very clear expectations that usage MUST be declared
> > and that the human submitter is responsible for (quoting these four points):
> >
> > * Reviewing all AI-generated code
> > * Ensuring compliance with licensing requirements
> > * Adding their own Signed-off-by tag to certify the DCO
> > * Taking full responsibility for the contribution
> >
> > https://docs.kernel.org/process/coding-assistants.html
> >
> > That is pragmatic but ignores the legal and ethical minefield. We don't
> > have a Developer Certificate of Origin (DCO), but I think the other
> > points are a bare minimum for any Biopython policy.
> >
> > Most of my personal open source projects have only had a very small
> > number of contributors, and I am comfortable with outright rejecting
> > generative AI. I know some of the past/current Biopython contributors
> > are more willing to embrace this technology though - so I doubt support
> > for a simple ban would be unanimous.
> >
> > Speaking for a moment as the current Open Bioinformatics Foundation
> > president, the board has discussed this and agreed not to try to micro
> > manage the member projects. For reference, BioPerl have started
> > https://github.com/bioperl/bioperl-live/issues/407 which has some
> > excellent points and examples to consider.
> >
> > In particular, this is not just a code or documentation changes issue - but
> > also about the communication around any proposed change: the nature
> > of the commit messages, pull request description, and discussion. This
> > ties into the maintainers' burden - many of our recent AI generated PRs
> > have fairy short code changes but the verbose text is exhausting to read
> > and unhelpful. It has sometimes felt like I have been talking to an AI agent
> > rather than a human - I actually liked the feeling of mentoring a new
> > contributor and guiding them through minor hurdles to getting their
> > change accepted, but you lose that with an AI agent inbetween you.
> >
> > I therefore very much like this line from the curreth Codeberg policy:
> >
> > > All communication, that includes: commit messages, pull request
> > > messages, documentation, code comments and issues (and
> > > comments on issues/pull requests), that is intended to be read
> > > by people to understand your thoughts and work must not have
> > > been generated with AI. We exclude machine translation and
> > > tooling that helps with grammar and spelling check.
> >
> > https://codeberg.org/comaps/Governance/src/branch/main/AI_USAGE.md
> >
> > Would anyone like to speak in defence of accepting AI (assisted) PRs,
> > and suggest an existing policy you would be happy we adopt or base
> > ours on?
> >
> > Or should I start drafting a more draconian but likely much shorter one -
> > a few lines like this in the CONTRIBUTING file and/or PR template: No
> > generative AI to be used in any Biopython contributions, with the exception
> > of machine translation to/from English (where you might consider including
> > your original language text as well).
> >
> > Thank you,
> >
> > Peter
More information about the Biopython
mailing list