[Biopython] biopython module for variant descriptions?

David Merberg merbergd at gmail.com
Tue Oct 31 15:31:35 EDT 2023


Hello biopython world,

For my last job, I wrote some python code to categorize and describe
sequence changes of many types. I used biopython to handle sequences and
some basic functions like  IO and translation, but I did not find a module
for reading variants/mutants and applying them to sequences.

Some cases are trivial, but some are not. For example, a small deletion in
the nucleotide sequence may have no effect on the amino acid corresponding
to the position of the affected codon, but will affect downstream amino
acids. Protein changes caused by deletions or insertions of 3, 6, 9 . . .
nucleotides can also be tricky to calculate.

My question is whether there is a biopython module to read variants in a
standard format (see for example http://varnomen.hgvs.org/)? Along with the
variant objects there could be a set of methods to operate on mutated
sequences. Does the community think that this would be useful if it does
not already exist?

I implemented many functions for these sorts of operations, but I realized
soon afterwards that there are probably better ways to do much of it. I
always wanted to redo the work, but never had time. Now I have time, but am
not at that job. If it would be useful to the community, I may be able to
take it on as a contribution to biopython.

A caveat is that I don’t have experience contributing to multi-developer
projects. I try to write clean, well documented code and I’m familiar with
the basics of git. So, it’s OK if you’d prefer that I start with something
smaller (like unit tests or documentation). Just let me know.

Dave Merberg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20231031/d0e99270/attachment.htm>


More information about the Biopython mailing list