[Biopython-dev] GSoC python variant update 10

Lenna Peterson arklenna at gmail.com
Mon Aug 13 05:00:41 UTC 2012

Link: http://arklenna.tumblr.com/post/29317968106/


Following extensive
on the dev list of the pros and cons of configuration classes/modules,
I have refactored my [coordinate
mapper](https://gist.github.com/3172753) to keep configuration as
isolated as possible.

All mapping functions use base 0 internally. Transformation to and
from 1-based coords is allowed by custom MapPosition objects. (they
are currently separate from the  Seq* positions but could probably
subclass ExactPosition). The MapPosition objects have to_dialect and
from_dialect methods that automatically handle conversion between
bases and other formatting details.

There are two different ways a user can convert a coordinate from HGVS:

    # ... assuming cm is an instance of CoordinateMapper
    # Manually construct position from HGVS
    CDS_coord = CDSPosition.from_hgvs("6+1")
    genomic_coord = cm.c2g(CDS_coord)
    print genomic_coord.to_hgvs()

    # Pass dialect argument to mapping function
    genomic_coord = cm.c2g("6+1", dialect="HGVS")
    print genomic_coord.to_hgvs()

Furthermore, the inheritance hierarchy is designed to allow a user to
set a default string representation:

    # Set MapPositions to print as HGVS by default
    def use_hgvs(self):
        return str(self.to_hgvs())
    MapPosition.__str__ = use_hgvs

The [version](https://gist.github.com/3172753/577b7c383e057b78cdcee64be33f18117a46faaf)
as of this writing is passing tests using base 0. I have not yet
implemented tests for `from_hgvs` or `to_hgvs`, but that's next on my
list. I'm hoping to have time for strand and mixed strand, too.



More information about the Biopython-dev mailing list