[Biopython-dev] Brad's GFF parser in a Biopython repository

Peter Cock p.j.a.cock at googlemail.com
Wed Aug 24 02:33:21 UTC 2011


On Tue, Aug 23, 2011 at 8:31 PM, Brad Chapman <chapmanb at 50mail.com> wrote:
> Peter;
> Awesome, thanks for doing this. I didn't even realize there was a
> git solution that could transfer histories across repositories like
> this; how did you do it?

Well, it wasn't an off the shelf solution, it was a hack.

See https://gist.github.com/1167169
and https://github.com/gitpython-developers/GitPython

I used the Python library (import git) to query the source
repository, basically doing "git log -- gff/BCBio gff/Tests"
to find only the commits of interest, then "git show XXX"
to extract the diff which I then had to modify to change
the paths, then a system call to patch to apply each
patch to the destination repository, git add, git commit.
Note for git commit you can specify the message via
a file (-F) so I could preserve the original long message,
plus you can preserve the authored date (--date) and
the author too.

There were several steps where I couldn't work out
how you were meant to do something via the git
wrapper's API (e.g. get a diff as a patch), but it also
lets you easily call git commands directly which was
easier for me.

Bit hacky but seemed to get the job done.

> Everything looks great on a first pass. Do you think some of the
> scripts would also be useful to include in the script directory?
> They handle some of the common cases people have asked about;
> 'access_gff_index.py' uses bx-python so might be excluded, but the
> others are Biopython specific.
>
> Thanks again,
> Brad

Good point - that could be mapped to the Biopython
scripts folder. I'll take a look.

Peter



More information about the Biopython-dev mailing list