[Biopython-dev] results of applying Clone Digger to the sources of BioPython project
Peter
biopython at maubp.freeserve.co.uk
Sat Mar 22 07:35:39 EDT 2008
> Hello.
>
> Clone Digger project is aimed to find software clones (duplicate code) in
> Python and Java programs.
>
> I have applied it to the source of BioPython and discovered several clone
> candidates.
>
> There are a lot of false positives caused by similar code in
> nlmmedline_*_format.py files, but maybe other clone candidates will be
> interesting for you.
>
> The results can be seen here:
> http://clonedigger.sourceforge.net/examples.html
Interesting. Does your tool know to ignore deprecated modules? e.g.
when we have essentially copied a file from one location to another, a
deprecated the original.
Some of these are from scanner/consumer parsers where there are two
alternative consumers turning the data into different object
representations.
Other things like providing dictionary like objects seem to be reusing
a lot of "boiler plate" code, and could probably be rationalised into
a base class and subclasses. e.g. in Bio/SwissProt/SProt.py and
Bio/PubMed.py and Bio/GenBank/__init__.py and Bio/Prosite/__init__.py
Other things like the Blunt(AbstractCut) and Ov3(AbstractCut) both
sharing apparently identical catalyse() methods may fall into the same
class.
Peter
More information about the Biopython-dev
mailing list