[BioPython] Distance Matrix Parsers

Chris Lasher chris.lasher at gmail.com
Tue Jun 27 21:34:37 UTC 2006


Hi Peter,

Would you be up for licensing your code under the BioPython license?
If not, I shouldn't  look at it, as I've started coding my own module
for the project. From your description, your module sounds very good.
=-)

Chris

On 6/25/06, Peter <biopython at maubp.freeserve.co.uk> wrote:
> [Off topic, but recently has anyone else get valid messages bounced due
> to a "suspicious header"?]
>
> Hello List,
>
> I recently wanted to load a "PHYLIP distance matrix file" created by
> clustalw for my own research...
>
> As discussed earlier, clustalw bends the official PHYLIP specification
> by not truncating long names to 10 characters.  For my dataset I need
> the long names to avoid ambiguity.
>
> The attached code implements a fairly simple distance matrix class and
> associated code to read (parse) and write PHYLIP style distance matrices.
>
> There are options to control strict 10 character name truncation, and
> the separator character(s) when writing files.
>
> Internally, I store the distances as a list of lists (of different
> lengths) to mimic a lower triangular matrix.
>
> For example, this matrix:
>
> [[0.0, 0.1, 0.2],
>    [0.1, 0.0, 0.5],
>    [0.2, 0.5, 0.0]]
>
> Is stored as this:
>
> [[], [0.1], [0.2, 0.5]]
>
> This may not be the best way to do this in terms of speed and memory usage.
>
> There are some simple test cases included, but I have pushed the code
> very far and there may be problems.  Anyway - in case anyone is
> interested either in the short term, or for ideas for how BioPython
> could support these files - here it is.
>
> I'm sure someone more familiar with arrays (Numeric and NumPy) would be
> able to make the class act more like an array - but the basics are there.
>
> As far as I could see, neither Numeric or NumPy have a specific
> symmetric matrix / symmetric array class which would be ideal.
>
> Members of the list are welcome to use the code, but please contact me
> before re-distributing it to anyone else.
>
> Peter
>
>
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>
>
>



More information about the Biopython mailing list