[BioPython] Distance Matrix Parsers

Marc Colosimo mcolosimo at mitre.org
Fri Jun 9 18:41:29 UTC 2006


Chris,

I likewise didn't know about the Bio::Matrix::PhylipDist module.  
Personally, I would opt for a Matrix Object (since this is Python a  
OO language) and store it internally as a nested list. That way you  
have the best of both worlds. The next question is the object  
hierarchy. Here I would opt for a top level Matrix class (or module)  
and then subclass that under Phylo. So, something like this:

Bio.Matrix
Bio.Phylo.Matrix

and maybe things like the following (which isn't used/followed much  
here in BioPython)

Bio.Phylo.IO
Bio.Phylo.Parsers.PhylipDist
Bio.Phylo.Parsers.Newick
Bio.Phylo.Parsers.Nexus

And/or have
Bio.Phylo.Matrix.IO that uses the PhylipDist parser.

The next big question is what should Bio.Phylo.IO return? For  
inspiration, we might want to look at Mesquite <http:// 
mesquiteproject.org/mesquite/mesquite.html>.

Marc

On Jun 9, 2006, at 11:59 AM, Chris Lasher wrote:

> Hi Marc,
>
> Thanks for the reply. I had not seen the Bio::Phylo package before.
> Thanks for pointing that out. That seems to have be a really useful
> library, though it's not exactly what I was thinking about when I
> originally posted. I was thinking more along the lines of the
> Bio::Matrix modules
> (http://bio.perl.org/wiki/Special:Search?search=matrix&go=Go).
>
> I don't think writing parsers for these formats will be that
> difficult. I am unsure, however, about what type of data structure the
> matrix should be. The simplest solution is a nested list. Perhaps this
> is the proper solution, as the user can then convert this over to a
> NumPy multi-dimensional array, say, or some matrix object. I dunno.
> Thoughts, comments, suggestions?
>
> Chris
>
> On 6/9/06, Marc Colosimo <mcolosimo at mitre.org> wrote:
>> Hi Chris,
>>
>> I don't think there is a parser for those. I have in the past thought
>> about writing them up. I was looking over the structure of BioPython
>> to see where it would best fit [I'll save my rant on this for another
>> time, maybe later today]. In the mean time, the folks at BioPerl have
>> Bio-Phylo CPAN module <http://search.cpan.org/~rvosa/Bio-Phylo/>,
>> which looks nice, but it does NOT have what you are looking for.
>> However, I am planning on following that.
>>
>> Marc
>>
>> On Jun 8, 2006, at 5:32 PM, Chris Lasher wrote:
>>
>>> Hi all,
>>>   Are there any modules in BioPython to parse distance matrices? My
>>> poking around the BioPython modules and Google searching does not  
>>> turn
>>> up any signs indicating there are distance matrix parsers,  
>>> currently.
>>> Two particularly useful parsers would be a parser for the output of
>>> DNADIST/PROTDIST/RESTDIST from PHYLIP
>>> (http://evolution.genetics.washington.edu/phylip.html), and a parser
>>> for the MEGA (http://www.megasoftware.net/mega.html) distance matrix
>>> format. If not, would there be any interest in creating parsers for
>>> these matrices, other than my own? I think parsers for distance
>>> matrices could be very useful to the community.
>>>
>>> Chris
>>> _______________________________________________
>>> BioPython mailing list  -  BioPython at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython




More information about the Biopython mailing list