[Bioperl-l] extending the PHYLIP format

Albert Vilella avilella at gmail.com
Wed May 28 10:31:07 UTC 2008


Hi Heikki,

About a year ago, some code was added to deal with "the more than 10 chars"
ids
problem. (
https://www.nescent.org/wg_phyloinformatics/Phylohackathon_1/BioPerl_Targets)


Basically: (1) mapping the long ids to 10-char numeric ids, (2) running the
program
with the id limitation, (3) reverting the ids back to the originals in the
output. The pods explain how to do it.

So I would say that the solution is at least "partially" there :-)

    Albert.

On Wed, May 28, 2008 at 9:23 AM, Heikki Lehvaslaiho <heikki at sanbi.ac.za>
wrote:

>
> I just learned that a number of phylogenetics packages (PAUP, PHYML, Mr
> Bayes
> at least ) now allow longer than 10 character IDs in PHYLIP format. The
> documentation is scarce but the rules seem to be:
>
> 1. There can be spaces before the ID.
> 2. The ID can be up to 50 characters long.
> 3. ID can contain any characters. If you are using spaces within the ID,
> you
> have to put the whole ID in single quotes ('). Single quotes can be used
> for
> all IDs and are removed when parsing in.
> 4. It is customary to have two spaces between the ID and the sequence.
>
> This custom seems to have come into PHYLIP format from Nexus.
> Note that this allows sequences in a file to start at different columns.
>
> Can anyone shed more light into matter?
>
>
> I need to get this into bioperl as the names in HIV sequences that I work
> with
> are very long and can not be sensibly truncated.
>
> What would be the best way to do this?
> 1. Add more options to the already heavily
>   hacked Bio::AlignIO::phylip.pm
> 2. Create a Bio::AlignIO::phyliplong.pm
>
> Do those ugly hacks for supporting fixed length long IDs really really
> belong
> in the vanilla phylip.pm file?
>
> Opinions?
>
>        -Heikki
>
> --
> ______ _/      _/_____________________________________________________
>      _/      _/
>     _/  _/  _/  Heikki Lehvaslaiho    heikki at_sanbi _ac _za
>    _/_/_/_/_/  Senior Scientist    skype: heikki_lehvaslaiho
>   _/  _/  _/  SANBI, South African National Bioinformatics Institute
>  _/  _/  _/  University of Western Cape, South Africa
>     _/      Phone: +27 21 959 2096   FAX: +27 21 959 2512
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list