[Bioperl-l] problem with alignment test data & parsers

Heikki Lehvaslaiho heikki@ebi.ac.uk
Thu, 24 May 2001 14:42:50 +0100


Peter,

I innocently added code into Bio::LocatableSeq::end to test if the
value given is supported by start() and seq(). Almost all of the test
data sets in t/data turned out to be wrong.

It seems to be that the problem got started from the Stockholm format
file.
The parser seems to read in only the first line from each sequence and
ignores the rest. Apparently these parse results have then been
written into other test files in various formats. Being not familiar
with the Stockholm format I can see two options:

1. The alignment file is not correct and the extra new line 
   characters should be removed.
2. The regexp doing the parsing needs to do quite clever look ahead 
   tricks to see when the next sequence starts.

I've committed the changed LocatableSeq::end into bioperl-live to help
in sorting out this problem. The SimpleAlign-related tests will fail
until the problem is fixed.

	-Heikki

-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki@ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________