[Biopython-dev] Bio.IntelliGenetics

Michiel de Hoon mjldehoon at yahoo.com
Wed Jul 2 15:29:31 UTC 2008


--- On Wed, 7/2/08, Peter <biopython at maubp.freeserve.co.uk> wrote:

> I found an old link I had added on the wiki page for SeqIO
> development,
> http://pbil.univ-lyon1.fr/help/formats.html
> 
> This clearly describes MASE format format s having
> (optional) header
> lines as starting with two semi colons.  But are MASE and
> IntelliGenetics the same thing?

It may be that the link in Bio/IntelliGenetics/__init__.py actually does not pertain the the IntelliGenetics format. Except for this link (which as you point out actually talks about the MASE format, not the IntelliGenetics format), I have seen no description elsewhere of these file-wide comments preceded by a double semi-colon in the IntelliGenetics format. Even Biopython doesn't treat these consistently: The tests for Bio.IntelliGenetics include comments with the double semi-colon, but the parser doesn't treat them differently from sequence-specific comments.

So let's do the following:
For the IntelliGenetics parser, do not look for double semi-colon comments. Only check if the first character in a line is a semi-colon, and if so, treat it as a sequence-specific comment. This is what Bio.IntelliGenetics currently does anyway.
Replace the parser class in Bio.IntelliGenetics by a generator function, and integrate it with Bio.SeqIO. Then, let's replace the IntelliGenetics tests by files that do not contain the double semi-colon comments.
Does that sound OK?

--Michiel.


--Michiel.



      



More information about the Biopython-dev mailing list