[Biopython-dev] Bio.IntelliGenetics
Michiel de Hoon
mjldehoon at yahoo.com
Fri Jul 4 10:24:06 UTC 2008
> I'm assuming we'd put the new IntelliGenetics to
> SeqRecord parser in Bio/SeqIO/IgIO.py (based on
> the format name of "ig" used in EMBOSS).
OK.
> Would we then also deprecate Bio.IntelliGenetics?
Yes. Otherwise, it's replicated functionality.
> Do you want to make these changes, or should I?
Either way is fine with me. If you want to include this in Bio.SeqIO, go ahead. If you prefer me to do it, please let me know.
> > Then, let's replace the IntelliGenetics tests by
> files that do not contain the double
> > semi-colon comments.
>
> Why not just leave the double colon lines alone? The parser
> should be able to cope.
In the example files in test/IntelliGenetics, lines with a ';;' clearly have a different interpretation from the sequence-specific comments starting with ';'. I am fine with skipping the ';;' lines, but if we'd include them with the sequence-specific comments we'd be misrepresenting the file.
--Michiel.
--- On Wed, 7/2/08, Peter <biopython at maubp.freeserve.co.uk> wrote:
> From: Peter <biopython at maubp.freeserve.co.uk>
> Subject: Re: [Biopython-dev] Bio.IntelliGenetics
> To: mjldehoon at yahoo.com
> Date: Wednesday, July 2, 2008, 12:11 PM
> > It may be that the link in
> Bio/IntelliGenetics/__init__.py actually does not pertain
> to
> > the IntelliGenetics format. Except for this link
> (which as you point out actually talks
> > about the MASE format, not the IntelliGenetics
> format), I have seen no description
> > elsewhere of these file-wide comments preceded by a
> double semi-colon in the
> > IntelliGenetics format. Even Biopython doesn't
> treat these consistently: The tests
> > for Bio.IntelliGenetics include comments with the
> double semi-colon, but the parser
> > doesn't treat them differently from
> sequence-specific comments.
>
> Maybe we should ask BioPerl if they distinguish between the
> IntelliGenetics and MASE formats?
>
> Looking back over the old mailing list, at the time they
> did think the
> two were the same:
> http://lists.open-bio.org/pipermail/biopython-dev/2001-October/000626.html
>
> > So let's do the following:
> > For the IntelliGenetics parser, do not look for double
> semi-colon comments. Only check
> > if the first character in a line is a semi-colon, and
> if so, treat it as a sequence-specific
> > comment. This is what Bio.IntelliGenetics currently
> does anyway.
> > Replace the parser class in Bio.IntelliGenetics by a
> generator function, and integrate it with
> > Bio.SeqIO.
>
> I'm assuming we'd put the new IntelliGenetics to
> SeqRecord parser in
> Bio/SeqIO/IgIO.py (based on the format name of
> "ig" used in EMBOSS).
> Would we then also deprecate Bio.IntelliGenetics?
>
> Do you want to make these changes, or should I?
>
> > Then, let's replace the IntelliGenetics tests by
> files that do not contain the double
> > semi-colon comments.
>
> Why not just leave the double colon lines alone? The parser
> should be
> able to cope.
>
> Peter
More information about the Biopython-dev
mailing list