[Biopython-dev] New: sequtils.py
Thomas Sicheritz-Ponten
thomas at genome.cbs.dtu.dk
Tue Jul 24 09:09:04 EDT 2001
Hej All,
After the Biopython BoF meeting at ISMB01 in Copenhagen we decided to
temporarily collect seqeuence utilities/functions in Bio/sequtils.py
Cessie (our new biopython member) and I started by collecting some functions
(some of them are just aliases to existing - but deeply hidden functions).
Currently included:
ProteinX, makeTableX for error free translation of ambiguous DNA
complement, reverse, antiparallel and translate
nice six_frame_translations ala DNA Strider/XBBtools
GC, GC123, GC_skew, Accumulated_GC_skew
fasta_uniqids for getting unique identifiers in the FASTA file (useful) for using clustalw
quick_FASTA_reader for reading huge FASTA files (e.g. genomes)
apply_on_multi_fasta: use any function (e.g. GC) and apply it on all entries in a multiple FASTA file
Questions:
1) should we move Proteinx and maketablex somewhere else ?
2) we included a quick_fasta_reader hack, the FASTA parser is cool and nice
but because of all checkings it takes ages for e.g. a complete genome
Should we create a faster alternative ? (compatible with the normal one)
3) some functions exists in utils.py. Could we move sequence based functions
to sequtils.py and use utils.py for other non-seqeunce based functions ?
(e.g. I'd like to put my hyper-geometric distribution code there for expression data)
4) anyone got a hangover from yesterdays banquette ?
cheers
-thomas
Sicheritz-Ponten Thomas, Ph.D CBS, Department of Biotechnology
thomas at biopython.org The Technical University of Denmark
CBS: +45 45 252489 Building 208, DK-2800 Lyngby
Fax +45 45 931585 http://www.cbs.dtu.dk/thomas
De Chelonian Mobile ... The Turtle Moves ...
More information about the Biopython-dev
mailing list