From jkb at mrc-lmb.cam.ac.uk Fri May 3 09:34:19 2002 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Fri, 3 May 2002 14:34:19 +0100 Subject: Reading frames (for Gary) Message-ID: <20020503143419.B18499@arran.mrc-lmb.cam.ac.uk> Hi all, I've spoken to Rodger about how he reports open reading frames and he said that his -2 starts at the same base as +2 - ie the reading frames are worked out before complementing. (Infact the way his algorithms work avoids the need to complement anyway as he just looks up using a reversed table - I think). Gap4 does much the same thing too in it's own translation windows. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Fax: (+44) 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jison at hgmp.mrc.ac.uk Mon May 20 10:52:47 2002 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Mon, 20 May 2002 15:52:47 +0100 Subject: Protein Structure Message-ID: <3CE90DBF.C3FF17B0@hgmp.mrc.ac.uk> Major new revision of protein structure applications - w/o full documentation. Gary - the documentation that can be used is the one-line summaries given at the end of this msg. Full documentation plus user-guide to follow soon. New applications have been cvs add'ed: pdbparse.c / acd scopseqs.c / acd scopnr.c / acd seqsearch.c / acd seqwords.c / acd seqalign.c / acd hetparse.c / acd scopreso.c / acd scoprep.c / acd profgen.c / acd funky.c / acd hmmgen.c / acd Some applications have been cvs delete'd: scope.c / acd nrscope.c / acd psiblasts.c / acd swissparse.c / acd alignwrap.c / acd dichet.c / acd The deleted applications have been replaced as follows: coordenew --> pdbparse (coordnew was deleted a while back) scope --> scopparse nrscope --> scopnr psiblasts --> seqsearch swissparse --> seqwords alignwrap --> seqalign New versions of code have been cvs commit'ed: pdbparse.c / acd domainer.c / acd contacts.c / acd interface.c / acd pdbtosp.c / acd scopparse.c / acd scopreso.c / acd scopseqs.c / acd scopnr.c / acd scoprep.c / acd scopalign.c / acd seqsearch.c / acd seqwords.c / acd seqsort.c / acd seqnr.c / acd seqalign.c / acd siggen.c / acd sigscan.c / acd sigplot.c / acd hetparse.c / acd profgen.c / acd funky.c / acd hmmgen.c / acd Plus ajxyz.c / ajxyz.h Short summaries of the applications are as follows: pdbparse - Parses pdb files and writes cleaned-up protein coordinate files. domainer - Reads protein coordinate files and writes domains coordinate files. contacts - Reads coordinate files and writes files of intra-chain residue-residue contact data. interface- Reads coordinate files and writes files of inter-chain residue-residue contact data. pdbtosp - Convert raw swissprot:pdb equivalence file to embl-like format. scopparse- Converts raw scop classification files to a file in embl-like format. scopreso - Removes low resolution domains from a scop classification file. scopseqs - Adds pdb and swissprot sequence records to a scop classification file. scopnr - Removes redundant domains from a scop classification file. scoprep - Reorder scop classificaiton file so that the representative structure of each family is given first. scopalign- Generate alignments for families in a scop classification file by using STAMP. seqsearch- Generate files of hits for families in a scop classification file by using PSI-BLAST with seed alignments. seqwords - Generate files of hits for scop families by searching swissprot with keywords. seqsort - Reads multiple files of hits and writes a non-ambiguous file of hits (scop families file) plus a validation file. seqnr - Removes redundant hits from a scop families file. seqalign - Generate extended alignments for families in a scop families file by using CLUSTALW with seed alignments. siggen - Generates a sparse protein signature from an alignment and residue contact data. sigscan - Scans a signature against swissprot and writes a signature hits files. sigplot - Reads a signature hits file and validation file and generates gnuplot data files of signature performance. profgen - Generates various profiles for each alignment in a directory. hmmgen - Generates a hidden Markov model for each alignment in a directory. hetparse - Converts raw dictionary of heterogen groups to a file in embl-like format. funky - Reads clean coordinate files and writes file of protein-heterogen contact data. J. -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk From jkb at mrc-lmb.cam.ac.uk Fri May 3 13:34:19 2002 From: jkb at mrc-lmb.cam.ac.uk (James Bonfield) Date: Fri, 3 May 2002 14:34:19 +0100 Subject: Reading frames (for Gary) Message-ID: <20020503143419.B18499@arran.mrc-lmb.cam.ac.uk> Hi all, I've spoken to Rodger about how he reports open reading frames and he said that his -2 starts at the same base as +2 - ie the reading frames are worked out before complementing. (Infact the way his algorithms work avoids the need to complement anyway as he just looks up using a reversed table - I think). Gap4 does much the same thing too in it's own translation windows. James -- James Bonfield (jkb at mrc-lmb.cam.ac.uk) Fax: (+44) 01223 213556 Medical Research Council - Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, England. Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/ From jison at hgmp.mrc.ac.uk Mon May 20 14:52:47 2002 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Mon, 20 May 2002 15:52:47 +0100 Subject: Protein Structure Message-ID: <3CE90DBF.C3FF17B0@hgmp.mrc.ac.uk> Major new revision of protein structure applications - w/o full documentation. Gary - the documentation that can be used is the one-line summaries given at the end of this msg. Full documentation plus user-guide to follow soon. New applications have been cvs add'ed: pdbparse.c / acd scopseqs.c / acd scopnr.c / acd seqsearch.c / acd seqwords.c / acd seqalign.c / acd hetparse.c / acd scopreso.c / acd scoprep.c / acd profgen.c / acd funky.c / acd hmmgen.c / acd Some applications have been cvs delete'd: scope.c / acd nrscope.c / acd psiblasts.c / acd swissparse.c / acd alignwrap.c / acd dichet.c / acd The deleted applications have been replaced as follows: coordenew --> pdbparse (coordnew was deleted a while back) scope --> scopparse nrscope --> scopnr psiblasts --> seqsearch swissparse --> seqwords alignwrap --> seqalign New versions of code have been cvs commit'ed: pdbparse.c / acd domainer.c / acd contacts.c / acd interface.c / acd pdbtosp.c / acd scopparse.c / acd scopreso.c / acd scopseqs.c / acd scopnr.c / acd scoprep.c / acd scopalign.c / acd seqsearch.c / acd seqwords.c / acd seqsort.c / acd seqnr.c / acd seqalign.c / acd siggen.c / acd sigscan.c / acd sigplot.c / acd hetparse.c / acd profgen.c / acd funky.c / acd hmmgen.c / acd Plus ajxyz.c / ajxyz.h Short summaries of the applications are as follows: pdbparse - Parses pdb files and writes cleaned-up protein coordinate files. domainer - Reads protein coordinate files and writes domains coordinate files. contacts - Reads coordinate files and writes files of intra-chain residue-residue contact data. interface- Reads coordinate files and writes files of inter-chain residue-residue contact data. pdbtosp - Convert raw swissprot:pdb equivalence file to embl-like format. scopparse- Converts raw scop classification files to a file in embl-like format. scopreso - Removes low resolution domains from a scop classification file. scopseqs - Adds pdb and swissprot sequence records to a scop classification file. scopnr - Removes redundant domains from a scop classification file. scoprep - Reorder scop classificaiton file so that the representative structure of each family is given first. scopalign- Generate alignments for families in a scop classification file by using STAMP. seqsearch- Generate files of hits for families in a scop classification file by using PSI-BLAST with seed alignments. seqwords - Generate files of hits for scop families by searching swissprot with keywords. seqsort - Reads multiple files of hits and writes a non-ambiguous file of hits (scop families file) plus a validation file. seqnr - Removes redundant hits from a scop families file. seqalign - Generate extended alignments for families in a scop families file by using CLUSTALW with seed alignments. siggen - Generates a sparse protein signature from an alignment and residue contact data. sigscan - Scans a signature against swissprot and writes a signature hits files. sigplot - Reads a signature hits file and validation file and generates gnuplot data files of signature performance. profgen - Generates various profiles for each alignment in a directory. hmmgen - Generates a hidden Markov model for each alignment in a directory. hetparse - Converts raw dictionary of heterogen groups to a file in embl-like format. funky - Reads clean coordinate files and writes file of protein-heterogen contact data. J. -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk