From henrikki.almusa at helsinki.fi Thu Dec 4 09:26:32 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Thu, 4 Dec 2003 16:26:32 +0200 Subject: tfscan output conversion Message-ID: <200312041626.32319.henrikki.almusa@helsinki.fi> Hello, I'm trying to convert tfscan to write report output (patch attached). This basicly should work, but it doesn't. So one problem and one test request. Problem. For some reason this patch seems to make it hang if used more than one sequence. I've used -debug option and taken last 75 lines from debug to add as attachment (since whole thing is almost 1meg). I can't figure out what causes this. Test request. Since there is now the binding factor informtion added to this i've put that into tail of report. However i don't get these using tfscan here, so i'd like someone to see what it looks like with them. Thanks, -- Henrikki Almusa -------------- next part -------------- testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'MOUSE$A21COL_02 ' reportWriteSeqTable subseq 6 seq 840 28..33 ajFeatGetNote 'acc' try /note="*id HS$APOE_08 " testing *name try /note="*acc R00149" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00149' ajFeatGetNote 'id' try /note="*id HS$APOE_08 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$APOE_08 ' reportWriteSeqTable subseq 5 seq 840 374..378 ajFeatGetNote 'acc' try /note="*id HS$ALBU_03 " testing *name try /note="*acc R00079" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00079' ajFeatGetNote 'id' try /note="*id HS$ALBU_03 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$ALBU_03 ' ajStrCut 0 0 len: 1 ibegin: 0 iend: 1 ajStrCut 0 0 len: 3 ibegin: 0 iend: 1 ajFeattableDel 80723c0 ajSeqRead: input file 'mRNA.small.twice' still there, try again ajFeattableDel 0 seqRead: cleared seqRead: seqin format 10 'fasta' seqRead: one format specified ajFileBuffNobuff mRNA.small.twice buffsize: 15 ++seqRead known format 10 ++seqReadFmt format 10 (fasta) 'mRNA.small.twice' feat No ajSeqParseNcbi '>Exon_10_head_2 (copied for testing two seqs) ' trying ajSeqParseFasta ajSeqParseFasta '>Exon_10_head_2 (copied for testing two seqs) ' result id: 'Exon_10_head_2' acc: '' desc: '(copied for testing two seqs) ' parsed id 'Exon_10_head_2' acc '' sv '' gi '' desc '(copied for testing two seqs) ' seqSetName 'Exon_10_head_2' result: 'Exon_10_head_2' at EOF: File already read to end mRNA.small.twice End of file - data in buffer - return ajFalse ajFileBuffClear (0) Nobuff: Yes first: 15 thys->Pos: 15 thys->Size: 15 thys->Nobuff: Yes ajFileBuffClear 'mRNA.small.twice' (0 lines) Y size: 15 pos: 15 removed 15 lines add to free: 0 seqReadFmt success with format 10 (fasta) seqQueryMatch 'Exon_10_head_2' id '' acc '' Sv '' Des '' Key '' Org '' No accession number to test No taxonomy to test No keyword to test No description to test testing sequence 'Exon_10_head_2' type 'DNA' IsNuc No IsProt No ajSeqTypeCheckIn type 'dna' found (DNA sequence) Remove all gaps ajSeqIsNuc Type '' seqTypeGapnucS test Convert '?XUu' to 'NNTt' ajSeqRead: open buffer usa: 'mRNA.small.twice' returns: Yes ++keep restored 0..0 (N) 'fasta' 10 ajSeqRead: thys->Db '', seqin->Db '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqRead: thys->Entryname 'Exon_10_head_2', seqin->Entryname '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqSetRange (len: 840 0..0 old 0..0) result: (len: 840 0..0) ajSeqallNext success -------------- next part -------------- A non-text attachment was scrubbed... Name: tfscan_report.patch Type: text/x-diff Size: 5759 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss-dev/attachments/20031204/bda0a97e/attachment.bin From henrikki.almusa at helsinki.fi Fri Dec 5 05:08:52 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 12:08:52 +0200 Subject: Report format Message-ID: <200312051208.52153.henrikki.almusa@helsinki.fi> Hello, Small question about report format. Since one can add tag in style 'type:value=text_in_file'. Is there some way to give spaces in "text_in_file" or give sort of "%-xS" type of syntax to make sure atleast x sized are is used? Thanks, -- Henrikki Almusa From pmr at ebi.ac.uk Fri Dec 5 06:20:27 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Dec 2003 11:20:27 +0000 Subject: Report format In-Reply-To: <200312051208.52153.henrikki.almusa@helsinki.fi> References: <200312051208.52153.henrikki.almusa@helsinki.fi> Message-ID: <3FD069FB.7010600@ebi.ac.uk> Henrikki Almusa wrote: > Hello, > > Small question about report format. Since one can add tag in style > 'type:value=text_in_file'. Is there some way to give spaces in "text_in_file" > or give sort of "%-xS" type of syntax to make sure atleast x sized are is > used? Good idea! The code has widths for each column already. We need a syntax to give: 1. minimum column width 2. maximum column width (for example, for sequence data that can be very long) Perhaps type:value%n.n=columnheading More questions: Do we need more column types? Do we need a way to define the "standard" tags - to change them or to exclude them (excluding could be done with qualifiers, changing needs a tag syntax) regards, Peter From henrikki.almusa at helsinki.fi Fri Dec 5 07:04:36 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 14:04:36 +0200 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <200312051404.36982.henrikki.almusa@helsinki.fi> On Friday 05 December 2003 13:20, Peter Rice wrote: > The code has widths for each column already. > > We need a syntax to give: > > 1. minimum column width > 2. maximum column width (for example, for sequence data that can be very > long) > > Perhaps type:value%n.n=columnheading Might be good to make it possible to add space within there. Eg. column heading. Possible ways could be either allow type:value%n.n='column heading' or add tag_delim to report, which defaults on space or whitespace. > More questions: > > Do we need more column types? Perhaps name/id tag. Can't figure out much else. > Do we need a way to define the "standard" tags - to change them or to > exclude them (excluding could be done with qualifiers, changing needs a > tag syntax) This might help. For example 'cusp' could be adapted if the sequence and start and end points could be dropped. Perhaps this can be done in some report format, would need to check. I think there is those '-noscore' etc which do similar thing. Of course all "standard" tags then need to be named and added to there. -- Henrikki Almusa From henrikki.almusa at helsinki.fi Fri Dec 5 07:11:53 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 14:11:53 +0200 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <200312051411.53703.henrikki.almusa@helsinki.fi> Forgot to add to previous mail. One thing that would be nice to do with report format is sorting. Would it be hard to try to create a possibility to sort the feature table accending/decending order other than how its built? Eg sort by sequence, then start point? -- Henrikki Almusa From pmr at ebi.ac.uk Fri Dec 5 07:38:29 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Dec 2003 12:38:29 +0000 Subject: Report format In-Reply-To: <200312051404.36982.henrikki.almusa@helsinki.fi> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <200312051404.36982.henrikki.almusa@helsinki.fi> Message-ID: <3FD07C45.2070703@ebi.ac.uk> Henrikki Almusa wrote: >>Perhaps type:value%n.n=columnheading > > Might be good to make it possible to add space within there. Eg. column > heading. Possible ways could be either allow type:value%n.n='column heading' > or add tag_delim to report, which defaults on space or whitespace. Spaces are tricky - for parsers that read the output. I prefer column_heading (or ColumnHeading) Any other comments on this? >>Do we need more column types? > > Perhaps name/id tag. Can't figure out much else. Are they different to "str"? I noticed in checking 2.8.0 that "rstr" works as a right-justified string. (str is left justified, anything else is right-justified. Perhaps we should structly check the tag types (anything is allowed in 2.8.0!) >>Do we need a way to define the "standard" tags - to change them or to >>exclude them (excluding could be done with qualifiers, changing needs a >>tag syntax) > This might help. For example 'cusp' could be adapted if the sequence and start > and end points could be dropped. Perhaps this can be done in some report > format, would need to check. Will do. We can put the -norstart and -norend qualifiers (see below) into the ACD report definition. > I think there is those '-noscore' etc which do similar thing. Of course all > "standard" tags then need to be named and added to there. Yes, will do. Have to check for any report formats that may be strange without specific tags (they can,of course, ignore the qualifier) regards, Peter From henrikki.almusa at helsinki.fi Fri Dec 5 08:46:06 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 15:46:06 +0200 Subject: Report format In-Reply-To: <3FD07C45.2070703@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <200312051404.36982.henrikki.almusa@helsinki.fi> <3FD07C45.2070703@ebi.ac.uk> Message-ID: <200312051546.06872.henrikki.almusa@helsinki.fi> On Friday 05 December 2003 14:38, Peter Rice wrote: > Henrikki Almusa wrote: > >>Perhaps type:value%n.n=columnheading > > > > Might be good to make it possible to add space within there. Eg. column > > heading. Possible ways could be either allow type:value%n.n='column > > heading' or add tag_delim to report, which defaults on space or > > whitespace. > > Spaces are tricky - for parsers that read the output. > > I prefer column_heading (or ColumnHeading) > > Any other comments on this? I just don't like ThisTypeOfCapsing much. But i can live with it, no problem. > >>Do we need more column types? > > > > Perhaps name/id tag. Can't figure out much else. > > Are they different to "str"? Ah, right, understood it then wrong and your right, no dirrefent. > I noticed in checking 2.8.0 that "rstr" works as a right-justified > string. (str is left justified, anything else is right-justified. > Perhaps we should structly check the tag types (anything is allowed in > 2.8.0!) And perhaps put some documentation into web :). There is quite little knowledge on what can be done with report that i could find. -- Henrikki Almusa From d.counsell at hgmp.mrc.ac.uk Mon Dec 8 06:28:15 2003 From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell) Date: Mon, 8 Dec 2003 11:28:15 +0000 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> * Peter Rice [031205 11:23]: > Henrikki Almusa wrote: > >Hello, > > > >Small question about report format. Since one can add tag in style > >'type:value=text_in_file'. Is there some way to give spaces in > >"text_in_file" or give sort of "%-xS" type of syntax to make sure atleast > >x sized are is used? > > Good idea! > > The code has widths for each column already. > > We need a syntax to give: > > 1. minimum column width > 2. maximum column width (for example, for sequence data that can be very > long) > > Perhaps type:value%n.n=columnheading > > More questions: > > Do we need more column types? This may be a completely stupid suggestion, but, if you don't ask... Could we have decimal tabs, please? You know: ones smart enough to align themselves by the position of the floating point, even in the absence of specifying the number of digits before and after it? all the best Damian -- MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK phone: +44 (0)1223 494585 fax: +44 (0)1223 494512 email: d.counsell at hgmp.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk/~dcounsel/ From pmr at ebi.ac.uk Mon Dec 8 06:45:59 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Dec 2003 11:45:59 +0000 Subject: Report format In-Reply-To: <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> Message-ID: <3FD46477.1030602@ebi.ac.uk> Damian Counsell wrote: > This may be a completely stupid suggestion, but, if you don't ask... > > Could we have decimal tabs, please? You know: ones smart enough to > align themselves by the position of the floating point, even in the > absence of specifying the number of digits before and after it? Hmmmm ... What really happens to these data types is that the values are all strings written by the calling program and stored as tag=value pairs in an internal feature table. On output, the (string) tag value is reported. "str" tags are left justified Anything else is right justified. We could try, for floats, reading the value in and rewriting it in a fixed precision. A small overhead but maybe worth it. We could also try removing extra trailing zeroes in some cases. Comments? Peter From henrikki.almusa at helsinki.fi Mon Dec 8 09:24:42 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Mon, 8 Dec 2003 16:24:42 +0200 Subject: tfscan output conversion In-Reply-To: <200312041626.32319.henrikki.almusa@helsinki.fi> References: <200312041626.32319.henrikki.almusa@helsinki.fi> Message-ID: <200312081624.42660.henrikki.almusa@helsinki.fi> Hello, This patch should actually print the info properly into tail. Still would like to confirm that though. But the problem still remains. So any info on why the while fails? -- Henrikki Almusa -------------- next part -------------- A non-text attachment was scrubbed... Name: tfscan_report.patch Type: text/x-diff Size: 6413 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss-dev/attachments/20031208/00108002/attachment.bin -------------- next part -------------- ajFeatGetNote 'id' try /note="*id MOUSE$A21COL_02 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'MOUSE$A21COL_02 ' reportWriteSeqTable subseq 6 seq 840 28..33 ajFeatGetNote 'acc' try /note="*id HS$APOE_08 " testing *name try /note="*acc R00149" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00149' ajFeatGetNote 'id' try /note="*id HS$APOE_08 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$APOE_08 ' reportWriteSeqTable subseq 5 seq 840 374..378 ajFeatGetNote 'acc' try /note="*id HS$ALBU_03 " testing *name try /note="*acc R00079" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00079' ajFeatGetNote 'id' try /note="*id HS$ALBU_03 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$ALBU_03 ' ajFeattableDel 80723d8 ajSeqRead: input file 'mRNA.small.twice' still there, try again ajFeattableDel 0 seqRead: cleared seqRead: seqin format 10 'fasta' seqRead: one format specified ajFileBuffNobuff mRNA.small.twice buffsize: 15 ++seqRead known format 10 ++seqReadFmt format 10 (fasta) 'mRNA.small.twice' feat No ajSeqParseNcbi '>Exon_10_head_2 (copied for testing two seqs) ' trying ajSeqParseFasta ajSeqParseFasta '>Exon_10_head_2 (copied for testing two seqs) ' result id: 'Exon_10_head_2' acc: '' desc: '(copied for testing two seqs) ' parsed id 'Exon_10_head_2' acc '' sv '' gi '' desc '(copied for testing two seqs) ' seqSetName 'Exon_10_head_2' result: 'Exon_10_head_2' at EOF: File already read to end mRNA.small.twice End of file - data in buffer - return ajFalse ajFileBuffClear (0) Nobuff: Yes first: 15 thys->Pos: 15 thys->Size: 15 thys->Nobuff: Yes ajFileBuffClear 'mRNA.small.twice' (0 lines) Y size: 15 pos: 15 removed 15 lines add to free: 0 seqReadFmt success with format 10 (fasta) seqQueryMatch 'Exon_10_head_2' id '' acc '' Sv '' Des '' Key '' Org '' No accession number to test No taxonomy to test No keyword to test No description to test testing sequence 'Exon_10_head_2' type 'DNA' IsNuc No IsProt No ajSeqTypeCheckIn type 'dna' found (DNA sequence) Remove all gaps ajSeqIsNuc Type '' seqTypeGapnucS test Convert '?XUu' to 'NNTt' ajSeqRead: open buffer usa: 'mRNA.small.twice' returns: Yes ++keep restored 0..0 (N) 'fasta' 10 ajSeqRead: thys->Db '', seqin->Db '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqRead: thys->Entryname 'Exon_10_head_2', seqin->Entryname '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqSetRange (len: 840 0..0 old 0..0) result: (len: 840 0..0) ajSeqallNext success From pmr at ebi.ac.uk Mon Dec 8 10:01:57 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Dec 2003 15:01:57 +0000 Subject: ACD changes for 2.9.0 Message-ID: <3FD49265.3030205@ebi.ac.uk> Just committed some new ACD validations (in acdvalid). Interface developers will need to look for them in 2.9.0. New section "additional" for qualifiers with additional:"Y" defined. I would suggest treating this in the same way as "advanced" (for many programs it needed only a rename of the advanced section). New ACD type "toggle" - this is the same as "boolean" and will be used for thoe boolean values that are only used to control other ACD qualifiers (-plot for example). acdvalid will allow these toggles in other sections, and will (but not yet) check for them in calculated values. Boolean values will be expected to appear in the required, additional or advanced sections (but can be in the input or output sections without problem, as before). Input and output datatypes now must appear in the input and output sections. matrix, datafile and cfile datatypes have been relocated. The application name in the ACD file must match the true application name. This is only checked by acdvalid so far to avoid breaking third-party ACD files. Output outfile, align, report, etc. have new attributes: nullok - if true, can return a NULL value nulldefault - if true, defaults to a NULL value. Setting a filename on the command line overrides and creates an output file. Setting to "" on the commandline ***creates the expected default filename*** missing - if true, can ctreate the expected filename by simply using -qualname on the commandline (rather than -qualname="") if it is last on the command line or followe dby anotehr qualifier (if followed by a paraneter that will appear to be the filename value) I have started to change "string" datatypes to other datatypes if approporiate (for example to directory or datafile). There will be more of these. Peter From d.counsell at hgmp.mrc.ac.uk Mon Dec 8 10:12:09 2003 From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell) Date: Mon, 8 Dec 2003 15:12:09 +0000 Subject: Report format In-Reply-To: <3FD46477.1030602@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> <3FD46477.1030602@ebi.ac.uk> Message-ID: <20031208151209.GD5099@dev4.hgmp.mrc.ac.uk> Peter! * Peter Rice [031208 11:49]: > Damian Counsell wrote: > >This may be a completely stupid suggestion, but, if you don't ask... > > > >Could we have decimal tabs, please? You know: ones smart enough to > >align themselves by the position of the floating point, even in the > >absence of specifying the number of digits before and after it? > > Hmmmm ... > > What really happens to these data types is that the values are all > strings written by the calling program and stored as tag=value pairs in > an internal feature table. > > On output, the (string) tag value is reported. > > "str" tags are left justified > Anything else is right justified. > > We could try, for floats, reading the value in and rewriting it in a > fixed precision. A small overhead but maybe worth it. Thanks for the explanation. If no one else has any objections this sounds fine to me. > We could also try removing extra trailing zeroes in some cases. I have no problem with trailing zeroes when there is consistent and controllable length and precision in the output. Leading zeroes are another matter; the right number of leading spaces would be the ideal solution for me of course. all the best Damian -- MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK phone: +44 (0)1223 494585 fax: +44 (0)1223 494512 email: d.counsell at hgmp.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk/~dcounsel/ From gwilliam at hgmp.mrc.ac.uk Fri Dec 12 08:59:18 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Fri, 12 Dec 2003 13:59:18 +0000 Subject: CpG programs Message-ID: <3FD9C9B6.BEEF7DC8@hgmp.mrc.ac.uk> EMBOSS has several programs for finding CpG islands: cpgreport Reports all CpG rich regions newcpgseek Reports CpG rich regions newcpgreport Report CpG rich areas The documentation (originally supplied by the author) says that for all practical purposes you should probably use newcpgreport. There is probably a case for retiring some of these programs to the 'make check' section of the Makefile? (i.e remove them from the standard distribution unless explicitly compiled.) Which of these, if any, do you use and why? Regards, Gary -- Gary Williams MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK Tel: +44 1223 494522 Fax: +44 1223 494512 E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk From rls at ebi.ac.uk Mon Dec 22 05:19:43 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Mon, 22 Dec 2003 10:19:43 -0000 Subject: CpG programs In-Reply-To: <3FD9C9B6.BEEF7DC8@hgmp.mrc.ac.uk> Message-ID: Hi, Sorry for the late reply. I'm currently re-writing a small portion of the code to speed things up together with a collaborator. As soon as this one is tried and tested we will move to replace the old version of newcpgreport with this one. As soon as that is done a name change from newcpgreport to cpgreport will be requested and the old programs can be retired. Thanks and Merry Xmas to all!!!! R:) > -----Original Message----- > From: owner-emboss-dev at hgmp.mrc.ac.uk > [mailto:owner-emboss-dev at hgmp.mrc.ac.uk]On Behalf Of Gary Williams, Tel > 01223 494522 > Sent: 12 December 2003 13:59 > To: emboss-dev at embnet.org > Subject: CpG programs > > > EMBOSS has several programs for finding CpG islands: > > cpgreport Reports all CpG rich regions > newcpgseek Reports CpG rich regions > newcpgreport Report CpG rich areas > > The documentation (originally supplied by the author) says that for all > practical purposes you should probably use newcpgreport. > > There is probably a case for retiring some of these programs to the > 'make check' section of the Makefile? (i.e remove them from the standard > distribution unless explicitly compiled.) > > Which of these, if any, do you use and why? > > Regards, > Gary > > -- > Gary Williams > MRC Rosalind Franklin Centre for Genomics Research > Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK > Tel: +44 1223 494522 Fax: +44 1223 494512 > E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk > From henrikki.almusa at helsinki.fi Thu Dec 4 14:26:32 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Thu, 4 Dec 2003 16:26:32 +0200 Subject: tfscan output conversion Message-ID: <200312041626.32319.henrikki.almusa@helsinki.fi> Hello, I'm trying to convert tfscan to write report output (patch attached). This basicly should work, but it doesn't. So one problem and one test request. Problem. For some reason this patch seems to make it hang if used more than one sequence. I've used -debug option and taken last 75 lines from debug to add as attachment (since whole thing is almost 1meg). I can't figure out what causes this. Test request. Since there is now the binding factor informtion added to this i've put that into tail of report. However i don't get these using tfscan here, so i'd like someone to see what it looks like with them. Thanks, -- Henrikki Almusa -------------- next part -------------- testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'MOUSE$A21COL_02 ' reportWriteSeqTable subseq 6 seq 840 28..33 ajFeatGetNote 'acc' try /note="*id HS$APOE_08 " testing *name try /note="*acc R00149" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00149' ajFeatGetNote 'id' try /note="*id HS$APOE_08 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$APOE_08 ' reportWriteSeqTable subseq 5 seq 840 374..378 ajFeatGetNote 'acc' try /note="*id HS$ALBU_03 " testing *name try /note="*acc R00079" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00079' ajFeatGetNote 'id' try /note="*id HS$ALBU_03 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$ALBU_03 ' ajStrCut 0 0 len: 1 ibegin: 0 iend: 1 ajStrCut 0 0 len: 3 ibegin: 0 iend: 1 ajFeattableDel 80723c0 ajSeqRead: input file 'mRNA.small.twice' still there, try again ajFeattableDel 0 seqRead: cleared seqRead: seqin format 10 'fasta' seqRead: one format specified ajFileBuffNobuff mRNA.small.twice buffsize: 15 ++seqRead known format 10 ++seqReadFmt format 10 (fasta) 'mRNA.small.twice' feat No ajSeqParseNcbi '>Exon_10_head_2 (copied for testing two seqs) ' trying ajSeqParseFasta ajSeqParseFasta '>Exon_10_head_2 (copied for testing two seqs) ' result id: 'Exon_10_head_2' acc: '' desc: '(copied for testing two seqs) ' parsed id 'Exon_10_head_2' acc '' sv '' gi '' desc '(copied for testing two seqs) ' seqSetName 'Exon_10_head_2' result: 'Exon_10_head_2' at EOF: File already read to end mRNA.small.twice End of file - data in buffer - return ajFalse ajFileBuffClear (0) Nobuff: Yes first: 15 thys->Pos: 15 thys->Size: 15 thys->Nobuff: Yes ajFileBuffClear 'mRNA.small.twice' (0 lines) Y size: 15 pos: 15 removed 15 lines add to free: 0 seqReadFmt success with format 10 (fasta) seqQueryMatch 'Exon_10_head_2' id '' acc '' Sv '' Des '' Key '' Org '' No accession number to test No taxonomy to test No keyword to test No description to test testing sequence 'Exon_10_head_2' type 'DNA' IsNuc No IsProt No ajSeqTypeCheckIn type 'dna' found (DNA sequence) Remove all gaps ajSeqIsNuc Type '' seqTypeGapnucS test Convert '?XUu' to 'NNTt' ajSeqRead: open buffer usa: 'mRNA.small.twice' returns: Yes ++keep restored 0..0 (N) 'fasta' 10 ajSeqRead: thys->Db '', seqin->Db '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqRead: thys->Entryname 'Exon_10_head_2', seqin->Entryname '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqSetRange (len: 840 0..0 old 0..0) result: (len: 840 0..0) ajSeqallNext success -------------- next part -------------- A non-text attachment was scrubbed... Name: tfscan_report.patch Type: text/x-diff Size: 5759 bytes Desc: not available URL: From henrikki.almusa at helsinki.fi Fri Dec 5 10:08:52 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 12:08:52 +0200 Subject: Report format Message-ID: <200312051208.52153.henrikki.almusa@helsinki.fi> Hello, Small question about report format. Since one can add tag in style 'type:value=text_in_file'. Is there some way to give spaces in "text_in_file" or give sort of "%-xS" type of syntax to make sure atleast x sized are is used? Thanks, -- Henrikki Almusa From pmr at ebi.ac.uk Fri Dec 5 11:20:27 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Dec 2003 11:20:27 +0000 Subject: Report format In-Reply-To: <200312051208.52153.henrikki.almusa@helsinki.fi> References: <200312051208.52153.henrikki.almusa@helsinki.fi> Message-ID: <3FD069FB.7010600@ebi.ac.uk> Henrikki Almusa wrote: > Hello, > > Small question about report format. Since one can add tag in style > 'type:value=text_in_file'. Is there some way to give spaces in "text_in_file" > or give sort of "%-xS" type of syntax to make sure atleast x sized are is > used? Good idea! The code has widths for each column already. We need a syntax to give: 1. minimum column width 2. maximum column width (for example, for sequence data that can be very long) Perhaps type:value%n.n=columnheading More questions: Do we need more column types? Do we need a way to define the "standard" tags - to change them or to exclude them (excluding could be done with qualifiers, changing needs a tag syntax) regards, Peter From henrikki.almusa at helsinki.fi Fri Dec 5 12:04:36 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 14:04:36 +0200 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <200312051404.36982.henrikki.almusa@helsinki.fi> On Friday 05 December 2003 13:20, Peter Rice wrote: > The code has widths for each column already. > > We need a syntax to give: > > 1. minimum column width > 2. maximum column width (for example, for sequence data that can be very > long) > > Perhaps type:value%n.n=columnheading Might be good to make it possible to add space within there. Eg. column heading. Possible ways could be either allow type:value%n.n='column heading' or add tag_delim to report, which defaults on space or whitespace. > More questions: > > Do we need more column types? Perhaps name/id tag. Can't figure out much else. > Do we need a way to define the "standard" tags - to change them or to > exclude them (excluding could be done with qualifiers, changing needs a > tag syntax) This might help. For example 'cusp' could be adapted if the sequence and start and end points could be dropped. Perhaps this can be done in some report format, would need to check. I think there is those '-noscore' etc which do similar thing. Of course all "standard" tags then need to be named and added to there. -- Henrikki Almusa From henrikki.almusa at helsinki.fi Fri Dec 5 12:11:53 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 14:11:53 +0200 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <200312051411.53703.henrikki.almusa@helsinki.fi> Forgot to add to previous mail. One thing that would be nice to do with report format is sorting. Would it be hard to try to create a possibility to sort the feature table accending/decending order other than how its built? Eg sort by sequence, then start point? -- Henrikki Almusa From pmr at ebi.ac.uk Fri Dec 5 12:38:29 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 05 Dec 2003 12:38:29 +0000 Subject: Report format In-Reply-To: <200312051404.36982.henrikki.almusa@helsinki.fi> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <200312051404.36982.henrikki.almusa@helsinki.fi> Message-ID: <3FD07C45.2070703@ebi.ac.uk> Henrikki Almusa wrote: >>Perhaps type:value%n.n=columnheading > > Might be good to make it possible to add space within there. Eg. column > heading. Possible ways could be either allow type:value%n.n='column heading' > or add tag_delim to report, which defaults on space or whitespace. Spaces are tricky - for parsers that read the output. I prefer column_heading (or ColumnHeading) Any other comments on this? >>Do we need more column types? > > Perhaps name/id tag. Can't figure out much else. Are they different to "str"? I noticed in checking 2.8.0 that "rstr" works as a right-justified string. (str is left justified, anything else is right-justified. Perhaps we should structly check the tag types (anything is allowed in 2.8.0!) >>Do we need a way to define the "standard" tags - to change them or to >>exclude them (excluding could be done with qualifiers, changing needs a >>tag syntax) > This might help. For example 'cusp' could be adapted if the sequence and start > and end points could be dropped. Perhaps this can be done in some report > format, would need to check. Will do. We can put the -norstart and -norend qualifiers (see below) into the ACD report definition. > I think there is those '-noscore' etc which do similar thing. Of course all > "standard" tags then need to be named and added to there. Yes, will do. Have to check for any report formats that may be strange without specific tags (they can,of course, ignore the qualifier) regards, Peter From henrikki.almusa at helsinki.fi Fri Dec 5 13:46:06 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Fri, 5 Dec 2003 15:46:06 +0200 Subject: Report format In-Reply-To: <3FD07C45.2070703@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <200312051404.36982.henrikki.almusa@helsinki.fi> <3FD07C45.2070703@ebi.ac.uk> Message-ID: <200312051546.06872.henrikki.almusa@helsinki.fi> On Friday 05 December 2003 14:38, Peter Rice wrote: > Henrikki Almusa wrote: > >>Perhaps type:value%n.n=columnheading > > > > Might be good to make it possible to add space within there. Eg. column > > heading. Possible ways could be either allow type:value%n.n='column > > heading' or add tag_delim to report, which defaults on space or > > whitespace. > > Spaces are tricky - for parsers that read the output. > > I prefer column_heading (or ColumnHeading) > > Any other comments on this? I just don't like ThisTypeOfCapsing much. But i can live with it, no problem. > >>Do we need more column types? > > > > Perhaps name/id tag. Can't figure out much else. > > Are they different to "str"? Ah, right, understood it then wrong and your right, no dirrefent. > I noticed in checking 2.8.0 that "rstr" works as a right-justified > string. (str is left justified, anything else is right-justified. > Perhaps we should structly check the tag types (anything is allowed in > 2.8.0!) And perhaps put some documentation into web :). There is quite little knowledge on what can be done with report that i could find. -- Henrikki Almusa From d.counsell at hgmp.mrc.ac.uk Mon Dec 8 11:28:15 2003 From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell) Date: Mon, 8 Dec 2003 11:28:15 +0000 Subject: Report format In-Reply-To: <3FD069FB.7010600@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> Message-ID: <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> * Peter Rice [031205 11:23]: > Henrikki Almusa wrote: > >Hello, > > > >Small question about report format. Since one can add tag in style > >'type:value=text_in_file'. Is there some way to give spaces in > >"text_in_file" or give sort of "%-xS" type of syntax to make sure atleast > >x sized are is used? > > Good idea! > > The code has widths for each column already. > > We need a syntax to give: > > 1. minimum column width > 2. maximum column width (for example, for sequence data that can be very > long) > > Perhaps type:value%n.n=columnheading > > More questions: > > Do we need more column types? This may be a completely stupid suggestion, but, if you don't ask... Could we have decimal tabs, please? You know: ones smart enough to align themselves by the position of the floating point, even in the absence of specifying the number of digits before and after it? all the best Damian -- MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK phone: +44 (0)1223 494585 fax: +44 (0)1223 494512 email: d.counsell at hgmp.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk/~dcounsel/ From pmr at ebi.ac.uk Mon Dec 8 11:45:59 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Dec 2003 11:45:59 +0000 Subject: Report format In-Reply-To: <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> Message-ID: <3FD46477.1030602@ebi.ac.uk> Damian Counsell wrote: > This may be a completely stupid suggestion, but, if you don't ask... > > Could we have decimal tabs, please? You know: ones smart enough to > align themselves by the position of the floating point, even in the > absence of specifying the number of digits before and after it? Hmmmm ... What really happens to these data types is that the values are all strings written by the calling program and stored as tag=value pairs in an internal feature table. On output, the (string) tag value is reported. "str" tags are left justified Anything else is right justified. We could try, for floats, reading the value in and rewriting it in a fixed precision. A small overhead but maybe worth it. We could also try removing extra trailing zeroes in some cases. Comments? Peter From henrikki.almusa at helsinki.fi Mon Dec 8 14:24:42 2003 From: henrikki.almusa at helsinki.fi (Henrikki Almusa) Date: Mon, 8 Dec 2003 16:24:42 +0200 Subject: tfscan output conversion In-Reply-To: <200312041626.32319.henrikki.almusa@helsinki.fi> References: <200312041626.32319.henrikki.almusa@helsinki.fi> Message-ID: <200312081624.42660.henrikki.almusa@helsinki.fi> Hello, This patch should actually print the info properly into tail. Still would like to confirm that though. But the problem still remains. So any info on why the while fails? -- Henrikki Almusa -------------- next part -------------- A non-text attachment was scrubbed... Name: tfscan_report.patch Type: text/x-diff Size: 6413 bytes Desc: not available URL: -------------- next part -------------- ajFeatGetNote 'id' try /note="*id MOUSE$A21COL_02 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'MOUSE$A21COL_02 ' reportWriteSeqTable subseq 6 seq 840 28..33 ajFeatGetNote 'acc' try /note="*id HS$APOE_08 " testing *name try /note="*acc R00149" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00149' ajFeatGetNote 'id' try /note="*id HS$APOE_08 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$APOE_08 ' reportWriteSeqTable subseq 5 seq 840 374..378 ajFeatGetNote 'acc' try /note="*id HS$ALBU_03 " testing *name try /note="*acc R00079" testing *name found [1] 'acc' reportWriteSeqTable jwid 9 jmin 6 tagval 'R00079' ajFeatGetNote 'id' try /note="*id HS$ALBU_03 " testing *name found [1] 'id' reportWriteSeqTable jwid 10 jmin 6 tagval 'HS$ALBU_03 ' ajFeattableDel 80723d8 ajSeqRead: input file 'mRNA.small.twice' still there, try again ajFeattableDel 0 seqRead: cleared seqRead: seqin format 10 'fasta' seqRead: one format specified ajFileBuffNobuff mRNA.small.twice buffsize: 15 ++seqRead known format 10 ++seqReadFmt format 10 (fasta) 'mRNA.small.twice' feat No ajSeqParseNcbi '>Exon_10_head_2 (copied for testing two seqs) ' trying ajSeqParseFasta ajSeqParseFasta '>Exon_10_head_2 (copied for testing two seqs) ' result id: 'Exon_10_head_2' acc: '' desc: '(copied for testing two seqs) ' parsed id 'Exon_10_head_2' acc '' sv '' gi '' desc '(copied for testing two seqs) ' seqSetName 'Exon_10_head_2' result: 'Exon_10_head_2' at EOF: File already read to end mRNA.small.twice End of file - data in buffer - return ajFalse ajFileBuffClear (0) Nobuff: Yes first: 15 thys->Pos: 15 thys->Size: 15 thys->Nobuff: Yes ajFileBuffClear 'mRNA.small.twice' (0 lines) Y size: 15 pos: 15 removed 15 lines add to free: 0 seqReadFmt success with format 10 (fasta) seqQueryMatch 'Exon_10_head_2' id '' acc '' Sv '' Des '' Key '' Org '' No accession number to test No taxonomy to test No keyword to test No description to test testing sequence 'Exon_10_head_2' type 'DNA' IsNuc No IsProt No ajSeqTypeCheckIn type 'dna' found (DNA sequence) Remove all gaps ajSeqIsNuc Type '' seqTypeGapnucS test Convert '?XUu' to 'NNTt' ajSeqRead: open buffer usa: 'mRNA.small.twice' returns: Yes ++keep restored 0..0 (N) 'fasta' 10 ajSeqRead: thys->Db '', seqin->Db '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqRead: thys->Entryname 'Exon_10_head_2', seqin->Entryname '' ajSeqRead: thys->Name 'Exon_10_head_2' ajSeqSetRange (len: 840 0..0 old 0..0) result: (len: 840 0..0) ajSeqallNext success From pmr at ebi.ac.uk Mon Dec 8 15:01:57 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 08 Dec 2003 15:01:57 +0000 Subject: ACD changes for 2.9.0 Message-ID: <3FD49265.3030205@ebi.ac.uk> Just committed some new ACD validations (in acdvalid). Interface developers will need to look for them in 2.9.0. New section "additional" for qualifiers with additional:"Y" defined. I would suggest treating this in the same way as "advanced" (for many programs it needed only a rename of the advanced section). New ACD type "toggle" - this is the same as "boolean" and will be used for thoe boolean values that are only used to control other ACD qualifiers (-plot for example). acdvalid will allow these toggles in other sections, and will (but not yet) check for them in calculated values. Boolean values will be expected to appear in the required, additional or advanced sections (but can be in the input or output sections without problem, as before). Input and output datatypes now must appear in the input and output sections. matrix, datafile and cfile datatypes have been relocated. The application name in the ACD file must match the true application name. This is only checked by acdvalid so far to avoid breaking third-party ACD files. Output outfile, align, report, etc. have new attributes: nullok - if true, can return a NULL value nulldefault - if true, defaults to a NULL value. Setting a filename on the command line overrides and creates an output file. Setting to "" on the commandline ***creates the expected default filename*** missing - if true, can ctreate the expected filename by simply using -qualname on the commandline (rather than -qualname="") if it is last on the command line or followe dby anotehr qualifier (if followed by a paraneter that will appear to be the filename value) I have started to change "string" datatypes to other datatypes if approporiate (for example to directory or datafile). There will be more of these. Peter From d.counsell at hgmp.mrc.ac.uk Mon Dec 8 15:12:09 2003 From: d.counsell at hgmp.mrc.ac.uk (Damian Counsell) Date: Mon, 8 Dec 2003 15:12:09 +0000 Subject: Report format In-Reply-To: <3FD46477.1030602@ebi.ac.uk> References: <200312051208.52153.henrikki.almusa@helsinki.fi> <3FD069FB.7010600@ebi.ac.uk> <20031208112815.GB5099@dev4.hgmp.mrc.ac.uk> <3FD46477.1030602@ebi.ac.uk> Message-ID: <20031208151209.GD5099@dev4.hgmp.mrc.ac.uk> Peter! * Peter Rice [031208 11:49]: > Damian Counsell wrote: > >This may be a completely stupid suggestion, but, if you don't ask... > > > >Could we have decimal tabs, please? You know: ones smart enough to > >align themselves by the position of the floating point, even in the > >absence of specifying the number of digits before and after it? > > Hmmmm ... > > What really happens to these data types is that the values are all > strings written by the calling program and stored as tag=value pairs in > an internal feature table. > > On output, the (string) tag value is reported. > > "str" tags are left justified > Anything else is right justified. > > We could try, for floats, reading the value in and rewriting it in a > fixed precision. A small overhead but maybe worth it. Thanks for the explanation. If no one else has any objections this sounds fine to me. > We could also try removing extra trailing zeroes in some cases. I have no problem with trailing zeroes when there is consistent and controllable length and precision in the output. Leading zeroes are another matter; the right number of leading spaces would be the ideal solution for me of course. all the best Damian -- MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK phone: +44 (0)1223 494585 fax: +44 (0)1223 494512 email: d.counsell at hgmp.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk/~dcounsel/ From gwilliam at hgmp.mrc.ac.uk Fri Dec 12 13:59:18 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Fri, 12 Dec 2003 13:59:18 +0000 Subject: CpG programs Message-ID: <3FD9C9B6.BEEF7DC8@hgmp.mrc.ac.uk> EMBOSS has several programs for finding CpG islands: cpgreport Reports all CpG rich regions newcpgseek Reports CpG rich regions newcpgreport Report CpG rich areas The documentation (originally supplied by the author) says that for all practical purposes you should probably use newcpgreport. There is probably a case for retiring some of these programs to the 'make check' section of the Makefile? (i.e remove them from the standard distribution unless explicitly compiled.) Which of these, if any, do you use and why? Regards, Gary -- Gary Williams MRC Rosalind Franklin Centre for Genomics Research Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK Tel: +44 1223 494522 Fax: +44 1223 494512 E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk From rls at ebi.ac.uk Mon Dec 22 10:19:43 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Mon, 22 Dec 2003 10:19:43 -0000 Subject: CpG programs In-Reply-To: <3FD9C9B6.BEEF7DC8@hgmp.mrc.ac.uk> Message-ID: Hi, Sorry for the late reply. I'm currently re-writing a small portion of the code to speed things up together with a collaborator. As soon as this one is tried and tested we will move to replace the old version of newcpgreport with this one. As soon as that is done a name change from newcpgreport to cpgreport will be requested and the old programs can be retired. Thanks and Merry Xmas to all!!!! R:) > -----Original Message----- > From: owner-emboss-dev at hgmp.mrc.ac.uk > [mailto:owner-emboss-dev at hgmp.mrc.ac.uk]On Behalf Of Gary Williams, Tel > 01223 494522 > Sent: 12 December 2003 13:59 > To: emboss-dev at embnet.org > Subject: CpG programs > > > EMBOSS has several programs for finding CpG islands: > > cpgreport Reports all CpG rich regions > newcpgseek Reports CpG rich regions > newcpgreport Report CpG rich areas > > The documentation (originally supplied by the author) says that for all > practical purposes you should probably use newcpgreport. > > There is probably a case for retiring some of these programs to the > 'make check' section of the Makefile? (i.e remove them from the standard > distribution unless explicitly compiled.) > > Which of these, if any, do you use and why? > > Regards, > Gary > > -- > Gary Williams > MRC Rosalind Franklin Centre for Genomics Research > Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, UK > Tel: +44 1223 494522 Fax: +44 1223 494512 > E-mail: gwilliam at rfcgr.mrc.ac.uk Web: http://www.rfcgr.mrc.ac.uk >