From d.m.a.martin at dundee.ac.uk Tue Apr 1 08:34:16 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 01 Apr 2003 14:34:16 +0100 Subject: Staden Package In-Reply-To: <200304011318.OAA27250@arran.mrc-lmb.cam.ac.uk> Message-ID: many on this list will be users of the Staden Package. This has recently had close ties with the EMBOSS project. It is (IMHO) extremely shortsighted of the MRC to take this approach and I would urge all those of you who have benefitted from this package to at least send an email to support Rodger Staden's position and argue for continued support for the package. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee ------ Forwarded Message From: Staden Package Administrator Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) To: d.m.a.martin at dundee.ac.uk Subject: Devastating news Hello, You are on the list of people with licences for our software. I am sorry to have to inform you that the Staden Package is no longer available or supported as the MRC has decided to withdraw our funding. Last year I submitted, to MRC, a three year grant proposal of 443k, almost all of which was to pay salaries. The application gained extremely positive referee reports and the MRC Molecular and Cellular Medicine Board awarded it their highest banding for both past and proposed work. Despite this, and with knowledge of the large number of groups who are going to be badly affected, the MRC has decided not to fund us. The current funding finishes at the end of April - just a few weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng will lose their jobs and receive no redundancy pay. Before we saw the favourable referees reports I asked MRC if, in the event of our funding being cut, the package could be made Open Source so that we and others could continue to develop and support it even if no longer working for MRC. This seemed to us the best way of safeguarding users and the careers of the group. MRC refused, saying the package had potential commercial value. The official who phoned to tell us the funding decision said he had "devastating news". For us this is certainly true. I have been working on various versions of the package for over 25 years, James for 11 and Kathryn for 7. It is also very frustrating as we had so much work nearly ready for release. If this decision is going to affect the work of you and your colleagues, or if you have any other comments or suggestions please reply to this email and, if you think it might help, send a copy to the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk Rodger Staden -- Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ Hills Road, Tel (01223) 402389 or +44 1223 402389 Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 213556 ------ End of Forwarded Message From squiresb at macrogenics.com Tue Apr 1 22:56:17 2003 From: squiresb at macrogenics.com (Burke Squires) Date: Tue, 01 Apr 2003 21:56:17 -0600 Subject: Extractfeat options Message-ID: Hello All, I have having a bit of trouble extracting just genes form a Genbank file. I have tried the obviously options to no avail. I want to get JUST the gene information but I always get gene and CDS as below. How do I do that? Additionally, can I get the gene name instead of the stuff below? Thanks! Burke >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome. atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc ggagccggcccggcgaactcggtctaa >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome. atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc ggagccggcccggcgaactcggtctaa >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome. atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome. cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg gacgacg -- Burke Squires Bioinformatics MacroGenics, Inc. Dallas, TX From oddmund.nordgard at biokjemi.uio.no Wed Apr 2 00:23:37 2003 From: oddmund.nordgard at biokjemi.uio.no (=?ISO-8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Wed, 2 Apr 2003 07:23:37 +0200 (MET DST) Subject: Staden Package In-Reply-To: Message-ID: Sending an email to george.rada at headoffice.mrc.ac.uk resulted in "Delivery failure". Perhaps the address does not exist anymore? Oddmund Nordg?rd On Tue, 1 Apr 2003, David Martin wrote: > many on this list will be users of the Staden Package. This has recently had > close ties with the EMBOSS project. > > It is (IMHO) extremely shortsighted of the MRC to take this approach and I > would urge all those of you who have benefitted from this package to at > least send an email to support Rodger Staden's position and argue for > continued support for the package. > > ..d > > -- > David Martin PhD > Bioinformatics Scientific Officer > Post-Genomics and Molecular Interactions Centre > University of Dundee > > ------ Forwarded Message > From: Staden Package Administrator > Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) > To: d.m.a.martin at dundee.ac.uk > Subject: Devastating news > > Hello, > > You are on the list of people with licences for our software. I am > sorry to have to inform you that the Staden Package is no longer > available or supported as the MRC has decided to withdraw our funding. > > Last year I submitted, to MRC, a three year grant proposal of 443k, > almost all of which was to pay salaries. The application gained > extremely positive referee reports and the MRC Molecular and Cellular > Medicine Board awarded it their highest banding for both past and > proposed work. Despite this, and with knowledge of the large number of > groups who are going to be badly affected, the MRC has decided not to > fund us. The current funding finishes at the end of April - just a few > weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng > will lose their jobs and receive no redundancy pay. > > Before we saw the favourable referees reports I asked MRC if, in the > event of our funding being cut, the package could be made Open Source > so that we and others could continue to develop and support it even if > no longer working for MRC. This seemed to us the best way of > safeguarding users and the careers of the group. MRC refused, saying > the package had potential commercial value. > > The official who phoned to tell us the funding decision said he had > "devastating news". For us this is certainly true. I have been working > on various versions of the package for over 25 years, James for 11 and > Kathryn for 7. It is also very frustrating as we had so much work > nearly ready for release. > > If this decision is going to affect the work of you and your > colleagues, or if you have any other comments or suggestions please > reply to this email and, if you think it might help, send a copy to > the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk > > > Rodger Staden > > -- > Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk > MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ > Hills Road, Tel (01223) 402389 or +44 1223 > 402389 > Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 > 213556 > > > ------ End of Forwarded Message > > ****************************************************************** Oddmund Nordg?rd Address at work: Adress at home: Department of Haematology and Oncology Opalv. 28 Rogaland Central Hospital 4318 SANDNES P.O. Box 8100 Tlf.: 51 67 25 65 4068 STAVANGER Mob.: 48 20 51 72 Phone: 51 51 89 26 Email: oddmundn at biokjemi.uio.no ******************************************************************* Registered linux user #44149 From chenna at embl-heidelberg.de. Wed Apr 2 01:47:55 2003 From: chenna at embl-heidelberg.de. (Ramu Chenna) Date: Wed, 2 Apr 2003 08:47:55 +0200 (CEST) Subject: Staden Package In-Reply-To: Message-ID: > > Sending an email to george.rada at headoffice.mrc.ac.uk resulted in "Delivery > failure". Perhaps the address does not exist anymore? > > Oddmund Nordg?rd > > > On Tue, 1 Apr 2003, David Martin wrote: ===== the periodicity is once/year! Ramu --------------------------------------------------- It is not adivsable to be innocent on this day! > > > many on this list will be users of the Staden Package. This has recently had > > close ties with the EMBOSS project. > > > > It is (IMHO) extremely shortsighted of the MRC to take this approach and I > > would urge all those of you who have benefitted from this package to at > > least send an email to support Rodger Staden's position and argue for > > continued support for the package. > > > > ..d > > > > -- > > David Martin PhD > > Bioinformatics Scientific Officer > > Post-Genomics and Molecular Interactions Centre > > University of Dundee > > > > ------ Forwarded Message > > From: Staden Package Administrator > > Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) > > To: d.m.a.martin at dundee.ac.uk > > Subject: Devastating news > > > > Hello, > > > > You are on the list of people with licences for our software. I am > > sorry to have to inform you that the Staden Package is no longer > > available or supported as the MRC has decided to withdraw our funding. > > > > Last year I submitted, to MRC, a three year grant proposal of 443k, > > almost all of which was to pay salaries. The application gained > > extremely positive referee reports and the MRC Molecular and Cellular > > Medicine Board awarded it their highest banding for both past and > > proposed work. Despite this, and with knowledge of the large number of > > groups who are going to be badly affected, the MRC has decided not to > > fund us. The current funding finishes at the end of April - just a few > > weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng > > will lose their jobs and receive no redundancy pay. > > > > Before we saw the favourable referees reports I asked MRC if, in the > > event of our funding being cut, the package could be made Open Source > > so that we and others could continue to develop and support it even if > > no longer working for MRC. This seemed to us the best way of > > safeguarding users and the careers of the group. MRC refused, saying > > the package had potential commercial value. > > > > The official who phoned to tell us the funding decision said he had > > "devastating news". For us this is certainly true. I have been working > > on various versions of the package for over 25 years, James for 11 and > > Kathryn for 7. It is also very frustrating as we had so much work > > nearly ready for release. > > > > If this decision is going to affect the work of you and your > > colleagues, or if you have any other comments or suggestions please > > reply to this email and, if you think it might help, send a copy to > > the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk > > > > > > Rodger Staden > > > > -- > > Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk > > MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ > > Hills Road, Tel (01223) 402389 or +44 1223 > > 402389 > > Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 > > 213556 > > > > > > ------ End of Forwarded Message > > > > > > ****************************************************************** > > Oddmund Nordg?rd > > Address at work: Adress at home: > Department of Haematology and Oncology Opalv. 28 > Rogaland Central Hospital 4318 SANDNES > P.O. Box 8100 Tlf.: 51 67 25 65 > 4068 STAVANGER Mob.: 48 20 51 72 > Phone: 51 51 89 26 > Email: oddmundn at biokjemi.uio.no > > ******************************************************************* > Registered linux user #44149 > > > > From Marc.Logghe at devgen.com Wed Apr 2 03:27:23 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 2 Apr 2003 10:27:23 +0200 Subject: Extractfeat options Message-ID: Hi Burke, > I have having a bit of trouble extracting just genes form a > Genbank file. I > have tried the obviously options to no avail. I want to get > JUST the gene > information but I always get gene and CDS as below. How do I do that? you should set the -type arg to gene like this extractfeat -filter -type gene test.gb | less > > Additionally, can I get the gene name instead of the stuff below? Don't know how to do this with EMBOSS, I'd use BioPerl for that: #!/usr/bin/perl -w use strict; use Bio::SeqIO; my $io = Bio::SeqIO->new(-format => 'genbank', -file => shift); while (my $seq = $io->next_seq) { foreach my $feat ($seq->get_SeqFeatures('gene')) { next unless ($feat->primary_tag =~ /gene/i); print $feat->each_tag_value('gene'), "\n"; } } HTH, Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From gwilliam at hgmp.mrc.ac.uk Wed Apr 2 03:29:52 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 02 Apr 2003 09:29:52 +0100 Subject: Extractfeat options References: Message-ID: <3E8A9F7F.993DE906@hgmp.mrc.ac.uk> It looks like you are doing: extractfeat refseq:NC_001806 stdout -tag gene This will pull out the features like: gene 513..1259 /gene="RL1" or CDS 513..1259 /gene="RL1" which include the tag name 'gene', e.g. /gene="RL1" You should be using: extractfeat refseq:NC_001806 stdout -type gene which will only find the features like: gene 513..1259 /gene="RL1" which has the type name 'gene' I'll add a report of specified tag values in the output description for you soon, Burke. Regards, Gary Burke Squires wrote: > > Hello All, > > I have having a bit of trouble extracting just genes form a Genbank file. I > have tried the obviously options to no avail. I want to get JUST the gene > information but I always get gene and CDS as below. How do I do that? > > Additionally, can I get the gene name instead of the stuff below? > > Thanks! > > Burke > > >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome. > atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc > gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg > agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg > ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac > gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc > ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac > cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg > gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc > gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc > tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc > cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag > gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc > ggagccggcccggcgaactcggtctaa > >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome. > atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc > gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg > agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg > ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac > gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc > ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac > cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg > gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc > gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc > tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc > cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag > gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc > ggagccggcccggcgaactcggtctaa > >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome. > atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag > >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome. > cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct > gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc > tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc > gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc > gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc > acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc > atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag > ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg > aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac > tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg > gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg > gacgacg > > -- > Burke Squires > Bioinformatics > MacroGenics, Inc. > Dallas, TX -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From yezq at mail.cbi.pku.edu.cn Thu Apr 3 03:52:33 2003 From: yezq at mail.cbi.pku.edu.cn (Zhiqiang Ye) Date: Thu, 3 Apr 2003 16:52:33 +0800 Subject: translate tools Message-ID: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Dear all?? I have a large set of mRNAs,cDNAs,cds to translate, but each sequence has different reading frame. So I have to translate it in all 6 frames and see which is the best.I have so many sequences that I cannot do this one by one. Is there any program in emboss which can translate nucleic sequence in 6 frames and choose the best one as output? Transeq doesn't seem to work with this. Thanks in advance! ???? ???????????????? Best Regards! ????????????????????????????Zhiqiang Ye ??????????????????????????????????2003-04-03 ############################################################### Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics Center of BioInformatics, College of Life Scicences, Peking University, Beijing, PR China 100871 Tel: +86 10 6275 6730 ############################################################### From rls at ebi.ac.uk Thu Apr 3 03:49:03 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 09:49:03 +0100 Subject: translate tools In-Reply-To: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Message-ID: Hi, sixpack spring to mind. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Zhiqiang Ye > Sent: 03 April 2003 09:53 > To: emboss at embnet.org > Subject: translate tools > > > Dear all?? > I have a large set of mRNAs,cDNAs,cds to translate, but each > sequence has different reading frame. > So I have to translate it in all 6 frames and see which is the > best.I have so many sequences that I cannot > do this one by one. Is there any program in emboss which can > translate nucleic sequence in 6 frames and choose > the best one as output? Transeq doesn't seem to work with this. > Thanks in advance! > ???? > > ???????????????? > Best Regards! > > ????????????????????????????Zhiqiang Ye > ??????????????????????????????????2003-04-03 > > ############################################################### > Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics > Center of BioInformatics, College of Life Scicences, > Peking University, Beijing, PR China 100871 > Tel: +86 10 6275 6730 > ############################################################### > > > From rls at ebi.ac.uk Thu Apr 3 03:51:27 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 09:51:27 +0100 Subject: translate tools In-Reply-To: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Message-ID: Ah! You might want to have a look at checktrans as well. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Zhiqiang Ye > Sent: 03 April 2003 09:53 > To: emboss at embnet.org > Subject: translate tools > > > Dear all?? > I have a large set of mRNAs,cDNAs,cds to translate, but each > sequence has different reading frame. > So I have to translate it in all 6 frames and see which is the > best.I have so many sequences that I cannot > do this one by one. Is there any program in emboss which can > translate nucleic sequence in 6 frames and choose > the best one as output? Transeq doesn't seem to work with this. > Thanks in advance! > ???? > > ???????????????? > Best Regards! > > ????????????????????????????Zhiqiang Ye > ??????????????????????????????????2003-04-03 > > ############################################################### > Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics > Center of BioInformatics, College of Life Scicences, > Peking University, Beijing, PR China 100871 > Tel: +86 10 6275 6730 > ############################################################### > > > From Marc.Logghe at devgen.com Thu Apr 3 04:31:43 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Thu, 3 Apr 2003 11:31:43 +0200 Subject: translate tools Message-ID: Hi Rodrigo, > > sixpack spring to mind. > I had the same problem a few days ago. Used getorf for that with minsize set to 3000. But this was done on only one sequence for which I knew the size of the translation I needed. You can not do this if you don't know beforehand the value to set minsize to. This also counts for sixpack (and checktrans afaik): you can only pass a minsize argument. There should exist something like -only -largest. So I think Zhiqiang Ye's problem persists. Regards, Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From rls at ebi.ac.uk Thu Apr 3 05:12:57 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 11:12:57 +0100 Subject: translate tools In-Reply-To: Message-ID: Yes, I see the problem...Good to have some more specs to go by.... The closest I can get at helping is something like: transeq emblcd:hscfo\* -auto -frame 6 | checktrans -filter -auto -orfml 200 but that would require some post-processing to find out which orf is the longest one and then rerun transeq/sixpack to get it explicitly. A perl or a sh/csh script could do that. However, the correct approach is like you say: to have an option for '-onlylargest/-onlylongest'.. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Marc Logghe > Sent: 03 April 2003 10:32 > To: 'Rodrigo Lopez' > Cc: Emboss (E-mail) > Subject: RE: translate tools > > > Hi Rodrigo, > > > > sixpack spring to mind. > > > I had the same problem a few days ago. Used getorf for that with > minsize set > to 3000. But this was done on only one sequence for which I knew > the size of > the translation I needed. > You can not do this if you don't know beforehand the value to set minsize > to. > This also counts for sixpack (and checktrans afaik): you can only pass a > minsize argument. There should exist something like -only -largest. > So I think Zhiqiang Ye's problem persists. > Regards, > Marc > > *********************************************************** > Marc Logghe, Ph.D. > Senior Scientist > Scientific Computing Group > deVGen > Technologiepark 9 > 9052 Zwijnaarde > Belgium > tel: +32 (0) 9 324 24 88 > fax: +32 (0) 9 324 24 25 > *********************************************************** > From aengus.stewart at cancer.org.uk Thu Apr 3 10:49:15 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Thu, 03 Apr 2003 15:49:15 +0000 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> Message-ID: <3E8C57FB.63B85F6A@cancer.org.uk> Hi, I maintain data libraries by having 2 copies of them a "live" version for the users and a "incoming" version where the index etc takes place on the raw files. After the indexing the copies are flipped by changing the soft-links. Therefore I would like to "run" 2 copies of emboss.default, however looking through the documentation, I havent really got a pointer as to how to do this. Does anyone have a strategy to do this or do other people not keep parallel copies of the data? Cheers Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From aengus.stewart at cancer.org.uk Thu Apr 3 11:13:58 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Thu, 03 Apr 2003 16:13:58 +0000 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> <3E8C57FB.63B85F6A@cancer.org.uk> <3E8C4D05.2060708@imperial.ac.uk> Message-ID: <3E8C5DC6.FD7C7AE5@cancer.org.uk> Ooops as soon as I had posted I realised I was talking cobblers. The indexing doesnt reference the emboss.default file at all so I have completely dreamt up an non-existent problem :-) Apologies............ Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From kellert at ohsu.edu Thu Apr 3 16:43:21 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Thu, 3 Apr 2003 13:43:21 -0800 Subject: database help Message-ID: <49DF0FF7-661D-11D7-9D0C-0003930405E2@ohsu.edu> Greetings, The sysadmin for the machine with GCG installed, was kind enough to allow me to mount the gcg databases maintained at my institution. This should save me a lot of headaches, yet allow me the convenience of using them with EMBOSS on my machine. But I am unsure how to make them available to EMBOSS. Is there some general documentation on database formats and the steps required for use with emboss? The database documentation doesn't cover this particular case. I'm unsure, for example, if I would put "method: gcg" for all of these? would I need to run formatdb on the blast dbs? general stuff like that. The directory structure looks like this: gcgnrl3d gcgswissprot gcgsrs gcgpir gcgembl gcggenpept gcgblast est_human_00.nsq est_human_00.nsi est_human_00.nsd est_human_00.nin est_human_00.nhr ... gcgpfam gcggenbank gcggbtags gcgpfam.org gcgsptrembl From pmr at ebi.ac.uk Fri Apr 4 02:52:35 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 04 Apr 2003 08:52:35 +0100 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> <3E8C57FB.63B85F6A@cancer.org.uk> Message-ID: <3E8D39C3.3040609@ebi.ac.uk> Aengus Stewart wrote: > Therefore I would like to "run" 2 copies of emboss.default, however > looking through the documentation, I havent really got a pointer as to > how to do this. Yes, I have seen your folowup mail ... however ... emboss.default can include other files. At present the include statements do not resolve variable names but they should (and will). Or ... you could swap soft links to 2 emboss.default files :-) Hope this helps, Peter From sdowd at lbk.ars.usda.gov Fri Apr 4 17:36:28 2003 From: sdowd at lbk.ars.usda.gov (Dr. Scot E. Dowd) Date: Fri, 4 Apr 2003 16:36:28 -0600 Subject: exception in thread Message-ID: <000601c2fafa$a26a2af0$599385c7@Salmonella> Upon installing emboss and jemboss using the server install. I seem to get through everything OK then when I run the runJemboss.csh I get the following Exception in thread "main" java.lang.NoCLassDefFoundError: org/emboss/jemboss/Jemboss Any ideas or help would be appreciated in not too technical language. cheers Scot -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030404/3248d12e/attachment.html From Sean.Maceachern at nre.vic.gov.au Mon Apr 7 00:54:29 2003 From: Sean.Maceachern at nre.vic.gov.au (Sean.Maceachern at nre.vic.gov.au) Date: Mon, 7 Apr 2003 14:54:29 +1000 Subject: Revseq Message-ID: Hello, I am trying to reverse and compement a few hundred FASTA sequences and am having trouble getting the input files in the correct format. To date all I have as an example of an inputfile is a sequence entry that is in the example nucleic acid database 'tembl' format. If anyone could suggest the best way to reverse and complement a number of FASTA files or could show me an example of an input file it would be greatly appreciated. Thank you Sean MacEachern PhD Student Sean.Maceachern at nre.vic.gov.au From Marc.Logghe at devgen.com Mon Apr 7 03:21:07 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 7 Apr 2003 09:21:07 +0200 Subject: Revseq Message-ID: Something like should do the job seqret fasta::your_file -sreverse ? > -----Original Message----- > From: Sean.Maceachern at nre.vic.gov.au > [mailto:Sean.Maceachern at nre.vic.gov.au] > Sent: Monday, April 07, 2003 6:54 AM > To: emboss at embnet.org > Subject: Revseq > > > Hello, I am trying to reverse and compement a few hundred > FASTA sequences > and am having trouble getting the input files in the correct > format. To > date all I have as an example of an inputfile is a sequence > entry that is > in the example nucleic acid database 'tembl' format. > If anyone could suggest the best way to reverse and > complement a number of > FASTA files or could show me an example of an input file it would be > greatly appreciated. > Thank you > > Sean MacEachern > PhD Student > Sean.Maceachern at nre.vic.gov.au > > From thomas-c at esbs.u-strasbg.fr Mon Apr 7 04:37:56 2003 From: thomas-c at esbs.u-strasbg.fr (Morgane THOMAS-CHOLLIER) Date: Mon, 7 Apr 2003 10:37:56 +0200 (CEST) Subject: install Phylip Macos X Message-ID: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> I already have EMBOSS installed on my machine and I 'm trying to install PHYLIP on Mac OS X 10.2 but it cannot compile fine. I've dowload the latest .tar.gz and unpacked it. I get the following error when trying the ./configure command : [PHYLIP-3.573c] root# ./configure --prefix=/usr/local/share/EMBOSS configure: error: cannot find install-sh or install.sh in . ./.. ./../.. Does anyone have already had that problem ? Thanks for your help. -- Morgane THOMAS-CHOLLIER DESS Bioinformatique et g?nomique ESBS - ULP Strasbourg From gwilliam at hgmp.mrc.ac.uk Mon Apr 7 06:04:42 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 07 Apr 2003 11:04:42 +0100 Subject: Revseq References: Message-ID: <3E914D3A.373C7674@hgmp.mrc.ac.uk> The example of a standard FASTA input file is in: http://www.emboss.org/Themes/SequenceExamples/fasta The full set of input sequence formats is described in: http://www.emboss.org/Themes/SequenceFormats.html#in To reverse and complement a set of sequences that are in separate FASTA files, for example '*.seq' revseq '*.seq' result.seq This writes the results to a single file holding many sequences. Note that you should put the *.seq in quote marks if you specify it on the command line to stop the shell trying to expand the '*' for you. I prefer to output a set of sequences like this to a single file, because the resulting file name is then known and can be handled easily by scripts, but you may need to run non-EMBOSS programs on the results which might not be able to read in a file containing many FASTA format sequences - they may require one sequence per file. If you prefer to deal with many resulting sequence files, use the qualifier '-ossingle' which will force the output sequence to be written to individual files, each named using the ID name of the input sequence: revseq '*.seq' result.seq -ossingle You can specify other parts of the output file name using: -osextension reversed to specify the extension name as 'reversed' -osdirectory out to specify the output directory Run 'revseq -help -verbose' for further information. Regards, Gary Sean.Maceachern at nre.vic.gov.au wrote: > > Hello, I am trying to reverse and compement a few hundred FASTA sequences > and am having trouble getting the input files in the correct format. To > date all I have as an example of an inputfile is a sequence entry that is > in the example nucleic acid database 'tembl' format. > If anyone could suggest the best way to reverse and complement a number of > FASTA files or could show me an example of an input file it would be > greatly appreciated. > Thank you > > Sean MacEachern > PhD Student > Sean.Maceachern at nre.vic.gov.au -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From can at ucsd.edu Mon Apr 7 15:15:44 2003 From: can at ucsd.edu (Can Tran) Date: Mon, 7 Apr 2003 12:15:44 -0700 Subject: install Phylip Macos X In-Reply-To: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> Message-ID: Have you tried the OSX packages at http://bioteam.net/MacOSX/index.html Can At 10:37 AM +0200 4/7/03, Morgane THOMAS-CHOLLIER wrote: >I already have EMBOSS installed on my machine and I 'm trying to install >PHYLIP on Mac OS X 10.2 but it cannot compile fine. > >I've dowload the latest .tar.gz and unpacked it. > >I get the following error when trying the ./configure command : > >[PHYLIP-3.573c] root# ./configure --prefix=/usr/local/share/EMBOSS >configure: error: cannot find install-sh or install.sh in . ./.. ./../.. > > >Does anyone have already had that problem ? > >Thanks for your help. >-- >Morgane THOMAS-CHOLLIER >DESS Bioinformatique et g?nomique >ESBS - ULP Strasbourg -- From aengus.stewart at cancer.org.uk Tue Apr 8 10:41:14 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Tue, 08 Apr 2003 15:41:14 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> Message-ID: <3E92DF8A.EF19B6B5@cancer.org.uk> Hi, Just had a couple of thoughts Synomyns: It would be nice to have a synomyns or alias tag that would take a list of alternative names for datalibs - at the moment I dont believe this is possible, so for "embl" and "em" you have to duplicate the entire datalib definition? Aggregation: When you want a SWALL that is SWISSPROT + SWISSNEW I dont see an easy way to do this. I hold both of these in separate directories with separate definitions, could the directory tag possible take a list as a value? I imagine there may be other considerations as well. BTW small typo in databases.html - in the Attributes section the key is given as "filename:" shouldnt this be "file:" ? Cheers Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From gwilliam at hgmp.mrc.ac.uk Tue Apr 8 10:48:45 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 08 Apr 2003 15:48:45 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> <3E92DF8A.EF19B6B5@cancer.org.uk> Message-ID: <3E92E14D.F349EEAE@hgmp.mrc.ac.uk> Aengus Stewart wrote: > BTW small typo in databases.html - in the Attributes section the key is > given as "filename:" shouldnt this be "file:" ? I believe that "file:" is an allowed abbreviation for "filename:". This is stated in the documentation for "filename:" -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From aengus.stewart at cancer.org.uk Tue Apr 8 10:51:11 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Tue, 08 Apr 2003 15:51:11 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> <3E92DF8A.EF19B6B5@cancer.org.uk> <3E92E14D.F349EEAE@hgmp.mrc.ac.uk> Message-ID: <3E92E1DF.CF860E35@cancer.org.uk> "Gary Williams, Tel 01223 494522" wrote: > > Aengus Stewart wrote: > > BTW small typo in databases.html - in the Attributes section the key is > > given as "filename:" shouldnt this be "file:" ? > > I believe that "file:" is an allowed abbreviation for "filename:". > This is stated in the documentation for "filename:" > Ooops yes indeed it helps if I read all the documentation including the last line :-) Sorry Gary. Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From davids at synpep.com Tue Apr 8 12:02:04 2003 From: davids at synpep.com (David Stephens) Date: Tue, 8 Apr 2003 09:02:04 -0700 Subject: Custom Peptides $10/Residue Message-ID: <20030408160934.13F927D1A8@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030408/f78f0f03/attachment.html From asonyadi at cp.co.id Thu Apr 10 05:36:25 2003 From: asonyadi at cp.co.id (Sony Adi Susanto) Date: Thu, 10 Apr 2003 16:36:25 +0700 Subject: i want to join Message-ID: <3E953B19.1000908@cp.co.id> Dear friends and emboss milis moderator I want to joing this milis I am a research scientist from Indonesia -sony adi susanto- From gwilliam at hgmp.mrc.ac.uk Fri Apr 11 05:14:43 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Fri, 11 Apr 2003 10:14:43 +0100 Subject: [Fwd: EMBOSS:einverted and "nested" inverted repeats] Message-ID: <3E968783.D7AD0178@hgmp.mrc.ac.uk> Linda Cardle (lcardl at scri.sari.ac.uk) wrote: > > I wasn't sure who to contact about this query, but here goes: > > I'm trying to find MITES within sequences using einverted. My main question > is: > > How does einverted cope with "nested" inverted repeats? > > My reason for asking is that to simulate a sequence with both an inverted > repeat and an SSR inverted repeat I added an SSR inverted repeat inside a > known MITE. Once I'd done that it seemed that einverted could spot the SSR > but not the MITE surrounding it. > > I was hoping you'd be able to clarify how einverted would cope with this > situation, or point me to someone who could. > > Thanks for your time, > Linda > -- > Dr Linda Cardle > Computational Biology > Scottish Crop Research Institute > Dundee, DD2 5DA -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From thomas-c at esbs.u-strasbg.fr Mon Apr 14 04:58:21 2003 From: thomas-c at esbs.u-strasbg.fr (Morgane THOMAS-CHOLLIER) Date: Mon, 14 Apr 2003 10:58:21 +0200 (CEST) Subject: Jalview and Jemboss Message-ID: <50670.130.79.135.12.1050310701.squirrel@esbsmail.u-strasbg.fr> Hello, I use jemboss as a standalone server on MacOS X and also need to use Jalview. The problem is that when I load my multiple alignement as a fasta file, there are no colors on the display. I tried to change the colors but it stays gray. Also, it takes a long time to display the alignement and to consider any change made on it. I have more than 60 sequences ont that alignement, could it be that I go out of memory ? Does anyone as a idea to display the colors ? Or another way to install and run Jalview on MacOS X ? Thanks a lot for your help. Morgane THOMAS-CHOLLIER -- Morgane THOMAS-CHOLLIER DESS Bioinformatique et g?nomique ESBS - ULP Strasbourg From gbottu at ben.vub.ac.be Mon Apr 14 13:07:45 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 14 Apr 2003 19:07:45 +0200 (CEST) Subject: Preferred isoschizomer ? Message-ID: <200304141707.h3EH7jV31381945@black.vub.ac.be> from : BEN Dear colleagues, A user of BEN complained about a serious problem with restrict/remap. He could not find the site for PstI in a sequence, where the "wet work" showed the enzyme did cut. He lost a lot of time because he thought he had made an error, while the reason was that the program reported the isoshizomer BspMAI. Now, in the Rebase we see : <1>BspMAI <2>PstI,AinI,AjoI,Ali2882I,AliAJI,ApiI,Asp36I,Asp708I,Asp713I,AspTI,BbiI,Bce170I ,BloHII,BloHIII,BmeBI,BsaNII,BsaQI,BscDI,Bsp17I,Bsp43I,Bsp63I,Bsp78I,Bsp81I,Bsp9 3I,Bsp107I,Bsp1..... So, PstI is clearly identified as the prototype enzyme. Yet, when restrict is requested to report only the "preferred" isoschizomer, it does not report PstI, nor even the first in the file (AinI) or the last (YenEI). Does someone understand the cause of the erratic behaviour ? And did noone else suffer from this "feature" ? Regards, Guy Bottu From ableasby at hgmp.mrc.ac.uk Mon Apr 14 14:17:54 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Mon, 14 Apr 2003 19:17:54 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> 1) If your colleague had explicitly said -enzymes psti on the command line (or equivalent GUI) then it would be found. The output would be overly verbose if all isoschizomers are reported so as a compromise it reports only one. 2) If you take the emboss files from the REBASE (NEB) distro then, after renaming and putting them in data/REBASE, it will probably report PstI (haven't tried it). I arranged with NEB that they would provide only the 'common' REs in their files. I believe this is what some other packages do. Using REBASEEXTRACT on the withrefm file gives all the REs. 3) You can equate any reported RE to another by adding an entry into embossre.equ e.g. BspMAI PstI HTH Alan From can at ucsd.edu Mon Apr 14 17:44:33 2003 From: can at ucsd.edu (can at ucsd.edu) Date: Mon, 14 Apr 2003 21:44:33 GMT Subject: Jalview and Jemboss Message-ID: <200304142144.h3ELiXM4026935@smtp.ucsd.edu> Hi, What version of Jalview are you using? I use the Jalview applet to display MSAs on the web. I grabbed a version of Jalview from EBI that works well on OSX via Safari and Netscape. I haven't gotten it to work in IE though. Best wishes, Can ___________________________________ Can Tran can at ucsd.edu University of California, San Diego Division of Biology Muir Biology 4143 http://tcdb.ucsd.edu > Hello, > > I use jemboss as a standalone server on MacOS X and also need to use Jalview. > > The problem is that when I load my multiple alignement as a fasta file, > there are no colors on the display. I tried to change the colors but it > stays gray. > Also, it takes a long time to display the alignement and to consider any > change made on it. > > I have more than 60 sequences ont that alignement, could it be that I go > out of memory ? > Does anyone as a idea to display the colors ? > Or another way to install and run Jalview on MacOS X ? > > Thanks a lot for your help. > > Morgane THOMAS-CHOLLIER > -- > Morgane THOMAS-CHOLLIER > DESS Bioinformatique et g?nomique > ESBS - ULP Strasbourg > > > From Sean.Maceachern at nre.vic.gov.au Mon Apr 14 18:59:25 2003 From: Sean.Maceachern at nre.vic.gov.au (Sean.Maceachern at nre.vic.gov.au) Date: Tue, 15 Apr 2003 08:59:25 +1000 Subject: Detecting selection Message-ID: Hello, I have just commenced a PhD and am interested in detecting selection in a large subset of genes between two species. Owing to the large number of samples that I am analysing I need to find a program that I can automate through command line. To date I have not been able to find a program on EMBOSS that will detect ratios between synonymous and nonsynonymous mutations in a method analagous to GCG's Diverge. Does anyone know of a program on EMBOSS that will calculate these ratios or can anyone suggest a reliable program that would be easy to automate via command line that i can source from the web Thank you Sean MacEachern PhD Student Sean.Maceachern at nre.vic.gov.au From Joerg.Schaber at uv.es Tue Apr 15 04:22:25 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Tue, 15 Apr 2003 10:22:25 +0200 Subject: Detecting selection References: Message-ID: <3E9BC141.1090906@uv.es> You might want to use PAML. http://abacus.gene.ucl.ac.uk/software/paml.html Sean.Maceachern at nre.vic.gov.au wrote: >Hello, I have just commenced a PhD and am interested in detecting selection >in a large subset of genes between two species. Owing to the large number >of samples that I am analysing I need to find a program that I can automate >through command line. To date I have not been able to find a program on >EMBOSS that will detect ratios between synonymous and nonsynonymous >mutations in a method analagous to GCG's Diverge. Does anyone know of a >program on EMBOSS that will calculate these ratios or can anyone suggest a >reliable program that would be easy to automate via command line that i can >source from the web >Thank you > >Sean MacEachern >PhD Student >Sean.Maceachern at nre.vic.gov.au > > > > -- ---------------------------------------------------------- J?rg Schaber Instituto Cavanilles de Biodiversidad y Biologia Evolutiva Universidad de Valencia Tel.: ++34 96 354 3666 A.C. 22085 Fax.: ++34 96 354 3670 46071 Valencia, Espa?a email : jos at uv.es From siegmund at develogen.com Tue Apr 15 11:15:44 2003 From: siegmund at develogen.com (Thomas Siegmund) Date: Tue, 15 Apr 2003 17:15:44 +0200 Subject: New release of Kaptain X11 GUI for EMBOSS Message-ID: <200304151715.44096.siegmund@develogen.com> Hi everybody, I'd like to announce that a new version of the EMBOSS GUI for Linux/Unix systems running QT/KDE is available. Thanks to Ter?k Zsolt (who fixed some bugs and released Kaptain 0.71 a few days ago) we can use QT3.x now. The widgets look quite nice in a KDE3 environment and use less space on the screen than the QT2 version. Support for EMBOSS 2.6 is complete. You can the grammars for EMBOSS and Phylip as usual at http://userpage.fu-berlin.de/~sgmd/download.html . Kaptain 0.71 is available from http://kaptain.sourceforge.net/ . From the changelog: Version 0.89 - fixes for quite a few number of grammars for kaptain0.71. This version understands rules like 'something -> @ | "-something"' a little bit differently. The grammars should work with kaptain 0.6 and kaptain 0.71 now. Please report any problems, especially with the older version. I will use kaptain 0.71 from now on. Version 0.88 - new: skipseq.kaptn Version 0.87 - one more fix in embosslauncher - make showalign.kaptn, emma.kaptn, est2genome.kaptn, remap.kaptn, showseq.kaptn, efitch.kaptn work with kaptain 0.7 - fix empty lines in embossdata.kaptn - fixed -sreverse option in needle.kaptn and water.kaptn with kaptain 0.7 Version 0.86 - small fix to embosslauncher Version 0.85 - finished support for EMBOSS 2.6 - new: pestfind.kaptn, sirna.kaptn, twofeat.kaptn - moved grammar files for "Protein 3D" applications to separate directory "Domainatrix". For the moment they won't get installed automatically. If you need them, please copy them manually to the appropriate locations. If you update, it might be a good idea to remove the $KDEDIRS/share/applnk/EMBOSS directory before running the install script to get rid of stale .desktop files - "other" option for msbar.kaptn - "featinname" feature for extractfeat.kaptn - small fix for cpgplot.kaptn Regards Thomas -- Thomas Siegmund, Ph.D. DeveloGen AG Bioinformatics and Data Management Phone: +49(551) 505 58 651 From kclancy at informaxinc.com Tue Apr 15 13:24:09 2003 From: kclancy at informaxinc.com (Kevin Clancy) Date: Tue, 15 Apr 2003 11:24:09 -0600 Subject: compiling emboss on windows Message-ID: <001b01c30373$d3781310$120610ac@informaxinc.net> Dear Sirs Has anyone tackled building EMBOSS on WIndows? Is there any kind of guidelines in doing this? I was hoping to be able to run this outside of Cygwin or any other type of unix emulator. If it hasn't been done, would it be possible to let me know what types of problems I would run into? Thanks for any information. kevin Kevin Clancy, PhD Senior Bioinformatic Scientist InforMax, Inc., 433 Park Point Drive, Suite 275, Golden, CO 80401 Direct phone line: (720) 746 3707 Cell Phone: (240) 417 8604 Direct email: kclancy at informaxinc.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030415/7d6b7440/attachment.html From gbottu at black.vub.ac.be Tue Apr 15 14:54:31 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Tue, 15 Apr 2003 20:54:31 +0200 Subject: Preferred isischizomer (bis) Message-ID: <20030415205431.A1419113@black.vub.ac.be> from : BEN Dear colleagues, Allow me to insist. The point is that when you ask for only one representative isoschizomer the programs should report PstI, because PstI is mentioned in Rebase as the "prototype". The GCG programs map, mapplot and mapsort did do this. The file in GCG format contains : ;BstMAI 5 C_TGCA'G -4 PstI 5 C_TGCA'G -4 While withrefm contains the same information in another way. So, IMHO, rebaseextract and/or restrict+remap are not doing their job properly. Sincerely, Guy Bottu From leungyukfai at hotmail.com Wed Apr 16 21:35:26 2003 From: leungyukfai at hotmail.com (YUK FAI LEUNG) Date: Wed, 16 Apr 2003 21:35:26 -0400 Subject: A simple installation problem Message-ID: Hi there, I have encountered the same installation problem that many others encountered before. I tried a few methods from the web and the emboss discussion group like adding the flag -L/X11R6/lib to the Makefile but they all didn't work. I tried the installation in both the Redhat 8 & 9 in my laptop but the result is the same. Could anyone tell me how to get through this problem? Thanks! fai ---Here is the error message---- /bin/sh ../libtool --mode=link gcc -g -O2 -o aaindexextract aaindexextract.o ../nucleus/libnucleus.la ../ajax/libajaxg.la ../ajax/libajax.la ../plplot/libplplot.la -lX11 -lm gcc -g -O2 -o .libs/aaindexextract aaindexextract.o ../nucleus/.libs/libnucleus.so ../ajax/.libs/libajaxg.so ../ajax/.libs/libajax.so ../plplot/.libs/libplplot.so -lX11 -lm -Wl,--rpath -Wl,/usr/local/lib /usr/bin/ld: cannot find -lX11 collect2: ld returned 1 exit status _________________________________________________________________ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail From leungyukfai at hotmail.com Thu Apr 17 11:11:02 2003 From: leungyukfai at hotmail.com (YUK FAI LEUNG) Date: Thu, 17 Apr 2003 11:11:02 -0400 Subject: A simple installation problem Message-ID: An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030417/774e2fcd/attachment.html From sdowd at lbk.ars.usda.gov Sat Apr 19 10:09:03 2003 From: sdowd at lbk.ars.usda.gov (Dr. Scot E. Dowd) Date: Sat, 19 Apr 2003 09:09:03 -0500 Subject: Kaptain is Koolneal! In-Reply-To: <20030415205431.A1419113@black.vub.ac.be> Message-ID: <002201c3067d$50585230$599385c7@Salmonella> Hi all, Thought to express appreciation I love kaptns grammars. Installed easy!~ BINGO right away. Ran a few of the programs and they appear to function properly. Thanks a bunch! Dr. Scot E. Dowd Ph.D. Research Microbiologist USDA-ARS Livestock Issues Research Unit RT3 Box 215 FM 1294 Lubbock, TX 79403 806-746-5356 ext 241 mobile 806-832-0659 fax 806-744-4402 From maximtel at ibpm.pushchino.ru Mon Apr 21 03:40:31 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 11:40:31 +0400 Subject: GUI Message-ID: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Dear EMBOSS users! Our group means to write some integrated GUI frontend using the EMBOSS package. It is planed as all-in-one program, such as Vector NTI under Windows platform plus such enhancements as advanced functions in cloning-strategy planning and other. As base we are planning to use the Gtk library (because we have some experience). Is such program will be useful? We know that different GUI frontends (Jemboss, Kaptain etc) exists, but we want to realize something more powerfull. Regards Maxim A. Telegin IBPM Russian Academy of Sciences From cquijano at iib.uam.es Mon Apr 21 06:54:29 2003 From: cquijano at iib.uam.es (Carlos Quijano) Date: Mon, 21 Apr 2003 12:54:29 +0200 Subject: GUI In-Reply-To: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Message-ID: <3EA3CDE5.3020902@iib.uam.es> Hello, Maxim. I feel GUI exceedingly usefull, and a major need if you need to support users. The need extends no more. It's difficult to reach Jemboss' or EmbosGUI's easiness. W2H is powerfull but barely intuitive, and at the end, cofiguring it is too much stress for nothing. The perl code is dusky... EmbossGUI's perl code is easy to understand. Even if you have a professional service (no users, all work for you) it's good to arrange of GUI's, but the most powerfull working method is, like always, the command line. Serious researchers have to use the Shell, they have to develop as much code as their research needs. It's the only way to do it well. If they don't, we do. We, bioinformatics are more usefull providing clusters, parallelism in their code, or developing new biological data and related algorithms, I think. * My effort is to comprehend my users' needings, and give them the best solution. I think give them the "easiest" solution is not the right direction for the research community, for science. We need the max. accuracy, and it's not the same concept. So, if you want to develop a new GUI, think about how are you going to overcome the other ones. For doing the same that others already do, dont waste your time. Try to innovate! (cloning focus.... good idea) Some ideas about what I think are other GUI's weakness: 1- There is no GUI with all the options for all the programs (and you have to do it without turning the GUI dusky). 2- There is no GUI focused in an output usefull for publishing - papers, if you follow me - (great weak point). 3- There is no a really windows-based GUI without using Java or web-browsing (I love GNU and Linux and Sun, so forget this unlucky advice, for more detail, read line * ;-) 4- ?Have somebody dreamed about pipelines between emboss apps?. 5- It could be great to have an expert system. For example, send a sequence and receive all information possible (very usefull, a lot of people is lost with the bioinformatic's protocols, with this utillity they shall see how is all done). A cloning expret system? ;-) 6- It could be interesting to enhance the EDIT - VIEW interface of emboss (and their GUIs do little about it, only presenting the output... ). I have installed Jemboss, W2H and EmbossGUI, all of them very usefull. My advice is that if you are going to spend your time for the bioinformatics, it's best to improve the GNU software present by now and only start from the beginning if you are going to do something really new (for this reason I give you such ideas). Reading your mail seems you look for something new, and more powerfull, you say "all-in-one". Ok, try with points 2,5 and 6. And if you think to use it in cloning, then try to make cirdna and lindna apps, for example, more usuable for the typical researcher (avoiding the code-like data input file). This is the deficiency I found in EDIT-VIEW (6). You can do a lot for the emboss project developing a new graphical output interface, for playing with the graphics (now we only have pretty but boring and static png, ps or X11 graphics), and do them more publishable and modificable. New applications for cloning would be great too! I heard there are some GUI projets abandoned, someone from Argentina? If it's true and the GUI worths enough, you can re-take the effort. I hope that, between all this spanglish awfull lines you find something usefull, at least a little help (I learned english reading Tolkien like a freak, sorry if I write queer). Thanks for your helpfull GUI-dev interest! From maximtel at ibpm.pushchino.ru Mon Apr 21 08:43:36 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 16:43:36 +0400 Subject: GUI In-Reply-To: <3EA3CDE5.3020902@iib.uam.es> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> <3EA3CDE5.3020902@iib.uam.es> Message-ID: <1050929016.4565.60.camel@lab207.ibpm.serpukhov.su> ? ???, 21.04.2003, ? 14:54, Carlos Quijano ???????: > 1- There is no GUI with all the options for all the programs (and you > have to do it without turning the GUI dusky). > 2- There is no GUI focused in an output usefull for publishing - papers, > if you follow me - (great weak point). > 3- There is no a really windows-based GUI without using Java or > web-browsing (I love GNU and Linux and Sun, so forget this unlucky > advice, for more detail, read line * ;-) > 4- ?Have somebody dreamed about pipelines between emboss apps?. > 5- It could be great to have an expert system. For example, send a > sequence and receive all information possible (very usefull, a lot of > people is lost with the bioinformatic's protocols, with this utillity > they shall see how is all done). A cloning expret system? ;-) > 6- It could be interesting to enhance the EDIT - VIEW interface of > emboss (and their GUIs do little about it, only presenting the output... ). Big thanks for tips, more of them we are planed yet. > I have installed Jemboss, W2H and EmbossGUI, all of them very usefull. > My advice is that if you are going to spend your time for the > bioinformatics, it's best to improve the GNU software present by now and > only start from the beginning if you are going to do something really > new (for this reason I give you such ideas). > Reading your mail seems you look for something new, and more powerfull, > you say "all-in-one". Ok, try with points 2,5 and 6. And if you think to > use it in cloning, then try to make cirdna and lindna apps, for example, > more usuable for the typical researcher (avoiding the code-like data > input file). This is the deficiency I found in EDIT-VIEW (6). > You can do a lot for the emboss project developing a new graphical > output interface, for playing with the graphics (now we only have pretty > but boring and static png, ps or X11 graphics), and do them more > publishable and modificable. New applications for cloning would be great > too! Automatical designing of cloning strategy and realizing features usefull for day-to-day working of gene engeneer - our major aim. So realization of powerfull interactive window-based GUI with advanced drag'n'drop possibilities well'be we think not luxury. > > I heard there are some GUI projets abandoned, someone from Argentina? If > it's true and the GUI worths enough, you can re-take the effort. Hm.. I dont know about it. I'll try to find some about. > I hope that, between all this spanglish awfull lines you find something > usefull, at least a little help (I learned english reading Tolkien like > a freak, sorry if I write queer). Ofcouse such essays will be very helpfull :) Dont take hard. > Thanks for your helpfull GUI-dev interest! > Regards Maxim A. Telegin IBPM Russian Academy of Sciences From maximtel at ibpm.pushchino.ru Mon Apr 21 08:44:26 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 16:44:26 +0400 Subject: GUI In-Reply-To: References: Message-ID: <1050929066.4565.63.camel@lab207.ibpm.serpukhov.su> ? ???, 21.04.2003, ? 13:46, David Martin ???????: > That would be very nice. Even better would be adding functionality (like > Jemboss) where the databases and applications can exist on a remote machine > (possibly by using SOPA or somesuch). Yes, I think it is very important property so we are planning to realize such ar?hitecture. > Another key element for a good gui is drag and drop cloning. That would be > really nice.. This is one of our major aims. I think EMBOSS package poor with this functionality, so realization of GUI in this context will be fully justified. CLI despite it's power does'nt adequate in this case. At least we have to realize some important widgets (sequence editing with ability to visualize addition data (features, translations, restriction sites etc) and interactive sequence map viewer with generating of publication - quality graphics). I think this vidget library will be usefull not for our program only. Thanks for the tips. Regards Maxim A. Telegin IBPM Russian Academy of Sciences From jrvalverde at cnb.uam.es Mon Apr 21 09:33:18 2003 From: jrvalverde at cnb.uam.es (José R. Valverde) Date: Mon, 21 Apr 2003 15:33:18 +0200 Subject: GUI In-Reply-To: <3EA3CDE5.3020902@iib.uam.es> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> <3EA3CDE5.3020902@iib.uam.es> Message-ID: <20030421153318.5b1e0fa4.jrvalverde@cnb.uam.es> On Mon, 21 Apr 2003 12:54:29 +0200 Carlos Quijano wrote: > * My effort is to comprehend my users' needings, and give them the > best solution. I think give them the "easiest" solution is not the right > direction for the research community, for science. We need the max. > accuracy, and it's not the same concept. > That's a common misconception. Accuracy is not the goal. Meaningfulness is. Having measures to the tenth decimal point may be absolutely meaningless in many contexts. And worst, misleading too. What users want is meaningful results. They want to get something they can a) understand and b) trust, and if by the way the get some measure of the reliability of results, well, it won't hurt. Now, what does that mean? To provide meaning, you need to understand the methods AND the data. *We* do understand the methods, but only users know the data they are using. There are two ways around this: one is to try and foresee all possible use cases and provide options and explanations for each (kind of an expert system), and the other is to give the user hints to realize when something is going airy and pointers to further information. Since foreseeing every possible use case is quite difficult (if not impossible), the first solution may give a false impression of overaccuracy and be misleading too. If you go for it, you better be *real good* at it. So, from my point of view, users need to get what they need *iff at all possible*. Note the double 'f': *if and only if*. If you can't give them what they need you better don't. Overbloating a user interface with bells and whistles may lead them to blind believing in the results, and we don't want that, what we want is escepticism on the results. Always. Dot. To sum it up: concentrate on meaning, and make sure the user always knows what to trust and what not, and provide enough pointers (e.g. as hyperlinks) to further explanations. An extra note on this: provide SHORT tips and explanations FIRST. Assume users won't maintain attention more than 10 seconds. Anything that takes longer to read won't be read on a first sight. Once they decide, based on your tip, that further investigation is needed, you can THEN lead them to longer descriptions. > So, if you want to develop a new GUI, think about how are you going to > overcome the other ones. For doing the same that others already do, dont > waste your time. Try to innovate! (cloning focus.... good idea) > That's a good one. And it leads to an important conclusion: it is probably a waste of time to duplicate other people's work. So, if possible, don't. Consider joining some of the existing efforts. Jemboss may be a good one since being Java it runs everywhere. Instead of duplicating effors, add to it what you feel missing. Contact the Jemboss team and find out how to add new functionality to it. > Some ideas about what I think are other GUI's weakness: > 1- There is no GUI with all the options for all the programs (and you > have to do it without turning the GUI dusky). > > 2- There is no GUI focused in an output usefull for publishing - papers, > if you follow me - (great weak point). Right. Turning emboss output into something more useful (like editable vector graphics, PostScript, etc.. is a nice goal. Furthermore, a simple output 'editor' that allows adding some arrows, notes, or simple graphics to program output might be good enough. > 3- There is no a really windows-based GUI without using Java or > web-browsing (I love GNU and Linux and Sun, so forget this unlucky > advice, for more detail, read line * ;-) Java runs everywhere. True, Jemboss is a pain to install. Why not make it easy to install? Furthermore, why not create 'ditributions' that are ready to run (and install) for several architectures? > 4- ?Have somebody dreamed about pipelines between emboss apps?. > 5- It could be great to have an expert system. For example, send a > sequence and receive all information possible (very usefull, a lot of > people is lost with the bioinformatic's protocols, with this utillity > they shall see how is all done). A cloning expret system? ;-) For the reasons explained above, I would rather propose development of 'wizards', simple tools that guide the user through the basic process, providing tips here and there, and these with hints that results may be a lot better if one uses the fool power of the tools, with links to the actual tools and to documentation on them. Then the casual user will have an easy entry point, and after a few trials and if s/he finds it worth, wannabee power users will have the starting points to become proficient. > 6- It could be interesting to enhance the EDIT - VIEW interface of > emboss (and their GUIs do little about it, only presenting the output... ). > Yep, a feature browser, a sequence editor, etc.. might be good add-ons to Jemboss. Note that if the extensions are properly done, so they may be independent from Jemboss and have a good interface to the main program (a bit like Jalview), and written in Java, then they might be added with little effort to other web based GUIs, increasing the utility of the tools. As I said, I would contact the Jemboss team and find out with them how to contribute. j -- These opinions are mine and only mine. Hey man, I saw them first! Jos? R. Valverde De nada sirve la Inteligencia Artificial cuando falta la Natural From mayaguezcoqui at fastmail.fm Mon Apr 21 16:59:12 2003 From: mayaguezcoqui at fastmail.fm (Lorraine Cavanaugh) Date: Mon, 21 Apr 2003 16:59:12 -0400 Subject: Jemboss and Fink package problems Message-ID: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> Hi all, I have recently attempted to install Emboss and Jemboss to run in standalone mode on my G4 laptop. I was unable to get the program suite to compile properly using the Jemboss script, but was able to get Fink to install the package for me. Anyone have ideas on how to configure Emboss to run as standalone with a Jemboss interface with a Fink installation? I think Fink only installed Emboss (which works fine from the command line). I did install Jemboss, which launches, but asks for a username and password on startup, so I think it's trying to run in client mode. Thanks! Lorraine ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Lorraine Cavanaugh Hughson Lab Department of Molecular Biology 201 Schultz Labs Princeton University lcavanaugh at molbio.princeton.edu mayaguezcoqui at fastmail.fm From david at cnb.uam.es Tue Apr 22 04:19:24 2003 From: david at cnb.uam.es (David Garcia Aristegui) Date: Tue, 22 Apr 2003 10:19:24 +0200 Subject: Jemboss and Fink package problems In-Reply-To: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> References: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> Message-ID: You need Tomcat and Axis/SOAP to run Jemboss. Look up where emboss is installed with fink ( /sw ??? ). To configure Jemboss tu run as standalone: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Jemboss/install/standalone.html MacOS X http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Jemboss/install/macosx_server.html Go to the directory where the script should be run from: cd EMBOSS-2.x.x/jemboss/utils Run the install-jemboss-server.sh script. ./install-jemboss-server.sh Go to the jemboss directory in the EMBOSS install directory ($EMSBOSS_INSTALL/share/EMBOSS/jemboss) and try running Jemboss. Edit runJemboss.csh to set the following environment variables: setenv EMBOSS_INSTALL /usr/local/emboss/ setenv LD_LIBRARY_PATH $EMBOSS_INSTALL/lib For MacOSX also add: setenv DYLD_LIBRARY_PATH $EMBOSS_INSTALL/lib Also add the 'local' option for Jemboss to run in 'standalone' mode: ( very important, java1.3 or higher should be used ). java org/emboss/jemboss/Jemboss local & Then try running it by typing ./runJemboss.csh. HTH, David. >Hi all, > >I have recently attempted to install Emboss and Jemboss to run in >standalone mode on my G4 laptop. I was unable to get the program >suite to compile properly using the Jemboss script, but was able to >get Fink to install the package for me. > >Anyone have ideas on how to configure Emboss to run as standalone >with a Jemboss interface with a Fink installation? I think Fink >only installed Emboss (which works fine from the command line). I >did install Jemboss, which launches, but asks for a username and >password on startup, so I think it's trying to run in client mode. > >Thanks! >Lorraine >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >Lorraine Cavanaugh >Hughson Lab >Department of Molecular Biology >201 Schultz Labs >Princeton University > >lcavanaugh at molbio.princeton.edu >mayaguezcoqui at fastmail.fm -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030422/5690cb4d/attachment.html From tcarver at hgmp.mrc.ac.uk Tue Apr 22 05:14:31 2003 From: tcarver at hgmp.mrc.ac.uk (Dr T. Carver) Date: Tue, 22 Apr 2003 10:14:31 +0100 (BST) Subject: GUI In-Reply-To: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Message-ID: I would be very interested to know your ideas on improvements that can be made to the existing Jemboss GUI. We would certainly encourage and *very* much welcome any collaboration or code improvements to the GUI. Feedback and GUI enhancement to Jemboss are actively encouraged! The code, as it is part of the EMBOSS distribution, is freely available. In the early stages of the project people have had problems with the installation process. The install script has been improved and includes different types of installations. It is tested on standard installations of Solaris, linux, AIX, OSF, MacOSX and irix platforms. The problems that people encounter are mainly to do with the site specific set up. However, I think we could do better with documentation. You may be interested to know the work we are currently doing includes an integrated sequence editor and a multiple sequence editor. The early release of the multiple sequence editor can be found in Jemboss at the HGMP. This can be run separately from the interface but initially can be found as part of the main Jemboss GUI. There is also a CCP11 project at the HGMP to work on improving the graphics to EMBOSS and these will be filtered through to Jemboss. Regards Tim Carver HGMP On 21 Apr 2003, Maxim Telegin wrote: > Dear EMBOSS users! > Our group means to write some integrated GUI frontend using the EMBOSS > package. It is planed as all-in-one program, such as Vector NTI under > Windows platform plus such enhancements as advanced functions in > cloning-strategy planning and other. As base we are planning to use the > Gtk library (because we have some experience). Is such program will be > useful? We know that different GUI frontends (Jemboss, Kaptain etc) > exists, but we want to realize something more powerfull. > > Regards > Maxim A. Telegin > IBPM > Russian Academy of Sciences > From ztu at msi.umn.edu Tue Apr 22 14:57:47 2003 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 22 Apr 2003 13:57:47 -0500 (CDT) Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss In-Reply-To: <1050929066.4565.63.camel@lab207.ibpm.serpukhov.su> Message-ID: Anyone has success story in "indexing" human genome at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss? They are fasta format files, I try to run formatdb these chromosomes then dbiblast. But it always gives me some errors. Some runs as ---------------------------------------------- swinst at bi7 [CHR_16up] % head chr16.fa >gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, complete sequence TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC swinst at bi7 [CHR_16up] % formatdb -i chr16.fa -p F -o T swinst at bi7 [CHR_16up] % ls -l chr16* -rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq swinst at bi7 [CHR_16up] % dbiblast Index a BLAST database Database name: chr16 Database directory [.]: Wildcard database filename [chr16]: chr16.fa* Release number [0.0]: 33 Index date [00/00/00]: 04/22/03 N : nucleic P : protein ? : unknown Sequence type [unknown]: N 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 swinst at bi7 [CHR_16up] % ls -rlt -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq -rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log -rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp -rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit swinst at bi7 [CHR_16up] % -------------------------------------------------------------------------- Thanks, Tu ---------------------------------------------------------------- Zheng Jin Tu Computational Biology Specialist Supercomputing Institute 599 Walter Library 117 Pleasant Street SE University of Minnesota Minneapolis, Minnesota 55455 email: ztu at msi.umn.edu help email: help at msi.umn.edu phone: 612-624-9504, 624-0115 help phone: 612-626-0802 fax: 612-624-8861 ----------------------------------------------------------------- From ztu at msi.umn.edu Tue Apr 22 15:09:49 2003 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 22 Apr 2003 14:09:49 -0500 (CDT) Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss In-Reply-To: Message-ID: Here is some more message related to this question: on .embossrc file: DB chr16 [ type: N method: blast release: "33" format: ncbi dir: /usr/local/db/embossdb/H_sapiens/build_33/CHR_16up file: chr16.fa* comment: "Human chr 16 from ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/April_14_2003/" ] Then try to run fuzztran -sequence=chr16 -pattern="CC" -mismatch=0 -frame=6 -outf=myout Protein pattern search after translation EMBOSS An error in ajseqdb.c at line 4006: error reading file /usr/local/db/embossdb/H_sapiens/build_33/CHR_16up/chr16.fa.nhr Thanks, Tu ---------------------------------------------------------------- Zheng Jin Tu Computational Biology Specialist Supercomputing Institute 599 Walter Library 117 Pleasant Street SE University of Minnesota Minneapolis, Minnesota 55455 email: ztu at msi.umn.edu help email: help at msi.umn.edu phone: 612-624-9504, 624-0115 help phone: 612-626-0802 fax: 612-624-8861 ----------------------------------------------------------------- On Tue, 22 Apr 2003, Zheng Jin Tu wrote: > > Anyone has success story in "indexing" human genome at > ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ > for emboss? > > They are fasta format files, I try to run formatdb these chromosomes > then dbiblast. But it always gives me some errors. > > > Some runs as > > ---------------------------------------------- > swinst at bi7 [CHR_16up] % head chr16.fa > >gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, > complete sequence > TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT > GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC > CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT > CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA > TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA > GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA > CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT > GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC > CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC > > swinst at bi7 [CHR_16up] % formatdb -i chr16.fa -p F -o T > swinst at bi7 [CHR_16up] % ls -l chr16* > -rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa > -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr > -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin > -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd > -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni > -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd > -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi > -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > > swinst at bi7 [CHR_16up] % dbiblast > Index a BLAST database > Database name: chr16 > Database directory [.]: > Wildcard database filename [chr16]: chr16.fa* > Release number [0.0]: 33 > Index date [00/00/00]: 04/22/03 > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > swinst at bi7 [CHR_16up] % ls -rlt > -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd > -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni > -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd > -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi > -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr > -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin > -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > -rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log > -rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp > -rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx > -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg > -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit > swinst at bi7 [CHR_16up] % > > -------------------------------------------------------------------------- > > Thanks, > > > Tu > > ---------------------------------------------------------------- > Zheng Jin Tu > Computational Biology Specialist > Supercomputing Institute > 599 Walter Library > 117 Pleasant Street SE > University of Minnesota > Minneapolis, Minnesota 55455 > email: ztu at msi.umn.edu help email: help at msi.umn.edu > phone: 612-624-9504, 624-0115 help phone: 612-626-0802 > fax: 612-624-8861 > ----------------------------------------------------------------- > > From yezq at mail.cbi.pku.edu.cn Tue Apr 22 22:10:15 2003 From: yezq at mail.cbi.pku.edu.cn (Zhiqiang Ye) Date: Wed, 23 Apr 2003 10:10:15 +0800 Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/for emboss Message-ID: <200304230205.h3N25D3E000398@mail.cbi.pku.edu.cn> you can just run dbifasta, you don't need formatdb. ???? ======= 2003-04-22 13:57:00 ????????????????======= >Anyone has success story in "indexing" human genome at >ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ >for emboss? > >They are fasta format files, I try to run formatdb these chromosomes >then dbiblast. But it always gives me some errors. > > >Some runs as > >---------------------------------------------- >swinst at bi7 [CHR_16up] head chr16.fa >>gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, >complete sequence >TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT >GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC >CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT >CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA >TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA >GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA >CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT >GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC >CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC > >swinst at bi7 [CHR_16up] formatdb -i chr16.fa -p F -o T >swinst at bi7 [CHR_16up] ls -l chr16* >-rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa >-rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr >-rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin >-rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd >-rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni >-rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd >-rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi >-rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > >swinst at bi7 [CHR_16up] dbiblast >Index a BLAST database >Database name: chr16 >Database directory [.]: >Wildcard database filename [chr16]: chr16.fa* >Release number [0.0]: 33 >Index date [00/00/00]: 04/22/03 > N : nucleic > P : protein > ? : unknown >Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown >Blast index version [unknown]: 2 >swinst at bi7 [CHR_16up] ls -rlt >-rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd >-rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni >-rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd >-rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi >-rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr >-rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin >-rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq >-rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log >-rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp >-rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx >-rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg >-rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit >swinst at bi7 [CHR_16up] > >-------------------------------------------------------------------------- > >Thanks, > > >Tu > >---------------------------------------------------------------- >Zheng Jin Tu >Computational Biology Specialist >Supercomputing Institute >599 Walter Library >117 Pleasant Street SE >University of Minnesota >Minneapolis, Minnesota 55455 >email: ztu at msi.umn.edu help email: help at msi.umn.edu >phone: 612-624-9504, 624-0115 help phone: 612-626-0802 >fax: 612-624-8861 >----------------------------------------------------------------- = = = = = = = = = = = = = = = = = = = = Best Wishes! Zhiqiang Ye 2003-04-23 From calvinwangxi at yahoo.com Wed Apr 23 07:42:39 2003 From: calvinwangxi at yahoo.com (calvin wang) Date: Wed, 23 Apr 2003 04:42:39 -0700 (PDT) Subject: kaptain Message-ID: <20030423114239.6054.qmail@web20514.mail.yahoo.com> I have just installed kaptain but I can not run it... kaptain --version gives me an error msg, and I can not run any emboss program via kaptain. bash-2.05b# kaptain --version kaptain 0.71 Copyright (C) 2000-2002 Ter�k Zsolt Mutex destroy failure: Device or resource busy I assume it is kaptain wossname for example... well but anyway is that msg usual after kaptain --version? __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com From Joerg.Schaber at uv.es Wed Apr 23 11:46:40 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Wed, 23 Apr 2003 17:46:40 +0200 Subject: coderet problems Message-ID: <3EA6B560.2010308@uv.es> Hi, i am having problems extracting CDS from NCBI flat files using coderet. I keep getting the error 'Unable to read sequence'. I guess the problem could be solved if I used the appropriate command line arguments like -sformat1, for instance. However, in the docs is is not stated what options I have for the associated qualifiers. Any idea what could be the problem or what options there are for the qualifiers? joerg From Wiepert.Mathieu at mayo.edu Wed Apr 23 12:08:52 2003 From: Wiepert.Mathieu at mayo.edu (Wiepert, Mathieu) Date: Wed, 23 Apr 2003 11:08:52 -0500 Subject: coderet problems Message-ID: <2F41CC6C9777D311ACBD009027B108EA0541C746@excsrv32.mayo.edu> Hi, Have you tried the coderet -help -verbose options? That should give you all the possible parameters available. I couldn't tell you what they all do though, sorry... ~ $ coderet -help -verbose Mandatory qualifiers: [-seqall] seqall Sequence database USA [-seqout] seqout Output sequence USA Optional qualifiers: (none) Advanced qualifiers: -[no]cds boolean Extract CDS sequences -[no]mrna boolean Extract mrna sequences -[no]translation boolean Extract translated sequences Associated qualifiers: "-seqall" related qualifiers -sbegin1 integer First base used -send1 integer Last base used, def=seq length -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sopenfile1 string Input filename -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-seqout" related qualifiers -osformat2 string Output seq format -osextension2 string File name extension -osname2 string Base file name -osdbname2 string Database name to add -ossingle2 boolean Separate file for each entry -oufo2 string UFO features -offormat2 string Features format -ofname2 string Features file name General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for required and optional values -debug boolean Write debug output to program.dbg -acdlog boolean Write ACD processing log to program.acdlog -acdpretty boolean Rewrite ACD file as program.acdpretty -acdtable boolean Write HTML table of options -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths -----Original Message----- From: Joerg Schaber [mailto:Joerg.Schaber at uv.es] Sent: Wednesday, April 23, 2003 10:47 AM To: emboss at embnet.org Subject: coderet problems Hi, i am having problems extracting CDS from NCBI flat files using coderet. I keep getting the error 'Unable to read sequence'. I guess the problem could be solved if I used the appropriate command line arguments like -sformat1, for instance. However, in the docs is is not stated what options I have for the associated qualifiers. Any idea what could be the problem or what options there are for the qualifiers? joerg From peptides at earthlink.net Wed Apr 23 12:23:03 2003 From: peptides at earthlink.net (David Stephens) Date: Wed, 23 Apr 2003 09:23:03 -0700 Subject: Complete Polyclonal Antibody Package at $597 Message-ID: <20030423163018.E4BF97D1CF@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030423/9eeaa80b/attachment.html From Joerg.Schaber at uv.es Wed Apr 23 12:47:18 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Wed, 23 Apr 2003 18:47:18 +0200 Subject: coderet again Message-ID: <3EA6C396.8060507@uv.es> ok, I applied coderet to the same feature table as before but from EMBL instead of NCBI and it worked. I conclude that it was indeed a format problem. Any idea if that is a general bug of coderet? It seemed to know the ncbi format, though (debug file). joerg From ablavier at wanadoo.fr Wed Apr 23 16:03:12 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Wed, 23 Apr 2003 22:03:12 +0200 Subject: EMBOSS for Windows Message-ID: <001701c309d3$63f62110$0100a8c0@bach> I have started to work on porting Emboss to Windows. I have encountered very few problems so far. Look at http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html and tell me what you think if this work is of interest for you. -- Andr? Blavier From David.Bauer at SCHERING.DE Thu Apr 24 01:29:57 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 07:29:57 +0200 Subject: Antwort: coderet again Message-ID: What embossversion are you using ? In older EMBOSS releases the programs reading feature tables did not understand genbank format. David. ok, I applied coderet to the same feature table as before but from EMBL instead of NCBI and it worked. I conclude that it was indeed a format problem. Any idea if that is a general bug of coderet? It seemed to know the ncbi format, though (debug file). joerg From pmr at ebi.ac.uk Thu Apr 24 04:05:15 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 24 Apr 2003 09:05:15 +0100 Subject: coderet again References: <3EA6C396.8060507@uv.es> Message-ID: <3EA79ABB.7040703@ebi.ac.uk> Joerg Schaber wrote: > ok, I applied coderet to the same feature table as before but from EMBL > instead of NCBI and it worked. I conclude that it was indeed a format > problem. Any idea if that is a general bug of coderet? It seemed to know > the ncbi format, though (debug file). NCBI format does not have any feature information - it is a FASTA file with an NCBI style ID. Was it perhaps GENBANK format that you were reading? Hope this helps, Peter From gbottu at black.vub.ac.be Thu Apr 24 04:56:15 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Thu, 24 Apr 2003 10:56:15 +0200 Subject: Preferred isoschizomer (bis) In-Reply-To: <003d01c30a31$4813ac70$0402a6c1@windows.csc.fi>; from eija.korpelainen@csc.fi on Thu, Apr 24, 2003 at 10:15:27AM +0300 References: <20030415205431.A1419113@black.vub.ac.be> <003d01c30a31$4813ac70$0402a6c1@windows.csc.fi> Message-ID: <20030424105615.A1078815@black.vub.ac.be> Dear colleagues, I took a second look, and it is even worse : the file withrefm.304 contains as many as 149 enzymes with restriction site CTGCAG. The file embossre.enz contains 2 sites CTGCAG (BspMAI and PstI) and 147 sites ctgcag. When I run restrict with parameter -nolimit it finds the 2 sites and when I run it in default mode it finds only BspMAI. There is clearly a bug+misfeature in the programs rebaseextract+restrict. One would expect that restrict by default finds PstI and with -nolimit finds all 149 enzymes (although this would give a monstrous output). Sincerely, Guy Bottu From arunanirudhan at yahoo.co.in Thu Apr 24 05:20:47 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Thu, 24 Apr 2003 10:20:47 +0100 (BST) Subject: Database Message-ID: <20030424092047.60966.qmail@web8204.mail.in.yahoo.com> Hello all I am new to emboss. showdb is showing the results correctly. But seqret is showing this result [arun at localhost arun]$ seqret Reads and writes (returns) sequences Input sequence(s): embl:L07770 Warning: Cannot open division file '' for database 'embl' Warning: seqCdQry failed Error: Unable to read sequence 'embl:L07770' Please help Arun Anirudhan ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From pmr at ebi.ac.uk Thu Apr 24 05:35:19 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 24 Apr 2003 10:35:19 +0100 Subject: Database References: <20030424092047.60966.qmail@web8204.mail.in.yahoo.com> Message-ID: <3EA7AFD7.4070906@ebi.ac.uk> arun anirudhan wrote: > Hello all > I am new to emboss. > showdb is showing the results correctly. > But seqret is showing this result > [arun at localhost arun]$ seqret > Reads and writes (returns) > sequences > Input sequence(s): embl:L07770 > Warning: Cannot open division file > '' for > database 'embl' > Warning: seqCdQry failed > Error: Unable to read sequence > 'embl:L07770' > Please help You have EMBL defined as a database, but you have either not defined the correct access method or have not indexed it correctly. Assuming you have EMBL locally, you could index it with the dbiflat program (or use SRS if you have it :-) If you have EMBL defined as remote access, remember that L07770 is an accession number, not the ID (which is XLRHODOP) You can try this definition to use ID and accession number searches: DB embl [ type: N method: srswww url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" comment: "EMBL from EBI" dbalias: embl ] This uses the EBI's SRS server to query by ID and accession number for a USA that does not specify which kind of identifier you are using. Hope this helps, Peter Rice From Joerg.Schaber at uv.es Thu Apr 24 09:38:55 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Thu, 24 Apr 2003 15:38:55 +0200 Subject: Antwort: coderet again References: Message-ID: <3EA7E8EF.9010909@uv.es> I updated to version 2.6.0 but the problem remained. However, I figured that in my emboss.default file I erroneously set the database type to P instead of N. After I changed that coderet worked perfectly well. However, I encountered a new problem with the new version. I tried the new very useful '-describe' qualifer of extractfeat and received the error message " Died: unknown qualifier -describe". Any idea what I could be the problem here. Thanks, joerg David.Bauer at SCHERING.DE wrote: >What embossversion are you using ? >In older EMBOSS releases the programs reading feature tables did not >understand genbank format. > >David. > >ok, I applied coderet to the same feature table as before but from EMBL >instead of NCBI and it worked. I conclude that it was indeed a format >problem. Any idea if that is a general bug of coderet? It seemed to know >the ncbi format, though (debug file). > >joerg > > > -- ---------------------------------------------------------- J?rg Schaber Instituto Cavanilles de Biodiversidad y Biologia Evolutiva Universidad de Valencia Tel.: ++34 96 354 3666 A.C. 22085 Fax.: ++34 96 354 3670 46071 Valencia, Espa?a email : jos at uv.es From jan.wuyts at gengenp.rug.ac.be Thu Apr 24 09:48:16 2003 From: jan.wuyts at gengenp.rug.ac.be (Jan Wuyts) Date: Thu, 24 Apr 2003 15:48:16 +0200 (MEST) Subject: matcher score calculation Message-ID: Dear all, I am trying to use 'matcher' to do a local alignment of a small RNA sequence against a larger one. However, the output confuses me a bit. For example: matcher seq1 seq2 -alternatives 9 -stdout -auto > output The best (first) match in the output is this: ######################################## # Program: matcher # Rundate: Thu Apr 24 15:21:41 2003 # Align_format: markx0 # Report_file: stdout ######################################## #======================================= # # Aligned_sequences: 2 # 1: 21 # 2: 21-1 # Matrix: EDNAFULL # Gap_penalty: 16 # Extend_penalty: 4 # # Length: 18 # Identity: 16/18 (88.9%) # Similarity: 13/18 (72.2%) # Gaps: 0/18 ( 0.0%) # Score: 61 # # #======================================= 10 20 21 GCAGCAUCAUCAAGAUUC :::::: :::.::::::: 21-1 GCAGCACCAUUAAGAUUC 440 450 #======================================= Apparently 16 positions are identical (seems right, there are 16 ':') but only 13 are counted as similar. First of all, I don't understand why CU would be counted as similar (this score is after all negative in EDNAFULL) and second, how can it be that #similar is small than #identical. The manual (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats.html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From gwilliam at hgmp.mrc.ac.uk Thu Apr 24 09:50:06 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Thu, 24 Apr 2003 14:50:06 +0100 Subject: Antwort: coderet again References: <3EA7E8EF.9010909@uv.es> Message-ID: <3EA7EB8E.BA88FE38@hgmp.mrc.ac.uk> The '-describe' option will be avalable in the next release (2.7.0) of EMBOSS. See the Change Log for the version 2.7.0: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/ChangeLog.html#0 Gary Joerg Schaber wrote: > However, I encountered a new problem with the new version. I tried the > new very useful '-describe' qualifer of extractfeat and received the > error message " Died: unknown qualifier -describe". -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From David.Bauer at SCHERING.DE Thu Apr 24 09:54:56 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 15:54:56 +0200 Subject: coderet again Message-ID: Ooops, in my version 2.6.0 extracfeat does not have this option. Maybe Gary has an idea ? David. I updated to version 2.6.0 but the problem remained. However, I figured that in my emboss.default file I erroneously set the database type to P instead of N. After I changed that coderet worked perfectly well. However, I encountered a new problem with the new version. I tried the new very useful '-describe' qualifer of extractfeat and received the error message " Died: unknown qualifier -describe". Any idea what I could be the problem here. Thanks, joerg From David.Bauer at SCHERING.DE Thu Apr 24 10:07:07 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 16:07:07 +0200 Subject: Antwort: matcher score calculation Message-ID: I had this problem long time ago (and assumed it was fixed in the meantime). Matcher doesn't like the "U". If you change your RNA to DNA it will calculate the correct Similarity. David. Dear all, I am trying to use 'matcher' to do a local alignment of a small RNA sequence against a larger one. However, the output confuses me a bit. For example: matcher seq1 seq2 -alternatives 9 -stdout -auto > output The best (first) match in the output is this: ######################################## # Program: matcher # Rundate: Thu Apr 24 15:21:41 2003 # Align_format: markx0 # Report_file: stdout ######################################## #======================================= # # Aligned_sequences: 2 # 1: 21 # 2: 21-1 # Matrix: EDNAFULL # Gap_penalty: 16 # Extend_penalty: 4 # # Length: 18 # Identity: 16/18 (88.9%) # Similarity: 13/18 (72.2%) # Gaps: 0/18 ( 0.0%) # Score: 61 # # #======================================= 10 20 21 GCAGCAUCAUCAAGAUUC :::::: :::.::::::: 21-1 GCAGCACCAUUAAGAUUC 440 450 #======================================= Apparently 16 positions are identical (seems right, there are 16 ':') but only 13 are counted as similar. First of all, I don't understand why CU would be counted as similar (this score is after all negative in EDNAFULL) and second, how can it be that #similar is small than #identical. The manual (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats.html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From Jack.Leunissen at cmbi.kun.nl Thu Apr 24 15:10:52 2003 From: Jack.Leunissen at cmbi.kun.nl (Jack Leunissen) Date: Thu, 24 Apr 2003 21:10:52 +0200 Subject: Antwort: matcher score calculation In-Reply-To: Message-ID: <000401c30a95$4bacda50$0300000a@kuifje> What is even more surprising is that the match U->C is different from C->U. The first receives no 'dot' in the alignment, the latter does. Interesting... Jack A.M. Leunissen, Ph.D. Dept. Genome Informatics Wageningen University 6703 HA Wageningen, NL > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk] On Behalf Of > David.Bauer at SCHERING.DE > Sent: Thursday, 24 April, 2003 16:07 > To: jan.wuyts at gengenp.rug.ac.be > Cc: emboss at embnet.org; jan.wuyts at gengenp.rug.ac.be; > owner-emboss at hgmp.mrc.ac.uk > Subject: Antwort: matcher score calculation > > > > > I had this problem long time ago (and assumed it was fixed in > the meantime). Matcher doesn't like the "U". If you change > your RNA to DNA it will calculate the correct Similarity. > > David. > > > Dear all, > > I am trying to use 'matcher' to do a local alignment of a > small RNA sequence against a larger one. However, the output > confuses me a bit. For example: matcher seq1 seq2 > -alternatives 9 -stdout -auto > output > > The best (first) match in the output is this: > ######################################## > # Program: matcher > # Rundate: Thu Apr 24 15:21:41 2003 > # Align_format: markx0 > # Report_file: stdout > ######################################## > #======================================= > # > # Aligned_sequences: 2 > # 1: 21 > # 2: 21-1 > # Matrix: EDNAFULL > # Gap_penalty: 16 > # Extend_penalty: 4 > # > # Length: 18 > # Identity: 16/18 (88.9%) > # Similarity: 13/18 (72.2%) > # Gaps: 0/18 ( 0.0%) > # Score: 61 > # > # > #======================================= > > > 10 20 > 21 GCAGCAUCAUCAAGAUUC > :::::: :::.::::::: > 21-1 GCAGCACCAUUAAGAUUC > 440 450 > #======================================= > > Apparently 16 positions are identical (seems right, there are > 16 ':') but only 13 are counted as similar. First of all, I > don't understand why CU would be counted as similar (this > score is after all negative in > EDNAFULL) and second, how can it be that #similar is small > than #identical. The manual > (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats .html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From kvddrift at earthlink.net Sat Apr 26 09:01:00 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 26 Apr 2003 09:01:00 -0400 Subject: EMBOSS on Mac OS X Message-ID: Hi, I am not sure if this has been discussed here before, but it is now very easy to install EMBOSS on Mac OS X. Through the 'Fink' project, a debian-like packaging tool a package for EMBOSS 2.6.0 is available. The package was originally submitted by Ben Hines, but now I am the 'maintainer' of it. I have also submitted packages for emboss-kaptain and kaptain to provide a GUI (also works with KDE) for EMBOSS on Mac OS X. Of course, all the credit goes to the developers of Kaptain en EMBOSS-kaptns - I just 'ported' the packages for Mac OS X. If you are interested, please go to the Fink website (http://fink.sf.net), and download and install Fink. Now you can install an any available package (eg 'fink install emboss'). This will take care of downloading, compiling and installing EMBOSS plus all the packages it needs. Be aware thet this can take quite some time. I would encourage all Mac OS X users to try this out, and see if it works as expected. thanks, - Koen van der Drift. From siegmund at develogen.com Sun Apr 27 11:18:58 2003 From: siegmund at develogen.com (Thomas Siegmund) Date: Sun, 27 Apr 2003 17:18:58 +0200 Subject: kaptain In-Reply-To: <20030423114239.6054.qmail@web20514.mail.yahoo.com> References: <20030423114239.6054.qmail@web20514.mail.yahoo.com> Message-ID: <200304271718.58161.siegmund@develogen.com> Dear Calvin, I have not seen this message with kaptain, but I remember people saw it on RedHat systems with koffice. This was a problem with the location of the koffice binaries. Please try to search the KDE mailing list archive ( http://lists.kde.org/ ) for 'mutex destroy'. Or you could ask Terek Zolt, the author of kaptain directly. Maybe he has an idea. Thomas Am Mittwoch, 23. April 2003 13:42 schrieb calvin wang: > I have just installed kaptain but I can not run it... > kaptain --version gives me an error msg, and I can not > run any emboss program via kaptain. > bash-2.05b# kaptain --version > kaptain 0.71 > Copyright (C) 2000-2002 Ter�k Zsolt > > Mutex destroy failure: Device or resource busy > > I assume it is kaptain wossname for example... well > but anyway is that msg usual after kaptain --version? > > __________________________________________________ > Do you Yahoo!? > The New Yahoo! Search - Faster. Easier. Bingo > http://search.yahoo.com -- Thomas Siegmund, Ph.D. DeveloGen AG Bioinformatics and Data Management Phone: +49(551) 505 58 651 From ablavier at wanadoo.fr Sun Apr 27 16:07:11 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Sun, 27 Apr 2003 22:07:11 +0200 Subject: EMBOSS for Windows: new build Message-ID: <001901c30cf8$97f7ad80$0100a8c0@bach> Most EMBOSS applications are now available in my Windows distribution. Visit http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html I have also added technical information about the way I produce native Windows EMBOSS programs, along with a source code distribution. -- Andr? Blavier From kvddrift at earthlink.net Mon Apr 28 12:11:50 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 12:11:50 -0400 Subject: database info Message-ID: Hi, I want to try out some EMBOSS programs that use database searching. From the docs it's not clear to me if and how I can search an online database. My diskspace is limited, so I can't download a complete database to my HD. Or maybe there is some small database available somwhere that I can use? thanks, - Koen. From pmr at ebi.ac.uk Mon Apr 28 12:50:03 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 28 Apr 2003 17:50:03 +0100 Subject: database info References: Message-ID: <3EAD5BBB.9050006@ebi.ac.uk> Koen van der Drift wrote: > I want to try out some EMBOSS programs that use database searching. From > the docs it's not clear to me if and how I can search an online > database. My diskspace is limited, so I can't download a complete > database to my HD. You can point the database definitions to a remote SRS server (srs.ebi.ac.uk usually) using access method SRSWWW For example: DB genbank [ type: N method: srswww format: genbank release: NCBI comment: "Genbank from NCBI" url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" ] DB embl [ type: N method: srswww format: embl release: EBI comment: "EMBL from EBI" url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" ] > Or maybe there is some small database available somwhere that I can use? Yes ... there are databases defined in emboss.default.template with names tembl, tsw, and so on. These are small subsets of EMBL, SwissProt, etc. that we use for testing. The general principle is that any sequence used in the program documentation should appear there. You can uncomment them, and set the value of emboss_tempdata to point to the share/EMBOSS/test directory where EMBOSS is installed (or the test directory in the source tree). Hope this helps, Peter Rice From gbottu at ben.vub.ac.be Mon Apr 28 14:07:57 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 28 Apr 2003 20:07:57 +0200 (CEST) Subject: problem with mse Message-ID: <200304281807.h3SI7v3p1213597@black.vub.ac.be> from : BEN Dear colleagues, I am using EMBOSS 2.6 under CompaqTru64 5.1A. I was first happy to see that mse could be started (the previous installation did make the terminal stuck), but when I tried to use it I ran into trouble. Indeed, impossible to save a multiple sequence alignment. The command "exit" does nothing, the command "write" saves only one sequence into a file. Do I miss something ? Sincerely, Guy Bottu From kvddrift at earthlink.net Mon Apr 28 15:02:46 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 15:02:46 -0400 Subject: database info In-Reply-To: <3EAD5BBB.9050006@ebi.ac.uk> References: <3EAD5BBB.9050006@ebi.ac.uk> Message-ID: At 17:50 +0100 4/28/03, Peter Rice wrote: >DB genbank [ type: N > method: srswww format: genbank release: NCBI > comment: "Genbank from NCBI" > url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > ] >DB embl [ type: N > method: srswww format: embl release: EBI > comment: "EMBL from EBI" > url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > ] > > I added these entries to .embossrc, and they then indeed show up when I run showdb. Following the example in the tutorial (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Tutorial/node11.html), I now run seqret, but get the following error: Input sequence(s): embl:xlrhodop Error: Unable to read sequence 'embl:xlrhodop' Do I need to do something else before I can use seqret (and other programs)? Is there a place in the docs on how to use/access databases? thanks again, - Koen. From burke at airmail.net Mon Apr 28 15:17:09 2003 From: burke at airmail.net (Burke Squires) Date: Mon, 28 Apr 2003 14:17:09 -0500 Subject: Extracting protein translation with extractfeat Message-ID: Hello, I would like to extract the protein translation for genes in a Genbank file. I need the name to also be extracted so I must use extractfeat but I am not sure what of the myriad of option to put together to get the amino acid sequence out with the name. The Example lists a mod_res but nothing else. Got any ideas? Thanks, Burke -- Burke Squires Bioinformatics MacroGenics, Inc. Dallas, TX From Marc.Logghe at devgen.com Mon Apr 28 15:57:55 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 28 Apr 2003 21:57:55 +0200 Subject: database info Message-ID: Hi Koen, You probably also have to set the proxy in your .embossrc file: SET emboss_proxy "yourproxy:80" Marc > -----Original Message----- > From: Koen van der Drift [mailto:kvddrift at earthlink.net] > Sent: Monday, April 28, 2003 9:03 PM > To: Peter Rice > Cc: emboss at embnet.org > Subject: Re: database info > > > At 17:50 +0100 4/28/03, Peter Rice wrote: > > >DB genbank [ type: N > > method: srswww format: genbank release: NCBI > > comment: "Genbank from NCBI" > > url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > > ] > >DB embl [ type: N > > method: srswww format: embl release: EBI > > comment: "EMBL from EBI" > > url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > > ] > > > > > > > I added these entries to .embossrc, and they then indeed show up when > I run showdb. > > Following the example in the tutorial > (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Tutorial/node11.html), > I now run seqret, but get the following error: > > Input sequence(s): embl:xlrhodop > Error: Unable to read sequence 'embl:xlrhodop' > > > Do I need to do something else before I can use seqret (and > other programs)? > > Is there a place in the docs on how to use/access databases? > > > > thanks again, > > - Koen. > From kvddrift at earthlink.net Mon Apr 28 16:19:30 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 16:19:30 -0400 Subject: database info In-Reply-To: References: Message-ID: At 21:57 +0200 4/28/03, Marc Logghe wrote: >Hi Koen, >You probably also have to set the proxy in your .embossrc file: >SET emboss_proxy "yourproxy:80" >Marc Thanks Marc, That doesn't seem to work - I never use a proxy anyway, so I guess the problem must be something else. I also tried Pauls 2nd suggestion, by using the smaller databases that are in the EMBOSS package. According to the docs, I added SET emboss_tempdata /sw/share/EMBOSS/test to emboss.default.template (in my case on Mac OS X - this is the correct location of test). When I now run showdb, there are no databases listed. Did I miss something else? thanks again, - Koen. From Marc.Logghe at devgen.com Mon Apr 28 16:36:42 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 28 Apr 2003 22:36:42 +0200 Subject: some seqret questions Message-ID: Hi all, I was wondering whether there was a way (preferably emboss, but other suggestions are also welcome) to fetch a native sequence record. After some seqret experiments I realized a number of tags are discarded. For instance, I don't get the reference fields back from genbank records, neither the cross reference fields from swissprot, ipi, etc. (used methods: srswww). Is this also the case when genbank has been 'dbiflatted' ? Another thing. Is there a global variable like emboss_feature to switch on the -feature option by default ? Regards, Marc From kvddrift at earthlink.net Mon Apr 28 19:22:02 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 19:22:02 -0400 Subject: database info In-Reply-To: References: Message-ID: At 4:19 PM -0400 4/28/03, Koen van der Drift wrote: > >to emboss.default.template Duh - that should be emboss.default Now it works. Sorry for the noise ;) - Koen. From ivahser at i.com.ua Tue Apr 29 01:20:18 2003 From: ivahser at i.com.ua (Sergiy Ivakhno) Date: Tue, 29 Apr 2003 08:20:18 +0300 Subject: Briefings in Bioinformatics References: Message-ID: <001001c30e0f$07c03080$75f909d4@bioinformatics> Dear friends . This is more like disparate appeal for help then a question .The problem is that I am urgently need to articles from the journal Briefings in Bioinformatics . I have tried almost everything : searched all libraries , have written to authors to ask for any kind of reprints . If somebody hold subscription on Briefings in Bioinformatics and has access to online archive of this journal at Ingenta I would be very grateful if you could send me PDF versions of this articles . URL addresses I give below A comparison of microarray databases Gardiner-Garden M.; Littlejohn T.G. Briefings in Bioinformatics, May 2001, vol. 2, no. 2, pp. 143-158(16) http://www.ingenta.com/isis/searching/Availability/ingenta?pub=infobike://hs p/bib/2001/00000002/00000002/art00004&targetId=1051380412858&WebLogicSession =PbDvzgxLljrAkAbKboZ7|2603703981800765476/-1052814329/6/7051/7051/7052/7052/ 7051/-1 A review of bioinformatics education in the UK Counsell D.Briefings in Bioinformatics, March 2003, vol. 4, no. 1, pp. 7-21(15) http://www.ingenta.com/isis/searching/Availability/ingenta?pub=infobike://hs p/bib/2003/00000004/00000001/art00002&targetId=1051380234757&WebLogicSession =PbDvzgxLljrAkAbKboZ7|2603703981800765476/-1052814329/6/7051/7051/7052/7052/ 7051/-1 Thank you in advance . My email is ivahser at i.com.ua . Sergiy. From simon.andrews at bbsrc.ac.uk Tue Apr 29 03:22:18 2003 From: simon.andrews at bbsrc.ac.uk (simon andrews (BI)) Date: Tue, 29 Apr 2003 08:22:18 +0100 Subject: some seqret questions Message-ID: <2DC41140A89ED411989D00508BDCD9ED01E289C2@bi-exsrv1.iapc.bbsrc.ac.uk> > -----Original Message----- > From: Marc Logghe [mailto:Marc.Logghe at devgen.com] > Sent: 28 April 2003 21:37 > To: Emboss (E-mail) > Subject: some seqret questions > > > Hi all, > I was wondering whether there was a way (preferably emboss, but other > suggestions are also welcome) to fetch a native sequence record. tfm entret Hope this helps Simon, PS I spent ages playing with seqret options before I was pointed to this :-) From arunanirudhan at yahoo.co.in Tue Apr 29 03:26:22 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Tue, 29 Apr 2003 08:26:22 +0100 (BST) Subject: database info In-Reply-To: <3EAD5BBB.9050006@ebi.ac.uk> Message-ID: <20030429072622.9105.qmail@web8206.mail.in.yahoo.com> Thank you very much for your valuable suggestion. I have one problem now. seqret embl:L07770 is giving result from embl website. But is there a way to search embl site via a command like this seqret embl:insulin* This is giving an error message Die seqret terminated: Bad value for option [sequence] and no prompt. Please help Arun --- Peter Rice wrote: > Koen van der Drift wrote: > > I want to try out some EMBOSS programs that use > database searching. From > > the docs it's not clear to me if and how I can > search an online > > database. My diskspace is limited, so I can't > download a complete > > database to my HD. > > You can point the database definitions to a remote > SRS server > (srs.ebi.ac.uk usually) using access method SRSWWW > > For example: > > DB genbank [ type: N > method: srswww format: genbank release: NCBI > comment: "Genbank from NCBI" > url: > "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > ] > DB embl [ type: N > method: srswww format: embl release: EBI > comment: "EMBL from EBI" > url: > "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > ] > > > > Or maybe there is some small database available > somwhere that I can use? > > Yes ... there are databases defined in > emboss.default.template with > names tembl, tsw, and so on. These are small subsets > of EMBL, SwissProt, > etc. that we use for testing. The general principle > is that any sequence > used in the program documentation should appear > there. > > You can uncomment them, and set the value of > emboss_tempdata to point to > the share/EMBOSS/test directory where EMBOSS is > installed (or the test > directory in the source tree). > > Hope this helps, > > Peter Rice > ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From Marc.Logghe at devgen.com Tue Apr 29 03:47:00 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 09:47:00 +0200 Subject: some seqret questions Message-ID: Simon ! Thanks a lot. I think I was just repeating your seqret playing ;-) Entret is it !!! > -----Original Message----- > From: simon andrews (BI) [mailto:simon.andrews at bbsrc.ac.uk] > Sent: Tuesday, April 29, 2003 9:22 AM > To: Emboss (E-mail) > Subject: RE: some seqret questions > > > > > > -----Original Message----- > > From: Marc Logghe [mailto:Marc.Logghe at devgen.com] > > Sent: 28 April 2003 21:37 > > To: Emboss (E-mail) > > Subject: some seqret questions > > > > > > Hi all, > > I was wondering whether there was a way (preferably emboss, > but other > > suggestions are also welcome) to fetch a native sequence record. > > > tfm entret > > Hope this helps > > Simon, > > PS I spent ages playing with seqret options before I was > pointed to this :-) > From gwilliam at hgmp.mrc.ac.uk Tue Apr 29 04:22:44 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 29 Apr 2003 09:22:44 +0100 Subject: Extracting protein translation with extractfeat References: Message-ID: <3EAE3654.D357DC29@hgmp.mrc.ac.uk> Try using 'coderet -nocds -nomrna'. Gary Burke Squires wrote: > > Hello, > > I would like to extract the protein translation for genes in a Genbank file. > I need the name to also be extracted so I must use extractfeat but I am not > sure what of the myriad of option to put together to get the amino acid > sequence out with the name. The Example lists a mod_res but nothing else. > Got any ideas? > > Thanks, > > Burke > > -- > Burke Squires > Bioinformatics > MacroGenics, Inc. > Dallas, TX -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From Marc.Logghe at devgen.com Tue Apr 29 06:27:39 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 12:27:39 +0200 Subject: entret bug or feature ? Message-ID: Hi all, Thanks to Simon Andrews I had a new toy to play with: entret. And I have already a question about that: why is a seqret returning something and entret not. I tried the following with a local ipi database, which is actually a blast database: seqret ipi:ENSP00000289136 -debug and entret ipi:ENSP00000289136 -debug Like I mentioned I don't get anything back from entret. Why is that ? I'll include both debug files <> <> In emboss.default the ipi database is defined as: DB ipi [ type: P format: ncbi method: app app: "fastacmd -d ipi -s %s" ] Thanks, ML *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: seqret.dbg Type: application/octet-stream Size: 9517 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20030429/98bc181c/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: entret.dbg Type: application/octet-stream Size: 8704 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20030429/98bc181c/attachment-0001.obj From pmr at ebi.ac.uk Tue Apr 29 08:58:42 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 29 Apr 2003 13:58:42 +0100 Subject: entret bug or feature ? References: Message-ID: <3EAE7702.60902@ebi.ac.uk> Marc Logghe wrote: > Hi all, > Thanks to Simon Andrews I had a new toy to play with: entret. > And I have already a question about that: why is a seqret returning > something and entret not. > In emboss.default the ipi database is defined as: > DB ipi [ > type: P > format: ncbi > method: app > app: "fastacmd -d ipi -s %s" > ] NCBI format is not saving the text (because originally we expected NCBI to be a file format, not a database format). Somehow it still does not save the text properly, and some of the other sequence formats are not happy in entret. I will add it for the next release. Thanks for pointing this out. Peter Rice From kvddrift at earthlink.net Tue Apr 29 10:26:21 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 10:26:21 -0400 Subject: database info In-Reply-To: References: Message-ID: Hi, I am trying some EMBOSS programs out on Mac OS X, and it works very nice with the kaptain-GUI. Which EMBOSS programs should I use to list all entries in one database (either offline or online). Many times I try something by wild guessing, I get an error 'entry not in database'. thanks, - Koen. From gwilliam at hgmp.mrc.ac.uk Tue Apr 29 10:29:51 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 29 Apr 2003 15:29:51 +0100 Subject: database info References: Message-ID: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> infoseq Koen van der Drift wrote: > > Hi, > > I am trying some EMBOSS programs out on Mac OS X, and it works very > nice with the kaptain-GUI. > > Which EMBOSS programs should I use to list all entries in one > database (either offline or online). Many times I try something by > wild guessing, I get an error 'entry not in database'. > > thanks, > > - Koen. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From hchen at genetics.ac.cn Tue Apr 29 10:32:18 2003 From: hchen at genetics.ac.cn (Chen Hao) Date: 29 Apr 2003 22:32:18 +0800 Subject: is there any new tutorial for the last version EMBOSS ? Message-ID: <1051626742.3466.49.camel@localhost> Hi all, I'd like to know if there is any new tutorial for last version EMBOSS . if you know , point me out ,or if you have a ps / pdf format file ,do me the favor to e-mail my address (hchen at genetics.ac.cn) Thank you very much! Motata From d.m.a.martin at dundee.ac.uk Tue Apr 29 10:35:19 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 15:35:19 +0100 Subject: is there any new tutorial for the last version EMBOSS ? In-Reply-To: <1051626742.3466.49.camel@localhost> Message-ID: On 29/4/03 3:32 pm, "Chen Hao" wrote: > Hi all, > > I'd like to know if there is any new tutorial for last version EMBOSS . > if you know , point me out ,or if you have a ps / pdf format file ,do me > the favor to e-mail my address (hchen at genetics.ac.cn) > I added some bits to the tutorial (report formats etc.) and rewrote a few things. Lisa Mullen at HGMP is probably the person to ask as I have passed all my modifications back to her for eventual dissemination ..d > Thank you very much! > > Motata > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From jison at hgmp.mrc.ac.uk Tue Apr 29 10:38:40 2003 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Tue, 29 Apr 2003 15:38:40 +0100 Subject: is there any new tutorial for the last version EMBOSS ? References: <1051626742.3466.49.camel@localhost> Message-ID: <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> Hi Motata An EMBOSS programming course is available on-line: http://www.hgmp.mrc.ac.uk/CCP11/CCP11courses/EMBOSS-Course/emboss_index.html and that covers many of the basic concepts in using emboss too. If you're interested in more user-level documentation, what we have is available here: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/userdoc.html The tutorial itself is a bit out of date, but the other documents are current. Cheers J. Chen Hao wrote: > Hi all, > > I'd like to know if there is any new tutorial for last version EMBOSS . > if you know , point me out ,or if you have a ps / pdf format file ,do me > the favor to e-mail my address (hchen at genetics.ac.cn) > > Thank you very much! > > Motata -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk Tel : 01223 49-4548 HGMP-RC: http://www.hgmp.mrc.ac.uk/ EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ From kvddrift at earthlink.net Tue Apr 29 10:38:43 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 10:38:43 -0400 Subject: database info In-Reply-To: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> References: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> Message-ID: At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: >infoseq This only gives the info of one sequence. What I was looking for is a program that lists all sequences/entries in a database, so I don't have to type in sw:foo, sw:bar until I get a valid entry. thanks, - Koen. From Marc.Logghe at devgen.com Tue Apr 29 10:51:01 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 16:51:01 +0200 Subject: database info Message-ID: Koen, is it only to obtain one way or another a valid ID for a sequence from your database ? If that is the case, you can e.g. ask for the first entry in the database like this: seqret sw -firstonly I did not try it with remote databases; it should work with a local database. HTH, ML > -----Original Message----- > From: Koen van der Drift [mailto:kvddrift at earthlink.net] > Sent: Tuesday, April 29, 2003 4:39 PM > To: Gary Williams, Tel 01223 494522 > Cc: emboss at embnet.org > Subject: Re: database info > > > At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: > > >infoseq > > > This only gives the info of one sequence. What I was looking for is a > program that lists all sequences/entries in a database, so I don't > have to type in sw:foo, sw:bar until I get a valid entry. > > thanks, > > - Koen. > From d.m.a.martin at dundee.ac.uk Tue Apr 29 10:59:43 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 15:59:43 +0100 Subject: database info In-Reply-To: Message-ID: >> -----Original Message----- >> From: Koen van der Drift [mailto:kvddrift at earthlink.net] >> Sent: Tuesday, April 29, 2003 4:39 PM >> To: Gary Williams, Tel 01223 494522 >> Cc: emboss at embnet.org >> Subject: Re: database info >> >> >> At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: >> >>> infoseq >> >> >> This only gives the info of one sequence. What I was looking for is a >> program that lists all sequences/entries in a database, so I don't >> have to type in sw:foo, sw:bar until I get a valid entry. infoseq sw:\* -stdout -auto |more will give you a very long list of sequences. I don't know why you are only getting reports for one sequence. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From kvddrift at earthlink.net Tue Apr 29 11:04:18 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 11:04:18 -0400 Subject: database info In-Reply-To: References: Message-ID: At 15:59 +0100 4/29/03, David Martin wrote: >infoseq sw:\* -stdout -auto |more will give you a very long list of >sequences. > >I don't know why you are only getting reports for one sequence. > Thanks everyone for the response. I just needed a valid entry name/ID that I can use to test out some EMBOSS programs. Marc, your suggestion indeed only works with local databases, eg: seqret tsw -firstonly thanks again, - Koen. From matamban at psc.edu Tue Apr 29 11:11:28 2003 From: matamban at psc.edu (Tendai Matambanadzo) Date: Tue, 29 Apr 2003 11:11:28 -0400 (EDT) Subject: discussion groups Message-ID: I was wondering if there is a forum board which will help me install the EMBOSS application program.I seem to be having several problems in the installation.Thank you for you help Tendai From hchen at genetics.ac.cn Tue Apr 29 11:17:35 2003 From: hchen at genetics.ac.cn (Chen Hao) Date: 29 Apr 2003 23:17:35 +0800 Subject: is there any new tutorial for the last version EMBOSS ? In-Reply-To: <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> References: <1051626742.3466.49.camel@localhost> <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> Message-ID: <1051629459.3466.59.camel@localhost> Hi Dr J.C.Ison Thank you very much ! The online course is very helpful for me. sincerely, Motata ?? 2003-04-29 ?? ?? 22:38?? Dr J.C. Ison ?????? > Hi Motata > > An EMBOSS programming course is available on-line: > http://www.hgmp.mrc.ac.uk/CCP11/CCP11courses/EMBOSS-Course/emboss_index.html > > and that covers many of the basic concepts in using emboss too. > > If you're interested in more user-level documentation, what we have is > available here: > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/userdoc.html > > The tutorial itself is a bit out of date, but the other documents are > current. > > Cheers > > J. > > Chen Hao wrote: > > > Hi all, > > > > I'd like to know if there is any new tutorial for last version EMBOSS . > > if you know , point me out ,or if you have a ps / pdf format file ,do me > > the favor to e-mail my address (hchen at genetics.ac.cn) > > > > Thank you very much! > > > > Motata > > -- > Jon C. Ison, PhD > Bioinformatics Applications Group > UK MRC Human Genome Mapping Project Resource Centre > Hinxton, Cambridge, CB10 1SB, UK > E-mail : jison at hgmp.mrc.ac.uk > Tel : 01223 49-4548 > HGMP-RC: http://www.hgmp.mrc.ac.uk/ > EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ > > > From jison at hgmp.mrc.ac.uk Tue Apr 29 11:20:41 2003 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Tue, 29 Apr 2003 16:20:41 +0100 Subject: discussion groups References: Message-ID: <3EAE9849.8DE27F76@hgmp.mrc.ac.uk> Tendai - There is no forum other than these lists, but you should have a look here ... http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html In particular, David Martin's EMBOSS administration guide. Cheers J. Tendai Matambanadzo wrote: > I was wondering if there is a forum board which will help me install the > EMBOSS application program.I seem to be having several problems in the > installation.Thank you for you help > > Tendai -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk Tel : 01223 49-4548 HGMP-RC: http://www.hgmp.mrc.ac.uk/ EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ From d.m.a.martin at dundee.ac.uk Tue Apr 29 11:23:22 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 16:23:22 +0100 Subject: discussion groups In-Reply-To: <3EAE9849.8DE27F76@hgmp.mrc.ac.uk> Message-ID: On 29/4/03 4:20 pm, "Dr J.C. Ison" wrote: > Tendai - > > There is no forum other than these lists, but you should have > a look here ... > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html > > In particular, David Martin's EMBOSS administration guide. This is somewhat out of date but most of the information is still valid. ..d > > Cheers > > J. > > > Tendai Matambanadzo wrote: > >> I was wondering if there is a forum board which will help me install the >> EMBOSS application program.I seem to be having several problems in the >> installation.Thank you for you help >> >> Tendai > > -- > Jon C. Ison, PhD > Bioinformatics Applications Group > UK MRC Human Genome Mapping Project Resource Centre > Hinxton, Cambridge, CB10 1SB, UK > E-mail : jison at hgmp.mrc.ac.uk > Tel : 01223 49-4548 > HGMP-RC: http://www.hgmp.mrc.ac.uk/ > EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From pmr at ebi.ac.uk Tue Apr 29 11:47:34 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 29 Apr 2003 16:47:34 +0100 Subject: discussion groups References: Message-ID: <3EAE9E96.6000605@ebi.ac.uk> Tendai Matambanadzo wrote: > I was wondering if there is a forum board which will help me install the > EMBOSS application program.I seem to be having several problems in the > installation.Thank you for you help For installation problems, mail emboss-bug at embnet.org The Admin Guide mentioned in other replies has been updated. There is a more recent doc/manuals/admin.tex version (August 2002) in the EMBOSS distribution. Can someone please update the HTML version at http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/adminguide/ Hope this helps, Peter Rice From Marc.Logghe at devgen.com Tue Apr 29 16:48:39 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 22:48:39 +0200 Subject: how to get the native format of DB Message-ID: Hi all, Another question. Is there a way to find out what the native sequence format of a database is ? Something like 'showdb -only -format tsw' would be useful for instance. I was just thinking about the possibility of using emboss as a generic tool to fetch sequences in Bioperl. But, in order to create the sequence objects in Bioperl, you need to know the format you are dealing with. Therefore I was looking for a way to retrieve that information, without writing 'yet another config parser'. Any suggestions ? Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From arunanirudhan at yahoo.co.in Wed Apr 30 06:20:41 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Wed, 30 Apr 2003 11:20:41 +0100 (BST) Subject: blast or fasta with emboss Message-ID: <20030430102041.76056.qmail@web8207.mail.in.yahoo.com> Hello I have downloaded all the databases locally to my server. Is there any emboss progarm by which we can do a blast or fasta search to the databases installed in my server with a sequence in a text file. With regards Arun ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From d.m.a.martin at dundee.ac.uk Wed Apr 30 06:24:34 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 30 Apr 2003 11:24:34 +0100 Subject: blast or fasta with emboss In-Reply-To: <20030430102041.76056.qmail@web8207.mail.in.yahoo.com> Message-ID: On 30/4/03 11:20 am, "arun anirudhan" wrote: > Hello > I have downloaded all the databases locally to my > server. Is there any emboss progarm by which we can do > a blast or fasta search to the databases installed in > my server with a sequence in a text file. You will have to install the BLAST or FASTA software separately. EMBOSS can help in providing the right formats, but doesn't incorporate these packages directly. ..d > With regards > Arun > > ________________________________________________________________________ > Missed your favourite TV serial last night? Try the new, Yahoo! TV. > visit http://in.tv.yahoo.com > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From gbottu at ben.vub.ac.be Wed Apr 30 06:49:01 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 30 Apr 2003 12:49:01 +0200 (CEST) Subject: blast or fasta with emboss Message-ID: <200304301049.h3UAn18J1259242@black.vub.ac.be> from : BEN Dear Arun, At the BEN site we have installed the BLAST and fastA packages and then written EMBOSS "wrapper" programs to run the programs from EMBOSS (just like CLUSTAL and Primer3 for which there are "wrappers" within the EMBOSS distribution). Because I was in a hurry to offer BLAST/fastA as fast as possible to our user, I did not do the efort to make it readily portable (contain absolute links, ...). I had sent a copy of our programs to Martin Sarachu from the Argentinian EMBnet node and Jose Valverde from the Spanish EMBnet Node. They were going to do the porting. I have howver not yet heard how they fared. Guy Bottu From areagp61 at yahoo.it Wed Apr 30 07:19:21 2003 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 30 Apr 2003 13:19:21 +0200 Subject: ednadist warning Message-ID: <002a01c30f0a$5c37db80$18105709@italy.ibm.com> Hi all, I am analyzing EMBASSY programs. I have to construct a phylogenetic tree starting from a multiple alignment of DNA sequences. I have performed a bootstrap simulation by means of "eseqboot" program. The output is constituted of 100 resamples of the initial multialignment. At this point I want to calculate distance matrices on each bootstrap simulation using "ednadist" program, then I have to use "eneighbor" and finally "econsense" to calculate the consensus tree. Nonetheless, when I try to use the output of eseqboot as input for ednadist (as suggested in the seqboot documentation "... you would run SEQBOOT,then run DNADIST using the output of SEQBOOT as its input, then run NEIGHBOR using the output of DNADIST as its input, and then run CONSENSE using the tree file from NEIGHBOR as its input"), it returns a warning: > ednadist eseqboot.outfile Nucleic acid sequence Distance Matrix program Warning: seqReadPhylip 14 sequences partly read at end Output file [ednadist.outfile]: Distance methods Kimura : Kimura 2-parameter distance JinNei : Jin and Nei distance ML : Maximum Likelihood distance Jukes : Jukes-Cantor distance Choose the method to use [Kimura]: Transition/transversion ratio [2.0]: Form of distance matrix S : Square L : Lower-triangular Form [S]: Kimura What does this warning mean? The output file contains only one distance matrix. My goal is to obtain a file containing 100 distance matrices, one for each bootstrap resample; is it possible? Could be a limitation of the ednadist algotithm? Regards Graziano -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20030430/37c6531c/attachment.html From fernan at iib.unsam.edu.ar Wed Apr 30 14:19:45 2003 From: fernan at iib.unsam.edu.ar (Fernan Aguero) Date: Wed, 30 Apr 2003 15:19:45 -0300 Subject: Preferred isoschizomer ? In-Reply-To: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> References: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> Message-ID: <20030430181945.GD3138@iib.unsam.edu.ar> Sorry to insist on this point. I am now also suffering from this behaviour ... +----[ ableasby at hgmp.mrc.ac.uk (14.Apr.2003 15:28): | | 1) If your colleague had explicitly said -enzymes psti | on the command line (or equivalent GUI) then it would | be found. The output would be overly verbose if all | isoschizomers are reported so as a compromise it reports | only one. Right, if you ask for PstI, you would get PstI and not any other isoschizomer. And I also agree with the compromise of reporting only one isoschizomer. The problem is deciding which one to report. | 2) If you take the emboss files from the REBASE (NEB) distro | then, after renaming and putting them in data/REBASE, it will | probably report PstI (haven't tried it). I don't exactly know what you mean by 'take the emboss files from the REBASE distro' if you mean getting the withrefm file, as explained in the EMBOSS admin tutorial, that's what I did, and in all cases I tried I get BspMAI instead of PstI (but this is only one particular case I expect this to happen for many other enzymes as well): restrict -> BspMAI restrict -commercial t -> BspMAI restrict -preferred t -> BspMAI restrict -commercial t -preferred t -> BspMAI I've been looking at the withrefm file and according to the description of the format provided within the file itself, there is no provision for 'preferred' isoschizomers. At least not explicitly declared. The fields for each entry are: name, isoschizomers, sequence, metylation site, organism, source, commercial provider, references. So, if you look for 'CTGCAG' (the sequence recognized by PstI), you would see that it occurs several times. The file is sorted in alphabetical order by enzyme name. However, the list of isoschizomers does not seem to be in strict alphabetical order and seems that the ordering is trying to suggest a 'preferred' isoschizomer. Going through the PstI isoschizomers in alphabetical order, of all the cases I looked (until I got tired) PstI is always the first in the list of isoschizomers. So, why is restrict not using it? And perhaps, a more difficult question to answer, as pointed before by Guy Bottu: why is restrict preferring BspMAI over the rest of the isoschizomers? | I arranged with NEB | that they would provide only the 'common' REs in their files. | I believe this is what some other packages do. Using REBASEEXTRACT | on the withrefm file gives all the REs. So, the answer would be that rebaseextract does nothing to mark/tag/select a preferred isoschizomer and instead relies on the withrefm file to contain only 'preferred' isoschizomers? As far as I can see the withrefm file contains all the isoschizomers for each recognition sequence. Taken from the REBASE README: ... ... ... #31. All Enzymes (each w/ ref & isos) withrefm.### ... ... ... #37. EMBOSS emboss_e.###, emboss_r.### emboss_s.### ... ... ... I also checked all the emboss* files (apparently, REBASE already provides the same files that rebaseextract produces?) and they also contain all the isoschizomers, and not a reduced subset. However, if this is the case, what's the use of a '-preferred t/f' option for restrict? There would only be 'preferred' files in the restriction enzyme database accessed by EMBOSS ... | 3) You can equate any reported RE to another by adding an entry | into embossre.equ e.g. | BspMAI PstI And I have to do this myself for all enzymes when -- apparently -- it is all already in the withrefm file? | | HTH I hope this helps to find a solution. In the meantime a hack around this would be to have at hand a file with a list of all commonly used, commercially available enzymes, and use it like this restrict -enzymes @enz.list Such a list of enzymes may be the one containing enzyme prototypes (they are called proto.* at the REBASE site, proto.304 is the current one). I've modified it to use it as a list successfully. A comparison of what happens when one uses withrefm or the proto list does not lead to a rapid conclusion. Using withrefm sometimes gives you a prototype enzyme, even if there are other isoschizomers, and even if they appear first (in alphabetical order). I wasn't able to understand what guides restrict in choosing from the list of available enzymes. Looking at the source code was my next step, but I'm still not knowledgeable enough in C ... Regards, Fernan | | Alan | | +----] -- F e r n a n A g u e r o http://genoma.unsam.edu.ar/~fernan From ableasby at hgmp.mrc.ac.uk Wed Apr 30 14:29:58 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 30 Apr 2003 19:29:58 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk> There are replacement files for rebaseextract.c and rebaseextract.acd in the ftp://ftp.uk.embnet.org/pub/EMBOSS/patchfiles/ directory. By default this program will now produce an embossre.equ file. Re-extract the withrefm file using the new program. If you then use the -preferred option to 'restrict' it should behave as you wish. HTH Alan Bleasby HGMP From kellert at ohsu.edu Wed Apr 30 17:45:26 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Wed, 30 Apr 2003 14:45:26 -0700 Subject: database problem Message-ID: <0DC0C708-7B55-11D7-8373-0003930405E2@ohsu.edu> Greetings, I just ran dbigcg on the nsf directory mounted on my machine, it contains the GCG GenBank database. I had to create the index in a different directory, cause the data is read only. It took about 2.5 hours to finish indexing, and gave a bunch of warnings about ajStrFixI called with length 2048 for string with size 2048. And it warned about one accession number that it expected, but couldn't find. However, it make the correct files and they are of substantial size. But I seem to have a problem though: ************** kellert% seqret Reads and writes (returns) sequences Input sequence(s): mygenbank:L42450 Segmentation fault ************** This sounds like a RAM issue to me, but I have 768 MB on this machine, which seems adequate to me. Here's the DB definition in ~/.embossrc: ************** DB mygenbank [ type: N method: gcg format: GenBank fields: "acnum seqvn des keywords taxon" dir: $emboss_db_dir/gcggenbank indexdir: $emboss_indices/gcggenbank file: "*.seq" release: "133.0" comment: "GCG genbank db from dna2 mounted locally: /Volumes/dna2.ohsu.edu" ] *************** Any suggestions? Thanks, Tom K. Thomas J. Keller, Ph.D. Director, MMI Core Facility Oregon Health & Science University 3181 SW Sam Jackson Park Rd. Portland, OR, USA, 97239 http://www.ohsu.edu/core From d.m.a.martin at dundee.ac.uk Tue Apr 1 13:34:16 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 01 Apr 2003 14:34:16 +0100 Subject: Staden Package In-Reply-To: <200304011318.OAA27250@arran.mrc-lmb.cam.ac.uk> Message-ID: many on this list will be users of the Staden Package. This has recently had close ties with the EMBOSS project. It is (IMHO) extremely shortsighted of the MRC to take this approach and I would urge all those of you who have benefitted from this package to at least send an email to support Rodger Staden's position and argue for continued support for the package. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee ------ Forwarded Message From: Staden Package Administrator Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) To: d.m.a.martin at dundee.ac.uk Subject: Devastating news Hello, You are on the list of people with licences for our software. I am sorry to have to inform you that the Staden Package is no longer available or supported as the MRC has decided to withdraw our funding. Last year I submitted, to MRC, a three year grant proposal of 443k, almost all of which was to pay salaries. The application gained extremely positive referee reports and the MRC Molecular and Cellular Medicine Board awarded it their highest banding for both past and proposed work. Despite this, and with knowledge of the large number of groups who are going to be badly affected, the MRC has decided not to fund us. The current funding finishes at the end of April - just a few weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng will lose their jobs and receive no redundancy pay. Before we saw the favourable referees reports I asked MRC if, in the event of our funding being cut, the package could be made Open Source so that we and others could continue to develop and support it even if no longer working for MRC. This seemed to us the best way of safeguarding users and the careers of the group. MRC refused, saying the package had potential commercial value. The official who phoned to tell us the funding decision said he had "devastating news". For us this is certainly true. I have been working on various versions of the package for over 25 years, James for 11 and Kathryn for 7. It is also very frustrating as we had so much work nearly ready for release. If this decision is going to affect the work of you and your colleagues, or if you have any other comments or suggestions please reply to this email and, if you think it might help, send a copy to the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk Rodger Staden -- Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ Hills Road, Tel (01223) 402389 or +44 1223 402389 Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 213556 ------ End of Forwarded Message From squiresb at macrogenics.com Wed Apr 2 03:56:17 2003 From: squiresb at macrogenics.com (Burke Squires) Date: Tue, 01 Apr 2003 21:56:17 -0600 Subject: Extractfeat options Message-ID: Hello All, I have having a bit of trouble extracting just genes form a Genbank file. I have tried the obviously options to no avail. I want to get JUST the gene information but I always get gene and CDS as below. How do I do that? Additionally, can I get the gene name instead of the stuff below? Thanks! Burke >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome. atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc ggagccggcccggcgaactcggtctaa >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome. atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc ggagccggcccggcgaactcggtctaa >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome. atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome. cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg gacgacg -- Burke Squires Bioinformatics MacroGenics, Inc. Dallas, TX From oddmund.nordgard at biokjemi.uio.no Wed Apr 2 05:23:37 2003 From: oddmund.nordgard at biokjemi.uio.no (=?ISO-8859-1?Q?Oddmund_Nordg=E5rd?=) Date: Wed, 2 Apr 2003 07:23:37 +0200 (MET DST) Subject: Staden Package In-Reply-To: Message-ID: Sending an email to george.rada at headoffice.mrc.ac.uk resulted in "Delivery failure". Perhaps the address does not exist anymore? Oddmund Nordg?rd On Tue, 1 Apr 2003, David Martin wrote: > many on this list will be users of the Staden Package. This has recently had > close ties with the EMBOSS project. > > It is (IMHO) extremely shortsighted of the MRC to take this approach and I > would urge all those of you who have benefitted from this package to at > least send an email to support Rodger Staden's position and argue for > continued support for the package. > > ..d > > -- > David Martin PhD > Bioinformatics Scientific Officer > Post-Genomics and Molecular Interactions Centre > University of Dundee > > ------ Forwarded Message > From: Staden Package Administrator > Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) > To: d.m.a.martin at dundee.ac.uk > Subject: Devastating news > > Hello, > > You are on the list of people with licences for our software. I am > sorry to have to inform you that the Staden Package is no longer > available or supported as the MRC has decided to withdraw our funding. > > Last year I submitted, to MRC, a three year grant proposal of 443k, > almost all of which was to pay salaries. The application gained > extremely positive referee reports and the MRC Molecular and Cellular > Medicine Board awarded it their highest banding for both past and > proposed work. Despite this, and with knowledge of the large number of > groups who are going to be badly affected, the MRC has decided not to > fund us. The current funding finishes at the end of April - just a few > weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng > will lose their jobs and receive no redundancy pay. > > Before we saw the favourable referees reports I asked MRC if, in the > event of our funding being cut, the package could be made Open Source > so that we and others could continue to develop and support it even if > no longer working for MRC. This seemed to us the best way of > safeguarding users and the careers of the group. MRC refused, saying > the package had potential commercial value. > > The official who phoned to tell us the funding decision said he had > "devastating news". For us this is certainly true. I have been working > on various versions of the package for over 25 years, James for 11 and > Kathryn for 7. It is also very frustrating as we had so much work > nearly ready for release. > > If this decision is going to affect the work of you and your > colleagues, or if you have any other comments or suggestions please > reply to this email and, if you think it might help, send a copy to > the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk > > > Rodger Staden > > -- > Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk > MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ > Hills Road, Tel (01223) 402389 or +44 1223 > 402389 > Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 > 213556 > > > ------ End of Forwarded Message > > ****************************************************************** Oddmund Nordg?rd Address at work: Adress at home: Department of Haematology and Oncology Opalv. 28 Rogaland Central Hospital 4318 SANDNES P.O. Box 8100 Tlf.: 51 67 25 65 4068 STAVANGER Mob.: 48 20 51 72 Phone: 51 51 89 26 Email: oddmundn at biokjemi.uio.no ******************************************************************* Registered linux user #44149 From chenna at embl-heidelberg.de. Wed Apr 2 06:47:55 2003 From: chenna at embl-heidelberg.de. (Ramu Chenna) Date: Wed, 2 Apr 2003 08:47:55 +0200 (CEST) Subject: Staden Package In-Reply-To: Message-ID: > > Sending an email to george.rada at headoffice.mrc.ac.uk resulted in "Delivery > failure". Perhaps the address does not exist anymore? > > Oddmund Nordg?rd > > > On Tue, 1 Apr 2003, David Martin wrote: ===== the periodicity is once/year! Ramu --------------------------------------------------- It is not adivsable to be innocent on this day! > > > many on this list will be users of the Staden Package. This has recently had > > close ties with the EMBOSS project. > > > > It is (IMHO) extremely shortsighted of the MRC to take this approach and I > > would urge all those of you who have benefitted from this package to at > > least send an email to support Rodger Staden's position and argue for > > continued support for the package. > > > > ..d > > > > -- > > David Martin PhD > > Bioinformatics Scientific Officer > > Post-Genomics and Molecular Interactions Centre > > University of Dundee > > > > ------ Forwarded Message > > From: Staden Package Administrator > > Date: Tue, 1 Apr 2003 14:18:04 +0100 (BST) > > To: d.m.a.martin at dundee.ac.uk > > Subject: Devastating news > > > > Hello, > > > > You are on the list of people with licences for our software. I am > > sorry to have to inform you that the Staden Package is no longer > > available or supported as the MRC has decided to withdraw our funding. > > > > Last year I submitted, to MRC, a three year grant proposal of 443k, > > almost all of which was to pay salaries. The application gained > > extremely positive referee reports and the MRC Molecular and Cellular > > Medicine Board awarded it their highest banding for both past and > > proposed work. Despite this, and with knowledge of the large number of > > groups who are going to be badly affected, the MRC has decided not to > > fund us. The current funding finishes at the end of April - just a few > > weeks - and James Bonfield, Kathryn Beal, Mark Jordan and Yaping Cheng > > will lose their jobs and receive no redundancy pay. > > > > Before we saw the favourable referees reports I asked MRC if, in the > > event of our funding being cut, the package could be made Open Source > > so that we and others could continue to develop and support it even if > > no longer working for MRC. This seemed to us the best way of > > safeguarding users and the careers of the group. MRC refused, saying > > the package had potential commercial value. > > > > The official who phoned to tell us the funding decision said he had > > "devastating news". For us this is certainly true. I have been working > > on various versions of the package for over 25 years, James for 11 and > > Kathryn for 7. It is also very frustrating as we had so much work > > nearly ready for release. > > > > If this decision is going to affect the work of you and your > > colleagues, or if you have any other comments or suggestions please > > reply to this email and, if you think it might help, send a copy to > > the MRC Chief Executive George Rada george.rada at headoffice.mrc.ac.uk > > > > > > Rodger Staden > > > > -- > > Dr Rodger Staden, rs at mrc-lmb.cam.ac.uk > > MRC Laboratory of Molecular Biology, http://www.mrc-lmb.cam.ac.uk/pubseq/ > > Hills Road, Tel (01223) 402389 or +44 1223 > > 402389 > > Cambridge, CB2 2QH, UK. Fax (01223) 213556 or +44 1223 > > 213556 > > > > > > ------ End of Forwarded Message > > > > > > ****************************************************************** > > Oddmund Nordg?rd > > Address at work: Adress at home: > Department of Haematology and Oncology Opalv. 28 > Rogaland Central Hospital 4318 SANDNES > P.O. Box 8100 Tlf.: 51 67 25 65 > 4068 STAVANGER Mob.: 48 20 51 72 > Phone: 51 51 89 26 > Email: oddmundn at biokjemi.uio.no > > ******************************************************************* > Registered linux user #44149 > > > > From Marc.Logghe at devgen.com Wed Apr 2 08:27:23 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Wed, 2 Apr 2003 10:27:23 +0200 Subject: Extractfeat options Message-ID: Hi Burke, > I have having a bit of trouble extracting just genes form a > Genbank file. I > have tried the obviously options to no avail. I want to get > JUST the gene > information but I always get gene and CDS as below. How do I do that? you should set the -type arg to gene like this extractfeat -filter -type gene test.gb | less > > Additionally, can I get the gene name instead of the stuff below? Don't know how to do this with EMBOSS, I'd use BioPerl for that: #!/usr/bin/perl -w use strict; use Bio::SeqIO; my $io = Bio::SeqIO->new(-format => 'genbank', -file => shift); while (my $seq = $io->next_seq) { foreach my $feat ($seq->get_SeqFeatures('gene')) { next unless ($feat->primary_tag =~ /gene/i); print $feat->each_tag_value('gene'), "\n"; } } HTH, Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From gwilliam at hgmp.mrc.ac.uk Wed Apr 2 08:29:52 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 02 Apr 2003 09:29:52 +0100 Subject: Extractfeat options References: Message-ID: <3E8A9F7F.993DE906@hgmp.mrc.ac.uk> It looks like you are doing: extractfeat refseq:NC_001806 stdout -tag gene This will pull out the features like: gene 513..1259 /gene="RL1" or CDS 513..1259 /gene="RL1" which include the tag name 'gene', e.g. /gene="RL1" You should be using: extractfeat refseq:NC_001806 stdout -type gene which will only find the features like: gene 513..1259 /gene="RL1" which has the type name 'gene' I'll add a report of specified tag values in the output description for you soon, Burke. Regards, Gary Burke Squires wrote: > > Hello All, > > I have having a bit of trouble extracting just genes form a Genbank file. I > have tried the obviously options to no avail. I want to get JUST the gene > information but I always get gene and CDS as below. How do I do that? > > Additionally, can I get the gene name instead of the stuff below? > > Thanks! > > Burke > > >NC_001806_513_1259 [gene] Human herpesvirus 1, complete genome. > atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc > gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg > agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg > ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac > gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc > ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac > cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg > gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc > gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc > tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc > cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag > gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc > ggagccggcccggcgaactcggtctaa > >NC_001806_513_1259 [CDS] Human herpesvirus 1, complete genome. > atggcccgccgccgccgccatcgcggcccccgccgcccccggccgcccgggcccacgggc > gccgtcccaaccgcacagtcccaggtaacctccacgcccaactcggaacccgcggtcagg > agcgcgcccgcggccgccccgccgccgccccccgccggtgggcccccgccttcttgttcg > ctgctgctgcgccagtggctccacgttcccgagtccgcgtccgacgacgacgatgacgac > gactggccggacagccccccgcccgagccggcgccagaggcccggcccaccgccgccgcc > ccccggccccggcccccaccgcccggcgtgggcccggggggcggggctgacccctcccac > cccccctcgcgccccttccgccttccgccgcgcctcgccctccgcctgcgcgtcaccgcg > gagcacctggcgcgcctgcgcctgcgacgcgcgggcggggagggggcgccggagcccccc > gcgacccccgcgacccccgcgacccccgcgacccccgcgacccccgcgcgggtgcgcttc > tcgccccacgtccgggtgcgccacctggtggtctgggcctcggccgcccgcctggcgcgc > cgcggctcgtgggcccgcgagcgggccgaccgggctcggttccggcgccgggtggcggag > gccgaggcggtcatcgggccgtgcctggggcccgaggcccgtgcccgggccctggcccgc > ggagccggcccggcgaactcggtctaa > >NC_001806_2261_2317 [gene] Human herpesvirus 1, complete genome. > atggagccccgccccggagcgagtacccgccggcctgagggccgcccccagcgcgag > >NC_001806_3083_3749 [gene] Human herpesvirus 1, complete genome. > cccgccccggatgtctgggtgtttccctgcgaccgagacctgccggacagcagcgactct > gaggcggagaccgaagtgggggggcggggggacgccgaccaccatgacgacgactccgcc > tccgaggcggacagcacggacacggaactgttcgagacggggctgctggggccgcagggc > gtggatgggggggcggtctcgggggggagccccccccgcgaggaagaccccggcagttgc > gggggcgccccccctcgagaggacggggggagcgacgagggcgacgtgtgcgccgtgtgc > acggatgagatcgcgccccacctgcgctgcgacaccttcccgtgcatgcaccgcttctgc > atcccgtgcatgaaaacctggatgcaattgcgcaacacctgcccgctgtgcaacgccaag > ctggtgtacctgatagtgggcgtgacgcccagcgggtcgttcagcaccatcccgatcgtg > aacgacccccagacccgcatggaggccgaggaggccgtcagggcgggcacggccgtggac > tttatctggacgggcaatcagcggttcgccccgcggtacctgaccctgggggggcacacg > gtgagggccctgtcgcccacccacccggagcccaccacggacgaggatgacgacgacctg > gacgacg > > -- > Burke Squires > Bioinformatics > MacroGenics, Inc. > Dallas, TX -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From yezq at mail.cbi.pku.edu.cn Thu Apr 3 08:52:33 2003 From: yezq at mail.cbi.pku.edu.cn (Zhiqiang Ye) Date: Thu, 3 Apr 2003 16:52:33 +0800 Subject: translate tools Message-ID: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Dear all? I have a large set of mRNAs,cDNAs,cds to translate, but each sequence has different reading frame. So I have to translate it in all 6 frames and see which is the best.I have so many sequences that I cannot do this one by one. Is there any program in emboss which can translate nucleic sequence in 6 frames and choose the best one as output? Transeq doesn't seem to work with this. Thanks in advance! ?? ???????? Best Regards! ??????????????Zhiqiang Ye ?????????????????2003-04-03 ############################################################### Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics Center of BioInformatics, College of Life Scicences, Peking University, Beijing, PR China 100871 Tel: +86 10 6275 6730 ############################################################### From rls at ebi.ac.uk Thu Apr 3 08:49:03 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 09:49:03 +0100 Subject: translate tools In-Reply-To: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Message-ID: Hi, sixpack spring to mind. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Zhiqiang Ye > Sent: 03 April 2003 09:53 > To: emboss at embnet.org > Subject: translate tools > > > Dear all? > I have a large set of mRNAs,cDNAs,cds to translate, but each > sequence has different reading frame. > So I have to translate it in all 6 frames and see which is the > best.I have so many sequences that I cannot > do this one by one. Is there any program in emboss which can > translate nucleic sequence in 6 frames and choose > the best one as output? Transeq doesn't seem to work with this. > Thanks in advance! > ?? > > ???????? > Best Regards! > > ??????????????Zhiqiang Ye > ?????????????????2003-04-03 > > ############################################################### > Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics > Center of BioInformatics, College of Life Scicences, > Peking University, Beijing, PR China 100871 > Tel: +86 10 6275 6730 > ############################################################### > > > From rls at ebi.ac.uk Thu Apr 3 08:51:27 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 09:51:27 +0100 Subject: translate tools In-Reply-To: <200304030935.h339Z4Ex012234@mail.cbi.pku.edu.cn> Message-ID: Ah! You might want to have a look at checktrans as well. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Zhiqiang Ye > Sent: 03 April 2003 09:53 > To: emboss at embnet.org > Subject: translate tools > > > Dear all? > I have a large set of mRNAs,cDNAs,cds to translate, but each > sequence has different reading frame. > So I have to translate it in all 6 frames and see which is the > best.I have so many sequences that I cannot > do this one by one. Is there any program in emboss which can > translate nucleic sequence in 6 frames and choose > the best one as output? Transeq doesn't seem to work with this. > Thanks in advance! > ?? > > ???????? > Best Regards! > > ??????????????Zhiqiang Ye > ?????????????????2003-04-03 > > ############################################################### > Zhiqiang Ye, Ph. D candidate, Major in Bioinformatics > Center of BioInformatics, College of Life Scicences, > Peking University, Beijing, PR China 100871 > Tel: +86 10 6275 6730 > ############################################################### > > > From Marc.Logghe at devgen.com Thu Apr 3 09:31:43 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Thu, 3 Apr 2003 11:31:43 +0200 Subject: translate tools Message-ID: Hi Rodrigo, > > sixpack spring to mind. > I had the same problem a few days ago. Used getorf for that with minsize set to 3000. But this was done on only one sequence for which I knew the size of the translation I needed. You can not do this if you don't know beforehand the value to set minsize to. This also counts for sixpack (and checktrans afaik): you can only pass a minsize argument. There should exist something like -only -largest. So I think Zhiqiang Ye's problem persists. Regards, Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From rls at ebi.ac.uk Thu Apr 3 10:12:57 2003 From: rls at ebi.ac.uk (Rodrigo Lopez) Date: Thu, 3 Apr 2003 11:12:57 +0100 Subject: translate tools In-Reply-To: Message-ID: Yes, I see the problem...Good to have some more specs to go by.... The closest I can get at helping is something like: transeq emblcd:hscfo\* -auto -frame 6 | checktrans -filter -auto -orfml 200 but that would require some post-processing to find out which orf is the longest one and then rerun transeq/sixpack to get it explicitly. A perl or a sh/csh script could do that. However, the correct approach is like you say: to have an option for '-onlylargest/-onlylongest'.. R:) > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk]On Behalf Of Marc Logghe > Sent: 03 April 2003 10:32 > To: 'Rodrigo Lopez' > Cc: Emboss (E-mail) > Subject: RE: translate tools > > > Hi Rodrigo, > > > > sixpack spring to mind. > > > I had the same problem a few days ago. Used getorf for that with > minsize set > to 3000. But this was done on only one sequence for which I knew > the size of > the translation I needed. > You can not do this if you don't know beforehand the value to set minsize > to. > This also counts for sixpack (and checktrans afaik): you can only pass a > minsize argument. There should exist something like -only -largest. > So I think Zhiqiang Ye's problem persists. > Regards, > Marc > > *********************************************************** > Marc Logghe, Ph.D. > Senior Scientist > Scientific Computing Group > deVGen > Technologiepark 9 > 9052 Zwijnaarde > Belgium > tel: +32 (0) 9 324 24 88 > fax: +32 (0) 9 324 24 25 > *********************************************************** > From aengus.stewart at cancer.org.uk Thu Apr 3 15:49:15 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Thu, 03 Apr 2003 15:49:15 +0000 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> Message-ID: <3E8C57FB.63B85F6A@cancer.org.uk> Hi, I maintain data libraries by having 2 copies of them a "live" version for the users and a "incoming" version where the index etc takes place on the raw files. After the indexing the copies are flipped by changing the soft-links. Therefore I would like to "run" 2 copies of emboss.default, however looking through the documentation, I havent really got a pointer as to how to do this. Does anyone have a strategy to do this or do other people not keep parallel copies of the data? Cheers Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From aengus.stewart at cancer.org.uk Thu Apr 3 16:13:58 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Thu, 03 Apr 2003 16:13:58 +0000 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> <3E8C57FB.63B85F6A@cancer.org.uk> <3E8C4D05.2060708@imperial.ac.uk> Message-ID: <3E8C5DC6.FD7C7AE5@cancer.org.uk> Ooops as soon as I had posted I realised I was talking cobblers. The indexing doesnt reference the emboss.default file at all so I have completely dreamt up an non-existent problem :-) Apologies............ Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From kellert at ohsu.edu Thu Apr 3 21:43:21 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Thu, 3 Apr 2003 13:43:21 -0800 Subject: database help Message-ID: <49DF0FF7-661D-11D7-9D0C-0003930405E2@ohsu.edu> Greetings, The sysadmin for the machine with GCG installed, was kind enough to allow me to mount the gcg databases maintained at my institution. This should save me a lot of headaches, yet allow me the convenience of using them with EMBOSS on my machine. But I am unsure how to make them available to EMBOSS. Is there some general documentation on database formats and the steps required for use with emboss? The database documentation doesn't cover this particular case. I'm unsure, for example, if I would put "method: gcg" for all of these? would I need to run formatdb on the blast dbs? general stuff like that. The directory structure looks like this: gcgnrl3d gcgswissprot gcgsrs gcgpir gcgembl gcggenpept gcgblast est_human_00.nsq est_human_00.nsi est_human_00.nsd est_human_00.nin est_human_00.nhr ... gcgpfam gcggenbank gcggbtags gcgpfam.org gcgsptrembl From pmr at ebi.ac.uk Fri Apr 4 07:52:35 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Fri, 04 Apr 2003 08:52:35 +0100 Subject: Require 2 versions of emboss.default References: <200303211450.h2LEomw28312@bromine.hgmp.mrc.ac.uk> <3E7B5127.DA2F05F8@cancer.org.uk> <3E8C57FB.63B85F6A@cancer.org.uk> Message-ID: <3E8D39C3.3040609@ebi.ac.uk> Aengus Stewart wrote: > Therefore I would like to "run" 2 copies of emboss.default, however > looking through the documentation, I havent really got a pointer as to > how to do this. Yes, I have seen your folowup mail ... however ... emboss.default can include other files. At present the include statements do not resolve variable names but they should (and will). Or ... you could swap soft links to 2 emboss.default files :-) Hope this helps, Peter From sdowd at lbk.ars.usda.gov Fri Apr 4 22:36:28 2003 From: sdowd at lbk.ars.usda.gov (Dr. Scot E. Dowd) Date: Fri, 4 Apr 2003 16:36:28 -0600 Subject: exception in thread Message-ID: <000601c2fafa$a26a2af0$599385c7@Salmonella> Upon installing emboss and jemboss using the server install. I seem to get through everything OK then when I run the runJemboss.csh I get the following Exception in thread "main" java.lang.NoCLassDefFoundError: org/emboss/jemboss/Jemboss Any ideas or help would be appreciated in not too technical language. cheers Scot -------------- next part -------------- An HTML attachment was scrubbed... URL: From Sean.Maceachern at nre.vic.gov.au Mon Apr 7 04:54:29 2003 From: Sean.Maceachern at nre.vic.gov.au (Sean.Maceachern at nre.vic.gov.au) Date: Mon, 7 Apr 2003 14:54:29 +1000 Subject: Revseq Message-ID: Hello, I am trying to reverse and compement a few hundred FASTA sequences and am having trouble getting the input files in the correct format. To date all I have as an example of an inputfile is a sequence entry that is in the example nucleic acid database 'tembl' format. If anyone could suggest the best way to reverse and complement a number of FASTA files or could show me an example of an input file it would be greatly appreciated. Thank you Sean MacEachern PhD Student Sean.Maceachern at nre.vic.gov.au From Marc.Logghe at devgen.com Mon Apr 7 07:21:07 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 7 Apr 2003 09:21:07 +0200 Subject: Revseq Message-ID: Something like should do the job seqret fasta::your_file -sreverse ? > -----Original Message----- > From: Sean.Maceachern at nre.vic.gov.au > [mailto:Sean.Maceachern at nre.vic.gov.au] > Sent: Monday, April 07, 2003 6:54 AM > To: emboss at embnet.org > Subject: Revseq > > > Hello, I am trying to reverse and compement a few hundred > FASTA sequences > and am having trouble getting the input files in the correct > format. To > date all I have as an example of an inputfile is a sequence > entry that is > in the example nucleic acid database 'tembl' format. > If anyone could suggest the best way to reverse and > complement a number of > FASTA files or could show me an example of an input file it would be > greatly appreciated. > Thank you > > Sean MacEachern > PhD Student > Sean.Maceachern at nre.vic.gov.au > > From thomas-c at esbs.u-strasbg.fr Mon Apr 7 08:37:56 2003 From: thomas-c at esbs.u-strasbg.fr (Morgane THOMAS-CHOLLIER) Date: Mon, 7 Apr 2003 10:37:56 +0200 (CEST) Subject: install Phylip Macos X Message-ID: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> I already have EMBOSS installed on my machine and I 'm trying to install PHYLIP on Mac OS X 10.2 but it cannot compile fine. I've dowload the latest .tar.gz and unpacked it. I get the following error when trying the ./configure command : [PHYLIP-3.573c] root# ./configure --prefix=/usr/local/share/EMBOSS configure: error: cannot find install-sh or install.sh in . ./.. ./../.. Does anyone have already had that problem ? Thanks for your help. -- Morgane THOMAS-CHOLLIER DESS Bioinformatique et g?nomique ESBS - ULP Strasbourg From gwilliam at hgmp.mrc.ac.uk Mon Apr 7 10:04:42 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Mon, 07 Apr 2003 11:04:42 +0100 Subject: Revseq References: Message-ID: <3E914D3A.373C7674@hgmp.mrc.ac.uk> The example of a standard FASTA input file is in: http://www.emboss.org/Themes/SequenceExamples/fasta The full set of input sequence formats is described in: http://www.emboss.org/Themes/SequenceFormats.html#in To reverse and complement a set of sequences that are in separate FASTA files, for example '*.seq' revseq '*.seq' result.seq This writes the results to a single file holding many sequences. Note that you should put the *.seq in quote marks if you specify it on the command line to stop the shell trying to expand the '*' for you. I prefer to output a set of sequences like this to a single file, because the resulting file name is then known and can be handled easily by scripts, but you may need to run non-EMBOSS programs on the results which might not be able to read in a file containing many FASTA format sequences - they may require one sequence per file. If you prefer to deal with many resulting sequence files, use the qualifier '-ossingle' which will force the output sequence to be written to individual files, each named using the ID name of the input sequence: revseq '*.seq' result.seq -ossingle You can specify other parts of the output file name using: -osextension reversed to specify the extension name as 'reversed' -osdirectory out to specify the output directory Run 'revseq -help -verbose' for further information. Regards, Gary Sean.Maceachern at nre.vic.gov.au wrote: > > Hello, I am trying to reverse and compement a few hundred FASTA sequences > and am having trouble getting the input files in the correct format. To > date all I have as an example of an inputfile is a sequence entry that is > in the example nucleic acid database 'tembl' format. > If anyone could suggest the best way to reverse and complement a number of > FASTA files or could show me an example of an input file it would be > greatly appreciated. > Thank you > > Sean MacEachern > PhD Student > Sean.Maceachern at nre.vic.gov.au -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From can at ucsd.edu Mon Apr 7 19:15:44 2003 From: can at ucsd.edu (Can Tran) Date: Mon, 7 Apr 2003 12:15:44 -0700 Subject: install Phylip Macos X In-Reply-To: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> Message-ID: Have you tried the OSX packages at http://bioteam.net/MacOSX/index.html Can At 10:37 AM +0200 4/7/03, Morgane THOMAS-CHOLLIER wrote: >I already have EMBOSS installed on my machine and I 'm trying to install >PHYLIP on Mac OS X 10.2 but it cannot compile fine. > >I've dowload the latest .tar.gz and unpacked it. > >I get the following error when trying the ./configure command : > >[PHYLIP-3.573c] root# ./configure --prefix=/usr/local/share/EMBOSS >configure: error: cannot find install-sh or install.sh in . ./.. ./../.. > > >Does anyone have already had that problem ? > >Thanks for your help. >-- >Morgane THOMAS-CHOLLIER >DESS Bioinformatique et g?nomique >ESBS - ULP Strasbourg -- From aengus.stewart at cancer.org.uk Tue Apr 8 14:41:14 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Tue, 08 Apr 2003 15:41:14 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> Message-ID: <3E92DF8A.EF19B6B5@cancer.org.uk> Hi, Just had a couple of thoughts Synomyns: It would be nice to have a synomyns or alias tag that would take a list of alternative names for datalibs - at the moment I dont believe this is possible, so for "embl" and "em" you have to duplicate the entire datalib definition? Aggregation: When you want a SWALL that is SWISSPROT + SWISSNEW I dont see an easy way to do this. I hold both of these in separate directories with separate definitions, could the directory tag possible take a list as a value? I imagine there may be other considerations as well. BTW small typo in databases.html - in the Attributes section the key is given as "filename:" shouldnt this be "file:" ? Cheers Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From gwilliam at hgmp.mrc.ac.uk Tue Apr 8 14:48:45 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 08 Apr 2003 15:48:45 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> <3E92DF8A.EF19B6B5@cancer.org.uk> Message-ID: <3E92E14D.F349EEAE@hgmp.mrc.ac.uk> Aengus Stewart wrote: > BTW small typo in databases.html - in the Attributes section the key is > given as "filename:" shouldnt this be "file:" ? I believe that "file:" is an allowed abbreviation for "filename:". This is stated in the documentation for "filename:" -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From aengus.stewart at cancer.org.uk Tue Apr 8 14:51:11 2003 From: aengus.stewart at cancer.org.uk (Aengus Stewart) Date: Tue, 08 Apr 2003 15:51:11 +0100 Subject: Synomyns and Datalib aggregation References: <49388.130.79.135.12.1049704676.squirrel@esbsmail.u-strasbg.fr> <3E92DF8A.EF19B6B5@cancer.org.uk> <3E92E14D.F349EEAE@hgmp.mrc.ac.uk> Message-ID: <3E92E1DF.CF860E35@cancer.org.uk> "Gary Williams, Tel 01223 494522" wrote: > > Aengus Stewart wrote: > > BTW small typo in databases.html - in the Attributes section the key is > > given as "filename:" shouldnt this be "file:" ? > > I believe that "file:" is an allowed abbreviation for "filename:". > This is stated in the documentation for "filename:" > Ooops yes indeed it helps if I read all the documentation including the last line :-) Sorry Gary. Aengus -- -------------------------------------------------------------------------- Aengus Stewart aengus.stewart at DELcancerETE.org.uk Computational Genome Analysis Laboratory Tel: +44 (0)20 7269 3679 Cancer Research UK Lincoln's Inn Fields, Holborn, London, WC2A 3PX, UK -------------------------------------------------------------------------- From davids at synpep.com Tue Apr 8 16:02:04 2003 From: davids at synpep.com (David Stephens) Date: Tue, 8 Apr 2003 09:02:04 -0700 Subject: Custom Peptides $10/Residue Message-ID: <20030408160934.13F927D1A8@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: From asonyadi at cp.co.id Thu Apr 10 09:36:25 2003 From: asonyadi at cp.co.id (Sony Adi Susanto) Date: Thu, 10 Apr 2003 16:36:25 +0700 Subject: i want to join Message-ID: <3E953B19.1000908@cp.co.id> Dear friends and emboss milis moderator I want to joing this milis I am a research scientist from Indonesia -sony adi susanto- From gwilliam at hgmp.mrc.ac.uk Fri Apr 11 09:14:43 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Fri, 11 Apr 2003 10:14:43 +0100 Subject: [Fwd: EMBOSS:einverted and "nested" inverted repeats] Message-ID: <3E968783.D7AD0178@hgmp.mrc.ac.uk> Linda Cardle (lcardl at scri.sari.ac.uk) wrote: > > I wasn't sure who to contact about this query, but here goes: > > I'm trying to find MITES within sequences using einverted. My main question > is: > > How does einverted cope with "nested" inverted repeats? > > My reason for asking is that to simulate a sequence with both an inverted > repeat and an SSR inverted repeat I added an SSR inverted repeat inside a > known MITE. Once I'd done that it seemed that einverted could spot the SSR > but not the MITE surrounding it. > > I was hoping you'd be able to clarify how einverted would cope with this > situation, or point me to someone who could. > > Thanks for your time, > Linda > -- > Dr Linda Cardle > Computational Biology > Scottish Crop Research Institute > Dundee, DD2 5DA -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From thomas-c at esbs.u-strasbg.fr Mon Apr 14 08:58:21 2003 From: thomas-c at esbs.u-strasbg.fr (Morgane THOMAS-CHOLLIER) Date: Mon, 14 Apr 2003 10:58:21 +0200 (CEST) Subject: Jalview and Jemboss Message-ID: <50670.130.79.135.12.1050310701.squirrel@esbsmail.u-strasbg.fr> Hello, I use jemboss as a standalone server on MacOS X and also need to use Jalview. The problem is that when I load my multiple alignement as a fasta file, there are no colors on the display. I tried to change the colors but it stays gray. Also, it takes a long time to display the alignement and to consider any change made on it. I have more than 60 sequences ont that alignement, could it be that I go out of memory ? Does anyone as a idea to display the colors ? Or another way to install and run Jalview on MacOS X ? Thanks a lot for your help. Morgane THOMAS-CHOLLIER -- Morgane THOMAS-CHOLLIER DESS Bioinformatique et g?nomique ESBS - ULP Strasbourg From gbottu at ben.vub.ac.be Mon Apr 14 17:07:45 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 14 Apr 2003 19:07:45 +0200 (CEST) Subject: Preferred isoschizomer ? Message-ID: <200304141707.h3EH7jV31381945@black.vub.ac.be> from : BEN Dear colleagues, A user of BEN complained about a serious problem with restrict/remap. He could not find the site for PstI in a sequence, where the "wet work" showed the enzyme did cut. He lost a lot of time because he thought he had made an error, while the reason was that the program reported the isoshizomer BspMAI. Now, in the Rebase we see : <1>BspMAI <2>PstI,AinI,AjoI,Ali2882I,AliAJI,ApiI,Asp36I,Asp708I,Asp713I,AspTI,BbiI,Bce170I ,BloHII,BloHIII,BmeBI,BsaNII,BsaQI,BscDI,Bsp17I,Bsp43I,Bsp63I,Bsp78I,Bsp81I,Bsp9 3I,Bsp107I,Bsp1..... So, PstI is clearly identified as the prototype enzyme. Yet, when restrict is requested to report only the "preferred" isoschizomer, it does not report PstI, nor even the first in the file (AinI) or the last (YenEI). Does someone understand the cause of the erratic behaviour ? And did noone else suffer from this "feature" ? Regards, Guy Bottu From ableasby at hgmp.mrc.ac.uk Mon Apr 14 18:17:54 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Mon, 14 Apr 2003 19:17:54 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> 1) If your colleague had explicitly said -enzymes psti on the command line (or equivalent GUI) then it would be found. The output would be overly verbose if all isoschizomers are reported so as a compromise it reports only one. 2) If you take the emboss files from the REBASE (NEB) distro then, after renaming and putting them in data/REBASE, it will probably report PstI (haven't tried it). I arranged with NEB that they would provide only the 'common' REs in their files. I believe this is what some other packages do. Using REBASEEXTRACT on the withrefm file gives all the REs. 3) You can equate any reported RE to another by adding an entry into embossre.equ e.g. BspMAI PstI HTH Alan From can at ucsd.edu Mon Apr 14 21:44:33 2003 From: can at ucsd.edu (can at ucsd.edu) Date: Mon, 14 Apr 2003 21:44:33 GMT Subject: Jalview and Jemboss Message-ID: <200304142144.h3ELiXM4026935@smtp.ucsd.edu> Hi, What version of Jalview are you using? I use the Jalview applet to display MSAs on the web. I grabbed a version of Jalview from EBI that works well on OSX via Safari and Netscape. I haven't gotten it to work in IE though. Best wishes, Can ___________________________________ Can Tran can at ucsd.edu University of California, San Diego Division of Biology Muir Biology 4143 http://tcdb.ucsd.edu > Hello, > > I use jemboss as a standalone server on MacOS X and also need to use Jalview. > > The problem is that when I load my multiple alignement as a fasta file, > there are no colors on the display. I tried to change the colors but it > stays gray. > Also, it takes a long time to display the alignement and to consider any > change made on it. > > I have more than 60 sequences ont that alignement, could it be that I go > out of memory ? > Does anyone as a idea to display the colors ? > Or another way to install and run Jalview on MacOS X ? > > Thanks a lot for your help. > > Morgane THOMAS-CHOLLIER > -- > Morgane THOMAS-CHOLLIER > DESS Bioinformatique et g?nomique > ESBS - ULP Strasbourg > > > From Sean.Maceachern at nre.vic.gov.au Mon Apr 14 22:59:25 2003 From: Sean.Maceachern at nre.vic.gov.au (Sean.Maceachern at nre.vic.gov.au) Date: Tue, 15 Apr 2003 08:59:25 +1000 Subject: Detecting selection Message-ID: Hello, I have just commenced a PhD and am interested in detecting selection in a large subset of genes between two species. Owing to the large number of samples that I am analysing I need to find a program that I can automate through command line. To date I have not been able to find a program on EMBOSS that will detect ratios between synonymous and nonsynonymous mutations in a method analagous to GCG's Diverge. Does anyone know of a program on EMBOSS that will calculate these ratios or can anyone suggest a reliable program that would be easy to automate via command line that i can source from the web Thank you Sean MacEachern PhD Student Sean.Maceachern at nre.vic.gov.au From Joerg.Schaber at uv.es Tue Apr 15 08:22:25 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Tue, 15 Apr 2003 10:22:25 +0200 Subject: Detecting selection References: Message-ID: <3E9BC141.1090906@uv.es> You might want to use PAML. http://abacus.gene.ucl.ac.uk/software/paml.html Sean.Maceachern at nre.vic.gov.au wrote: >Hello, I have just commenced a PhD and am interested in detecting selection >in a large subset of genes between two species. Owing to the large number >of samples that I am analysing I need to find a program that I can automate >through command line. To date I have not been able to find a program on >EMBOSS that will detect ratios between synonymous and nonsynonymous >mutations in a method analagous to GCG's Diverge. Does anyone know of a >program on EMBOSS that will calculate these ratios or can anyone suggest a >reliable program that would be easy to automate via command line that i can >source from the web >Thank you > >Sean MacEachern >PhD Student >Sean.Maceachern at nre.vic.gov.au > > > > -- ---------------------------------------------------------- J?rg Schaber Instituto Cavanilles de Biodiversidad y Biologia Evolutiva Universidad de Valencia Tel.: ++34 96 354 3666 A.C. 22085 Fax.: ++34 96 354 3670 46071 Valencia, Espa?a email : jos at uv.es From siegmund at develogen.com Tue Apr 15 15:15:44 2003 From: siegmund at develogen.com (Thomas Siegmund) Date: Tue, 15 Apr 2003 17:15:44 +0200 Subject: New release of Kaptain X11 GUI for EMBOSS Message-ID: <200304151715.44096.siegmund@develogen.com> Hi everybody, I'd like to announce that a new version of the EMBOSS GUI for Linux/Unix systems running QT/KDE is available. Thanks to Ter?k Zsolt (who fixed some bugs and released Kaptain 0.71 a few days ago) we can use QT3.x now. The widgets look quite nice in a KDE3 environment and use less space on the screen than the QT2 version. Support for EMBOSS 2.6 is complete. You can the grammars for EMBOSS and Phylip as usual at http://userpage.fu-berlin.de/~sgmd/download.html . Kaptain 0.71 is available from http://kaptain.sourceforge.net/ . From the changelog: Version 0.89 - fixes for quite a few number of grammars for kaptain0.71. This version understands rules like 'something -> @ | "-something"' a little bit differently. The grammars should work with kaptain 0.6 and kaptain 0.71 now. Please report any problems, especially with the older version. I will use kaptain 0.71 from now on. Version 0.88 - new: skipseq.kaptn Version 0.87 - one more fix in embosslauncher - make showalign.kaptn, emma.kaptn, est2genome.kaptn, remap.kaptn, showseq.kaptn, efitch.kaptn work with kaptain 0.7 - fix empty lines in embossdata.kaptn - fixed -sreverse option in needle.kaptn and water.kaptn with kaptain 0.7 Version 0.86 - small fix to embosslauncher Version 0.85 - finished support for EMBOSS 2.6 - new: pestfind.kaptn, sirna.kaptn, twofeat.kaptn - moved grammar files for "Protein 3D" applications to separate directory "Domainatrix". For the moment they won't get installed automatically. If you need them, please copy them manually to the appropriate locations. If you update, it might be a good idea to remove the $KDEDIRS/share/applnk/EMBOSS directory before running the install script to get rid of stale .desktop files - "other" option for msbar.kaptn - "featinname" feature for extractfeat.kaptn - small fix for cpgplot.kaptn Regards Thomas -- Thomas Siegmund, Ph.D. DeveloGen AG Bioinformatics and Data Management Phone: +49(551) 505 58 651 From kclancy at informaxinc.com Tue Apr 15 17:24:09 2003 From: kclancy at informaxinc.com (Kevin Clancy) Date: Tue, 15 Apr 2003 11:24:09 -0600 Subject: compiling emboss on windows Message-ID: <001b01c30373$d3781310$120610ac@informaxinc.net> Dear Sirs Has anyone tackled building EMBOSS on WIndows? Is there any kind of guidelines in doing this? I was hoping to be able to run this outside of Cygwin or any other type of unix emulator. If it hasn't been done, would it be possible to let me know what types of problems I would run into? Thanks for any information. kevin Kevin Clancy, PhD Senior Bioinformatic Scientist InforMax, Inc., 433 Park Point Drive, Suite 275, Golden, CO 80401 Direct phone line: (720) 746 3707 Cell Phone: (240) 417 8604 Direct email: kclancy at informaxinc.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbottu at black.vub.ac.be Tue Apr 15 18:54:31 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Tue, 15 Apr 2003 20:54:31 +0200 Subject: Preferred isischizomer (bis) Message-ID: <20030415205431.A1419113@black.vub.ac.be> from : BEN Dear colleagues, Allow me to insist. The point is that when you ask for only one representative isoschizomer the programs should report PstI, because PstI is mentioned in Rebase as the "prototype". The GCG programs map, mapplot and mapsort did do this. The file in GCG format contains : ;BstMAI 5 C_TGCA'G -4 PstI 5 C_TGCA'G -4 While withrefm contains the same information in another way. So, IMHO, rebaseextract and/or restrict+remap are not doing their job properly. Sincerely, Guy Bottu From leungyukfai at hotmail.com Thu Apr 17 01:35:26 2003 From: leungyukfai at hotmail.com (YUK FAI LEUNG) Date: Wed, 16 Apr 2003 21:35:26 -0400 Subject: A simple installation problem Message-ID: Hi there, I have encountered the same installation problem that many others encountered before. I tried a few methods from the web and the emboss discussion group like adding the flag -L/X11R6/lib to the Makefile but they all didn't work. I tried the installation in both the Redhat 8 & 9 in my laptop but the result is the same. Could anyone tell me how to get through this problem? Thanks! fai ---Here is the error message---- /bin/sh ../libtool --mode=link gcc -g -O2 -o aaindexextract aaindexextract.o ../nucleus/libnucleus.la ../ajax/libajaxg.la ../ajax/libajax.la ../plplot/libplplot.la -lX11 -lm gcc -g -O2 -o .libs/aaindexextract aaindexextract.o ../nucleus/.libs/libnucleus.so ../ajax/.libs/libajaxg.so ../ajax/.libs/libajax.so ../plplot/.libs/libplplot.so -lX11 -lm -Wl,--rpath -Wl,/usr/local/lib /usr/bin/ld: cannot find -lX11 collect2: ld returned 1 exit status _________________________________________________________________ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail From leungyukfai at hotmail.com Thu Apr 17 15:11:02 2003 From: leungyukfai at hotmail.com (YUK FAI LEUNG) Date: Thu, 17 Apr 2003 11:11:02 -0400 Subject: A simple installation problem Message-ID: An HTML attachment was scrubbed... URL: From sdowd at lbk.ars.usda.gov Sat Apr 19 14:09:03 2003 From: sdowd at lbk.ars.usda.gov (Dr. Scot E. Dowd) Date: Sat, 19 Apr 2003 09:09:03 -0500 Subject: Kaptain is Koolneal! In-Reply-To: <20030415205431.A1419113@black.vub.ac.be> Message-ID: <002201c3067d$50585230$599385c7@Salmonella> Hi all, Thought to express appreciation I love kaptns grammars. Installed easy!~ BINGO right away. Ran a few of the programs and they appear to function properly. Thanks a bunch! Dr. Scot E. Dowd Ph.D. Research Microbiologist USDA-ARS Livestock Issues Research Unit RT3 Box 215 FM 1294 Lubbock, TX 79403 806-746-5356 ext 241 mobile 806-832-0659 fax 806-744-4402 From maximtel at ibpm.pushchino.ru Mon Apr 21 07:40:31 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 11:40:31 +0400 Subject: GUI Message-ID: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Dear EMBOSS users! Our group means to write some integrated GUI frontend using the EMBOSS package. It is planed as all-in-one program, such as Vector NTI under Windows platform plus such enhancements as advanced functions in cloning-strategy planning and other. As base we are planning to use the Gtk library (because we have some experience). Is such program will be useful? We know that different GUI frontends (Jemboss, Kaptain etc) exists, but we want to realize something more powerfull. Regards Maxim A. Telegin IBPM Russian Academy of Sciences From cquijano at iib.uam.es Mon Apr 21 10:54:29 2003 From: cquijano at iib.uam.es (Carlos Quijano) Date: Mon, 21 Apr 2003 12:54:29 +0200 Subject: GUI In-Reply-To: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Message-ID: <3EA3CDE5.3020902@iib.uam.es> Hello, Maxim. I feel GUI exceedingly usefull, and a major need if you need to support users. The need extends no more. It's difficult to reach Jemboss' or EmbosGUI's easiness. W2H is powerfull but barely intuitive, and at the end, cofiguring it is too much stress for nothing. The perl code is dusky... EmbossGUI's perl code is easy to understand. Even if you have a professional service (no users, all work for you) it's good to arrange of GUI's, but the most powerfull working method is, like always, the command line. Serious researchers have to use the Shell, they have to develop as much code as their research needs. It's the only way to do it well. If they don't, we do. We, bioinformatics are more usefull providing clusters, parallelism in their code, or developing new biological data and related algorithms, I think. * My effort is to comprehend my users' needings, and give them the best solution. I think give them the "easiest" solution is not the right direction for the research community, for science. We need the max. accuracy, and it's not the same concept. So, if you want to develop a new GUI, think about how are you going to overcome the other ones. For doing the same that others already do, dont waste your time. Try to innovate! (cloning focus.... good idea) Some ideas about what I think are other GUI's weakness: 1- There is no GUI with all the options for all the programs (and you have to do it without turning the GUI dusky). 2- There is no GUI focused in an output usefull for publishing - papers, if you follow me - (great weak point). 3- There is no a really windows-based GUI without using Java or web-browsing (I love GNU and Linux and Sun, so forget this unlucky advice, for more detail, read line * ;-) 4- ?Have somebody dreamed about pipelines between emboss apps?. 5- It could be great to have an expert system. For example, send a sequence and receive all information possible (very usefull, a lot of people is lost with the bioinformatic's protocols, with this utillity they shall see how is all done). A cloning expret system? ;-) 6- It could be interesting to enhance the EDIT - VIEW interface of emboss (and their GUIs do little about it, only presenting the output... ). I have installed Jemboss, W2H and EmbossGUI, all of them very usefull. My advice is that if you are going to spend your time for the bioinformatics, it's best to improve the GNU software present by now and only start from the beginning if you are going to do something really new (for this reason I give you such ideas). Reading your mail seems you look for something new, and more powerfull, you say "all-in-one". Ok, try with points 2,5 and 6. And if you think to use it in cloning, then try to make cirdna and lindna apps, for example, more usuable for the typical researcher (avoiding the code-like data input file). This is the deficiency I found in EDIT-VIEW (6). You can do a lot for the emboss project developing a new graphical output interface, for playing with the graphics (now we only have pretty but boring and static png, ps or X11 graphics), and do them more publishable and modificable. New applications for cloning would be great too! I heard there are some GUI projets abandoned, someone from Argentina? If it's true and the GUI worths enough, you can re-take the effort. I hope that, between all this spanglish awfull lines you find something usefull, at least a little help (I learned english reading Tolkien like a freak, sorry if I write queer). Thanks for your helpfull GUI-dev interest! From maximtel at ibpm.pushchino.ru Mon Apr 21 12:43:36 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 16:43:36 +0400 Subject: GUI In-Reply-To: <3EA3CDE5.3020902@iib.uam.es> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> <3EA3CDE5.3020902@iib.uam.es> Message-ID: <1050929016.4565.60.camel@lab207.ibpm.serpukhov.su> ? ???, 21.04.2003, ? 14:54, Carlos Quijano ???????: > 1- There is no GUI with all the options for all the programs (and you > have to do it without turning the GUI dusky). > 2- There is no GUI focused in an output usefull for publishing - papers, > if you follow me - (great weak point). > 3- There is no a really windows-based GUI without using Java or > web-browsing (I love GNU and Linux and Sun, so forget this unlucky > advice, for more detail, read line * ;-) > 4- ?Have somebody dreamed about pipelines between emboss apps?. > 5- It could be great to have an expert system. For example, send a > sequence and receive all information possible (very usefull, a lot of > people is lost with the bioinformatic's protocols, with this utillity > they shall see how is all done). A cloning expret system? ;-) > 6- It could be interesting to enhance the EDIT - VIEW interface of > emboss (and their GUIs do little about it, only presenting the output... ). Big thanks for tips, more of them we are planed yet. > I have installed Jemboss, W2H and EmbossGUI, all of them very usefull. > My advice is that if you are going to spend your time for the > bioinformatics, it's best to improve the GNU software present by now and > only start from the beginning if you are going to do something really > new (for this reason I give you such ideas). > Reading your mail seems you look for something new, and more powerfull, > you say "all-in-one". Ok, try with points 2,5 and 6. And if you think to > use it in cloning, then try to make cirdna and lindna apps, for example, > more usuable for the typical researcher (avoiding the code-like data > input file). This is the deficiency I found in EDIT-VIEW (6). > You can do a lot for the emboss project developing a new graphical > output interface, for playing with the graphics (now we only have pretty > but boring and static png, ps or X11 graphics), and do them more > publishable and modificable. New applications for cloning would be great > too! Automatical designing of cloning strategy and realizing features usefull for day-to-day working of gene engeneer - our major aim. So realization of powerfull interactive window-based GUI with advanced drag'n'drop possibilities well'be we think not luxury. > > I heard there are some GUI projets abandoned, someone from Argentina? If > it's true and the GUI worths enough, you can re-take the effort. Hm.. I dont know about it. I'll try to find some about. > I hope that, between all this spanglish awfull lines you find something > usefull, at least a little help (I learned english reading Tolkien like > a freak, sorry if I write queer). Ofcouse such essays will be very helpfull :) Dont take hard. > Thanks for your helpfull GUI-dev interest! > Regards Maxim A. Telegin IBPM Russian Academy of Sciences From maximtel at ibpm.pushchino.ru Mon Apr 21 12:44:26 2003 From: maximtel at ibpm.pushchino.ru (Maxim Telegin) Date: 21 Apr 2003 16:44:26 +0400 Subject: GUI In-Reply-To: References: Message-ID: <1050929066.4565.63.camel@lab207.ibpm.serpukhov.su> ? ???, 21.04.2003, ? 13:46, David Martin ???????: > That would be very nice. Even better would be adding functionality (like > Jemboss) where the databases and applications can exist on a remote machine > (possibly by using SOPA or somesuch). Yes, I think it is very important property so we are planning to realize such ar?hitecture. > Another key element for a good gui is drag and drop cloning. That would be > really nice.. This is one of our major aims. I think EMBOSS package poor with this functionality, so realization of GUI in this context will be fully justified. CLI despite it's power does'nt adequate in this case. At least we have to realize some important widgets (sequence editing with ability to visualize addition data (features, translations, restriction sites etc) and interactive sequence map viewer with generating of publication - quality graphics). I think this vidget library will be usefull not for our program only. Thanks for the tips. Regards Maxim A. Telegin IBPM Russian Academy of Sciences From jrvalverde at cnb.uam.es Mon Apr 21 13:33:18 2003 From: jrvalverde at cnb.uam.es (José R. Valverde) Date: Mon, 21 Apr 2003 15:33:18 +0200 Subject: GUI In-Reply-To: <3EA3CDE5.3020902@iib.uam.es> References: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> <3EA3CDE5.3020902@iib.uam.es> Message-ID: <20030421153318.5b1e0fa4.jrvalverde@cnb.uam.es> On Mon, 21 Apr 2003 12:54:29 +0200 Carlos Quijano wrote: > * My effort is to comprehend my users' needings, and give them the > best solution. I think give them the "easiest" solution is not the right > direction for the research community, for science. We need the max. > accuracy, and it's not the same concept. > That's a common misconception. Accuracy is not the goal. Meaningfulness is. Having measures to the tenth decimal point may be absolutely meaningless in many contexts. And worst, misleading too. What users want is meaningful results. They want to get something they can a) understand and b) trust, and if by the way the get some measure of the reliability of results, well, it won't hurt. Now, what does that mean? To provide meaning, you need to understand the methods AND the data. *We* do understand the methods, but only users know the data they are using. There are two ways around this: one is to try and foresee all possible use cases and provide options and explanations for each (kind of an expert system), and the other is to give the user hints to realize when something is going airy and pointers to further information. Since foreseeing every possible use case is quite difficult (if not impossible), the first solution may give a false impression of overaccuracy and be misleading too. If you go for it, you better be *real good* at it. So, from my point of view, users need to get what they need *iff at all possible*. Note the double 'f': *if and only if*. If you can't give them what they need you better don't. Overbloating a user interface with bells and whistles may lead them to blind believing in the results, and we don't want that, what we want is escepticism on the results. Always. Dot. To sum it up: concentrate on meaning, and make sure the user always knows what to trust and what not, and provide enough pointers (e.g. as hyperlinks) to further explanations. An extra note on this: provide SHORT tips and explanations FIRST. Assume users won't maintain attention more than 10 seconds. Anything that takes longer to read won't be read on a first sight. Once they decide, based on your tip, that further investigation is needed, you can THEN lead them to longer descriptions. > So, if you want to develop a new GUI, think about how are you going to > overcome the other ones. For doing the same that others already do, dont > waste your time. Try to innovate! (cloning focus.... good idea) > That's a good one. And it leads to an important conclusion: it is probably a waste of time to duplicate other people's work. So, if possible, don't. Consider joining some of the existing efforts. Jemboss may be a good one since being Java it runs everywhere. Instead of duplicating effors, add to it what you feel missing. Contact the Jemboss team and find out how to add new functionality to it. > Some ideas about what I think are other GUI's weakness: > 1- There is no GUI with all the options for all the programs (and you > have to do it without turning the GUI dusky). > > 2- There is no GUI focused in an output usefull for publishing - papers, > if you follow me - (great weak point). Right. Turning emboss output into something more useful (like editable vector graphics, PostScript, etc.. is a nice goal. Furthermore, a simple output 'editor' that allows adding some arrows, notes, or simple graphics to program output might be good enough. > 3- There is no a really windows-based GUI without using Java or > web-browsing (I love GNU and Linux and Sun, so forget this unlucky > advice, for more detail, read line * ;-) Java runs everywhere. True, Jemboss is a pain to install. Why not make it easy to install? Furthermore, why not create 'ditributions' that are ready to run (and install) for several architectures? > 4- ?Have somebody dreamed about pipelines between emboss apps?. > 5- It could be great to have an expert system. For example, send a > sequence and receive all information possible (very usefull, a lot of > people is lost with the bioinformatic's protocols, with this utillity > they shall see how is all done). A cloning expret system? ;-) For the reasons explained above, I would rather propose development of 'wizards', simple tools that guide the user through the basic process, providing tips here and there, and these with hints that results may be a lot better if one uses the fool power of the tools, with links to the actual tools and to documentation on them. Then the casual user will have an easy entry point, and after a few trials and if s/he finds it worth, wannabee power users will have the starting points to become proficient. > 6- It could be interesting to enhance the EDIT - VIEW interface of > emboss (and their GUIs do little about it, only presenting the output... ). > Yep, a feature browser, a sequence editor, etc.. might be good add-ons to Jemboss. Note that if the extensions are properly done, so they may be independent from Jemboss and have a good interface to the main program (a bit like Jalview), and written in Java, then they might be added with little effort to other web based GUIs, increasing the utility of the tools. As I said, I would contact the Jemboss team and find out with them how to contribute. j -- These opinions are mine and only mine. Hey man, I saw them first! Jos? R. Valverde De nada sirve la Inteligencia Artificial cuando falta la Natural From mayaguezcoqui at fastmail.fm Mon Apr 21 20:59:12 2003 From: mayaguezcoqui at fastmail.fm (Lorraine Cavanaugh) Date: Mon, 21 Apr 2003 16:59:12 -0400 Subject: Jemboss and Fink package problems Message-ID: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> Hi all, I have recently attempted to install Emboss and Jemboss to run in standalone mode on my G4 laptop. I was unable to get the program suite to compile properly using the Jemboss script, but was able to get Fink to install the package for me. Anyone have ideas on how to configure Emboss to run as standalone with a Jemboss interface with a Fink installation? I think Fink only installed Emboss (which works fine from the command line). I did install Jemboss, which launches, but asks for a username and password on startup, so I think it's trying to run in client mode. Thanks! Lorraine ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Lorraine Cavanaugh Hughson Lab Department of Molecular Biology 201 Schultz Labs Princeton University lcavanaugh at molbio.princeton.edu mayaguezcoqui at fastmail.fm From david at cnb.uam.es Tue Apr 22 08:19:24 2003 From: david at cnb.uam.es (David Garcia Aristegui) Date: Tue, 22 Apr 2003 10:19:24 +0200 Subject: Jemboss and Fink package problems In-Reply-To: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> References: <1AD1B986-743C-11D7-86A8-000393120AFA@fastmail.fm> Message-ID: You need Tomcat and Axis/SOAP to run Jemboss. Look up where emboss is installed with fink ( /sw ??? ). To configure Jemboss tu run as standalone: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Jemboss/install/standalone.html MacOS X http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Jemboss/install/macosx_server.html Go to the directory where the script should be run from: cd EMBOSS-2.x.x/jemboss/utils Run the install-jemboss-server.sh script. ./install-jemboss-server.sh Go to the jemboss directory in the EMBOSS install directory ($EMSBOSS_INSTALL/share/EMBOSS/jemboss) and try running Jemboss. Edit runJemboss.csh to set the following environment variables: setenv EMBOSS_INSTALL /usr/local/emboss/ setenv LD_LIBRARY_PATH $EMBOSS_INSTALL/lib For MacOSX also add: setenv DYLD_LIBRARY_PATH $EMBOSS_INSTALL/lib Also add the 'local' option for Jemboss to run in 'standalone' mode: ( very important, java1.3 or higher should be used ). java org/emboss/jemboss/Jemboss local & Then try running it by typing ./runJemboss.csh. HTH, David. >Hi all, > >I have recently attempted to install Emboss and Jemboss to run in >standalone mode on my G4 laptop. I was unable to get the program >suite to compile properly using the Jemboss script, but was able to >get Fink to install the package for me. > >Anyone have ideas on how to configure Emboss to run as standalone >with a Jemboss interface with a Fink installation? I think Fink >only installed Emboss (which works fine from the command line). I >did install Jemboss, which launches, but asks for a username and >password on startup, so I think it's trying to run in client mode. > >Thanks! >Lorraine >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >Lorraine Cavanaugh >Hughson Lab >Department of Molecular Biology >201 Schultz Labs >Princeton University > >lcavanaugh at molbio.princeton.edu >mayaguezcoqui at fastmail.fm -------------- next part -------------- An HTML attachment was scrubbed... URL: From tcarver at hgmp.mrc.ac.uk Tue Apr 22 09:14:31 2003 From: tcarver at hgmp.mrc.ac.uk (Dr T. Carver) Date: Tue, 22 Apr 2003 10:14:31 +0100 (BST) Subject: GUI In-Reply-To: <1050910831.3959.21.camel@lab207.ibpm.serpukhov.su> Message-ID: I would be very interested to know your ideas on improvements that can be made to the existing Jemboss GUI. We would certainly encourage and *very* much welcome any collaboration or code improvements to the GUI. Feedback and GUI enhancement to Jemboss are actively encouraged! The code, as it is part of the EMBOSS distribution, is freely available. In the early stages of the project people have had problems with the installation process. The install script has been improved and includes different types of installations. It is tested on standard installations of Solaris, linux, AIX, OSF, MacOSX and irix platforms. The problems that people encounter are mainly to do with the site specific set up. However, I think we could do better with documentation. You may be interested to know the work we are currently doing includes an integrated sequence editor and a multiple sequence editor. The early release of the multiple sequence editor can be found in Jemboss at the HGMP. This can be run separately from the interface but initially can be found as part of the main Jemboss GUI. There is also a CCP11 project at the HGMP to work on improving the graphics to EMBOSS and these will be filtered through to Jemboss. Regards Tim Carver HGMP On 21 Apr 2003, Maxim Telegin wrote: > Dear EMBOSS users! > Our group means to write some integrated GUI frontend using the EMBOSS > package. It is planed as all-in-one program, such as Vector NTI under > Windows platform plus such enhancements as advanced functions in > cloning-strategy planning and other. As base we are planning to use the > Gtk library (because we have some experience). Is such program will be > useful? We know that different GUI frontends (Jemboss, Kaptain etc) > exists, but we want to realize something more powerfull. > > Regards > Maxim A. Telegin > IBPM > Russian Academy of Sciences > From ztu at msi.umn.edu Tue Apr 22 18:57:47 2003 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 22 Apr 2003 13:57:47 -0500 (CDT) Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss In-Reply-To: <1050929066.4565.63.camel@lab207.ibpm.serpukhov.su> Message-ID: Anyone has success story in "indexing" human genome at ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss? They are fasta format files, I try to run formatdb these chromosomes then dbiblast. But it always gives me some errors. Some runs as ---------------------------------------------- swinst at bi7 [CHR_16up] % head chr16.fa >gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, complete sequence TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC swinst at bi7 [CHR_16up] % formatdb -i chr16.fa -p F -o T swinst at bi7 [CHR_16up] % ls -l chr16* -rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq swinst at bi7 [CHR_16up] % dbiblast Index a BLAST database Database name: chr16 Database directory [.]: Wildcard database filename [chr16]: chr16.fa* Release number [0.0]: 33 Index date [00/00/00]: 04/22/03 N : nucleic P : protein ? : unknown Sequence type [unknown]: N 1 : wublast and setdb/pressdb 2 : formatdb 0 : unknown Blast index version [unknown]: 2 swinst at bi7 [CHR_16up] % ls -rlt -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq -rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log -rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp -rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit swinst at bi7 [CHR_16up] % -------------------------------------------------------------------------- Thanks, Tu ---------------------------------------------------------------- Zheng Jin Tu Computational Biology Specialist Supercomputing Institute 599 Walter Library 117 Pleasant Street SE University of Minnesota Minneapolis, Minnesota 55455 email: ztu at msi.umn.edu help email: help at msi.umn.edu phone: 612-624-9504, 624-0115 help phone: 612-626-0802 fax: 612-624-8861 ----------------------------------------------------------------- From ztu at msi.umn.edu Tue Apr 22 19:09:49 2003 From: ztu at msi.umn.edu (Zheng Jin Tu) Date: Tue, 22 Apr 2003 14:09:49 -0500 (CDT) Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ for emboss In-Reply-To: Message-ID: Here is some more message related to this question: on .embossrc file: DB chr16 [ type: N method: blast release: "33" format: ncbi dir: /usr/local/db/embossdb/H_sapiens/build_33/CHR_16up file: chr16.fa* comment: "Human chr 16 from ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/April_14_2003/" ] Then try to run fuzztran -sequence=chr16 -pattern="CC" -mismatch=0 -frame=6 -outf=myout Protein pattern search after translation EMBOSS An error in ajseqdb.c at line 4006: error reading file /usr/local/db/embossdb/H_sapiens/build_33/CHR_16up/chr16.fa.nhr Thanks, Tu ---------------------------------------------------------------- Zheng Jin Tu Computational Biology Specialist Supercomputing Institute 599 Walter Library 117 Pleasant Street SE University of Minnesota Minneapolis, Minnesota 55455 email: ztu at msi.umn.edu help email: help at msi.umn.edu phone: 612-624-9504, 624-0115 help phone: 612-626-0802 fax: 612-624-8861 ----------------------------------------------------------------- On Tue, 22 Apr 2003, Zheng Jin Tu wrote: > > Anyone has success story in "indexing" human genome at > ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ > for emboss? > > They are fasta format files, I try to run formatdb these chromosomes > then dbiblast. But it always gives me some errors. > > > Some runs as > > ---------------------------------------------- > swinst at bi7 [CHR_16up] % head chr16.fa > >gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, > complete sequence > TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT > GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC > CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT > CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA > TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA > GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA > CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT > GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC > CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC > > swinst at bi7 [CHR_16up] % formatdb -i chr16.fa -p F -o T > swinst at bi7 [CHR_16up] % ls -l chr16* > -rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa > -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr > -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin > -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd > -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni > -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd > -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi > -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > > swinst at bi7 [CHR_16up] % dbiblast > Index a BLAST database > Database name: chr16 > Database directory [.]: > Wildcard database filename [chr16]: chr16.fa* > Release number [0.0]: 33 > Index date [00/00/00]: 04/22/03 > N : nucleic > P : protein > ? : unknown > Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown > Blast index version [unknown]: 2 > swinst at bi7 [CHR_16up] % ls -rlt > -rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd > -rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni > -rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd > -rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi > -rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr > -rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin > -rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > -rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log > -rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp > -rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx > -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg > -rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit > swinst at bi7 [CHR_16up] % > > -------------------------------------------------------------------------- > > Thanks, > > > Tu > > ---------------------------------------------------------------- > Zheng Jin Tu > Computational Biology Specialist > Supercomputing Institute > 599 Walter Library > 117 Pleasant Street SE > University of Minnesota > Minneapolis, Minnesota 55455 > email: ztu at msi.umn.edu help email: help at msi.umn.edu > phone: 612-624-9504, 624-0115 help phone: 612-626-0802 > fax: 612-624-8861 > ----------------------------------------------------------------- > > From yezq at mail.cbi.pku.edu.cn Wed Apr 23 02:10:15 2003 From: yezq at mail.cbi.pku.edu.cn (Zhiqiang Ye) Date: Wed, 23 Apr 2003 10:10:15 +0800 Subject: database ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/for emboss Message-ID: <200304230205.h3N25D3E000398@mail.cbi.pku.edu.cn> you can just run dbifasta, you don't need formatdb. ?? ======= 2003-04-22 13:57:00 ????????======= >Anyone has success story in "indexing" human genome at >ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/April_14_2003/ >for emboss? > >They are fasta format files, I try to run formatdb these chromosomes >then dbiblast. But it always gives me some errors. > > >Some runs as > >---------------------------------------------- >swinst at bi7 [CHR_16up] head chr16.fa >>gi|29824587|ref|NC_000016.4|NC_000016 Homo sapiens chromosome 16, >complete sequence >TAACCCTAACCCTAACCCTAACCCTAACCCTAACCGACCCTCACCCTCACCCTAACCACATGAGCAATGT >GGGTGTTATATTTTAGCTGTCATGGGTGCATTAGGAATGCTGCATTTGTGTTTCAACGCTGCAACTGGAC >CCTGCAATGCAGCCCCTCGCCTTGCCTTGGGAGAATCTCGGTGCCCAGGATTCAGAGGGGCTTTTAGTTT >CCCATTTTCCACACTGAACCGTTCTAACTGGTCTCTGACCTTGATTATTCACGGCTGCAACCGGGAAAGA >TTTTATTCACTGTCAATGCGCCCCGAGTTGTCCCAAAGCCAGGCAGTGCCCCCAACGTCTGTGCTTAGCA >GAATGCTGCTCCACCTTTACGGTGACCCCCAGGTCTGTGCTGAGCAGAACGCAGCTCCGCCCTCGCAGTA >CCCTCAGCCCGCCCGCCCGGGTCTGACCTGAGCAGAACTCTGCTCTGCCTTCGCAGTACCACCGAAATCT >GTGCAAAGGAGAACGCAGCTCCGCCCTCGCGGTGCTCTCCGCGTCTGTGCTGAGGAGAACGCAACTCCGC >CGTCGCAAAGGCGCGCGCCGCGCCGGCGCAGGCGCAGAGGGGCGCGCCGCGCCGGCGCAGGCGCAGAGAC > >swinst at bi7 [CHR_16up] formatdb -i chr16.fa -p F -o T >swinst at bi7 [CHR_16up] ls -l chr16* >-rw-r--r-- 1 swinst swinst 91281742 Apr 14 05:27 chr16.fa >-rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr >-rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin >-rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd >-rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni >-rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd >-rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi >-rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq > >swinst at bi7 [CHR_16up] dbiblast >Index a BLAST database >Database name: chr16 >Database directory [.]: >Wildcard database filename [chr16]: chr16.fa* >Release number [0.0]: 33 >Index date [00/00/00]: 04/22/03 > N : nucleic > P : protein > ? : unknown >Sequence type [unknown]: N > 1 : wublast and setdb/pressdb > 2 : formatdb > 0 : unknown >Blast index version [unknown]: 2 >swinst at bi7 [CHR_16up] ls -rlt >-rw-r----- 1 swinst swinst 8 Apr 22 13:53 chr16.fa.nnd >-rw-r----- 1 swinst swinst 52 Apr 22 13:53 chr16.fa.nni >-rw-r----- 1 swinst swinst 147 Apr 22 13:53 chr16.fa.nsd >-rw-r----- 1 swinst swinst 66 Apr 22 13:53 chr16.fa.nsi >-rw-r----- 1 swinst swinst 129 Apr 22 13:53 chr16.fa.nhr >-rw-r----- 1 swinst swinst 80 Apr 22 13:53 chr16.fa.nin >-rw-r----- 1 swinst swinst 22518829 Apr 22 13:53 chr16.fa.nsq >-rw-r--r-- 1 swinst swinst 680 Apr 22 13:53 formatdb.log >-rw-r--r-- 1 swinst swinst 344 Apr 22 13:55 division.lkp >-rw-r--r-- 1 swinst swinst 320 Apr 22 13:55 entrynam.idx >-rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.trg >-rw-r--r-- 1 swinst swinst 300 Apr 22 13:55 acnum.hit >swinst at bi7 [CHR_16up] > >-------------------------------------------------------------------------- > >Thanks, > > >Tu > >---------------------------------------------------------------- >Zheng Jin Tu >Computational Biology Specialist >Supercomputing Institute >599 Walter Library >117 Pleasant Street SE >University of Minnesota >Minneapolis, Minnesota 55455 >email: ztu at msi.umn.edu help email: help at msi.umn.edu >phone: 612-624-9504, 624-0115 help phone: 612-626-0802 >fax: 612-624-8861 >----------------------------------------------------------------- = = = = = = = = = = = = = = = = = = = = Best Wishes! Zhiqiang Ye 2003-04-23 From calvinwangxi at yahoo.com Wed Apr 23 11:42:39 2003 From: calvinwangxi at yahoo.com (calvin wang) Date: Wed, 23 Apr 2003 04:42:39 -0700 (PDT) Subject: kaptain Message-ID: <20030423114239.6054.qmail@web20514.mail.yahoo.com> I have just installed kaptain but I can not run it... kaptain --version gives me an error msg, and I can not run any emboss program via kaptain. bash-2.05b# kaptain --version kaptain 0.71 Copyright (C) 2000-2002 Ter�k Zsolt Mutex destroy failure: Device or resource busy I assume it is kaptain wossname for example... well but anyway is that msg usual after kaptain --version? __________________________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo http://search.yahoo.com From Joerg.Schaber at uv.es Wed Apr 23 15:46:40 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Wed, 23 Apr 2003 17:46:40 +0200 Subject: coderet problems Message-ID: <3EA6B560.2010308@uv.es> Hi, i am having problems extracting CDS from NCBI flat files using coderet. I keep getting the error 'Unable to read sequence'. I guess the problem could be solved if I used the appropriate command line arguments like -sformat1, for instance. However, in the docs is is not stated what options I have for the associated qualifiers. Any idea what could be the problem or what options there are for the qualifiers? joerg From Wiepert.Mathieu at mayo.edu Wed Apr 23 16:08:52 2003 From: Wiepert.Mathieu at mayo.edu (Wiepert, Mathieu) Date: Wed, 23 Apr 2003 11:08:52 -0500 Subject: coderet problems Message-ID: <2F41CC6C9777D311ACBD009027B108EA0541C746@excsrv32.mayo.edu> Hi, Have you tried the coderet -help -verbose options? That should give you all the possible parameters available. I couldn't tell you what they all do though, sorry... ~ $ coderet -help -verbose Mandatory qualifiers: [-seqall] seqall Sequence database USA [-seqout] seqout Output sequence USA Optional qualifiers: (none) Advanced qualifiers: -[no]cds boolean Extract CDS sequences -[no]mrna boolean Extract mrna sequences -[no]translation boolean Extract translated sequences Associated qualifiers: "-seqall" related qualifiers -sbegin1 integer First base used -send1 integer Last base used, def=seq length -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sopenfile1 string Input filename -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-seqout" related qualifiers -osformat2 string Output seq format -osextension2 string File name extension -osname2 string Base file name -osdbname2 string Database name to add -ossingle2 boolean Separate file for each entry -oufo2 string UFO features -offormat2 string Features format -ofname2 string Features file name General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for required and optional values -debug boolean Write debug output to program.dbg -acdlog boolean Write ACD processing log to program.acdlog -acdpretty boolean Rewrite ACD file as program.acdpretty -acdtable boolean Write HTML table of options -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report deaths -----Original Message----- From: Joerg Schaber [mailto:Joerg.Schaber at uv.es] Sent: Wednesday, April 23, 2003 10:47 AM To: emboss at embnet.org Subject: coderet problems Hi, i am having problems extracting CDS from NCBI flat files using coderet. I keep getting the error 'Unable to read sequence'. I guess the problem could be solved if I used the appropriate command line arguments like -sformat1, for instance. However, in the docs is is not stated what options I have for the associated qualifiers. Any idea what could be the problem or what options there are for the qualifiers? joerg From peptides at earthlink.net Wed Apr 23 16:23:03 2003 From: peptides at earthlink.net (David Stephens) Date: Wed, 23 Apr 2003 09:23:03 -0700 Subject: Complete Polyclonal Antibody Package at $597 Message-ID: <20030423163018.E4BF97D1CF@mercury.hgmp.mrc.ac.uk> An HTML attachment was scrubbed... URL: From Joerg.Schaber at uv.es Wed Apr 23 16:47:18 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Wed, 23 Apr 2003 18:47:18 +0200 Subject: coderet again Message-ID: <3EA6C396.8060507@uv.es> ok, I applied coderet to the same feature table as before but from EMBL instead of NCBI and it worked. I conclude that it was indeed a format problem. Any idea if that is a general bug of coderet? It seemed to know the ncbi format, though (debug file). joerg From ablavier at wanadoo.fr Wed Apr 23 20:03:12 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Wed, 23 Apr 2003 22:03:12 +0200 Subject: EMBOSS for Windows Message-ID: <001701c309d3$63f62110$0100a8c0@bach> I have started to work on porting Emboss to Windows. I have encountered very few problems so far. Look at http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html and tell me what you think if this work is of interest for you. -- Andr? Blavier From David.Bauer at SCHERING.DE Thu Apr 24 05:29:57 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 07:29:57 +0200 Subject: Antwort: coderet again Message-ID: What embossversion are you using ? In older EMBOSS releases the programs reading feature tables did not understand genbank format. David. ok, I applied coderet to the same feature table as before but from EMBL instead of NCBI and it worked. I conclude that it was indeed a format problem. Any idea if that is a general bug of coderet? It seemed to know the ncbi format, though (debug file). joerg From pmr at ebi.ac.uk Thu Apr 24 08:05:15 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 24 Apr 2003 09:05:15 +0100 Subject: coderet again References: <3EA6C396.8060507@uv.es> Message-ID: <3EA79ABB.7040703@ebi.ac.uk> Joerg Schaber wrote: > ok, I applied coderet to the same feature table as before but from EMBL > instead of NCBI and it worked. I conclude that it was indeed a format > problem. Any idea if that is a general bug of coderet? It seemed to know > the ncbi format, though (debug file). NCBI format does not have any feature information - it is a FASTA file with an NCBI style ID. Was it perhaps GENBANK format that you were reading? Hope this helps, Peter From gbottu at black.vub.ac.be Thu Apr 24 08:56:15 2003 From: gbottu at black.vub.ac.be (Guy Bottu) Date: Thu, 24 Apr 2003 10:56:15 +0200 Subject: Preferred isoschizomer (bis) In-Reply-To: <003d01c30a31$4813ac70$0402a6c1@windows.csc.fi>; from eija.korpelainen@csc.fi on Thu, Apr 24, 2003 at 10:15:27AM +0300 References: <20030415205431.A1419113@black.vub.ac.be> <003d01c30a31$4813ac70$0402a6c1@windows.csc.fi> Message-ID: <20030424105615.A1078815@black.vub.ac.be> Dear colleagues, I took a second look, and it is even worse : the file withrefm.304 contains as many as 149 enzymes with restriction site CTGCAG. The file embossre.enz contains 2 sites CTGCAG (BspMAI and PstI) and 147 sites ctgcag. When I run restrict with parameter -nolimit it finds the 2 sites and when I run it in default mode it finds only BspMAI. There is clearly a bug+misfeature in the programs rebaseextract+restrict. One would expect that restrict by default finds PstI and with -nolimit finds all 149 enzymes (although this would give a monstrous output). Sincerely, Guy Bottu From arunanirudhan at yahoo.co.in Thu Apr 24 09:20:47 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Thu, 24 Apr 2003 10:20:47 +0100 (BST) Subject: Database Message-ID: <20030424092047.60966.qmail@web8204.mail.in.yahoo.com> Hello all I am new to emboss. showdb is showing the results correctly. But seqret is showing this result [arun at localhost arun]$ seqret Reads and writes (returns) sequences Input sequence(s): embl:L07770 Warning: Cannot open division file '' for database 'embl' Warning: seqCdQry failed Error: Unable to read sequence 'embl:L07770' Please help Arun Anirudhan ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From pmr at ebi.ac.uk Thu Apr 24 09:35:19 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 24 Apr 2003 10:35:19 +0100 Subject: Database References: <20030424092047.60966.qmail@web8204.mail.in.yahoo.com> Message-ID: <3EA7AFD7.4070906@ebi.ac.uk> arun anirudhan wrote: > Hello all > I am new to emboss. > showdb is showing the results correctly. > But seqret is showing this result > [arun at localhost arun]$ seqret > Reads and writes (returns) > sequences > Input sequence(s): embl:L07770 > Warning: Cannot open division file > '' for > database 'embl' > Warning: seqCdQry failed > Error: Unable to read sequence > 'embl:L07770' > Please help You have EMBL defined as a database, but you have either not defined the correct access method or have not indexed it correctly. Assuming you have EMBL locally, you could index it with the dbiflat program (or use SRS if you have it :-) If you have EMBL defined as remote access, remember that L07770 is an accession number, not the ID (which is XLRHODOP) You can try this definition to use ID and accession number searches: DB embl [ type: N method: srswww url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" comment: "EMBL from EBI" dbalias: embl ] This uses the EBI's SRS server to query by ID and accession number for a USA that does not specify which kind of identifier you are using. Hope this helps, Peter Rice From Joerg.Schaber at uv.es Thu Apr 24 13:38:55 2003 From: Joerg.Schaber at uv.es (Joerg Schaber) Date: Thu, 24 Apr 2003 15:38:55 +0200 Subject: Antwort: coderet again References: Message-ID: <3EA7E8EF.9010909@uv.es> I updated to version 2.6.0 but the problem remained. However, I figured that in my emboss.default file I erroneously set the database type to P instead of N. After I changed that coderet worked perfectly well. However, I encountered a new problem with the new version. I tried the new very useful '-describe' qualifer of extractfeat and received the error message " Died: unknown qualifier -describe". Any idea what I could be the problem here. Thanks, joerg David.Bauer at SCHERING.DE wrote: >What embossversion are you using ? >In older EMBOSS releases the programs reading feature tables did not >understand genbank format. > >David. > >ok, I applied coderet to the same feature table as before but from EMBL >instead of NCBI and it worked. I conclude that it was indeed a format >problem. Any idea if that is a general bug of coderet? It seemed to know >the ncbi format, though (debug file). > >joerg > > > -- ---------------------------------------------------------- J?rg Schaber Instituto Cavanilles de Biodiversidad y Biologia Evolutiva Universidad de Valencia Tel.: ++34 96 354 3666 A.C. 22085 Fax.: ++34 96 354 3670 46071 Valencia, Espa?a email : jos at uv.es From jan.wuyts at gengenp.rug.ac.be Thu Apr 24 13:48:16 2003 From: jan.wuyts at gengenp.rug.ac.be (Jan Wuyts) Date: Thu, 24 Apr 2003 15:48:16 +0200 (MEST) Subject: matcher score calculation Message-ID: Dear all, I am trying to use 'matcher' to do a local alignment of a small RNA sequence against a larger one. However, the output confuses me a bit. For example: matcher seq1 seq2 -alternatives 9 -stdout -auto > output The best (first) match in the output is this: ######################################## # Program: matcher # Rundate: Thu Apr 24 15:21:41 2003 # Align_format: markx0 # Report_file: stdout ######################################## #======================================= # # Aligned_sequences: 2 # 1: 21 # 2: 21-1 # Matrix: EDNAFULL # Gap_penalty: 16 # Extend_penalty: 4 # # Length: 18 # Identity: 16/18 (88.9%) # Similarity: 13/18 (72.2%) # Gaps: 0/18 ( 0.0%) # Score: 61 # # #======================================= 10 20 21 GCAGCAUCAUCAAGAUUC :::::: :::.::::::: 21-1 GCAGCACCAUUAAGAUUC 440 450 #======================================= Apparently 16 positions are identical (seems right, there are 16 ':') but only 13 are counted as similar. First of all, I don't understand why CU would be counted as similar (this score is after all negative in EDNAFULL) and second, how can it be that #similar is small than #identical. The manual (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats.html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From gwilliam at hgmp.mrc.ac.uk Thu Apr 24 13:50:06 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Thu, 24 Apr 2003 14:50:06 +0100 Subject: Antwort: coderet again References: <3EA7E8EF.9010909@uv.es> Message-ID: <3EA7EB8E.BA88FE38@hgmp.mrc.ac.uk> The '-describe' option will be avalable in the next release (2.7.0) of EMBOSS. See the Change Log for the version 2.7.0: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/ChangeLog.html#0 Gary Joerg Schaber wrote: > However, I encountered a new problem with the new version. I tried the > new very useful '-describe' qualifer of extractfeat and received the > error message " Died: unknown qualifier -describe". -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From David.Bauer at SCHERING.DE Thu Apr 24 13:54:56 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 15:54:56 +0200 Subject: coderet again Message-ID: Ooops, in my version 2.6.0 extracfeat does not have this option. Maybe Gary has an idea ? David. I updated to version 2.6.0 but the problem remained. However, I figured that in my emboss.default file I erroneously set the database type to P instead of N. After I changed that coderet worked perfectly well. However, I encountered a new problem with the new version. I tried the new very useful '-describe' qualifer of extractfeat and received the error message " Died: unknown qualifier -describe". Any idea what I could be the problem here. Thanks, joerg From David.Bauer at SCHERING.DE Thu Apr 24 14:07:07 2003 From: David.Bauer at SCHERING.DE (David.Bauer at SCHERING.DE) Date: Thu, 24 Apr 2003 16:07:07 +0200 Subject: Antwort: matcher score calculation Message-ID: I had this problem long time ago (and assumed it was fixed in the meantime). Matcher doesn't like the "U". If you change your RNA to DNA it will calculate the correct Similarity. David. Dear all, I am trying to use 'matcher' to do a local alignment of a small RNA sequence against a larger one. However, the output confuses me a bit. For example: matcher seq1 seq2 -alternatives 9 -stdout -auto > output The best (first) match in the output is this: ######################################## # Program: matcher # Rundate: Thu Apr 24 15:21:41 2003 # Align_format: markx0 # Report_file: stdout ######################################## #======================================= # # Aligned_sequences: 2 # 1: 21 # 2: 21-1 # Matrix: EDNAFULL # Gap_penalty: 16 # Extend_penalty: 4 # # Length: 18 # Identity: 16/18 (88.9%) # Similarity: 13/18 (72.2%) # Gaps: 0/18 ( 0.0%) # Score: 61 # # #======================================= 10 20 21 GCAGCAUCAUCAAGAUUC :::::: :::.::::::: 21-1 GCAGCACCAUUAAGAUUC 440 450 #======================================= Apparently 16 positions are identical (seems right, there are 16 ':') but only 13 are counted as similar. First of all, I don't understand why CU would be counted as similar (this score is after all negative in EDNAFULL) and second, how can it be that #similar is small than #identical. The manual (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats.html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From Jack.Leunissen at cmbi.kun.nl Thu Apr 24 19:10:52 2003 From: Jack.Leunissen at cmbi.kun.nl (Jack Leunissen) Date: Thu, 24 Apr 2003 21:10:52 +0200 Subject: Antwort: matcher score calculation In-Reply-To: Message-ID: <000401c30a95$4bacda50$0300000a@kuifje> What is even more surprising is that the match U->C is different from C->U. The first receives no 'dot' in the alignment, the latter does. Interesting... Jack A.M. Leunissen, Ph.D. Dept. Genome Informatics Wageningen University 6703 HA Wageningen, NL > -----Original Message----- > From: owner-emboss at hgmp.mrc.ac.uk > [mailto:owner-emboss at hgmp.mrc.ac.uk] On Behalf Of > David.Bauer at SCHERING.DE > Sent: Thursday, 24 April, 2003 16:07 > To: jan.wuyts at gengenp.rug.ac.be > Cc: emboss at embnet.org; jan.wuyts at gengenp.rug.ac.be; > owner-emboss at hgmp.mrc.ac.uk > Subject: Antwort: matcher score calculation > > > > > I had this problem long time ago (and assumed it was fixed in > the meantime). Matcher doesn't like the "U". If you change > your RNA to DNA it will calculate the correct Similarity. > > David. > > > Dear all, > > I am trying to use 'matcher' to do a local alignment of a > small RNA sequence against a larger one. However, the output > confuses me a bit. For example: matcher seq1 seq2 > -alternatives 9 -stdout -auto > output > > The best (first) match in the output is this: > ######################################## > # Program: matcher > # Rundate: Thu Apr 24 15:21:41 2003 > # Align_format: markx0 > # Report_file: stdout > ######################################## > #======================================= > # > # Aligned_sequences: 2 > # 1: 21 > # 2: 21-1 > # Matrix: EDNAFULL > # Gap_penalty: 16 > # Extend_penalty: 4 > # > # Length: 18 > # Identity: 16/18 (88.9%) > # Similarity: 13/18 (72.2%) > # Gaps: 0/18 ( 0.0%) > # Score: 61 > # > # > #======================================= > > > 10 20 > 21 GCAGCAUCAUCAAGAUUC > :::::: :::.::::::: > 21-1 GCAGCACCAUUAAGAUUC > 440 450 > #======================================= > > Apparently 16 positions are identical (seems right, there are > 16 ':') but only 13 are counted as similar. First of all, I > don't understand why CU would be counted as similar (this > score is after all negative in > EDNAFULL) and second, how can it be that #similar is small > than #identical. The manual > (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Themes/AlignFormats .html) states that "Any two residues or bases are defined as similar when they have positive comparisons (as defined by the comparison matrix being used in the alignment algorithm)." and a bit further "Note that the sum of identical and similar positions is greater than 100%. This is because the count of similar positions includes the count of identical positions; if residues are identical, they must also be similar." Therefor I would think #similar must always be >= #identical. Lastly, when I calculate the score manually, I get 16x5-2x4=72 (in EDNAFULL, 5 is used for all non-ambiguous matches, -4 for all non-ambiguous mis-matches) while matcher calculates the score to be 61. Any help on this would be greatly appreciated. Greetings, Jan. From kvddrift at earthlink.net Sat Apr 26 13:01:00 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 26 Apr 2003 09:01:00 -0400 Subject: EMBOSS on Mac OS X Message-ID: Hi, I am not sure if this has been discussed here before, but it is now very easy to install EMBOSS on Mac OS X. Through the 'Fink' project, a debian-like packaging tool a package for EMBOSS 2.6.0 is available. The package was originally submitted by Ben Hines, but now I am the 'maintainer' of it. I have also submitted packages for emboss-kaptain and kaptain to provide a GUI (also works with KDE) for EMBOSS on Mac OS X. Of course, all the credit goes to the developers of Kaptain en EMBOSS-kaptns - I just 'ported' the packages for Mac OS X. If you are interested, please go to the Fink website (http://fink.sf.net), and download and install Fink. Now you can install an any available package (eg 'fink install emboss'). This will take care of downloading, compiling and installing EMBOSS plus all the packages it needs. Be aware thet this can take quite some time. I would encourage all Mac OS X users to try this out, and see if it works as expected. thanks, - Koen van der Drift. From siegmund at develogen.com Sun Apr 27 15:18:58 2003 From: siegmund at develogen.com (Thomas Siegmund) Date: Sun, 27 Apr 2003 17:18:58 +0200 Subject: kaptain In-Reply-To: <20030423114239.6054.qmail@web20514.mail.yahoo.com> References: <20030423114239.6054.qmail@web20514.mail.yahoo.com> Message-ID: <200304271718.58161.siegmund@develogen.com> Dear Calvin, I have not seen this message with kaptain, but I remember people saw it on RedHat systems with koffice. This was a problem with the location of the koffice binaries. Please try to search the KDE mailing list archive ( http://lists.kde.org/ ) for 'mutex destroy'. Or you could ask Terek Zolt, the author of kaptain directly. Maybe he has an idea. Thomas Am Mittwoch, 23. April 2003 13:42 schrieb calvin wang: > I have just installed kaptain but I can not run it... > kaptain --version gives me an error msg, and I can not > run any emboss program via kaptain. > bash-2.05b# kaptain --version > kaptain 0.71 > Copyright (C) 2000-2002 Ter�k Zsolt > > Mutex destroy failure: Device or resource busy > > I assume it is kaptain wossname for example... well > but anyway is that msg usual after kaptain --version? > > __________________________________________________ > Do you Yahoo!? > The New Yahoo! Search - Faster. Easier. Bingo > http://search.yahoo.com -- Thomas Siegmund, Ph.D. DeveloGen AG Bioinformatics and Data Management Phone: +49(551) 505 58 651 From ablavier at wanadoo.fr Sun Apr 27 20:07:11 2003 From: ablavier at wanadoo.fr (=?iso-8859-1?Q?Andr=E9_Blavier?=) Date: Sun, 27 Apr 2003 22:07:11 +0200 Subject: EMBOSS for Windows: new build Message-ID: <001901c30cf8$97f7ad80$0100a8c0@bach> Most EMBOSS applications are now available in my Windows distribution. Visit http://perso.wanadoo.fr/ablavier/embosswin/embosswin.html I have also added technical information about the way I produce native Windows EMBOSS programs, along with a source code distribution. -- Andr? Blavier From kvddrift at earthlink.net Mon Apr 28 16:11:50 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 12:11:50 -0400 Subject: database info Message-ID: Hi, I want to try out some EMBOSS programs that use database searching. From the docs it's not clear to me if and how I can search an online database. My diskspace is limited, so I can't download a complete database to my HD. Or maybe there is some small database available somwhere that I can use? thanks, - Koen. From pmr at ebi.ac.uk Mon Apr 28 16:50:03 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 28 Apr 2003 17:50:03 +0100 Subject: database info References: Message-ID: <3EAD5BBB.9050006@ebi.ac.uk> Koen van der Drift wrote: > I want to try out some EMBOSS programs that use database searching. From > the docs it's not clear to me if and how I can search an online > database. My diskspace is limited, so I can't download a complete > database to my HD. You can point the database definitions to a remote SRS server (srs.ebi.ac.uk usually) using access method SRSWWW For example: DB genbank [ type: N method: srswww format: genbank release: NCBI comment: "Genbank from NCBI" url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" ] DB embl [ type: N method: srswww format: embl release: EBI comment: "EMBL from EBI" url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" ] > Or maybe there is some small database available somwhere that I can use? Yes ... there are databases defined in emboss.default.template with names tembl, tsw, and so on. These are small subsets of EMBL, SwissProt, etc. that we use for testing. The general principle is that any sequence used in the program documentation should appear there. You can uncomment them, and set the value of emboss_tempdata to point to the share/EMBOSS/test directory where EMBOSS is installed (or the test directory in the source tree). Hope this helps, Peter Rice From gbottu at ben.vub.ac.be Mon Apr 28 18:07:57 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Mon, 28 Apr 2003 20:07:57 +0200 (CEST) Subject: problem with mse Message-ID: <200304281807.h3SI7v3p1213597@black.vub.ac.be> from : BEN Dear colleagues, I am using EMBOSS 2.6 under CompaqTru64 5.1A. I was first happy to see that mse could be started (the previous installation did make the terminal stuck), but when I tried to use it I ran into trouble. Indeed, impossible to save a multiple sequence alignment. The command "exit" does nothing, the command "write" saves only one sequence into a file. Do I miss something ? Sincerely, Guy Bottu From kvddrift at earthlink.net Mon Apr 28 19:02:46 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 15:02:46 -0400 Subject: database info In-Reply-To: <3EAD5BBB.9050006@ebi.ac.uk> References: <3EAD5BBB.9050006@ebi.ac.uk> Message-ID: At 17:50 +0100 4/28/03, Peter Rice wrote: >DB genbank [ type: N > method: srswww format: genbank release: NCBI > comment: "Genbank from NCBI" > url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > ] >DB embl [ type: N > method: srswww format: embl release: EBI > comment: "EMBL from EBI" > url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > ] > > I added these entries to .embossrc, and they then indeed show up when I run showdb. Following the example in the tutorial (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Tutorial/node11.html), I now run seqret, but get the following error: Input sequence(s): embl:xlrhodop Error: Unable to read sequence 'embl:xlrhodop' Do I need to do something else before I can use seqret (and other programs)? Is there a place in the docs on how to use/access databases? thanks again, - Koen. From burke at airmail.net Mon Apr 28 19:17:09 2003 From: burke at airmail.net (Burke Squires) Date: Mon, 28 Apr 2003 14:17:09 -0500 Subject: Extracting protein translation with extractfeat Message-ID: Hello, I would like to extract the protein translation for genes in a Genbank file. I need the name to also be extracted so I must use extractfeat but I am not sure what of the myriad of option to put together to get the amino acid sequence out with the name. The Example lists a mod_res but nothing else. Got any ideas? Thanks, Burke -- Burke Squires Bioinformatics MacroGenics, Inc. Dallas, TX From Marc.Logghe at devgen.com Mon Apr 28 19:57:55 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 28 Apr 2003 21:57:55 +0200 Subject: database info Message-ID: Hi Koen, You probably also have to set the proxy in your .embossrc file: SET emboss_proxy "yourproxy:80" Marc > -----Original Message----- > From: Koen van der Drift [mailto:kvddrift at earthlink.net] > Sent: Monday, April 28, 2003 9:03 PM > To: Peter Rice > Cc: emboss at embnet.org > Subject: Re: database info > > > At 17:50 +0100 4/28/03, Peter Rice wrote: > > >DB genbank [ type: N > > method: srswww format: genbank release: NCBI > > comment: "Genbank from NCBI" > > url: "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > > ] > >DB embl [ type: N > > method: srswww format: embl release: EBI > > comment: "EMBL from EBI" > > url: "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > > ] > > > > > > > I added these entries to .embossrc, and they then indeed show up when > I run showdb. > > Following the example in the tutorial > (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Tutorial/node11.html), > I now run seqret, but get the following error: > > Input sequence(s): embl:xlrhodop > Error: Unable to read sequence 'embl:xlrhodop' > > > Do I need to do something else before I can use seqret (and > other programs)? > > Is there a place in the docs on how to use/access databases? > > > > thanks again, > > - Koen. > From kvddrift at earthlink.net Mon Apr 28 20:19:30 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 16:19:30 -0400 Subject: database info In-Reply-To: References: Message-ID: At 21:57 +0200 4/28/03, Marc Logghe wrote: >Hi Koen, >You probably also have to set the proxy in your .embossrc file: >SET emboss_proxy "yourproxy:80" >Marc Thanks Marc, That doesn't seem to work - I never use a proxy anyway, so I guess the problem must be something else. I also tried Pauls 2nd suggestion, by using the smaller databases that are in the EMBOSS package. According to the docs, I added SET emboss_tempdata /sw/share/EMBOSS/test to emboss.default.template (in my case on Mac OS X - this is the correct location of test). When I now run showdb, there are no databases listed. Did I miss something else? thanks again, - Koen. From Marc.Logghe at devgen.com Mon Apr 28 20:36:42 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Mon, 28 Apr 2003 22:36:42 +0200 Subject: some seqret questions Message-ID: Hi all, I was wondering whether there was a way (preferably emboss, but other suggestions are also welcome) to fetch a native sequence record. After some seqret experiments I realized a number of tags are discarded. For instance, I don't get the reference fields back from genbank records, neither the cross reference fields from swissprot, ipi, etc. (used methods: srswww). Is this also the case when genbank has been 'dbiflatted' ? Another thing. Is there a global variable like emboss_feature to switch on the -feature option by default ? Regards, Marc From kvddrift at earthlink.net Mon Apr 28 23:22:02 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 28 Apr 2003 19:22:02 -0400 Subject: database info In-Reply-To: References: Message-ID: At 4:19 PM -0400 4/28/03, Koen van der Drift wrote: > >to emboss.default.template Duh - that should be emboss.default Now it works. Sorry for the noise ;) - Koen. From ivahser at i.com.ua Tue Apr 29 05:20:18 2003 From: ivahser at i.com.ua (Sergiy Ivakhno) Date: Tue, 29 Apr 2003 08:20:18 +0300 Subject: Briefings in Bioinformatics References: Message-ID: <001001c30e0f$07c03080$75f909d4@bioinformatics> Dear friends . This is more like disparate appeal for help then a question .The problem is that I am urgently need to articles from the journal Briefings in Bioinformatics . I have tried almost everything : searched all libraries , have written to authors to ask for any kind of reprints . If somebody hold subscription on Briefings in Bioinformatics and has access to online archive of this journal at Ingenta I would be very grateful if you could send me PDF versions of this articles . URL addresses I give below A comparison of microarray databases Gardiner-Garden M.; Littlejohn T.G. Briefings in Bioinformatics, May 2001, vol. 2, no. 2, pp. 143-158(16) http://www.ingenta.com/isis/searching/Availability/ingenta?pub=infobike://hs p/bib/2001/00000002/00000002/art00004&targetId=1051380412858&WebLogicSession =PbDvzgxLljrAkAbKboZ7|2603703981800765476/-1052814329/6/7051/7051/7052/7052/ 7051/-1 A review of bioinformatics education in the UK Counsell D.Briefings in Bioinformatics, March 2003, vol. 4, no. 1, pp. 7-21(15) http://www.ingenta.com/isis/searching/Availability/ingenta?pub=infobike://hs p/bib/2003/00000004/00000001/art00002&targetId=1051380234757&WebLogicSession =PbDvzgxLljrAkAbKboZ7|2603703981800765476/-1052814329/6/7051/7051/7052/7052/ 7051/-1 Thank you in advance . My email is ivahser at i.com.ua . Sergiy. From simon.andrews at bbsrc.ac.uk Tue Apr 29 07:22:18 2003 From: simon.andrews at bbsrc.ac.uk (simon andrews (BI)) Date: Tue, 29 Apr 2003 08:22:18 +0100 Subject: some seqret questions Message-ID: <2DC41140A89ED411989D00508BDCD9ED01E289C2@bi-exsrv1.iapc.bbsrc.ac.uk> > -----Original Message----- > From: Marc Logghe [mailto:Marc.Logghe at devgen.com] > Sent: 28 April 2003 21:37 > To: Emboss (E-mail) > Subject: some seqret questions > > > Hi all, > I was wondering whether there was a way (preferably emboss, but other > suggestions are also welcome) to fetch a native sequence record. tfm entret Hope this helps Simon, PS I spent ages playing with seqret options before I was pointed to this :-) From arunanirudhan at yahoo.co.in Tue Apr 29 07:26:22 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Tue, 29 Apr 2003 08:26:22 +0100 (BST) Subject: database info In-Reply-To: <3EAD5BBB.9050006@ebi.ac.uk> Message-ID: <20030429072622.9105.qmail@web8206.mail.in.yahoo.com> Thank you very much for your valuable suggestion. I have one problem now. seqret embl:L07770 is giving result from embl website. But is there a way to search embl site via a command like this seqret embl:insulin* This is giving an error message Die seqret terminated: Bad value for option [sequence] and no prompt. Please help Arun --- Peter Rice wrote: > Koen van der Drift wrote: > > I want to try out some EMBOSS programs that use > database searching. From > > the docs it's not clear to me if and how I can > search an online > > database. My diskspace is limited, so I can't > download a complete > > database to my HD. > > You can point the database definitions to a remote > SRS server > (srs.ebi.ac.uk usually) using access method SRSWWW > > For example: > > DB genbank [ type: N > method: srswww format: genbank release: NCBI > comment: "Genbank from NCBI" > url: > "http://cbr-rbc.nrc-cnrc.gc.ca/srs6bin/cgi-bin/wgetz" > ] > DB embl [ type: N > method: srswww format: embl release: EBI > comment: "EMBL from EBI" > url: > "http://srs.ebi.ac.uk/srs7bin/cgi-bin/wgetz" > ] > > > > Or maybe there is some small database available > somwhere that I can use? > > Yes ... there are databases defined in > emboss.default.template with > names tembl, tsw, and so on. These are small subsets > of EMBL, SwissProt, > etc. that we use for testing. The general principle > is that any sequence > used in the program documentation should appear > there. > > You can uncomment them, and set the value of > emboss_tempdata to point to > the share/EMBOSS/test directory where EMBOSS is > installed (or the test > directory in the source tree). > > Hope this helps, > > Peter Rice > ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From Marc.Logghe at devgen.com Tue Apr 29 07:47:00 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 09:47:00 +0200 Subject: some seqret questions Message-ID: Simon ! Thanks a lot. I think I was just repeating your seqret playing ;-) Entret is it !!! > -----Original Message----- > From: simon andrews (BI) [mailto:simon.andrews at bbsrc.ac.uk] > Sent: Tuesday, April 29, 2003 9:22 AM > To: Emboss (E-mail) > Subject: RE: some seqret questions > > > > > > -----Original Message----- > > From: Marc Logghe [mailto:Marc.Logghe at devgen.com] > > Sent: 28 April 2003 21:37 > > To: Emboss (E-mail) > > Subject: some seqret questions > > > > > > Hi all, > > I was wondering whether there was a way (preferably emboss, > but other > > suggestions are also welcome) to fetch a native sequence record. > > > tfm entret > > Hope this helps > > Simon, > > PS I spent ages playing with seqret options before I was > pointed to this :-) > From gwilliam at hgmp.mrc.ac.uk Tue Apr 29 08:22:44 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 29 Apr 2003 09:22:44 +0100 Subject: Extracting protein translation with extractfeat References: Message-ID: <3EAE3654.D357DC29@hgmp.mrc.ac.uk> Try using 'coderet -nocds -nomrna'. Gary Burke Squires wrote: > > Hello, > > I would like to extract the protein translation for genes in a Genbank file. > I need the name to also be extracted so I must use extractfeat but I am not > sure what of the myriad of option to put together to get the amino acid > sequence out with the name. The Example lists a mod_res but nothing else. > Got any ideas? > > Thanks, > > Burke > > -- > Burke Squires > Bioinformatics > MacroGenics, Inc. > Dallas, TX -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From Marc.Logghe at devgen.com Tue Apr 29 10:27:39 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 12:27:39 +0200 Subject: entret bug or feature ? Message-ID: Hi all, Thanks to Simon Andrews I had a new toy to play with: entret. And I have already a question about that: why is a seqret returning something and entret not. I tried the following with a local ipi database, which is actually a blast database: seqret ipi:ENSP00000289136 -debug and entret ipi:ENSP00000289136 -debug Like I mentioned I don't get anything back from entret. Why is that ? I'll include both debug files <> <> In emboss.default the ipi database is defined as: DB ipi [ type: P format: ncbi method: app app: "fastacmd -d ipi -s %s" ] Thanks, ML *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: seqret.dbg Type: application/octet-stream Size: 9517 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: entret.dbg Type: application/octet-stream Size: 8704 bytes Desc: not available URL: From pmr at ebi.ac.uk Tue Apr 29 12:58:42 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 29 Apr 2003 13:58:42 +0100 Subject: entret bug or feature ? References: Message-ID: <3EAE7702.60902@ebi.ac.uk> Marc Logghe wrote: > Hi all, > Thanks to Simon Andrews I had a new toy to play with: entret. > And I have already a question about that: why is a seqret returning > something and entret not. > In emboss.default the ipi database is defined as: > DB ipi [ > type: P > format: ncbi > method: app > app: "fastacmd -d ipi -s %s" > ] NCBI format is not saving the text (because originally we expected NCBI to be a file format, not a database format). Somehow it still does not save the text properly, and some of the other sequence formats are not happy in entret. I will add it for the next release. Thanks for pointing this out. Peter Rice From kvddrift at earthlink.net Tue Apr 29 14:26:21 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 10:26:21 -0400 Subject: database info In-Reply-To: References: Message-ID: Hi, I am trying some EMBOSS programs out on Mac OS X, and it works very nice with the kaptain-GUI. Which EMBOSS programs should I use to list all entries in one database (either offline or online). Many times I try something by wild guessing, I get an error 'entry not in database'. thanks, - Koen. From gwilliam at hgmp.mrc.ac.uk Tue Apr 29 14:29:51 2003 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Tue, 29 Apr 2003 15:29:51 +0100 Subject: database info References: Message-ID: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> infoseq Koen van der Drift wrote: > > Hi, > > I am trying some EMBOSS programs out on Mac OS X, and it works very > nice with the kaptain-GUI. > > Which EMBOSS programs should I use to list all entries in one > database (either offline or online). Many times I try something by > wild guessing, I get an error 'entry not in database'. > > thanks, > > - Koen. -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at rfcgr.mrc.ac.uk http://www.rfcgr.mrc.ac.uk/ Bioinformatics, MRC RFCGR, Hinxton, Cambridge, CB10 1SB, UK From hchen at genetics.ac.cn Tue Apr 29 14:32:18 2003 From: hchen at genetics.ac.cn (Chen Hao) Date: 29 Apr 2003 22:32:18 +0800 Subject: is there any new tutorial for the last version EMBOSS ? Message-ID: <1051626742.3466.49.camel@localhost> Hi all, I'd like to know if there is any new tutorial for last version EMBOSS . if you know , point me out ,or if you have a ps / pdf format file ,do me the favor to e-mail my address (hchen at genetics.ac.cn) Thank you very much! Motata From d.m.a.martin at dundee.ac.uk Tue Apr 29 14:35:19 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 15:35:19 +0100 Subject: is there any new tutorial for the last version EMBOSS ? In-Reply-To: <1051626742.3466.49.camel@localhost> Message-ID: On 29/4/03 3:32 pm, "Chen Hao" wrote: > Hi all, > > I'd like to know if there is any new tutorial for last version EMBOSS . > if you know , point me out ,or if you have a ps / pdf format file ,do me > the favor to e-mail my address (hchen at genetics.ac.cn) > I added some bits to the tutorial (report formats etc.) and rewrote a few things. Lisa Mullen at HGMP is probably the person to ask as I have passed all my modifications back to her for eventual dissemination ..d > Thank you very much! > > Motata > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From jison at hgmp.mrc.ac.uk Tue Apr 29 14:38:40 2003 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Tue, 29 Apr 2003 15:38:40 +0100 Subject: is there any new tutorial for the last version EMBOSS ? References: <1051626742.3466.49.camel@localhost> Message-ID: <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> Hi Motata An EMBOSS programming course is available on-line: http://www.hgmp.mrc.ac.uk/CCP11/CCP11courses/EMBOSS-Course/emboss_index.html and that covers many of the basic concepts in using emboss too. If you're interested in more user-level documentation, what we have is available here: http://www.hgmp.mrc.ac.uk/Software/EMBOSS/userdoc.html The tutorial itself is a bit out of date, but the other documents are current. Cheers J. Chen Hao wrote: > Hi all, > > I'd like to know if there is any new tutorial for last version EMBOSS . > if you know , point me out ,or if you have a ps / pdf format file ,do me > the favor to e-mail my address (hchen at genetics.ac.cn) > > Thank you very much! > > Motata -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk Tel : 01223 49-4548 HGMP-RC: http://www.hgmp.mrc.ac.uk/ EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ From kvddrift at earthlink.net Tue Apr 29 14:38:43 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 10:38:43 -0400 Subject: database info In-Reply-To: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> References: <3EAE8C5F.8DF21874@hgmp.mrc.ac.uk> Message-ID: At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: >infoseq This only gives the info of one sequence. What I was looking for is a program that lists all sequences/entries in a database, so I don't have to type in sw:foo, sw:bar until I get a valid entry. thanks, - Koen. From Marc.Logghe at devgen.com Tue Apr 29 14:51:01 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 16:51:01 +0200 Subject: database info Message-ID: Koen, is it only to obtain one way or another a valid ID for a sequence from your database ? If that is the case, you can e.g. ask for the first entry in the database like this: seqret sw -firstonly I did not try it with remote databases; it should work with a local database. HTH, ML > -----Original Message----- > From: Koen van der Drift [mailto:kvddrift at earthlink.net] > Sent: Tuesday, April 29, 2003 4:39 PM > To: Gary Williams, Tel 01223 494522 > Cc: emboss at embnet.org > Subject: Re: database info > > > At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: > > >infoseq > > > This only gives the info of one sequence. What I was looking for is a > program that lists all sequences/entries in a database, so I don't > have to type in sw:foo, sw:bar until I get a valid entry. > > thanks, > > - Koen. > From d.m.a.martin at dundee.ac.uk Tue Apr 29 14:59:43 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 15:59:43 +0100 Subject: database info In-Reply-To: Message-ID: >> -----Original Message----- >> From: Koen van der Drift [mailto:kvddrift at earthlink.net] >> Sent: Tuesday, April 29, 2003 4:39 PM >> To: Gary Williams, Tel 01223 494522 >> Cc: emboss at embnet.org >> Subject: Re: database info >> >> >> At 15:29 +0100 4/29/03, Gary Williams, Tel 01223 494522 wrote: >> >>> infoseq >> >> >> This only gives the info of one sequence. What I was looking for is a >> program that lists all sequences/entries in a database, so I don't >> have to type in sw:foo, sw:bar until I get a valid entry. infoseq sw:\* -stdout -auto |more will give you a very long list of sequences. I don't know why you are only getting reports for one sequence. ..d -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From kvddrift at earthlink.net Tue Apr 29 15:04:18 2003 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 29 Apr 2003 11:04:18 -0400 Subject: database info In-Reply-To: References: Message-ID: At 15:59 +0100 4/29/03, David Martin wrote: >infoseq sw:\* -stdout -auto |more will give you a very long list of >sequences. > >I don't know why you are only getting reports for one sequence. > Thanks everyone for the response. I just needed a valid entry name/ID that I can use to test out some EMBOSS programs. Marc, your suggestion indeed only works with local databases, eg: seqret tsw -firstonly thanks again, - Koen. From matamban at psc.edu Tue Apr 29 15:11:28 2003 From: matamban at psc.edu (Tendai Matambanadzo) Date: Tue, 29 Apr 2003 11:11:28 -0400 (EDT) Subject: discussion groups Message-ID: I was wondering if there is a forum board which will help me install the EMBOSS application program.I seem to be having several problems in the installation.Thank you for you help Tendai From hchen at genetics.ac.cn Tue Apr 29 15:17:35 2003 From: hchen at genetics.ac.cn (Chen Hao) Date: 29 Apr 2003 23:17:35 +0800 Subject: is there any new tutorial for the last version EMBOSS ? In-Reply-To: <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> References: <1051626742.3466.49.camel@localhost> <3EAE8E70.238EABE1@hgmp.mrc.ac.uk> Message-ID: <1051629459.3466.59.camel@localhost> Hi Dr J.C.Ison Thank you very much ! The online course is very helpful for me. sincerely, Motata ? 2003-04-29 ? ? 22:38? Dr J.C. Ison ??? > Hi Motata > > An EMBOSS programming course is available on-line: > http://www.hgmp.mrc.ac.uk/CCP11/CCP11courses/EMBOSS-Course/emboss_index.html > > and that covers many of the basic concepts in using emboss too. > > If you're interested in more user-level documentation, what we have is > available here: > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/userdoc.html > > The tutorial itself is a bit out of date, but the other documents are > current. > > Cheers > > J. > > Chen Hao wrote: > > > Hi all, > > > > I'd like to know if there is any new tutorial for last version EMBOSS . > > if you know , point me out ,or if you have a ps / pdf format file ,do me > > the favor to e-mail my address (hchen at genetics.ac.cn) > > > > Thank you very much! > > > > Motata > > -- > Jon C. Ison, PhD > Bioinformatics Applications Group > UK MRC Human Genome Mapping Project Resource Centre > Hinxton, Cambridge, CB10 1SB, UK > E-mail : jison at hgmp.mrc.ac.uk > Tel : 01223 49-4548 > HGMP-RC: http://www.hgmp.mrc.ac.uk/ > EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ > > > From jison at hgmp.mrc.ac.uk Tue Apr 29 15:20:41 2003 From: jison at hgmp.mrc.ac.uk (Dr J.C. Ison) Date: Tue, 29 Apr 2003 16:20:41 +0100 Subject: discussion groups References: Message-ID: <3EAE9849.8DE27F76@hgmp.mrc.ac.uk> Tendai - There is no forum other than these lists, but you should have a look here ... http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html In particular, David Martin's EMBOSS administration guide. Cheers J. Tendai Matambanadzo wrote: > I was wondering if there is a forum board which will help me install the > EMBOSS application program.I seem to be having several problems in the > installation.Thank you for you help > > Tendai -- Jon C. Ison, PhD Bioinformatics Applications Group UK MRC Human Genome Mapping Project Resource Centre Hinxton, Cambridge, CB10 1SB, UK E-mail : jison at hgmp.mrc.ac.uk Tel : 01223 49-4548 HGMP-RC: http://www.hgmp.mrc.ac.uk/ EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ From d.m.a.martin at dundee.ac.uk Tue Apr 29 15:23:22 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Tue, 29 Apr 2003 16:23:22 +0100 Subject: discussion groups In-Reply-To: <3EAE9849.8DE27F76@hgmp.mrc.ac.uk> Message-ID: On 29/4/03 4:20 pm, "Dr J.C. Ison" wrote: > Tendai - > > There is no forum other than these lists, but you should have > a look here ... > http://www.hgmp.mrc.ac.uk/Software/EMBOSS/admin.html > > In particular, David Martin's EMBOSS administration guide. This is somewhat out of date but most of the information is still valid. ..d > > Cheers > > J. > > > Tendai Matambanadzo wrote: > >> I was wondering if there is a forum board which will help me install the >> EMBOSS application program.I seem to be having several problems in the >> installation.Thank you for you help >> >> Tendai > > -- > Jon C. Ison, PhD > Bioinformatics Applications Group > UK MRC Human Genome Mapping Project Resource Centre > Hinxton, Cambridge, CB10 1SB, UK > E-mail : jison at hgmp.mrc.ac.uk > Tel : 01223 49-4548 > HGMP-RC: http://www.hgmp.mrc.ac.uk/ > EMBOSS : http://www.hgmp.mrc.ac.uk/Software/EMBOSS/ > > > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From pmr at ebi.ac.uk Tue Apr 29 15:47:34 2003 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 29 Apr 2003 16:47:34 +0100 Subject: discussion groups References: Message-ID: <3EAE9E96.6000605@ebi.ac.uk> Tendai Matambanadzo wrote: > I was wondering if there is a forum board which will help me install the > EMBOSS application program.I seem to be having several problems in the > installation.Thank you for you help For installation problems, mail emboss-bug at embnet.org The Admin Guide mentioned in other replies has been updated. There is a more recent doc/manuals/admin.tex version (August 2002) in the EMBOSS distribution. Can someone please update the HTML version at http://www.hgmp.mrc.ac.uk/Software/EMBOSS/Doc/Admin_guide/adminguide/ Hope this helps, Peter Rice From Marc.Logghe at devgen.com Tue Apr 29 20:48:39 2003 From: Marc.Logghe at devgen.com (Marc Logghe) Date: Tue, 29 Apr 2003 22:48:39 +0200 Subject: how to get the native format of DB Message-ID: Hi all, Another question. Is there a way to find out what the native sequence format of a database is ? Something like 'showdb -only -format tsw' would be useful for instance. I was just thinking about the possibility of using emboss as a generic tool to fetch sequences in Bioperl. But, in order to create the sequence objects in Bioperl, you need to know the format you are dealing with. Therefore I was looking for a way to retrieve that information, without writing 'yet another config parser'. Any suggestions ? Marc *********************************************************** Marc Logghe, Ph.D. Senior Scientist Scientific Computing Group deVGen Technologiepark 9 9052 Zwijnaarde Belgium tel: +32 (0) 9 324 24 88 fax: +32 (0) 9 324 24 25 *********************************************************** From arunanirudhan at yahoo.co.in Wed Apr 30 10:20:41 2003 From: arunanirudhan at yahoo.co.in (=?iso-8859-1?q?arun=20anirudhan?=) Date: Wed, 30 Apr 2003 11:20:41 +0100 (BST) Subject: blast or fasta with emboss Message-ID: <20030430102041.76056.qmail@web8207.mail.in.yahoo.com> Hello I have downloaded all the databases locally to my server. Is there any emboss progarm by which we can do a blast or fasta search to the databases installed in my server with a sequence in a text file. With regards Arun ________________________________________________________________________ Missed your favourite TV serial last night? Try the new, Yahoo! TV. visit http://in.tv.yahoo.com From d.m.a.martin at dundee.ac.uk Wed Apr 30 10:24:34 2003 From: d.m.a.martin at dundee.ac.uk (David Martin) Date: Wed, 30 Apr 2003 11:24:34 +0100 Subject: blast or fasta with emboss In-Reply-To: <20030430102041.76056.qmail@web8207.mail.in.yahoo.com> Message-ID: On 30/4/03 11:20 am, "arun anirudhan" wrote: > Hello > I have downloaded all the databases locally to my > server. Is there any emboss progarm by which we can do > a blast or fasta search to the databases installed in > my server with a sequence in a text file. You will have to install the BLAST or FASTA software separately. EMBOSS can help in providing the right formats, but doesn't incorporate these packages directly. ..d > With regards > Arun > > ________________________________________________________________________ > Missed your favourite TV serial last night? Try the new, Yahoo! TV. > visit http://in.tv.yahoo.com > -- David Martin PhD Bioinformatics Scientific Officer Post-Genomics and Molecular Interactions Centre University of Dundee From gbottu at ben.vub.ac.be Wed Apr 30 10:49:01 2003 From: gbottu at ben.vub.ac.be (Guy Bottu) Date: Wed, 30 Apr 2003 12:49:01 +0200 (CEST) Subject: blast or fasta with emboss Message-ID: <200304301049.h3UAn18J1259242@black.vub.ac.be> from : BEN Dear Arun, At the BEN site we have installed the BLAST and fastA packages and then written EMBOSS "wrapper" programs to run the programs from EMBOSS (just like CLUSTAL and Primer3 for which there are "wrappers" within the EMBOSS distribution). Because I was in a hurry to offer BLAST/fastA as fast as possible to our user, I did not do the efort to make it readily portable (contain absolute links, ...). I had sent a copy of our programs to Martin Sarachu from the Argentinian EMBnet node and Jose Valverde from the Spanish EMBnet Node. They were going to do the porting. I have howver not yet heard how they fared. Guy Bottu From areagp61 at yahoo.it Wed Apr 30 11:19:21 2003 From: areagp61 at yahoo.it (Graziano P.) Date: Wed, 30 Apr 2003 13:19:21 +0200 Subject: ednadist warning Message-ID: <002a01c30f0a$5c37db80$18105709@italy.ibm.com> Hi all, I am analyzing EMBASSY programs. I have to construct a phylogenetic tree starting from a multiple alignment of DNA sequences. I have performed a bootstrap simulation by means of "eseqboot" program. The output is constituted of 100 resamples of the initial multialignment. At this point I want to calculate distance matrices on each bootstrap simulation using "ednadist" program, then I have to use "eneighbor" and finally "econsense" to calculate the consensus tree. Nonetheless, when I try to use the output of eseqboot as input for ednadist (as suggested in the seqboot documentation "... you would run SEQBOOT,then run DNADIST using the output of SEQBOOT as its input, then run NEIGHBOR using the output of DNADIST as its input, and then run CONSENSE using the tree file from NEIGHBOR as its input"), it returns a warning: > ednadist eseqboot.outfile Nucleic acid sequence Distance Matrix program Warning: seqReadPhylip 14 sequences partly read at end Output file [ednadist.outfile]: Distance methods Kimura : Kimura 2-parameter distance JinNei : Jin and Nei distance ML : Maximum Likelihood distance Jukes : Jukes-Cantor distance Choose the method to use [Kimura]: Transition/transversion ratio [2.0]: Form of distance matrix S : Square L : Lower-triangular Form [S]: Kimura What does this warning mean? The output file contains only one distance matrix. My goal is to obtain a file containing 100 distance matrices, one for each bootstrap resample; is it possible? Could be a limitation of the ednadist algotithm? Regards Graziano -------------- next part -------------- An HTML attachment was scrubbed... URL: From fernan at iib.unsam.edu.ar Wed Apr 30 18:19:45 2003 From: fernan at iib.unsam.edu.ar (Fernan Aguero) Date: Wed, 30 Apr 2003 15:19:45 -0300 Subject: Preferred isoschizomer ? In-Reply-To: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> References: <200304141817.h3EIHs410930@bromine.hgmp.mrc.ac.uk> Message-ID: <20030430181945.GD3138@iib.unsam.edu.ar> Sorry to insist on this point. I am now also suffering from this behaviour ... +----[ ableasby at hgmp.mrc.ac.uk (14.Apr.2003 15:28): | | 1) If your colleague had explicitly said -enzymes psti | on the command line (or equivalent GUI) then it would | be found. The output would be overly verbose if all | isoschizomers are reported so as a compromise it reports | only one. Right, if you ask for PstI, you would get PstI and not any other isoschizomer. And I also agree with the compromise of reporting only one isoschizomer. The problem is deciding which one to report. | 2) If you take the emboss files from the REBASE (NEB) distro | then, after renaming and putting them in data/REBASE, it will | probably report PstI (haven't tried it). I don't exactly know what you mean by 'take the emboss files from the REBASE distro' if you mean getting the withrefm file, as explained in the EMBOSS admin tutorial, that's what I did, and in all cases I tried I get BspMAI instead of PstI (but this is only one particular case I expect this to happen for many other enzymes as well): restrict -> BspMAI restrict -commercial t -> BspMAI restrict -preferred t -> BspMAI restrict -commercial t -preferred t -> BspMAI I've been looking at the withrefm file and according to the description of the format provided within the file itself, there is no provision for 'preferred' isoschizomers. At least not explicitly declared. The fields for each entry are: name, isoschizomers, sequence, metylation site, organism, source, commercial provider, references. So, if you look for 'CTGCAG' (the sequence recognized by PstI), you would see that it occurs several times. The file is sorted in alphabetical order by enzyme name. However, the list of isoschizomers does not seem to be in strict alphabetical order and seems that the ordering is trying to suggest a 'preferred' isoschizomer. Going through the PstI isoschizomers in alphabetical order, of all the cases I looked (until I got tired) PstI is always the first in the list of isoschizomers. So, why is restrict not using it? And perhaps, a more difficult question to answer, as pointed before by Guy Bottu: why is restrict preferring BspMAI over the rest of the isoschizomers? | I arranged with NEB | that they would provide only the 'common' REs in their files. | I believe this is what some other packages do. Using REBASEEXTRACT | on the withrefm file gives all the REs. So, the answer would be that rebaseextract does nothing to mark/tag/select a preferred isoschizomer and instead relies on the withrefm file to contain only 'preferred' isoschizomers? As far as I can see the withrefm file contains all the isoschizomers for each recognition sequence. Taken from the REBASE README: ... ... ... #31. All Enzymes (each w/ ref & isos) withrefm.### ... ... ... #37. EMBOSS emboss_e.###, emboss_r.### emboss_s.### ... ... ... I also checked all the emboss* files (apparently, REBASE already provides the same files that rebaseextract produces?) and they also contain all the isoschizomers, and not a reduced subset. However, if this is the case, what's the use of a '-preferred t/f' option for restrict? There would only be 'preferred' files in the restriction enzyme database accessed by EMBOSS ... | 3) You can equate any reported RE to another by adding an entry | into embossre.equ e.g. | BspMAI PstI And I have to do this myself for all enzymes when -- apparently -- it is all already in the withrefm file? | | HTH I hope this helps to find a solution. In the meantime a hack around this would be to have at hand a file with a list of all commonly used, commercially available enzymes, and use it like this restrict -enzymes @enz.list Such a list of enzymes may be the one containing enzyme prototypes (they are called proto.* at the REBASE site, proto.304 is the current one). I've modified it to use it as a list successfully. A comparison of what happens when one uses withrefm or the proto list does not lead to a rapid conclusion. Using withrefm sometimes gives you a prototype enzyme, even if there are other isoschizomers, and even if they appear first (in alphabetical order). I wasn't able to understand what guides restrict in choosing from the list of available enzymes. Looking at the source code was my next step, but I'm still not knowledgeable enough in C ... Regards, Fernan | | Alan | | +----] -- F e r n a n A g u e r o http://genoma.unsam.edu.ar/~fernan From ableasby at hgmp.mrc.ac.uk Wed Apr 30 18:29:58 2003 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Wed, 30 Apr 2003 19:29:58 +0100 (BST) Subject: Preferred isoschizomer ? Message-ID: <200304301829.h3UITwG29534@sulphur.hgmp.mrc.ac.uk> There are replacement files for rebaseextract.c and rebaseextract.acd in the ftp://ftp.uk.embnet.org/pub/EMBOSS/patchfiles/ directory. By default this program will now produce an embossre.equ file. Re-extract the withrefm file using the new program. If you then use the -preferred option to 'restrict' it should behave as you wish. HTH Alan Bleasby HGMP From kellert at ohsu.edu Wed Apr 30 21:45:26 2003 From: kellert at ohsu.edu (Thomas Keller) Date: Wed, 30 Apr 2003 14:45:26 -0700 Subject: database problem Message-ID: <0DC0C708-7B55-11D7-8373-0003930405E2@ohsu.edu> Greetings, I just ran dbigcg on the nsf directory mounted on my machine, it contains the GCG GenBank database. I had to create the index in a different directory, cause the data is read only. It took about 2.5 hours to finish indexing, and gave a bunch of warnings about ajStrFixI called with length 2048 for string with size 2048. And it warned about one accession number that it expected, but couldn't find. However, it make the correct files and they are of substantial size. But I seem to have a problem though: ************** kellert% seqret Reads and writes (returns) sequences Input sequence(s): mygenbank:L42450 Segmentation fault ************** This sounds like a RAM issue to me, but I have 768 MB on this machine, which seems adequate to me. Here's the DB definition in ~/.embossrc: ************** DB mygenbank [ type: N method: gcg format: GenBank fields: "acnum seqvn des keywords taxon" dir: $emboss_db_dir/gcggenbank indexdir: $emboss_indices/gcggenbank file: "*.seq" release: "133.0" comment: "GCG genbank db from dna2 mounted locally: /Volumes/dna2.ohsu.edu" ] *************** Any suggestions? Thanks, Tom K. Thomas J. Keller, Ph.D. Director, MMI Core Facility Oregon Health & Science University 3181 SW Sam Jackson Park Rd. Portland, OR, USA, 97239 http://www.ohsu.edu/core