From abrown at nimr.mrc.ac.uk Thu Aug 1 07:03:42 2002 From: abrown at nimr.mrc.ac.uk (Alex Brown) Date: Thu, 1 Aug 2002 12:03:42 +0100 Subject: EMBOSS/lib directory Message-ID: <579E23D2-A53E-11D6-988E-0003938768AC@nimr.mrc.ac.uk> Hi. Another small problem, this time with 'cirdna'. I am running EMBOSS 2.4.1 on a Mac PowerBook G4, under Darwin and XDarwin 1.1 (XFree86 4.2.0). After installing EMBOSS, I noticed that he directory /EMBOSS/lib did not exist. This was odd, as the post-compilation setup instructions in the EMBOSS Administrators Guide (1.13.2) required the following line to be added to .cshrc: setenv PLPLOT_LIB /blah/blah/EMBOSS/lib With this setting, cirdna gave the following error: Cannot open library file: plstnd5.fnt Please set PLPLOT_LIB to the plplot/lib directory under emboss *** PLPLOT ERROR *** Unable to open font file Program aborted Setting PLPLOT_LIB to '/blah/blah/EMBOSS' did work, as the files plstnd5.fnt and plxtnd5.fnt were in this directory. However, the program 'hangs' after producing the 'cirdna' window with the graphic. I have to 'kill' the 'cirdna' window by alt-clicking it for the cirdna program to complete. Could someone explain this behavior (or is this normal), and what files are supposed to be in the /EMBOSS/lib directory. Sorry to be a nuisance. Many Thanks Alex Brown. From ame at esbs.u-strasbg.fr Thu Aug 1 10:31:24 2002 From: ame at esbs.u-strasbg.fr (Jean-Christophe Ame) Date: Thu, 1 Aug 2002 16:31:24 +0200 Subject: Request Message-ID: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Hi, I am looking for a program that would allow to do a multiple alignment of multiple ESTs sequences against a genomic sequence almost like est2genome does but with multiple ESTs. Does it exist ?? Thanks a lot. Sincerely, Jean-Christophe ________________________ Jean-Christophe Am?, PhD U.P.R. 9003 du CNRS - Canc?rog?n?se et Mutag?n?se Mol?culaire et Structurale ?cole Sup?rieure de Biotechnologie de Strasbourg P?le API Boulevard S?bastien-Brant 67400 Illkirch France tel.: 33 3 90 24 47 05 Fax.: 33 3 90 24 46 86 http://parplink.u-strasbg.fr http://www-esbs.u-strasbg.fr/centrerech/upr9003/upr9003.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 766 bytes Desc: not available Url : http://lists.open-bio.org/pipermail/emboss/attachments/20020801/6124c39e/attachment.bin From elemento at club-internet.fr Thu Aug 1 16:39:34 2002 From: elemento at club-internet.fr (Elemento Olivier) Date: Thu, 01 Aug 2002 16:39:34 -0400 Subject: Request References: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Message-ID: <3D499C86.5090809@club-internet.fr> I am not sure but I think the StackPack software package (which is free for academic users) allows you to do that. Olivier. Jean-Christophe Ame wrote: > Hi, > > I am looking for a program that would allow to do a multiple alignment > of multiple ESTs sequences against a genomic sequence almost like > est2genome does but with multiple ESTs. Does it exist ?? > Thanks a lot. > > Sincerely, > Jean-Christophe > > > > > > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se et Mutag?n?se Mol?culaire et > Structurale > ?cole Sup?rieure de Biotechnologie de Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 33 3 90 24 47 05 > Fax.: 33 3 90 24 46 86 > > _http://parplink.u-strasbg.fr_ > _http://www-esbs.u-strasbg.fr/centrerech/upr9003/upr9003.html_ From peter.rice at uk.lionbioscience.com Thu Aug 1 11:18:12 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 01 Aug 2002 16:18:12 +0100 Subject: Request (est2genome) References: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Message-ID: <3D495134.FA79A151@uk.lionbioscience.com> Jean-Christophe Am? writes: > I am looking for a program that would allow to do a multiple alignment > of multiple ESTs sequences against a genomic sequence almost like > est2genome does but with multiple ESTs. Does it exist ?? est2genome accepts multiple ESTs as input !!! But it reports one set of exons and alignment for each EST. We are looking into alignment and report format output for est2genome, so one obvious question: What format alignment would you like from est2genome (we could build one alignment of all ESTs against the genomic sequence, or separate alignments, and allow a choice of alignment format). (also, what format report would be most useful?) .... we will have to keep the original format output for users who are dependent on parsing it. regards, Peter Rice -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From michael.hanlon at bbsrc.ac.uk Wed Aug 7 11:55:32 2002 From: michael.hanlon at bbsrc.ac.uk (michael hanlon (BITS)) Date: Wed, 7 Aug 2002 16:55:32 +0100 Subject: problem with seqret and protein sequences. Message-ID: <41773CEF2B8FD411920200508BDCDC12BE146D@bits-exch1.bits.bbsrc.ac.uk> Due to changes in format at Sanger, we have edited our emboss.default to use the SRS service at EBI instead. Heres the relevant part: DB swall [ type: P method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-id:%s]" comment: "EBI Swissprot IDs and Trembl IDs" ] DB swalla [ type: P method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-acc:%s]" comment: "EBI Swissprot ACs and Trembl ACs" ] DB embl [ type: N method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[EMBLRELEASE-ID:%s]" comment: EBI EMBL ACs" ] DB embla [ type: N method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[embl-acc:%s]" comment: "EBI EMBL IDs" ] The embl bits work OK, but not the swall, I just get 'unable to read sequences' Any help much appreciated. Mike From ableasby at hgmp.mrc.ac.uk Thu Aug 8 04:43:46 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 8 Aug 2002 09:43:46 +0100 (BST) Subject: EMBOSS 2.5.0 available Message-ID: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> EMBOSS 2.5.0 is now available for download. As well as new programs (e.g. mwcontam, aaindexextract) this release can handle chunked HTML, sequences as values in EMBL rpt-unit entries, improved database indexing, extensions to several programs and documentation corrections (documentation for the protein structure programs is being produced). See the ChangeLog file for full details. Jemboss beta 2.7 contains new features including improved file management and file managers with pop-up menus, browsable EMBOSS help, report formats and multiple selection and deletion of results. It also contains a number of fixes and methods of speeding up of the response of the interface. The Jemboss server includes code which can be used to send the Jemboss batch jobs to a network queueing system. There are also extra jemboss.properties parameters to define the environment variables and the URL for the EMBOSS applications help. Alan From peter.rice at uk.lionbioscience.com Thu Aug 8 04:21:33 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 08 Aug 2002 09:21:33 +0100 Subject: problem with seqret and protein sequences. References: <41773CEF2B8FD411920200508BDCDC12BE146D@bits-exch1.bits.bbsrc.ac.uk> Message-ID: <3D522A0D.E71A0706@uk.lionbioscience.com> "michael hanlon (BITS)" wrote: > > Due to changes in format at Sanger, we have edited our emboss.default to use the SRS service at EBI instead. Heres the relevant part: > > DB swall [ type: P method: url format: embl > url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-id:%s]" > comment: "EBI Swissprot IDs and Trembl IDs" ] > > DB embl [ type: N method: url format: embl > url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[EMBLRELEASE-ID:%s]" > comment: EBI EMBL ACs" ] > > The embl bits work OK, but not the swall, I just get 'unable to read sequences' You are lucky with EMBL :-) For SwissProt, you should add +-ascii to get the URL to work. It depends on the default views defined for the EBI's SRS server. But, with EMBOSS 2.5.0 (should be fine with 2.4.1 too, though there was a problem with chunked HTTP output) you can use method: srswww and define the SRS database name in dbalias. With this definition you can use swall-id:amir_pseae and swall-acc:P10932. If you add fields: "sv des key org" you can search those too in the USA, though beware of huge result sets because EMBOSS will keep them in memory. A small search would be embl-org:aardvark (27 entries, at least until an aardvark EST project starts :-) Without the fields definition you are safer because users are less likely to generate a huge request to an SRS server with an ID or ACC query. Method: srswww builds the full SRS URL for you (with +-e+-ascii), and converts the USA queries into SRS queries. It was added in EMBOSS 2.4.0. For example: DB swall [ type: P method: srswww format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz" comment: "EBI Swissprot and Trembl" ] DB embl [ type: N method: srswww format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz" dbalias: emblrelease comment: EBI EMBL" ] regards, Peter Rice -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From sylvain.foisy at bioneq.qc.ca Thu Aug 8 09:33:59 2002 From: sylvain.foisy at bioneq.qc.ca (Sylvain Foisy) Date: Thu, 8 Aug 2002 09:33:59 -0400 Subject: EMBOSS 2.5.0 on Mac OS X - Problems with Java config Message-ID: <7EC90FCA-AAD3-11D6-B428-0003936297DA@bioneq.qc.ca> Hi, Just downloaded 2.5.0 and tried to configure/compile on my Mac with Java support for Jemboss. I can't get configure to use the Java parameters that are written in the Jemboss Server set-up instructions. I get the following: Java directory /System/Library/Frameworks/JavaVM.frameworks/Versions/1.3.1/Home does not exist Well, it does! Anybody can solve this one? Thanks in advance Sylvain ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Sylvain Foisy, Ph. D. Directeur-Operations / Project Manager BioNEQ - Le Reseau quebecois de bioinformatique Genome-Quebec Tel.: (514) 878-9911 E-mail: sylvain.foisy at bioneq.qc.ca ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ From richard at seqbio.com Thu Aug 8 16:18:47 2002 From: richard at seqbio.com (Richard Cote) Date: Thu, 08 Aug 2002 16:18:47 -0400 Subject: Discouraging jemboss perfornance Message-ID: <3D52D227.1020800@seqbio.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote -- ========================================================== Richard Cote, M.Sc. richard at seqbio.com Senior Bioinformatician http://www.seqbio.com Sequence Bioinformatics 1-877-SEQUENCE 1410 Stanley St. Suite 704 (tel) 514-842-5356 Montreal, QC H3A 1P8 (fax) 514-842-7230 ========================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3359 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.open-bio.org/pipermail/emboss/attachments/20020808/42d312ca/attachment.bin From rglcote at yahoo.ca Thu Aug 8 16:22:56 2002 From: rglcote at yahoo.ca (richard cote) Date: Thu, 8 Aug 2002 16:22:56 -0400 (EDT) Subject: Jemboss performance In-Reply-To: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <20020808202256.48923.qmail@web21304.mail.yahoo.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote ______________________________________________________________________ Post your ad for free now! http://personals.yahoo.ca From rglcote at yahoo.ca Thu Aug 8 16:22:56 2002 From: rglcote at yahoo.ca (richard cote) Date: Thu, 8 Aug 2002 16:22:56 -0400 (EDT) Subject: Jemboss performance In-Reply-To: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <20020808202256.48923.qmail@web21304.mail.yahoo.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote ______________________________________________________________________ Post your ad for free now! http://personals.yahoo.ca From uma at avesthagen.com Fri Aug 16 02:37:45 2002 From: uma at avesthagen.com (=?iso-8859-1?Q?Uma_Maheswari?=) Date: Fri, 16 Aug 2002 12:07:45 +0530 (IST) Subject: =?iso-8859-1?Q?emowse!?= Message-ID: <39285.192.168.1.5.1029479865.squirrel@mail.avesthagen.com> hai! Iam using emowse program in EMBOSS to search a protein database for matches with the mass spectrometry data. When this database is just a few protein(100 seq) in fasta format it works fine. but when I do the same over nr database I get an error message "An error has been found: Sequence is not a protein" I downloaded the nr database in fasta format.(all the seq. in a single file) then used csplit to split it into indiuvidual file with the common prefix and ran the prog. as follows. Input sequence(s): pro* Input file: b8 Whole sequence molwt [0]: Output file [baa92411.emowse]: An error has been found: Sequence is not a protein (note:'pro' is the common prefix for all the protein database sequence) What can I do to avoid this error? Is there any other way to create he database for the same. Thanks in Adv. Uma Maheswari -------------------------------------------- Avestha Gengraine Technologies Pvt. Ltd. Discoverer 9th Floor, Unit 3, International Tech Park, Bangalore-560 066, India. Tel:91-80-8411665/8412308; Fax:91-80-8418780 http://www.avesthagen.com From uma at avesthagen.com Fri Aug 16 05:00:35 2002 From: uma at avesthagen.com (=?iso-8859-1?Q?Uma_Maheswari?=) Date: Fri, 16 Aug 2002 14:30:35 +0530 (IST) Subject: =?iso-8859-1?Q?Re:_Antwort:_emowse!?= In-Reply-To: References: Message-ID: <46025.192.168.1.5.1029488435.squirrel@mail.avesthagen.com> hai David! I think you are right! I just downloaded the swiss prot and used that as database for emowse ....and it works! :) Uma Maheswari. > > Hi, > > I had sometime problems with other emboss programms (database indexers) > and the nr database. > When the database is made non-redundant, an entry which is more than > one time in the original databases gets all the headers of the original > sequences fused to one header line. > The original fasta headers are then joined with a Ctrl-A. > This binary character within the fasta header line may confuse emowse. > > It's just an idea. I may be completely wrong :-). > > David. -- "Experience is not what happens to a man; it is what a man does with what happens to him." - Aldous Huxley -------------------------------------------- Avestha Gengraine Technologies Pvt. Ltd. Discoverer 9th Floor, Unit 3, International Tech Park, Bangalore-560 066, India. Tel:91-80-8411665/8412308; Fax:91-80-8418780 http://www.avesthagen.com From jaen at novonordisk.com Wed Aug 21 03:34:30 2002 From: jaen at novonordisk.com (JAEN (Jacob Engelbrecht)) Date: Wed, 21 Aug 2002 09:34:30 +0200 Subject: compseq: is U an amino acid Message-ID: I have been using compseq for protein sequences and wondered why 'U' is reported as an amino acid? I looked in the code (nucleus/embnmer.c) and found it was specifically accounted for, whereas 'X' which in many databases as unknown is not specifically accounted for. Would it not make sense to have options which made specific symbols part of the alphabet or left them out: -leaveout XU or -include BZXU Jacob Engelbrecht, Phd Insulin Research Novo Nordisk 6A1.038 Novo Alle DK-2880 Bagsvaerd Denmark tel: +45 4442 4403 mail: jaen at novonordisk.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.open-bio.org/pipermail/emboss/attachments/20020821/e8111896/attachment.html From gwilliam at hgmp.mrc.ac.uk Wed Aug 21 04:18:06 2002 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 21 Aug 2002 09:18:06 +0100 Subject: compseq: is U an amino acid References: Message-ID: <3D634CBE.1A758F63@hgmp.mrc.ac.uk> U codes for the amino acid selenocysteine. See the IUPAC documentation for one-letter amino-acids: http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html and http://www.chem.qmul.ac.uk/iubmb/newsletter/1999/item3.html regards, Gary > "JAEN (Jacob Engelbrecht)" wrote: > > I have been using compseq for protein sequences and wondered why 'U' > is reported as an amino acid? > I looked in the code (nucleus/embnmer.c) and found it was specifically > accounted for, whereas 'X' which in many databases as unknown is not > specifically accounted for. > > Would it not make sense to have options which made specific symbols > part of the alphabet or left them out: > -leaveout XU or -include BZXU > > Jacob Engelbrecht, Phd > Insulin Research > Novo Nordisk > 6A1.038 Novo Alle > DK-2880 Bagsvaerd > Denmark > tel: +45 4442 4403 > mail: jaen at novonordisk.com -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From haruna at sgi.com Thu Aug 22 09:40:41 2002 From: haruna at sgi.com (Haruna Cofer) Date: Thu, 22 Aug 2002 09:40:41 -0400 Subject: EMBOSS 2.5.0 available References: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <3D64E9D9.EBFB36FF@sgi.com> Hello! This is just a quick note to let SGI/IRIX users know that my porting notes for EMBOSS and Jemboss have been updated for the latest release: http://www.sgi.com/industries/sciences/chembio/resources/emboss/ Thanks! -- Haruna :) -- Haruna N. Cofer Silicon Graphics Inc. ChemPharm Applications From gkaiser at facstaff.wisc.edu Thu Aug 22 18:25:31 2002 From: gkaiser at facstaff.wisc.edu (Gebhard Kaiser) Date: Thu, 22 Aug 2002 17:25:31 -0500 Subject: No subject Message-ID: <200208221725.31504.gkaiser@facstaff.wisc.edu> Hello 1. I have got problens configuring my databases may someone can help me with it. 2. I am very new in Bioinformatics. Is their any kind of an "how-to-do" what explains essential things like fileformats etc. . I tryed to to index some gcg files as well but dbigcg asked for some *.ref files what is this, what is UFO; a coordinate file is the same like *.pdb? Thank you Gebhard ------------------------------------------------------------------------------------------ Here is what you may need for 1.: ### this is the warning: schleppi:/ # infoseq Displays some simple information about sequences Input sequence(s): askn:* Warning: Cannot open division file '' for database 'askn' Error: Unable to read sequence 'askn:*' ### the part of my emboss.default: DB askn [ type: N method: emblcd format: fasta dir: $emboss_db_dir/ask file: askn.fasta comment: "ASKxx_atxgxxxxx-sp/usp" ] ### ls of $emboss_db_dir/askn: (dbifasta did not give me any warnings) . .. acnum.hit acnum.trg askn.fasta division.lkp entrynam.idx ### askn.fasta it self: >ASK01_at1g75950-sp ATGTCTGCGAAGAAGATTGTGTTGAAGAGTTCCGATGGTGAATCTTTCGAGGTTGAGGAGGCGGTGGCTCTCGAGTCACAAACCATAGCGCATATGGTTGAAGACGACTGCGTCGACAACGGAGTCCCTCTTCCTAACGTCACGAGCAAGATCCTCGCCAAGGTGATCGAGTATTGCAAGAGGCACGTCGAGGCTGCTGCCTCTAAGGCCGAGGCCGTCGAGGGTGCTGCTACCTCCGATGACGATCTTAAGGCCTGGGACGCTGATTTTATGAAGATCGATCAAGCTACTCTCTTTGAACTCATTCTGGCTGCTAATTACCTGAATATCAAGAACTTGCTTGATCTAACATGTCAGACAGTTGCGGATATGATCAAAGGAAAGACTCCAGAAGAGATCCGCACAACGTTCAACATTAAGAACGACTTCACACCAGAGGAAGAGGAAGAGGTTCGCAGAGAGAACCAATGGGCTTTTGAATGA >ASK02_at5g42190-sp ATGTCGACGGTGAGAAAAATCACTCTTAAGAGTTCGGATGGCGAAAACTTCGAAATTGACGAAGCGGTGGCGCTAGAGTCACAAACCATCAAACATATGATTGAAGATGACTGTACCGATAATGGTATCCCTCTCCCTAATGTCACAAGCAAGATCCTTTCGAAGGTGATTGAGTACTGTAAGAGACATGTCGAAGCTGCTGAGAAATCCGAAACCACGGCCGATGCTGCTGCTGCTACTACTACCACCACCGTCGCGTCGGGTTCTAGTGATGAAGATCTCAAGACTTGGGATTCTGAGTTTATCAAAGTTGATCAGGGCACTCTCTTCGATCTTATCCTGGCTGCTAACTACTTGAATATCAAGGGACTGTTGGACTTGACTTGCCAGACAGTGGCTGATATGATTAAAGGAAAAACCCCAGAAGAAATCCGTAAGACGTTCAATATCAAGAACGACTTCACGCCAGAGGAAGAAGAAGAGGTTCGCCGTGAGAATCAGTGGGCGTTTGAATGA >ASK03_at3g25700-sp ATGGCAGAAACGAAGAAGATGATCATCCTCAAGAGCTCCGACGGTGAATCCTTCGAGGTCGAGGAAGCCGTCGCGGTCGAGTCCCAGACGATTAAGCACATGATCGAGGACGACTGCGTCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCGAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGCCGCTGCTGAGGCTGGTGGAGACAAGGATTTCTATGGTTCCACCGAGAACCACGAGCTCAAGACTTGGGACAACGATTTCGTCAAAGTTGATCATCCTACTCTCTTCGATCTCCTTCGGGCTGCCAACTATTTGAACATCAGTGGACTTCTTGACCTTACGTGCAAGGCCGTGGCTGATCAGATGAGAGGCAAAACTCCAGCGCAGATGCGTGAACACTTCAACATCAAGAACGACTACACACCTGAGGAAGAGGCCGAGGTTCGCAATGAGAACAGGTGGGCGTTCGAGTGA >ASK04_at1g20140-sp ATGGCAGAAACGAAGAAGATGATCATCCTTAAGAGCTCCGACGGTGAATCCTTCGAGATCGAGGAAGCCGTCGCTGTTAAGTCCCAGACGATTAAGCACATGATTGAGGACGACTGTGCCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCCAAGGTTATTGAGTATTGCAAGAAGCACGTTGAAGCCGCTGCTGAAGCTGGTGGAGACAAGGATTTCTATGGTTCCGCTGAGAACGACGAGCTTAAGAATTGGGACAGCGAATTCGTCAAAGTCGATCAGCCTACTCTCTTCGATCTCATCTTGGCTGCGAACTATTTGAACATCGGTGGACTTCTTGACCTTACGTGCAAGGCCGTGGCTGATCAGATGAGAGGCAAAACTCCAGAGCAGATGCGTGCACACTTCAACATCAAGAACGATTACACACCTGAGGAAGAGGCGGAGGTTCGCAATGAGAACAAGTGGGCGTTCGAGTGA >ASK05_at3g60020-sp ATGTCGACGAAGATCATGTTGAAGAGCTCCGATGGTAAATCGTTCGAGATCGACGAAGACGTGGCACGCAAATCAATCGCGATAAACCATATGGTTGAGGACGGCTGCGCCACTGATGTAATACCGCTTCGAAACGTCACAAGCAAGATTCTCAAGATTGTGATCGATTATTGCGAGAAGCACGTCAAGAGCAAAGAAGAAGAAGATCTCAAGGAGTGGGACGCTGATTTCATGAAGACGATCGAAACAACCATTCTCTTTGATGTTATGATGGCTGCGAATTATCTCAATATTCAAAGCCTTCTTGATCTCACATGTAAAACTGTCTCGGATTTGCTCCAGGCTGATTTGCTCTCAGGGAAAACTCCAGATGAGATTCGCGCGCACTTCAACATCGAGAACGATCTAACAGCAGAGGAAGTAGCTAAGATTCGTGAGGAGAATCAATGGGCTTTTCAATGA >ASK06_at3g53060-sp ATGGCAGAAGACGATTGTGCCGATAATGGAATCCCTCTTCCAAACGTGACAAGCAAGATACTCTTATTGGTGATCGAGTATTGCAAGAAGCACGTCGTTGAGAGCAAAGAAGAAGATCTAAAGAAGTGGGACGCTGAATTCATGAAGAAGATGGAACAATCGATTCTCTTTGATGCAAAACTCCAGGCGAGATTCGCTCATACTTCAATATCGAGAACGATTTCACAGCAGAGGGAGAAGCTGAGATCCGCAAGGTGA >ASK07_at3g21840-sp ATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK08_at3g21830-sp ATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK09at3g21850-sp ATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK10_at3g21860-sp ATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCATGCCAATGCCAGACCATAGCTCATATGAGTGAAGACGATTGTACCGATAATGGAATCCCGCTTCCAGAAGTGACAGGCAAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGAGGATCTCAAGAAGTGGGACAAGGAATTCATGGAAAAATATCAATCCACGATCTTTGATCTCATTATGGCTGCGAATTACCTAAACATCAAAAGCCTTCTTGATCTAGCATGCCAAACCGTCGCGGATATGATCAAAGACAACACTGTGGAGCACACTCGCAAATTCTTCAACATTGAGAATGATTATACACATGAGGAAGAAGAAGCGGTTCGTAGGGAGAATCAATGGGGTTTTGAATGA >ASK11_at4g34210-sp ATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAACACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAACTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGCTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK12_at4g34470-sp ATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAATACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAAGTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGTTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK13_at3g60010-sp ATGTCGAAGATGGTTATGTTGCTGAGCTCCGATGGTGAATCTTTCCAGGTCGAAGAAGCAGTCGCGGTCCAGTCACAGACGATAGCACATATGATTGAAGACGATTGCGTCGCCAATGGAGTCCCTATCGCAAACGTTACAGGAGTCATCCTCTCGAAGGTGATCGAGTATTGCAAGAAACACGTCGTTTCTGATTCACCAACCGAAGAGAGCAAAGACGAACTCAAGAAGTGGGACGCTGAGTTCATGAAGGCCCTGGAACAGTCGTCGACTCTCTTTGATGTTATGCTGGCTGCGAATTACCTAAACATAAAAGACCTGCTTGACCTTGGTTGCCAAACTGTTGCTGACATGATCACTGGCAAGAAACCAGACGAGATTCGTGCACTTCTTGGCATCGAGAACGATTTTACACCGGAGGAGGAAGAGGAGATTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK14_at2g03170-sp ATGTCTTCCAACAAGATTGTTTTGTCTAGCTCCGATGGCGAATCTTTCGAGGTTGAAGAAGCGGTGGCAAGAAAACTGAAAATCGTGGAACACATGATTGAAGACGACTGTGTTGTTACCGAGGTCCCTCTTCAAAACGTCACCGGAAAGATCCTCTCCATTGTTGTCGAGTATTGCAAGAAACACGTCGTTGACGAAGAAAGCGACGAGTTCAAGACTTGGGACGAAGAGTTCATGAAGAAATTTGATCAGCCTACGGTCTTCCAACTCTTGCTCGCTGCTAACTATCTCAATATCAAAGGCCTTCTTGATCTCTCTGCTCAAACCGTTGCAGATCGCATCAAAGATAAGACTCCAGAGGAAATTCGAGAAATCTTCAACATCGAGAACGATTTCACACCCGAAGAAGAAGCAGCGGTTCGCAAGGAAAACGCATGGGCTTTTGAATAG >ASK15_at3g25650-sp ATGTCTTCTAACAAGATTGTGTTGACTAGTTCCGATGGCGAGTCTTTCCAAGTTGAGGAAGTGGTGGCACGAAAACTGCAGATCGTAAAGCACCTGCTCGAAGACGACTGTGTTATTAACGAAATCCCTCTTCAAAACGTTACAGGAAATATTCTCTCCATCGTTCTCGAGTATTGCAAGAAACACGTCGACGATGTGGTCGATGATGATGCATCTGAGGAGCCGAAGAAGAAGAAGCCCGATGATGAGGCGAAGCAGAATCTCGATGCTTGGGACGCAGAGTTCATGAAAAATATTGATATGGAAACAATCTTCAAGCTCATTCTCGCTGCTAACTATCTCAACGTCGAAGGTCTTCTTGGTCTCACTTGCCAGACTGTTGCAGATTACATCAAAGATAAGACGCCAGAGGAAGTTCGAGAACTCTTTAATATCGAGAATGATTTCACACATGAAGAAGAAGAAGAAGCGATTCGCAAGGAGAACGCTTGGGCTTTTGAGGCTGACACAAAACACGAAGATCCAAAGCCCTAG >ASK16_at2g03190-sp ATGTCTTCGAACAAGATTGTGTTGACTAGCTCGGATGATGAATCGTTCGAGGTTGAGGAAGCGGTGGCTCGTAAATTGAAGGTCATAGCACACATGATCGATGACGACTGCGCCGATAAAGCAATCCCGCTTGAAAACGTCACCGGAAATATCCTCGCTTTGGTTATCGAGTATTGCAAGAAACACGTACTTGATGATGTTGATGATAGTGATGATTCTACTGAAGCAACAAGCGAAAATGTAAACGAGGAAGCCAAGAACGAGCTCAGGACTTGGGACGCAGAGTTCATGAAAGAATTTGATATGGAAACAGTCATGAAACTCATTCTCGCTGTTAATTATCTCAACGTCCAAGATCTTCTTGGTCTCACTTGCCAGACCGTTGCAGATCACATGAAAGATATGTCGCCAGAGGAAGTTCGAGAACTCTTTAACATTGAGAATGATTACACACCTGAAGAAGAAGACGCGATTCGTAAGGAAAACGCTTGGGCTTTTGAGGATCTAAAGTAA >ASK17_at2g20160-sp ATGTCTTCGAAGAAGATTGTGTTGACTAGCTCCGATGATGAATGTTTTGAGATTGACGAAGCGGTGGCTCGTAAGATGCAGATGGTAGCGCACATGATCGATGACGATTGCGCCGATAAAGCAATCCGGCTTCAAAACGTCACTGGAAAGATCCTCGCTATCATTATCGAGTATTGCAAGAAACACGTTGATGATGTTGAAGCCAAGAATGAGTTCGTGACTTGGGACGCAGAGTTCGTGAAAAACATTGATATGGATACACTCTTCAAACTCCTTGACGCTGCTGACTATCTCATCGTCATAGGTCTCAAGAATCTCATTGCCCAGGCCATTGCAGATTACACTGCAGATAAGACGGTAAATGAGATTCGAGAACTCTTTAACATCGAGAACGATTACACACCTGAGGAAGAAGAAGAGCTTCGCAAGAAGAACGAATGGGCTTTCAATTAA >ASK18_at1g10230-sp ATGTCTTCTAACAAGATTTTGTTGACGAGTTCCGATGGCGAGTCTTTCGAGATCGACGAAGCGGTGGCGCGTAAGTTTCTGATCATAGTGCACATGATGGAGGATAACTGCGCCGGTGAAGCAATTCCGCTTGAAAATGTCACCGGGGATATCCTCTCCAAGATAATCGAGTACGCGAAGATGCACGTCAATGAACCTAGTGAAGAAGACGAAGACGAGGAGGCGAAGAAGAATCTAGACTCGTGGGACGCTAAGTTCATGGAAAAGCTAGATCTGGAGACCATCTTCAAAATCATTCTCGCTGCCAACTACCTAAACTTCGAAGGACTTCTCGGTTTCGCTAGCCAGACGGTTGCTGATTACATCAAGGACAAAACACCAGAGGAAGTACGAGAGATTTTCAACATCGAGAACGATTTCACGCCTGAAGAAGAGGAAGAGATTCGCAAGGAGAATGCTTGGACTTTTAATGAGTAA >ASK19_at2g03160-sp ATGTCTTCGAAAAAGATTGTGTTGACAAGCTCCGATGGTGAATCTTTCAAGGTTGAAGAAGTGGTGGCAAGAAAACTGCAGATCGTAGGACACATTATCGAAGACGACTGTGCTACAAACAAAATCCCTATTCCAAACGTTACCGGAGAGATTCTCGCCAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGACGATGATGACGTGGTGGAGACGCATGAATCATCGACGAAAGGAGATAAAACAGTTGAGGAGGCGAAGAAGAAGCCTGATGATGTGGCCGTACCTGAATCAACTGAAGGAGATGATGAAGCTGAGGATAAGAAGGAGAAGCTTAATGAGTGGGATGCAAAGTTCATGAAGGATTTCGATATTAAGACGATCTTCGACATTATTCTGGCTGCTAACTATCTCAACGTCCAAGGTCTTTTTGATCTCTGTAGCAAGACCATTGCAGATTACATAAAAGATATGACGCCAGAGGAAGTTCGAGAACTCTTTAACATCGAGAATGATTTCACACCTGAAGAAGAAGAAGCAATTCGCAATGAAAACGCTTGGACTTTTGAGCAAGATGGAAAACAACAAGTTCCAAAACCCTAG >ASK01_at1g75950-usp aaataaaaataaatgttcaaaaaacatgatcttaaagctgacaaagctgatttgattgactaatacttatcctacggtgatttttggttcttttactttttttgacaattatggagctggatggaaaaaaaatatatataaaatcatatattattaataatgagataaatacaacgaattaaacggatcaaagttaatatttccaaaagaaaaatagaacagagtcctaatttcattaatttcaactattgaaaacaaaatttaaaatcaataggattgatttctatttttcttttagaaaaacaaaatttgaaacaatttcctaatttccctaaacttgcgactttttaacaatcgaccgacttatcaaaattagggattgttttatatataaagagagacgcatctctttatttcattcatcgcttctccaaaattttcttcaaagaacaaatctcccaaatctaaaatctttctcttctctcttcgtttccataaccATGTCTGCGAAGAAGATTGTGTTGAAGAGTTCCGATGGTGAATCTTTCGAGGTTGAGGAGGCGGTGGCTCTCGAGTCACAAACCATAGCGCATATGGTTGAAGACGACTGCGTCGACAACGGAGTCCCTCTTCCTAACGTCACGAGCAAGATCCTCGCCAAGGTGATCGAGTATTGCAAGAGGCACGTCGAGGCTGCTGCCTCTAAGGCCGAGGCCGTCGAGGGTGCTGCTACCTCCGATGACGATCTTAAGGCCTGGGACGCTGATTTTATGAAGATCGATCAAGCTACTCTCTTTGAACTCATTCTGgtatgtttcttctctcgatctgatttgatttttccatcgaattttgaattttgggattctagggttttcgatttgggaaaattagggtttcgaaatttaggtgtttgtttcagaaattgaatctgcttgagattgatattgttagggttcttatggaaccaatcattaattgaatctatcg atttggat >ASK02_at5g42190-usp aaaaaaaaataagaaaaataaaataaaaaataaataagttttgtcgtaacggtggacttggtttctagaatgtggatgattttaatacaacttaattagcaaaacaactgccgcaattgattatgattcttaactctttatttgcaaagatgttcaaagaaaaatattaccggaaatcaaatcacatgaatccaattaaattatacacaccatcctacaaatagaggagtttagtatctgttcaactactgattaatcaaaagttgatgaaaagagttgaattaattttcaggtgttttcctttgaaagaaaaaaacagagaaaatgttttaaaatgaaaatttataggtaaaaaaagtcaattgggaatcgttagatctcactggttcaatgtgtgagccgggctttaaaaacattttgatttaacccatacacacatctctctgttccacgattctcttcctcagcctccacgtcgtctctaaactcagcaaaaaccaATGTCGACGGTGAGAAAAATCACTCTTAAGAGTTCGGATGGCGAAAACTTCGAAATTGACGAAGCGGTGGCGCTAGAGTCACAAACCATCAAACATATGATTGAAGATGACTGTACCGATAATGGTATCCCTCTCCCTAATGTCACAAGCAAGATCCTTTCGAAGGTGATTGAGTACTGTAAGAGACATGTCGAAGCTGCTGAGAAATCCGAAACCACGGCCGATGCTGCTGCTGCTACTACTACCACCACCGTCGCGTCGGGTTCTAGTGATGAAGATCTCAAGACTTGGGATTCTGAGTTTATCAAAGTTGATCAGGGCACTCTCTTCGATCTTATCCTGgtttgtcaaacttatttttaattgctttggttttcaaagtttgcgatttcatttctagggtttgagatcttgatttctgtttgagatctaattttagggttcaggttttgtttagattgccaatttcacagtttaaactatgatcatg tctgattg >ASK03_at3g25700-usp tgttggttaccaatttttacattttcattgatattcactacaaaatacacaaataataattaattcataatctatgtgaacgtggaggtttacttttattaaaacataagaccctagtcaatgattttttcacacgtagtagatgattaactgtattttctgaaatcagatacgggtaaatctggaaagtaaatattattactaaggttgagcttttggaaaagtaaaaatatttctctttaaaagaaaaaaataataaagattttgttacttaaaattccaatatttgtttcccttttattgttttcctattattaaaaggattagattaacataaaagcaatcaaccgactttaattaccaagtaagaaattgtttttacatagatctataaatagggcaccaacttcccaaaccttgagaccatcacacaattcacaatcaatcgcagagccgattctcttcaaaacttgtctagtcctttgtccttgttgcaaacgATGGCAGAAACGAAGAAGATGATCATCCTCAAGAGCTCCGACGGTGAATCCTTCGAGGTCGAGGAAGCCGTCGCGGTCGAGTCCCAGACGATTAAGCACATGATCGAGGACGACTGCGTCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCGAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGCCGCTGCTGAGGCTGGTGGAGACAAGGATTTCTATGGTTCCACCGAGAACCACGAGCTCAAGACTTGGGACAACGATTTCGTCAAAGTTGATCATCCTACTCTCTTCGATCTCCTTCGGgttagtaatgtctttttctttgttttttggttttatgtttttagaattagggttttttatattttttccatgactatgttagggttttatttatattattgaatgttgtgttttgatttggagactaatcgtcttggtttataaagGCTGCCAACTATTTGAACATCAGTGG ACTTCTTG >ASK04_at1g20140-usp tattcactacaaaatacccaaataataattcataatttcacatagatttttacatacaaacgtggaggttttctttgattaaaacataaaaaccctagtcaatgatttttgcacacgtagataaactgtattttctgaaatcagatgtaaatctgaaaaagtaaatattattaacggaatacagctaaggtgaagtttgtggaaaagtaaaaatatttactttatttaaaacaacataaattttccattttttaaaaacttaaatttccaatatttgttttactttttactgttctcctattagtaaaaggagtagattaacttaaaagcaataagccggcaaaaaaaaaaaaacttaaaagcaataaaccgacttgaataccaattaagaaattggctataaataaggctccaacttcccaaaccttgacaccatcacaacaatcaatcgcagcgccgattctccttcaaaaacttttcctaagtcgtcactttttacgATGGCAGAAACGAAGAAGATGATCATCCTTAAGAGCTCCGACGGTGAATCCTTCGAGATCGAGGAAGCCGTCGCTGTTAAGTCCCAGACGATTAAGCACATGATTGAGGACGACTGTGCCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCCAAGGTTATTGAGTATTGCAAGAAGCACGTTGAAGCCGCTGCTGAAGCTGGTGGAGACAAGGATTTCTATGGTTCCGCTGAGAACGACGAGCTTAAGAATTGGGACAGCGAATTCGTCAAAGTCGATCAGCCTACTCTCTTCGATCTCATCTTGgttagtaaagtcgctttcattgtttaggtttgatgttgttagaattagggtttttaatttggggatttaggtttgattttgcaggattctgttagggttttagtgtgttgaatgtcgttttgatttgactaatcgttttggttctttggtttgtacagGCTGCGAACTATTT GAACATCG >ASK05_at3g60020-usp gtgttggtgtgaaatctacgagtgtaattttatgtaacgtttaattatttttactagaatatgtatcattcagctttacaagattatatgtaatgtagctttatctgatgttaaacccccgaatttgtataactacgtttttgtgtgtttgttacattttgtactttgtctttgatggcaaatcttgaagtgagtgagaataacaacaaatcacataaaacaccaaccaaatcctttttttcttttgacatttaatccatctcgatcaagttttgggccgaatctacgaaactgggccaattgtaaaattcgacccggttcagcttggtttactccactgaaaatgttgatcccaccgactctgacagttggagcttataaatacacagacgcatagattcataagttcgttacatcatttttcttcaaatcgctctctaaattcttcttcaatcttgtttcatcaaccttgctttccagagaaaaatcgctccataacaATGTCGACGAAGATCATGTTGAAGAGCTCCGATGGTAAATCGTTCGAGATCGACGAAGACGTGGCACGCAAATCAATCGCGATAAACCATATGGTTGAGGACGGCTGCGCCACTGATGTAATACCGCTTCGAAACGTCACAAGCAAGATTCTCAAGATTGTGATCGATTATTGCGAGAAGCACGTCAAGAGCAAAGAAGAAGAAGATCTCAAGGAGTGGGACGCTGATTTCATGAAGACGATCGAAACAACCATTCTCTTTGATGTTATGATGGCTGCGAATTATCTCAATATTCAAAGCCTTCTTGATCTCACATGTAAAACTGTCTCGGATTTGCTCCAGGCTGATTTGCTCTCAGGGAAAACTCCAGATGAGATTCGCGCGCACTTCAACATCGAGAACGATCTAACAGCAGAGGAAGTAGCTAAGATTCGTGAGGAGAATCAATGGGCTTTTCAATGAgagagcggatcatcaaagttgttgatgc aaatctac >ASK06_at3g53060-usp gtttcctactttacacgtttgttacattttttctctttgttgcaaaccccaaatgtgtatgtgccatacaatatcaatcatctttgccatattttatcgaactgtttataggttactcaatatttttctcctcaaaaaatgtgaagactgacacgtaccaaatcttttaagtgagaatcacaacaaatcacataaaacaccaaccaattccattctttttcctcttttgacacactttaatccaaatctgatcaagttttggggccacattgcaaaattggggcccgttgtaagttttggtcccaatctgaaaatgcatatcccaccgactctgacaattagggcttataaataaacgagcacataagttcataacttggttacttcattattcttccatctttttcatcaaattgctaagagagtaaaatcgccccataacgaagtaaccatgtcgaagaagataatcgtgttgacaagctccgatgatgataaagggtATGGCAGAAGACGATTGTGCCGATAATGGAATCCCTCTTCCAAACGTGACAAGCAAGATACTCTTATTGGTGATCGAGTATTGCAAGAAGCACGTCGTTGAGAGCAAAGAAGAAGATCTAAAGAAGTGGGACGCTGAATTCATGAAGAAGATGGAACAATCGATTCTCTTTGATgttatgatggctgcgaattatctcaatatccaaagccttcttgatctcacattttcaaactgtcgctgatttgctctcagGCAAAACTCCAGGCGAGATTCGCTCATACTTCAATATCGAGAACGATTTCACAGCAGAGGGAGAAGCTGAGATCCGCAAGGTGAatcaatgggcttttgaatgaaagtggatcttcaaagttctttcttctttagtgttttcggttttcttgatgcgatgttagatgattacttcgtctatttcttgtttgttctgttgtttcttttcttttggtttcttgatgcataagtaaacc aatgtttg >ASK07_at3g21840-usp gtagataaaaaaaaaaaaagaattacatagaaacttaggagtaaacactgtaaaacacgaatttccaaaaaaaaaaaaagaagttgtagttataatttaaaagacacatataaaaaataatttagtggaaattaaaacaacataacatgagtagataacttaatgattgtgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacatgaaaaagttgtggttttaatcaaaaagacacataaaaaaagactttagtggaagaacaacatcaacatgagtagacatacaatacaacttcaagctttcgtgacttacttgacacgatacatgcatatatgaaagttattagggcttataaatagacaaacgcataggttcataacttcattaccttattactcaatcatttactcaattcttcaatcttccagagaaaaaatctcccccactgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttc tttctttt >ASK08_at3g21830-usp aagaagaattatatagaaacttaggagtaaacactgtaaaacacgaattttctaattaaaaaaaaagttgtagttataatctaaaagacacatagaaaaatactttagtggaaattaaaacaacataacatgagtagataacttaatgattttgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacataaaaatgttgtggttttaatcaaaaagacacataaaaaaaagacttaatggaagaacaacatcaacatgagtagacatacaacttcatgctttcgtgacttacttgacacgatacatgcatctatgaaagttagggcttataaatagacaagactcataggttcataacttcattaccttattactcaatcattttctcaattcttcaatctttcatcaaaatttcttccagagaaaaaaaacaaatctcccccacaaagaaaacaacaATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttcttt cttttctt >ASK09at3g21850-usp gtagataaaaaaaaaaaaagaattacatagaaacttaggagtaaacactgtaaaacacgaatttccaaaaaaaaaaaaagaagttgtagttataatttaaaagacacatataaaaaataatttagtggaaattaaaacaacataacatgagtagataacttaatgattgtgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacatgaaaaagttgtggttttaatcaaaaagacacataaaaaaagactttagtggaagaacaacatcaacatgagtagacatacaatacaacttcaagctttcgtgacttacttgacacgatacatgcatatatgaaagttattagggcttataaatagacaaacgcataggttcataacttcattaccttattactcaatcatttactcaattcttcaatcttccagagaaaaaatctcccccactgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttc tttctttt >ASK10_at3g21860-usp tggaacatttgtggtattaagtgaaaacaaagagatctcgacatgccgtgtaataatattaagatcaattaattgcgtaagacgtgatatttatgcatgcatttatagaaactttgaagacgcgttatgtgtataaaacaacatcatagtgcatttttaataaaaaaagtttatttggaatatttgagtaggtggttagtccaatatataaaattgaattacatataatcttacaagtaaacacgataaaaagttgtggttttagctttaaaagactttagtggaaaacaacatcaacatgagtagacataaaacttcaagctttcgtgacttgcttgacacgatacatgcatctatgaaaattattagggcttattaatagacaaacgcataggttcataacttcattaccttattactgaaatccttctctcaattctcaatcattttctcaattcttcaatcttccagagaaaaaatctccccccagcgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCATGCCAATGCCAGACCATAGCTCATATGAGTGAAGACGATTGTACCGATAATGGAATCCCGCTTCCAGAAGTGACAGGCAAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGAGGATCTCAAGAAGTGGGACAAGGAATTCATGGAAAAATATCAATCCACGATCTTTGATCTCATTATGGCTGCGAATTACCTAAACATCAAAAGCCTTCTTGATCTAGCATGCCAAACCGTCGCGGATATGATCAAAGACAACACTGTGGAGCACACTCGCAAATTCTTCAACATTGAGAATGATTATACACATGAGGAAGAAGAAGCGGTTCGTAGGGAGAATCAATGGGGTTTTGAATGAgcgggtgaagcaaacactaagcagctttctt tcttttct >ASK11_at4g34210-usp acattaaagtcaaaaacaattattatttttaaagaatattagtgatatattttacctttatgaaacattattaatatttaaacataaaaaattaaaatatcatgagacgggaagtatagcagaacgggtttggattgataggaaatacatcgagacgggtttggattgatataacacaggtaaaatatttattttattattttatcatcactaatttttttaattacaagattaatataaaattatataactttattactaaatataaaattataaaatttaaaattattatatatataaataaaatatttgtccgtagtgtaccacgcgttgaattataatatgttgttgtcaccgactctagctagggcttataaatacaaagacgcatagggtttcacaacttcatcaccttaacacaatttgttcgctctctaaattcttaaatctttcatcaaatttgcttcacagtgaaaaacctcctccacaaggaacacacaATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAACACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAACTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGCTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGAgcccatgaatcaaaaccctaactagctttct ttttcttt >ASK12_at4g34470-usp gacttcttttggtttcttataaatgaatctgaatattatctttggataaagtttcctatattccaatgtatatctatatccctcttaaattgttacattttacgccgtacctttataattgggcctattgtaaatttaaaccggtttagctggttcactttactgaatttacattttctgaaaataaggatttgagccgaacctacaacaaaaaaatgaaaatttgacccggtttattttggtttactcatctgaaattaccttttatgaaaaggagtcgacaaataatttacttattttcttcggttaaggaaaataactatttccttcttaaaaaaggaatggttgatctcaccgactctagctagggcttataaatacaaagacgcatagggtttcacaacttcatcaccttaacacaatttgttcgctctctaaattcttaaatctttcatcaaatttgcttcacagtgaaaaacctcctccacaaggaacacacaATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAATACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAAGTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGTTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGAgcccatgaatcaaaaacctaactagctttct ttttcttt >ASK13_at3g60010-usp aacacatgataaatttacgtgaccataattacttttcttcttttcttttgttttgaatgacataaacaaaatttgtagatatttttcttgctttcggagttttttaaaatatggaatatacacatgtaacgttttagaaagtttccatttttttttctttttacctcttttttgttggttcactcgttaaactcttgttacaattgtattgaattagagccgttcattatgacacgtagattcggatctgatttcctttttgtaccgaaattagatcaattgggactagatccatcaaaaacccggtttaggattggtttactcaactgaattttccttttaggttaaaatagtaatttcatccattatgaaaaaaggcaagaattataccgactctcacaattaggtctctctataaatacacaccttttgatatctctccatcatcgaaaaccttgtaaacaaaaatcgccaccacaaaaagaaaaaagaaggaaacgATGTCGAAGATGGTTATGTTGCTGAGCTCCGATGGTGAATCTTTCCAGGTCGAAGAAGCAGTCGCGGTCCAGTCACAGACGATAGCACATATGATTGAAGACGATTGCGTCGCCAATGGAGTCCCTATCGCAAACGTTACAGGAGTCATCCTCTCGAAGGTGATCGAGTATTGCAAGAAACACGTCGTTTCTGATTCACCAACCGAAGAGAGCAAAGACGAACTCAAGAAGTGGGACGCTGAGTTCATGAAGGCCCTGGAACAGTCGTCGACTCTCTTTGATGTTATGCTGGCTGCGAATTACCTAAACATAAAAGACCTGCTTGACCTTGGTTGCCAAACTGTTGCTGACATGATCACTGGCAAGAAACCAGACGAGATTCGTGCACTTCTTGGCATCGAGAACGATTTTACACCGGAGGAGGAAGAGGAGATTCGTAAGGAGAATCAATGGGCTTTTGAATGAttctttagttttctttttcgacgtt agtgtgct >ASK14_at2g03170-usp ttagtagttattcaaataaaagtagaaaattaaaacctaataaactaaaagaacgcaactgattacaaaacttaatttatagactcttatcctataattagattaataattaatcaattaaattcaattctaagtaagatggacaaattaattaaataaataaaatgtcatacaaaattttcatagaaaatagccatgtttaggaaaaaaactctttttgtatgttaaataagtttctagaatctcctgataaactcttttactaagacgtcggtttttacttttaagcccaaaagttttaaggctaaacactagagatttggaagatttgcttttcttctcttccagtattgtgaaaaggaaaacacaattgaccgactcttaaaatattttataaatagacgccttcatcgactcctctctattccattcaatatctttgcataaatcataaccaatattattcttttcatcacccaaaatctcaaaaacgaacaaacATGTCTTCCAACAAGATTGTTTTGTCTAGCTCCGATGGCGAATCTTTCGAGGTTGAAGAAGCGGTGGCAAGAAAACTGAAAATCGTGGAACACATGATTGAAGACGACTGTGTTGTTACCGAGGTCCCTCTTCAAAACGTCACCGGAAAGATCCTCTCCATTGTTGTCGAGTATTGCAAGAAACACGTCGTTGACGAAGAAAGCGACGAGTTCAAGACTTGGGACGAAGAGTTCATGAAGAAATTTGATCAGCCTACGGTCTTCCAACTCTTGCTCGCTGCTAACTATCTCAATATCAAAGGCCTTCTTGATCTCTCTGCTCAAACCGTTGCAGATCGCATCAAAGATAAGACTCCAGAGGAAATTCGAGAAATCTTCAACATCGAGAACGATTTCACACCCGAAGAAGAAGCAGCGGTTCGCAAGGAAAACGCATGGGCTTTTGAATAGacaccaaaaccctagtttttggtttcgtttcttattgtcg gttttagg >ASK15_at3g25650-usp acattgactcttttggatcaaaatggatggtacatatcttgtgatttataatttttagcttgttacaatcgtcattgcatagaacttatatgagtattatggattgatcttatatttaaacataaaaatatttttatacaaataaatactttataagtttccaagatataaatttgatattttatatagatatataaatataaaattttacaaataaatatctttctcttttgcggatccaataaaaggggttaatcccaacaaactagttaagttagaatttgcgttattttccttatttccttgagaaaaaggaaacaacatgaaacctaattctaattggaaaaggaaaacacaaattgaccgactcttaggatttctatagataaacacattcatcgactcttttatttttcattcaacaatctctcatcaaatttctctgcacaaaaaataaaagcaaaaacttattttttcgtttaacacacaaagcaaacaaaATGTCTTCTAACAAGATTGTGTTGACTAGTTCCGATGGCGAGTCTTTCCAAGTTGAGGAAGTGGTGGCACGAAAACTGCAGATCGTAAAGCACCTGCTCGAAGACGACTGTGTTATTAACGAAATCCCTCTTCAAAACGTTACAGGAAATATTCTCTCCATCGTTCTCGAGTATTGCAAGAAACACGTCGACGATGTGGTCGATGATGATGCATCTGAGGAGCCGAAGAAGAAGAAGCCCGATGATgtggcgggccggttcctgaatcaactgaagaaggagatgatgcatctgagGAGGCGAAGCAGAATCTCGATGCTTGGGACGCAGAGTTCATGAAAAATATTGATATGGAAACAATCTTCAAGCTCATTCTCGCTGCTAACTATCTCAACGTCGAAGGTCTTCTTGGTCTCACTTGCCAGACTGTTGCAGATTACATCAAAGATAAGACGCCAGAGGAAGTTCGAGAACTCTTTAATATCGAGAA TGATTTCA >ASK16_at2g03190-usp taaatgttgaaaaagtaagtatcttagaacaaagtacatgtcatcaaattaattatttctaaaacttataaataaaaaaatctcactatattatactatatttaatatttacattgagttgataagagataccaaaaattaaattttgcttttaggtgaaatcctatctataatggttaatatttttaaaatattaaattacaatattagtataacttatataaaatctaatcaatataattttattagcatgtgtgtagtatatgtatgggtcaaatattaaaagaaaatataagagcctaacaattaattatatagtaggattttgttatatttatttttccttactatatttcctaaattaaagatttgacccactctcaactcattataaaaagagagacattatttcatagcttcatcaattttattcaacaatttctctcatttcatttcttccataaaatctctcaaaaatcaacaaaacaacaaaacaaacaATGTCTTCGAACAAGATTGTGTTGACTAGCTCGGATGATGAATCGTTCGAGGTTGAGGAAGCGGTGGCTCGTAAATTGAAGGTCATAGCACACATGATCGATGACGACTGCGCCGATAAAGCAATCCCGCTTGAAAACGTCACCGGAAATATCCTCGCTTTGGTTATCGAGTATTGCAAGAAACACGTACTTGATGATGTTGATGATAGTGATGATTCTACTGAAGCAACAAGCGAAAATGTAAACGAGGAAGCCAAGAACGAGCTCAGGACTTGGGACGCAGAGTTCATGAAAGAATTTGATATGGAAACAGTCATGAAACTCATTCTCGCTGTTAATTATCTCAACGTCCAAGATCTTCTTGGTCTCACTTGCCAGACCGTTGCAGATCACATGAAAGATATGTCGCCAGAGGAAGTTCGAGAACTCTTTAACATTGAGAATGATTACACACCTGAAGAAGAAGACGCGATTCGTAAGGAAAACGCTT GGGCTTTT >ASK17_at2g20160-usp atgacctttttttggtgaaaacaattatttatgacattgtcatagagctaattattttaaaaacttataaatagaaaatctcactatattatacccaaaatgaaaatttatgttaggtgaaactgaatccaatatataaatataacaatatatggttaatatttttttgaatattaagatagcatctataaaatctcatgaatattttttattagcatatgtgtagtatatgggtcaaatattaaaagataagataagagcctaattaattgattaggattttgttatatttacttttgctttctcttcccttatttaggaaaaaaagagaggaaaatatattttatatatatttcctcaattatagatttgacctcactctcaactcattataaaaagagagacttgcatagcctgagactcatcaatttcatataacaatttcttccataagatatctcaaaaatcttgctcttcgttaaaccagcaaaacaaacaaaATGTCTTCGAAGAAGATTGTGTTGACTAGCTCCGATGATGAATGTTTTGAGATTGACGAAGCGGTGGCTCGTAAGATGCAGATGGTAGCGCACATGATCGATGACGATTGCGCCGATAAAGCAATCCGGCTTCAAAACGTCACTGGAAAGATCCTCGCTATCATTATCGAGTATTGCAAGAAACACGTTGATGATGTTGAAGCCAAGAATGAGTTCGTGACTTGGGACGCAGAGTTCGTGAAAAACATTGATATGGATACACTCTTCAAACTCCTTGACGCTGCTGACTATCTCATCGTCATAGGTCTCAAGAATCTCATTGCCCAGGCCATTGCAGATTACACTGCAGATAAGACGGTAAATGAGATTCGAGAACTCTTTAACATCGAGAACGATTACACACCTGAGGAAGAAGAAGAGCTTCGCAAGAAGAACGAATGGGCTTTCAATTAAtaaccctaaagttccgttgtctaagtgttgatctcga tgttcttt >ASK18_at1g10230-usp taatagtttggcccattattccgaatcatttccttgttaattgagtgtattaataaatgttttggactggtttgtaccaaaaaataaaataatgttaagcttagaagattagaagatatacaataaactttccaaatcggcaacaaaacaaagtatttgatattggtttgatgtgtttcacgaccatagaacaagacggtacattatactatatgaatgttggcaaaagacaagtttttatgtttagttacgtttctctgcaaacgaagatattttttagtttgatcggttttctacaaaccggttccaaatcaatatgatcccattttggttttctcttcagaatgttctagaatcagatgagtgggacttgttgagtacagccgtataggttgtttgggctttatgaaatctttaggcccataattctgatccattatatttccttttctcccctacttgtaagggtttattctggattactcatttgccttataacaatggcttcttcttccgaagagattgtgtccgccggcgaatcatcagagatcgaggaagcggttgcgagtctaaccATGTCTTCTAACAAGATTTTGTTGACGAGTTCCGATGGCGAGTCTTTCGAGATCGACGAAGCGGTGGCGCGTAAGTTTCTGATCATAGTGCACATGATGGAGGATAACTGCGCCGGTGAAGCAATTCCGCTTGAAAATGTCACCGGGGATATCCTCTCCAAGATAATCGAGTACGCGAAGATGCACGTCAATGAACCTAGTGAAGAAGACGAAGACGAGGAGGCGAAGAAGAATCTAGACTCGTGGGACGCTAAGTTCATGGAAAAGCTAGATCTGGAGACCATCTTCAAAATCATTCTCGCTGCCAACTACCTAAACTTCGAAGGACTTCTCGGTTTCGCTAGCCAGACGGTTGCTGATTACATCAAGGACAAAACACCAGAGGAAGTACGAGAGATTTTCAACATCGAGAACG ATTTCACG >ASK19_at2g03160-usp attcacgatcgaggaacttgtacaataaccagtaatcttgcgagtttctttttttttccatgaataataataccaaaaaattagatgtttaaggttttgttagattctacataatctaatatgctcttatatcataattaaattaataattaatcccacgtcaaacaaaattagaagattacagaaaatgttataaactttgcatcgaaaaatacaaatcaaggaacaaaaggagtatattagtatattgctaaagaagtttctagaaatatatatatcttccactaaaccctagttaaaataggatttgagtctattttccttatttctttgtgagaaaaggaaacaacatgaaacccaatatctaattgtgaaaaggaaaacacaattgactgaatcttagggttctataaatagactgattcaacaatctctcatcaaaatttctcttcacaaacaattgactcttgtttttcgttcaacacacaaagcaaaaaacaATGTCTTCGAAAAAGATTGTGTTGACAAGCTCCGATGGTGAATCTTTCAAGGTTGAAGAAGTGGTGGCAAGAAAACTGCAGATCGTAGGACACATTATCGAAGACGACTGTGCTACAAACAAAATCCCTATTCCAAACGTTACCGGAGAGATTCTCGCCAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGACGATGATGACGTGGTGGAGACGCATGAATCATCGACGAAAGGAGATAAAACAGTTGAGGAGGCGAAGAAGAAGCCTGATGATGTGGCCGTACCTGAATCAACTGAAGGAGATGATGAAGCTGAGGATAAGAAGGAGAAGCTTAATGAGTGGGATGCAAAGTTCATGAAGGATTTCGATATTAAGACGATCTTCGACATTATTCTGGCTGCTAACTATCTCAACGTCCAAGGTCTTTTTGATCTCTGTAGCAAGACCATTGCAGATTACATAAAAGATATGACGCCAGAGGAAGTTC GAGAACTC ### last but not least showdb [just look for the ###] schleppi:/EMBOSS/data/ask # showdb Displays information on the currently available databases # Name Type ID Qry All Comment # ==== ==== == === === ======= qapblast P OK OK OK BLAST swissnew qapblastall P OK OK OK BLAST swissnew, all fields indexed qapblastsplit P OK OK OK BLAST swissnew split in 5 files qapblastsplitexc P OK OK OK BLAST swissnew split in 5 files, not file 02 qapblastsplitinc P OK OK OK BLAST swissnew split in 5 files, only file 02 qapfasta P OK OK OK FASTA file swissnew entries qapflat P OK OK OK Swissnew flatfiles qapflatall P OK OK OK Swissnew flatfiles, all fields indexed qapflatexc P OK OK OK Swissnew flatfiles, no updated sequence file qapflatinc P OK OK OK Swissnew flatfiles, only updated sequence file qapir P OK OK OK PIR qapirall P OK OK OK PIR qapirinc P OK OK OK PIR tpir P OK OK OK PIR using NBRF access for 4 files tsw P OK OK OK Swissprot native format with EMBL CD-ROM index tswnew P OK OK OK Swissnew as 3 files in native format with EMBL CD -ROM index twp P OK OK OK EMBL new in native format with EMBL CD-ROM index [###]askn N OK OK OK ASKxx_atxgxxxxx-sp/usp[###] qanfasta N OK OK OK FASTA file EMBL rodents qanfastaall N OK OK OK FASTA file EMBL rodents, all fields indexed qanflat N OK OK OK EMBL flatfiles qangcg N OK OK OK GCG format EMBL qangcgall N OK OK OK GCG format EMBL qangcgexc N OK OK OK GCG format EMBL without prokaryotes qangcginc N OK OK OK GCG format EMBL only prokaryotes qapirexc N OK OK OK PIR qasrs N OK OK - EMBL in local srs installation qasrsfasta N OK OK - EMBL in local srs installation, fasta format qasrswww N OK - - Remote SRS web server qawfasta N OK OK OK FASTA file wormpep entries tembl N OK OK OK EMBL in native format with EMBL CD-ROM index tgb N OK - - Genbank IDs tgenbank N OK OK OK GenBank in native format with EMBL CD-ROM index schleppi:/EMBOSS/data/ask # ------------------------------------------------------------------------------------------- From faurie at clermont.in2p3.fr Wed Aug 28 08:13:30 2002 From: faurie at clermont.in2p3.fr (julien Faurie) Date: Wed, 28 Aug 2002 14:13:30 +0200 Subject: generate a FASTA file with a annotation file (extension .dat) Message-ID: <3D6CBE6A.5E492550@clermont.in2p3.fr> Hi everybody, I have a little question. I downloaded swissprot release and now, I would like to generate my fasta file. In EMBOSS documention, I found the command "seqret" And I would like to know if it's a good idea to use it or Have you others commands to create a fasta file from annotation file. I don't want to lose informations I want to translate all sequences in annotation file to fasta file. Thanks in advance for your help. Julien Faurie "I apologyse for my english" From peter.rice at uk.lionbioscience.com Wed Aug 28 09:04:04 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 28 Aug 2002 14:04:04 +0100 Subject: generate a FASTA file with a annotation file (extension .dat) References: <3D6CBE6A.5E492550@clermont.in2p3.fr> Message-ID: <3D6CCA44.7020103@uk.lionbioscience.com> julien Faurie wrote: > I downloaded swissprot release and now, I would like to generate my > fasta file. > > In EMBOSS documention, I found the command "seqret" Good choice. You can pick the fasta format you prefer: % seqret seq.dat -sformat swissprot swissprot.fasta ... gives you fasta headers like this: >100K_RAT Q62671 100 KDA PROTEIN (EC 6.3.2.-). % seqret seq.dat -sformat swiss swissprot.ncbi -osformat ncbi ... gives you fasta headers like this: >gnl|unk|100K_RAT (Q62671) 100 KDA PROTEIN (EC 6.3.2.-). % seqret seq.dat -sformat swiss swissprot.blast -osformat ncbi -osdbname sp ... gives you fasta headers like this: >gnl|sp|100K_RAT (Q62671) 100 KDA PROTEIN (EC 6.3.2.-). Note in passing ... it seems 100K_RAT is no longer the first entry in SwissProt, as accession Q62671 is now called KC11_RAT (Casein Kinase I gamma 1 isoform). -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From faurie at clermont.in2p3.fr Wed Aug 28 09:12:59 2002 From: faurie at clermont.in2p3.fr (julien Faurie) Date: Wed, 28 Aug 2002 15:12:59 +0200 Subject: generate a FASTA file with a annotation file (extension .dat) References: <3D6CBE6A.5E492550@clermont.in2p3.fr> <3D6CCA44.7020103@uk.lionbioscience.com> Message-ID: <3D6CCC5B.A353FA38@clermont.in2p3.fr> > > In EMBOSS documention, I found the command "seqret" > > Good choice. You can pick the fasta format you prefer: > > % seqret seq.dat -sformat swissprot swissprot.fasta > > ... gives you fasta headers like this: > > >100K_RAT Q62671 100 KDA PROTEIN (EC 6.3.2.-). Thank you very much and I prefer the first command at this time. > > Note in passing ... it seems 100K_RAT is no longer the first entry in > SwissProt, as accession Q62671 is now called KC11_RAT (Casein Kinase I > gamma 1 isoform). this note is not important for my translation, isn't it ??? Thanks for your help. Regards. Julien Faurie. From abrown at nimr.mrc.ac.uk Fri Aug 30 09:14:18 2002 From: abrown at nimr.mrc.ac.uk (Alex Brown) Date: Fri, 30 Aug 2002 14:14:18 +0100 Subject: PROSITE Help Required Message-ID: <6410D05B-BC1A-11D6-8606-0003938768AC@nimr.mrc.ac.uk> Hi. I am trying to install the PROSITE database into my EMBOSS installation. I obtained the files prosite.dat and prosite.doc by anonyous FTP from ftp.ebi.ac.uk, and have placed both of these in the /EMBOSS/data/PROSITE directory. I then ran prosextract : this gave the following error EMBOSS An error in prosextract.c at line 83: Cannot open file data/PROSITE/prosite.dat What am I doing wrong ?? I am running on a Mac PowerBook G4, under Darwin with XDarwin (XFree86 4.2.0). Many thanks, Alex Brown. From haruna at sgi.com Sat Aug 31 00:56:45 2002 From: haruna at sgi.com (Haruna Cofer) Date: Sat, 31 Aug 2002 00:56:45 -0400 Subject: EMBOSS, EMBASSY, Jemboss on SGI Message-ID: <3D704C8D.659587C@sgi.com> Hello! Just FYI, I have placed my SGI porting notes for all of EMBOSS, EMBASSY, and Jemboss 2.5.0 on the SGI web site: http://www.sgi.com/industries/sciences/chembio/resources/emboss/ Thank you, and please do let me know if you have any questions or suggestions! -- Haruna :) -- Haruna N. Cofer Silicon Graphics Inc. ChemPharm Applications From abrown at nimr.mrc.ac.uk Thu Aug 1 11:03:42 2002 From: abrown at nimr.mrc.ac.uk (Alex Brown) Date: Thu, 1 Aug 2002 12:03:42 +0100 Subject: EMBOSS/lib directory Message-ID: <579E23D2-A53E-11D6-988E-0003938768AC@nimr.mrc.ac.uk> Hi. Another small problem, this time with 'cirdna'. I am running EMBOSS 2.4.1 on a Mac PowerBook G4, under Darwin and XDarwin 1.1 (XFree86 4.2.0). After installing EMBOSS, I noticed that he directory /EMBOSS/lib did not exist. This was odd, as the post-compilation setup instructions in the EMBOSS Administrators Guide (1.13.2) required the following line to be added to .cshrc: setenv PLPLOT_LIB /blah/blah/EMBOSS/lib With this setting, cirdna gave the following error: Cannot open library file: plstnd5.fnt Please set PLPLOT_LIB to the plplot/lib directory under emboss *** PLPLOT ERROR *** Unable to open font file Program aborted Setting PLPLOT_LIB to '/blah/blah/EMBOSS' did work, as the files plstnd5.fnt and plxtnd5.fnt were in this directory. However, the program 'hangs' after producing the 'cirdna' window with the graphic. I have to 'kill' the 'cirdna' window by alt-clicking it for the cirdna program to complete. Could someone explain this behavior (or is this normal), and what files are supposed to be in the /EMBOSS/lib directory. Sorry to be a nuisance. Many Thanks Alex Brown. From ame at esbs.u-strasbg.fr Thu Aug 1 14:31:24 2002 From: ame at esbs.u-strasbg.fr (Jean-Christophe Ame) Date: Thu, 1 Aug 2002 16:31:24 +0200 Subject: Request Message-ID: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Hi, I am looking for a program that would allow to do a multiple alignment of multiple ESTs sequences against a genomic sequence almost like est2genome does but with multiple ESTs. Does it exist ?? Thanks a lot. Sincerely, Jean-Christophe ________________________ Jean-Christophe Am?, PhD U.P.R. 9003 du CNRS - Canc?rog?n?se et Mutag?n?se Mol?culaire et Structurale ?cole Sup?rieure de Biotechnologie de Strasbourg P?le API Boulevard S?bastien-Brant 67400 Illkirch France tel.: 33 3 90 24 47 05 Fax.: 33 3 90 24 46 86 http://parplink.u-strasbg.fr http://www-esbs.u-strasbg.fr/centrerech/upr9003/upr9003.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 766 bytes Desc: not available URL: From elemento at club-internet.fr Thu Aug 1 20:39:34 2002 From: elemento at club-internet.fr (Elemento Olivier) Date: Thu, 01 Aug 2002 16:39:34 -0400 Subject: Request References: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Message-ID: <3D499C86.5090809@club-internet.fr> I am not sure but I think the StackPack software package (which is free for academic users) allows you to do that. Olivier. Jean-Christophe Ame wrote: > Hi, > > I am looking for a program that would allow to do a multiple alignment > of multiple ESTs sequences against a genomic sequence almost like > est2genome does but with multiple ESTs. Does it exist ?? > Thanks a lot. > > Sincerely, > Jean-Christophe > > > > > > ________________________ > Jean-Christophe Am?, PhD > U.P.R. 9003 du CNRS - Canc?rog?n?se et Mutag?n?se Mol?culaire et > Structurale > ?cole Sup?rieure de Biotechnologie de Strasbourg > P?le API > Boulevard S?bastien-Brant > 67400 Illkirch > France > > tel.: 33 3 90 24 47 05 > Fax.: 33 3 90 24 46 86 > > _http://parplink.u-strasbg.fr_ > _http://www-esbs.u-strasbg.fr/centrerech/upr9003/upr9003.html_ From peter.rice at uk.lionbioscience.com Thu Aug 1 15:18:12 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 01 Aug 2002 16:18:12 +0100 Subject: Request (est2genome) References: <5B72626A-A55B-11D6-B14F-0005024329A7@esbs.u-strasbg.fr> Message-ID: <3D495134.FA79A151@uk.lionbioscience.com> Jean-Christophe Am? writes: > I am looking for a program that would allow to do a multiple alignment > of multiple ESTs sequences against a genomic sequence almost like > est2genome does but with multiple ESTs. Does it exist ?? est2genome accepts multiple ESTs as input !!! But it reports one set of exons and alignment for each EST. We are looking into alignment and report format output for est2genome, so one obvious question: What format alignment would you like from est2genome (we could build one alignment of all ESTs against the genomic sequence, or separate alignments, and allow a choice of alignment format). (also, what format report would be most useful?) .... we will have to keep the original format output for users who are dependent on parsing it. regards, Peter Rice -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From michael.hanlon at bbsrc.ac.uk Wed Aug 7 15:55:32 2002 From: michael.hanlon at bbsrc.ac.uk (michael hanlon (BITS)) Date: Wed, 7 Aug 2002 16:55:32 +0100 Subject: problem with seqret and protein sequences. Message-ID: <41773CEF2B8FD411920200508BDCDC12BE146D@bits-exch1.bits.bbsrc.ac.uk> Due to changes in format at Sanger, we have edited our emboss.default to use the SRS service at EBI instead. Heres the relevant part: DB swall [ type: P method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-id:%s]" comment: "EBI Swissprot IDs and Trembl IDs" ] DB swalla [ type: P method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-acc:%s]" comment: "EBI Swissprot ACs and Trembl ACs" ] DB embl [ type: N method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[EMBLRELEASE-ID:%s]" comment: EBI EMBL ACs" ] DB embla [ type: N method: url format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[embl-acc:%s]" comment: "EBI EMBL IDs" ] The embl bits work OK, but not the swall, I just get 'unable to read sequences' Any help much appreciated. Mike From ableasby at hgmp.mrc.ac.uk Thu Aug 8 08:43:46 2002 From: ableasby at hgmp.mrc.ac.uk (ableasby at hgmp.mrc.ac.uk) Date: Thu, 8 Aug 2002 09:43:46 +0100 (BST) Subject: EMBOSS 2.5.0 available Message-ID: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> EMBOSS 2.5.0 is now available for download. As well as new programs (e.g. mwcontam, aaindexextract) this release can handle chunked HTML, sequences as values in EMBL rpt-unit entries, improved database indexing, extensions to several programs and documentation corrections (documentation for the protein structure programs is being produced). See the ChangeLog file for full details. Jemboss beta 2.7 contains new features including improved file management and file managers with pop-up menus, browsable EMBOSS help, report formats and multiple selection and deletion of results. It also contains a number of fixes and methods of speeding up of the response of the interface. The Jemboss server includes code which can be used to send the Jemboss batch jobs to a network queueing system. There are also extra jemboss.properties parameters to define the environment variables and the URL for the EMBOSS applications help. Alan From peter.rice at uk.lionbioscience.com Thu Aug 8 08:21:33 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Thu, 08 Aug 2002 09:21:33 +0100 Subject: problem with seqret and protein sequences. References: <41773CEF2B8FD411920200508BDCDC12BE146D@bits-exch1.bits.bbsrc.ac.uk> Message-ID: <3D522A0D.E71A0706@uk.lionbioscience.com> "michael hanlon (BITS)" wrote: > > Due to changes in format at Sanger, we have edited our emboss.default to use the SRS service at EBI instead. Heres the relevant part: > > DB swall [ type: P method: url format: embl > url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[SWall-id:%s]" > comment: "EBI Swissprot IDs and Trembl IDs" ] > > DB embl [ type: N method: url format: embl > url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz?-e+[EMBLRELEASE-ID:%s]" > comment: EBI EMBL ACs" ] > > The embl bits work OK, but not the swall, I just get 'unable to read sequences' You are lucky with EMBL :-) For SwissProt, you should add +-ascii to get the URL to work. It depends on the default views defined for the EBI's SRS server. But, with EMBOSS 2.5.0 (should be fine with 2.4.1 too, though there was a problem with chunked HTTP output) you can use method: srswww and define the SRS database name in dbalias. With this definition you can use swall-id:amir_pseae and swall-acc:P10932. If you add fields: "sv des key org" you can search those too in the USA, though beware of huge result sets because EMBOSS will keep them in memory. A small search would be embl-org:aardvark (27 entries, at least until an aardvark EST project starts :-) Without the fields definition you are safer because users are less likely to generate a huge request to an SRS server with an ID or ACC query. Method: srswww builds the full SRS URL for you (with +-e+-ascii), and converts the USA queries into SRS queries. It was added in EMBOSS 2.4.0. For example: DB swall [ type: P method: srswww format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz" comment: "EBI Swissprot and Trembl" ] DB embl [ type: N method: srswww format: embl url: "http://srs.ebi.ac.uk/srs6bin/cgi-bin/wgetz" dbalias: emblrelease comment: EBI EMBL" ] regards, Peter Rice -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From sylvain.foisy at bioneq.qc.ca Thu Aug 8 13:33:59 2002 From: sylvain.foisy at bioneq.qc.ca (Sylvain Foisy) Date: Thu, 8 Aug 2002 09:33:59 -0400 Subject: EMBOSS 2.5.0 on Mac OS X - Problems with Java config Message-ID: <7EC90FCA-AAD3-11D6-B428-0003936297DA@bioneq.qc.ca> Hi, Just downloaded 2.5.0 and tried to configure/compile on my Mac with Java support for Jemboss. I can't get configure to use the Java parameters that are written in the Jemboss Server set-up instructions. I get the following: Java directory /System/Library/Frameworks/JavaVM.frameworks/Versions/1.3.1/Home does not exist Well, it does! Anybody can solve this one? Thanks in advance Sylvain ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Sylvain Foisy, Ph. D. Directeur-Operations / Project Manager BioNEQ - Le Reseau quebecois de bioinformatique Genome-Quebec Tel.: (514) 878-9911 E-mail: sylvain.foisy at bioneq.qc.ca ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ From richard at seqbio.com Thu Aug 8 20:18:47 2002 From: richard at seqbio.com (Richard Cote) Date: Thu, 08 Aug 2002 16:18:47 -0400 Subject: Discouraging jemboss perfornance Message-ID: <3D52D227.1020800@seqbio.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote -- ========================================================== Richard Cote, M.Sc. richard at seqbio.com Senior Bioinformatician http://www.seqbio.com Sequence Bioinformatics 1-877-SEQUENCE 1410 Stanley St. Suite 704 (tel) 514-842-5356 Montreal, QC H3A 1P8 (fax) 514-842-7230 ========================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 3359 bytes Desc: S/MIME Cryptographic Signature URL: From rglcote at yahoo.ca Thu Aug 8 20:22:56 2002 From: rglcote at yahoo.ca (richard cote) Date: Thu, 8 Aug 2002 16:22:56 -0400 (EDT) Subject: Jemboss performance In-Reply-To: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <20020808202256.48923.qmail@web21304.mail.yahoo.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote ______________________________________________________________________ Post your ad for free now! http://personals.yahoo.ca From rglcote at yahoo.ca Thu Aug 8 20:22:56 2002 From: rglcote at yahoo.ca (richard cote) Date: Thu, 8 Aug 2002 16:22:56 -0400 (EDT) Subject: Jemboss performance In-Reply-To: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <20020808202256.48923.qmail@web21304.mail.yahoo.com> Hello all. We have installed emboss/jemboss locally. Jemboss is running on a linux box, with a standard installation (apache/tomcat/soap). We have noticed a *huge* performance gap between running some applications (eg supermatcher, megamerger) through a shell command-line vs. using jemboss (jemboss being significantly slower). How could this situation be improved? Thank you for any help you can provide, R. Cote ______________________________________________________________________ Post your ad for free now! http://personals.yahoo.ca From uma at avesthagen.com Fri Aug 16 06:37:45 2002 From: uma at avesthagen.com (=?iso-8859-1?Q?Uma_Maheswari?=) Date: Fri, 16 Aug 2002 12:07:45 +0530 (IST) Subject: =?iso-8859-1?Q?emowse!?= Message-ID: <39285.192.168.1.5.1029479865.squirrel@mail.avesthagen.com> hai! Iam using emowse program in EMBOSS to search a protein database for matches with the mass spectrometry data. When this database is just a few protein(100 seq) in fasta format it works fine. but when I do the same over nr database I get an error message "An error has been found: Sequence is not a protein" I downloaded the nr database in fasta format.(all the seq. in a single file) then used csplit to split it into indiuvidual file with the common prefix and ran the prog. as follows. Input sequence(s): pro* Input file: b8 Whole sequence molwt [0]: Output file [baa92411.emowse]: An error has been found: Sequence is not a protein (note:'pro' is the common prefix for all the protein database sequence) What can I do to avoid this error? Is there any other way to create he database for the same. Thanks in Adv. Uma Maheswari -------------------------------------------- Avestha Gengraine Technologies Pvt. Ltd. Discoverer 9th Floor, Unit 3, International Tech Park, Bangalore-560 066, India. Tel:91-80-8411665/8412308; Fax:91-80-8418780 http://www.avesthagen.com From uma at avesthagen.com Fri Aug 16 09:00:35 2002 From: uma at avesthagen.com (=?iso-8859-1?Q?Uma_Maheswari?=) Date: Fri, 16 Aug 2002 14:30:35 +0530 (IST) Subject: =?iso-8859-1?Q?Re:_Antwort:_emowse!?= In-Reply-To: References: Message-ID: <46025.192.168.1.5.1029488435.squirrel@mail.avesthagen.com> hai David! I think you are right! I just downloaded the swiss prot and used that as database for emowse ....and it works! :) Uma Maheswari. > > Hi, > > I had sometime problems with other emboss programms (database indexers) > and the nr database. > When the database is made non-redundant, an entry which is more than > one time in the original databases gets all the headers of the original > sequences fused to one header line. > The original fasta headers are then joined with a Ctrl-A. > This binary character within the fasta header line may confuse emowse. > > It's just an idea. I may be completely wrong :-). > > David. -- "Experience is not what happens to a man; it is what a man does with what happens to him." - Aldous Huxley -------------------------------------------- Avestha Gengraine Technologies Pvt. Ltd. Discoverer 9th Floor, Unit 3, International Tech Park, Bangalore-560 066, India. Tel:91-80-8411665/8412308; Fax:91-80-8418780 http://www.avesthagen.com From jaen at novonordisk.com Wed Aug 21 07:34:30 2002 From: jaen at novonordisk.com (JAEN (Jacob Engelbrecht)) Date: Wed, 21 Aug 2002 09:34:30 +0200 Subject: compseq: is U an amino acid Message-ID: I have been using compseq for protein sequences and wondered why 'U' is reported as an amino acid? I looked in the code (nucleus/embnmer.c) and found it was specifically accounted for, whereas 'X' which in many databases as unknown is not specifically accounted for. Would it not make sense to have options which made specific symbols part of the alphabet or left them out: -leaveout XU or -include BZXU Jacob Engelbrecht, Phd Insulin Research Novo Nordisk 6A1.038 Novo Alle DK-2880 Bagsvaerd Denmark tel: +45 4442 4403 mail: jaen at novonordisk.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From gwilliam at hgmp.mrc.ac.uk Wed Aug 21 08:18:06 2002 From: gwilliam at hgmp.mrc.ac.uk (Gary Williams, Tel 01223 494522) Date: Wed, 21 Aug 2002 09:18:06 +0100 Subject: compseq: is U an amino acid References: Message-ID: <3D634CBE.1A758F63@hgmp.mrc.ac.uk> U codes for the amino acid selenocysteine. See the IUPAC documentation for one-letter amino-acids: http://www.chem.qmul.ac.uk/iupac/AminoAcid/A2021.html and http://www.chem.qmul.ac.uk/iubmb/newsletter/1999/item3.html regards, Gary > "JAEN (Jacob Engelbrecht)" wrote: > > I have been using compseq for protein sequences and wondered why 'U' > is reported as an amino acid? > I looked in the code (nucleus/embnmer.c) and found it was specifically > accounted for, whereas 'X' which in many databases as unknown is not > specifically accounted for. > > Would it not make sense to have options which made specific symbols > part of the alphabet or left them out: > -leaveout XU or -include BZXU > > Jacob Engelbrecht, Phd > Insulin Research > Novo Nordisk > 6A1.038 Novo Alle > DK-2880 Bagsvaerd > Denmark > tel: +45 4442 4403 > mail: jaen at novonordisk.com -- Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512 mailto:G.Williams at hgmp.mrc.ac.uk http://www.hgmp.mrc.ac.uk/ Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK From haruna at sgi.com Thu Aug 22 13:40:41 2002 From: haruna at sgi.com (Haruna Cofer) Date: Thu, 22 Aug 2002 09:40:41 -0400 Subject: EMBOSS 2.5.0 available References: <200208080843.JAA29945@bromine.hgmp.mrc.ac.uk> Message-ID: <3D64E9D9.EBFB36FF@sgi.com> Hello! This is just a quick note to let SGI/IRIX users know that my porting notes for EMBOSS and Jemboss have been updated for the latest release: http://www.sgi.com/industries/sciences/chembio/resources/emboss/ Thanks! -- Haruna :) -- Haruna N. Cofer Silicon Graphics Inc. ChemPharm Applications From gkaiser at facstaff.wisc.edu Thu Aug 22 22:25:31 2002 From: gkaiser at facstaff.wisc.edu (Gebhard Kaiser) Date: Thu, 22 Aug 2002 17:25:31 -0500 Subject: No subject Message-ID: <200208221725.31504.gkaiser@facstaff.wisc.edu> Hello 1. I have got problens configuring my databases may someone can help me with it. 2. I am very new in Bioinformatics. Is their any kind of an "how-to-do" what explains essential things like fileformats etc. . I tryed to to index some gcg files as well but dbigcg asked for some *.ref files what is this, what is UFO; a coordinate file is the same like *.pdb? Thank you Gebhard ------------------------------------------------------------------------------------------ Here is what you may need for 1.: ### this is the warning: schleppi:/ # infoseq Displays some simple information about sequences Input sequence(s): askn:* Warning: Cannot open division file '' for database 'askn' Error: Unable to read sequence 'askn:*' ### the part of my emboss.default: DB askn [ type: N method: emblcd format: fasta dir: $emboss_db_dir/ask file: askn.fasta comment: "ASKxx_atxgxxxxx-sp/usp" ] ### ls of $emboss_db_dir/askn: (dbifasta did not give me any warnings) . .. acnum.hit acnum.trg askn.fasta division.lkp entrynam.idx ### askn.fasta it self: >ASK01_at1g75950-sp ATGTCTGCGAAGAAGATTGTGTTGAAGAGTTCCGATGGTGAATCTTTCGAGGTTGAGGAGGCGGTGGCTCTCGAGTCACAAACCATAGCGCATATGGTTGAAGACGACTGCGTCGACAACGGAGTCCCTCTTCCTAACGTCACGAGCAAGATCCTCGCCAAGGTGATCGAGTATTGCAAGAGGCACGTCGAGGCTGCTGCCTCTAAGGCCGAGGCCGTCGAGGGTGCTGCTACCTCCGATGACGATCTTAAGGCCTGGGACGCTGATTTTATGAAGATCGATCAAGCTACTCTCTTTGAACTCATTCTGGCTGCTAATTACCTGAATATCAAGAACTTGCTTGATCTAACATGTCAGACAGTTGCGGATATGATCAAAGGAAAGACTCCAGAAGAGATCCGCACAACGTTCAACATTAAGAACGACTTCACACCAGAGGAAGAGGAAGAGGTTCGCAGAGAGAACCAATGGGCTTTTGAATGA >ASK02_at5g42190-sp ATGTCGACGGTGAGAAAAATCACTCTTAAGAGTTCGGATGGCGAAAACTTCGAAATTGACGAAGCGGTGGCGCTAGAGTCACAAACCATCAAACATATGATTGAAGATGACTGTACCGATAATGGTATCCCTCTCCCTAATGTCACAAGCAAGATCCTTTCGAAGGTGATTGAGTACTGTAAGAGACATGTCGAAGCTGCTGAGAAATCCGAAACCACGGCCGATGCTGCTGCTGCTACTACTACCACCACCGTCGCGTCGGGTTCTAGTGATGAAGATCTCAAGACTTGGGATTCTGAGTTTATCAAAGTTGATCAGGGCACTCTCTTCGATCTTATCCTGGCTGCTAACTACTTGAATATCAAGGGACTGTTGGACTTGACTTGCCAGACAGTGGCTGATATGATTAAAGGAAAAACCCCAGAAGAAATCCGTAAGACGTTCAATATCAAGAACGACTTCACGCCAGAGGAAGAAGAAGAGGTTCGCCGTGAGAATCAGTGGGCGTTTGAATGA >ASK03_at3g25700-sp ATGGCAGAAACGAAGAAGATGATCATCCTCAAGAGCTCCGACGGTGAATCCTTCGAGGTCGAGGAAGCCGTCGCGGTCGAGTCCCAGACGATTAAGCACATGATCGAGGACGACTGCGTCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCGAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGCCGCTGCTGAGGCTGGTGGAGACAAGGATTTCTATGGTTCCACCGAGAACCACGAGCTCAAGACTTGGGACAACGATTTCGTCAAAGTTGATCATCCTACTCTCTTCGATCTCCTTCGGGCTGCCAACTATTTGAACATCAGTGGACTTCTTGACCTTACGTGCAAGGCCGTGGCTGATCAGATGAGAGGCAAAACTCCAGCGCAGATGCGTGAACACTTCAACATCAAGAACGACTACACACCTGAGGAAGAGGCCGAGGTTCGCAATGAGAACAGGTGGGCGTTCGAGTGA >ASK04_at1g20140-sp ATGGCAGAAACGAAGAAGATGATCATCCTTAAGAGCTCCGACGGTGAATCCTTCGAGATCGAGGAAGCCGTCGCTGTTAAGTCCCAGACGATTAAGCACATGATTGAGGACGACTGTGCCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCCAAGGTTATTGAGTATTGCAAGAAGCACGTTGAAGCCGCTGCTGAAGCTGGTGGAGACAAGGATTTCTATGGTTCCGCTGAGAACGACGAGCTTAAGAATTGGGACAGCGAATTCGTCAAAGTCGATCAGCCTACTCTCTTCGATCTCATCTTGGCTGCGAACTATTTGAACATCGGTGGACTTCTTGACCTTACGTGCAAGGCCGTGGCTGATCAGATGAGAGGCAAAACTCCAGAGCAGATGCGTGCACACTTCAACATCAAGAACGATTACACACCTGAGGAAGAGGCGGAGGTTCGCAATGAGAACAAGTGGGCGTTCGAGTGA >ASK05_at3g60020-sp ATGTCGACGAAGATCATGTTGAAGAGCTCCGATGGTAAATCGTTCGAGATCGACGAAGACGTGGCACGCAAATCAATCGCGATAAACCATATGGTTGAGGACGGCTGCGCCACTGATGTAATACCGCTTCGAAACGTCACAAGCAAGATTCTCAAGATTGTGATCGATTATTGCGAGAAGCACGTCAAGAGCAAAGAAGAAGAAGATCTCAAGGAGTGGGACGCTGATTTCATGAAGACGATCGAAACAACCATTCTCTTTGATGTTATGATGGCTGCGAATTATCTCAATATTCAAAGCCTTCTTGATCTCACATGTAAAACTGTCTCGGATTTGCTCCAGGCTGATTTGCTCTCAGGGAAAACTCCAGATGAGATTCGCGCGCACTTCAACATCGAGAACGATCTAACAGCAGAGGAAGTAGCTAAGATTCGTGAGGAGAATCAATGGGCTTTTCAATGA >ASK06_at3g53060-sp ATGGCAGAAGACGATTGTGCCGATAATGGAATCCCTCTTCCAAACGTGACAAGCAAGATACTCTTATTGGTGATCGAGTATTGCAAGAAGCACGTCGTTGAGAGCAAAGAAGAAGATCTAAAGAAGTGGGACGCTGAATTCATGAAGAAGATGGAACAATCGATTCTCTTTGATGCAAAACTCCAGGCGAGATTCGCTCATACTTCAATATCGAGAACGATTTCACAGCAGAGGGAGAAGCTGAGATCCGCAAGGTGA >ASK07_at3g21840-sp ATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK08_at3g21830-sp ATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK09at3g21850-sp ATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGA >ASK10_at3g21860-sp ATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCATGCCAATGCCAGACCATAGCTCATATGAGTGAAGACGATTGTACCGATAATGGAATCCCGCTTCCAGAAGTGACAGGCAAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGAGGATCTCAAGAAGTGGGACAAGGAATTCATGGAAAAATATCAATCCACGATCTTTGATCTCATTATGGCTGCGAATTACCTAAACATCAAAAGCCTTCTTGATCTAGCATGCCAAACCGTCGCGGATATGATCAAAGACAACACTGTGGAGCACACTCGCAAATTCTTCAACATTGAGAATGATTATACACATGAGGAAGAAGAAGCGGTTCGTAGGGAGAATCAATGGGGTTTTGAATGA >ASK11_at4g34210-sp ATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAACACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAACTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGCTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK12_at4g34470-sp ATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAATACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAAGTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGTTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK13_at3g60010-sp ATGTCGAAGATGGTTATGTTGCTGAGCTCCGATGGTGAATCTTTCCAGGTCGAAGAAGCAGTCGCGGTCCAGTCACAGACGATAGCACATATGATTGAAGACGATTGCGTCGCCAATGGAGTCCCTATCGCAAACGTTACAGGAGTCATCCTCTCGAAGGTGATCGAGTATTGCAAGAAACACGTCGTTTCTGATTCACCAACCGAAGAGAGCAAAGACGAACTCAAGAAGTGGGACGCTGAGTTCATGAAGGCCCTGGAACAGTCGTCGACTCTCTTTGATGTTATGCTGGCTGCGAATTACCTAAACATAAAAGACCTGCTTGACCTTGGTTGCCAAACTGTTGCTGACATGATCACTGGCAAGAAACCAGACGAGATTCGTGCACTTCTTGGCATCGAGAACGATTTTACACCGGAGGAGGAAGAGGAGATTCGTAAGGAGAATCAATGGGCTTTTGAATGA >ASK14_at2g03170-sp ATGTCTTCCAACAAGATTGTTTTGTCTAGCTCCGATGGCGAATCTTTCGAGGTTGAAGAAGCGGTGGCAAGAAAACTGAAAATCGTGGAACACATGATTGAAGACGACTGTGTTGTTACCGAGGTCCCTCTTCAAAACGTCACCGGAAAGATCCTCTCCATTGTTGTCGAGTATTGCAAGAAACACGTCGTTGACGAAGAAAGCGACGAGTTCAAGACTTGGGACGAAGAGTTCATGAAGAAATTTGATCAGCCTACGGTCTTCCAACTCTTGCTCGCTGCTAACTATCTCAATATCAAAGGCCTTCTTGATCTCTCTGCTCAAACCGTTGCAGATCGCATCAAAGATAAGACTCCAGAGGAAATTCGAGAAATCTTCAACATCGAGAACGATTTCACACCCGAAGAAGAAGCAGCGGTTCGCAAGGAAAACGCATGGGCTTTTGAATAG >ASK15_at3g25650-sp ATGTCTTCTAACAAGATTGTGTTGACTAGTTCCGATGGCGAGTCTTTCCAAGTTGAGGAAGTGGTGGCACGAAAACTGCAGATCGTAAAGCACCTGCTCGAAGACGACTGTGTTATTAACGAAATCCCTCTTCAAAACGTTACAGGAAATATTCTCTCCATCGTTCTCGAGTATTGCAAGAAACACGTCGACGATGTGGTCGATGATGATGCATCTGAGGAGCCGAAGAAGAAGAAGCCCGATGATGAGGCGAAGCAGAATCTCGATGCTTGGGACGCAGAGTTCATGAAAAATATTGATATGGAAACAATCTTCAAGCTCATTCTCGCTGCTAACTATCTCAACGTCGAAGGTCTTCTTGGTCTCACTTGCCAGACTGTTGCAGATTACATCAAAGATAAGACGCCAGAGGAAGTTCGAGAACTCTTTAATATCGAGAATGATTTCACACATGAAGAAGAAGAAGAAGCGATTCGCAAGGAGAACGCTTGGGCTTTTGAGGCTGACACAAAACACGAAGATCCAAAGCCCTAG >ASK16_at2g03190-sp ATGTCTTCGAACAAGATTGTGTTGACTAGCTCGGATGATGAATCGTTCGAGGTTGAGGAAGCGGTGGCTCGTAAATTGAAGGTCATAGCACACATGATCGATGACGACTGCGCCGATAAAGCAATCCCGCTTGAAAACGTCACCGGAAATATCCTCGCTTTGGTTATCGAGTATTGCAAGAAACACGTACTTGATGATGTTGATGATAGTGATGATTCTACTGAAGCAACAAGCGAAAATGTAAACGAGGAAGCCAAGAACGAGCTCAGGACTTGGGACGCAGAGTTCATGAAAGAATTTGATATGGAAACAGTCATGAAACTCATTCTCGCTGTTAATTATCTCAACGTCCAAGATCTTCTTGGTCTCACTTGCCAGACCGTTGCAGATCACATGAAAGATATGTCGCCAGAGGAAGTTCGAGAACTCTTTAACATTGAGAATGATTACACACCTGAAGAAGAAGACGCGATTCGTAAGGAAAACGCTTGGGCTTTTGAGGATCTAAAGTAA >ASK17_at2g20160-sp ATGTCTTCGAAGAAGATTGTGTTGACTAGCTCCGATGATGAATGTTTTGAGATTGACGAAGCGGTGGCTCGTAAGATGCAGATGGTAGCGCACATGATCGATGACGATTGCGCCGATAAAGCAATCCGGCTTCAAAACGTCACTGGAAAGATCCTCGCTATCATTATCGAGTATTGCAAGAAACACGTTGATGATGTTGAAGCCAAGAATGAGTTCGTGACTTGGGACGCAGAGTTCGTGAAAAACATTGATATGGATACACTCTTCAAACTCCTTGACGCTGCTGACTATCTCATCGTCATAGGTCTCAAGAATCTCATTGCCCAGGCCATTGCAGATTACACTGCAGATAAGACGGTAAATGAGATTCGAGAACTCTTTAACATCGAGAACGATTACACACCTGAGGAAGAAGAAGAGCTTCGCAAGAAGAACGAATGGGCTTTCAATTAA >ASK18_at1g10230-sp ATGTCTTCTAACAAGATTTTGTTGACGAGTTCCGATGGCGAGTCTTTCGAGATCGACGAAGCGGTGGCGCGTAAGTTTCTGATCATAGTGCACATGATGGAGGATAACTGCGCCGGTGAAGCAATTCCGCTTGAAAATGTCACCGGGGATATCCTCTCCAAGATAATCGAGTACGCGAAGATGCACGTCAATGAACCTAGTGAAGAAGACGAAGACGAGGAGGCGAAGAAGAATCTAGACTCGTGGGACGCTAAGTTCATGGAAAAGCTAGATCTGGAGACCATCTTCAAAATCATTCTCGCTGCCAACTACCTAAACTTCGAAGGACTTCTCGGTTTCGCTAGCCAGACGGTTGCTGATTACATCAAGGACAAAACACCAGAGGAAGTACGAGAGATTTTCAACATCGAGAACGATTTCACGCCTGAAGAAGAGGAAGAGATTCGCAAGGAGAATGCTTGGACTTTTAATGAGTAA >ASK19_at2g03160-sp ATGTCTTCGAAAAAGATTGTGTTGACAAGCTCCGATGGTGAATCTTTCAAGGTTGAAGAAGTGGTGGCAAGAAAACTGCAGATCGTAGGACACATTATCGAAGACGACTGTGCTACAAACAAAATCCCTATTCCAAACGTTACCGGAGAGATTCTCGCCAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGACGATGATGACGTGGTGGAGACGCATGAATCATCGACGAAAGGAGATAAAACAGTTGAGGAGGCGAAGAAGAAGCCTGATGATGTGGCCGTACCTGAATCAACTGAAGGAGATGATGAAGCTGAGGATAAGAAGGAGAAGCTTAATGAGTGGGATGCAAAGTTCATGAAGGATTTCGATATTAAGACGATCTTCGACATTATTCTGGCTGCTAACTATCTCAACGTCCAAGGTCTTTTTGATCTCTGTAGCAAGACCATTGCAGATTACATAAAAGATATGACGCCAGAGGAAGTTCGAGAACTCTTTAACATCGAGAATGATTTCACACCTGAAGAAGAAGAAGCAATTCGCAATGAAAACGCTTGGACTTTTGAGCAAGATGGAAAACAACAAGTTCCAAAACCCTAG >ASK01_at1g75950-usp aaataaaaataaatgttcaaaaaacatgatcttaaagctgacaaagctgatttgattgactaatacttatcctacggtgatttttggttcttttactttttttgacaattatggagctggatggaaaaaaaatatatataaaatcatatattattaataatgagataaatacaacgaattaaacggatcaaagttaatatttccaaaagaaaaatagaacagagtcctaatttcattaatttcaactattgaaaacaaaatttaaaatcaataggattgatttctatttttcttttagaaaaacaaaatttgaaacaatttcctaatttccctaaacttgcgactttttaacaatcgaccgacttatcaaaattagggattgttttatatataaagagagacgcatctctttatttcattcatcgcttctccaaaattttcttcaaagaacaaatctcccaaatctaaaatctttctcttctctcttcgtttccataaccATGTCTGCGAAGAAGATTGTGTTGAAGAGTTCCGATGGTGAATCTTTCGAGGTTGAGGAGGCGGTGGCTCTCGAGTCACAAACCATAGCGCATATGGTTGAAGACGACTGCGTCGACAACGGAGTCCCTCTTCCTAACGTCACGAGCAAGATCCTCGCCAAGGTGATCGAGTATTGCAAGAGGCACGTCGAGGCTGCTGCCTCTAAGGCCGAGGCCGTCGAGGGTGCTGCTACCTCCGATGACGATCTTAAGGCCTGGGACGCTGATTTTATGAAGATCGATCAAGCTACTCTCTTTGAACTCATTCTGgtatgtttcttctctcgatctgatttgatttttccatcgaattttgaattttgggattctagggttttcgatttgggaaaattagggtttcgaaatttaggtgtttgtttcagaaattgaatctgcttgagattgatattgttagggttcttatggaaccaatcattaattgaatctatcg atttggat >ASK02_at5g42190-usp aaaaaaaaataagaaaaataaaataaaaaataaataagttttgtcgtaacggtggacttggtttctagaatgtggatgattttaatacaacttaattagcaaaacaactgccgcaattgattatgattcttaactctttatttgcaaagatgttcaaagaaaaatattaccggaaatcaaatcacatgaatccaattaaattatacacaccatcctacaaatagaggagtttagtatctgttcaactactgattaatcaaaagttgatgaaaagagttgaattaattttcaggtgttttcctttgaaagaaaaaaacagagaaaatgttttaaaatgaaaatttataggtaaaaaaagtcaattgggaatcgttagatctcactggttcaatgtgtgagccgggctttaaaaacattttgatttaacccatacacacatctctctgttccacgattctcttcctcagcctccacgtcgtctctaaactcagcaaaaaccaATGTCGACGGTGAGAAAAATCACTCTTAAGAGTTCGGATGGCGAAAACTTCGAAATTGACGAAGCGGTGGCGCTAGAGTCACAAACCATCAAACATATGATTGAAGATGACTGTACCGATAATGGTATCCCTCTCCCTAATGTCACAAGCAAGATCCTTTCGAAGGTGATTGAGTACTGTAAGAGACATGTCGAAGCTGCTGAGAAATCCGAAACCACGGCCGATGCTGCTGCTGCTACTACTACCACCACCGTCGCGTCGGGTTCTAGTGATGAAGATCTCAAGACTTGGGATTCTGAGTTTATCAAAGTTGATCAGGGCACTCTCTTCGATCTTATCCTGgtttgtcaaacttatttttaattgctttggttttcaaagtttgcgatttcatttctagggtttgagatcttgatttctgtttgagatctaattttagggttcaggttttgtttagattgccaatttcacagtttaaactatgatcatg tctgattg >ASK03_at3g25700-usp tgttggttaccaatttttacattttcattgatattcactacaaaatacacaaataataattaattcataatctatgtgaacgtggaggtttacttttattaaaacataagaccctagtcaatgattttttcacacgtagtagatgattaactgtattttctgaaatcagatacgggtaaatctggaaagtaaatattattactaaggttgagcttttggaaaagtaaaaatatttctctttaaaagaaaaaaataataaagattttgttacttaaaattccaatatttgtttcccttttattgttttcctattattaaaaggattagattaacataaaagcaatcaaccgactttaattaccaagtaagaaattgtttttacatagatctataaatagggcaccaacttcccaaaccttgagaccatcacacaattcacaatcaatcgcagagccgattctcttcaaaacttgtctagtcctttgtccttgttgcaaacgATGGCAGAAACGAAGAAGATGATCATCCTCAAGAGCTCCGACGGTGAATCCTTCGAGGTCGAGGAAGCCGTCGCGGTCGAGTCCCAGACGATTAAGCACATGATCGAGGACGACTGCGTCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCGAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGCCGCTGCTGAGGCTGGTGGAGACAAGGATTTCTATGGTTCCACCGAGAACCACGAGCTCAAGACTTGGGACAACGATTTCGTCAAAGTTGATCATCCTACTCTCTTCGATCTCCTTCGGgttagtaatgtctttttctttgttttttggttttatgtttttagaattagggttttttatattttttccatgactatgttagggttttatttatattattgaatgttgtgttttgatttggagactaatcgtcttggtttataaagGCTGCCAACTATTTGAACATCAGTGG ACTTCTTG >ASK04_at1g20140-usp tattcactacaaaatacccaaataataattcataatttcacatagatttttacatacaaacgtggaggttttctttgattaaaacataaaaaccctagtcaatgatttttgcacacgtagataaactgtattttctgaaatcagatgtaaatctgaaaaagtaaatattattaacggaatacagctaaggtgaagtttgtggaaaagtaaaaatatttactttatttaaaacaacataaattttccattttttaaaaacttaaatttccaatatttgttttactttttactgttctcctattagtaaaaggagtagattaacttaaaagcaataagccggcaaaaaaaaaaaaacttaaaagcaataaaccgacttgaataccaattaagaaattggctataaataaggctccaacttcccaaaccttgacaccatcacaacaatcaatcgcagcgccgattctccttcaaaaacttttcctaagtcgtcactttttacgATGGCAGAAACGAAGAAGATGATCATCCTTAAGAGCTCCGACGGTGAATCCTTCGAGATCGAGGAAGCCGTCGCTGTTAAGTCCCAGACGATTAAGCACATGATTGAGGACGACTGTGCCGATAACGGAATTCCACTTCCCAATGTCACCGGAGCCATCCTCGCCAAGGTTATTGAGTATTGCAAGAAGCACGTTGAAGCCGCTGCTGAAGCTGGTGGAGACAAGGATTTCTATGGTTCCGCTGAGAACGACGAGCTTAAGAATTGGGACAGCGAATTCGTCAAAGTCGATCAGCCTACTCTCTTCGATCTCATCTTGgttagtaaagtcgctttcattgtttaggtttgatgttgttagaattagggtttttaatttggggatttaggtttgattttgcaggattctgttagggttttagtgtgttgaatgtcgttttgatttgactaatcgttttggttctttggtttgtacagGCTGCGAACTATTT GAACATCG >ASK05_at3g60020-usp gtgttggtgtgaaatctacgagtgtaattttatgtaacgtttaattatttttactagaatatgtatcattcagctttacaagattatatgtaatgtagctttatctgatgttaaacccccgaatttgtataactacgtttttgtgtgtttgttacattttgtactttgtctttgatggcaaatcttgaagtgagtgagaataacaacaaatcacataaaacaccaaccaaatcctttttttcttttgacatttaatccatctcgatcaagttttgggccgaatctacgaaactgggccaattgtaaaattcgacccggttcagcttggtttactccactgaaaatgttgatcccaccgactctgacagttggagcttataaatacacagacgcatagattcataagttcgttacatcatttttcttcaaatcgctctctaaattcttcttcaatcttgtttcatcaaccttgctttccagagaaaaatcgctccataacaATGTCGACGAAGATCATGTTGAAGAGCTCCGATGGTAAATCGTTCGAGATCGACGAAGACGTGGCACGCAAATCAATCGCGATAAACCATATGGTTGAGGACGGCTGCGCCACTGATGTAATACCGCTTCGAAACGTCACAAGCAAGATTCTCAAGATTGTGATCGATTATTGCGAGAAGCACGTCAAGAGCAAAGAAGAAGAAGATCTCAAGGAGTGGGACGCTGATTTCATGAAGACGATCGAAACAACCATTCTCTTTGATGTTATGATGGCTGCGAATTATCTCAATATTCAAAGCCTTCTTGATCTCACATGTAAAACTGTCTCGGATTTGCTCCAGGCTGATTTGCTCTCAGGGAAAACTCCAGATGAGATTCGCGCGCACTTCAACATCGAGAACGATCTAACAGCAGAGGAAGTAGCTAAGATTCGTGAGGAGAATCAATGGGCTTTTCAATGAgagagcggatcatcaaagttgttgatgc aaatctac >ASK06_at3g53060-usp gtttcctactttacacgtttgttacattttttctctttgttgcaaaccccaaatgtgtatgtgccatacaatatcaatcatctttgccatattttatcgaactgtttataggttactcaatatttttctcctcaaaaaatgtgaagactgacacgtaccaaatcttttaagtgagaatcacaacaaatcacataaaacaccaaccaattccattctttttcctcttttgacacactttaatccaaatctgatcaagttttggggccacattgcaaaattggggcccgttgtaagttttggtcccaatctgaaaatgcatatcccaccgactctgacaattagggcttataaataaacgagcacataagttcataacttggttacttcattattcttccatctttttcatcaaattgctaagagagtaaaatcgccccataacgaagtaaccatgtcgaagaagataatcgtgttgacaagctccgatgatgataaagggtATGGCAGAAGACGATTGTGCCGATAATGGAATCCCTCTTCCAAACGTGACAAGCAAGATACTCTTATTGGTGATCGAGTATTGCAAGAAGCACGTCGTTGAGAGCAAAGAAGAAGATCTAAAGAAGTGGGACGCTGAATTCATGAAGAAGATGGAACAATCGATTCTCTTTGATgttatgatggctgcgaattatctcaatatccaaagccttcttgatctcacattttcaaactgtcgctgatttgctctcagGCAAAACTCCAGGCGAGATTCGCTCATACTTCAATATCGAGAACGATTTCACAGCAGAGGGAGAAGCTGAGATCCGCAAGGTGAatcaatgggcttttgaatgaaagtggatcttcaaagttctttcttctttagtgttttcggttttcttgatgcgatgttagatgattacttcgtctatttcttgtttgttctgttgtttcttttcttttggtttcttgatgcataagtaaacc aatgtttg >ASK07_at3g21840-usp gtagataaaaaaaaaaaaagaattacatagaaacttaggagtaaacactgtaaaacacgaatttccaaaaaaaaaaaaagaagttgtagttataatttaaaagacacatataaaaaataatttagtggaaattaaaacaacataacatgagtagataacttaatgattgtgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacatgaaaaagttgtggttttaatcaaaaagacacataaaaaaagactttagtggaagaacaacatcaacatgagtagacatacaatacaacttcaagctttcgtgacttacttgacacgatacatgcatatatgaaagttattagggcttataaatagacaaacgcataggttcataacttcattaccttattactcaatcatttactcaattcttcaatcttccagagaaaaaatctcccccactgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttc tttctttt >ASK08_at3g21830-usp aagaagaattatatagaaacttaggagtaaacactgtaaaacacgaattttctaattaaaaaaaaagttgtagttataatctaaaagacacatagaaaaatactttagtggaaattaaaacaacataacatgagtagataacttaatgattttgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacataaaaatgttgtggttttaatcaaaaagacacataaaaaaaagacttaatggaagaacaacatcaacatgagtagacatacaacttcatgctttcgtgacttacttgacacgatacatgcatctatgaaagttagggcttataaatagacaagactcataggttcataacttcattaccttattactcaatcattttctcaattcttcaatctttcatcaaaatttcttccagagaaaaaaaacaaatctcccccacaaagaaaacaacaATGTCGACGAAAAAGATCATGTTGAAGAGCTCCGAGGGTAAAACGTTTGAGATTGAAGAAGAGACCGCACGCCAATGCCAGACCATAGCTCATATGATTGAAGCCGAATGTACAGATAACGTAATCCTGGTTTTAAAGATGACAAGCGAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCGAGAAGTGGGACAAGGAGTTCATGGAAAAAGATAAATCCACGATCTTTGCTCTCACCAATGCTGCGAATTTCCTAAACAACAAAAGCCTTCTTCATCTAGCAGGCCAAACCGTCGCGGATATGATCAAAGGCAACACTCCGAAGCAGATGCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttcttt cttttctt >ASK09at3g21850-usp gtagataaaaaaaaaaaaagaattacatagaaacttaggagtaaacactgtaaaacacgaatttccaaaaaaaaaaaaagaagttgtagttataatttaaaagacacatataaaaaataatttagtggaaattaaaacaacataacatgagtagataacttaatgattgtgtgacttacttgacacgatacatcatacatgtatctatgaaagttagggcttacatactaacttacaagtaaacacatgaaaaagttgtggttttaatcaaaaagacacataaaaaaagactttagtggaagaacaacatcaacatgagtagacatacaatacaacttcaagctttcgtgacttacttgacacgatacatgcatatatgaaagttattagggcttataaatagacaaacgcataggttcataacttcattaccttattactcaatcatttactcaattcttcaatcttccagagaaaaaatctcccccactgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCACGCCAATGCCAGATTATCATAGCTCATATGAGTGAAAATGATTGTACCGATAATGGAATCCCTCTTCCAAACGTGACAGGCAAGATTCTTGCGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGATGATCTCAAGAAGTGGGACAAGGAGTTCATGGAAAAAGATACATCCACGATCTTTGATCTCATCAAGGCTGCGAATTACCTAAACATCAAAAGCCTTTTTGATCTAGCATGCCAAACCGTCGCGGAAATCATCAAAGGCAACACTCCTGAGCAGATTCGCGAATTCTTCAACATTGAGAATGATTTAACACCTGAGGAAGAAGCAGCGATTCGTAGGGAGAATAAATGGGCTTTTGAATGAgcggattatcaaaacctaaacagctttc tttctttt >ASK10_at3g21860-usp tggaacatttgtggtattaagtgaaaacaaagagatctcgacatgccgtgtaataatattaagatcaattaattgcgtaagacgtgatatttatgcatgcatttatagaaactttgaagacgcgttatgtgtataaaacaacatcatagtgcatttttaataaaaaaagtttatttggaatatttgagtaggtggttagtccaatatataaaattgaattacatataatcttacaagtaaacacgataaaaagttgtggttttagctttaaaagactttagtggaaaacaacatcaacatgagtagacataaaacttcaagctttcgtgacttgcttgacacgatacatgcatctatgaaaattattagggcttattaatagacaaacgcataggttcataacttcattaccttattactgaaatccttctctcaattctcaatcattttctcaattcttcaatcttccagagaaaaaatctccccccagcgaaaaaataATGTCGACGAAGAAGATCATATTGAAGAGCTCCGATGGTCACTCATTCGAGGTCGAAGAAGAGGCCGCATGCCAATGCCAGACCATAGCTCATATGAGTGAAGACGATTGTACCGATAATGGAATCCCGCTTCCAGAAGTGACAGGCAAGATTCTTGAGATGGTGATCGAGTACTGCAACAAGCACCACGTCGATGCCGCTAATCCTTGCTCCGACGAGGATCTCAAGAAGTGGGACAAGGAATTCATGGAAAAATATCAATCCACGATCTTTGATCTCATTATGGCTGCGAATTACCTAAACATCAAAAGCCTTCTTGATCTAGCATGCCAAACCGTCGCGGATATGATCAAAGACAACACTGTGGAGCACACTCGCAAATTCTTCAACATTGAGAATGATTATACACATGAGGAAGAAGAAGCGGTTCGTAGGGAGAATCAATGGGGTTTTGAATGAgcgggtgaagcaaacactaagcagctttctt tcttttct >ASK11_at4g34210-usp acattaaagtcaaaaacaattattatttttaaagaatattagtgatatattttacctttatgaaacattattaatatttaaacataaaaaattaaaatatcatgagacgggaagtatagcagaacgggtttggattgataggaaatacatcgagacgggtttggattgatataacacaggtaaaatatttattttattattttatcatcactaatttttttaattacaagattaatataaaattatataactttattactaaatataaaattataaaatttaaaattattatatatataaataaaatatttgtccgtagtgtaccacgcgttgaattataatatgttgttgtcaccgactctagctagggcttataaatacaaagacgcatagggtttcacaacttcatcaccttaacacaatttgttcgctctctaaattcttaaatctttcatcaaatttgcttcacagtgaaaaacctcctccacaaggaacacacaATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAACACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAACTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGCTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGAgcccatgaatcaaaaccctaactagctttct ttttcttt >ASK12_at4g34470-usp gacttcttttggtttcttataaatgaatctgaatattatctttggataaagtttcctatattccaatgtatatctatatccctcttaaattgttacattttacgccgtacctttataattgggcctattgtaaatttaaaccggtttagctggttcactttactgaatttacattttctgaaaataaggatttgagccgaacctacaacaaaaaaatgaaaatttgacccggtttattttggtttactcatctgaaattaccttttatgaaaaggagtcgacaaataatttacttattttcttcggttaaggaaaataactatttccttcttaaaaaaggaatggttgatctcaccgactctagctagggcttataaatacaaagacgcatagggtttcacaacttcatcaccttaacacaatttgttcgctctctaaattcttaaatctttcatcaaatttgcttcacagtgaaaaacctcctccacaaggaacacacaATGTCTTCGAAGATGATCGTGTTGATGAGCTCCGATGGTCAGTCGTTTGAGGTGGAAGAAGCGGTAGCAATCCAATCGCAGACCATAGCGCATATGGTTGAAGACGATTGCGTTGCTGATGGAATCCCTCTTGCAAACGTGGAAAGCAAGATTCTTGTGAAAGTGATCGAGTACTGCAAGAAATACCACGTCGACGAGGCTAATCCTATCTCTGAAGAGGATCTCAACAAGTGGGACGAGAAGTTCATGGATCTCGAACAATCCACCATCTTTGAACTCATCCTTGCTGCGAATTACCTCAACATAAAAAGCTTGTTTGATCTCACATGCCAAACTGTTGCGGACATGATCAAAGGCAAGACTCCAGAGGAGATTCGTTCAACTTTCAACATTGAGAATGATTTTACACCTGAGGAAGAAGAAGCTGTTCGTAAGGAGAATCAATGGGCTTTTGAATGAgcccatgaatcaaaaacctaactagctttct ttttcttt >ASK13_at3g60010-usp aacacatgataaatttacgtgaccataattacttttcttcttttcttttgttttgaatgacataaacaaaatttgtagatatttttcttgctttcggagttttttaaaatatggaatatacacatgtaacgttttagaaagtttccatttttttttctttttacctcttttttgttggttcactcgttaaactcttgttacaattgtattgaattagagccgttcattatgacacgtagattcggatctgatttcctttttgtaccgaaattagatcaattgggactagatccatcaaaaacccggtttaggattggtttactcaactgaattttccttttaggttaaaatagtaatttcatccattatgaaaaaaggcaagaattataccgactctcacaattaggtctctctataaatacacaccttttgatatctctccatcatcgaaaaccttgtaaacaaaaatcgccaccacaaaaagaaaaaagaaggaaacgATGTCGAAGATGGTTATGTTGCTGAGCTCCGATGGTGAATCTTTCCAGGTCGAAGAAGCAGTCGCGGTCCAGTCACAGACGATAGCACATATGATTGAAGACGATTGCGTCGCCAATGGAGTCCCTATCGCAAACGTTACAGGAGTCATCCTCTCGAAGGTGATCGAGTATTGCAAGAAACACGTCGTTTCTGATTCACCAACCGAAGAGAGCAAAGACGAACTCAAGAAGTGGGACGCTGAGTTCATGAAGGCCCTGGAACAGTCGTCGACTCTCTTTGATGTTATGCTGGCTGCGAATTACCTAAACATAAAAGACCTGCTTGACCTTGGTTGCCAAACTGTTGCTGACATGATCACTGGCAAGAAACCAGACGAGATTCGTGCACTTCTTGGCATCGAGAACGATTTTACACCGGAGGAGGAAGAGGAGATTCGTAAGGAGAATCAATGGGCTTTTGAATGAttctttagttttctttttcgacgtt agtgtgct >ASK14_at2g03170-usp ttagtagttattcaaataaaagtagaaaattaaaacctaataaactaaaagaacgcaactgattacaaaacttaatttatagactcttatcctataattagattaataattaatcaattaaattcaattctaagtaagatggacaaattaattaaataaataaaatgtcatacaaaattttcatagaaaatagccatgtttaggaaaaaaactctttttgtatgttaaataagtttctagaatctcctgataaactcttttactaagacgtcggtttttacttttaagcccaaaagttttaaggctaaacactagagatttggaagatttgcttttcttctcttccagtattgtgaaaaggaaaacacaattgaccgactcttaaaatattttataaatagacgccttcatcgactcctctctattccattcaatatctttgcataaatcataaccaatattattcttttcatcacccaaaatctcaaaaacgaacaaacATGTCTTCCAACAAGATTGTTTTGTCTAGCTCCGATGGCGAATCTTTCGAGGTTGAAGAAGCGGTGGCAAGAAAACTGAAAATCGTGGAACACATGATTGAAGACGACTGTGTTGTTACCGAGGTCCCTCTTCAAAACGTCACCGGAAAGATCCTCTCCATTGTTGTCGAGTATTGCAAGAAACACGTCGTTGACGAAGAAAGCGACGAGTTCAAGACTTGGGACGAAGAGTTCATGAAGAAATTTGATCAGCCTACGGTCTTCCAACTCTTGCTCGCTGCTAACTATCTCAATATCAAAGGCCTTCTTGATCTCTCTGCTCAAACCGTTGCAGATCGCATCAAAGATAAGACTCCAGAGGAAATTCGAGAAATCTTCAACATCGAGAACGATTTCACACCCGAAGAAGAAGCAGCGGTTCGCAAGGAAAACGCATGGGCTTTTGAATAGacaccaaaaccctagtttttggtttcgtttcttattgtcg gttttagg >ASK15_at3g25650-usp acattgactcttttggatcaaaatggatggtacatatcttgtgatttataatttttagcttgttacaatcgtcattgcatagaacttatatgagtattatggattgatcttatatttaaacataaaaatatttttatacaaataaatactttataagtttccaagatataaatttgatattttatatagatatataaatataaaattttacaaataaatatctttctcttttgcggatccaataaaaggggttaatcccaacaaactagttaagttagaatttgcgttattttccttatttccttgagaaaaaggaaacaacatgaaacctaattctaattggaaaaggaaaacacaaattgaccgactcttaggatttctatagataaacacattcatcgactcttttatttttcattcaacaatctctcatcaaatttctctgcacaaaaaataaaagcaaaaacttattttttcgtttaacacacaaagcaaacaaaATGTCTTCTAACAAGATTGTGTTGACTAGTTCCGATGGCGAGTCTTTCCAAGTTGAGGAAGTGGTGGCACGAAAACTGCAGATCGTAAAGCACCTGCTCGAAGACGACTGTGTTATTAACGAAATCCCTCTTCAAAACGTTACAGGAAATATTCTCTCCATCGTTCTCGAGTATTGCAAGAAACACGTCGACGATGTGGTCGATGATGATGCATCTGAGGAGCCGAAGAAGAAGAAGCCCGATGATgtggcgggccggttcctgaatcaactgaagaaggagatgatgcatctgagGAGGCGAAGCAGAATCTCGATGCTTGGGACGCAGAGTTCATGAAAAATATTGATATGGAAACAATCTTCAAGCTCATTCTCGCTGCTAACTATCTCAACGTCGAAGGTCTTCTTGGTCTCACTTGCCAGACTGTTGCAGATTACATCAAAGATAAGACGCCAGAGGAAGTTCGAGAACTCTTTAATATCGAGAA TGATTTCA >ASK16_at2g03190-usp taaatgttgaaaaagtaagtatcttagaacaaagtacatgtcatcaaattaattatttctaaaacttataaataaaaaaatctcactatattatactatatttaatatttacattgagttgataagagataccaaaaattaaattttgcttttaggtgaaatcctatctataatggttaatatttttaaaatattaaattacaatattagtataacttatataaaatctaatcaatataattttattagcatgtgtgtagtatatgtatgggtcaaatattaaaagaaaatataagagcctaacaattaattatatagtaggattttgttatatttatttttccttactatatttcctaaattaaagatttgacccactctcaactcattataaaaagagagacattatttcatagcttcatcaattttattcaacaatttctctcatttcatttcttccataaaatctctcaaaaatcaacaaaacaacaaaacaaacaATGTCTTCGAACAAGATTGTGTTGACTAGCTCGGATGATGAATCGTTCGAGGTTGAGGAAGCGGTGGCTCGTAAATTGAAGGTCATAGCACACATGATCGATGACGACTGCGCCGATAAAGCAATCCCGCTTGAAAACGTCACCGGAAATATCCTCGCTTTGGTTATCGAGTATTGCAAGAAACACGTACTTGATGATGTTGATGATAGTGATGATTCTACTGAAGCAACAAGCGAAAATGTAAACGAGGAAGCCAAGAACGAGCTCAGGACTTGGGACGCAGAGTTCATGAAAGAATTTGATATGGAAACAGTCATGAAACTCATTCTCGCTGTTAATTATCTCAACGTCCAAGATCTTCTTGGTCTCACTTGCCAGACCGTTGCAGATCACATGAAAGATATGTCGCCAGAGGAAGTTCGAGAACTCTTTAACATTGAGAATGATTACACACCTGAAGAAGAAGACGCGATTCGTAAGGAAAACGCTT GGGCTTTT >ASK17_at2g20160-usp atgacctttttttggtgaaaacaattatttatgacattgtcatagagctaattattttaaaaacttataaatagaaaatctcactatattatacccaaaatgaaaatttatgttaggtgaaactgaatccaatatataaatataacaatatatggttaatatttttttgaatattaagatagcatctataaaatctcatgaatattttttattagcatatgtgtagtatatgggtcaaatattaaaagataagataagagcctaattaattgattaggattttgttatatttacttttgctttctcttcccttatttaggaaaaaaagagaggaaaatatattttatatatatttcctcaattatagatttgacctcactctcaactcattataaaaagagagacttgcatagcctgagactcatcaatttcatataacaatttcttccataagatatctcaaaaatcttgctcttcgttaaaccagcaaaacaaacaaaATGTCTTCGAAGAAGATTGTGTTGACTAGCTCCGATGATGAATGTTTTGAGATTGACGAAGCGGTGGCTCGTAAGATGCAGATGGTAGCGCACATGATCGATGACGATTGCGCCGATAAAGCAATCCGGCTTCAAAACGTCACTGGAAAGATCCTCGCTATCATTATCGAGTATTGCAAGAAACACGTTGATGATGTTGAAGCCAAGAATGAGTTCGTGACTTGGGACGCAGAGTTCGTGAAAAACATTGATATGGATACACTCTTCAAACTCCTTGACGCTGCTGACTATCTCATCGTCATAGGTCTCAAGAATCTCATTGCCCAGGCCATTGCAGATTACACTGCAGATAAGACGGTAAATGAGATTCGAGAACTCTTTAACATCGAGAACGATTACACACCTGAGGAAGAAGAAGAGCTTCGCAAGAAGAACGAATGGGCTTTCAATTAAtaaccctaaagttccgttgtctaagtgttgatctcga tgttcttt >ASK18_at1g10230-usp taatagtttggcccattattccgaatcatttccttgttaattgagtgtattaataaatgttttggactggtttgtaccaaaaaataaaataatgttaagcttagaagattagaagatatacaataaactttccaaatcggcaacaaaacaaagtatttgatattggtttgatgtgtttcacgaccatagaacaagacggtacattatactatatgaatgttggcaaaagacaagtttttatgtttagttacgtttctctgcaaacgaagatattttttagtttgatcggttttctacaaaccggttccaaatcaatatgatcccattttggttttctcttcagaatgttctagaatcagatgagtgggacttgttgagtacagccgtataggttgtttgggctttatgaaatctttaggcccataattctgatccattatatttccttttctcccctacttgtaagggtttattctggattactcatttgccttataacaatggcttcttcttccgaagagattgtgtccgccggcgaatcatcagagatcgaggaagcggttgcgagtctaaccATGTCTTCTAACAAGATTTTGTTGACGAGTTCCGATGGCGAGTCTTTCGAGATCGACGAAGCGGTGGCGCGTAAGTTTCTGATCATAGTGCACATGATGGAGGATAACTGCGCCGGTGAAGCAATTCCGCTTGAAAATGTCACCGGGGATATCCTCTCCAAGATAATCGAGTACGCGAAGATGCACGTCAATGAACCTAGTGAAGAAGACGAAGACGAGGAGGCGAAGAAGAATCTAGACTCGTGGGACGCTAAGTTCATGGAAAAGCTAGATCTGGAGACCATCTTCAAAATCATTCTCGCTGCCAACTACCTAAACTTCGAAGGACTTCTCGGTTTCGCTAGCCAGACGGTTGCTGATTACATCAAGGACAAAACACCAGAGGAAGTACGAGAGATTTTCAACATCGAGAACG ATTTCACG >ASK19_at2g03160-usp attcacgatcgaggaacttgtacaataaccagtaatcttgcgagtttctttttttttccatgaataataataccaaaaaattagatgtttaaggttttgttagattctacataatctaatatgctcttatatcataattaaattaataattaatcccacgtcaaacaaaattagaagattacagaaaatgttataaactttgcatcgaaaaatacaaatcaaggaacaaaaggagtatattagtatattgctaaagaagtttctagaaatatatatatcttccactaaaccctagttaaaataggatttgagtctattttccttatttctttgtgagaaaaggaaacaacatgaaacccaatatctaattgtgaaaaggaaaacacaattgactgaatcttagggttctataaatagactgattcaacaatctctcatcaaaatttctcttcacaaacaattgactcttgtttttcgttcaacacacaaagcaaaaaacaATGTCTTCGAAAAAGATTGTGTTGACAAGCTCCGATGGTGAATCTTTCAAGGTTGAAGAAGTGGTGGCAAGAAAACTGCAGATCGTAGGACACATTATCGAAGACGACTGTGCTACAAACAAAATCCCTATTCCAAACGTTACCGGAGAGATTCTCGCCAAGGTTATCGAGTACTGCAAGAAACACGTTGAAGACGATGATGACGTGGTGGAGACGCATGAATCATCGACGAAAGGAGATAAAACAGTTGAGGAGGCGAAGAAGAAGCCTGATGATGTGGCCGTACCTGAATCAACTGAAGGAGATGATGAAGCTGAGGATAAGAAGGAGAAGCTTAATGAGTGGGATGCAAAGTTCATGAAGGATTTCGATATTAAGACGATCTTCGACATTATTCTGGCTGCTAACTATCTCAACGTCCAAGGTCTTTTTGATCTCTGTAGCAAGACCATTGCAGATTACATAAAAGATATGACGCCAGAGGAAGTTC GAGAACTC ### last but not least showdb [just look for the ###] schleppi:/EMBOSS/data/ask # showdb Displays information on the currently available databases # Name Type ID Qry All Comment # ==== ==== == === === ======= qapblast P OK OK OK BLAST swissnew qapblastall P OK OK OK BLAST swissnew, all fields indexed qapblastsplit P OK OK OK BLAST swissnew split in 5 files qapblastsplitexc P OK OK OK BLAST swissnew split in 5 files, not file 02 qapblastsplitinc P OK OK OK BLAST swissnew split in 5 files, only file 02 qapfasta P OK OK OK FASTA file swissnew entries qapflat P OK OK OK Swissnew flatfiles qapflatall P OK OK OK Swissnew flatfiles, all fields indexed qapflatexc P OK OK OK Swissnew flatfiles, no updated sequence file qapflatinc P OK OK OK Swissnew flatfiles, only updated sequence file qapir P OK OK OK PIR qapirall P OK OK OK PIR qapirinc P OK OK OK PIR tpir P OK OK OK PIR using NBRF access for 4 files tsw P OK OK OK Swissprot native format with EMBL CD-ROM index tswnew P OK OK OK Swissnew as 3 files in native format with EMBL CD -ROM index twp P OK OK OK EMBL new in native format with EMBL CD-ROM index [###]askn N OK OK OK ASKxx_atxgxxxxx-sp/usp[###] qanfasta N OK OK OK FASTA file EMBL rodents qanfastaall N OK OK OK FASTA file EMBL rodents, all fields indexed qanflat N OK OK OK EMBL flatfiles qangcg N OK OK OK GCG format EMBL qangcgall N OK OK OK GCG format EMBL qangcgexc N OK OK OK GCG format EMBL without prokaryotes qangcginc N OK OK OK GCG format EMBL only prokaryotes qapirexc N OK OK OK PIR qasrs N OK OK - EMBL in local srs installation qasrsfasta N OK OK - EMBL in local srs installation, fasta format qasrswww N OK - - Remote SRS web server qawfasta N OK OK OK FASTA file wormpep entries tembl N OK OK OK EMBL in native format with EMBL CD-ROM index tgb N OK - - Genbank IDs tgenbank N OK OK OK GenBank in native format with EMBL CD-ROM index schleppi:/EMBOSS/data/ask # ------------------------------------------------------------------------------------------- From faurie at clermont.in2p3.fr Wed Aug 28 12:13:30 2002 From: faurie at clermont.in2p3.fr (julien Faurie) Date: Wed, 28 Aug 2002 14:13:30 +0200 Subject: generate a FASTA file with a annotation file (extension .dat) Message-ID: <3D6CBE6A.5E492550@clermont.in2p3.fr> Hi everybody, I have a little question. I downloaded swissprot release and now, I would like to generate my fasta file. In EMBOSS documention, I found the command "seqret" And I would like to know if it's a good idea to use it or Have you others commands to create a fasta file from annotation file. I don't want to lose informations I want to translate all sequences in annotation file to fasta file. Thanks in advance for your help. Julien Faurie "I apologyse for my english" From peter.rice at uk.lionbioscience.com Wed Aug 28 13:04:04 2002 From: peter.rice at uk.lionbioscience.com (Peter Rice) Date: Wed, 28 Aug 2002 14:04:04 +0100 Subject: generate a FASTA file with a annotation file (extension .dat) References: <3D6CBE6A.5E492550@clermont.in2p3.fr> Message-ID: <3D6CCA44.7020103@uk.lionbioscience.com> julien Faurie wrote: > I downloaded swissprot release and now, I would like to generate my > fasta file. > > In EMBOSS documention, I found the command "seqret" Good choice. You can pick the fasta format you prefer: % seqret seq.dat -sformat swissprot swissprot.fasta ... gives you fasta headers like this: >100K_RAT Q62671 100 KDA PROTEIN (EC 6.3.2.-). % seqret seq.dat -sformat swiss swissprot.ncbi -osformat ncbi ... gives you fasta headers like this: >gnl|unk|100K_RAT (Q62671) 100 KDA PROTEIN (EC 6.3.2.-). % seqret seq.dat -sformat swiss swissprot.blast -osformat ncbi -osdbname sp ... gives you fasta headers like this: >gnl|sp|100K_RAT (Q62671) 100 KDA PROTEIN (EC 6.3.2.-). Note in passing ... it seems 100K_RAT is no longer the first entry in SwissProt, as accession Q62671 is now called KC11_RAT (Casein Kinase I gamma 1 isoform). -- ------------------------------------------------ Peter Rice, LION Bioscience Ltd, Cambridge, UK peter.rice at uk.lionbioscience.com +44 1223 224723 From faurie at clermont.in2p3.fr Wed Aug 28 13:12:59 2002 From: faurie at clermont.in2p3.fr (julien Faurie) Date: Wed, 28 Aug 2002 15:12:59 +0200 Subject: generate a FASTA file with a annotation file (extension .dat) References: <3D6CBE6A.5E492550@clermont.in2p3.fr> <3D6CCA44.7020103@uk.lionbioscience.com> Message-ID: <3D6CCC5B.A353FA38@clermont.in2p3.fr> > > In EMBOSS documention, I found the command "seqret" > > Good choice. You can pick the fasta format you prefer: > > % seqret seq.dat -sformat swissprot swissprot.fasta > > ... gives you fasta headers like this: > > >100K_RAT Q62671 100 KDA PROTEIN (EC 6.3.2.-). Thank you very much and I prefer the first command at this time. > > Note in passing ... it seems 100K_RAT is no longer the first entry in > SwissProt, as accession Q62671 is now called KC11_RAT (Casein Kinase I > gamma 1 isoform). this note is not important for my translation, isn't it ??? Thanks for your help. Regards. Julien Faurie. From abrown at nimr.mrc.ac.uk Fri Aug 30 13:14:18 2002 From: abrown at nimr.mrc.ac.uk (Alex Brown) Date: Fri, 30 Aug 2002 14:14:18 +0100 Subject: PROSITE Help Required Message-ID: <6410D05B-BC1A-11D6-8606-0003938768AC@nimr.mrc.ac.uk> Hi. I am trying to install the PROSITE database into my EMBOSS installation. I obtained the files prosite.dat and prosite.doc by anonyous FTP from ftp.ebi.ac.uk, and have placed both of these in the /EMBOSS/data/PROSITE directory. I then ran prosextract : this gave the following error EMBOSS An error in prosextract.c at line 83: Cannot open file data/PROSITE/prosite.dat What am I doing wrong ?? I am running on a Mac PowerBook G4, under Darwin with XDarwin (XFree86 4.2.0). Many thanks, Alex Brown. From haruna at sgi.com Sat Aug 31 04:56:45 2002 From: haruna at sgi.com (Haruna Cofer) Date: Sat, 31 Aug 2002 00:56:45 -0400 Subject: EMBOSS, EMBASSY, Jemboss on SGI Message-ID: <3D704C8D.659587C@sgi.com> Hello! Just FYI, I have placed my SGI porting notes for all of EMBOSS, EMBASSY, and Jemboss 2.5.0 on the SGI web site: http://www.sgi.com/industries/sciences/chembio/resources/emboss/ Thank you, and please do let me know if you have any questions or suggestions! -- Haruna :) -- Haruna N. Cofer Silicon Graphics Inc. ChemPharm Applications