From ajb at ebi.ac.uk Thu Jul 15 06:18:06 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Thu, 15 Jul 2010 11:18:06 +0100 (BST) Subject: [EMBOSS] EMBOSS 6.3.0 released Message-ID: <58375.86.26.12.63.1279189086.squirrel@webmail.ebi.ac.uk> EMBOSS 6.3.0 is now available and can be downloaded from our ftp server: ftp://emboss.open-bio.org/pub/EMBOSS/ The associated (optional) EMBASSY packages are in the same directory. mEMBOSS, the native Microsoft windows port, can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/windows/ This release provides a platform for further application development. Some highlights include: Network access to BioMart, Ensembl and general SQL databases Support for BAM/SAM files Parsing and validation for NCBI taxonomy and OBO files Scaleable graphics options Rabin-Karp multi-pattern search algorithm implemented EDAM ontology identifiers added Full details are in the attached ChangeLog Installation: There are some new optional steps in the installation. For UNIX: To enable PDF graphics support you will need to have installed the libhpdf development files (a.k.a. libharu, source available via libharu.org). To enable MySQL and/or PostgreSQL support their development files will need to have been installed. For example, under Linux RPM systems the packages would typically be called libharu-devel, mysql-devel & postgresql-devel. Such installations need to be performed prior to the EMBOSS configuration step. The configuration will automatically include support for the above if the relevant files are detected. For Windows (mEMBOSS): No action is required. PDF and MySQL support DLLs are included by the installation. Alan -------------- next part -------------- A non-text attachment was scrubbed... Name: ChangeLog Type: application/octet-stream Size: 11537 bytes Desc: not available URL: From shrish at ccmb.res.in Mon Jul 19 07:32:46 2010 From: shrish at ccmb.res.in (Shrish Tiwari) Date: Mon, 19 Jul 2010 17:02:46 +0530 (IST) Subject: [EMBOSS] compiling problems Message-ID: <444095438.19601279539166895.JavaMail.root@127.0.0.1> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ajb at ebi.ac.uk Mon Jul 19 10:08:11 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 19 Jul 2010 15:08:11 +0100 (BST) Subject: [EMBOSS] EMBOSS 6.3.1 available Message-ID: <42192.86.26.12.63.1279548491.squirrel@webmail.ebi.ac.uk> EMBOSS 6.3.1 is now available and can be downloaded from our ftp server: ftp://emboss.open-bio.org/pub/EMBOSS/ The associated (optional) EMBASSY packages are in the same directory. mEMBOSS, the native Microsoft windows port, can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/windows/ This is a maintenance release. It fixes a compilation failure under Linux Ubuntu distributions. The bug causing the failure was in a function not actually used by the EMBOSS main package but was used by two of the EMBASSY packages. It could cause some output file permissions to be set incorrectly on other operating systems or distributions. The remaining minor differences over 6.3.0 are: a) The optional configuration switch for PDF is now called --with-hpdf to bring it into line with the documentation. b) The default restriction isoschizomer data file has been updated. c) svg & pdf are now only reported once under graphics devices. For mEMBOSS, the only changes are 'b)' and 'c)' We apologise for any inconvenience caused. Alan From ajb at ebi.ac.uk Mon Jul 19 10:41:47 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 19 Jul 2010 15:41:47 +0100 (BST) Subject: [EMBOSS] compiling problems In-Reply-To: <444095438.19601279539166895.JavaMail.root@127.0.0.1> References: <444095438.19601279539166895.JavaMail.root@127.0.0.1> Message-ID: <53997.86.26.12.63.1279550507.squirrel@webmail.ebi.ac.uk> Hi Shrish, The problem appears to be with your PostgreSQL development files under RHEL. The EMBOSS configuration is picking up that your system says it has the development files installed and is including the relevant library that should contain the function 'PQescapeStringConn' i.e. libpq (-lpq). From the error it appears the function isn't there for some reason. I have not seen this problem on other systems but, as the SQL stuff is a new addition from 6.3.0 I'd be interested to know if people have had similar experiences on non-RHEL machines. I'll likely contact you off-list with a follow-up question. In the meantime, if you cannot update your PostgreSQL then you can always 'make clean' and configure again using: --without-postgresql as your system is reporting that it also has MySQL (which will allow you access to mysql servers [obviously] and the public Ensembl ones). ATB Alan From biopython at maubp.freeserve.co.uk Tue Jul 20 12:27:42 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 20 Jul 2010 17:27:42 +0100 Subject: [EMBOSS] Counting the number of sequences in a file Message-ID: Hi all, Is there a tool in EMBOSS to just count the number of sequences in a file? For simple file formats like FASTA or GenBank I'd typically just use grep: $ grep -c "^LOCUS " gbvrt1.seq 31065 However, this becomes more complicated for general file formats (e.g. FASTQ files where in addition to identifiers the quality lines can also start with @) or binary files like BAM which EMBOSS now supports. Right now I could handle this by using seqret to convert the file into FASTA and then pipe that though grep to count the records. But an EMBOSS tool would be more elegant, e.g. $ countseq -sformat=genbank gbvrt1.seq 31065 For the implementation you might offer the choice between using the normal EMBOSS parsing (as in seqret) versus file format specific regular expression searches which just look for marker lines (without checking validity) which should be really fast. Regards, Peter C. From pmr at ebi.ac.uk Tue Jul 20 13:02:12 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 Jul 2010 18:02:12 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: Message-ID: <4C45D694.9090405@ebi.ac.uk> On 20/07/10 17:27, Peter C. wrote: > Hi all, > > Is there a tool in EMBOSS to just count the number of sequences in a file? > > Right now I could handle this by using seqret to convert the file into FASTA > and then pipe that though grep to count the records. But an EMBOSS tool > would be more elegant, e.g. > > $ countseq -sformat=genbank gbvrt1.seq > 31065 > > For the implementation you might offer the choice between using the normal > EMBOSS parsing (as in seqret) versus file format specific regular expression > searches which just look for marker lines (without checking validity) which > should be really fast. Very easy to write ... you could do it yourself for practise (we will help of course). Just use seqret as the basis, don't write any sequences out, but add an outfile for the results. We will add countseq to the next release. regards, Peter Rice From pmr at ebi.ac.uk Tue Jul 20 13:04:58 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 Jul 2010 18:04:58 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: Message-ID: <4C45D73A.1010406@ebi.ac.uk> On 20/07/10 17:27, Peter C. wrote: > $ countseq -sformat=genbank gbvrt1.seq > 31065 Of course, you could just use: $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' 31065 :-) Peter Rice From biopython at maubp.freeserve.co.uk Tue Jul 20 16:01:24 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 20 Jul 2010 21:01:24 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: <4C45D73A.1010406@ebi.ac.uk> References: <4C45D73A.1010406@ebi.ac.uk> Message-ID: On Tue, Jul 20, 2010 at 6:04 PM, Peter Rice wrote: > > On 20/07/10 17:27, Peter C. wrote: >> $ countseq -sformat=genbank gbvrt1.seq >> 31065 > > Of course, you could just use: > > $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' > 31065 > > :-) > Exactly what I had in mind as the work around ("handle this by using seqret to convert the file into FASTA and then pipe that though grep to count the records"), although I'd not thought about the fact that FASTA is the default output format which keeps it nice and short. The (Unix) command line can be great :) Peter C From uludag at ebi.ac.uk Tue Jul 20 16:57:25 2010 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Tue, 20 Jul 2010 21:57:25 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: <4C45D73A.1010406@ebi.ac.uk> Message-ID: <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> >> $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' infoseq prints a separate line for each sequence, following command line may also work. > $ infoseq -filter -sformat=genbank gbvrt1.seq | wc -l Mahmut From andrew.warry at bbsrc.ac.uk Wed Jul 21 05:28:48 2010 From: andrew.warry at bbsrc.ac.uk (andrew warry (BITS)) Date: Wed, 21 Jul 2010 10:28:48 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> References: <4C45D73A.1010406@ebi.ac.uk> <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> Message-ID: Watch out for the infoseq header line you should use -noheading to avoid a +1 to your total infoseq -sformat=genbank gbvrt1.seq -noheading -auto | wc -l Andrew -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Mahmut Uludag Sent: 20 July 2010 21:57 To: Peter; Peter Rice Cc: emboss at emboss.open-bio.org Subject: Re: [EMBOSS] Counting the number of sequences in a file >> $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' infoseq prints a separate line for each sequence, following command line may also work. > $ infoseq -filter -sformat=genbank gbvrt1.seq | wc -l Mahmut _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss -- Disclaimer: This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender. This email and any attachments are believed to be free from viruses but BBSRC accepts no liability in connection therewith. From ajb at ebi.ac.uk Wed Jul 21 05:42:17 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Wed, 21 Jul 2010 10:42:17 +0100 (BST) Subject: [EMBOSS] Patch/Fix 1 for EMBOSS 6.3.1 available Message-ID: <52577.86.26.12.63.1279705337.squirrel@webmail.ebi.ac.uk> This is available as separate drop-in replacement files, or as a patch file, from the ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ directory hierarchy. Instructions are also available there. Problem addressed: In some circumstances the inclusion of PDF support could prevent the inclusion of MySQL support. If you required MySQL support and have already successfully installed it then there is no need to apply this patch. Similarly, the patch is unnecessary if you didn't want MySQL support included. Alan From biopython at maubp.freeserve.co.uk Thu Jul 22 07:16:48 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:16:48 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BD080F6.3060409@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: On Thu, Apr 22, 2010 at 6:01 PM, Peter Rice wrote: > > On 22/04/2010 16:48, Peter Cock wrote: > >> Does this mean there is an updated seqret in a public repository where I >> can convert an ABI file to FASTQ taking the ABI basecaller's sequence >> and PHRED scores? I'd be interested to test that... or a patch against >> EMBOSS 6.2.0. > > It is in the latest CVS code and will appeart in the July release. > Hi Peter R et al, I've just compiled and installed EMBOSS 6.3.1 on Mac OS X, and had a go converting some ABI (extension .ab1) files from our in house sequencing service to FASTQ - so far all the examples give Sanger FASTQ quality strings of "!" (ASCII 33, PHRED quality zero) or Illumina FASTQ quality strings of "@" (ASCII 64, again PHRED quality zero). I remember you saying ABI files can have two sets of quality scores, so perhaps my files have one set all of PHRED zero? I tried to find some 3rd party example files via Google, for example on http://www.elimbio.com/sequencing_sample_files.htm they have a zip file http://www.elimbio.com/Forms/pGEM.zip containing one ABI file. The output of this is more interesting: $ seqret -sformat abi -osformat fastq -auto -stdout -sequence pGEM_\(ABI\)_A01.ab1 @pGEM_(ABI) NANTCTATAGGCGAATTCGAGCTCGGTA...GNN + "!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"...!"!"!" I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 (PHRED quality 1, quality 0) which is rather strange. The sequence appears to agree with the provided file pGEM_(ABI)_A01.seq Have I just been unlucky with the AB1 files that I have looked at? Thus far all the quality scores seem meaningless. Peter C. From biopython at maubp.freeserve.co.uk Thu Jul 22 07:22:06 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:22:06 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 12:16 PM, Peter wrote: > On Thu, Apr 22, 2010 at 6:01 PM, Peter Rice wrote: >> >> On 22/04/2010 16:48, Peter Cock wrote: >> >>> Does this mean there is an updated seqret in a public repository where I >>> can convert an ABI file to FASTQ taking the ABI basecaller's sequence >>> and PHRED scores? I'd be interested to test that... or a patch against >>> EMBOSS 6.2.0. >> >> It is in the latest CVS code and will appeart in the July release. >> > > Hi Peter R et al, > > I've just compiled and installed EMBOSS 6.3.1 on Mac OS X, and had a > go converting some ABI (extension .ab1) files from our in house sequencing > service to FASTQ - so far all the examples give Sanger FASTQ quality strings > of "!" (ASCII 33, PHRED quality zero) or Illumina FASTQ quality strings of > "@" (ASCII 64, again PHRED quality zero). > > I remember you saying ABI files can have two sets of quality scores, > so perhaps my files have one set all of PHRED zero? > > I tried to find some 3rd party example files via Google, for example on > http://www.elimbio.com/sequencing_sample_files.htm they have a zip > file http://www.elimbio.com/Forms/pGEM.zip containing one ABI file. > The output of this is more interesting: > > $ seqret -sformat abi -osformat fastq ?-auto -stdout -sequence > pGEM_\(ABI\)_A01.ab1 > @pGEM_(ABI) > NANTCTATAGGCGAATTCGAGCTCGGTA...GNN > + > "!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"...!"!"!" > > I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 > (PHRED quality 1, quality 0) which is rather strange. The sequence appears > to agree with the provided file pGEM_(ABI)_A01.seq > > Have I just been unlucky with the AB1 files that I have looked at? Thus > far all the quality scores seem meaningless. I went back through my old emails, and see you had been testing with http://www.appliedbiosystems.com/support/software_community/ab1_files.zip (I had trouble downloading this with curl - Firefox worked). Looking at these ABI files with seqret as FASTQ does seem to give meaningful quality scores. Curious. Peter C. From biopython at maubp.freeserve.co.uk Thu Jul 22 07:36:04 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:36:04 +0100 Subject: [EMBOSS] transeq and ambiguous codons In-Reply-To: <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> References: <320fb6e00907081450y2fd135e0x817f03c41357e297@mail.gmail.com> <4A55B395.4090301@ebi.ac.uk> <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> Message-ID: Hi again, Now that I have installed the latest and greatest version, EMBOSS 6.3.1, I'm revisiting some old issues I had with EMBOSS. In this case 'unambiguous ambiguous codons' and other translation issues. On Fri, Jul 10, 2009 at 10:14 AM, Peter C. wrote: > On Thu, Jul 9, 2009 at 10:08 AM, Peter Rice wrote: >> >> Peter C. wrote: >>> However, consider the codon TRR. R means A or G, so this can mean TAA, >>> TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI >>> standard table agree here). Therefore the translation of TRR should be >>> "* or W", which I would expect based on the above examples to result >>> in "X". But instead EMBOSS transeq gives "*": >> >> This is a side effect of the way backtranslation works... > > OK, leaving TRR aside for the moment (I'm not sure I'd have done it that > way, but I think I follow your logic), I have some more problem cases for > you to consider (all using the default standard NCBI table 1). > > Most of these are 'unambiguous ambiguous codons' as you put it, and > I would agree using X when a more specific letter is possible isn't ideal > but isn't actually wrong. The "ATS" and related codons (see below) > however are simply wrong. > > -------------------------------------------------------------------------------------- > > TRA means TAA or TGA, which are both stop codons. Therefore TRA > should translate as a stop, not as an X: > > $ transeq asis:TAATGATRA -stdout -auto -osformat raw > **X Same on EMBOSS 6.3.1, shouldn't TRA translate as stop? > -------------------------------------------------------------------------------------- > > Now look at YTA, which means CTA or TTA which encode L, so > YTA should be L not X: > > $ transeq asis:CTATTAYTA -stdout -auto -osformat raw > LLX Same on EMBOSS 6.3.1, giving X instead of specific amino acid (i.e. YTA is an "unambiguous ambiguous codon" for L) > Likewise for YTG and YTR, and YTN. I haven't re-checked these. > -------------------------------------------------------------------------------------- > > Another example, ATW means ATA or ATT, which both translate as I, > so ATW should translate as I not X: > > $ transeq asis:ATAATTATW -stdout -auto -osformat raw > IIX Same on EMBOSS 6.3.1, giving X instead of specific amino acid (i.e. ATW is an "unambiguous ambiguous codon" for I) > -------------------------------------------------------------------------------------- > > Conversely, ATS which means ATC or ATG which translate as I and M. > Remember S means G or C. Therefore ATS should translate as X, and > not I: > > $ transeq asis:ATCATGATS -stdout -auto -osformat raw > IMI Same on EMBOSS 6.3.1, giving potentially wrong amino acid instead of X. > Likewise H means A, G or C, so ATH shows the same bug, as do some > other AT* codons: > > $ transeq asis:ATAATCATGATH -stdout -auto -osformat raw > IIMI > > [*** This one strikes me as a clear bug ***] Same on EMBOSS 6.3.1, giving potentially wrong amino acid instead of X. As I noted before, this list is only partial, and only for the standard table. I could compile a much longer list of oddities using the Biopython translation as a reference if you wanted. Regards, Peter C. From pmr at ebi.ac.uk Thu Jul 22 08:28:00 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 22 Jul 2010 13:28:00 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: <4C483950.7070004@ebi.ac.uk> On 22/07/10 12:22, Peter C. wrote: >> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 >> (PHRED quality 1, quality 0) which is rather strange. The sequence appears >> to agree with the provided file pGEM_(ABI)_A01.seq >> >> Have I just been unlucky with the AB1 files that I have looked at? Thus >> far all the quality scores seem meaningless. There are two sets of quality scores in that file. Both are the alternating characters 1 and 0. Adding 33 gives the scores you see. Looks as though EMBOSS is just reporting what it finds. The file offset is the value returned by function ajSeqABIGetConfidOffset. It simply reads one byte from there for each base of sequence length. > I went back through my old emails, and see you had been testing with > http://www.appliedbiosystems.com/support/software_community/ab1_files.zip > (I had trouble downloading this with curl - Firefox worked). Looking at these > ABI files with seqret as FASTQ does seem to give meaningful quality scores. > Curious. It should look for a PCON tag in the file and pick up the second of two, or the first if there is only one. Can anyone on the list enlighten us further on what is intended for the quality socrss in these example files? regards, Peter Rice From pmr at ebi.ac.uk Thu Jul 22 08:29:41 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 22 Jul 2010 13:29:41 +0100 Subject: [EMBOSS] transeq and ambiguous codons In-Reply-To: References: <320fb6e00907081450y2fd135e0x817f03c41357e297@mail.gmail.com> <4A55B395.4090301@ebi.ac.uk> <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> Message-ID: <4C4839B5.2070208@ebi.ac.uk> On 22/07/10 12:36, Peter C. wrote: > Hi again, > > Now that I have installed the latest and greatest version, EMBOSS 6.3.1, > I'm revisiting some old issues I had with EMBOSS. In this case 'unambiguous > ambiguous codons' and other translation issues. Interestng. I'll take a look. We certainly intended to handle these cases. regards, Peter Rice From biopython at maubp.freeserve.co.uk Thu Jul 22 09:13:46 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 14:13:46 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4C483950.7070004@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> <4C483950.7070004@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice wrote: > > On 22/07/10 12:22, Peter C. wrote: > >>> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 >>> (PHRED quality 1, quality 0) which is rather strange. The sequence appears >>> to agree with the provided file pGEM_(ABI)_A01.seq >>> >>> Have I just been unlucky with the AB1 files that I have looked at? Thus >>> far all the quality scores seem meaningless. > > There are two sets of quality scores in that file. Both are the > alternating characters 1 and 0. Adding 33 gives the scores you see. > > Looks as though EMBOSS is just reporting what it finds. > > The file offset is the value returned by function > ajSeqABIGetConfidOffset. It simply reads one byte from there for each > base of sequence length. Looks like that particular random example from the internet was just odd. >> I went back through my old emails, and see you had been testing with >> http://www.appliedbiosystems.com/support/software_community/ab1_files.zip >> (I had trouble downloading this with curl - Firefox worked). Looking at these >> ABI files with seqret as FASTQ does seem to give meaningful quality scores. >> Curious. > > It should look for a PCON tag in the file and pick up the second of two, > or the first if there is only one. > > Can anyone on the list enlighten us further on what is intended for the > quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter From Bastien.Chevreux at dsm.com Thu Jul 22 10:42:40 2010 From: Bastien.Chevreux at dsm.com (Chevreux, Bastien) Date: Thu, 22 Jul 2010 16:42:40 +0200 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com><4B B1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk><320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@m ail.gmail.com><4BD06999.2080502@ebi.ac.uk><4BD080F6.3060409@ebi.ac.uk><4C483950.7070004@ebi.ac.uk> Message-ID: AFAIK ab1 files do not have phred quality scores included. At least they did not a couple of years ago. You need to mangle them through a basecaller (TraceTuner, phred, others) to get these scores. B. -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203.4 / 188 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open- > bio.org] On Behalf Of Peter > Sent: Donnerstag, 22. Juli 2010 15:14 > To: Peter Rice > Cc: emboss at lists.open-bio.org > Subject: Re: [EMBOSS] ABI to FASTQ with seqret > > On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice wrote: > > > > On 22/07/10 12:22, Peter C. wrote: > > > >>> I truncated this for brevity. Here the quality string repeats ASCI 34, > ASCI 33 > >>> (PHRED quality 1, quality 0) which is rather strange. The sequence > appears > >>> to agree with the provided file pGEM_(ABI)_A01.seq > >>> > >>> Have I just been unlucky with the AB1 files that I have looked at? > Thus > >>> far all the quality scores seem meaningless. > > > > There are two sets of quality scores in that file. Both are the > > alternating characters 1 and 0. Adding 33 gives the scores you see. > > > > Looks as though EMBOSS is just reporting what it finds. > > > > The file offset is the value returned by function > > ajSeqABIGetConfidOffset. It simply reads one byte from there for each > > base of sequence length. > > Looks like that particular random example from the internet was just odd. > > >> I went back through my old emails, and see you had been testing with > >> > http://www.appliedbiosystems.com/support/software_community/ab1_files.zip > >> (I had trouble downloading this with curl - Firefox worked). Looking at > these > >> ABI files with seqret as FASTQ does seem to give meaningful quality > scores. > >> Curious. > > > > It should look for a PCON tag in the file and pick up the second of two, > > or the first if there is only one. > > > > Can anyone on the list enlighten us further on what is intended for the > > quality socrss in these example files? > > The gGEM example I have no idea - I just found it with Google. > > I can send you a couple of our locally produced AB1 files off list > if you wouldn't mind having a look at them. It may be that however > these are being generated there simply are no useful scores inside. > > Peter > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. From biopython at maubp.freeserve.co.uk Thu Jul 22 12:45:47 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 17:45:47 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1FBE1.8030400@ebi.ac.uk> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> <4C483950.7070004@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 5:33 PM, Tom Keller wrote: > Greetings, > The latest versions of the ABI basecaller does indeed give quality scores. I suspect the problem is my ABI files were not created using the latest ABI basecaller then. Do you have any more details (e.g. which version)? I've sent a couple of *.ab1 files off list to Peter Rice to confirm they really don't have quality scores. Tomorrow I will try and find out who to contact locally about the base calling, and what version of the base caller they have. Peter From kellert at ohsu.edu Thu Jul 22 12:33:43 2010 From: kellert at ohsu.edu (Tom Keller) Date: Thu, 22 Jul 2010 09:33:43 -0700 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com><4B B1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk><320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@m ail.gmail.com><4BD06999.2080502@ebi.ac.uk><4BD080F6.3060409@ebi.ac.uk><4C483950.7070004@ebi.ac.uk> Message-ID: Greetings, The latest versions of the ABI basecaller does indeed give quality scores. Nicola Vitacolonna wrote a perl module that access the metadata encoded in the ab1 files. use Bio::Trace::ABIF; my $abif = Bio::Trace::ABIF?>new(); $abif?>open_abif('/Path/to/my/file.ab1'); my $sequence = $abif?>sequence(); my @quality_values = $abif?>quality_values(); print $abif?>sample_name(), "\n"; print $sequence, "\n"; print '+\n'; print join(" ", at quality_values), "\n"; Will generate a fastq-sanger format. regards, Tom Thomas (Tom) Keller, PhD kellert at ohsu.edu 503.494.2442 6339b R Jones Hall (BSc/CROET) www.ohsu.edu/xd/research/research-cores/dna-analysis/ On Jul 22, 2010, at 7:42 AM, Chevreux, Bastien wrote: AFAIK ab1 files do not have phred quality scores included. At least they did not a couple of years ago. You need to mangle them through a basecaller (TraceTuner, phred, others) to get these scores. B. -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203.4 / 188 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open- bio.org] On Behalf Of Peter Sent: Donnerstag, 22. Juli 2010 15:14 To: Peter Rice Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] ABI to FASTQ with seqret On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice > wrote: On 22/07/10 12:22, Peter C. wrote: I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 (PHRED quality 1, quality 0) which is rather strange. The sequence appears to agree with the provided file pGEM_(ABI)_A01.seq Have I just been unlucky with the AB1 files that I have looked at? Thus far all the quality scores seem meaningless. There are two sets of quality scores in that file. Both are the alternating characters 1 and 0. Adding 33 gives the scores you see. Looks as though EMBOSS is just reporting what it finds. The file offset is the value returned by function ajSeqABIGetConfidOffset. It simply reads one byte from there for each base of sequence length. Looks like that particular random example from the internet was just odd. I went back through my old emails, and see you had been testing with http://www.appliedbiosystems.com/support/software_community/ab1_files.zip (I had trouble downloading this with curl - Firefox worked). Looking at these ABI files with seqret as FASTQ does seem to give meaningful quality scores. Curious. It should look for a PCON tag in the file and pick up the second of two, or the first if there is only one. Can anyone on the list enlighten us further on what is intended for the quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From korndoerfer at crelux.com Sat Jul 24 06:56:50 2010 From: korndoerfer at crelux.com (Ingo P. Korndoerfer) Date: Sat, 24 Jul 2010 12:56:50 +0200 Subject: [EMBOSS] emboss stand-in for fasta Message-ID: <4C4AC6F2.3000403@crelux.com> could anybody help me out with what to use as a stand-in for fasta ? fasta by itself is fine, but under windows there is no way to make fasta accept filenames with spaces. neither "" nor """" nor '' seem to alleviate the problem. so i was hoping emboss would have something (which would also save me having to install fasta on all of our pcs). what i need to do is run a sequence against an in house library and return me the top hit in alignment. 1000 thanks in advance greetings ingo -- CRELUX GmbH Dr. Ingo Korndoerfer Head of Crystallography Am Klopferspitz 19a 82152 Martinsried Germany Phone: +49 89 700760210 Fax: +49 89 700760222 korndoerfer at crelux.com www.crelux.com Amtsgericht M?nchen HRB 165552 - Managing Directors: Dr. Michael Sch?ffer, Dr. Ismail Moarefi This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Diese E-Mail enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. -------------- next part -------------- A non-text attachment was scrubbed... Name: korndoerfer.vcf Type: text/x-vcard Size: 318 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Sat Jul 24 08:05:11 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 24 Jul 2010 13:05:11 +0100 Subject: [EMBOSS] emboss stand-in for fasta In-Reply-To: <4C4AC6F2.3000403@crelux.com> References: <4C4AC6F2.3000403@crelux.com> Message-ID: On Sat, Jul 24, 2010 at 11:56 AM, Ingo P. Korndoerfer wrote: > could anybody help me out with what to use as a stand-in for fasta ? > > fasta by itself is fine, but under windows there is no way to make fasta > accept filenames with spaces. ?neither "" nor """" nor '' seem to alleviate > the problem. You are talking about Bill Pearson's FASTA command line tools, right? Have you tried wrapping the filename with double quote characters, "like this.fasta", which usually works on Windows. If not, I'd also try escaping with a slash, "like\ this.fasta", just in case. > so i was hoping emboss would have something (which would also save > me having to install fasta on all of our pcs). > > what i need to do is run a sequence against an in house library and > return me the top hit in alignment. Sounds like BLAST might we a sensible choice to me - it works fine on Windows, although I'm not sure about filenames with spaces. Personally I avoid filenames with spaces - they just cause trouble. Can't you rename things before calling FASTA? e.g. Write a wrapper script for FASTA to turn spaces into underscores? Peter From koenvanderdrift at gmail.com Mon Jul 26 08:30:20 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Mon, 26 Jul 2010 08:30:20 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) Message-ID: Hi, When updating the emboss package file for fink on OS X, I got an error about a libtool version mismatch. Fink already has version 2.2.10, but emboss requires 2.2.6b. I tried recreating the libtool-related files as written in the install file, but that did not work, still got the same error. I am not very familiair with libtool et al, so I could have made a mistake there. I am not at my Mac right now, so cannot give you more detailed info and error messages. But I was wondering if there is a way to use compile emboss with libtool 2.2.10 instead of 2.2.6b? Thanks, - Koen. From ajb at ebi.ac.uk Mon Jul 26 09:14:26 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 26 Jul 2010 14:14:26 +0100 (BST) Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: References: Message-ID: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Hi Koen, Unless you are doing something in a non-standard way with the EMBOSS-x.y.z.gz file contents then you shouldn't get that message. It might appear, for example, if you have been hand-modifying some of the configuration files. If, on the other hand, you are using the CVS version then you will get such a message. Generally, if your libtool (2.x) is newer than that used by a given package then typing "autoreconf -fi" before doing any other configuration steps should fix up the libtool side. Note that doing so may not work for modified non-developer distributions. Please let us know if you are getting the error using just: gunzip EMBOSS-x.y.z.gz ./configure --prefix=/fu/bar make HTH Alan > Hi, > > When updating the emboss package file for fink on OS X, I got an error > about a libtool version mismatch. Fink already has version 2.2.10, but > emboss requires 2.2.6b. I tried recreating the libtool-related files > as written in the install file, but that did not work, still got the > same error. I am not very familiair with libtool et al, so I could > have made a mistake there. I am not at my Mac right now, so cannot > give you more detailed info and error messages. But I was wondering if > there is a way to use compile emboss with libtool 2.2.10 instead of > 2.2.6b? > > Thanks, > > - Koen. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From koenvanderdrift at gmail.com Mon Jul 26 10:03:54 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Mon, 26 Jul 2010 10:03:54 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> References: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Message-ID: Hi Alan, Thanks for the reply. As far as I know I am uisng the x.y.z release that I downloaded from the ftp site this weekend. I don't use the cvs version, and am not manipulating any of the configuration files. I'll try the autoreconf command as well as the configure/make commands outside of fink later today and will let you know how that went. - Koen. On Mon, Jul 26, 2010 at 9:14 AM, wrote: > Hi Koen, > > Unless you are doing something in a non-standard way with the > EMBOSS-x.y.z.gz file contents then you shouldn't get that message. It > might appear, for example, if you have been hand-modifying some of the > configuration files. > > If, on the other hand, you are using the CVS version then you will > get such a message. Generally, if your libtool (2.x) is newer > than that used by a given package then typing "autoreconf -fi" > before doing any other configuration steps should fix up > the libtool side. Note that doing so may not work for > modified non-developer distributions. > > > Please let us know if you are getting the error using just: > ? ?gunzip EMBOSS-x.y.z.gz > ? ?./configure --prefix=/fu/bar > ? ?make > > HTH > > Alan > > > >> Hi, >> >> When updating the emboss package file for fink on OS X, I got an error >> about a libtool version mismatch. Fink already has version 2.2.10, but >> emboss requires 2.2.6b. I tried recreating the libtool-related files >> as written in the install file, but that did not work, still got the >> same error. I am not very familiair with libtool et al, so I could >> have made a mistake there. I am not at my Mac right now, so cannot >> give you more detailed info and error messages. But I was wondering if >> there is a way to use compile emboss with libtool 2.2.10 instead of >> 2.2.6b? >> >> Thanks, >> >> - Koen. >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > > From koenvanderdrift at gmail.com Tue Jul 27 00:03:26 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Tue, 27 Jul 2010 00:03:26 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> References: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Message-ID: <4401ABFA-39A5-4868-A638-6CA4A614148E@gmail.com> On Jul 26, 2010, at 9:14 AM, ajb at ebi.ac.uk wrote: > Generally, if your libtool (2.x) is newer > than that used by a given package then typing "autoreconf -fi" > before doing any other configuration steps should fix up > the libtool side. That indeed solved the issue. Thanks, - Koen. From dag at sonsorol.org Wed Jul 28 07:49:17 2010 From: dag at sonsorol.org (Chris Dagdigian) Date: Wed, 28 Jul 2010 07:49:17 -0400 Subject: [EMBOSS] accessing emboss ftp site In-Reply-To: References: <6F57C2D1-8927-420C-940C-C6EC0C62AABE@gmail.com> <4BD0DB9B.5050005@sonsorol.org> <4BD1C05D.5010109@sonsorol.org> Message-ID: <4C50193D.5060500@bioteam.net> I can't seem to replicate this at all with any of my available FTP clients or browser-based FTP clients. It works for me from both Mac OS X clients as well as a CentOS 5.4 based Linux system. Is there an FTP client / OS combo that seems in particular not to work? -Chris Here is one example: > dag-static:~ dag$ ftp emboss.open-bio.org > Connected to emboss.open-bio.org. > 220 (vsFTPd 2.0.1) > Name (emboss.open-bio.org:dag): anonymous > 331 Please specify the password. > Password: > 230 Login successful. > Remote system type is UNIX. > Using binary mode to transfer files. > ftp> dir > 229 Entering Extended Passive Mode (|||54472|) > 150 Here comes the directory listing. > drwxr-xr-x 8 14 50 4096 May 22 2006 pub > 226 Directory send OK. > ftp> cd pub/EMBOSS > 250 Directory successfully changed. > ftp> ls > 229 Entering Extended Passive Mode (|||38343|) > 150 Here comes the directory listing. > -rw-rw-r-- 1 501 503 389856 Jul 21 09:25 CBSTOOLS-1.0.0.tar.gz > -rw-rw-r-- 1 501 503 426218 Jul 21 09:25 DOMAINATRIX-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 441025 Jul 21 09:25 DOMALIGN-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 452787 Jul 21 09:25 DOMSEARCH-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 23572243 Jul 19 13:41 EMBOSS-6.3.1.tar.gz > lrwxrwxrwx 1 501 503 19 Jul 19 13:42 EMBOSS-latest.tar.gz -> EMBOSS-6.3.1.tar.gz > -rw-rw-r-- 1 501 503 373798 Jul 21 09:25 EMNU-1.05.tar.gz > -rw-rw-r-- 1 501 503 415096 Jul 21 09:25 ESIM4-1.0.0.tar.gz > -rw-rw-r-- 1 501 503 569581 Jul 21 09:25 HMMER-2.3.2.tar.gz > -rw-rw-r-- 1 501 503 350791 Jul 21 09:25 IPRSCAN-4.3.1.tar.gz > drwxrwsr-x 7 501 503 4096 Feb 01 2006 Jemboss > -rw-rw-r-- 1 501 503 513418 Jul 21 09:25 MEMENEW-4.0.0.tar.gz > -rw-rw-r-- 1 501 503 823636 Jul 21 09:25 MIRA-2.8.2.tar.gz > -rw-rw-r-- 1 501 503 435315 Jul 21 09:25 MSE-3.0.0.tar.gz > -rw-rw-r-- 1 501 503 328540 Jul 21 09:25 MYEMBOSS-6.3.0.tar.gz > -rw-rw-r-- 1 501 503 359488 Jul 21 09:25 MYEMBOSSDEMO-6.3.0.tar.gz > -rw-rw-r-- 1 501 503 1667760 Jul 21 09:25 PHYLIPNEW-3.69.tar.gz > -rw-rw-r-- 1 501 503 571008 Jul 21 09:25 SIGNATURE-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 531604 Jul 21 09:25 STRUCTURE-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 386287 Jul 21 09:25 TOPO-2.0.0.tar.gz > -rw-rw-r-- 1 501 503 652433 Jul 21 09:25 VIENNA-1.7.2.tar.gz > drwxrwsr-x 3 522 503 4096 Aug 21 2006 contrib > drwxrwsr-x 2 501 503 4096 Nov 11 2005 doc > drwxrwsr-x 3 501 503 4096 Jul 21 09:25 fixes > drwxrwsr-x 14 501 503 4096 Jul 19 13:39 old > drwxrwsr-x 2 501 503 4096 Jul 06 2005 tutorials > drwxrwsr-x 4 501 503 4096 Jul 19 13:36 windows > 226 Directory send OK. > ftp> Hamish McWilliam wrote: > Hi Chris, > > I'm also seeing problems with the FTP site, but using Mirror rather > than Transmit, it looks like the server does not like options being > specified to the LIST command: > > Scanning remote directory /pub/EMBOSS > ---> CWD /pub/EMBOSS > 250 Directory successfully changed. > ---> TYPE A > 200 Switching to ASCII mode. > ---> PORT 172,21,22,1,171,245 > 200 PORT command successful. Consider using PASV. > ---> PASV > 227 Entering Passive Mode (208,94,50,58,104,178) > ---> LIST -lat > timed out > Cannot get remote directory listing because: timed out > Cannot get remote directory details (/pub/EMBOSS) > disconnecting from emboss.open-bio.org > > Trying it with a command line client I get the response to hang if I > try using any options to ls or dir, without options they are fine. > > All the best, > > Hamish > > On 29 May 2010 03:44, Koen van der Drift wrote: >> Hi Chris, >> >> Did you have a chance to look at this? Just tried again, and Transmit still >> won't let me access the emboss ftp site. >> >> Thanks, >> >> - Koen. >> >> >> On Apr 23, 2010, at 11:44 AM, Chris Dagdigian wrote: >> >>> In the last few months the open-bio.org servers switched datacenters, IP >>> addresses and firewall/IDS appliances. Lots of juicy things to look at and >>> debug. >>> >>> Koen - if you have a chance can you send me the IP address that you are >>> using to connect from? I might be able to find some relevant log entries >>> with that info. >>> >>> -Chris >>> >>> >>> >>> Koen van der Drift wrote: >>>> Just for the record, it used to work with Transmit, this is only from >>>> the last few months. >>>> >>>> - Koen. >>>> >>>> On Thu, Apr 22, 2010 at 7:28 PM, Chris Dagdigian >>>> wrote: >>>>> Might be an issue with the Juniper Netscreen firewall/IDS security >>>>> appliance >>>>> that sits upstream of the EMBOSS FTP server. I'll take a look at the >>>>> security logs and alerts. >>>>> >>>>> -Chris >>>>> >>>>> >>>>> Koen van der Drift wrote: >>>>>> Hi, >>>>>> >>>>>> For a while now I am unable to access the emboss ftp site using the OS >>>>>> X >>>>>> client Transmit. Loggin in works fine, but it chokes on the LIST >>>>>> command. I have no problems accessing it from the command line. I have >>>>>> added the output from Transmit below. I don't know if this is a >>>>>> Transmit >>>>>> or emboss issue, but just wanted to let you know. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> - Koen. >>>>>> >>>>>> >>>>>> Transmit 3.6.9 Session Transcript >>>>>> LibNcFTP 3.2.1 (August 13, 2007) compiled for UNIX >>>>>> Uname: Darwin|exile.local|9.8.0|Darwin Kernel Version 9.8.0: Wed Jul 15 >>>>>> 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC|Power Macintosh >>>>>> 220: (vsFTPd 2.0.1) >>>>>> Connected to emboss.open-bio.org. >>>>>> Cmd: USER anonymous >>>>>> 331: Please specify the password. >>>>>> Cmd: PASS NcFTP@ >>>>>> 230: Login successful. >>>>>> Cmd: TYPE A >>>>>> 200: Switching to ASCII mode. >>>>>> Logged in to emboss.open-bio.org as anonymous. >>>>>> Cmd: SYST >>>>>> 215: UNIX Type: L8 >>>>>> Cmd: PWD >>>>>> 257: "/" >>>>>> Cmd: CWD /pub/EMBOSS/fixes >>>>>> 250: Directory successfully changed. >>>>>> Cmd: PWD >>>>>> 257: "/pub/EMBOSS/fixes" >>>>>> Cmd: PASV >>>>>> 227: Entering Passive Mode (208,94,50,58,83,232) >>>>>> Cmd: LIST -a >>>>>> Could not read reply from control connection -- timed out. (SReadline >>>>>> 1) >>>>>> 220: (vsFTPd 2.0.1) >>>>>> Connected to emboss.open-bio.org. >>>>>> Cmd: USER anonymous >>>>>> 331: Please specify the password. >>>>>> Cmd: PASS NcFTP@ >>>>>> 230: Login successful. >>>>>> Logged in to emboss.open-bio.org as anonymous. >>>>>> Cmd: SYST >>>>>> 215: UNIX Type: L8 >>>>>> Cmd: PWD >>>>>> 257: "/" >>>>>> Cmd: CWD /pub/EMBOSS/fixes >>>>>> 250: Directory successfully changed. >>>>>> Cmd: PWD >>>>>> 257: "/pub/EMBOSS/fixes" >>>>>> Cmd: PASV >>>>>> 227: Entering Passive Mode (208,94,50,58,222,100) >>>>>> Cmd: LIST -a >>>>>> Could not read reply from control connection -- timed out. (SReadline >>>>>> 1) >>>>>> >>>>>> _______________________________________________ >>>>>> EMBOSS mailing list >>>>>> EMBOSS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/emboss >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > > From ajb at ebi.ac.uk Thu Jul 15 10:18:06 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Thu, 15 Jul 2010 11:18:06 +0100 (BST) Subject: [EMBOSS] EMBOSS 6.3.0 released Message-ID: <58375.86.26.12.63.1279189086.squirrel@webmail.ebi.ac.uk> EMBOSS 6.3.0 is now available and can be downloaded from our ftp server: ftp://emboss.open-bio.org/pub/EMBOSS/ The associated (optional) EMBASSY packages are in the same directory. mEMBOSS, the native Microsoft windows port, can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/windows/ This release provides a platform for further application development. Some highlights include: Network access to BioMart, Ensembl and general SQL databases Support for BAM/SAM files Parsing and validation for NCBI taxonomy and OBO files Scaleable graphics options Rabin-Karp multi-pattern search algorithm implemented EDAM ontology identifiers added Full details are in the attached ChangeLog Installation: There are some new optional steps in the installation. For UNIX: To enable PDF graphics support you will need to have installed the libhpdf development files (a.k.a. libharu, source available via libharu.org). To enable MySQL and/or PostgreSQL support their development files will need to have been installed. For example, under Linux RPM systems the packages would typically be called libharu-devel, mysql-devel & postgresql-devel. Such installations need to be performed prior to the EMBOSS configuration step. The configuration will automatically include support for the above if the relevant files are detected. For Windows (mEMBOSS): No action is required. PDF and MySQL support DLLs are included by the installation. Alan -------------- next part -------------- A non-text attachment was scrubbed... Name: ChangeLog Type: application/octet-stream Size: 11537 bytes Desc: not available URL: From shrish at ccmb.res.in Mon Jul 19 11:32:46 2010 From: shrish at ccmb.res.in (Shrish Tiwari) Date: Mon, 19 Jul 2010 17:02:46 +0530 (IST) Subject: [EMBOSS] compiling problems Message-ID: <444095438.19601279539166895.JavaMail.root@127.0.0.1> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ajb at ebi.ac.uk Mon Jul 19 14:08:11 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 19 Jul 2010 15:08:11 +0100 (BST) Subject: [EMBOSS] EMBOSS 6.3.1 available Message-ID: <42192.86.26.12.63.1279548491.squirrel@webmail.ebi.ac.uk> EMBOSS 6.3.1 is now available and can be downloaded from our ftp server: ftp://emboss.open-bio.org/pub/EMBOSS/ The associated (optional) EMBASSY packages are in the same directory. mEMBOSS, the native Microsoft windows port, can be downloaded from the directory: ftp://emboss.open-bio.org/pub/EMBOSS/windows/ This is a maintenance release. It fixes a compilation failure under Linux Ubuntu distributions. The bug causing the failure was in a function not actually used by the EMBOSS main package but was used by two of the EMBASSY packages. It could cause some output file permissions to be set incorrectly on other operating systems or distributions. The remaining minor differences over 6.3.0 are: a) The optional configuration switch for PDF is now called --with-hpdf to bring it into line with the documentation. b) The default restriction isoschizomer data file has been updated. c) svg & pdf are now only reported once under graphics devices. For mEMBOSS, the only changes are 'b)' and 'c)' We apologise for any inconvenience caused. Alan From ajb at ebi.ac.uk Mon Jul 19 14:41:47 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 19 Jul 2010 15:41:47 +0100 (BST) Subject: [EMBOSS] compiling problems In-Reply-To: <444095438.19601279539166895.JavaMail.root@127.0.0.1> References: <444095438.19601279539166895.JavaMail.root@127.0.0.1> Message-ID: <53997.86.26.12.63.1279550507.squirrel@webmail.ebi.ac.uk> Hi Shrish, The problem appears to be with your PostgreSQL development files under RHEL. The EMBOSS configuration is picking up that your system says it has the development files installed and is including the relevant library that should contain the function 'PQescapeStringConn' i.e. libpq (-lpq). From the error it appears the function isn't there for some reason. I have not seen this problem on other systems but, as the SQL stuff is a new addition from 6.3.0 I'd be interested to know if people have had similar experiences on non-RHEL machines. I'll likely contact you off-list with a follow-up question. In the meantime, if you cannot update your PostgreSQL then you can always 'make clean' and configure again using: --without-postgresql as your system is reporting that it also has MySQL (which will allow you access to mysql servers [obviously] and the public Ensembl ones). ATB Alan From biopython at maubp.freeserve.co.uk Tue Jul 20 16:27:42 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 20 Jul 2010 17:27:42 +0100 Subject: [EMBOSS] Counting the number of sequences in a file Message-ID: Hi all, Is there a tool in EMBOSS to just count the number of sequences in a file? For simple file formats like FASTA or GenBank I'd typically just use grep: $ grep -c "^LOCUS " gbvrt1.seq 31065 However, this becomes more complicated for general file formats (e.g. FASTQ files where in addition to identifiers the quality lines can also start with @) or binary files like BAM which EMBOSS now supports. Right now I could handle this by using seqret to convert the file into FASTA and then pipe that though grep to count the records. But an EMBOSS tool would be more elegant, e.g. $ countseq -sformat=genbank gbvrt1.seq 31065 For the implementation you might offer the choice between using the normal EMBOSS parsing (as in seqret) versus file format specific regular expression searches which just look for marker lines (without checking validity) which should be really fast. Regards, Peter C. From pmr at ebi.ac.uk Tue Jul 20 17:02:12 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 Jul 2010 18:02:12 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: Message-ID: <4C45D694.9090405@ebi.ac.uk> On 20/07/10 17:27, Peter C. wrote: > Hi all, > > Is there a tool in EMBOSS to just count the number of sequences in a file? > > Right now I could handle this by using seqret to convert the file into FASTA > and then pipe that though grep to count the records. But an EMBOSS tool > would be more elegant, e.g. > > $ countseq -sformat=genbank gbvrt1.seq > 31065 > > For the implementation you might offer the choice between using the normal > EMBOSS parsing (as in seqret) versus file format specific regular expression > searches which just look for marker lines (without checking validity) which > should be really fast. Very easy to write ... you could do it yourself for practise (we will help of course). Just use seqret as the basis, don't write any sequences out, but add an outfile for the results. We will add countseq to the next release. regards, Peter Rice From pmr at ebi.ac.uk Tue Jul 20 17:04:58 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Tue, 20 Jul 2010 18:04:58 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: Message-ID: <4C45D73A.1010406@ebi.ac.uk> On 20/07/10 17:27, Peter C. wrote: > $ countseq -sformat=genbank gbvrt1.seq > 31065 Of course, you could just use: $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' 31065 :-) Peter Rice From biopython at maubp.freeserve.co.uk Tue Jul 20 20:01:24 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Tue, 20 Jul 2010 21:01:24 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: <4C45D73A.1010406@ebi.ac.uk> References: <4C45D73A.1010406@ebi.ac.uk> Message-ID: On Tue, Jul 20, 2010 at 6:04 PM, Peter Rice wrote: > > On 20/07/10 17:27, Peter C. wrote: >> $ countseq -sformat=genbank gbvrt1.seq >> 31065 > > Of course, you could just use: > > $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' > 31065 > > :-) > Exactly what I had in mind as the work around ("handle this by using seqret to convert the file into FASTA and then pipe that though grep to count the records"), although I'd not thought about the fact that FASTA is the default output format which keeps it nice and short. The (Unix) command line can be great :) Peter C From uludag at ebi.ac.uk Tue Jul 20 20:57:25 2010 From: uludag at ebi.ac.uk (Mahmut Uludag) Date: Tue, 20 Jul 2010 21:57:25 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: References: <4C45D73A.1010406@ebi.ac.uk> Message-ID: <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> >> $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' infoseq prints a separate line for each sequence, following command line may also work. > $ infoseq -filter -sformat=genbank gbvrt1.seq | wc -l Mahmut From andrew.warry at bbsrc.ac.uk Wed Jul 21 09:28:48 2010 From: andrew.warry at bbsrc.ac.uk (andrew warry (BITS)) Date: Wed, 21 Jul 2010 10:28:48 +0100 Subject: [EMBOSS] Counting the number of sequences in a file In-Reply-To: <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> References: <4C45D73A.1010406@ebi.ac.uk> <8DA3AC5172EB49319DE921F7E99A8C51@MedionPC> Message-ID: Watch out for the infoseq header line you should use -noheading to avoid a +1 to your total infoseq -sformat=genbank gbvrt1.seq -noheading -auto | wc -l Andrew -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open-bio.org] On Behalf Of Mahmut Uludag Sent: 20 July 2010 21:57 To: Peter; Peter Rice Cc: emboss at emboss.open-bio.org Subject: Re: [EMBOSS] Counting the number of sequences in a file >> $ seqret -filter -sformat=genbank gbvrt1.seq | grep -c '^>' infoseq prints a separate line for each sequence, following command line may also work. > $ infoseq -filter -sformat=genbank gbvrt1.seq | wc -l Mahmut _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss -- Disclaimer: This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender. This email and any attachments are believed to be free from viruses but BBSRC accepts no liability in connection therewith. From ajb at ebi.ac.uk Wed Jul 21 09:42:17 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Wed, 21 Jul 2010 10:42:17 +0100 (BST) Subject: [EMBOSS] Patch/Fix 1 for EMBOSS 6.3.1 available Message-ID: <52577.86.26.12.63.1279705337.squirrel@webmail.ebi.ac.uk> This is available as separate drop-in replacement files, or as a patch file, from the ftp://emboss.open-bio.org/pub/EMBOSS/fixes/ directory hierarchy. Instructions are also available there. Problem addressed: In some circumstances the inclusion of PDF support could prevent the inclusion of MySQL support. If you required MySQL support and have already successfully installed it then there is no need to apply this patch. Similarly, the patch is unnecessary if you didn't want MySQL support included. Alan From biopython at maubp.freeserve.co.uk Thu Jul 22 11:16:48 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:16:48 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4BD080F6.3060409@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: On Thu, Apr 22, 2010 at 6:01 PM, Peter Rice wrote: > > On 22/04/2010 16:48, Peter Cock wrote: > >> Does this mean there is an updated seqret in a public repository where I >> can convert an ABI file to FASTQ taking the ABI basecaller's sequence >> and PHRED scores? I'd be interested to test that... or a patch against >> EMBOSS 6.2.0. > > It is in the latest CVS code and will appeart in the July release. > Hi Peter R et al, I've just compiled and installed EMBOSS 6.3.1 on Mac OS X, and had a go converting some ABI (extension .ab1) files from our in house sequencing service to FASTQ - so far all the examples give Sanger FASTQ quality strings of "!" (ASCII 33, PHRED quality zero) or Illumina FASTQ quality strings of "@" (ASCII 64, again PHRED quality zero). I remember you saying ABI files can have two sets of quality scores, so perhaps my files have one set all of PHRED zero? I tried to find some 3rd party example files via Google, for example on http://www.elimbio.com/sequencing_sample_files.htm they have a zip file http://www.elimbio.com/Forms/pGEM.zip containing one ABI file. The output of this is more interesting: $ seqret -sformat abi -osformat fastq -auto -stdout -sequence pGEM_\(ABI\)_A01.ab1 @pGEM_(ABI) NANTCTATAGGCGAATTCGAGCTCGGTA...GNN + "!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"...!"!"!" I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 (PHRED quality 1, quality 0) which is rather strange. The sequence appears to agree with the provided file pGEM_(ABI)_A01.seq Have I just been unlucky with the AB1 files that I have looked at? Thus far all the quality scores seem meaningless. Peter C. From biopython at maubp.freeserve.co.uk Thu Jul 22 11:22:06 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:22:06 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 12:16 PM, Peter wrote: > On Thu, Apr 22, 2010 at 6:01 PM, Peter Rice wrote: >> >> On 22/04/2010 16:48, Peter Cock wrote: >> >>> Does this mean there is an updated seqret in a public repository where I >>> can convert an ABI file to FASTQ taking the ABI basecaller's sequence >>> and PHRED scores? I'd be interested to test that... or a patch against >>> EMBOSS 6.2.0. >> >> It is in the latest CVS code and will appeart in the July release. >> > > Hi Peter R et al, > > I've just compiled and installed EMBOSS 6.3.1 on Mac OS X, and had a > go converting some ABI (extension .ab1) files from our in house sequencing > service to FASTQ - so far all the examples give Sanger FASTQ quality strings > of "!" (ASCII 33, PHRED quality zero) or Illumina FASTQ quality strings of > "@" (ASCII 64, again PHRED quality zero). > > I remember you saying ABI files can have two sets of quality scores, > so perhaps my files have one set all of PHRED zero? > > I tried to find some 3rd party example files via Google, for example on > http://www.elimbio.com/sequencing_sample_files.htm they have a zip > file http://www.elimbio.com/Forms/pGEM.zip containing one ABI file. > The output of this is more interesting: > > $ seqret -sformat abi -osformat fastq ?-auto -stdout -sequence > pGEM_\(ABI\)_A01.ab1 > @pGEM_(ABI) > NANTCTATAGGCGAATTCGAGCTCGGTA...GNN > + > "!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"!"...!"!"!" > > I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 > (PHRED quality 1, quality 0) which is rather strange. The sequence appears > to agree with the provided file pGEM_(ABI)_A01.seq > > Have I just been unlucky with the AB1 files that I have looked at? Thus > far all the quality scores seem meaningless. I went back through my old emails, and see you had been testing with http://www.appliedbiosystems.com/support/software_community/ab1_files.zip (I had trouble downloading this with curl - Firefox worked). Looking at these ABI files with seqret as FASTQ does seem to give meaningful quality scores. Curious. Peter C. From biopython at maubp.freeserve.co.uk Thu Jul 22 11:36:04 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 12:36:04 +0100 Subject: [EMBOSS] transeq and ambiguous codons In-Reply-To: <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> References: <320fb6e00907081450y2fd135e0x817f03c41357e297@mail.gmail.com> <4A55B395.4090301@ebi.ac.uk> <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> Message-ID: Hi again, Now that I have installed the latest and greatest version, EMBOSS 6.3.1, I'm revisiting some old issues I had with EMBOSS. In this case 'unambiguous ambiguous codons' and other translation issues. On Fri, Jul 10, 2009 at 10:14 AM, Peter C. wrote: > On Thu, Jul 9, 2009 at 10:08 AM, Peter Rice wrote: >> >> Peter C. wrote: >>> However, consider the codon TRR. R means A or G, so this can mean TAA, >>> TGA, TAG or TGG which translate to stop or W (both EMBOSS and the NCBI >>> standard table agree here). Therefore the translation of TRR should be >>> "* or W", which I would expect based on the above examples to result >>> in "X". But instead EMBOSS transeq gives "*": >> >> This is a side effect of the way backtranslation works... > > OK, leaving TRR aside for the moment (I'm not sure I'd have done it that > way, but I think I follow your logic), I have some more problem cases for > you to consider (all using the default standard NCBI table 1). > > Most of these are 'unambiguous ambiguous codons' as you put it, and > I would agree using X when a more specific letter is possible isn't ideal > but isn't actually wrong. The "ATS" and related codons (see below) > however are simply wrong. > > -------------------------------------------------------------------------------------- > > TRA means TAA or TGA, which are both stop codons. Therefore TRA > should translate as a stop, not as an X: > > $ transeq asis:TAATGATRA -stdout -auto -osformat raw > **X Same on EMBOSS 6.3.1, shouldn't TRA translate as stop? > -------------------------------------------------------------------------------------- > > Now look at YTA, which means CTA or TTA which encode L, so > YTA should be L not X: > > $ transeq asis:CTATTAYTA -stdout -auto -osformat raw > LLX Same on EMBOSS 6.3.1, giving X instead of specific amino acid (i.e. YTA is an "unambiguous ambiguous codon" for L) > Likewise for YTG and YTR, and YTN. I haven't re-checked these. > -------------------------------------------------------------------------------------- > > Another example, ATW means ATA or ATT, which both translate as I, > so ATW should translate as I not X: > > $ transeq asis:ATAATTATW -stdout -auto -osformat raw > IIX Same on EMBOSS 6.3.1, giving X instead of specific amino acid (i.e. ATW is an "unambiguous ambiguous codon" for I) > -------------------------------------------------------------------------------------- > > Conversely, ATS which means ATC or ATG which translate as I and M. > Remember S means G or C. Therefore ATS should translate as X, and > not I: > > $ transeq asis:ATCATGATS -stdout -auto -osformat raw > IMI Same on EMBOSS 6.3.1, giving potentially wrong amino acid instead of X. > Likewise H means A, G or C, so ATH shows the same bug, as do some > other AT* codons: > > $ transeq asis:ATAATCATGATH -stdout -auto -osformat raw > IIMI > > [*** This one strikes me as a clear bug ***] Same on EMBOSS 6.3.1, giving potentially wrong amino acid instead of X. As I noted before, this list is only partial, and only for the standard table. I could compile a much longer list of oddities using the Biopython translation as a reference if you wanted. Regards, Peter C. From pmr at ebi.ac.uk Thu Jul 22 12:28:00 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 22 Jul 2010 13:28:00 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> Message-ID: <4C483950.7070004@ebi.ac.uk> On 22/07/10 12:22, Peter C. wrote: >> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 >> (PHRED quality 1, quality 0) which is rather strange. The sequence appears >> to agree with the provided file pGEM_(ABI)_A01.seq >> >> Have I just been unlucky with the AB1 files that I have looked at? Thus >> far all the quality scores seem meaningless. There are two sets of quality scores in that file. Both are the alternating characters 1 and 0. Adding 33 gives the scores you see. Looks as though EMBOSS is just reporting what it finds. The file offset is the value returned by function ajSeqABIGetConfidOffset. It simply reads one byte from there for each base of sequence length. > I went back through my old emails, and see you had been testing with > http://www.appliedbiosystems.com/support/software_community/ab1_files.zip > (I had trouble downloading this with curl - Firefox worked). Looking at these > ABI files with seqret as FASTQ does seem to give meaningful quality scores. > Curious. It should look for a PCON tag in the file and pick up the second of two, or the first if there is only one. Can anyone on the list enlighten us further on what is intended for the quality socrss in these example files? regards, Peter Rice From pmr at ebi.ac.uk Thu Jul 22 12:29:41 2010 From: pmr at ebi.ac.uk (Peter Rice) Date: Thu, 22 Jul 2010 13:29:41 +0100 Subject: [EMBOSS] transeq and ambiguous codons In-Reply-To: References: <320fb6e00907081450y2fd135e0x817f03c41357e297@mail.gmail.com> <4A55B395.4090301@ebi.ac.uk> <320fb6e00907100214v6799a217l507e089f635ef781@mail.gmail.com> Message-ID: <4C4839B5.2070208@ebi.ac.uk> On 22/07/10 12:36, Peter C. wrote: > Hi again, > > Now that I have installed the latest and greatest version, EMBOSS 6.3.1, > I'm revisiting some old issues I had with EMBOSS. In this case 'unambiguous > ambiguous codons' and other translation issues. Interestng. I'll take a look. We certainly intended to handle these cases. regards, Peter Rice From biopython at maubp.freeserve.co.uk Thu Jul 22 13:13:46 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 14:13:46 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: <4C483950.7070004@ebi.ac.uk> References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk> <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> <4C483950.7070004@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice wrote: > > On 22/07/10 12:22, Peter C. wrote: > >>> I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 >>> (PHRED quality 1, quality 0) which is rather strange. The sequence appears >>> to agree with the provided file pGEM_(ABI)_A01.seq >>> >>> Have I just been unlucky with the AB1 files that I have looked at? Thus >>> far all the quality scores seem meaningless. > > There are two sets of quality scores in that file. Both are the > alternating characters 1 and 0. Adding 33 gives the scores you see. > > Looks as though EMBOSS is just reporting what it finds. > > The file offset is the value returned by function > ajSeqABIGetConfidOffset. It simply reads one byte from there for each > base of sequence length. Looks like that particular random example from the internet was just odd. >> I went back through my old emails, and see you had been testing with >> http://www.appliedbiosystems.com/support/software_community/ab1_files.zip >> (I had trouble downloading this with curl - Firefox worked). Looking at these >> ABI files with seqret as FASTQ does seem to give meaningful quality scores. >> Curious. > > It should look for a PCON tag in the file and pick up the second of two, > or the first if there is only one. > > Can anyone on the list enlighten us further on what is intended for the > quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter From Bastien.Chevreux at dsm.com Thu Jul 22 14:42:40 2010 From: Bastien.Chevreux at dsm.com (Chevreux, Bastien) Date: Thu, 22 Jul 2010 16:42:40 +0200 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com><4B B1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk><320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@m ail.gmail.com><4BD06999.2080502@ebi.ac.uk><4BD080F6.3060409@ebi.ac.uk><4C483950.7070004@ebi.ac.uk> Message-ID: AFAIK ab1 files do not have phred quality scores included. At least they did not a couple of years ago. You need to mangle them through a basecaller (TraceTuner, phred, others) to get these scores. B. -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203.4 / 188 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 > -----Original Message----- > From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open- > bio.org] On Behalf Of Peter > Sent: Donnerstag, 22. Juli 2010 15:14 > To: Peter Rice > Cc: emboss at lists.open-bio.org > Subject: Re: [EMBOSS] ABI to FASTQ with seqret > > On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice wrote: > > > > On 22/07/10 12:22, Peter C. wrote: > > > >>> I truncated this for brevity. Here the quality string repeats ASCI 34, > ASCI 33 > >>> (PHRED quality 1, quality 0) which is rather strange. The sequence > appears > >>> to agree with the provided file pGEM_(ABI)_A01.seq > >>> > >>> Have I just been unlucky with the AB1 files that I have looked at? > Thus > >>> far all the quality scores seem meaningless. > > > > There are two sets of quality scores in that file. Both are the > > alternating characters 1 and 0. Adding 33 gives the scores you see. > > > > Looks as though EMBOSS is just reporting what it finds. > > > > The file offset is the value returned by function > > ajSeqABIGetConfidOffset. It simply reads one byte from there for each > > base of sequence length. > > Looks like that particular random example from the internet was just odd. > > >> I went back through my old emails, and see you had been testing with > >> > http://www.appliedbiosystems.com/support/software_community/ab1_files.zip > >> (I had trouble downloading this with curl - Firefox worked). Looking at > these > >> ABI files with seqret as FASTQ does seem to give meaningful quality > scores. > >> Curious. > > > > It should look for a PCON tag in the file and pick up the second of two, > > or the first if there is only one. > > > > Can anyone on the list enlighten us further on what is intended for the > > quality socrss in these example files? > > The gGEM example I have no idea - I just found it with Google. > > I can send you a couple of our locally produced AB1 files off list > if you wouldn't mind having a look at them. It may be that however > these are being generated there simply are no useful scores inside. > > Peter > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. From biopython at maubp.freeserve.co.uk Thu Jul 22 16:45:47 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Thu, 22 Jul 2010 17:45:47 +0100 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com> <4BB1FBE1.8030400@ebi.ac.uk> <4BD06999.2080502@ebi.ac.uk> <4BD080F6.3060409@ebi.ac.uk> <4C483950.7070004@ebi.ac.uk> Message-ID: On Thu, Jul 22, 2010 at 5:33 PM, Tom Keller wrote: > Greetings, > The latest versions of the ABI basecaller does indeed give quality scores. I suspect the problem is my ABI files were not created using the latest ABI basecaller then. Do you have any more details (e.g. which version)? I've sent a couple of *.ab1 files off list to Peter Rice to confirm they really don't have quality scores. Tomorrow I will try and find out who to contact locally about the base calling, and what version of the base caller they have. Peter From kellert at ohsu.edu Thu Jul 22 16:33:43 2010 From: kellert at ohsu.edu (Tom Keller) Date: Thu, 22 Jul 2010 09:33:43 -0700 Subject: [EMBOSS] ABI to FASTQ with seqret In-Reply-To: References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com><4B B1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk><320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@m ail.gmail.com><4BD06999.2080502@ebi.ac.uk><4BD080F6.3060409@ebi.ac.uk><4C483950.7070004@ebi.ac.uk> Message-ID: Greetings, The latest versions of the ABI basecaller does indeed give quality scores. Nicola Vitacolonna wrote a perl module that access the metadata encoded in the ab1 files. use Bio::Trace::ABIF; my $abif = Bio::Trace::ABIF?>new(); $abif?>open_abif('/Path/to/my/file.ab1'); my $sequence = $abif?>sequence(); my @quality_values = $abif?>quality_values(); print $abif?>sample_name(), "\n"; print $sequence, "\n"; print '+\n'; print join(" ", at quality_values), "\n"; Will generate a fastq-sanger format. regards, Tom Thomas (Tom) Keller, PhD kellert at ohsu.edu 503.494.2442 6339b R Jones Hall (BSc/CROET) www.ohsu.edu/xd/research/research-cores/dna-analysis/ On Jul 22, 2010, at 7:42 AM, Chevreux, Bastien wrote: AFAIK ab1 files do not have phred quality scores included. At least they did not a couple of years ago. You need to mangle them through a basecaller (TraceTuner, phred, others) to get these scores. B. -- DSM Nutritional Products AG R&D Human Nutrition & Health Bioinformatics - Bldg. 203.4 / 188 P.O. Box 2676 CH-4002 Basel / Switzerland Tel. +41 61 815 8264 -----Original Message----- From: emboss-bounces at lists.open-bio.org [mailto:emboss-bounces at lists.open- bio.org] On Behalf Of Peter Sent: Donnerstag, 22. Juli 2010 15:14 To: Peter Rice Cc: emboss at lists.open-bio.org Subject: Re: [EMBOSS] ABI to FASTQ with seqret On Thu, Jul 22, 2010 at 1:28 PM, Peter Rice > wrote: On 22/07/10 12:22, Peter C. wrote: I truncated this for brevity. Here the quality string repeats ASCI 34, ASCI 33 (PHRED quality 1, quality 0) which is rather strange. The sequence appears to agree with the provided file pGEM_(ABI)_A01.seq Have I just been unlucky with the AB1 files that I have looked at? Thus far all the quality scores seem meaningless. There are two sets of quality scores in that file. Both are the alternating characters 1 and 0. Adding 33 gives the scores you see. Looks as though EMBOSS is just reporting what it finds. The file offset is the value returned by function ajSeqABIGetConfidOffset. It simply reads one byte from there for each base of sequence length. Looks like that particular random example from the internet was just odd. I went back through my old emails, and see you had been testing with http://www.appliedbiosystems.com/support/software_community/ab1_files.zip (I had trouble downloading this with curl - Firefox worked). Looking at these ABI files with seqret as FASTQ does seem to give meaningful quality scores. Curious. It should look for a PCON tag in the file and pick up the second of two, or the first if there is only one. Can anyone on the list enlighten us further on what is intended for the quality socrss in these example files? The gGEM example I have no idea - I just found it with Google. I can send you a couple of our locally produced AB1 files off list if you wouldn't mind having a look at them. It may be that however these are being generated there simply are no useful scores inside. Peter _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss DISCLAIMER : This e-mail is for the intended recipient only If you have received it by mistake please let us know by reply and then delete it from your system; access, disclosure, copying, distribution or reliance on any of it by anyone else is prohibited. If you as intended recipient have received this e-mail incorrectly, please notify the sender (via e-mail) immediately. _______________________________________________ EMBOSS mailing list EMBOSS at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/emboss From korndoerfer at crelux.com Sat Jul 24 10:56:50 2010 From: korndoerfer at crelux.com (Ingo P. Korndoerfer) Date: Sat, 24 Jul 2010 12:56:50 +0200 Subject: [EMBOSS] emboss stand-in for fasta Message-ID: <4C4AC6F2.3000403@crelux.com> could anybody help me out with what to use as a stand-in for fasta ? fasta by itself is fine, but under windows there is no way to make fasta accept filenames with spaces. neither "" nor """" nor '' seem to alleviate the problem. so i was hoping emboss would have something (which would also save me having to install fasta on all of our pcs). what i need to do is run a sequence against an in house library and return me the top hit in alignment. 1000 thanks in advance greetings ingo -- CRELUX GmbH Dr. Ingo Korndoerfer Head of Crystallography Am Klopferspitz 19a 82152 Martinsried Germany Phone: +49 89 700760210 Fax: +49 89 700760222 korndoerfer at crelux.com www.crelux.com Amtsgericht M?nchen HRB 165552 - Managing Directors: Dr. Michael Sch?ffer, Dr. Ismail Moarefi This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Diese E-Mail enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. -------------- next part -------------- A non-text attachment was scrubbed... Name: korndoerfer.vcf Type: text/x-vcard Size: 318 bytes Desc: not available URL: From biopython at maubp.freeserve.co.uk Sat Jul 24 12:05:11 2010 From: biopython at maubp.freeserve.co.uk (Peter) Date: Sat, 24 Jul 2010 13:05:11 +0100 Subject: [EMBOSS] emboss stand-in for fasta In-Reply-To: <4C4AC6F2.3000403@crelux.com> References: <4C4AC6F2.3000403@crelux.com> Message-ID: On Sat, Jul 24, 2010 at 11:56 AM, Ingo P. Korndoerfer wrote: > could anybody help me out with what to use as a stand-in for fasta ? > > fasta by itself is fine, but under windows there is no way to make fasta > accept filenames with spaces. ?neither "" nor """" nor '' seem to alleviate > the problem. You are talking about Bill Pearson's FASTA command line tools, right? Have you tried wrapping the filename with double quote characters, "like this.fasta", which usually works on Windows. If not, I'd also try escaping with a slash, "like\ this.fasta", just in case. > so i was hoping emboss would have something (which would also save > me having to install fasta on all of our pcs). > > what i need to do is run a sequence against an in house library and > return me the top hit in alignment. Sounds like BLAST might we a sensible choice to me - it works fine on Windows, although I'm not sure about filenames with spaces. Personally I avoid filenames with spaces - they just cause trouble. Can't you rename things before calling FASTA? e.g. Write a wrapper script for FASTA to turn spaces into underscores? Peter From koenvanderdrift at gmail.com Mon Jul 26 12:30:20 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Mon, 26 Jul 2010 08:30:20 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) Message-ID: Hi, When updating the emboss package file for fink on OS X, I got an error about a libtool version mismatch. Fink already has version 2.2.10, but emboss requires 2.2.6b. I tried recreating the libtool-related files as written in the install file, but that did not work, still got the same error. I am not very familiair with libtool et al, so I could have made a mistake there. I am not at my Mac right now, so cannot give you more detailed info and error messages. But I was wondering if there is a way to use compile emboss with libtool 2.2.10 instead of 2.2.6b? Thanks, - Koen. From ajb at ebi.ac.uk Mon Jul 26 13:14:26 2010 From: ajb at ebi.ac.uk (ajb at ebi.ac.uk) Date: Mon, 26 Jul 2010 14:14:26 +0100 (BST) Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: References: Message-ID: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Hi Koen, Unless you are doing something in a non-standard way with the EMBOSS-x.y.z.gz file contents then you shouldn't get that message. It might appear, for example, if you have been hand-modifying some of the configuration files. If, on the other hand, you are using the CVS version then you will get such a message. Generally, if your libtool (2.x) is newer than that used by a given package then typing "autoreconf -fi" before doing any other configuration steps should fix up the libtool side. Note that doing so may not work for modified non-developer distributions. Please let us know if you are getting the error using just: gunzip EMBOSS-x.y.z.gz ./configure --prefix=/fu/bar make HTH Alan > Hi, > > When updating the emboss package file for fink on OS X, I got an error > about a libtool version mismatch. Fink already has version 2.2.10, but > emboss requires 2.2.6b. I tried recreating the libtool-related files > as written in the install file, but that did not work, still got the > same error. I am not very familiair with libtool et al, so I could > have made a mistake there. I am not at my Mac right now, so cannot > give you more detailed info and error messages. But I was wondering if > there is a way to use compile emboss with libtool 2.2.10 instead of > 2.2.6b? > > Thanks, > > - Koen. > _______________________________________________ > EMBOSS mailing list > EMBOSS at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/emboss > From koenvanderdrift at gmail.com Mon Jul 26 14:03:54 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Mon, 26 Jul 2010 10:03:54 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> References: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Message-ID: Hi Alan, Thanks for the reply. As far as I know I am uisng the x.y.z release that I downloaded from the ftp site this weekend. I don't use the cvs version, and am not manipulating any of the configuration files. I'll try the autoreconf command as well as the configure/make commands outside of fink later today and will let you know how that went. - Koen. On Mon, Jul 26, 2010 at 9:14 AM, wrote: > Hi Koen, > > Unless you are doing something in a non-standard way with the > EMBOSS-x.y.z.gz file contents then you shouldn't get that message. It > might appear, for example, if you have been hand-modifying some of the > configuration files. > > If, on the other hand, you are using the CVS version then you will > get such a message. Generally, if your libtool (2.x) is newer > than that used by a given package then typing "autoreconf -fi" > before doing any other configuration steps should fix up > the libtool side. Note that doing so may not work for > modified non-developer distributions. > > > Please let us know if you are getting the error using just: > ? ?gunzip EMBOSS-x.y.z.gz > ? ?./configure --prefix=/fu/bar > ? ?make > > HTH > > Alan > > > >> Hi, >> >> When updating the emboss package file for fink on OS X, I got an error >> about a libtool version mismatch. Fink already has version 2.2.10, but >> emboss requires 2.2.6b. I tried recreating the libtool-related files >> as written in the install file, but that did not work, still got the >> same error. I am not very familiair with libtool et al, so I could >> have made a mistake there. I am not at my Mac right now, so cannot >> give you more detailed info and error messages. But I was wondering if >> there is a way to use compile emboss with libtool 2.2.10 instead of >> 2.2.6b? >> >> Thanks, >> >> - Koen. >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > > From koenvanderdrift at gmail.com Tue Jul 27 04:03:26 2010 From: koenvanderdrift at gmail.com (Koen van der Drift) Date: Tue, 27 Jul 2010 00:03:26 -0400 Subject: [EMBOSS] libtool mismatch for 6.3.1 on OS X (fink) In-Reply-To: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> References: <36262.86.26.12.63.1280150066.squirrel@webmail.ebi.ac.uk> Message-ID: <4401ABFA-39A5-4868-A638-6CA4A614148E@gmail.com> On Jul 26, 2010, at 9:14 AM, ajb at ebi.ac.uk wrote: > Generally, if your libtool (2.x) is newer > than that used by a given package then typing "autoreconf -fi" > before doing any other configuration steps should fix up > the libtool side. That indeed solved the issue. Thanks, - Koen. From dag at sonsorol.org Wed Jul 28 11:49:17 2010 From: dag at sonsorol.org (Chris Dagdigian) Date: Wed, 28 Jul 2010 07:49:17 -0400 Subject: [EMBOSS] accessing emboss ftp site In-Reply-To: References: <6F57C2D1-8927-420C-940C-C6EC0C62AABE@gmail.com> <4BD0DB9B.5050005@sonsorol.org> <4BD1C05D.5010109@sonsorol.org> Message-ID: <4C50193D.5060500@bioteam.net> I can't seem to replicate this at all with any of my available FTP clients or browser-based FTP clients. It works for me from both Mac OS X clients as well as a CentOS 5.4 based Linux system. Is there an FTP client / OS combo that seems in particular not to work? -Chris Here is one example: > dag-static:~ dag$ ftp emboss.open-bio.org > Connected to emboss.open-bio.org. > 220 (vsFTPd 2.0.1) > Name (emboss.open-bio.org:dag): anonymous > 331 Please specify the password. > Password: > 230 Login successful. > Remote system type is UNIX. > Using binary mode to transfer files. > ftp> dir > 229 Entering Extended Passive Mode (|||54472|) > 150 Here comes the directory listing. > drwxr-xr-x 8 14 50 4096 May 22 2006 pub > 226 Directory send OK. > ftp> cd pub/EMBOSS > 250 Directory successfully changed. > ftp> ls > 229 Entering Extended Passive Mode (|||38343|) > 150 Here comes the directory listing. > -rw-rw-r-- 1 501 503 389856 Jul 21 09:25 CBSTOOLS-1.0.0.tar.gz > -rw-rw-r-- 1 501 503 426218 Jul 21 09:25 DOMAINATRIX-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 441025 Jul 21 09:25 DOMALIGN-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 452787 Jul 21 09:25 DOMSEARCH-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 23572243 Jul 19 13:41 EMBOSS-6.3.1.tar.gz > lrwxrwxrwx 1 501 503 19 Jul 19 13:42 EMBOSS-latest.tar.gz -> EMBOSS-6.3.1.tar.gz > -rw-rw-r-- 1 501 503 373798 Jul 21 09:25 EMNU-1.05.tar.gz > -rw-rw-r-- 1 501 503 415096 Jul 21 09:25 ESIM4-1.0.0.tar.gz > -rw-rw-r-- 1 501 503 569581 Jul 21 09:25 HMMER-2.3.2.tar.gz > -rw-rw-r-- 1 501 503 350791 Jul 21 09:25 IPRSCAN-4.3.1.tar.gz > drwxrwsr-x 7 501 503 4096 Feb 01 2006 Jemboss > -rw-rw-r-- 1 501 503 513418 Jul 21 09:25 MEMENEW-4.0.0.tar.gz > -rw-rw-r-- 1 501 503 823636 Jul 21 09:25 MIRA-2.8.2.tar.gz > -rw-rw-r-- 1 501 503 435315 Jul 21 09:25 MSE-3.0.0.tar.gz > -rw-rw-r-- 1 501 503 328540 Jul 21 09:25 MYEMBOSS-6.3.0.tar.gz > -rw-rw-r-- 1 501 503 359488 Jul 21 09:25 MYEMBOSSDEMO-6.3.0.tar.gz > -rw-rw-r-- 1 501 503 1667760 Jul 21 09:25 PHYLIPNEW-3.69.tar.gz > -rw-rw-r-- 1 501 503 571008 Jul 21 09:25 SIGNATURE-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 531604 Jul 21 09:25 STRUCTURE-0.1.0.tar.gz > -rw-rw-r-- 1 501 503 386287 Jul 21 09:25 TOPO-2.0.0.tar.gz > -rw-rw-r-- 1 501 503 652433 Jul 21 09:25 VIENNA-1.7.2.tar.gz > drwxrwsr-x 3 522 503 4096 Aug 21 2006 contrib > drwxrwsr-x 2 501 503 4096 Nov 11 2005 doc > drwxrwsr-x 3 501 503 4096 Jul 21 09:25 fixes > drwxrwsr-x 14 501 503 4096 Jul 19 13:39 old > drwxrwsr-x 2 501 503 4096 Jul 06 2005 tutorials > drwxrwsr-x 4 501 503 4096 Jul 19 13:36 windows > 226 Directory send OK. > ftp> Hamish McWilliam wrote: > Hi Chris, > > I'm also seeing problems with the FTP site, but using Mirror rather > than Transmit, it looks like the server does not like options being > specified to the LIST command: > > Scanning remote directory /pub/EMBOSS > ---> CWD /pub/EMBOSS > 250 Directory successfully changed. > ---> TYPE A > 200 Switching to ASCII mode. > ---> PORT 172,21,22,1,171,245 > 200 PORT command successful. Consider using PASV. > ---> PASV > 227 Entering Passive Mode (208,94,50,58,104,178) > ---> LIST -lat > timed out > Cannot get remote directory listing because: timed out > Cannot get remote directory details (/pub/EMBOSS) > disconnecting from emboss.open-bio.org > > Trying it with a command line client I get the response to hang if I > try using any options to ls or dir, without options they are fine. > > All the best, > > Hamish > > On 29 May 2010 03:44, Koen van der Drift wrote: >> Hi Chris, >> >> Did you have a chance to look at this? Just tried again, and Transmit still >> won't let me access the emboss ftp site. >> >> Thanks, >> >> - Koen. >> >> >> On Apr 23, 2010, at 11:44 AM, Chris Dagdigian wrote: >> >>> In the last few months the open-bio.org servers switched datacenters, IP >>> addresses and firewall/IDS appliances. Lots of juicy things to look at and >>> debug. >>> >>> Koen - if you have a chance can you send me the IP address that you are >>> using to connect from? I might be able to find some relevant log entries >>> with that info. >>> >>> -Chris >>> >>> >>> >>> Koen van der Drift wrote: >>>> Just for the record, it used to work with Transmit, this is only from >>>> the last few months. >>>> >>>> - Koen. >>>> >>>> On Thu, Apr 22, 2010 at 7:28 PM, Chris Dagdigian >>>> wrote: >>>>> Might be an issue with the Juniper Netscreen firewall/IDS security >>>>> appliance >>>>> that sits upstream of the EMBOSS FTP server. I'll take a look at the >>>>> security logs and alerts. >>>>> >>>>> -Chris >>>>> >>>>> >>>>> Koen van der Drift wrote: >>>>>> Hi, >>>>>> >>>>>> For a while now I am unable to access the emboss ftp site using the OS >>>>>> X >>>>>> client Transmit. Loggin in works fine, but it chokes on the LIST >>>>>> command. I have no problems accessing it from the command line. I have >>>>>> added the output from Transmit below. I don't know if this is a >>>>>> Transmit >>>>>> or emboss issue, but just wanted to let you know. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> - Koen. >>>>>> >>>>>> >>>>>> Transmit 3.6.9 Session Transcript >>>>>> LibNcFTP 3.2.1 (August 13, 2007) compiled for UNIX >>>>>> Uname: Darwin|exile.local|9.8.0|Darwin Kernel Version 9.8.0: Wed Jul 15 >>>>>> 16:57:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_PPC|Power Macintosh >>>>>> 220: (vsFTPd 2.0.1) >>>>>> Connected to emboss.open-bio.org. >>>>>> Cmd: USER anonymous >>>>>> 331: Please specify the password. >>>>>> Cmd: PASS NcFTP@ >>>>>> 230: Login successful. >>>>>> Cmd: TYPE A >>>>>> 200: Switching to ASCII mode. >>>>>> Logged in to emboss.open-bio.org as anonymous. >>>>>> Cmd: SYST >>>>>> 215: UNIX Type: L8 >>>>>> Cmd: PWD >>>>>> 257: "/" >>>>>> Cmd: CWD /pub/EMBOSS/fixes >>>>>> 250: Directory successfully changed. >>>>>> Cmd: PWD >>>>>> 257: "/pub/EMBOSS/fixes" >>>>>> Cmd: PASV >>>>>> 227: Entering Passive Mode (208,94,50,58,83,232) >>>>>> Cmd: LIST -a >>>>>> Could not read reply from control connection -- timed out. (SReadline >>>>>> 1) >>>>>> 220: (vsFTPd 2.0.1) >>>>>> Connected to emboss.open-bio.org. >>>>>> Cmd: USER anonymous >>>>>> 331: Please specify the password. >>>>>> Cmd: PASS NcFTP@ >>>>>> 230: Login successful. >>>>>> Logged in to emboss.open-bio.org as anonymous. >>>>>> Cmd: SYST >>>>>> 215: UNIX Type: L8 >>>>>> Cmd: PWD >>>>>> 257: "/" >>>>>> Cmd: CWD /pub/EMBOSS/fixes >>>>>> 250: Directory successfully changed. >>>>>> Cmd: PWD >>>>>> 257: "/pub/EMBOSS/fixes" >>>>>> Cmd: PASV >>>>>> 227: Entering Passive Mode (208,94,50,58,222,100) >>>>>> Cmd: LIST -a >>>>>> Could not read reply from control connection -- timed out. (SReadline >>>>>> 1) >>>>>> >>>>>> _______________________________________________ >>>>>> EMBOSS mailing list >>>>>> EMBOSS at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/emboss >> _______________________________________________ >> EMBOSS mailing list >> EMBOSS at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/emboss >> > > >