From ajb at ebi.ac.uk  Tue Mar  2 05:44:51 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 10:44:51 -0000 (UTC)
Subject: [EMBOSS] EMBOSS-6.2.0 patch 1-18 available
Message-ID: <34743.86.26.12.63.1267526691.squirrel@webmail.ebi.ac.uk>

The first patch file for the EMBOSS-6.2.0 release is now
available at:

    ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-18.gz

Discrete files used to create the above patch are held in the
directory:

    ftp://emboss.open-bio.org/pub/EMBOSS/fixes/

The file README.fixes in the same directory describes what the
fixes address and is attached to this email for convenience.

A new mEMBOSS incorporating all relevant changes from the above
is available as:

    ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.2.0.2-setup.exe


Alan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: README.fixes
Type: application/octet-stream
Size: 5674 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20100302/c41b6753/attachment.obj>

From ajb at ebi.ac.uk  Tue Mar  2 06:22:58 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 11:22:58 -0000 (UTC)
Subject: [EMBOSS] mEMBOSS-6.2.0.1 reinstated
Message-ID: <56161.86.26.12.63.1267528978.squirrel@webmail.ebi.ac.uk>

A stability problem has been noticed with mEMBOSS-6.2.0.2 on the ftp
server. As a result we've reinstated mEMBOSS-6.2.0.1.

A further announcement will be posted when things are resolved.
Apologies for any inconvenience.

Alan


From ajb at ebi.ac.uk  Tue Mar  2 11:03:30 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 16:03:30 -0000 (UTC)
Subject: [EMBOSS] mEMBOSS 6.2.0.2 re-released
Message-ID: <57834.86.26.12.63.1267545810.squirrel@webmail.ebi.ac.uk>

The stability issues have been resolved. mEMBOSS 6.2.0.2 is
now re-released as:

ftp://emboss.open-bio.org/pub/EMBOSS/windows/beta/mEMBOSS-6.2.0.2-setup.exe

Note that this release is based on the current developers' CVS
code and, as such, has not had the rigorous testing performed
for major releases (or for patches to the UNIX version of EMBOSS).
We are providing it as a beta release in the event it may be useful.

Alan


From michael.watson at bbsrc.ac.uk  Fri Mar  5 09:26:06 2010
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 5 Mar 2010 14:26:06 +0000
Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to
 bootstrap with FPROTPARS
Message-ID: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>

Hello

Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence.

The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS.

I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68

Many thanks
Mick


From pmr at ebi.ac.uk  Fri Mar  5 10:40:54 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 05 Mar 2010 15:40:54 +0000
Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to
 bootstrap with FPROTPARS
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4B912606.9080906@ebi.ac.uk>

Dear Michael,

On 05/03/10 14:26, michael watson (IAH-C) wrote:
> Hello
> 
> Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence.
> 
> The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS.
> 
> I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68

My understanding of the phylip documentation is that you use (EMBOSS
name) fseqboot to generate the bootstrap resampling of your original
sequences and then use fprotpars to analyse the resulting output.

In the original phylip package the seqboot application bootstraps
several types of data. In the EMBASSY package, to make the input types
clearer, we split it into fseqboot, fseqbootall, fdiscboot, ffreqboot
and frestboot.

Hope that helps,

Peter Rice


From jeedward at yahoo.com  Fri Mar  5 19:35:35 2010
From: jeedward at yahoo.com (John Edward)
Date: Fri, 5 Mar 2010 16:35:35 -0800 (PST)
Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010
Message-ID: <915762.86810.qm@web45916.mail.sp1.yahoo.com>

It
would be highly appreciated if you could share this announcement with your
colleagues, students and individuals whose research is in bioinformatics,
computational biology, genomics, data-mining, and related areas.
 
Call
for papers: BCBGC-10, USA, July 2010
 
The
2010 International Conference on Bioinformatics, Computational Biology,
Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will
be held during 12-14 of July 2010 in Orlando, FL, USA.  BCBGC is an important event in the areas of
bioinformatics, computational biology, genomics and chemoinformatics and
focuses on all areas related to the conference.
 
The
conference will be held at the same time and location where several other major
international conferences will be taking place. The conference will be held as
part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during
July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to
promote research and developmental activities in computer science, information
technology, control engineering, and related fields. Another goal is to promote
the dissemination of research to a multidisciplinary audience and to facilitate
communication among researchers, developers, practitioners in different fields.
The following conferences are planned to be organized as part of MULTICONF-10.
 
?           International Conference on
Artificial Intelligence and Pattern Recognition (AIPR-10)
?            International Conference on Automation,
Robotics and Control Systems (ARCS-10)
?           International Conference on
Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10)
?           International Conference on Computer
Communications and Networks (CCN-10)
?           International Conference on
Enterprise Information Systems and Web Technologies (EISWT-10)
?           International Conference on High
Performance Computing Systems (HPCS-10)
?           International Conference on
Information Security and Privacy (ISP-10) 
?           International Conference on Image and
Video Processing and Computer Vision (IVPCV-10)
?           International Conference on Software
Engineering Theory and Practice (SETP-10) 
?           International Conference on
Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) 
 
 
MULTICONF-10
will be held at Imperial Swan Hotel and Suites.  It is a full-service resort that puts you in the middle of the fun!
Located 1/2 block south of the famed International Drive, the hotel is just
minutes from great entertainment like Walt Disney World? Resort, Universal
Studios and Sea World Orlando. Guests can enjoy free scheduled transportation
to these theme parks, as well as spacious accommodations, outdoor pools and
on-site dining ? all situated on 10 tropically landscaped acres. Here, guests
can experience a full-service resort with discount hotel pricing in Orlando.
 
We
invite draft paper submissions. Please see the website http://www.PromoteResearch.org for
more details.
 
Sincerely
John
Edward


From mbk0asis at gmail.com  Sun Mar  7 10:05:27 2010
From: mbk0asis at gmail.com (Byungkuk Min)
Date: Sun, 7 Mar 2010 07:05:27 -0800
Subject: [EMBOSS] A question about 'showdb'
Message-ID: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>

When I typed 'showdb', no list of databases appeared like the example in the
tutorial.
How can I set up the databases?

xxxxx at ubuntu:~$ showdb
Displays information on configured databases
# Name         Type  ID  Qry All Comment
# ============ ==== ==  === === =======

From pmr at ebi.ac.uk  Sun Mar  7 17:28:23 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Sun, 07 Mar 2010 22:28:23 +0000
Subject: [EMBOSS] A question about 'showdb'
In-Reply-To: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>
References: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>
Message-ID: <4B942887.7090608@ebi.ac.uk>

Dear Byungkuk,

On 07/03/2010 15:05, Byungkuk Min wrote:
> When I typed 'showdb', no list of databases appeared like the example in the
> tutorial.
> How can I set up the databases?

The databases are defined in a file emboss.defaults in the share/EMBOSS/ 
directory where EMBOSS is installed.

In that directory you will find a file emboss.default.template with 
example database definitions.

Some databases are remote (e.g. method: "srs") and can be defined and used.

Others need local data files and a local index created by EMBOSS 
(method: emboss and method: emblcd) creatted by the dbx* and dbi* 
programs in EMBOSS.

Let us know if you need any more help. We are working on more detailed 
instructions which will appear on the EMNBOSS website.

regards,

Peter Rice

From shrish at ccmb.res.in  Mon Mar  8 03:58:26 2010
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Mon, 8 Mar 2010 14:28:26 +0530 (IST)
Subject: [EMBOSS] (no subject)
Message-ID: <777482836.160381268038706946.JavaMail.root@127.0.0.1>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20100308/6db40829/attachment.pl>

From simon.andrews at bbsrc.ac.uk  Mon Mar  8 08:53:06 2010
From: simon.andrews at bbsrc.ac.uk (Simon Andrews)
Date: Mon, 8 Mar 2010 13:53:06 +0000
Subject: [EMBOSS] Data for Jasextract
Message-ID: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>

I've been trying to use EMBOSS to search using the Jaspar database  
(jaspextract /  jaspscan), but with no success.

I think the problem is coming from jaspextract.  TFM says:

Input file format

    The input files are the uncompressed and extracted JASPAR_CORE.tgz,
    JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the  
JASPAR
    MatrixDir download directory of the JASPAR homepage
    (http://jaspar.genereg.net).


..but there are no files named that way (the only google hit to those  
names is the jaspextract manpage!).

The main jaspar archive file is Archive.zip.  If I unzip this and run  
jaspextract on the expanded directory it runs with no errors or  
warnings, but if I subsequently try to run jaspscan I get an error  
saying:

Warning: Matrix file(s) *.pfm not found

    EMBOSS An error in jaspscan.c at line 870:
Matrix list file JASPAR_CORE/matrix_list.txt not found

I've tried loads of different subdirectories within the JASPAR  
database dump, but can't find anything which actually puts data into  
the appropriate EMBOSS data directories.

Can anyone else make this work?

Thanks

Simon.

From ajb at ebi.ac.uk  Mon Mar  8 11:07:48 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Mon, 8 Mar 2010 16:07:48 -0000 (UTC)
Subject: [EMBOSS] Data for Jasextract
In-Reply-To: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>
References: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>
Message-ID: <45660.86.26.12.63.1268064468.squirrel@webmail.ebi.ac.uk>

Hello Simon,

The Jaspar people altered the structure and content of their
ftp  server recently. There is a patch in the fixes/patches
area of the EMBOSS ftp server which updates jaspextract and
jaspscan appropriately. The README.fixes file in the 'fixes'
directory explains further.

HTH

Alan


> I've been trying to use EMBOSS to search using the Jaspar database
> (jaspextract /  jaspscan), but with no success.
>
> I think the problem is coming from jaspextract.  TFM says:
>
> Input file format
>
>     The input files are the uncompressed and extracted JASPAR_CORE.tgz,
>     JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the
> JASPAR
>     MatrixDir download directory of the JASPAR homepage
>     (http://jaspar.genereg.net).
>
>
> ..but there are no files named that way (the only google hit to those
> names is the jaspextract manpage!).
>
> The main jaspar archive file is Archive.zip.  If I unzip this and run
> jaspextract on the expanded directory it runs with no errors or
> warnings, but if I subsequently try to run jaspscan I get an error
> saying:
>
> Warning: Matrix file(s) *.pfm not found
>
>     EMBOSS An error in jaspscan.c at line 870:
> Matrix list file JASPAR_CORE/matrix_list.txt not found
>
> I've tried loads of different subdirectories within the JASPAR
> database dump, but can't find anything which actually puts data into
> the appropriate EMBOSS data directories.
>
> Can anyone else make this work?
>
> Thanks
>
> Simon.
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From stephen.taylor at imm.ox.ac.uk  Tue Mar  9 09:20:38 2010
From: stephen.taylor at imm.ox.ac.uk (Steve Taylor)
Date: Tue, 09 Mar 2010 14:20:38 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
Message-ID: <4B965936.6030102@imm.ox.ac.uk>

Hi,

I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone on this list know if there planned support for EMBOSS 6? We have found using our local installation of EMBOSS 6 that a few tools don't work. Is there a person who maintains the Galaxy/EMBOSS configuration?

I know this is *really* a Galaxy question but I posted this to the Galaxy list but haven't had any response so far. :-)

Thanks,

Steve

From pmr at ebi.ac.uk  Tue Mar  9 10:29:59 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 09 Mar 2010 15:29:59 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B965936.6030102@imm.ox.ac.uk>
References: <4B965936.6030102@imm.ox.ac.uk>
Message-ID: <4B966977.9050101@ebi.ac.uk>

On 09/03/2010 14:20, Steve Taylor wrote:
> Hi,
>
> I notice in the Galaxy distribution there is support for EMBOSS 5. Does
> anyone on this list know if there planned support for EMBOSS 6? We have
> found using our local installation of EMBOSS 6 that a few tools don't
> work. Is there a person who maintains the Galaxy/EMBOSS configuration?
>
> I know this is *really* a Galaxy question but I posted this to the
> Galaxy list but haven't had any response so far. :-)

I am looking into it and will be going to the Galaxy Developers meeting in May.

Any other interest among the EMBOSS users?

regards,

Peter Rice

From hrh at fmi.ch  Tue Mar  9 10:40:19 2010
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Tue, 09 Mar 2010 16:40:19 +0100
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B965936.6030102@imm.ox.ac.uk>
Message-ID: <C7BC2A73.7121%hrh@fmi.ch>

Steve

> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone
> on this list know if there planned support for EMBOSS 6? We have found using
> our local installation of EMBOSS 6 that a few tools don't work.

which tools don't work?

We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I
don't remember running into problems with the galaxy emboss 5 tool
definitions (ie emboss_*.xml files).

I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions
(based on the release notes) you need to make. Generally speaking: EMBOSS
tools are pretty stable.


Maybe if you provide a list of problems/incompatibilities and resend this to
the galaxy mailing list, you will get a response...

Hans


> Is there a person who maintains the Galaxy/EMBOSS configuration?
> 
> I know this is *really* a Galaxy question but I posted this to the Galaxy list
> but haven't had any response so far. :-)
> 
> Thanks,
> 
> Steve
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From stephen.taylor at imm.ox.ac.uk  Tue Mar  9 11:30:08 2010
From: stephen.taylor at imm.ox.ac.uk (Steve Taylor)
Date: Tue, 09 Mar 2010 16:30:08 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <C7BC2A73.7121%hrh@fmi.ch>
References: <C7BC2A73.7121%hrh@fmi.ch>
Message-ID: <4B967790.8000609@imm.ox.ac.uk>

Hi Hans,

>> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone
>> on this list know if there planned support for EMBOSS 6? We have found using
>> our local installation of EMBOSS 6 that a few tools don't work.
> 
> which tools don't work?
> 
> We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I
> don't remember running into problems with the galaxy emboss 5 tool
> definitions (ie emboss_*.xml files).
> 

Ok. That's good to know. 

> I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions
> (based on the release notes) you need to make. Generally speaking: EMBOSS
> tools are pretty stable.
> 

To give a bit of history, we are fairly new to using Galaxy and previously we used EMBOSS Explorer as our main web interface. With this we found when EMBOSS releases changed lots of things broke, so we ended up staying with EMBOSS v3. I am hoping this is not going to be true for EMBOSS/Galaxy because they are both great tools and I want them to be used routinely without us/users worrying if things are going to break, especially if they are going to be incorporated routinely into workflows. 

> Maybe if you provide a list of problems/incompatibilities and resend this to
> the galaxy mailing list, you will get a response...
> 


Maybe I was a bit unlucky because I tried a few more tools and generally things are ok. A couple of minor issues I came across:

* antigenic

(ran but gave an error)

 14: antigenic on data 13
An error occurred running this job: Error: Unable to read feature tags data file 'Etags.gff3protein'

* etandem produced two outputs (not exactly an error but I wondered if it was a misconfiguration in the xml)

there may be more ...

Your email answers my question that in general EMBOSS 6 is compatible with EMBOSS 5 but probably some minor tweaks may be required for certain tools. It would great if some form of unit testing could be employed to check compatibility with new builds.

Thanks,

Steve


> Hans
> 
> 
>> Is there a person who maintains the Galaxy/EMBOSS configuration?
>>
>> I know this is *really* a Galaxy question but I posted this to the Galaxy list
>> but haven't had any response so far. :-)
>>
>> Thanks,
>>
>> Steve
>> _______________________________________________
>> EMBOSS mailing list
>> EMBOSS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/emboss
> 


From pmr at ebi.ac.uk  Tue Mar  9 12:11:30 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 09 Mar 2010 17:11:30 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B967790.8000609@imm.ox.ac.uk>
References: <C7BC2A73.7121%hrh@fmi.ch> <4B967790.8000609@imm.ox.ac.uk>
Message-ID: <4B968142.9080905@ebi.ac.uk>

On 09/03/2010 16:30, Steve Taylor wrote:

> To give a bit of history, we are fairly new to using Galaxy and
> previously we used EMBOSS Explorer as our main web interface. With this
> we found when EMBOSS releases changed lots of things broke, so we ended
> up staying with EMBOSS v3. I am hoping this is not going to be true for
> EMBOSS/Galaxy because they are both great tools and I want them to be
> used routinely without us/users worrying if things are going to break,
> especially if they are going to be incorporated routinely into workflows.
>> Maybe if you provide a list of problems/incompatibilities and resend
>> this to
>> the galaxy mailing list, you will get a response...

Yes please do ... I am on the Galaxy list too.

> 14: antigenic on data 13
> An error occurred running this job: Error: Unable to read feature tags
> data file 'Etags.gff3protein'

Could be you have more than one version of EMBOSS running. That looks like 
a pure EMBOSS error suggesting EMBOSS 6 is trying to use EMBSOS5's data 
directory. Should be fixable by copying the missing file.

Peter


From n.binns at ed.ac.uk  Tue Mar  9 11:59:13 2010
From: n.binns at ed.ac.uk (Nigel Binns)
Date: Tue, 09 Mar 2010 16:59:13 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B967790.8000609@imm.ox.ac.uk>
References: <C7BC2A73.7121%hrh@fmi.ch> <4B967790.8000609@imm.ox.ac.uk>
Message-ID: <4B967E61.4060502@ed.ac.uk>

I'm running EMBOSS 6.2.0 (and Jemboss via JWS) with the latest patch 
applied and EMBOSS Explorer (v2.2.0) without any problems. The only 
issue I've experienced is that the link to the EMBOSS help files is 
broken The workaround is to copy the EMBOSS HTML help files to the 
location EE expects to find them - which is not where the current 
release of EMBOSS is places them :-)

Nigel

On 09/03/2010 16:30, Steve Taylor wrote:
> Hi Hans,
>
>>> I notice in the Galaxy distribution there is support for EMBOSS 5. 
>>> Does anyone
>>> on this list know if there planned support for EMBOSS 6? We have 
>>> found using
>>> our local installation of EMBOSS 6 that a few tools don't work.
>>
>> which tools don't work?
>>
>> We are using most of the EMBOSS 6.1.0 tools in or local galaxy 
>> server. And I
>> don't remember running into problems with the galaxy emboss 5 tool
>> definitions (ie emboss_*.xml files).
>>
>
> Ok. That's good to know.
>> I haven't checked EMBOSS 6.2.0, but I guess there are just a few 
>> additions
>> (based on the release notes) you need to make. Generally speaking: 
>> EMBOSS
>> tools are pretty stable.
>>
>
> To give a bit of history, we are fairly new to using Galaxy and 
> previously we used EMBOSS Explorer as our main web interface. With 
> this we found when EMBOSS releases changed lots of things broke, so we 
> ended up staying with EMBOSS v3. I am hoping this is not going to be 
> true for EMBOSS/Galaxy because they are both great tools and I want 
> them to be used routinely without us/users worrying if things are 
> going to break, especially if they are going to be incorporated 
> routinely into workflows.
>> Maybe if you provide a list of problems/incompatibilities and resend 
>> this to
>> the galaxy mailing list, you will get a response...
>>
>
>
> Maybe I was a bit unlucky because I tried a few more tools and 
> generally things are ok. A couple of minor issues I came across:
>
> * antigenic
>
> (ran but gave an error)
>
> 14: antigenic on data 13
> An error occurred running this job: Error: Unable to read feature tags 
> data file 'Etags.gff3protein'
>
> * etandem produced two outputs (not exactly an error but I wondered if 
> it was a misconfiguration in the xml)
>
> there may be more ...
>
> Your email answers my question that in general EMBOSS 6 is compatible 
> with EMBOSS 5 but probably some minor tweaks may be required for 
> certain tools. It would great if some form of unit testing could be 
> employed to check compatibility with new builds.
>
> Thanks,
>
> Steve
>
>
>
>
>> Hans
>>
>>
>>> Is there a person who maintains the Galaxy/EMBOSS configuration?
>>>
>>> I know this is *really* a Galaxy question but I posted this to the 
>>> Galaxy list
>>> but haven't had any response so far. :-)
>>>
>>> Thanks,
>>>
>>> Steve
>>> _______________________________________________
>>> EMBOSS mailing list
>>> EMBOSS at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/emboss
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From biopython at maubp.freeserve.co.uk  Fri Mar 12 07:07:48 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 12 Mar 2010 12:07:48 +0000
Subject: [EMBOSS] Broken links on Emboss webpages
Message-ID: <320fb6e01003120407u794bf5e6ue6c84522ac588c91@mail.gmail.com>

Hi,

I was just looking for the EMBOSS EMBASSY documentation for the
PHYLIPNEW packages, and noticed they are missing from this page:
http://emboss.sourceforge.net/embassy/

Perhaps this should redirect to the latest release? i.e.
http://emboss.sourceforge.net/apps/release/6.2/embassy/index.html

I also found the links on this page seem to be broken:
http://emboss.sourceforge.net/apps/release/6.2/emboss/apps/phylogeny_molecular_sequence_group.html

Regards,

Peter

From Perdeep.Mehta at STJUDE.ORG  Fri Mar 12 10:22:35 2010
From: Perdeep.Mehta at STJUDE.ORG (Mehta, Perdeep)
Date: Fri, 12 Mar 2010 09:22:35 -0600
Subject: [EMBOSS] Antwort:  restrict
In-Reply-To: <OF27EDD1C2.7567862F-ONC12576D3.0024E80E-C12576D3.002611AB@bayer.de>
References: <6EAE916704479E4BB6AB5A133BA224F728A54626D5@SJMEMXMBS11.stjude.sjcrh.local>
	<OF27EDD1C2.7567862F-ONC12576D3.0024E80E-C12576D3.002611AB@bayer.de>
Message-ID: <6EAE916704479E4BB6AB5A133BA224F728A5462746@SJMEMXMBS11.stjude.sjcrh.local>

Hi List,

We now have the Rebase locally installed.  Strangely, I see a new error;


"Input nucleotide sequence(s): chr10.fa Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2170"


Above example is just for testing with chromosome 10, I plan to do either whole genome (all 23 chromosomes) or do 23 times with each chromosome. I have tested running on a queue with higher memory using following command;


 qsub -q normal-ib /path/restrict -sequence chr10.fa -enzymes hinfI -fragments -outfile chr10.res


Then it threw following error;


"Unable to run job: Script length does not match declared length."

It may not be the restrict problem,  I was just throwing it in here to see if anyone else have had seen such a problem.

Any guess.

Thanks,
perdeep


From: david.bauer at bayerhealthcare.com [mailto:david.bauer at bayerhealthcare.com]
Sent: Tuesday, February 23, 2010 12:56 AM
To: Mehta, Perdeep
Cc: emboss; emboss-bounces at lists.open-bio.org
Subject: Antwort: [EMBOSS] restrict


Hi,

emboss-bounces at lists.open-bio.org schrieb am 23/02/2010 00:21:38:

> I have a few questions on EMBOSS restriction analysis and will
> appreciate any ideas or thoughts on these.
>
> 1. What Rebase file we need to download to get "restrict" working? I
> tried but there are files with different formats.

Go to the /pub/rebase dir on ftp.neb.com.
Download the withrefm.xxx and proto.xxx files
(xxx stands for the version number, just take the latest that's there)
Run rebaseextract -infile withrefm.xxx -protofile proto.xxx
This reformats the neb files for use with emboss. You should now see
4 files embossre.... in the REBASE directory

> 2. Is there a maximum size limit of a nucleotide sequence that I can
> use? Can I use the whole Human genome or at least a full chromosome
> to digest with a particular restriction enzyme?

I'm not sure about the whole genome but I have used it for individual
chromosomes without problems.

> 3. What program can give me the list of all possible fragments
> generated as well? Since I have not seen the output of "restrict",
> perhaps that is already doing that.

You can run restrict with the option -fragments to get them.

Hope this helps,
David.

________________________________
Email Disclaimer: www.stjude.org/emaildisclaimer


From michael.watson at bbsrc.ac.uk  Thu Mar 18 05:11:59 2010
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Thu, 18 Mar 2010 09:11:59 +0000
Subject: [EMBOSS] Memory problem with extractseq
Message-ID: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>

Hi

I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.

I find it strange that extractseq reports a memory problem:

-bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
Extract regions from a sequence
Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2406

Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly.

I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl?

Thanks
Mick


From david.bauer at bayerhealthcare.com  Thu Mar 18 06:01:33 2010
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Thu, 18 Mar 2010 11:01:33 +0100
Subject: [EMBOSS] Antwort:  Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <OFA9BE2347.416D8AEA-ONC12576EA.00368EC3-C12576EA.00371218@bayer.de>

Hi,

I tested this on a larger machine and the job growth to  ~7.3 Gb before it 
outputs the requested sequence part.
The memory size is the same for extractseq and seqret.
Chromosome 1 fasta file size is ~250 Mb so it seems that EMBOSS is not 
very memory efficient  ;-)

David.

emboss-bounces at lists.open-bio.org schrieb am 18/03/2010 10:11:59:

> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of 
RAM.
> 
> I find it strange that extractseq reports a memory problem:
> 
> -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.
> fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
> Extract regions from a sequence
> Uncaught exception:  Allocation failed, insufficient memory 
> available, raised at ajstr.c:2406
> 
> Whereas if I write a Bioperl script using SeqIO and the trunk() 
> function, it works perfectly.
> 
> I'd have thought EMBOSS would be more streamlined and memory 
> efficient than Bioperl?
> 
> Thanks
> Mick
> 
> 
> 
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss

From pmr at ebi.ac.uk  Thu Mar 18 08:39:28 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 18 Mar 2010 12:39:28 +0000
Subject: [EMBOSS] Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4BA21F00.2060609@ebi.ac.uk>

On 18/03/10 09:11, michael watson (IAH-C) wrote:
> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.
> 
> I find it strange that extractseq reports a memory problem:
> 
> -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
> Extract regions from a sequence
> Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2406
> 
> Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly.
> 
> I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl?

It appears to be in the buffering of input to detect the format.

While we try to improve the performance, you can simply specify the format:

-sformat fasta

to turn off the file input buffering.

Reading an unknown format requires a lot of input to be buffered, in
case a GCG ".." checksum line appears.

Hope that helps

Peter


From pmr at ebi.ac.uk  Thu Mar 18 09:30:12 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 18 Mar 2010 13:30:12 +0000
Subject: [EMBOSS] Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4BA22AE4.4050507@ebi.ac.uk>

On 18/03/10 09:11, michael watson (IAH-C) wrote:
> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.
> 
> I find it strange that extractseq reports a memory problem:

Some further investigation suggests several improvements for the next
release:

The input was being buffered with the entire input buffer (2000 bytes)
saved per line. That is why it used so much memory. This can be reduced
to a more reasonable figure (and we can save space in some other string
copies).

When processing FASTA format (and various others), once the '>' line has
been found it cannot fail. It will read everything up to the next '>' or
continue to the end of the file. This means we can turn off buffering of
FASTA input (and other formats) once they no longer have any format
tests that can fail.

Both changes will have a similar effect to specifying the format on the
command line for large input files. That should work for any release.

Hope that helps,

Peter

From d.m.a.martin at dundee.ac.uk  Tue Mar 23 07:12:42 2010
From: d.m.a.martin at dundee.ac.uk (David Martin)
Date: Tue, 23 Mar 2010 11:12:42 +0000
Subject: [EMBOSS] tfscan output
Message-ID: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>

TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do?
 
..d
 
 
David Martin PhD
College of Life Sciences
University of Dundee 
01382 388704
The University of Dundee is a Scottish Registered Charity, No. SC015096.
 
 
************************************************************
Please consider the environment. Do you really need to print this email? 

The University of Dundee is a registered Scottish charity, No: SC015096


From david.bauer at bayerhealthcare.com  Tue Mar 23 07:59:16 2010
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Tue, 23 Mar 2010 12:59:16 +0100
Subject: [EMBOSS] Antwort:  tfscan output
In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
Message-ID: <OFC2727EB5.F025E1CA-ONC12576EF.004198A7-C12576EF.0041D9FA@bayer.de>

Have you considered using jaspscan ?
It uses the JASPAR database of transcription factors 
(http://jaspar.cgb.ki.se/)

David.

emboss-bounces at lists.open-bio.org schrieb am 23/03/2010 12:12:42:

> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no 
> option to change the report format. It would be really nice to be 
> able to get (eg) GFF output or similar. How easy would this be to do?
> 
> ..d
> 
> 
> David Martin PhD
> College of Life Sciences
> University of Dundee 
> 01382 388704
> The University of Dundee is a Scottish Registered Charity, No. SC015096.
> 
> 
> 
> ************************************************************
> Please consider the environment. Do you really need to print this email? 

> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss

From pmr at ebi.ac.uk  Tue Mar 23 09:09:51 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 23 Mar 2010 13:09:51 +0000
Subject: [EMBOSS] tfscan output
In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
References: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
Message-ID: <4BA8BD9F.1090104@ebi.ac.uk>

On 23/03/10 11:12, David Martin wrote:
> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do?

Not difficult, but the extra line needs to be attached to all hits to
meet the requirements of report formats

It will be in the next release.

Peter

From jeedward at yahoo.com  Wed Mar 24 19:59:52 2010
From: jeedward at yahoo.com (John Edward)
Date: Wed, 24 Mar 2010 16:59:52 -0700 (PDT)
Subject: [EMBOSS] Call for papers (Deadline Extended): BCBGC-10, USA,
	July 2010
Message-ID: <268706.8648.qm@web45903.mail.sp1.yahoo.com>

It
would be highly appreciated if you could share this announcement with your
colleagues, students and individuals whose research is in bioinformatics,
computational biology, genomics, data-mining, and related areas.
 
Call
for papers (Deadline Extended): BCBGC-10, USA, July 2010
 
The
2010 International Conference on Bioinformatics, Computational Biology,
Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will
be held during 12-14 of July 2010 in Orlando, FL, USA.  BCBGC is an important event in the areas of
bioinformatics, computational biology, genomics and chemoinformatics and
focuses on all areas related to the conference.
 
The
conference will be held at the same time and location where several other major
international conferences will be taking place. The conference will be held as
part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during
July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to
promote research and developmental activities in computer science, information
technology, control engineering, and related fields. Another goal is to promote
the dissemination of research to a multidisciplinary audience and to facilitate
communication among researchers, developers, practitioners in different fields.
The following conferences are planned to be organized as part of MULTICONF-10.
 
?           International Conference on
Artificial Intelligence and Pattern Recognition (AIPR-10)
?            International Conference on
Automation, Robotics and Control Systems (ARCS-10)
?           International Conference on
Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10)
?           International Conference on Computer
Communications and Networks (CCN-10)
?           International Conference on
Enterprise Information Systems and Web Technologies (EISWT-10)
?           International Conference on High
Performance Computing Systems (HPCS-10)
?           International Conference on
Information Security and Privacy (ISP-10) 
?           International Conference on Image and
Video Processing and Computer Vision (IVPCV-10)
?           International Conference on Software
Engineering Theory and Practice (SETP-10) 
?           International Conference on
Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) 
 
 
MULTICONF-10
will be held at Imperial Swan Hotel and Suites.  It is a full-service resort that puts you in the middle of the fun!
Located 1/2 block south of the famed International Drive, the hotel is just
minutes from great entertainment like Walt Disney World? Resort, Universal
Studios and Sea World Orlando. Guests can enjoy free scheduled transportation
to these theme parks, as well as spacious accommodations, outdoor pools and
on-site dining ? all situated on 10 tropically landscaped acres. Here, guests
can experience a full-service resort with discount hotel pricing in Orlando.
 
We
invite draft paper submissions. Please see the website http://www.PromoteResearch.org for
more details.
 
Sincerely
John
Edward


From biopython at maubp.freeserve.co.uk  Tue Mar 30 07:46:10 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 12:46:10 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
Message-ID: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>

Hi all,

I've got some "Sanger" capillary sequence files in ABI trace file
format, which I understand includes the probabilities of the 4 bases
along the sequencing run. I'd like to extract this as a FASTQ file
with meaningful quality scores based on the trace data (for use in
assembly).

This doesn't seem to work - the FASTQ quality score characters are all
double quotes (ASCI 34), meaning PHRED quality 1.

seqret -sformat abi -osformat fastq-sanger -sequence example.ab1
-outseq example.fastq -auto

Output as FASTA seems fine:

seqret -sformat abi -osformat fasta -sequence example.ab1 -outseq
example.fasta -auto

Is ABI to FASTQ a reasonable to expect seqret to support? If so, could
it be added to the TODO list please?

Peter C.

P.S. I'd be interested to hear suggestions for alternative tools to
tackle this conversion.

From pmr at ebi.ac.uk  Tue Mar 30 08:02:25 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 13:02:25 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
Message-ID: <4BB1E851.1060607@ebi.ac.uk>

On 30/03/2010 12:46, Peter C. wrote:
> Hi all,
>
> I've got some "Sanger" capillary sequence files in ABI trace file
> format, which I understand includes the probabilities of the 4 bases
> along the sequencing run. I'd like to extract this as a FASTQ file
> with meaningful quality scores based on the trace data (for use in
> assembly).
>
> This doesn't seem to work - the FASTQ quality score characters are all
> double quotes (ASCI 34), meaning PHRED quality 1.

I will take a look. I don;t recall anyone using the quality scores from ABI 
data when we first imeplemented it (at that time Staden Experiment files 
were the only supported output format with any quality scores)

regards,

Peter R

From biopython at maubp.freeserve.co.uk  Tue Mar 30 08:17:23 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 13:17:23 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1E851.1060607@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1E851.1060607@ebi.ac.uk>
Message-ID: <320fb6e01003300517q6e9358bj4a45112d3e23c57f@mail.gmail.com>

On Tue, Mar 30, 2010 at 1:02 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 30/03/2010 12:46, Peter C. wrote:
>>
>> Hi all,
>>
>> I've got some "Sanger" capillary sequence files in ABI trace file
>> format, which I understand includes the probabilities of the 4 bases
>> along the sequencing run. I'd like to extract this as a FASTQ file
>> with meaningful quality scores based on the trace data (for use in
>> assembly).
>>
>> This doesn't seem to work - the FASTQ quality score characters are all
>> double quotes (ASCI 34), meaning PHRED quality 1.
>
> I will take a look. I don;t recall anyone using the quality scores from ABI
> data when we first imeplemented it (at that time Staden Experiment files
> were the only supported output format with any quality scores)
>

Thanks Peter,

Regarding other possible tools, there is the obvious choice of
PHRED (although getting a copy is non-trivial), and based on
this thread: http://seqanswers.com/forums/showthread.php?t=3165
I've just tried TraceTuner 3.0.6beta which is open source
(specifically, GPL v2 or later):
https://sourceforge.net/projects/tracetuner/

With the ttuner -nocall option to reuse the sequence as-is from
the ABI file results in zero quality scores.

Allowing ttuner to re-call the bases (the default), it can output
FASTA/QUAL/PHD with meaningful qualities (from which I can
easily make a FASTQ file).

Peter C.

From pmr at ebi.ac.uk  Tue Mar 30 09:13:28 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 14:13:28 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
Message-ID: <4BB1F8F8.8050608@ebi.ac.uk>

On 30/03/2010 12:46, Peter wrote:
> Hi all,
>
> I've got some "Sanger" capillary sequence files in ABI trace file
> format, which I understand includes the probabilities of the 4 bases
> along the sequencing run. I'd like to extract this as a FASTQ file
> with meaningful quality scores based on the trace data (for use in
> assembly).
>
> This doesn't seem to work - the FASTQ quality score characters are all
> double quotes (ASCI 34), meaning PHRED quality 1.

We have code to extract various fields from ABI trace files, but I'm not 
familiar with the details fo the format, and documentation appears hard to 
find.

Where do I look to find scores that we can use (and how do we convert those 
to phred quality scores)?

regards,

Peter

From pmr at ebi.ac.uk  Tue Mar 30 09:25:53 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 14:25:53 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1F8F8.8050608@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk>
Message-ID: <4BB1FBE1.8030400@ebi.ac.uk>

On 30/03/2010 14:13, Peter Rice wrote:

> Where do I look to find scores that we can use (and how do we convert
> those to phred quality scores)?

Aha, found something. The field is called PCON (confidence values), with 
values 0-255.

There is a possibility that these could be phred scores, but I suspect they 
are whatever the basecaller has decided to write there.

http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf

Peter R.

From ztu at msi.umn.edu  Tue Mar 30 09:33:56 2010
From: ztu at msi.umn.edu (Zheng Jin Tu)
Date: Tue, 30 Mar 2010 08:33:56 -0500 (CDT)
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
Message-ID: <Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>


Hi Peter:

You may want to check this URL about how to 
convert quality score:

 http://maq.sourceforge.net/fastq.shtml

Thanks, TU

=======================================

On Tue, 30 Mar 2010, Peter Rice wrote:

> On 30/03/2010 14:13, Peter Rice wrote:
> 
> > Where do I look to find scores that we can use (and how do we convert
> > those to phred quality scores)?
> 
> Aha, found something. The field is called PCON (confidence values), with
> values 0-255.
> 
> There is a possibility that these could be phred scores, but I suspect they
> are whatever the basecaller has decided to write there.
> 
> http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf
> 
> Peter R.
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 
> 

From biopython at maubp.freeserve.co.uk  Tue Mar 30 09:56:34 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 14:56:34 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
Message-ID: <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com>

On Tue, Mar 30, 2010 at 2:25 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 30/03/2010 14:13, Peter Rice wrote:
>
>> Where do I look to find scores that we can use (and how do we convert
>> those to phred quality scores)?
>
> Aha, found something. The field is called PCON (confidence values), with
> values 0-255.
>
> There is a possibility that these could be phred scores, but I suspect they
> are whatever the basecaller has decided to write there.
>
> http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf
>
> Peter R.

Hmm. Good question - I don't know, although if they are PHRED scores
they could go unusually high (we'd expect say 0 to 50 for a raw read).
It could be some other encoding (e.g. scaled from 0 for a poor base to
255 for a perfect base). Do you have any contacts at Applied Biosystems
to ask?

Peter C.

From biopython at maubp.freeserve.co.uk  Tue Mar 30 09:58:18 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 14:58:18 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
	<Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>
Message-ID: <320fb6e01003300658l2742a656ge766c3fcd5a2fa44@mail.gmail.com>

On Tue, Mar 30, 2010 at 2:33 PM, Zheng Jin Tu <ztu at msi.umn.edu> wrote:
>
>
> Hi Peter:
>
> You may want to check this URL about how to
> convert quality score:
>
> ?http://maq.sourceforge.net/fastq.shtml
>
> Thanks, TU

Thanks - but that just covers converting between PHRED scores
and Solexa Scores. Peter Rice and I are well aware of this.

The question here is what do the numbers in ABI files mean?

Peter C.


From georgios at biotek.uio.no  Wed Mar 31 14:08:07 2010
From: georgios at biotek.uio.no (Georgios Magklaras)
Date: Wed, 31 Mar 2010 20:08:07 +0200
Subject: [EMBOSS] MRS/EMBOSS lecture notes and videos
Message-ID: <4BB38F87.3020803@biotek.uio.no>

Hi,

Just to let people know (some folks expressed interest). You can find 
some interesting lecture notes, as part of an EMBnet course given in 
Mexico about sequence mining with EMBOSS/MRS here:

http://folk.uio.no/georgios/other/mrskurs.pdf

Some video shots of the presented material can be obtained from this URL:
http://www.nnb.unam.mx/video/track

(I will try and obtain the videos in a non-flash format, however the URL 
should make them available in the meantime).

Best regards,
GM

-- 
Best regards,
--

George Magklaras BSc (Hons) MPhil RHCE
IT Systems Manager/Senior Systems Engineer
The Biotechnology Center of Oslo
University of Oslo

http://www.biotek.uio.no
http://www.no.embnet.org
http://folk.uio.no/georgios


From ajb at ebi.ac.uk  Tue Mar  2 10:44:51 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 10:44:51 -0000 (UTC)
Subject: [EMBOSS] EMBOSS-6.2.0 patch 1-18 available
Message-ID: <34743.86.26.12.63.1267526691.squirrel@webmail.ebi.ac.uk>

The first patch file for the EMBOSS-6.2.0 release is now
available at:

    ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/patch-1-18.gz

Discrete files used to create the above patch are held in the
directory:

    ftp://emboss.open-bio.org/pub/EMBOSS/fixes/

The file README.fixes in the same directory describes what the
fixes address and is attached to this email for convenience.

A new mEMBOSS incorporating all relevant changes from the above
is available as:

    ftp://emboss.open-bio.org/pub/EMBOSS/windows/mEMBOSS-6.2.0.2-setup.exe


Alan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: README.fixes
Type: application/octet-stream
Size: 5674 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20100302/c41b6753/attachment-0002.obj>

From ajb at ebi.ac.uk  Tue Mar  2 11:22:58 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 11:22:58 -0000 (UTC)
Subject: [EMBOSS] mEMBOSS-6.2.0.1 reinstated
Message-ID: <56161.86.26.12.63.1267528978.squirrel@webmail.ebi.ac.uk>

A stability problem has been noticed with mEMBOSS-6.2.0.2 on the ftp
server. As a result we've reinstated mEMBOSS-6.2.0.1.

A further announcement will be posted when things are resolved.
Apologies for any inconvenience.

Alan


From ajb at ebi.ac.uk  Tue Mar  2 16:03:30 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Tue, 2 Mar 2010 16:03:30 -0000 (UTC)
Subject: [EMBOSS] mEMBOSS 6.2.0.2 re-released
Message-ID: <57834.86.26.12.63.1267545810.squirrel@webmail.ebi.ac.uk>

The stability issues have been resolved. mEMBOSS 6.2.0.2 is
now re-released as:

ftp://emboss.open-bio.org/pub/EMBOSS/windows/beta/mEMBOSS-6.2.0.2-setup.exe

Note that this release is based on the current developers' CVS
code and, as such, has not had the rigorous testing performed
for major releases (or for patches to the UNIX version of EMBOSS).
We are providing it as a beta release in the event it may be useful.

Alan


From michael.watson at bbsrc.ac.uk  Fri Mar  5 14:26:06 2010
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Fri, 5 Mar 2010 14:26:06 +0000
Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to
 bootstrap with FPROTPARS
Message-ID: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>

Hello

Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence.

The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS.

I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68

Many thanks
Mick


From pmr at ebi.ac.uk  Fri Mar  5 15:40:54 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Fri, 05 Mar 2010 15:40:54 +0000
Subject: [EMBOSS] EMBASSY/PHYLIP Question: which option do I use to
 bootstrap with FPROTPARS
In-Reply-To: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC501F911EA1A@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4B912606.9080906@ebi.ac.uk>

Dear Michael,

On 05/03/10 14:26, michael watson (IAH-C) wrote:
> Hello
> 
> Perhaps I am missing something fundamental, but I'd like to draw a phylogenetic tree of some protein sequences I have, and use bootstrapping for confidence.
> 
> The phylip help pages seem to suggest I can do this by setting the resampling option to "bootstrap" but I cannot find this option in FPROTPARS.
> 
> I'm using EMBOSS-6.1.0 and PHYLIPNEW-3.68

My understanding of the phylip documentation is that you use (EMBOSS
name) fseqboot to generate the bootstrap resampling of your original
sequences and then use fprotpars to analyse the resulting output.

In the original phylip package the seqboot application bootstraps
several types of data. In the EMBASSY package, to make the input types
clearer, we split it into fseqboot, fseqbootall, fdiscboot, ffreqboot
and frestboot.

Hope that helps,

Peter Rice


From jeedward at yahoo.com  Sat Mar  6 00:35:35 2010
From: jeedward at yahoo.com (John Edward)
Date: Fri, 5 Mar 2010 16:35:35 -0800 (PST)
Subject: [EMBOSS] Call for papers: BCBGC-10, USA, July 2010
Message-ID: <915762.86810.qm@web45916.mail.sp1.yahoo.com>

It
would be highly appreciated if you could share this announcement with your
colleagues, students and individuals whose research is in bioinformatics,
computational biology, genomics, data-mining, and related areas.
 
Call
for papers: BCBGC-10, USA, July 2010
 
The
2010 International Conference on Bioinformatics, Computational Biology,
Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will
be held during 12-14 of July 2010 in Orlando, FL, USA.  BCBGC is an important event in the areas of
bioinformatics, computational biology, genomics and chemoinformatics and
focuses on all areas related to the conference.
 
The
conference will be held at the same time and location where several other major
international conferences will be taking place. The conference will be held as
part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during
July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to
promote research and developmental activities in computer science, information
technology, control engineering, and related fields. Another goal is to promote
the dissemination of research to a multidisciplinary audience and to facilitate
communication among researchers, developers, practitioners in different fields.
The following conferences are planned to be organized as part of MULTICONF-10.
 
?           International Conference on
Artificial Intelligence and Pattern Recognition (AIPR-10)
?            International Conference on Automation,
Robotics and Control Systems (ARCS-10)
?           International Conference on
Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10)
?           International Conference on Computer
Communications and Networks (CCN-10)
?           International Conference on
Enterprise Information Systems and Web Technologies (EISWT-10)
?           International Conference on High
Performance Computing Systems (HPCS-10)
?           International Conference on
Information Security and Privacy (ISP-10) 
?           International Conference on Image and
Video Processing and Computer Vision (IVPCV-10)
?           International Conference on Software
Engineering Theory and Practice (SETP-10) 
?           International Conference on
Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) 
 
 
MULTICONF-10
will be held at Imperial Swan Hotel and Suites.  It is a full-service resort that puts you in the middle of the fun!
Located 1/2 block south of the famed International Drive, the hotel is just
minutes from great entertainment like Walt Disney World? Resort, Universal
Studios and Sea World Orlando. Guests can enjoy free scheduled transportation
to these theme parks, as well as spacious accommodations, outdoor pools and
on-site dining ? all situated on 10 tropically landscaped acres. Here, guests
can experience a full-service resort with discount hotel pricing in Orlando.
 
We
invite draft paper submissions. Please see the website http://www.PromoteResearch.org for
more details.
 
Sincerely
John
Edward


From mbk0asis at gmail.com  Sun Mar  7 15:05:27 2010
From: mbk0asis at gmail.com (Byungkuk Min)
Date: Sun, 7 Mar 2010 07:05:27 -0800
Subject: [EMBOSS] A question about 'showdb'
Message-ID: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>

When I typed 'showdb', no list of databases appeared like the example in the
tutorial.
How can I set up the databases?

xxxxx at ubuntu:~$ showdb
Displays information on configured databases
# Name         Type  ID  Qry All Comment
# ============ ==== ==  === === =======


From pmr at ebi.ac.uk  Sun Mar  7 22:28:23 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Sun, 07 Mar 2010 22:28:23 +0000
Subject: [EMBOSS] A question about 'showdb'
In-Reply-To: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>
References: <31f50bc71003070705l6164972bqe127ae0c2d9d7fc1@mail.gmail.com>
Message-ID: <4B942887.7090608@ebi.ac.uk>

Dear Byungkuk,

On 07/03/2010 15:05, Byungkuk Min wrote:
> When I typed 'showdb', no list of databases appeared like the example in the
> tutorial.
> How can I set up the databases?

The databases are defined in a file emboss.defaults in the share/EMBOSS/ 
directory where EMBOSS is installed.

In that directory you will find a file emboss.default.template with 
example database definitions.

Some databases are remote (e.g. method: "srs") and can be defined and used.

Others need local data files and a local index created by EMBOSS 
(method: emboss and method: emblcd) creatted by the dbx* and dbi* 
programs in EMBOSS.

Let us know if you need any more help. We are working on more detailed 
instructions which will appear on the EMNBOSS website.

regards,

Peter Rice


From shrish at ccmb.res.in  Mon Mar  8 08:58:26 2010
From: shrish at ccmb.res.in (Shrish Tiwari)
Date: Mon, 8 Mar 2010 14:28:26 +0530 (IST)
Subject: [EMBOSS] (no subject)
Message-ID: <777482836.160381268038706946.JavaMail.root@127.0.0.1>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/emboss/attachments/20100308/6db40829/attachment.ksh>

From simon.andrews at bbsrc.ac.uk  Mon Mar  8 13:53:06 2010
From: simon.andrews at bbsrc.ac.uk (Simon Andrews)
Date: Mon, 8 Mar 2010 13:53:06 +0000
Subject: [EMBOSS] Data for Jasextract
Message-ID: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>

I've been trying to use EMBOSS to search using the Jaspar database  
(jaspextract /  jaspscan), but with no success.

I think the problem is coming from jaspextract.  TFM says:

Input file format

    The input files are the uncompressed and extracted JASPAR_CORE.tgz,
    JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the  
JASPAR
    MatrixDir download directory of the JASPAR homepage
    (http://jaspar.genereg.net).


..but there are no files named that way (the only google hit to those  
names is the jaspextract manpage!).

The main jaspar archive file is Archive.zip.  If I unzip this and run  
jaspextract on the expanded directory it runs with no errors or  
warnings, but if I subsequently try to run jaspscan I get an error  
saying:

Warning: Matrix file(s) *.pfm not found

    EMBOSS An error in jaspscan.c at line 870:
Matrix list file JASPAR_CORE/matrix_list.txt not found

I've tried loads of different subdirectories within the JASPAR  
database dump, but can't find anything which actually puts data into  
the appropriate EMBOSS data directories.

Can anyone else make this work?

Thanks

Simon.


From ajb at ebi.ac.uk  Mon Mar  8 16:07:48 2010
From: ajb at ebi.ac.uk (ajb at ebi.ac.uk)
Date: Mon, 8 Mar 2010 16:07:48 -0000 (UTC)
Subject: [EMBOSS] Data for Jasextract
In-Reply-To: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>
References: <ED344C56-8DB5-4A52-A23D-3CC6393EACFB@bbsrc.ac.uk>
Message-ID: <45660.86.26.12.63.1268064468.squirrel@webmail.ebi.ac.uk>

Hello Simon,

The Jaspar people altered the structure and content of their
ftp  server recently. There is a patch in the fixes/patches
area of the EMBOSS ftp server which updates jaspextract and
jaspscan appropriately. The README.fixes file in the 'fixes'
directory explains further.

HTH

Alan


> I've been trying to use EMBOSS to search using the Jaspar database
> (jaspextract /  jaspscan), but with no success.
>
> I think the problem is coming from jaspextract.  TFM says:
>
> Input file format
>
>     The input files are the uncompressed and extracted JASPAR_CORE.tgz,
>     JASPAR_FAM.tgz and JASPAR_PHYLOFACTS.tgz files provided in the
> JASPAR
>     MatrixDir download directory of the JASPAR homepage
>     (http://jaspar.genereg.net).
>
>
> ..but there are no files named that way (the only google hit to those
> names is the jaspextract manpage!).
>
> The main jaspar archive file is Archive.zip.  If I unzip this and run
> jaspextract on the expanded directory it runs with no errors or
> warnings, but if I subsequently try to run jaspscan I get an error
> saying:
>
> Warning: Matrix file(s) *.pfm not found
>
>     EMBOSS An error in jaspscan.c at line 870:
> Matrix list file JASPAR_CORE/matrix_list.txt not found
>
> I've tried loads of different subdirectories within the JASPAR
> database dump, but can't find anything which actually puts data into
> the appropriate EMBOSS data directories.
>
> Can anyone else make this work?
>
> Thanks
>
> Simon.
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
>


From stephen.taylor at imm.ox.ac.uk  Tue Mar  9 14:20:38 2010
From: stephen.taylor at imm.ox.ac.uk (Steve Taylor)
Date: Tue, 09 Mar 2010 14:20:38 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
Message-ID: <4B965936.6030102@imm.ox.ac.uk>

Hi,

I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone on this list know if there planned support for EMBOSS 6? We have found using our local installation of EMBOSS 6 that a few tools don't work. Is there a person who maintains the Galaxy/EMBOSS configuration?

I know this is *really* a Galaxy question but I posted this to the Galaxy list but haven't had any response so far. :-)

Thanks,

Steve


From pmr at ebi.ac.uk  Tue Mar  9 15:29:59 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 09 Mar 2010 15:29:59 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B965936.6030102@imm.ox.ac.uk>
References: <4B965936.6030102@imm.ox.ac.uk>
Message-ID: <4B966977.9050101@ebi.ac.uk>

On 09/03/2010 14:20, Steve Taylor wrote:
> Hi,
>
> I notice in the Galaxy distribution there is support for EMBOSS 5. Does
> anyone on this list know if there planned support for EMBOSS 6? We have
> found using our local installation of EMBOSS 6 that a few tools don't
> work. Is there a person who maintains the Galaxy/EMBOSS configuration?
>
> I know this is *really* a Galaxy question but I posted this to the
> Galaxy list but haven't had any response so far. :-)

I am looking into it and will be going to the Galaxy Developers meeting in May.

Any other interest among the EMBOSS users?

regards,

Peter Rice


From hrh at fmi.ch  Tue Mar  9 15:40:19 2010
From: hrh at fmi.ch (Hotz, Hans-Rudolf)
Date: Tue, 09 Mar 2010 16:40:19 +0100
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B965936.6030102@imm.ox.ac.uk>
Message-ID: <C7BC2A73.7121%hrh@fmi.ch>

Steve

> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone
> on this list know if there planned support for EMBOSS 6? We have found using
> our local installation of EMBOSS 6 that a few tools don't work.

which tools don't work?

We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I
don't remember running into problems with the galaxy emboss 5 tool
definitions (ie emboss_*.xml files).

I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions
(based on the release notes) you need to make. Generally speaking: EMBOSS
tools are pretty stable.


Maybe if you provide a list of problems/incompatibilities and resend this to
the galaxy mailing list, you will get a response...

Hans


> Is there a person who maintains the Galaxy/EMBOSS configuration?
> 
> I know this is *really* a Galaxy question but I posted this to the Galaxy list
> but haven't had any response so far. :-)
> 
> Thanks,
> 
> Steve
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From stephen.taylor at imm.ox.ac.uk  Tue Mar  9 16:30:08 2010
From: stephen.taylor at imm.ox.ac.uk (Steve Taylor)
Date: Tue, 09 Mar 2010 16:30:08 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <C7BC2A73.7121%hrh@fmi.ch>
References: <C7BC2A73.7121%hrh@fmi.ch>
Message-ID: <4B967790.8000609@imm.ox.ac.uk>

Hi Hans,

>> I notice in the Galaxy distribution there is support for EMBOSS 5. Does anyone
>> on this list know if there planned support for EMBOSS 6? We have found using
>> our local installation of EMBOSS 6 that a few tools don't work.
> 
> which tools don't work?
> 
> We are using most of the EMBOSS 6.1.0 tools in or local galaxy server. And I
> don't remember running into problems with the galaxy emboss 5 tool
> definitions (ie emboss_*.xml files).
> 

Ok. That's good to know. 

> I haven't checked EMBOSS 6.2.0, but I guess there are just a few additions
> (based on the release notes) you need to make. Generally speaking: EMBOSS
> tools are pretty stable.
> 

To give a bit of history, we are fairly new to using Galaxy and previously we used EMBOSS Explorer as our main web interface. With this we found when EMBOSS releases changed lots of things broke, so we ended up staying with EMBOSS v3. I am hoping this is not going to be true for EMBOSS/Galaxy because they are both great tools and I want them to be used routinely without us/users worrying if things are going to break, especially if they are going to be incorporated routinely into workflows. 

> Maybe if you provide a list of problems/incompatibilities and resend this to
> the galaxy mailing list, you will get a response...
> 


Maybe I was a bit unlucky because I tried a few more tools and generally things are ok. A couple of minor issues I came across:

* antigenic

(ran but gave an error)

 14: antigenic on data 13
An error occurred running this job: Error: Unable to read feature tags data file 'Etags.gff3protein'

* etandem produced two outputs (not exactly an error but I wondered if it was a misconfiguration in the xml)

there may be more ...

Your email answers my question that in general EMBOSS 6 is compatible with EMBOSS 5 but probably some minor tweaks may be required for certain tools. It would great if some form of unit testing could be employed to check compatibility with new builds.

Thanks,

Steve


> Hans
> 
> 
>> Is there a person who maintains the Galaxy/EMBOSS configuration?
>>
>> I know this is *really* a Galaxy question but I posted this to the Galaxy list
>> but haven't had any response so far. :-)
>>
>> Thanks,
>>
>> Steve
>> _______________________________________________
>> EMBOSS mailing list
>> EMBOSS at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/emboss
> 


From pmr at ebi.ac.uk  Tue Mar  9 17:11:30 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 09 Mar 2010 17:11:30 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B967790.8000609@imm.ox.ac.uk>
References: <C7BC2A73.7121%hrh@fmi.ch> <4B967790.8000609@imm.ox.ac.uk>
Message-ID: <4B968142.9080905@ebi.ac.uk>

On 09/03/2010 16:30, Steve Taylor wrote:

> To give a bit of history, we are fairly new to using Galaxy and
> previously we used EMBOSS Explorer as our main web interface. With this
> we found when EMBOSS releases changed lots of things broke, so we ended
> up staying with EMBOSS v3. I am hoping this is not going to be true for
> EMBOSS/Galaxy because they are both great tools and I want them to be
> used routinely without us/users worrying if things are going to break,
> especially if they are going to be incorporated routinely into workflows.
>> Maybe if you provide a list of problems/incompatibilities and resend
>> this to
>> the galaxy mailing list, you will get a response...

Yes please do ... I am on the Galaxy list too.

> 14: antigenic on data 13
> An error occurred running this job: Error: Unable to read feature tags
> data file 'Etags.gff3protein'

Could be you have more than one version of EMBOSS running. That looks like 
a pure EMBOSS error suggesting EMBOSS 6 is trying to use EMBSOS5's data 
directory. Should be fixable by copying the missing file.

Peter


From n.binns at ed.ac.uk  Tue Mar  9 16:59:13 2010
From: n.binns at ed.ac.uk (Nigel Binns)
Date: Tue, 09 Mar 2010 16:59:13 +0000
Subject: [EMBOSS] Galaxy and EMBOSS
In-Reply-To: <4B967790.8000609@imm.ox.ac.uk>
References: <C7BC2A73.7121%hrh@fmi.ch> <4B967790.8000609@imm.ox.ac.uk>
Message-ID: <4B967E61.4060502@ed.ac.uk>

I'm running EMBOSS 6.2.0 (and Jemboss via JWS) with the latest patch 
applied and EMBOSS Explorer (v2.2.0) without any problems. The only 
issue I've experienced is that the link to the EMBOSS help files is 
broken The workaround is to copy the EMBOSS HTML help files to the 
location EE expects to find them - which is not where the current 
release of EMBOSS is places them :-)

Nigel

On 09/03/2010 16:30, Steve Taylor wrote:
> Hi Hans,
>
>>> I notice in the Galaxy distribution there is support for EMBOSS 5. 
>>> Does anyone
>>> on this list know if there planned support for EMBOSS 6? We have 
>>> found using
>>> our local installation of EMBOSS 6 that a few tools don't work.
>>
>> which tools don't work?
>>
>> We are using most of the EMBOSS 6.1.0 tools in or local galaxy 
>> server. And I
>> don't remember running into problems with the galaxy emboss 5 tool
>> definitions (ie emboss_*.xml files).
>>
>
> Ok. That's good to know.
>> I haven't checked EMBOSS 6.2.0, but I guess there are just a few 
>> additions
>> (based on the release notes) you need to make. Generally speaking: 
>> EMBOSS
>> tools are pretty stable.
>>
>
> To give a bit of history, we are fairly new to using Galaxy and 
> previously we used EMBOSS Explorer as our main web interface. With 
> this we found when EMBOSS releases changed lots of things broke, so we 
> ended up staying with EMBOSS v3. I am hoping this is not going to be 
> true for EMBOSS/Galaxy because they are both great tools and I want 
> them to be used routinely without us/users worrying if things are 
> going to break, especially if they are going to be incorporated 
> routinely into workflows.
>> Maybe if you provide a list of problems/incompatibilities and resend 
>> this to
>> the galaxy mailing list, you will get a response...
>>
>
>
> Maybe I was a bit unlucky because I tried a few more tools and 
> generally things are ok. A couple of minor issues I came across:
>
> * antigenic
>
> (ran but gave an error)
>
> 14: antigenic on data 13
> An error occurred running this job: Error: Unable to read feature tags 
> data file 'Etags.gff3protein'
>
> * etandem produced two outputs (not exactly an error but I wondered if 
> it was a misconfiguration in the xml)
>
> there may be more ...
>
> Your email answers my question that in general EMBOSS 6 is compatible 
> with EMBOSS 5 but probably some minor tweaks may be required for 
> certain tools. It would great if some form of unit testing could be 
> employed to check compatibility with new builds.
>
> Thanks,
>
> Steve
>
>
>
>
>> Hans
>>
>>
>>> Is there a person who maintains the Galaxy/EMBOSS configuration?
>>>
>>> I know this is *really* a Galaxy question but I posted this to the 
>>> Galaxy list
>>> but haven't had any response so far. :-)
>>>
>>> Thanks,
>>>
>>> Steve
>>> _______________________________________________
>>> EMBOSS mailing list
>>> EMBOSS at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/emboss
>

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


From biopython at maubp.freeserve.co.uk  Fri Mar 12 12:07:48 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri, 12 Mar 2010 12:07:48 +0000
Subject: [EMBOSS] Broken links on Emboss webpages
Message-ID: <320fb6e01003120407u794bf5e6ue6c84522ac588c91@mail.gmail.com>

Hi,

I was just looking for the EMBOSS EMBASSY documentation for the
PHYLIPNEW packages, and noticed they are missing from this page:
http://emboss.sourceforge.net/embassy/

Perhaps this should redirect to the latest release? i.e.
http://emboss.sourceforge.net/apps/release/6.2/embassy/index.html

I also found the links on this page seem to be broken:
http://emboss.sourceforge.net/apps/release/6.2/emboss/apps/phylogeny_molecular_sequence_group.html

Regards,

Peter


From Perdeep.Mehta at STJUDE.ORG  Fri Mar 12 15:22:35 2010
From: Perdeep.Mehta at STJUDE.ORG (Mehta, Perdeep)
Date: Fri, 12 Mar 2010 09:22:35 -0600
Subject: [EMBOSS] Antwort:  restrict
In-Reply-To: <OF27EDD1C2.7567862F-ONC12576D3.0024E80E-C12576D3.002611AB@bayer.de>
References: <6EAE916704479E4BB6AB5A133BA224F728A54626D5@SJMEMXMBS11.stjude.sjcrh.local>
	<OF27EDD1C2.7567862F-ONC12576D3.0024E80E-C12576D3.002611AB@bayer.de>
Message-ID: <6EAE916704479E4BB6AB5A133BA224F728A5462746@SJMEMXMBS11.stjude.sjcrh.local>

Hi List,

We now have the Rebase locally installed.  Strangely, I see a new error;


"Input nucleotide sequence(s): chr10.fa Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2170"


Above example is just for testing with chromosome 10, I plan to do either whole genome (all 23 chromosomes) or do 23 times with each chromosome. I have tested running on a queue with higher memory using following command;


 qsub -q normal-ib /path/restrict -sequence chr10.fa -enzymes hinfI -fragments -outfile chr10.res


Then it threw following error;


"Unable to run job: Script length does not match declared length."

It may not be the restrict problem,  I was just throwing it in here to see if anyone else have had seen such a problem.

Any guess.

Thanks,
perdeep


From: david.bauer at bayerhealthcare.com [mailto:david.bauer at bayerhealthcare.com]
Sent: Tuesday, February 23, 2010 12:56 AM
To: Mehta, Perdeep
Cc: emboss; emboss-bounces at lists.open-bio.org
Subject: Antwort: [EMBOSS] restrict


Hi,

emboss-bounces at lists.open-bio.org schrieb am 23/02/2010 00:21:38:

> I have a few questions on EMBOSS restriction analysis and will
> appreciate any ideas or thoughts on these.
>
> 1. What Rebase file we need to download to get "restrict" working? I
> tried but there are files with different formats.

Go to the /pub/rebase dir on ftp.neb.com.
Download the withrefm.xxx and proto.xxx files
(xxx stands for the version number, just take the latest that's there)
Run rebaseextract -infile withrefm.xxx -protofile proto.xxx
This reformats the neb files for use with emboss. You should now see
4 files embossre.... in the REBASE directory

> 2. Is there a maximum size limit of a nucleotide sequence that I can
> use? Can I use the whole Human genome or at least a full chromosome
> to digest with a particular restriction enzyme?

I'm not sure about the whole genome but I have used it for individual
chromosomes without problems.

> 3. What program can give me the list of all possible fragments
> generated as well? Since I have not seen the output of "restrict",
> perhaps that is already doing that.

You can run restrict with the option -fragments to get them.

Hope this helps,
David.

________________________________
Email Disclaimer: www.stjude.org/emaildisclaimer


From michael.watson at bbsrc.ac.uk  Thu Mar 18 09:11:59 2010
From: michael.watson at bbsrc.ac.uk (michael watson (IAH-C))
Date: Thu, 18 Mar 2010 09:11:59 +0000
Subject: [EMBOSS] Memory problem with extractseq
Message-ID: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>

Hi

I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.

I find it strange that extractseq reports a memory problem:

-bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
Extract regions from a sequence
Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2406

Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly.

I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl?

Thanks
Mick


From david.bauer at bayerhealthcare.com  Thu Mar 18 10:01:33 2010
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Thu, 18 Mar 2010 11:01:33 +0100
Subject: [EMBOSS] Antwort:  Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <OFA9BE2347.416D8AEA-ONC12576EA.00368EC3-C12576EA.00371218@bayer.de>

Hi,

I tested this on a larger machine and the job growth to  ~7.3 Gb before it 
outputs the requested sequence part.
The memory size is the same for extractseq and seqret.
Chromosome 1 fasta file size is ~250 Mb so it seems that EMBOSS is not 
very memory efficient  ;-)

David.

emboss-bounces at lists.open-bio.org schrieb am 18/03/2010 10:11:59:

> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of 
RAM.
> 
> I find it strange that extractseq reports a memory problem:
> 
> -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.
> fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
> Extract regions from a sequence
> Uncaught exception:  Allocation failed, insufficient memory 
> available, raised at ajstr.c:2406
> 
> Whereas if I write a Bioperl script using SeqIO and the trunk() 
> function, it works perfectly.
> 
> I'd have thought EMBOSS would be more streamlined and memory 
> efficient than Bioperl?
> 
> Thanks
> Mick
> 
> 
> 
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From pmr at ebi.ac.uk  Thu Mar 18 12:39:28 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 18 Mar 2010 12:39:28 +0000
Subject: [EMBOSS] Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4BA21F00.2060609@ebi.ac.uk>

On 18/03/10 09:11, michael watson (IAH-C) wrote:
> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.
> 
> I find it strange that extractseq reports a memory problem:
> 
> -bash-3.00# /usr/local/EMBOSS-6.1.0/bin/extractseq  -sequence chr1.fasta -outseq chr1_.1.fasta -regions '34415690-34415711'
> Extract regions from a sequence
> Uncaught exception:  Allocation failed, insufficient memory available, raised at ajstr.c:2406
> 
> Whereas if I write a Bioperl script using SeqIO and the trunk() function, it works perfectly.
> 
> I'd have thought EMBOSS would be more streamlined and memory efficient than Bioperl?

It appears to be in the buffering of input to detect the format.

While we try to improve the performance, you can simply specify the format:

-sformat fasta

to turn off the file input buffering.

Reading an unknown format requires a lot of input to be buffered, in
case a GCG ".." checksum line appears.

Hope that helps

Peter


From pmr at ebi.ac.uk  Thu Mar 18 13:30:12 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Thu, 18 Mar 2010 13:30:12 +0000
Subject: [EMBOSS] Memory problem with extractseq
In-Reply-To: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
References: <8D08960C647E64438CE5740657CBBDC5020F056B7C@iahcexch1.iah.bbsrc.ac.uk>
Message-ID: <4BA22AE4.4050507@ebi.ac.uk>

On 18/03/10 09:11, michael watson (IAH-C) wrote:
> Hi
> 
> I'm using EMBOSS 6.1.0 on a fairly small Linux VM which has about 3Gb of RAM.
> 
> I find it strange that extractseq reports a memory problem:

Some further investigation suggests several improvements for the next
release:

The input was being buffered with the entire input buffer (2000 bytes)
saved per line. That is why it used so much memory. This can be reduced
to a more reasonable figure (and we can save space in some other string
copies).

When processing FASTA format (and various others), once the '>' line has
been found it cannot fail. It will read everything up to the next '>' or
continue to the end of the file. This means we can turn off buffering of
FASTA input (and other formats) once they no longer have any format
tests that can fail.

Both changes will have a similar effect to specifying the format on the
command line for large input files. That should work for any release.

Hope that helps,

Peter


From d.m.a.martin at dundee.ac.uk  Tue Mar 23 11:12:42 2010
From: d.m.a.martin at dundee.ac.uk (David Martin)
Date: Tue, 23 Mar 2010 11:12:42 +0000
Subject: [EMBOSS] tfscan output
Message-ID: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>

TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do?
 
..d
 
 
David Martin PhD
College of Life Sciences
University of Dundee 
01382 388704
The University of Dundee is a Scottish Registered Charity, No. SC015096.
 
 
************************************************************
Please consider the environment. Do you really need to print this email? 

The University of Dundee is a registered Scottish charity, No: SC015096


From david.bauer at bayerhealthcare.com  Tue Mar 23 11:59:16 2010
From: david.bauer at bayerhealthcare.com (david.bauer at bayerhealthcare.com)
Date: Tue, 23 Mar 2010 12:59:16 +0100
Subject: [EMBOSS] Antwort:  tfscan output
In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
Message-ID: <OFC2727EB5.F025E1CA-ONC12576EF.004198A7-C12576EF.0041D9FA@bayer.de>

Have you considered using jaspscan ?
It uses the JASPAR database of transcription factors 
(http://jaspar.cgb.ki.se/)

David.

emboss-bounces at lists.open-bio.org schrieb am 23/03/2010 12:12:42:

> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no 
> option to change the report format. It would be really nice to be 
> able to get (eg) GFF output or similar. How easy would this be to do?
> 
> ..d
> 
> 
> David Martin PhD
> College of Life Sciences
> University of Dundee 
> 01382 388704
> The University of Dundee is a Scottish Registered Charity, No. SC015096.
> 
> 
> 
> ************************************************************
> Please consider the environment. Do you really need to print this email? 

> 
> The University of Dundee is a registered Scottish charity, No: SC015096
> 
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss


From pmr at ebi.ac.uk  Tue Mar 23 13:09:51 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 23 Mar 2010 13:09:51 +0000
Subject: [EMBOSS] tfscan output
In-Reply-To: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
References: <4BA8A22A020000E00000D05B@IA-GEN-A2.DUNDEE.AC.UK>
Message-ID: <4BA8BD9F.1090104@ebi.ac.uk>

On 23/03/10 11:12, David Martin wrote:
> TFscan appears to be a bit of a dinosaur in EMBOSS as there is no option to change the report format. It would be really nice to be able to get (eg) GFF output or similar. How easy would this be to do?

Not difficult, but the extra line needs to be attached to all hits to
meet the requirements of report formats

It will be in the next release.

Peter


From jeedward at yahoo.com  Wed Mar 24 23:59:52 2010
From: jeedward at yahoo.com (John Edward)
Date: Wed, 24 Mar 2010 16:59:52 -0700 (PDT)
Subject: [EMBOSS] Call for papers (Deadline Extended): BCBGC-10, USA,
	July 2010
Message-ID: <268706.8648.qm@web45903.mail.sp1.yahoo.com>

It
would be highly appreciated if you could share this announcement with your
colleagues, students and individuals whose research is in bioinformatics,
computational biology, genomics, data-mining, and related areas.
 
Call
for papers (Deadline Extended): BCBGC-10, USA, July 2010
 
The
2010 International Conference on Bioinformatics, Computational Biology,
Genomics and Chemoinformatics (BCBGC-10) (website: http://www.PromoteResearch.org ) will
be held during 12-14 of July 2010 in Orlando, FL, USA.  BCBGC is an important event in the areas of
bioinformatics, computational biology, genomics and chemoinformatics and
focuses on all areas related to the conference.
 
The
conference will be held at the same time and location where several other major
international conferences will be taking place. The conference will be held as
part of 2010 multi-conference (MULTICONF-10). MULTICONF-10 will be held during
July 12-14, 2010 in Orlando, Florida, USA. The primary goal of MULTICONF is to
promote research and developmental activities in computer science, information
technology, control engineering, and related fields. Another goal is to promote
the dissemination of research to a multidisciplinary audience and to facilitate
communication among researchers, developers, practitioners in different fields.
The following conferences are planned to be organized as part of MULTICONF-10.
 
?           International Conference on
Artificial Intelligence and Pattern Recognition (AIPR-10)
?            International Conference on
Automation, Robotics and Control Systems (ARCS-10)
?           International Conference on
Bioinformatics, Computational Biology, Genomics and Chemoinformatics (BCBGC-10)
?           International Conference on Computer
Communications and Networks (CCN-10)
?           International Conference on
Enterprise Information Systems and Web Technologies (EISWT-10)
?           International Conference on High
Performance Computing Systems (HPCS-10)
?           International Conference on
Information Security and Privacy (ISP-10) 
?           International Conference on Image and
Video Processing and Computer Vision (IVPCV-10)
?           International Conference on Software
Engineering Theory and Practice (SETP-10) 
?           International Conference on
Theoretical and Mathematical Foundations of Computer Science (TMFCS-10) 
 
 
MULTICONF-10
will be held at Imperial Swan Hotel and Suites.  It is a full-service resort that puts you in the middle of the fun!
Located 1/2 block south of the famed International Drive, the hotel is just
minutes from great entertainment like Walt Disney World? Resort, Universal
Studios and Sea World Orlando. Guests can enjoy free scheduled transportation
to these theme parks, as well as spacious accommodations, outdoor pools and
on-site dining ? all situated on 10 tropically landscaped acres. Here, guests
can experience a full-service resort with discount hotel pricing in Orlando.
 
We
invite draft paper submissions. Please see the website http://www.PromoteResearch.org for
more details.
 
Sincerely
John
Edward


From biopython at maubp.freeserve.co.uk  Tue Mar 30 11:46:10 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 12:46:10 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
Message-ID: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>

Hi all,

I've got some "Sanger" capillary sequence files in ABI trace file
format, which I understand includes the probabilities of the 4 bases
along the sequencing run. I'd like to extract this as a FASTQ file
with meaningful quality scores based on the trace data (for use in
assembly).

This doesn't seem to work - the FASTQ quality score characters are all
double quotes (ASCI 34), meaning PHRED quality 1.

seqret -sformat abi -osformat fastq-sanger -sequence example.ab1
-outseq example.fastq -auto

Output as FASTA seems fine:

seqret -sformat abi -osformat fasta -sequence example.ab1 -outseq
example.fasta -auto

Is ABI to FASTQ a reasonable to expect seqret to support? If so, could
it be added to the TODO list please?

Peter C.

P.S. I'd be interested to hear suggestions for alternative tools to
tackle this conversion.


From pmr at ebi.ac.uk  Tue Mar 30 12:02:25 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 13:02:25 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
Message-ID: <4BB1E851.1060607@ebi.ac.uk>

On 30/03/2010 12:46, Peter C. wrote:
> Hi all,
>
> I've got some "Sanger" capillary sequence files in ABI trace file
> format, which I understand includes the probabilities of the 4 bases
> along the sequencing run. I'd like to extract this as a FASTQ file
> with meaningful quality scores based on the trace data (for use in
> assembly).
>
> This doesn't seem to work - the FASTQ quality score characters are all
> double quotes (ASCI 34), meaning PHRED quality 1.

I will take a look. I don;t recall anyone using the quality scores from ABI 
data when we first imeplemented it (at that time Staden Experiment files 
were the only supported output format with any quality scores)

regards,

Peter R


From biopython at maubp.freeserve.co.uk  Tue Mar 30 12:17:23 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 13:17:23 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1E851.1060607@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1E851.1060607@ebi.ac.uk>
Message-ID: <320fb6e01003300517q6e9358bj4a45112d3e23c57f@mail.gmail.com>

On Tue, Mar 30, 2010 at 1:02 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 30/03/2010 12:46, Peter C. wrote:
>>
>> Hi all,
>>
>> I've got some "Sanger" capillary sequence files in ABI trace file
>> format, which I understand includes the probabilities of the 4 bases
>> along the sequencing run. I'd like to extract this as a FASTQ file
>> with meaningful quality scores based on the trace data (for use in
>> assembly).
>>
>> This doesn't seem to work - the FASTQ quality score characters are all
>> double quotes (ASCI 34), meaning PHRED quality 1.
>
> I will take a look. I don;t recall anyone using the quality scores from ABI
> data when we first imeplemented it (at that time Staden Experiment files
> were the only supported output format with any quality scores)
>

Thanks Peter,

Regarding other possible tools, there is the obvious choice of
PHRED (although getting a copy is non-trivial), and based on
this thread: http://seqanswers.com/forums/showthread.php?t=3165
I've just tried TraceTuner 3.0.6beta which is open source
(specifically, GPL v2 or later):
https://sourceforge.net/projects/tracetuner/

With the ttuner -nocall option to reuse the sequence as-is from
the ABI file results in zero quality scores.

Allowing ttuner to re-call the bases (the default), it can output
FASTA/QUAL/PHD with meaningful qualities (from which I can
easily make a FASTQ file).

Peter C.


From pmr at ebi.ac.uk  Tue Mar 30 13:13:28 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 14:13:28 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
Message-ID: <4BB1F8F8.8050608@ebi.ac.uk>

On 30/03/2010 12:46, Peter wrote:
> Hi all,
>
> I've got some "Sanger" capillary sequence files in ABI trace file
> format, which I understand includes the probabilities of the 4 bases
> along the sequencing run. I'd like to extract this as a FASTQ file
> with meaningful quality scores based on the trace data (for use in
> assembly).
>
> This doesn't seem to work - the FASTQ quality score characters are all
> double quotes (ASCI 34), meaning PHRED quality 1.

We have code to extract various fields from ABI trace files, but I'm not 
familiar with the details fo the format, and documentation appears hard to 
find.

Where do I look to find scores that we can use (and how do we convert those 
to phred quality scores)?

regards,

Peter


From pmr at ebi.ac.uk  Tue Mar 30 13:25:53 2010
From: pmr at ebi.ac.uk (Peter Rice)
Date: Tue, 30 Mar 2010 14:25:53 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1F8F8.8050608@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk>
Message-ID: <4BB1FBE1.8030400@ebi.ac.uk>

On 30/03/2010 14:13, Peter Rice wrote:

> Where do I look to find scores that we can use (and how do we convert
> those to phred quality scores)?

Aha, found something. The field is called PCON (confidence values), with 
values 0-255.

There is a possibility that these could be phred scores, but I suspect they 
are whatever the basecaller has decided to write there.

http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf

Peter R.


From ztu at msi.umn.edu  Tue Mar 30 13:33:56 2010
From: ztu at msi.umn.edu (Zheng Jin Tu)
Date: Tue, 30 Mar 2010 08:33:56 -0500 (CDT)
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
Message-ID: <Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>


Hi Peter:

You may want to check this URL about how to 
convert quality score:

 http://maq.sourceforge.net/fastq.shtml

Thanks, TU

=======================================

On Tue, 30 Mar 2010, Peter Rice wrote:

> On 30/03/2010 14:13, Peter Rice wrote:
> 
> > Where do I look to find scores that we can use (and how do we convert
> > those to phred quality scores)?
> 
> Aha, found something. The field is called PCON (confidence values), with
> values 0-255.
> 
> There is a possibility that these could be phred scores, but I suspect they
> are whatever the basecaller has decided to write there.
> 
> http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf
> 
> Peter R.
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss
> 
> 


From biopython at maubp.freeserve.co.uk  Tue Mar 30 13:56:34 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 14:56:34 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <4BB1FBE1.8030400@ebi.ac.uk>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
Message-ID: <320fb6e01003300656g6d9b5c28oe0f98f8deedbd484@mail.gmail.com>

On Tue, Mar 30, 2010 at 2:25 PM, Peter Rice <pmr at ebi.ac.uk> wrote:
>
> On 30/03/2010 14:13, Peter Rice wrote:
>
>> Where do I look to find scores that we can use (and how do we convert
>> those to phred quality scores)?
>
> Aha, found something. The field is called PCON (confidence values), with
> values 0-255.
>
> There is a possibility that these could be phred scores, but I suspect they
> are whatever the basecaller has decided to write there.
>
> http://www.appliedbiosystems.com/support/software_community/ABIF_File_Format.pdf
>
> Peter R.

Hmm. Good question - I don't know, although if they are PHRED scores
they could go unusually high (we'd expect say 0 to 50 for a raw read).
It could be some other encoding (e.g. scaled from 0 for a poor base to
255 for a perfect base). Do you have any contacts at Applied Biosystems
to ask?

Peter C.


From biopython at maubp.freeserve.co.uk  Tue Mar 30 13:58:18 2010
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Tue, 30 Mar 2010 14:58:18 +0100
Subject: [EMBOSS] ABI to FASTQ with seqret
In-Reply-To: <Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>
References: <320fb6e01003300446g487fb954jdc94112224182772@mail.gmail.com>
	<4BB1F8F8.8050608@ebi.ac.uk> <4BB1FBE1.8030400@ebi.ac.uk>
	<Pine.LNX.4.63.1003300832400.6854@cl4.msi.umn.edu>
Message-ID: <320fb6e01003300658l2742a656ge766c3fcd5a2fa44@mail.gmail.com>

On Tue, Mar 30, 2010 at 2:33 PM, Zheng Jin Tu <ztu at msi.umn.edu> wrote:
>
>
> Hi Peter:
>
> You may want to check this URL about how to
> convert quality score:
>
> ?http://maq.sourceforge.net/fastq.shtml
>
> Thanks, TU

Thanks - but that just covers converting between PHRED scores
and Solexa Scores. Peter Rice and I are well aware of this.

The question here is what do the numbers in ABI files mean?

Peter C.


From georgios at biotek.uio.no  Wed Mar 31 18:08:07 2010
From: georgios at biotek.uio.no (Georgios Magklaras)
Date: Wed, 31 Mar 2010 20:08:07 +0200
Subject: [EMBOSS] MRS/EMBOSS lecture notes and videos
Message-ID: <4BB38F87.3020803@biotek.uio.no>

Hi,

Just to let people know (some folks expressed interest). You can find 
some interesting lecture notes, as part of an EMBnet course given in 
Mexico about sequence mining with EMBOSS/MRS here:

http://folk.uio.no/georgios/other/mrskurs.pdf

Some video shots of the presented material can be obtained from this URL:
http://www.nnb.unam.mx/video/track

(I will try and obtain the videos in a non-flash format, however the URL 
should make them available in the meantime).

Best regards,
GM

-- 
Best regards,
--

George Magklaras BSc (Hons) MPhil RHCE
IT Systems Manager/Senior Systems Engineer
The Biotechnology Center of Oslo
University of Oslo

http://www.biotek.uio.no
http://www.no.embnet.org
http://folk.uio.no/georgios