From pmr at ebi.ac.uk  Mon Nov  3 06:19:44 2008
From: pmr at ebi.ac.uk (Peter Rice)
Date: Mon, 03 Nov 2008 11:19:44 +0000
Subject: [emboss-dev] USA syntax and `%' character in sequence file names
In-Reply-To: <20081024074322.GB216223@medusa.sis.pasteur.fr>
References: <20081023192000.GA222588@medusa.sis.pasteur.fr>	<39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk>
	<20081024074322.GB216223@medusa.sis.pasteur.fr>
Message-ID: <490EDE50.2010406@ebi.ac.uk>

Nicolas Joly wrote:
> On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote:
>> Hi Nicolas,
>>
>> What it does, given a USA like:
>>
>>     foo%10
>>
>> is to seek 10 bytes into file foo and try to start
>> reading a sequence from there. It does not, however, currently check that
>> what appears after the '%' is a valid number. I believe invalid numbers
>> are equivalent to an offset of 0.
>>
>> I suspect it might have been intended as a useful debugging tool for
>> the programmer rather than as something for the biologist.
>> If we leave it as an option we ought to mention it the documentation
>> in some form though.
> 
> Thanks, Alan. Personally, i would get rid of it. But if you plan to
> keep it, please check for valid numbers before using it.

We do need it - for saving USAs when reading files.

For example, sequence file formats where the ID is not unique or has to be 
generated. Also potentially useful together with the offsets stored by the 
database indexing systems and for future use with other data types.

Yes, we will fix it to check that the number is valid... and add to the 
documentation.

regards,

Peter

From njoly at pasteur.fr  Tue Nov  4 09:16:35 2008
From: njoly at pasteur.fr (Nicolas Joly)
Date: Tue, 4 Nov 2008 15:16:35 +0100
Subject: [emboss-dev] USA syntax and `%' character in sequence file names
In-Reply-To: <490EDE50.2010406@ebi.ac.uk>
References: <20081023192000.GA222588@medusa.sis.pasteur.fr>
	<39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk>
	<20081024074322.GB216223@medusa.sis.pasteur.fr>
	<490EDE50.2010406@ebi.ac.uk>
Message-ID: <20081104141635.GA297195@medusa.sis.pasteur.fr>

On Mon, Nov 03, 2008 at 11:19:44AM +0000, Peter Rice wrote:
> Nicolas Joly wrote:
> >On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote:
> >>Hi Nicolas,
> >>
> >>What it does, given a USA like:
> >>
> >>    foo%10
> >>
> >>is to seek 10 bytes into file foo and try to start
> >>reading a sequence from there. It does not, however, currently check that
> >>what appears after the '%' is a valid number. I believe invalid numbers
> >>are equivalent to an offset of 0.
> >>
> >>I suspect it might have been intended as a useful debugging tool for
> >>the programmer rather than as something for the biologist.
> >>If we leave it as an option we ought to mention it the documentation
> >>in some form though.
> >
> >Thanks, Alan. Personally, i would get rid of it. But if you plan to
> >keep it, please check for valid numbers before using it.
> 
> We do need it - for saving USAs when reading files.
> 
> For example, sequence file formats where the ID is not unique or has to be 
> generated. Also potentially useful together with the offsets stored by the 
> database indexing systems and for future use with other data types.
> 
> Yes, we will fix it to check that the number is valid... and add to the 
> documentation.

Ok. Thanks.

-- 
Nicolas Joly

Biological Software and Databanks.
Institut Pasteur, Paris.

From pmr at ebi.ac.uk  Mon Nov  3 11:19:44 2008
From: pmr at ebi.ac.uk (Peter Rice)
Date: Mon, 03 Nov 2008 11:19:44 +0000
Subject: [emboss-dev] USA syntax and `%' character in sequence file names
In-Reply-To: <20081024074322.GB216223@medusa.sis.pasteur.fr>
References: <20081023192000.GA222588@medusa.sis.pasteur.fr>	<39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk>
	<20081024074322.GB216223@medusa.sis.pasteur.fr>
Message-ID: <490EDE50.2010406@ebi.ac.uk>

Nicolas Joly wrote:
> On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote:
>> Hi Nicolas,
>>
>> What it does, given a USA like:
>>
>>     foo%10
>>
>> is to seek 10 bytes into file foo and try to start
>> reading a sequence from there. It does not, however, currently check that
>> what appears after the '%' is a valid number. I believe invalid numbers
>> are equivalent to an offset of 0.
>>
>> I suspect it might have been intended as a useful debugging tool for
>> the programmer rather than as something for the biologist.
>> If we leave it as an option we ought to mention it the documentation
>> in some form though.
> 
> Thanks, Alan. Personally, i would get rid of it. But if you plan to
> keep it, please check for valid numbers before using it.

We do need it - for saving USAs when reading files.

For example, sequence file formats where the ID is not unique or has to be 
generated. Also potentially useful together with the offsets stored by the 
database indexing systems and for future use with other data types.

Yes, we will fix it to check that the number is valid... and add to the 
documentation.

regards,

Peter


From njoly at pasteur.fr  Tue Nov  4 14:16:35 2008
From: njoly at pasteur.fr (Nicolas Joly)
Date: Tue, 4 Nov 2008 15:16:35 +0100
Subject: [emboss-dev] USA syntax and `%' character in sequence file names
In-Reply-To: <490EDE50.2010406@ebi.ac.uk>
References: <20081023192000.GA222588@medusa.sis.pasteur.fr>
	<39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk>
	<20081024074322.GB216223@medusa.sis.pasteur.fr>
	<490EDE50.2010406@ebi.ac.uk>
Message-ID: <20081104141635.GA297195@medusa.sis.pasteur.fr>

On Mon, Nov 03, 2008 at 11:19:44AM +0000, Peter Rice wrote:
> Nicolas Joly wrote:
> >On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote:
> >>Hi Nicolas,
> >>
> >>What it does, given a USA like:
> >>
> >>    foo%10
> >>
> >>is to seek 10 bytes into file foo and try to start
> >>reading a sequence from there. It does not, however, currently check that
> >>what appears after the '%' is a valid number. I believe invalid numbers
> >>are equivalent to an offset of 0.
> >>
> >>I suspect it might have been intended as a useful debugging tool for
> >>the programmer rather than as something for the biologist.
> >>If we leave it as an option we ought to mention it the documentation
> >>in some form though.
> >
> >Thanks, Alan. Personally, i would get rid of it. But if you plan to
> >keep it, please check for valid numbers before using it.
> 
> We do need it - for saving USAs when reading files.
> 
> For example, sequence file formats where the ID is not unique or has to be 
> generated. Also potentially useful together with the offsets stored by the 
> database indexing systems and for future use with other data types.
> 
> Yes, we will fix it to check that the number is valid... and add to the 
> documentation.

Ok. Thanks.

-- 
Nicolas Joly

Biological Software and Databanks.
Institut Pasteur, Paris.