From pmr at ebi.ac.uk Mon Nov 3 06:19:44 2008 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 03 Nov 2008 11:19:44 +0000 Subject: [emboss-dev] USA syntax and `%' character in sequence file names In-Reply-To: <20081024074322.GB216223@medusa.sis.pasteur.fr> References: <20081023192000.GA222588@medusa.sis.pasteur.fr> <39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk> <20081024074322.GB216223@medusa.sis.pasteur.fr> Message-ID: <490EDE50.2010406@ebi.ac.uk> Nicolas Joly wrote: > On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote: >> Hi Nicolas, >> >> What it does, given a USA like: >> >> foo%10 >> >> is to seek 10 bytes into file foo and try to start >> reading a sequence from there. It does not, however, currently check that >> what appears after the '%' is a valid number. I believe invalid numbers >> are equivalent to an offset of 0. >> >> I suspect it might have been intended as a useful debugging tool for >> the programmer rather than as something for the biologist. >> If we leave it as an option we ought to mention it the documentation >> in some form though. > > Thanks, Alan. Personally, i would get rid of it. But if you plan to > keep it, please check for valid numbers before using it. We do need it - for saving USAs when reading files. For example, sequence file formats where the ID is not unique or has to be generated. Also potentially useful together with the offsets stored by the database indexing systems and for future use with other data types. Yes, we will fix it to check that the number is valid... and add to the documentation. regards, Peter From njoly at pasteur.fr Tue Nov 4 09:16:35 2008 From: njoly at pasteur.fr (Nicolas Joly) Date: Tue, 4 Nov 2008 15:16:35 +0100 Subject: [emboss-dev] USA syntax and `%' character in sequence file names In-Reply-To: <490EDE50.2010406@ebi.ac.uk> References: <20081023192000.GA222588@medusa.sis.pasteur.fr> <39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk> <20081024074322.GB216223@medusa.sis.pasteur.fr> <490EDE50.2010406@ebi.ac.uk> Message-ID: <20081104141635.GA297195@medusa.sis.pasteur.fr> On Mon, Nov 03, 2008 at 11:19:44AM +0000, Peter Rice wrote: > Nicolas Joly wrote: > >On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote: > >>Hi Nicolas, > >> > >>What it does, given a USA like: > >> > >> foo%10 > >> > >>is to seek 10 bytes into file foo and try to start > >>reading a sequence from there. It does not, however, currently check that > >>what appears after the '%' is a valid number. I believe invalid numbers > >>are equivalent to an offset of 0. > >> > >>I suspect it might have been intended as a useful debugging tool for > >>the programmer rather than as something for the biologist. > >>If we leave it as an option we ought to mention it the documentation > >>in some form though. > > > >Thanks, Alan. Personally, i would get rid of it. But if you plan to > >keep it, please check for valid numbers before using it. > > We do need it - for saving USAs when reading files. > > For example, sequence file formats where the ID is not unique or has to be > generated. Also potentially useful together with the offsets stored by the > database indexing systems and for future use with other data types. > > Yes, we will fix it to check that the number is valid... and add to the > documentation. Ok. Thanks. -- Nicolas Joly Biological Software and Databanks. Institut Pasteur, Paris. From pmr at ebi.ac.uk Mon Nov 3 11:19:44 2008 From: pmr at ebi.ac.uk (Peter Rice) Date: Mon, 03 Nov 2008 11:19:44 +0000 Subject: [emboss-dev] USA syntax and `%' character in sequence file names In-Reply-To: <20081024074322.GB216223@medusa.sis.pasteur.fr> References: <20081023192000.GA222588@medusa.sis.pasteur.fr> <39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk> <20081024074322.GB216223@medusa.sis.pasteur.fr> Message-ID: <490EDE50.2010406@ebi.ac.uk> Nicolas Joly wrote: > On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote: >> Hi Nicolas, >> >> What it does, given a USA like: >> >> foo%10 >> >> is to seek 10 bytes into file foo and try to start >> reading a sequence from there. It does not, however, currently check that >> what appears after the '%' is a valid number. I believe invalid numbers >> are equivalent to an offset of 0. >> >> I suspect it might have been intended as a useful debugging tool for >> the programmer rather than as something for the biologist. >> If we leave it as an option we ought to mention it the documentation >> in some form though. > > Thanks, Alan. Personally, i would get rid of it. But if you plan to > keep it, please check for valid numbers before using it. We do need it - for saving USAs when reading files. For example, sequence file formats where the ID is not unique or has to be generated. Also potentially useful together with the offsets stored by the database indexing systems and for future use with other data types. Yes, we will fix it to check that the number is valid... and add to the documentation. regards, Peter From njoly at pasteur.fr Tue Nov 4 14:16:35 2008 From: njoly at pasteur.fr (Nicolas Joly) Date: Tue, 4 Nov 2008 15:16:35 +0100 Subject: [emboss-dev] USA syntax and `%' character in sequence file names In-Reply-To: <490EDE50.2010406@ebi.ac.uk> References: <20081023192000.GA222588@medusa.sis.pasteur.fr> <39434.86.9.126.186.1224798474.squirrel@webmail.ebi.ac.uk> <20081024074322.GB216223@medusa.sis.pasteur.fr> <490EDE50.2010406@ebi.ac.uk> Message-ID: <20081104141635.GA297195@medusa.sis.pasteur.fr> On Mon, Nov 03, 2008 at 11:19:44AM +0000, Peter Rice wrote: > Nicolas Joly wrote: > >On Thu, Oct 23, 2008 at 10:47:54PM +0100, ajb at ebi.ac.uk wrote: > >>Hi Nicolas, > >> > >>What it does, given a USA like: > >> > >> foo%10 > >> > >>is to seek 10 bytes into file foo and try to start > >>reading a sequence from there. It does not, however, currently check that > >>what appears after the '%' is a valid number. I believe invalid numbers > >>are equivalent to an offset of 0. > >> > >>I suspect it might have been intended as a useful debugging tool for > >>the programmer rather than as something for the biologist. > >>If we leave it as an option we ought to mention it the documentation > >>in some form though. > > > >Thanks, Alan. Personally, i would get rid of it. But if you plan to > >keep it, please check for valid numbers before using it. > > We do need it - for saving USAs when reading files. > > For example, sequence file formats where the ID is not unique or has to be > generated. Also potentially useful together with the offsets stored by the > database indexing systems and for future use with other data types. > > Yes, we will fix it to check that the number is valid... and add to the > documentation. Ok. Thanks. -- Nicolas Joly Biological Software and Databanks. Institut Pasteur, Paris.