[Bioperl-l] Why does Bio::DB::GFF::Feature::gff3_string swap start and stop coordinates??
cjfields at uiuc.edu
Mon May 21 23:13:24 UTC 2007
glimmer2/3 both assume the genome is circular by default (I'm
assuming since Glimmer2/3 are used for bacterial genomes). Acc. to
the Glimmer3 release notes the detail file has the information in the
header; from the Glimmer3 data used for tests:
Command: /bio/sw/glimmer3/bin/glimmer3 -o 50 -g 110 -t 30 ../BCTDNA
Sequence file = ../BCTDNA
ICM model file = Glimmer3.icm
Excluded regions file = none
List of orfs file = none
Truncated orfs = false
Circular genome = true
There are options available for glimmer3 (-L, -X) that specify a
linear sequence or allow ORFs to extend past the end of the sequence
analyzed (the latter assumes a linear sequence).
On May 21, 2007, at 4:21 PM, Mark Johnson wrote:
> That makes sense. Is that behavior documented anywhere? I'll
> feel like less of an idiot if it's not. 8) Either way, if you're
> sure that's whats going on, I'll fix up the parser to handle that as a
> split location.
>> I think I know what it is. If you mean these predictions:
>> 27 29263 6 [+1 L= 684 r=-1.187]
>> orf00001 29263 9 +1 9.60
>> Glimmer2/3 are predicting a gene for a circular chromosome that
>> starts at 29263 and ending at +9 (+6 for Glimmer2, which leaves off
>> the stop codon). Note in Glimmer2 detailed output the end is 29946
>> and the length of the sequence is 29940, so Glimmer2 artificially
>> extends the end of the sequence with part of the start.
>> This is handled as a split location in bioperl and in most GenBank
>> files; the above would be a location string like 'join
>> (29263..29940,1..9)'. If you switched the start and stop the
>> location would be '9..29263' which wouldn't be correct (and would be
>> a huge gene).
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l