[BioRuby] GFF3 status (possible bug?)
Tomoaki NISHIYAMA
tomoakin at kenroku.kanazawa-u.ac.jp
Fri Feb 6 05:27:10 UTC 2009
Hi,
Today, I got the code from git and tried parsing a GFF3 file (from
TAIR8).
a code fragment
open(transcriptgff,"r").each_line do |gffline|
record=Bio::GFF::GFF3::Record.new(gffline)
p record
curid = record.id
p curid
...
results
#<Bio::GFF::GFF3::Record:0x2b439aa9c640 @frame=nil, @start=3631,
@strand="+", @feature="gene", @score=nil, @source="TAIR8",
@attributes=[["ID", "AT1G01010"], ["Note", "protein_coding_gene"],
["Name", "AT1G01010"]], @end=5899, @seqname="Chr1">
/usr/local/lib/ruby/site_ruby/1.8/bio/db/gff.rb:1084:in `[]': can't
convert String into Integer (TypeError)
from /usr/local/lib/ruby/site_ruby/1.8/bio/db/gff.rb:1084:in
`id'
It seems that the @attributes is now not a hash, but an array of key,
value pairs.
On the otherhand, id expects it to be a hash.
The code in gff.rb looks
# Represents a single line of a GFF3-formatted file.
# See Bio::GFF::GFF3 for more information.
class Record < GFF2::Record
include GFF3::Escape
# shortcut to the ID attribute
def id
@attributes['ID']
end
I suppose this is reminiscent of the GFF when attributes were a hash.
The change from hash to array is presumably to because
the key may not be unique in attributes.
A way straighten may be create key to [array of values] hash when the
same key are
specified more than once. (when multiple values for each of key are
given it should be
represented as key to [array of arrays].
Otherwise, we may define id to scan the array as
def id
val = nil
@attributes.each do |keyval|
if(keyval[0] == 'ID')
val = keyval[1]
break
end
end
val
end
It is also nice if a function to get the attribute value for
a specific key is provided. This should be easier with key to
array of values approach, although the order of attributes
will not be conserved.
Which way are you going?
I hope this can be corrected before 1.3.0 release.
Best wishes
--
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan
More information about the BioRuby
mailing list