[BioRuby] GFF3 status (possible bug?)
Naohisa GOTO
ngoto at gen-info.osaka-u.ac.jp
Fri Feb 6 11:29:40 UTC 2009
Hi,
Thank you for reporting bugs.
On Fri, 6 Feb 2009 14:27:10 +0900
Tomoaki NISHIYAMA <tomoakin at kenroku.kanazawa-u.ac.jp> wrote:
> Hi,
>
> Today, I got the code from git and tried parsing a GFF3 file (from
> TAIR8).
>
> a code fragment
>
> open(transcriptgff,"r").each_line do |gffline|
> record=Bio::GFF::GFF3::Record.new(gffline)
> p record
> curid = record.id
> p curid
> ...
>
> results
>
> #<Bio::GFF::GFF3::Record:0x2b439aa9c640 @frame=nil, @start=3631,
> @strand="+", @feature="gene", @score=nil, @source="TAIR8",
> @attributes=[["ID", "AT1G01010"], ["Note", "protein_coding_gene"],
> ["Name", "AT1G01010"]], @end=5899, @seqname="Chr1">
> /usr/local/lib/ruby/site_ruby/1.8/bio/db/gff.rb:1084:in `[]': can't
> convert String into Integer (TypeError)
> from /usr/local/lib/ruby/site_ruby/1.8/bio/db/gff.rb:1084:in
> `id'
>
> It seems that the @attributes is now not a hash, but an array of key,
> value pairs.
@attributes is now an array of [ key, value ] pairs.
See doc/Changes-1.3.rdoc about the changes.
> On the otherhand, id expects it to be a hash.
>
> The code in gff.rb looks
>
> # Represents a single line of a GFF3-formatted file.
> # See Bio::GFF::GFF3 for more information.
> class Record < GFF2::Record
>
> include GFF3::Escape
>
> # shortcut to the ID attribute
> def id
> @attributes['ID']
> end
>
> I suppose this is reminiscent of the GFF when attributes were a hash.
You are right. This is apparently a bug.
I've just fixed.
http://github.com/bioruby/bioruby/commit/5258d88ef98a12fd7829eb86aa8664a18a672a43
> The change from hash to array is presumably to because
> the key may not be unique in attributes.
That's a reason why the @attributes is changed.
> A way straighten may be create key to [array of values] hash when the
> same key are
> specified more than once. (when multiple values for each of key are
> given it should be
> represented as key to [array of arrays].
>
> Otherwise, we may define id to scan the array as
> def id
> val = nil
> @attributes.each do |keyval|
> if(keyval[0] == 'ID')
> val = keyval[1]
> break
> end
> end
> val
> end
Ruby has a support for an array of [ key, value ] pairs.
See Ruby reference manual for Array#assoc.
For example,
key, val = @attributes.assoc('ID')
This is almost the same as
key, val = @attributes.find { |a| a[0] == 'ID' }
> It is also nice if a function to get the attribute value for
> a specific key is provided.
New methods to set/get/replace attributes have been added.
See doc/Changes-1.3.rdoc and RDoc of Bio::GFF::GFF2 and
Bio::GFF::GFF3 for details.
> This should be easier with key to
> array of values approach, although the order of attributes
> will not be conserved.
I think it is better keeping the order of attributes, and
I determined to use an array containing [ key, value ] pairs.
>
> Which way are you going?
>
> I hope this can be corrected before 1.3.0 release.
>
> Best wishes
>
> --
> Tomoaki NISHIYAMA
>
> Advanced Science Research Center,
> Kanazawa University,
> 13-1 Takara-machi,
> Kanazawa, 920-0934, Japan
>
Thanks,
Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org
More information about the BioRuby
mailing list