[BioRuby] Parsing GFF3 attributes
Michael Han
mh6 at sanger.ac.uk
Tue May 15 16:10:20 UTC 2007
On 15 May 2007, at 16:30, hienle at club-internet.fr wrote:
> Hello all,
>
> I am working with a GFF3-formatted file and have noticed that the
> attributes field is not parsed properly.
>
> In bio/db/gff.rb,
>
> 75 def parse_attributes(attributes)
> 76 hash = Hash.new
> 77 attributes.split(/[^\\];/).each do |atr|
> 78 key, value = atr.split(' ', 2)
> 79 hash[key] = value
> 80 end
> 81 return hash
> 82 end
> 83 end
>
> I changed :
> 78 key, value = atr.split(' ', 2)
> to:
> 78 key, value = atr.split('=', 2)
>
> and it now appears to behave properly. However, I am not certain if
> this is appropriate for backward compatibility with GFF and GFF2.
I use normally spaces between the key and the value of the attributes
for GFF2 like: Gene "1234" ; Transcript "1234"
as described in <"http://www.sanger.ac.uk/Software/formats/GFF/
GFF_Spec.shtml">
so it would break GFF2 / GFF parsing.
Maybe you could create a separate GFF3 parser inheriting from the
gff.rb .
some GFF3 reference (note: last version from a few weeks ago)
<"http://www.sequenceontology.org/gff3.shtml">
> Is anyone working on parsing GFF3 files?
>
> Thank you in advance for your help,
> -Hien
MIchael
More information about the BioRuby
mailing list