[BioRuby] Parsing MSF alignment file
Fredrik Johansson
fredjoha at bioreg.kyushu-u.ac.jp
Mon Apr 13 09:19:27 EDT 2009
Yes, that's what happened. A regular expression matched all the way to
the last occurrence of two dots at the end of a line.
> Thank you very much. Do you mean that Bio::GCG::Msf fails to
> parse an alignment that contain two contiguous gaps (..) in the
> end of a line? This is a bug, and the patch will soon be
> applied to the git repository.
>
>
>> $ diff msf.rb.old msf.rb.new
>>
>
> In the next time, please use "diff -u" (unified context format).
>
>
Ok, I'll attach the output of diff -u here:
--- /usr/lib/ruby/gems/1.8/gems/bio-1.3.0/lib/bio/appl/gcg/msf.rb.old
2009-04-13 11:32:53.000000000 +0900
+++ /usr/lib/ruby/gems/1.8/gems/bio-1.3.0/lib/bio/appl/gcg/msf.rb
2009-04-13 13:36:26.000000000 +0900
@@ -30,11 +30,12 @@
# Creates a new Msf object.
def initialize(str)
str = str.sub(/\A[\r\n]+/, '')
- if /^\!\![A-Z]+\_MULTIPLE\_ALIGNMNENT/ =~ str[/.*/] then
- @heading = str[/.*/] # '!!NA_MULTIPLE_ALIGNMENT 1.0' or like this
- str.sub!(/.*/, '')
+ preamble, at data = str.split(/^\/\/$/)
+ if /^\!\![A-Z]+\_MULTIPLE\_ALIGNMNENT/ =~ preamble[/.*/] then
+ @heading = preamble[/.*/] # '!!NA_MULTIPLE_ALIGNMENT 1.0' or
like this
+ preamble.sub!(/.*/, '')
end
- str.sub!(/.*\.\.$/m, '')
+ preamble.sub!(/.*\.\.$/m, '')
@description = $&.to_s.sub(/^.*\.\.$/, '').to_s
d = $&.to_s
if m =
/(.+)\s+MSF\:\s+(\d+)\s+Type\:\s+(\w)\s+(.+)\s+(Comp)?Check\:\s+(\d+)/.match(d)
then
@@ -45,10 +46,8 @@
@checksum = (m[6] ? m[6].to_i : nil)
end
- str.sub!(/.*\/\/$/m, '')
- a = $&.to_s.split(/^/)
@seq_info = []
- a.each do |x|
+ preamble.split(/^/).each do |x|
if /Name\: / =~ x then
s = {}
x.scan(/(\S+)\: +(\S*)/) { |y| s[$1] = $2 }
@@ -56,7 +55,6 @@
end
end
- @data = str
@description.sub!(/\A(\r\n|\r|\n)/, '')
@align = nil
end
More information about the BioRuby
mailing list