[BioRuby] EMBL / ENA parser error

Michael Paulini mh6 at sanger.ac.uk
Thu Dec 1 11:30:22 EST 2011


Hi fellow biorubysts,

I tried to parse EMBL/ENA entry DQ471885 with the bioruby EMBL parser,
and it dies when it tries to parse:
OS   uncultured nematode

due to the regexp in embl/common.rb being:
==================================
if tmp =~ /([A-Z][a-z]* *[\w\d \:\'\+\-]+[\w\d])/
         org = $1
         tmp =~ /(\(.+\))/
         os.push({'name' => $1, 'os' => org})
else
         raise "Error: OS Line. #{$!}\n#{fetch('OS')}\n"
end
================================
as it doesn't start with an uppercase letter.

Shouud we change the regexp, or file a bug with ENA?

thanks,

Michael


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


More information about the BioRuby mailing list