[BioRuby] Preparing for 1.1 release
Mikael Borg
mikael.borg at utoronto.ca
Tue Jul 10 14:58:48 UTC 2007
On Tue, 2007-10-07 at 19:40 +0900, Naohisa GOTO wrote:
> Hi,
>
> On Mon, 09 Jul 2007 16:00:47 -0400
> Mikael Borg <mikael.borg at utoronto.ca> wrote:
>
> > There are still a few bugs in the pdb parser. I have tried to correct
> > the ones I've found (see below), but as I find the original code
> > difficult to understand, I might have introduced new bugs. Maybe you can
> > have a look and either use my suggested changes, or come up with other
> > solutions?
> >
> > Cheers,
> >
> > Mikael
> >
> > 1. empty records causes parser to crash through
> > Bio::PDB::Record.Pdb_LString(nil).
> > Solution: if empty record, make empty string String.new('').
>
> Thank you for bug report.
> I changed "str" to "str.to_s" to fix the bug.
>
> > 2. if calling method sheet (Bio::PDB) for a Bio::PDB structure that
> > doesn't contain any sheets, the parser crashes.
> > Solution: return nil if there are no sheets in structure
>
> The same or similar error could also be occurred for REMARK (remark),
> JRNL (jrnl), HELIX (helix), TURN (turn), SHEET (sheet),
> SSBOND (ssbond), SEQRES (seqres), DBREF (dbref), KEYWDS (keywords),
> AUTHOR (authors), HEADER (entry_id, accession, classification),
> TITLE (definition), and REVDAT (version) records (methods).
>
> This is mostly caused by the Bio::PDB#record method which
> returned nil when the specified record did not exist.
> I changed it to return an empty array for nonexistent records.
>
> All of the above bugs are now fixed and committed into CVS.
> For your convenience, patch is attached below.
>
> Thanks,
>
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp / ngoto at bioruby.org
>
> -------------------------------------------------------------------
> --- lib/bio/db/pdb/pdb.rb 19 Apr 2007 13:59:29 -0000 1.22
> +++ lib/bio/db/pdb/pdb.rb 10 Jul 2007 10:17:38 -0000
> @@ -119,7 +119,7 @@
> m
> end
> def self.new(str)
> - String.new(str)
> + String.new(str.to_s)
> end
> end
>
> @@ -1674,7 +1674,7 @@
> # p pdb.record['HETATM']
> #
> def record(name = nil)
> - name ? @hash[name] : @hash
> + name ? (@hash[name] || []) : @hash
> end
>
> #--
> @@ -1837,12 +1837,13 @@
>
> # Classification in "HEADER".
> def classification
> - self.record('HEADER').first.classification
> + f = self.record('HEADER').first
> + f ? f.classification : nil
> end
>
> # Get authors in "AUTHOR".
> def authors
> - self.record('AUTHOR').first.authorList
> + self.record('AUTHOR').collect { |f| f.authorList }.flatten
> end
>
> #--
> @@ -1851,7 +1852,10 @@
>
> # PDB identifier written in "HEADER". (e.g. 1A00)
> def entry_id
> - @id = self.record('HEADER').first.idCode unless @id
> + unless @id
> + f = self.record('HEADER').first
> + @id = f ? f.idCode : nil
> + end
> @id
> end
>
> @@ -1862,12 +1866,14 @@
>
> # Title of this entry in "TITLE".
> def definition
> - self.record('TITLE').first.title
> + f = self.record('TITLE').first
> + f ? f.title : nil
> end
>
> # Current modification number in "REVDAT".
> def version
> - self.record('REVDAT').first.modNum
> + f = self.record('REVDAT').first
> + f ? f.modNum : nil
> end
>
> end #class PDB
> -------------------------------------------------------------------
Thank you for taking care of this so fast, great job!
Have you considered adding an optional argument to Bio::PDB.new, so that
it would be possible to prevent parsing parts of the pdb info, e.g.
remarks/hydrogen atoms/water molecules? The parser is using a lot of
memory, especially when calling Bio::PDB.inspect so that every record is
parsed. Maybe something for the next version, after 1.1 is done?
/Mikael
More information about the BioRuby
mailing list