[BioRuby] Parse big PDB use up all memory
Alex Gutteridge
alexg at kuicr.kyoto-u.ac.jp
Thu Dec 13 03:49:04 UTC 2007
Hi,
Could you give some more details on what system and ruby/bioruby
version you are running? The same script uses less than 20MB on my
machine (ruby 1.8.6 / bioruby 1.1.0 / ubuntu linux), which doesn't
seem so bad. Also 1w6k is biggish, but there are certainly bigger PDB
files out there so if you're having trouble with this one then others
will certainly be a problem.
In answer to your second question, yes you should be able to just
extract the header (everything up to the ATOM records). But if you're
really running out of memory just parsing that file then I suspect you
have deeper issues. Anyway, the sample below works for me for parsing
the header from 1w6k:
require 'bio'
serv = Bio::Fetch.new
entry = serv.fetch('pdb','1w6k')
header = ''
entry.each do |l|
break if l.match(/^ATOM/)
header << l
end
pdb = Bio::PDB.new(header)
p pdb.accession
On 13 Dec 2007, at 10:54, Yen-Ju Chen wrote:
> This is what I did:
>
> require 'bio'
> serv = Bio::Fetch.new()
> entry = serv.fetch('pdb', '1w6k')
> pdb = Bio::PDB.new(entry)
>
> The last step use up all memory and quit.
> The pdb file is quite big and I only need the information from header.
> Is it possible to do something like this ?
>
> pdb = Bio::PDB.new(entry[0-40000])
>
> Thanx for the help
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>
Alex Gutteridge
Bioinformatics Center
Kyoto University
More information about the BioRuby
mailing list