[Biopython-dev] bug-request
Thomas Hamelryck
thamelry at vub.ac.be
Tue Dec 10 08:22:24 EST 2002
> Trying to apply the PDB-Biopython-module with all PDB structures alvailable
> to me, I recognized that the job will sometimes be killed while processing
> a file, especially when processing large PDB files.
> The cause seems to be a lack of memory.
Well, yes. PDB file 1HTQ is a monster of 70 MB, containing almost a million
atoms. If you want to use the PDB module for this you'll have to buy some
more memory, I guess. :-)
> Seemingly the problem files are
> read several times (-due to an error within the header reading routine?-->
> I came upon this, because the program is printing the same discontinuations
> several times to the screen.)
No, that is because the chains in the file are discontinuous.
This is not the problem.
> Again the core problem: Note that for example "1htq" and "1bxr" will not
> be processed correctly, but be killed after some time.
1BXR contains an error. It has two residues with the same identifier.
HETATM23384 K K 3985 -8.986 34.229 -48.036 1.00 54.69 K
HETATM47621 K K 3985 -19.641 -25.353 -32.655 1.00 39.94 K
Normally, this should be handled by using PDBParser(PERMISSIVE=1)
which would leave out the duplicated atoms, but there was a bug in the
error handling code (there was an old assert statement instead of a "raise
PDBConstructionError" statement). That's been corrected now, you can try out
the new version. 1BXR should parse OK now with PDBParser(PERMISSIVE=1).
Cheers,
-Thomas
More information about the Biopython-dev
mailing list