[BioPython] parsing pdb files

Peter biopython at maubp.freeserve.co.uk
Mon Feb 9 13:11:06 EST 2009


On Mon, Feb 9, 2009 at 5:25 PM, Bala subramanian wrote:
> Dear Peter,

Hi Bala,

I've CC'd this message back to the mailing list.

> Thank you for you reply. I dnt have any way to write the B-factor. I tried
> to write the bfactor by a small script but still the parser dosent work and
> i get the value error as follows:
>
> Traceback (most recent call last):
> ...
>   File "/usr/lib/python2.5/site-packages/Bio/PDB/PDBParser.py", line 158, in
> _parse_coordinates
>     occupancy=float(line[54:60])
> ValueError: invalid literal for float(): 8    1
>
> Few lines from my pdb file is as follows
>
> ATOM    29    P    C    2    -5.431    7.793    -13.678    1
> ATOM    60    P    A    3    -7.411    4.185    -9.249    1
> ATOM    93    P    C    4    -10.773    -0.351    -8.752    1
> ATOM    124    P    A    5    -12.260    -6.200    -8.124    1
> ATOM    157    P    A    6    -13.059    -12.433    -8.140    1

You said your PDB file is from a simulation program - did you write
this or someone else?  If you wrote it then you should be able to add
in these missing fields.  Even if you didn't write it, it would be
fairly easy to "fix" the file using a short python script like this:

bad_pdb_filename = "1AD5_invalid.pdb"
fixed_pdb_filename = "1AD5_fixed.pdb"
input_handle = open(bad_pdb_filename,"rU")
output_handle = open(fixed_pdb_filename, "w")
for number, line in enumerate(input_handle) :
    if line.startswith("ATOM ") :
        if not line[54:60].strip() :
            print "Line %i missing occupancy" % (number+1)
            if len(line) < 60 :
                #It was a very short line!
                line = line[:54] + "  0.0 \n"
                #We'll add the B factor in a moment...
            else :
                line = line[:54] + "  0.0 " + line[60:]
        if not line[60:66].strip() :
            print "Line %i missing B factor" % (number+1)
            if len(line) < 66 :
                #It was a very short line!
                line = line[:60] + "  0.0 \n"
            else :
                line = line[:60] + "  0.0 " + line[66:]
        assert line.endswith("\n")
    else :
        #Not an ATOM line, leave it alone
        pass
    output_handle.write(line)
input_handle.close()
output_handle.close()

Also, if you wanted to try the latest Biopython code from CVS it
should ignore these missing fields in permissive mode (but will print
warning messages - see Bug 2751 for details).

Peter



More information about the BioPython mailing list