[Biopython] a parsing error

Rodrigo Faccioli rodrigo_faccioli at uol.com.br
Tue May 25 21:46:07 UTC 2010


Hey João,

Good question. We used this function with PDB database only. However, I
think that this function can be divided into two other functions:
isPDBFileATOMS and isPDBFileSEQRES. So, isPDBFile function call both
function.

The other functions (isPDBFileATOMS and isPDBFileSEQRES) can be to call
separately.

The code below is an example of these functions

def isPDBFileATOM(pathFileName):
    path,name = os.path.split(pathFileName)
    FilePDB = File(path,name)
    if FilePDB.find("ATOM"):
        mensage = "The %s file is not a PDB File. Please, check it." %
pathFileName
        raise Exception(mensage)

If you want, I can implement this my idea.

Thanks,

--
Rodrigo Antonio Faccioli
Ph.D Student in Electrical Engineering
University of Sao Paulo - USP
Engineering School of Sao Carlos - EESC
Department of Electrical Engineering - SEL
Intelligent System in Structure Bioinformatics
http://laips.sel.eesc.usp.br
Phone: 55 (16) 3373-9366 Ext 229
Curriculum Lattes - http://lattes.cnpq.br/1025157978990218
Public Profile - http://br.linkedin.com/pub/rodrigo-faccioli/7/589/a5


On Tue, May 25, 2010 at 5:25 PM, João Rodrigues <anaryin at gmail.com> wrote:

> Hey Rodrigo,
>
> About that isPDB function of yours. What if the protein is a result of a
> webserver that outputs only ATOM records?
>
>
> Best!
>
> João [...] Rodrigues
> @ http://stanford.edu/~joaor/ <http://stanford.edu/%7Ejoaor/>
>
>
>
> On Tue, May 25, 2010 at 12:56 PM, Rodrigo Faccioli <
> rodrigo_faccioli at uol.com.br> wrote:
>
>> Hi,
>>
>> In this way, we developed a method which check if the file is a valid PDB
>> format. The method is called isPDBFile. You can see it at [1]. If you want
>> I'll create a new code for you.
>>
>> I hope this message can help in something.
>>
>> [1]
>>
>> http://github.com/rodrigofaccioli/ContributeToBioPython/blob/master/fcfrp/PDB.py
>>
>> Thanks in advance,
>>
>> --
>> Rodrigo Antonio Faccioli
>> Ph.D Student in Electrical Engineering
>> University of Sao Paulo - USP
>> Engineering School of Sao Carlos - EESC
>> Department of Electrical Engineering - SEL
>> Intelligent System in Structure Bioinformatics
>> http://laips.sel.eesc.usp.br
>> Phone: 55 (16) 3373-9366 Ext 229
>> Curriculum Lattes - http://lattes.cnpq.br/1025157978990218
>> Public Profile - http://br.linkedin.com/pub/rodrigo-faccioli/7/589/a5
>>
>>
>> On Tue, May 25, 2010 at 1:47 PM, Bala subramanian <
>> bala.biophysics at gmail.com
>> > wrote:
>>
>> > Hi,
>> > Thank you very much. I just checked and in fact one of the files was a
>> > corrupted one.
>> >
>> > Thank you,
>> > Bala
>> >
>> > On Tue, May 25, 2010 at 6:41 PM, João Rodrigues <anaryin at gmail.com>
>> wrote:
>> >
>> > > Hello,
>> > >
>> > > I usually get that error when the parser finds an empty PDB file.
>> > >
>> > > Try outputting the name of the file you're currently parsing so you
>> know
>> > > when it breaks.
>> > >
>> > > Best
>> > >
>> > > João [...] Rodrigues
>> > > @ http://stanford.edu/~joaor/ <http://stanford.edu/%7Ejoaor/> <
>> http://stanford.edu/%7Ejoaor/> <
>> > http://stanford.edu/%7Ejoaor/>
>> > >
>> > >
>> > >
>> > > On Tue, May 25, 2010 at 9:29 AM, Bala subramanian <
>> > > bala.biophysics at gmail.com> wrote:
>> > >
>> > >> Friends,
>> > >>
>> > >> The following code takes all the pdb's in the current directory and
>> > >> creates
>> > >> a matrix. I get a parsing error. Pls write what is going wrong.
>> > >>
>> > >> from numpy import zeros,savetxt,matrix
>> > >> from Bio.PDB import *
>> > >> import glob
>> > >> donor=[ 'ARG','ASN', 'GLN', 'LYS', 'TRP' ]
>> > >> ali=['ALA', 'ARG', 'CYS', 'ILE', 'LEU', 'LYS', 'MET', 'PRO', 'THR',
>> > 'VAL'
>> > >> ]
>> > >> parser=PDBParser()
>> > >> X5_MAT=matrix(zeros((34,34),int))
>> > >> files=glob.glob('*.pdb')
>> > >> for i in range(len(files)):
>> > >>    strng=str(i)
>> > >>    structure=parser.get_structure(strng,files[i])
>> > >>    res=Selection.unfold_entities(structure,'R')
>> > >>
>> > >>    for x in range(len(res)):
>> > >>        for y in range(len(res)):
>> > >>            if x <> y :
>> > >>                if not res[x].get_resname() in donor: continue
>> > >>
>> > >>                else:
>> > >>                    if res[y].get_resname() in ali:
>> > >>                        X5_MAT[x,y] = X5_MAT[x,y] + 1
>> > >>
>> > >>            else: pass
>> > >>
>> > >>
>> > >> savetxt('myfile.txt', matrix(X5_MAT), fmt='%d')
>> > >>
>> > >>
>> > >> *The error is pasted below*
>> > >>
>> > >> Traceback (most recent call last):
>> > >>  File "un_don_ali.py", line 14, in <module>
>> > >>    structure=parser.get_structure(' ',files[i])
>> > >>  File "/usr/lib/python2.5/site-packages/Bio/PDB/PDBParser.py", line
>> 64,
>> > in
>> > >> get_structure
>> > >>    self._parse(file.readlines())
>> > >>  File "/usr/lib/python2.5/site-packages/Bio/PDB/PDBParser.py", line
>> 82,
>> > in
>> > >> _parse
>> > >>    self.header,
>> coords_trailer=self._get_header(header_coords_trailer)
>> > >>  File "/usr/lib/python2.5/site-packages/Bio/PDB/PDBParser.py", line
>> 95,
>> > in
>> > >> _get_header
>> > >>    header=header_coords_trailer[0:i]
>> > >> UnboundLocalError: local variable 'i' referenced before assignment
>> > >> _______________________________________________
>> > >> Biopython mailing list  -  Biopython at lists.open-bio.org
>> > >> http://lists.open-bio.org/mailman/listinfo/biopython
>> > >>
>> > >
>> > >
>> >
>> > _______________________________________________
>> > Biopython mailing list  -  Biopython at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biopython
>> >
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>




More information about the Biopython mailing list