[Biopython] Bio.PDB local MMCIF files
João Rodrigues
anaryin at gmail.com
Thu Dec 5 16:53:46 UTC 2013
Hi Dave,
I understand your concern. Python has the gzip module that can decompress
the files on the file and provide a handle for the content. This will not
work for the parsers since they except a filename. I will have a look at
the parsers code and if it's simple, I'll add a layer to do this exactly.
Cheers,
João
2013/12/5 Dave Howorth <dhoworth at mrc-lmb.cam.ac.uk>
> João Rodrigues wrote:
> > Dear Dave,
> >
> > I'm not quite sure I understood your question. PDBList is used to
> download
> > and maintain a local copy of the PDB, which would not suit you since you
> > are looking for mmCIF data. It could be tweaked however to download mmCIF
> > files. Is this what you are looking for?
>
> Sorry, I didn't express myself very well. I misunderstood the purpose of
> PDBList, and at the time thought it was simply a way to tell Biopython
> where the local archive was. I already have access to a PDB/mmCIF
> archive; I don't need to create one.
>
> > As for mmCIF parsing and manipulation, currently the parser accepts a
> path
> > to the file (relative paths should do) but indeed it does not handle
> > compression. I think it would be up to the user to inflate the gz file
> > before parsing..
>
> I don't think that is very convenient, since all the files are normally
> stored compressed. That's the usual case. Using a filename as the only
> way to specify a file means that I would have to open the file in the
> archive, read and uncompress it and store it in another file before
> passing the name of that file to the mmCIF parser. Unless python
> supports some means to incorporate a decopmression layer specification
> into the 'filename'? (Sorry, I'm new to python)
>
> I would think that the 'nicest' solution would be for the parser to
> recognize a compressed file and use a gzip layer to decompress it on the
> fly. Alternatively, the parser could accept an open file handle as an
> alternative to a filename and the caller would be responsible for
> opening the file through a decompression layer.
>
> Since the caller is going to have to deal with prepending the library
> base to the filename anyway, I suppose having it produce a decompressed
> stream is not a great problem, if only it could pass the stream to the
> parser!
>
> Cheers, Dave
>
> > Best,
> >
> > João
>
> PS, Sorry if I'm breaking threads by replying to the copy of the email
> that João sent to me, but the copy from the mail server hasn't arrived
> here yet, despite being visible at gmane.
>
More information about the Biopython
mailing list