[Biopython] can I use the xml parser in biopython on other xml files? how?
Brandon Invergo
b.invergo at gmail.com
Thu Jul 12 09:25:27 UTC 2012
With regards to the error that you receive, it's because you're trying
to `read()` a list, when that method requires a file-like object. This
would fix that:
>>> ttt=Bio.Entrez.read(open('ipa111229.xml', 'r'))
However, that wouldn't work because it requires a DTD from NCBI to read
the file.
Why not use one of Python's standard xml libraries (xml.sax or xml.dom
(or xml.minidom))?
-brandon
On Thu, 2012-07-12 at 16:53 +0800, Wheaton Little wrote:
> I would like to use the Biopython xml parser, if possible, on google
> patent xmls:
>
> http://www.google.com/googlebooks/uspto-patents-applications-text.html
>
> unfortunately, this is what I get:
>
> >>> t=open('ipa111229.xml','r').read()
> >>> import Bio
> >>> ttt=Bio.Entrez.read(t[:30000])
>
> Traceback (most recent call last):
> File "<pyshell#20>", line 1, in <module>
> ttt=Bio.Entrez.read(t[:30000])
> File "/Library/Python/2.7/site-packages/Bio/Entrez/__init__.py",
> line 351, in read
> record = handler.read(handle)
> File "/Library/Python/2.7/site-packages/Bio/Entrez/Parser.py", line
> 169, in read
> self.parser.ParseFile(handle)
> TypeError: argument must have 'read' attribute
>
> What would I have to do to use the parser on this xml?
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list