[Biopython] can I use the xml parser in biopython on other xml files? how?

Brandon Invergo b.invergo at gmail.com
Thu Jul 12 09:25:27 UTC 2012


With regards to the error that you receive, it's because you're trying
to `read()` a list, when that method requires a file-like object. This
would fix that:
>>> ttt=Bio.Entrez.read(open('ipa111229.xml', 'r'))

However, that wouldn't work because it requires a DTD from NCBI to read
the file.

Why not use one of Python's standard xml libraries (xml.sax or xml.dom
(or xml.minidom))?

-brandon

On Thu, 2012-07-12 at 16:53 +0800, Wheaton Little wrote:
> I would like to use the Biopython xml parser, if possible, on google
> patent xmls:
> 
> http://www.google.com/googlebooks/uspto-patents-applications-text.html
> 
> unfortunately, this is what I get:
> 
> >>> t=open('ipa111229.xml','r').read()
> >>> import Bio
> >>> ttt=Bio.Entrez.read(t[:30000])
> 
> Traceback (most recent call last):
>   File "<pyshell#20>", line 1, in <module>
>     ttt=Bio.Entrez.read(t[:30000])
>   File "/Library/Python/2.7/site-packages/Bio/Entrez/__init__.py",
> line 351, in read
>     record = handler.read(handle)
>   File "/Library/Python/2.7/site-packages/Bio/Entrez/Parser.py", line
> 169, in read
>     self.parser.ParseFile(handle)
> TypeError: argument must have 'read' attribute
> 
> What would I have to do to use the parser on this xml?
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython





More information about the Biopython mailing list