[Biopython-dev] [Bug 2448] New: Bio.EUtils can't handle accented author names

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sun Feb 10 15:29:37 EST 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2448

           Summary: Bio.EUtils can't handle accented author names
           Product: Biopython
           Version: 1.44
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: baoilleach at gmail.com


The following code exhibits the bug:

from Bio import EUtils
from Bio.EUtils import DBIdsClient

pmids = ["17299727", "17118524"]

client = DBIdsClient.DBIdsClient()

for pmid in pmids:
    paper = client.search(pmid)
    print paper.efetch().read()
    summary = paper.summary()
    data = summary.dataitems
    authors = ", ".join(data['AuthorList'].allvalues())
    p = {'title': data['Title'], 'journal': data['Source'],
                     'volume': data['Volume'],
                     'authors': authors, 'pages': data['Pages']}
    try:
        p['year'] = data['PubDate'].year
    except:
        p['year'] = "----"
    if hasattr(data, "DOI"):
        p['doi'] = data['DOI']
    print i, p['authors'] , p['title'], p['journal'], p['year'], p['volume'],
p['pages']


The result is:
Traceback (most recent call last):
  File "pmids.py", line 11, in <module>
    summary = paper.summary()
  File "C:\Documents and
Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1.
44\Bio\EUtils\DBIdsClient.py", line 105, in summary
    return parse.parse_summary_xml(self.esummary("xml"))
  File "C:\Documents and
Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1.
44\Bio\EUtils\parse.py", line 412, in parse_summary_xml
    pom = xml_parser.parse_using_dtd(infile)
  File "C:\Documents and
Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1.
44\Bio\EUtils\parse.py", line 48, in parse_using_dtd
    parser.parse(file)
  File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 107, in
pars
e
    xmlreader.IncrementalParser.parse(self, source)
  File "C:\Program Files\Python25\lib\xml\sax\xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "C:\Program Files\Python25\lib\xml\sax\expatreader.py", line 207, in
feed

    self._parser.Parse(data, isFinal)
  File "C:\Documents and
Settings\AvrilNoel\Desktop\Tools\Biopython\biopython-1.
44\Bio\EUtils\POM.py", line 774, in characters
    self.stack[-1].append(Text(text))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 4:
ordinal not in range(128)


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


More information about the Biopython-dev mailing list