首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >解析lxml时出错

解析lxml时出错
EN

Stack Overflow用户
提问于 2017-10-11 13:20:24
回答 1查看 858关注 0票数 2

在使用lxml解析XML时,我得到一个错误:“读取文件对象必须返回字节对象”。这是密码

代码语言:javascript
复制
from lxml import etree
from io import StringIO
def parseXML(xmlFile):
    """
    parse the xml
    """
    data=open(xmlFile)
    xml=data.read()
    data.close()

    tree=etree.parse(StringIO(xml))
    context=etree.iterparse(StringIO(xml))
    for action, elem in context:
        if not elem.text:
            if not elem.text:
                text="None"
            else:
                text=elem.text
            print(elem.tag + "=>" + text)
if __name__ == "__main__":
    parseXML("C:\\Users\\karthik\Desktop\\xml_path\\bgm.xml")

BGM xml

代码语言:javascript
复制
<?xml version="1.0" ?>
<zAppointments reminder="15">
    <appointment>
        <begin>1181251680</begin>
        <uid>040000008200E000</uid>
        <alarmTime>1181572063</alarmTime>
        <state></state>
        <location></location>
        <duration>1800</duration>
        <subject>Bring pizza home</subject>
    </appointment>
    <appointment>
        <begin>1234360800</begin>
        <duration>1800</duration>
        <subject>Check MS Office website for updates</subject>
        <location></location>
        <uid>604f4792-eb89-478b-a14f-dd34d3cc6c21-1234360800</uid>
        <state>dismissed</state>
  </appointment>
</zAppointments>

错误:

代码语言:javascript
复制
Traceback (most recent call last):
  File "C:/Users/karthik/source/ChartAttributes/crecords", line 34, in <module>
    parseXML("C:\\Users\\karthik\\Desktop\\xml_path\\bgm.xml")
  File "C:/Users/karthik/source/ChartAttributes/crecords", line 26, in parseXML
    for action, elem in context:
  File "src\lxml\iterparse.pxi", line 208, in lxml.etree.iterparse.__next__ (src\lxml\lxml.etree.c:150010)
  File "src\lxml\iterparse.pxi", line 193, in lxml.etree.iterparse.__next__ (src\lxml\lxml.etree.c:149708)
  File "src\lxml\iterparse.pxi", line 221, in lxml.etree.iterparse._read_more_events (src\lxml\lxml.etree.c:150208)
TypeError: reading file objects must return bytes objects

进程已完成,退出代码为%1

EN

回答 1

Stack Overflow用户

发布于 2017-10-11 13:34:13

我认为您需要XML作为字节数组,而不是字符串。

以二进制模式打开文件以获取bytes对象:

代码语言:javascript
复制
    data=open(xmlFile, 'rb')

但是,将文件名传递给LXML并让它处理文件的打开和读取可能更容易:

代码语言:javascript
复制
from lxml import etree

def parseXML(xmlFile):
    for action, elem in etree.iterparse(xmlFile):
        text = elem.text or "None"
        print(elem.tag + "=>" + text)
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46689317

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档