文章/答案/技术大牛

发布

社区首页 >问答首页 >Python-3.x简单XPath库

问Python-3.x简单XPath库
EN

Stack Overflow用户

提问于 2013-04-03 20:40:45

回答 2查看 3.3K关注 0票数 3

我正在尝试使用Python解析非常简单的XML。

在Python3之前，我使用了带有XPath功能的"webscraping“库。其工作原理非常简单：

xpath.search(xml (xml string), "XPath Query (//search)"

根据提供的XPath查询返回找到的元素。

现在我决定切换到Python3，上面提到的库在它上面不能正常工作(即使在2to3.py之后)-所以我决定使用本机xml.etree.ElementTree库。

也许我不明白什么，但这是真正的噩梦。它的工作方式并不是简单地向函数提供XML和XPath查询，然后返回结果。相反，您需要使用10+代码行，处理元素的子元素等，但它仍然不起作用……

import xml.etree.ElementTree as ET
doc = ET.fromstring(xml)
result = doc.findall("//XPath Query")

returns SyntaxError: cannot use absolute path on element将.添加到//XPath Query也没有太大帮助。

是不是有什么原因导致ElementTree和lxml库如此复杂，不允许简单地使用XPATH，而不是处理元素，每次都使用for循环等等？

有没有人推荐简单的Python3库，它只使用XPath查询和返回结果？

python

xpath

python-3.x

回答 2

Stack Overflow用户

回答已采纳

发布于 2013-04-03 22:37:20

使用来自http://docs.python.org/2/library/xml.etree.elementtree.html的示例xml，搜索似乎工作得很好：

>>> import xml.etree.ElementTree as ET
>>> xml = """..."""
>>> doc = ET.fromstring(xml)
>>> doc.findall(".//rank")
[<Element 'rank' at 0x10199ebd0>, <Element 'rank' at 0x10199e210>, <Element 'rank' at 0x10199e4d0>]

或者，如果您想显式地从根目录进行搜索：

>>> ET.ElementTree(doc).findall('//rank')

票数 2

Stack Overflow用户

发布于 2013-04-04 02:58:11

现在找到问题了。

我的XML响应包含以下内容：

<?xml version="1.0" encoding="utf-8"?>
<GetOrdersResponse xmlns="urn:ebay:apis:eBLBaseComponents">
  <!-- Call-specific Output Fields -->
  <HasMoreOrders> boolean </HasMoreOrders>
  <OrderArray> OrderArrayType
    <Order> OrderType
      <AdjustmentAmount currencyID="CurrencyCodeType"> AmountType (double) </AdjustmentAmount>
      <AmountPaid currencyID="CurrencyCodeType"> AmountType (double) </AmountPaid>
      <AmountSaved currencyID="CurrencyCodeType"> AmountType (double) </AmountSaved>
      <BuyerCheckoutMessage> string </BuyerCheckoutMessage>
      <BuyerUserID> UserIDType (string) </BuyerUserID>
      <CheckoutStatus> CheckoutStatusType
      ...

在解析该XML之后：

root = ET.fromstring(xml)
result = tree.findall("*")

它返回每个前缀为{urn:ebay:apis:eBLBaseComponents}的元素

例如，如果我需要搜索<BuyerCheckoutMessage>

因为该元素看起来像{urn:ebay:apis:eBLBaseComponents}BuyerCheckoutMessage，所以它不会返回任何内容。

因此，要搜索元素，我需要在每个XPath查询之前包含{urn:ebay:apis:eBLBaseComponents}，以便检索我的元素。

因此，解决方案是使用：

result = tree.findall(".//{urn:ebay:apis:eBLBaseComponents}BuyerCheckoutMessage") result[0].text将返回元素值。

为什么它不能以ET.search(xml，"XPath-query")的方式工作，这是我最大的秘密。浪费了这么多时间。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/15787637

复制

相似问题

问Python-3.x简单XPath库
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python-3.x简单XPath库EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python-3.x简单XPath库
EN