我使用的是萨克斯解析器(xml.sax),它的工作方式是我想要的。但是,我正在解析一个相当大的文件(因此我使用SAX),并且我希望在某个时候停止解析(例如,当我达到某个限制时,或者当我找到某个数据时)。
class ProductHandler(xml.sax.ContentHandler):
def startElement(self, tag, attrs):
[.. process start ..]
def endElement(self, tag):
[.. process end ..]
def characters(self, content):
[.. process characters ..]
product_handler = ProductHandler()
parser = xml.sax.make_parser()
parser.setContentHandler(product_handler)
parser.parse(xmlfile)我该怎么做?在其中一个处理程序方法中,是否有某个返回值?我检查了文件,但在这个方向上找不到任何东西。
发布于 2022-09-06 12:48:24
使用此示例数据,如果我们想要找到一个包含单词"sourdough“的<description>,也许我们应该这样写:
import xml.sax
class IAmAllDone(Exception):
pass
class ProductHandler(xml.sax.handler.ContentHandler):
def __init__(self):
super().__init__()
self.description = None
self.name = None
self.tree = []
def startElement(self, name, attrs):
self.tree.append(name)
def endElement(self, name):
self.tree.pop(0)
def characters(self, content):
if self.tree[-1] == "name" and content.strip():
self.name == content
print("name:", content)
elif self.tree[-1] == "description" and "sourdough" in content:
self.description = content
raise IAmAllDone()
product_handler = ProductHandler()
parser = xml.sax.make_parser()
parser.setContentHandler(product_handler)
try:
parser.parse("data.xml")
except IAmAllDone:
pass
if product_handler.description is not None:
print("found description:", product_handler.description)上述产出如下:
name: Belgian Waffles
name: Strawberry Belgian Waffles
name: Berry-Berry Belgian Waffles
name: French Toast
found description: Thick slices made from our homemade sourdough bread正如您所看到的,在阅读最后的"Homestyle早餐“项目之前,我们将停止SAX解析。
https://stackoverflow.com/questions/73621602
复制相似问题