我正在尝试找到一种在使用minidom解析xml文件时获取索引号的方法。xml将如下所示
<stuff>
<morestuff>
<sometag>catagory1</sometag>
<path pathversion="1">/path Im looking to for</path> #<--info i'm after
<path pathversion="2">/path I don't need</path>
<path pathversion="3">/path I don't need</path>
</morestuff>
<morestuff>
<sometag>catagory2</sometag>
<path pathversion="1">/other path I'm looking for</path> #<--info i'm after
<path pathversion="2">/path I don't need</path>
<path pathversion="3">/path I don't need</path>
</morestuff>
</stuff>我想做这样的事情
for element in node.getElementsByTagName('sometag'):
if element.firstChild.data == 'catagory1':
elementid = element.indexnumber #<----how do I write the [0], or [1] to a variable so I can use it to discribe the position in the next line
var1 = node.getElementsByTagName('path')[elementid].firstChild.data
if element.firstChild.data == 'catagory2':
elementid = element.indexnumber
var2 = node.getElementsByTagName('path')[elementid].firstChild.data发布于 2013-03-01 18:42:03
这将创建一个包含所需信息的字典:
import xml.dom.minidom
doc = xml.dom.minidom.parseString(test)
paths = {}
for element in doc.getElementsByTagName('morestuff'):
# get the text value of the sometag tag
category = element.getElementsByTagName('sometag')[0].firstChild.nodeValue
# get all the paths which are children of the morestuff element
for path in element.getElementsByTagName('path'):
if path.getAttribute('pathversion') == '1':
pathstr = path.firstChild.nodeValue
paths[category] = pathstr
print paths我得到的输出是:
{u'catagory1': u'/path Im looking to for', u'catagory2': u"/other path I'm looking for"}发布于 2013-03-01 19:07:46
使用Keith建议的etree如何:
['/path Im looking to for', "/other path I'm looking for"]使用以下代码:
import xml.etree.ElementTree as ET
tree = ET.fromstring('''<stuff>
<morestuff>
<sometag>catagory1</sometag>
<path pathversion="1">/path Im looking to for</path>
<path pathversion="2">/path I don't need</path>
<path pathversion="3">/path I don't need</path>
</morestuff>
<morestuff>
<sometag>catagory2</sometag>
<path pathversion="1">/other path I'm looking for</path>
<path pathversion="2">/path I don't need</path>
<path pathversion="3">/path I don't need</path>
</morestuff>
</stuff>
''')
print [e.text for e in tree.findall('.//morestuff/path[@pathversion="1"]')]https://stackoverflow.com/questions/13548408
复制相似问题