我有这个文本文件20150731100543_1.txt
GI-eSTB-MIB-NPH::eSTBGeneralErrorCode.0 = INTEGER: 0
GI-eSTB-MIB-NPH::eSTBGeneralConnectedState.0 = INTEGER: true(1)
GI-eSTB-MIB-NPH::eSTBGeneralPlatformID.0 = INTEGER: 2075
GI-eSTB-MIB-NPH::eSTBMoCAfrequency.0 = INTEGER: 0
GI-eSTB-MIB-NPH::eSTBMoCAMACAddress.0 = STRING: 0:0:0:0:0:0
GI-eSTB-MIB-NPH::eSTBMoCANumberOfNodes.0 = INTEGER: 0我想像下面这样用xml进行转换(20150731100543_1.xml)
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<GI-eSTB-MIB-NPH>
<eSTBGeneralErrorCode.0>
INTEGER: 0
</eSTBGeneralErrorCode.0>
</GI-eSTB-MIB-NPH>
<GI-eSTB-MIB-NPH>
<eSTBGeneralConnectedState.0>
INTEGER: true(1)
</eSTBGeneralConnectedState.0>
</GI-eSTB-MIB-NPH>
<GI-eSTB-MIB-NPH>
<eSTBGeneralPlatformID.0>
INTEGER: 2075
</eSTBGeneralPlatformID.0>
</GI-eSTB-MIB-NPH>
<GI-eSTB-MIB-NPH>
<eSTBMoCAfrequency.0>
INTEGER: 0
</eSTBMoCAfrequency.0>
</GI-eSTB-MIB-NPH>
<GI-eSTB-MIB-NPH>
<eSTBMoCAMACAddress.0>
STRING: 0:0:0:0:0:0
</eSTBMoCAMACAddress.0>
</GI-eSTB-MIB-NPH>
<GI-eSTB-MIB-NPH>
<eSTBMoCANumberOfNodes.0>
INTEGER: 0
</eSTBMoCANumberOfNodes.0>
</GI-eSTB-MIB-NPH>
</doc>我可以使用以下脚本完成这一任务:
import sys
import time
import commands
from xml.etree.ElementTree import Element, SubElement
from xml.etree import ElementTree
from xml.dom import minidom
def prettify(elem):
"""Return a pretty-printed XML string for the Element.
"""
rough_string = ElementTree.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent=" ", newl="\n", encoding="UTF-8")
if len(sys.argv) != 2:
print "\nUsage: python script.py <IP>\n";
exit(0)
filename_xml = '20150731100543_1.xml'#filename_xml = temp + ".xml"
print "xml filename is: %s\n" % filename_xml
xml = open(filename_xml, 'w+')
top = Element('doc')
with open('20150731100543_1.txt') as f:
for line in f:
b = line.split(':')
child = SubElement(top, b[0])
c = line.split()
d = c[0].split(':')
property = SubElement(child, d[2])
property.text = c[2] + " " + c[3]
xml.write(prettify(top))
xml.close()我有三个问题:
因此,如果可能的话,xml的格式应该是:
<?xml version="1.0" encoding="UTF-8"?>
<doc>
<GI-eSTB-MIB-NPH>
<eSTBGeneralErrorCode.0>INTEGER: 0</eSTBGeneralErrorCode.0>
<eSTBGeneralConnectedState.0>INTEGER: true(1)</eSTBGeneralConnectedState.0>
<eSTBGeneralPlatformID.0>INTEGER: 2075</eSTBGeneralPlatformID.0>
<eSTBMoCAfrequency.0>INTEGER: 0</eSTBMoCAfrequency.0>
<eSTBMoCAMACAddress.0>STRING: 0:0:0:0:0:0</eSTBMoCAMACAddress.0>
<eSTBMoCANumberOfNodes.0>INTEGER: 0</eSTBMoCANumberOfNodes.0>
</GI-eSTB-MIB-NPH>
</doc>我正在尝试这样做,因为这将在很大程度上减少xml中的行数。
最后一个也是最不重要的问题是:
请原谅我写了这么长的文章。
发布于 2016-04-01 20:39:52
1& 2:我使用etree.tostring,没有任何这些问题。
3:可以用regex替换多个拆分操作。
这样做应该很好:
from lxml import etree
import re
filename_xml = '20150731100543_1.xml'
root = etree.Element('doc')
node = etree.SubElement(root, 'GI-eSTB-MIB-NPH')
f = open('20150731100543_1.txt')
text = f.read()
f.close()
# get tag and value from each row
for tag, value in re.findall('GI-eSTB-MIB-NPH::(.*) = (.*$)', text, re.MULTILINE):
# create child node
etree.SubElement(node, tag).text = value
xml = etree.tostring(root, pretty_print = True, encoding = 'utf-8', xml_declaration=True)
f = open(filename_xml, 'w')
f.write(xml)
f.closehttps://stackoverflow.com/questions/31754461
复制相似问题