文章/答案/技术大牛

发布

问将文本解析为XML格式
EN

Stack Overflow用户

提问于 2015-09-22 03:19:06

回答 1查看 201关注 0票数 0

我不确定这是一个JSON文件，或者是什么类型的数据结构，但我需要使用python语言将这些格式解析为XML。

(Table){
   classA= "false"
   classB= "150538"
   classC= "AE_T_C"
   classD= "510150 DLCX DEPOSITION"
   classE= "233344"
   classF= "516"
   classG= "150131"
   classH= "CARJOB"
   classI= "23001367"
   classJ= "960"
   classK= "16"
   classL= "true"
   classM= "P_GENERIC_HARMONY.2"
 }
 (Table){
   LetterA= "true"
   LetterB= "15"
   LetterC= "x"
   LetterD= "Carbon"
   LetterE= "44"
   LetterF= "test"
   LetterG= "Dump"
   LetterH= "NA"
   LetterI= "2"
   LetterJ= "9"
   LetterK= "1"
   LetterL= "done"
   LetterM= "test"
 }
 .
 .
 .

下面是我当前解析JSON文件的脚本，但我认为它不是JSON，我仍然很困惑如何解析它：

import json
import urllib
import dicttoxml

filename = 'c:/myFile'
file = open(filename,"r") 
lines = file.read() 

content = lines
obj = json.loads(content)
print(obj)

content = lines
obj = json.loads(content)
print(obj)

xml = dicttoxml.dicttoxml(obj)
print(xml)

是否有任何方法或建议来解析这类文件？

提前谢谢..。

python

xml

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-09-22 03:57:35

这是一个快速的小脚本，它依赖于xmltodict模块，该模块帮助将dict转换为xml文件：

import xmltodict
mydict = {}
# I added the contents to a file named 'afile.txt'
with open("afile.txt","r") as f:
    for line in f:
        # ignore lines containing a bracket
        if "{" not in line and "}" not in line:
            # Split string and remove whitespaces
            linesplit = line.split("=")
            mydict[linesplit[0].strip()]=linesplit[1][:-1]
# define xml root tag
root = {
    'body': mydict
}
# parse (or unparce) to xml
print xmltodict.unparse(root, pretty=True)

使用您提供的(初始)内容运行打印：

<?xml version="1.0" encoding="utf-8"?>
<body>
    <classL> "true"</classL>
    <classM> "P_GENERIC_HARMONY.2"</classM>
    <classJ> "960"</classJ>
    <classK> "16"</classK>
    <classH> "CARJOB"</classH>
    <classI> "23001367"</classI>
    <classF> "516"</classF>
    <classG> "150131"</classG>
    <classD> "510150 DLCX DEPOSITION"</classD>
    <classE> "233344"</classE>
    <classB> "150538"</classB>
    <classC> "AE_T_C"</classC>
    <classA> "false"</classA>
</body>

这就是我的工作。如果您事先碰巧知道每个表的内容，您还可以定义一个包含这些标签的列表，并使该xml文件看起来更结构化：

# define the appropriate labels:
TableValues = ['Class', 'Letter']

# and create the dictionary based on these tags:
# this uses a dictionary comprehension in a dictionary 
# comprehension. Comprehensive stuff.
new_root = {
    'body': {
        label: {
            key: value 
            for key, value in mydict.iteritems() 
            if label.lower() in key.lower()
        } 
        for label in TableValues
     }
}    

print xmltodict.unparse(new_root, pretty=True)

使用您提供的添加的内容执行此操作将产生更结构化的结果：

<?xml version="1.0" encoding="utf-8"?>
<body>
    <Class>
        <classL>"true"</classL>
        <classM>"P_GENERIC_HARMONY.2"</classM>
        <classJ>"960"</classJ>
        <classK>"16"</classK>
        <classH>"CARJOB"</classH>
        <classI>"23001367"</classI>
        <classF>"516"</classF>
        <classG>"150131"</classG>
        <classD>"510150 DLCX DEPOSITION"</classD>
        <classE>"233344"</classE>
        <classB>"150538"</classB>
        <classC>"AE_T_C"</classC>
        <classA>"false"</classA>
    </Class>
    <Letter>
        <LetterG>"Dump"</LetterG>
        <LetterF>"test"</LetterF>
        <LetterE>"44"</LetterE>
        <LetterD>"Carbon"</LetterD>
        <LetterC>"x"</LetterC>
        <LetterB>"15"</LetterB>
        <LetterA>"true"</LetterA>
        <LetterM>"test"</LetterM>
        <LetterL>"done"</LetterL>
        <LetterK>"1"</LetterK>
        <LetterJ>"9"</LetterJ>
        <LetterI>"2"</LetterI>
        <LetterH>"NA"</LetterH>
    </Letter>
</body>

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/32707815

复制

相似问题

问将文本解析为XML格式
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将文本解析为XML格式EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将文本解析为XML格式
EN