首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从.xml文件中提取熊猫数据

从.xml文件中提取熊猫数据
EN

Stack Overflow用户
提问于 2022-04-26 04:53:16
回答 1查看 47关注 0票数 0

我有一个包含以下内容的.xml文件:

代码语言:javascript
复制
<detailedreport xmlns:xsi="http://"false">
<severity level="5">
<category categoryid="3" categoryname="Buffer Overflow" pcirelated="false">
<cwe cweid="121" cwename="Stack-based Buffer Overflow" pcirelated="false" sans="120" certc="1160">
<description>
<text text="code."/>
</description>
<staticflaws>
<flaw severity="5" categoryname="Stack-based Buffer Overflow" count="1" issueid="6225" module="Jep" type="strcpy" description="This call to strcpy() contains a buffer overflow. The source string has an allocated size of 80 bytes " note="" cweid="121" remediationeffort="2" exploitLevel="0" categoryid="3" pcirelated="false">
<exploitability_adjustments>
<exploitability_adjustment score_adjustment="0">
</exploitability_adjustment>
</exploitability_adjustments>
</flaw>
</staticflaws>
</cwe>
</category>
</severity>
</detailedreport>

下面是从.xml文件中提取“缺陷”标记下的一些字段的python程序。但是当我在python程序中打印字段时,它们是空的。

代码语言:javascript
复制
from lxml import etree
root = etree.parse(r'fps_change.xml')
xroot = root.getroot()

df_cols = ["categoryname", "issueid", "module"]
rows = []

for node in xroot:
    #s_name = node.attrib.get("name")
    s_categoryname = node.find("categoryname")
    s_issueid = node.find("issueid")
    s_module = node.find("module")

rows.append({"categoryname": s_categoryname,
             "issueid": s_issueid, "module": s_module})

out_df = pd.DataFrame(rows, columns=df_cols)
print(out_df)             #this prints empty.

预期产出:

代码语言:javascript
复制
 Stack-based Buffer Overflow 6225 Jep

我应该做什么改变我的程序,以获得我的预期输出。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-04-26 05:45:24

代码语言:javascript
复制
from bs4 import BeautifulSoup
html_obj = BeautifulSoup(string)
flaw = html_obj.find('flaw')
[flaw[key] for key in df_cols]

['Stack-based Buffer Overflow', '6225', 'Jep']

代码语言:javascript
复制
string = '''
<detailedreport xmlns:xsi="http://"false">
<severity level="5">
<category categoryid="3" categoryname="Buffer Overflow" pcirelated="false">
<cwe cweid="121" cwename="Stack-based Buffer Overflow" pcirelated="false" sans="120" certc="1160">
<description>
<text text="code."/>
</description>
<staticflaws>
<flaw severity="5" categoryname="Stack-based Buffer Overflow" count="1" issueid="6225" module="Jep" type="strcpy" description="This call to strcpy() contains a buffer overflow. The source string has an allocated size of 80 bytes " note="" cweid="121" remediationeffort="2" exploitLevel="0" categoryid="3" pcirelated="false">
<exploitability_adjustments>
<exploitability_adjustment score_adjustment="0">
</exploitability_adjustment>
</exploitability_adjustments>
</flaw>
</staticflaws>
</cwe>
</category>
</severity>
</detailedreport>'''
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/72008560

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档