我有这个XML数据,需要进行分析,并且应该提取某些信息。但是,当我试图使用从xml中提取名称字段时,会遇到一个问题。
<attribute-item id="mydata.core.customization.requirements._noSpwIUSEei1hLMz9D9OBw">问题2: XML还需要从XML中提取ID,即<attribute-item id="mydata.core.customization.requirements._noSpwIUSEei1hLMz9D9OBw">我使用BeautifulSoup作为标准方法,不能更改为任何其他包。因此,使用相同的解决办法将是非常感谢的。
下面是XML数据:需要提取粗体突出显示的数据。
<configurations>
<attributes-configuration>
<attributes>
<attribute-item id="mydata.core.customization.requirements._noSpwIUSEei1hLMz9D9OBw">
<name>priority</name>
<description>priority of a requirement</description>
<customization-element>mydata.core.customization.requirements</customization-element>
<attribute-type>mydata.attribute_type.list</attribute-type>
<options>
<option>
<key>DEFAULT_LIST</key>
<value class="java.lang.String"> high,low,medium</value>
</option>
<option>
<key>LIST_TYPE</key>
<value class="java.lang.String">CUSTOM</value>
</option>
</options>
<editable>true</editable>
<userDefined>true</userDefined>
<internal>false</internal>
</attribute-item>
<attribute-item id="mydata.core.customization.teststep.prerequisite">
<name>Prerequisite</name>
<description>User Defined Attribute</description>
<customization-element>mydata.core.customization.teststep</customization-element>
<attribute-type>mydata.attribute_type.string</attribute-type>
<options>
<option>
<key>DEFAULT_VALUE</key>
<value/>
</option>
<option>
<key>MAX_CHARACTERS</key>
<value class="java.lang.String">5000</value>
</option>
</options>
<editable>true</editable>
<userDefined>true</userDefined>
<internal>false</internal>
</attribute-item>
</attributes>
</attributes-configuration>
<test-management/>
</configurations>下面是我的python代码:
import os
from bs4 import BeautifulSoup as bs
fileName = 'Configuration.xml'
fullFile = os.path.abspath(os.path.join('DataTransporter', fileName))
attributeList = []
with open(fullFile) as f:
soup = bs(f, 'xml')
for attribData in soup.find_all('attribute-item'):
dat = {
'attribName' : attribData.name,
'attribDesc' : attribData.description.text,
'attribValue' : attribData.options.value.text,
}
attributeList.append(dat)
#for attribParams in soup.find_all(name = 'value'):
#newdict[attribName.text] = attribParams.text
print(attributeList)我的产出:
[{'attribName': 'attribute-item', 'attribDesc': 'priority of a requirement', 'attribValue': ' high,low,medium'}, {'attribName': 'attribute-item', 'attribDesc': 'User Defined Attribute', 'attribValue': ''}]预期产出:
[{'attribName': 'priority', 'attribDesc': 'priority of a requirement', 'attribValue': ' high,low,medium'}, {'attribName': 'prerequisite', 'attribDesc': 'User Defined Attribute', 'attribValue': ''}]发布于 2018-08-09 14:42:09
起初,我认为使用attribData.name.text应该这样做,但似乎'name‘是attribData的某种关键字属性。为了获得正确的值,可以使用以下findChildren(<key>)方法:
attribData.findChildren('name')[0].text
findChildren()返回一个列表,在本例中只有一个值,因此使用[0]获取元素,然后使用.text获取预期值是有意义的。
要获得Id,可以使用attribData['id']。总之,您的代码看起来如下(在for循环中):
dat = {
'attribName' : attribData.findChildren('name')[0].text,
'id': attribData['id'],
'attribDesc' : attribData.description.text,
'attribValue' : attribData.options.value.text,
}输出将如下所示:
[{'attribName': 'priority', 'id': 'mydata.core.customization.requirements._noSpwIUSEei1hLMz9D9OBw', 'attribDesc': 'priority of a requirement', 'attribValue': ' high,low,medium'}, {'attribName': 'Prerequisite', 'id': 'mydata.core.customization.teststep.prerequisite', 'attribDesc': 'User Defined Attribute', 'attribValue': ''}]希望能帮上忙!
https://stackoverflow.com/questions/51769008
复制相似问题