首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python3x:使用Python xml.etree解析带有名称空间的xml文件

Python3x:使用Python xml.etree解析带有名称空间的xml文件
EN

Stack Overflow用户
提问于 2020-07-13 02:19:39
回答 1查看 106关注 0票数 1

我正在尝试使用xml.etree解析一个大型xml文件。它的结构如下。

我特别感兴趣的是使用Title提取引用,这是一个Publisher,如下图所示。

下面是我尝试过的代码示例。它什么都不印。任何帮助都是非常感谢的。

代码语言:javascript
复制
import xml.etree.ElementTree as et

data = """<exist:result xmlns:exist="http://exist.sourceforge.net/NS/exist" exist:hits="1" exist:start="1" exist:count="1" exist:compilation-time="0" exist:execution-time="0">
    <events>
        <paging page="9" pageNumberOfRecords="20" totalNumberOfRecords="215"/>
        <WeatherEvent xmlns="http://hwe.niwa.co.nz/schema/2011" xmlns:gml="http://www.opengis.net/gml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hwe.niwa.co.nz/schema/2011 ../hwe.xsd">
    <Identifier>November_2019_Timaru_Hail</Identifier>
    <Title> November 2019 Timaru Hail</Title>
    <StartDate>2019-11-20</StartDate> 
        <Abstract>A severe hailstorm over Timaru, with golf ball-sized hail stones, caused extensive damage to buildings and vehicles.</Abstract>
        <Notes/>
        <Regions>
            <Region name="Canterbury">
                <Hazards>
                    <Hazard type="Hail">
                        <Location name="Timaru">
                            <gml:Point gml:id="Timaru_1" srsName="urn:ogc:def:crs:EPSG:6.6:4326" srsDimension="2">
                                <gml:pos>-44.398445 171.255200</gml:pos>
                            </gml:Point>
                        </Location>
                        <Impacts>
                            <Impact type="InsuranceClaim" unit="$" value="130700000">Insurance claims totalled $130.7 million.</Impact>
                            
                            <Impact type="GeneralComment">Large hail stones smashed windows, pelted holes in roofs, damaged vehicles and forced the closure of businesses.</Impact>
                            <Impact type="GeneralComment">The Fire and Emergency NZ Mid-South Canterbury area commander said they had received 30 call-outs between noon and 2.40pm. Twenty one  of them were for hail or rain damage.</Impact>
                            <Impact type="GeneralComment">The South Canterbury Chamber of Commerce said there had been considerable damage and flooding with a number of businesses forced to close until their premises were secure and safe to open.  The Timaru library and the Aigantighe Art Gallery were both closed due to damage sustained.</Impact>
                            <Impact type="GeneralComment">A Timaru panel beating business estimated there were at least 10,000 vehicles in Timaru that were damaged by the hail.  Vehicles had dents, broken windscreens and broken wing mirrors.  The structural integrity of many of the damaged vehicles was found to be compromised.</Impact>
                            <Impact type="GeneralComment">An Australian-based team of hail damage repairers set up a base in Timaru to fix cars damaged in the hailstorm.  They anticipated that repairing hail-damaged cars in Timaru would take at least six months.</Impact>
                        </Impacts>
                    </Hazard>
                    <Hazard type="Hail">
                        <Location name="St Andrews">
                            <gml:Point gml:id="St_Andrews_1" srsName="urn:ogc:def:crs:EPSG:6.6:4326" srsDimension="2">
                                <gml:pos>-44.5301 171.1909</gml:pos>
                            </gml:Point>
                        </Location>
                        <Impacts>
                            <Impact type="GeneralComment">Federated Farmers reported there had been significant crop damage near St Andrews.</Impact>
                        </Impacts>
                    </Hazard>
                </Hazards>
            </Region>
        </Regions>
        <References>
            <Reference>
                <Title>Insurance Council of New Zealand (https://www.icnz.org.nz/natural-disasters/cost-of-natural-disasters/)</Title>
                <Type>Reference</Type>
            </Reference>            
            <Reference>
                <Title>Headline:  Giant hail stones hammer Timaru as storm moves up the country.</Title>
                <Type>Reference</Type>
                <Publisher>www.stuff.co.nz, 20 November 2019.  </Publisher>
            </Reference>
            <Reference>
                <Title>Headline:  Insurance companies face deluge of hail damage claims.</Title>
                <Type>Reference</Type>
                <Publisher>www.stuff.co.nz, 21 November 2019.  </Publisher>
            </Reference>
            <Reference>
                <Title>Headline:  Cars damaged in severe Timaru hailstorm failing warrents of fitness.</Title>
                <Type>Reference</Type>
                <Publisher>www.stuff.co.nz, 4 December 2019.  </Publisher>
            </Reference>
            <Reference>
                <Title>Headline:  Record insurance repairs for cars smashed by hail in Timaru.</Title>
                <Type>Reference</Type>
                <Publisher>www.stuff.co.nz, 23 December 2019.  </Publisher>
            </Reference>
            
        </References>
</WeatherEvent>
</events>
</exist:result>
"""

root = et.fromstring(data)

ns = {'exist':'http://exist.sourceforge.net/NS/exist', 'niwa':'http://hwe.niwa.co.nz/schema/2011'}


results = root.findall('exist:result', ns)
for event in results:
    weatherEvnt = event.find('niwa: events', ns)
    for WE in weatherEvnt:
        Ref = WE.find('niwa: WeatherEvent', ns)
        for x in Ref.find('niwa: References', ns):
            print(x.text)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-07-13 03:39:24

问题至少包括:

  1. 根目录已经是exist:result,所以

结果=root.findall(“存在:结果”,ns)

返回一个空列表,因为exist:result没有这样的子列表。

  1. 在命名空间前缀及其本地名称后面的冒号后面不应该有空格。niwa: events应该是niwa:events et。al.

没有niwa:References.的文本子级

不知道你的最终目标是什么,但这段代码,

代码语言:javascript
复制
import xml.etree.ElementTree as et

data = "" # As specified in question.

root = et.fromstring(data)
ns = {'exist':'http://exist.sourceforge.net/NS/exist',
      'niwa':'http://hwe.niwa.co.nz/schema/2011'}

for ref in root.findall('.//niwa:Title', ns):
  print('Title='+ref.text)

将演示在命名空间XML中成功地选择文本,并输出:

代码语言:javascript
复制
Title= November 2019 Timaru Hail
Title=Insurance Council of New Zealand (https://www.icnz.org.nz/natural-disasters/cost-of-natural-disasters/)
Title=Headline:  Giant hail stones hammer Timaru as storm moves up the country.
Title=Headline:  Insurance companies face deluge of hail damage claims.
Title=Headline:  Cars damaged in severe Timaru hailstorm failing warrents of fitness.
Title=Headline:  Record insurance repairs for cars smashed by hail in Timaru.
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62868383

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档