首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >R XML获取后代属性的每个匹配项的祖先属性

R XML获取后代属性的每个匹配项的祖先属性
EN

Stack Overflow用户
提问于 2016-10-14 02:16:36
回答 2查看 93关注 0票数 0

我正在尝试通过从XML文件中提取停机数据并将每次停机与特定仪表相关联来构建数据框架。数据的简化示例如下:

代码语言:javascript
复制
  <MeterReadings Irn="311" Source="Remote">
       <Meter MeterIrn="311" IsActive="true" /> 
       <ConsumptionData>
       </ConsumptionData>
       <IntervalData>
           <Reading TimeStamp="2016-10-13" />
       </IntervalData>
       <EventData>
           <EventSpec Type="Outage Detected from Interval Data" Category="Full Power Outage / Restoration" />
           <Event TimeStamp="2014-10-31 14:17:40" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:16:20" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:16:16" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:15:12" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:12:00" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data">
           </Event>
       </EventData>
  </MeterReadings>

我想要的是设置一个数据帧,它将在第一列中包含仪表编号,在第二列中包含每次停机的时间。

我尝试过使用以下表达式:

代码语言:javascript
复制
    outage.inv <- data.frame(xpathSApply(doc, '//Event[contains(@EventInfo, "Outage detected from Interval Data")]/ancestor::MeterReadings', xmlGetAttr, "Irn"))
    outage.df <- data.frame(xpathSApply(doc, '//MeterReadings/EventData/EventSpec[@Type="Outage Detected from Interval Data"]/following-sibling::Event', xmlGetAttr, "TimeStamp"))
    outage.inv <- cbind(outage.inv, outage.df)

但是第一个表达式只拉出仪表编号一次,所以变量的总数不匹配。在这种情况下,1米数和5次停机时间。有没有一种方法可以让祖先属性在子孙属性中每次出现时都被提取出来?

我已经检查了以下答案,但还没有弄清楚。

XPath to select element based on childs child value

R: How to get parent attributes and node values at the site time?

任何帮助都将不胜感激。

EN

回答 2

Stack Overflow用户

发布于 2016-10-14 03:32:30

另一种方法。

以下是数据:

代码语言:javascript
复制
txt <- '  <MeterReadings Irn="311" Source="Remote">
       <Meter MeterIrn="311" IsActive="true" />
       <ConsumptionData>
       </ConsumptionData>
       <IntervalData>
           <Reading TimeStamp="2016-10-13" />
       </IntervalData>
       <EventData>
           <EventSpec Type="Outage Detected from Interval Data" Category="Full Power Outage / Restoration" />
           <Event TimeStamp="2014-10-31 14:17:40" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:16:20" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:16:16" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:15:12" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data.">
           </Event>
           <Event TimeStamp="2014-10-31 14:12:00" DiscoveredAt="2014-11-01 12:05:28" Source="Event Log" EventInfo="Outage detected from Interval Data">
           </Event>
       </EventData>
  </MeterReadings>'

我们可以用一种不同的方式处理记录:

代码语言:javascript
复制
library(xml2)
library(purrr)
library(dplyr)

doc <- read_xml(txt)

xml_find_all(doc, "//MeterReadings") %>%
  map_df(function(x) {
    meter <- xml_attr(x, "Irn")
    xml_find_all(x, "//Event[contains(@EventInfo, 'Outage')]") %>% 
      map_df(function(y) {
        data_frame(
          meter=meter,
          timestamp=xml_attr(y, "TimeStamp"),
          discovered_at=xml_attr(y, "DiscoveredAt")
        )
      })
  })

这将生成:

代码语言:javascript
复制
## # A tibble: 5 × 3
##   meter           timestamp       discovered_at
##   <chr>               <chr>               <chr>
## 1   311 2014-10-31 14:17:40 2014-11-01 12:05:28
## 2   311 2014-10-31 14:16:20 2014-11-01 12:05:28
## 3   311 2014-10-31 14:16:16 2014-11-01 12:05:28
## 4   311 2014-10-31 14:15:12 2014-11-01 12:05:28
## 5   311 2014-10-31 14:12:00 2014-11-01 12:05:28
票数 0
EN

Stack Overflow用户

发布于 2016-10-14 21:24:26

已修改答案,以过滤计量器和时间戳,以便不会重复所有计量器的所有时间戳:

代码语言:javascript
复制
outage.df <- xml_find_all(doc, "//MeterReadings[EventData/Event[contains(@EventInfo, 'Outage')]]") %>%
map_df(function(x) {
  meter <- xml_attr(x, "Irn")
  xml_find_all(x, paste("//MeterReadings[@Irn=",meter,"]/EventData/Event[contains(@EventInfo, 'Outage')]")) %>% 
    map_df(function(y) {
      data_frame(
        meter=meter,
        timestamp=xml_attr(y, "TimeStamp"),
        discovered_at=xml_attr(y, "DiscoveredAt")
      )
    })
})
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/40028173

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档