首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将医疗设备XML数据转换为R中的数据帧

将医疗设备XML数据转换为R中的数据帧
EN

Stack Overflow用户
提问于 2020-05-06 22:45:06
回答 1查看 56关注 0票数 2

我有一个医疗设备生成的XML数据,我正在尝试解析这些数据,但我无论如何也搞不懂它。下面是XML数据的一个示例:

代码语言:javascript
复制
<?xml version="1.0" encoding="UTF-8"?>
<Report Type="CprReport" Version="1.3" Generated="2020-05-06T09:08:47.94976" IncidentID="F0000001" PatientID="DE-IDENTIFIED" SoftwareVersion="11.0.0.1036">
  <DeviceID>BLANK</DeviceID>
  <DeviceType>LP15</DeviceType>
  <CPRAnnotationEdited>false</CPRAnnotationEdited>
  <PowerOn>2020-04-18T20:57:55.00000</PowerOn>
  <DeviceConfiguration>2L355RRB0200UR</DeviceConfiguration>
  <TimeAdjustment>PT1S</TimeAdjustment>
  <CPRStatistics SACVersion="10" CPRPauseThreshold="10" CompressionPauseThreshold="3" MinimumTimeInterval="30" MinimumCompressions="5">
    <OverallStatistic>
      <AverageCompressionRatePerMin>96.3298058454348</AverageCompressionRatePerMin>
      <AverageVentilationRatePerMin>9.34405124270701</AverageVentilationRatePerMin>
      <MedianCompressionRatePerMin>99.8261838893409</MedianCompressionRatePerMin>
      <MedianVentilationRatePerMin>6.51749943203332</MedianVentilationRatePerMin>
      <TotalTimeCompressionsDuringPromptedCprSec>0</TotalTimeCompressionsDuringPromptedCprSec>
      <TotalTimeCompressionsDuringValidSec>1277.72862806776</TotalTimeCompressionsDuringValidSec>
      <TotalTimeCprDuringPromptedCprSec>0</TotalTimeCprDuringPromptedCprSec>
      <TotalTimeCprDuringValidSec>1298.04384154134</TotalTimeCprDuringValidSec>
      <TotalTimePromptedCprSec>0</TotalTimePromptedCprSec>
      <TotalTimeValidSec>1339.14938235198</TotalTimeValidSec>
      <TotalTimeValidSecEx>1339.14938235198</TotalTimeValidSecEx>
    </OverallStatistic>
    <IntervalStatistics>
      <CPRStatisticsItem Interval="1" ResponsibleForCPR="" CPRDurationSec="0" PauseDurationSec="88.355" ReasonForPause="" IntervalComments="" AverageCompressionRatePerMin="-1" AverageVentilationRatePerMin="-1" MedianCompressionRatePerMin="-1" MedianVentilationRatePerMin="-1" TotalTimeCompressionsDuringPromptedCprSec="0" TotalTimeCompressionsDuringValidSec="0.016383236672237" TotalTimeCprDuringPromptedCprSec="0" TotalTimeCprDuringValidSec="0" TotalTimePromptedCprSec="0" TotalTimeValidSec="0.016383236672237" TotalTimeValidSecEx="0.016383236672237"/>
      <CPRStatisticsItem Interval="2" ResponsibleForCPR="" CPRStartTime="2020-04-18T20:59:23.37100" CPREndTime="2020-04-18T21:00:53.69200" CPRDurationSec="90.321" PauseDurationSec="12.337" ReasonForPause="" IntervalComments="" AverageCompressionRatePerMin="95.8528439195659" AverageVentilationRatePerMin="0" MedianCompressionRatePerMin="127.518878980892" MedianVentilationRatePerMin="0" TotalTimeCompressionsDuringPromptedCprSec="0" TotalTimeCompressionsDuringValidSec="80.0812608538943" TotalTimeCprDuringPromptedCprSec="0" TotalTimeCprDuringValidSec="90.3207837740424" TotalTimePromptedCprSec="0" TotalTimeValidSec="102.673744224909" TotalTimeValidSecEx="102.673744224909"/>
      <CPRStatisticsItem Interval="3" ResponsibleForCPR="" CPRStartTime="2020-04-18T21:01:06.04500" CPREndTime="2020-04-18T21:05:05.83000" CPRDurationSec="239.785" PauseDurationSec="18.89" ReasonForPause="" IntervalComments="" AverageCompressionRatePerMin="91.8527379821395" AverageVentilationRatePerMin="0" MedianCompressionRatePerMin="99.4940232081911" MedianVentilationRatePerMin="0" TotalTimeCompressionsDuringPromptedCprSec="0" TotalTimeCompressionsDuringValidSec="239.78505193486" TotalTimeCprDuringPromptedCprSec="0" TotalTimeCprDuringValidSec="239.78505193486" TotalTimePromptedCprSec="0" TotalTimeValidSec="258.691307054622" TotalTimeValidSecEx="258.691307054622"/>
      <CPRStatisticsItem Interval="4" ResponsibleForCPR="" CPRStartTime="2020-04-18T21:05:24.73600" CPREndTime="2020-04-18T21:18:09.50600" CPRDurationSec="764.77" PauseDurationSec="9.322" ReasonForPause="" IntervalComments="" AverageCompressionRatePerMin="97.1202954559885" AverageVentilationRatePerMin="8.60363351605325" MedianCompressionRatePerMin="99.6300586675938" MedianVentilationRatePerMin="8.55159640102828" TotalTimeCompressionsDuringPromptedCprSec="0" TotalTimeCompressionsDuringValidSec="754.693797306596" TotalTimeCprDuringPromptedCprSec="0" TotalTimeCprDuringValidSec="764.769487860022" TotalTimePromptedCprSec="0" TotalTimeValidSec="774.107932763197" TotalTimeValidSecEx="774.107932763197"/>
      <CPRStatisticsItem Interval="5" ResponsibleForCPR="" CPRStartTime="2020-04-18T21:18:18.84400" CPREndTime="2020-04-18T21:21:42.01300" CPRDurationSec="203.169" PauseDurationSec="0.491000000000014" ReasonForPause="" IntervalComments="" AverageCompressionRatePerMin="99.283111575899" AverageVentilationRatePerMin="9.42747647011503" MedianCompressionRatePerMin="99.4707780795654" MedianVentilationRatePerMin="9.45028304169019" TotalTimeCompressionsDuringPromptedCprSec="0" TotalTimeCompressionsDuringValidSec="203.152134735738" TotalTimeCprDuringPromptedCprSec="0" TotalTimeCprDuringValidSec="203.168517972411" TotalTimePromptedCprSec="0" TotalTimeValidSec="203.660015072578" TotalTimeValidSecEx="203.660015072578"/>
    </IntervalStatistics>
  </CPRStatistics>
  <CPRShockPauseStatistics/>
</Report>

我希望从<OverallStatistics><IntervalStatistics>生成两个数据帧。

我正在使用XML包解析数据,下面是我到目前为止所做的工作:

代码语言:javascript
复制
df <- xmlParse(file = "file.xml", useInternalNodes = TRUE)

df_1 <- xmlToDataFrame(df, nodes = getNodeSet(df, "//CPRStatistics"))

这就是我被卡住的地方。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-05-06 23:22:55

通常,package xml2更容易使用,尽管package XML至少会尝试在合理的情况下创建数据帧。

<OverallStatistic>获取的节点集更容易做成数据帧:

代码语言:javascript
复制
library(XML)
df <- xmlParse(file = "file.xml", useInternalNodes = TRUE)
overall <- xmlToDataFrame(df, nodes = getNodeSet(df, "//OverallStatistic"))

此数据框中只有一行:

代码语言:javascript
复制
overall
#>   AverageCompressionRatePerMin AverageVentilationRatePerMin
#> 1             96.3298058454348             9.34405124270701
#>   MedianCompressionRatePerMin MedianVentilationRatePerMin
#> 1            99.8261838893409            6.51749943203332
#>   TotalTimeCompressionsDuringPromptedCprSec TotalTimeCompressionsDuringValidSec
#> 1                                         0                    1277.72862806776
#>   TotalTimeCprDuringPromptedCprSec TotalTimeCprDuringValidSec TotalTimePromptedCprSec
#> 1                                0           1298.04384154134                       0
#>   TotalTimeValidSec TotalTimeValidSecEx
#> 1  1339.14938235198    1339.14938235198

第二个节点<IntervalStatistics>更难解析,因为值存储在属性中,而不是作为文本节点。这需要您找到所有的<CPRStatisticsItem>节点,将它们的所有属性剥离到一个列表中,然后将它们一起rbind到一个数据框中。在本例中,由于数据框有如此多的列,为了便于打印到屏幕,我将其转换为tibble而不是数据框:

代码语言:javascript
复制
dplyr::as_tibble(do.call(rbind, lapply(getNodeSet(df, "//CPRStatisticsItem")[-1], xmlAttrs)))
#> # A tibble: 4 x 19
#>   Interval ResponsibleForC~ CPRStartTime CPREndTime CPRDurationSec PauseDurationSec
#>   <chr>    <chr>            <chr>        <chr>      <chr>          <chr>           
#> 1 2        ""               2020-04-18T~ 2020-04-1~ 90.321         12.337          
#> 2 3        ""               2020-04-18T~ 2020-04-1~ 239.785        18.89           
#> 3 4        ""               2020-04-18T~ 2020-04-1~ 764.77         9.322           
#> 4 5        ""               2020-04-18T~ 2020-04-1~ 203.169        0.4910000000000~
#> # ... with 13 more variables: ReasonForPause <chr>, IntervalComments <chr>,
#> #   AverageCompressionRatePerMin <chr>, AverageVentilationRatePerMin <chr>,
#> #   MedianCompressionRatePerMin <chr>, MedianVentilationRatePerMin <chr>,
#> #   TotalTimeCompressionsDuringPromptedCprSec <chr>,
#> #   TotalTimeCompressionsDuringValidSec <chr>,
#> #   TotalTimeCprDuringPromptedCprSec <chr>, TotalTimeCprDuringValidSec <chr>,
#> #   TotalTimePromptedCprSec <chr>, TotalTimeValidSec <chr>, TotalTimeValidSecEx <chr>
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61638148

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档