文章/答案/技术大牛

发布

社区首页 >问答首页 >R:分别计算每个因素的水平，然后计算最小/平均/最大超过水平。

问R:分别计算每个因素的水平，然后计算最小/平均/最大超过水平。
EN

Stack Overflow用户

提问于 2018-09-13 07:41:50

回答 2查看 723关注 0票数 1

因此，我有一个水分配模型的输出，它是一条河流每小时的流入和流量值。我做了5次模型运行

可复制的例子：

df <- data.frame(rep(seq(
                  from=as.POSIXct("2012-1-1 0:00", tz="UTC"),
                  to=as.POSIXct("2012-1-1 23:00", tz="UTC"),
                  by="hour"
                  ),5),
                as.factor(c(rep(1,24),rep(2,24),rep(3,24), rep(4,24),rep(5,24))),
                rep(seq(1,300,length.out=24),5),
                rep(seq(1,180, length.out=24),5) )

colnames(df)<-c("time", "run", "inflow", "discharge")

当然，在现实中，运行的值是不同的。(我确实有更多的数据，因为我有100次跑步和35年的小时值)。

所以，首先，我想计算每一次水流的缺水系数，这意味着我需要计算类似的情况(1 -(流量/流入前6小时))，因为水需要6小时才能通过集水区。

 scarcityfactor <- 1 - (discharge / lag(inflow,6))

然后，我要计算出所有运行中稀缺性因素的平均值、最大值和最小值(根据不同的模型运行，找出每一阶段可能发生的稀缺性的最高值、最低值和平均值)。所以我会说，我可以计算出每一步的平均值、最大值和最小值：

f1 <- function(x) c(Mean = (mean(x)), Max = (max(x)), Min = (min(x)))
results <- do.call(data.frame, aggregate(scarcityfactor ~ time, 
      data = df,                                                              
      FUN = f1))

有人能帮我处理密码吗？

factors

levels

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-09-13 08:09:54

library(tidyverse)

df %>%
  group_by(run) %>%
  mutate(scarcityfactor = 1 - discharge / lag(inflow,6)) %>%
  group_by(time) %>%
  summarise(Mean = mean(scarcityfactor), 
            Max = max(scarcityfactor), 
            Min = min(scarcityfactor))

# # A tibble: 24 x 4
#  time                   Mean     Max     Min
#  <dttm>                <dbl>   <dbl>   <dbl>
# 1 2012-01-01 00:00:00  NA      NA      NA    
# 2 2012-01-01 01:00:00  NA      NA      NA    
# 3 2012-01-01 02:00:00  NA      NA      NA    
# 4 2012-01-01 03:00:00  NA      NA      NA    
# 5 2012-01-01 04:00:00  NA      NA      NA    
# 6 2012-01-01 05:00:00  NA      NA      NA    
# 7 2012-01-01 06:00:00 -46.7   -46.7   -46.7  
# 8 2012-01-01 07:00:00  -2.96   -2.96   -2.96 
# 9 2012-01-01 08:00:00  -1.34   -1.34   -1.34 
#10 2012-01-01 09:00:00  -0.776  -0.776  -0.776
# # ... with 14 more rows

票数 0

Stack Overflow用户

发布于 2018-09-13 08:21:14

如果我正确理解问题描述，我相信这就是你想要的。

我要用data.table

library(data.table)
setDT(df)

# add scarcity_factor (group by run)
df[ , scarcity_factor := 1 - discharge/shift(inflow, 6L), by = run]

# group by time, excluding times for which the
#   scarcity factor is missing
df[!is.na(scarcity_factor), by = time,
   .(min_scarcity = min(scarcity_factor),
     mean_scarcity = mean(scarcity_factor),
     max_scarcity = max(scarcity_factor))]

#                    time  min_scarcity mean_scarcity  max_scarcity
#  1: 2012-01-01 06:00:00 -46.695652174 -46.695652174 -46.695652174
#  2: 2012-01-01 07:00:00  -2.962732919  -2.962732919  -2.962732919
#  3: 2012-01-01 08:00:00  -1.342995169  -1.342995169  -1.342995169
#  4: 2012-01-01 09:00:00  -0.776086957  -0.776086957  -0.776086957
#  5: 2012-01-01 10:00:00  -0.487284660  -0.487284660  -0.487284660
#  6: 2012-01-01 11:00:00  -0.312252964  -0.312252964  -0.312252964
#  7: 2012-01-01 12:00:00  -0.194826637  -0.194826637  -0.194826637
#  8: 2012-01-01 13:00:00  -0.110586011  -0.110586011  -0.110586011
#  9: 2012-01-01 14:00:00  -0.047204969  -0.047204969  -0.047204969
# 10: 2012-01-01 15:00:00   0.002210759   0.002210759   0.002210759
# 11: 2012-01-01 16:00:00   0.041818785   0.041818785   0.041818785
# 12: 2012-01-01 17:00:00   0.074275362   0.074275362   0.074275362
# 13: 2012-01-01 18:00:00   0.101356965   0.101356965   0.101356965
# 14: 2012-01-01 19:00:00   0.124296675   0.124296675   0.124296675
# 15: 2012-01-01 20:00:00   0.143977192   0.143977192   0.143977192
# 16: 2012-01-01 21:00:00   0.161047028   0.161047028   0.161047028
# 17: 2012-01-01 22:00:00   0.175993343   0.175993343   0.175993343
# 18: 2012-01-01 23:00:00   0.189189189   0.189189189   0.189189189

通过对不同的聚合器进行lapply编程，可以使您更加简洁：

df[!is.na(scarcity_factor), by = time,
   lapply(list(min, mean, max), function(f) f(scarcity_factor))]

最后，您可以认为这是使用聚合重新塑造和使用dcast。

dcast(df, time ~ ., value.var = 'scarcity_factor',
      fun.aggregate = list(min, mean, max))

(如果要排除无意义的行，请在dcast的第一个参数中使用dcast)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52308847

复制

相似问题

问R:分别计算每个因素的水平，然后计算最小/平均/最大超过水平。
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R:分别计算每个因素的水平，然后计算最小/平均/最大超过水平。EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R:分别计算每个因素的水平，然后计算最小/平均/最大超过水平。
EN