首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >日期连续性

日期连续性
EN

Stack Overflow用户
提问于 2020-09-18 21:05:36
回答 2查看 65关注 0票数 0

我有调查数据,雇员必须输入他的数据,以便不断地回答以下问题。我试着用R来检查他们是否连续地填写数据。

数据如下:

代码语言:javascript
复制
EmployeeID <- c(101,101,101,102,102,102,102,104,104)

Created <- c(2020-06-19,2020-06-20,2020-06-21,2020-06-24,2020-06-25,2020-06-28,2020-06-28,2020-06-23,2020-06-24)

Updated <- c(2020-06-19,2020-06-20,2020-06-21,2020-06-24,2020-06-25,2020-06-28,2020-06-28,2020-06-23,2020-06-24)

happy <- c("True", "false", "false"," ", "false", "True","false", "True", "false")

active <- c("false", "false", " "," ", "false", "True"," ", "false", "false")

sad <- c("True", "false", "false"," ", "false", "True","false", "True", "false")

energitic <- c("True", "false", "false"," ", "false", "True","false", "True", "false")

df <- data.frame(EmployeeID, Created, Updated, happy, active, sad, energitic)

预期产出:

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-09-18 21:16:56

使用dplyr的一种方法

代码语言:javascript
复制
library(dplyr)
df %>%
  group_by(EmployeeID) %>%
  summarize(
    continuous = all(do.call(seq, c(as.list(range(Created)), by = "day")) %in% Created),
    .groups = "drop"
  )
# # A tibble: 3 x 2
#   EmployeeID continuous
#        <dbl> <lgl>     
# 1        101 TRUE      
# 2        102 FALSE     
# 3        104 TRUE      

顺便说一句,你的样本数据并不代表你的建议。2020-06-19不是日期,而是计算为1995的表达式。我更新了上面的答案,以说明它们是真实的Date对象。

以下是更新的数据:

代码语言:javascript
复制
df <- local({
  EmployeeID <- c(101, 101, 101, 102, 102, 102, 102, 104, 104)
  Created <- c("2020-06-19", "2020-06-20", "2020-06-21", "2020-06-24", "2020-06-25", "2020-06-28", "2020-06-28", "2020-06-23", "2020-06-24")
  Updated <- c("2020-06-19", "2020-06-20", "2020-06-21", "2020-06-24", "2020-06-25", "2020-06-28", "2020-06-28", "2020-06-23", "2020-06-24")
  happy <- c("True", "false", "false"," ", "false", "True","false", "True", "false")
  active <- c("false", "false", " "," ", "false", "True"," ", "false", "false")
  sad <- c("True", "false", "false"," ", "false", "True","false", "True", "false")
  energitic <- c("True", "false", "false"," ", "false", "True","false", "True", "false")
  data.frame(EmployeeID, Created, Updated, happy, active, sad, energitic)
})
df[,c("Created","Updated")] <- lapply(df[,c("Created","Updated")], as.Date)
票数 1
EN

Stack Overflow用户

发布于 2020-09-19 06:09:21

您可以通过获取连续的Created日期值的差异来检查其连续性,并查看它们的all值是否有1的差异。

这可以用dplyr来完成:

代码语言:javascript
复制
library(dplyr)
df %>%
  group_by(EmployeeID) %>%
  summarise(Result = all(diff(Created) == 1))

# EmployeeID Result
#       <dbl> <lgl> 
#1        101 TRUE  
#2        102 FALSE 
#3        104 TRUE  

R基地:

代码语言:javascript
复制
aggregate(Created~EmployeeID, df, function(x) all(diff(x) == 1))

data.table

代码语言:javascript
复制
library(data.table)
setDT(df)[, .(Result = all(diff(Created) == 1)), EmployeeID]
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63962813

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档