首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用前面的数据进行复制

使用前面的数据进行复制
EN

Stack Overflow用户
提问于 2020-04-28 07:16:47
回答 2查看 24关注 0票数 1

嗨,我需要找到副本,我已经附上了一个数据集的图像和一个副本的例子。相同的id,以及与前面日期相同的结果。

任何帮助都将不胜感激。

数据集屏幕抓取

代码语言:javascript
复制
structure(list(id = c(1010001, 1010001, 1010001, 1010001, 1010001, 
1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 
1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 
1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 1010001, 
1010001, 1010001, 1010001), DateCollected = structure(c(1145664000, 
1145750400, 1145836800, 1145923200, 1146009600, 1146096000, 1146096000, 
1146096000, 1146096000, 1146096000, 1146096000, 1146182400, 1146268800, 
1146355200, 1146441600, 1146528000, 1146614400, 1146700800, 1146787200, 
1146787200, 1146787200, 1146787200, 1146787200, 1146787200, 1146873600, 
1146960000, 1147046400, 1147132800, 1147219200), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), Test = c("Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)", 
"Tacrolimus (FK506)", "Tacrolimus (FK506)", "Tacrolimus (FK506)"
), Result = c(3, 4.1, 5.9, 8.1, 4.6, 7, 7.8, 11.2, 18.1, 18.4, 
27, 4, 7.8, 8.4, 8.4, 6.1, 6.8, 5.4, 5.4, 6.5, 6.7, 8.1, 14.2, 
32.4, 7.2, 8.6, 8.9, 7.2, 9.6), Units = c("ug/L", "ug/L", "ug/L", 
"ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", 
"ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", 
"ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", "ug/L", 
"ug/L", "ug/L")), row.names = c(NA, -29L), class = c("tbl_df", 
"tbl", "data.frame"))
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-04-28 12:08:26

我们可以编写一个函数来计算Result的值与找到重复项时返回的行索引值之间的差异。

代码语言:javascript
复制
find_duplicates <- function(x) {
  inds <- which(diff(x) == 0)
  sort(unique(c(inds, inds + 1)))
}

我们可以按组应用此函数。

要获得重复的行,我们可以这样做:

代码语言:javascript
复制
library(dplyr)
df %>% group_by(id) %>% slice(find_duplicates(Result))

#      id DateCollected       Test               Result Units
#    <dbl> <dttm>              <chr>               <dbl> <chr>
#1 1010001 2006-04-30 00:00:00 Tacrolimus (FK506)    8.4 ug/L 
#2 1010001 2006-05-01 00:00:00 Tacrolimus (FK506)    8.4 ug/L 
#3 1010001 2006-05-04 00:00:00 Tacrolimus (FK506)    5.4 ug/L 
#4 1010001 2006-05-05 00:00:00 Tacrolimus (FK506)    5.4 ug/L 

要获得额外的标志列,我们可以使用:

代码语言:javascript
复制
df %>% 
  group_by(id) %>% 
  mutate(is_duplicate = row_number() %in% find_duplicates(Result))
票数 1
EN

Stack Overflow用户

发布于 2020-04-28 07:23:33

我们可以按“id”分组,并通过检查相邻“lag”的结果或lead来创建一个标志。

代码语言:javascript
复制
library(dplyr)
df1 %>%
   group_by(id) %>%
   mutate(flag= Result == lag(Result)|Result == lead(Result)) %>%
   filter(flag)
# A tibble: 4 x 6
# Groups:   id [1]
#      id DateCollected       Test               Result Units flag 
#    <dbl> <dttm>              <chr>               <dbl> <chr> <lgl>
#1 1010001 2006-04-30 00:00:00 Tacrolimus (FK506)    8.4 ug/L  TRUE 
#2 1010001 2006-05-01 00:00:00 Tacrolimus (FK506)    8.4 ug/L  TRUE 
#3 1010001 2006-05-04 00:00:00 Tacrolimus (FK506)    5.4 ug/L  TRUE 
#4 1010001 2006-05-05 00:00:00 Tacrolimus (FK506)    5.4 ug/L  TRUE 
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/61470052

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档