我有一个从Microsoft365R返回的数据帧:
SKA_student <- structure(list(name = "Computing SKA 2021-22.xlsx", size = 22266L,
lastModifiedBy =
structure(list(user =
structure(list(email = "my@email.com",
id = "8ae50289-d7af-4779-91dc-e4638421f422",
displayName = "Name, My"), class = "data.frame", row.names = c(NA, -1L))),
class = "data.frame", row.names = c(NA, -1L)),
fileSystemInfo = structure(list(
createdDateTime = "2021-09-08T16:03:38Z",
lastModifiedDateTime = "2021-09-16T00:09:04Z"), class = "data.frame", row.names = c(NA,-1L))), row.names = c(NA, -1L), class = "data.frame")我可以通过以下方式返回所有lastModifiedBy数据:
SKA_student %>% select(lastModifiedBy)
lastModifiedBy.user.email lastModifiedBy.user.id lastModifiedBy.user.displayName
1 my@email.com 8ae50289-d7af-4779-91dc-e4638421f422 Name, My但如果我想要lastModifiedBy列表中的特定项,它就不起作用了,例如:
SKA_student %>% select(lastModifiedBy.user.email)
Error: Can't subset columns that don't exist.
x Column `lastModifiedBy.user.email` doesn't exist.我可以通过base让它工作,但我真的希望得到dplyr的答案。
发布于 2021-11-01 10:48:19
这个函数允许你展平所有的列表栏(我很久以前就找到了,但是找不到原始的帖子)
SO_flat_cols <- function(data) {
ListCols <- sapply(data, is.list)
cbind(data[!ListCols], t(apply(data[ListCols], 1, unlist)))
}然后您可以随心所欲地进行选择。
SO_flat_cols (SKA_student) %>%
select(lastModifiedBy.user.email)或者,您也可以通过递归地拉取列表来到达结尾
SKA_student %>%
pull(lastModifiedBy) %>%
pull(user) %>%
select(email)发布于 2021-11-01 10:51:40
你可以使用
library(dplyr)
library(tidyr)
SKA_student %>%
unnest_wider(lastModifiedBy) %>%
select(email)这将返回
# A tibble: 1 x 1
email
<chr>
1 my@email.comhttps://stackoverflow.com/questions/69795666
复制相似问题