给定一个包含多个元素的列表,目标是将它们放入数据框架中。purr包中的map_df函数对于常规列表非常有用,但对于不规则列表则会出现错误。
例如,遵循这教程,可以完成以下工作:
library(purrr)
library(repurrrsive) # The data comes from this package
map_dfr(got_chars, magrittr::extract, c("name", "culture", "gender", "id", "born", "alive"))
A tibble: 30 x 6
name culture gender id born alive
<chr> <chr> <chr> <int> <chr> <lgl>
1 Theon Greyjoy Ironborn Male 1022 In 278 AC or 279 AC, at Pyke TRUE
2 Tyrion Lannister "" Male 1052 In 273 AC, at Casterly Rock TRUE
3 Victarion Greyjoy Ironborn Male 1074 In 268 AC or before, at Pyke TRUE
4 Will "" Male 1109 "" FALSE
5 Areo Hotah Norvoshi Male 1166 In 257 AC or before, at Norvos TRUE
6 Chett "" Male 1267 At Hag's Mire FALSE
7 Cressen "" Male 1295 In 219 AC or 220 AC FALSE
8 Arianne Martell Dornish Female 130 In 276 AC, at Sunspear TRUE
9 Daenerys Targaryen Valyrian Female 1303 In 284 AC, at Dragonstone TRUE
10 Davos Seaworth Westeros Male 1319 In 260 AC or before, at King's Landing TRUE
# … with 20 more rows但是,如果从列表中删除元素,则该函数将失败。
got_chars[[1]]["gender"]<-NULL
map_dfr(got_chars, magrittr::extract, c("name", "culture", "gender", "id", "born", "alive"))
#Error: Argument 3 is a list, must contain atomic vectors所需的输出将是缺失元素的NA值。什么是优雅的解决方案?我怀疑这个解决方案包括使用purrr:possibly(),但我还没有弄清楚。
发布于 2019-08-15 20:41:16
一种方法是定义一个partial()ly指定的pluck(),它提取感兴趣的名称,如果缺少,返回NA。将修改后的pluck()传递给双映射,内部映射遍历要提取的名称,外部映射遍历got_chars列表:
v <- set_names(c("name", "culture", "gender", "id", "born", "alive"))
map_dfr( got_chars, ~map(v, partial(pluck, .x, .default=NA)) )
# # A tibble: 30 x 6
# name culture gender id born alive
# <chr> <chr> <chr> <int> <chr> <lgl>
# 1 Theon Greyjoy Ironborn NA 1022 In 278 AC or 279 AC, at Pyke TRUE
# 2 Tyrion Lannister "" Male 1052 In 273 AC, at Casterly Rock TRUE
# 3 Victarion Greyj… Ironborn Male 1074 In 268 AC or before, at Pyke TRUE
# 4 Will "" Male 1109 "" FALSE
# 5 Areo Hotah Norvoshi Male 1166 In 257 AC or before, at Norvos TRUE
# 6 Chett "" Male 1267 At Hag's Mire FALSE
# 7 Cressen "" Male 1295 In 219 AC or 220 AC FALSE
# 8 Arianne Martell Dornish Female 130 In 276 AC, at Sunspear TRUE
# 9 Daenerys Targar… Valyrian Female 1303 In 284 AC, at Dragonstone TRUE
# 10 Davos Seaworth Westeros Male 1319 In 260 AC or before, at King's … TRUE
# # … with 20 more rows为了澄清,.x在got_chars上迭代,因为它存在于用~指定的lambda函数中,因此它对应于外部map。内部map的函数是用partial()指定的,它将当前查看的got_chars元素(即.x)作为pluck()的第一个参数。修改后的pluck()接受要提取的名称作为其(新的)第一个参数,因此它可以按原样传递到内部映射,而不需要任何额外的~。
发布于 2019-08-16 10:35:21
一个固有的问题是[ (或其别名magrittr::extract)在缺少我们试图提取的元素时的行为:
list(a = 1)["b"]
# $<NA>
# NULL
magrittr::extract(list(a = 1), "b")
# $<NA>
# NULL我们可以界定:
extract_if_present <- function(x, y) {
x[intersect(y, names(x))]
}表现为:
extract_if_present(list(a = 1), "b")
# named list()然后用缺少的元素进行行绑定“只起作用”:
map_dfr(
got_chars_mutilated,
extract_if_present,
c("name", "culture", "gender", "id", "born", "alive")
)
# # A tibble: 30 x 6
# name culture id born alive gender
# <chr> <chr> <int> <chr> <lgl> <chr>
# 1 Theon Greyjoy Ironborn 1022 In 278 AC or 279 AC, at Pyke TRUE NA
# 2 Tyrion Lannister "" 1052 In 273 AC, at Casterly Rock TRUE Male
# 3 Victarion Greyjoy Ironborn 1074 In 268 AC or before, at Pyke TRUE Male
# 4 Will "" 1109 "" FALSE Male
# 5 Areo Hotah Norvoshi 1166 In 257 AC or before, at Norvos TRUE Male
# 6 Chett "" 1267 At Hag's Mire FALSE Male
# 7 Cressen "" 1295 In 219 AC or 220 AC FALSE Male
# 8 Arianne Martell Dornish 130 In 276 AC, at Sunspear TRUE Female
# 9 Daenerys Targaryen Valyrian 1303 In 284 AC, at Dragonstone TRUE Female
# 10 Davos Seaworth Westeros 1319 In 260 AC or before, at King's Landing TRUE Male
# # … with 20 more rows列的顺序有点混乱,取决于行的顺序和它们遗漏的内容。
发布于 2021-09-15 23:00:01
喜欢那个教程!在本教程的末尾,作者说:
在编程时,以通常的方式显式指定类型和构建数据框架是更安全的,但更麻烦。
您可以使用更详细的方法将默认值设置为NA。
got_chars %>% {
tibble(
name = map_chr(., "name"),
culture = map_chr(., "culture"),
gender = map_chr(., "gender", .default = NA),
id = map_chr(., "id"),
born = map_chr(., "born"),
alive = map_chr(., "alive")
)
}
# # A tibble: 30 x 6
# name culture gender id born alive
# <chr> <chr> <chr> <chr> <chr> <chr>
# 1 Theon Greyjoy "Ironborn" NA 1022 "In 278 AC or 279 AC, at Pyke" TRUE
# 2 Tyrion Lannister "" Male 1052 "In 273 AC, at Casterly Rock" TRUE
# 3 Victarion Greyjoy "Ironborn" Male 1074 "In 268 AC or before, at Pyke" TRUE
# 4 Will "" Male 1109 "" FALSE
# 5 Areo Hotah "Norvoshi" Male 1166 "In 257 AC or before, at Norvos" TRUE
# 6 Chett "" Male 1267 "At Hag's Mire" FALSE
# 7 Cressen "" Male 1295 "In 219 AC or 220 AC" FALSE
# 8 Arianne Martell "Dornish" Female 130 "In 276 AC, at Sunspear" TRUE
# 9 Daenerys Targaryen "Valyrian" Female 1303 "In 284 AC, at Dragonstone" TRUE
# 10 Davos Seaworth "Westeros" Male 1319 "In 260 AC or before, at King's Landing" TRUE https://stackoverflow.com/questions/57515535
复制相似问题