我在使用dplyr::left_join连接表时遇到了一些问题。我创建了以下data.frame:
conservation <- structure(list(conservation1 = c("EX ", "EW ", "CR ", "EN ",
"VU ", "NT ", "LC ", "DD ", "NE ", "PE ", "PEW "), description = c(" Extinct",
" Extinct em the wild", " Critically Endangered", " Endangered",
" Vulnerable", " Near Threatened", " Least Concern", " Data deficient",
" Not evaluated", " Probably extinct (informal)", " Probably extinct em the wild (informal)"
)), class = "data.frame", row.names = c(NA, -11L))我想把它加到ggplot2的睡眠守恒性中,以获得带有描述的睡眠守恒性。
我正在考虑的msleep变量是:
msleep <- structure(list(name = c("Cheetah", "Owl monkey", "Mountain beaver",
"Greater short-tailed shrew", "Cow", "Three-toed sloth", "Northern fur seal",
"Vesper mouse", "Dog", "Roe deer", "Goat", "Guinea pig", "Grivet",
"Chinchilla", "Star-nosed mole", "African giant pouched rat",
"Lesser short-tailed shrew", "Long-nosed armadillo", "Tree hyrax",
"North American Opossum", "Asian elephant", "Big brown bat",
"Horse", "Donkey", "European hedgehog", "Patas monkey", "Western american chipmunk",
"Domestic cat", "Galago", "Giraffe", "Pilot whale", "Gray seal",
"Gray hyrax", "Human", "Mongoose lemur", "African elephant",
"Thick-tailed opposum", "Macaque", "Mongolian gerbil", "Golden hamster",
"Vole ", "House mouse", "Little brown bat", "Round-tailed muskrat",
"Slow loris", "Degu", "Northern grasshopper mouse", "Rabbit",
"Sheep", "Chimpanzee", "Tiger", "Jaguar", "Lion", "Baboon", "Desert hedgehog",
"Potto", "Deer mouse", "Phalanger", "Caspian seal", "Common porpoise",
"Potoroo", "Giant armadillo", "Rock hyrax", "Laboratory rat",
"African striped mouse", "Squirrel monkey", "Eastern american mole",
"Cotton rat", "Mole rat", "Arctic ground squirrel", "Thirteen-lined ground squirrel",
"Golden-mantled ground squirrel", "Musk shrew", "Pig", "Short-nosed echidna",
"Eastern american chipmunk", "Brazilian tapir", "Tenrec", "Tree shrew",
"Bottle-nosed dolphin", "Genet", "Arctic fox", "Red fox"), conservation = c("lc",
NA, "nt", "lc", "domesticated", NA, "vu", NA, "domesticated",
"lc", "lc", "domesticated", "lc", "domesticated", "lc", NA, "lc",
"lc", "lc", "lc", "en", "lc", "domesticated", "domesticated",
"lc", "lc", NA, "domesticated", NA, "cd", "cd", "lc", "lc", NA,
"vu", "vu", "lc", NA, "lc", "en", NA, "nt", NA, "nt", NA, "lc",
"lc", "domesticated", "domesticated", NA, "en", "nt", "vu", NA,
"lc", "lc", NA, NA, "vu", "vu", NA, "en", "lc", "lc", NA, NA,
"lc", NA, NA, "lc", "lc", "lc", NA, "domesticated", NA, NA, "vu",
NA, NA, NA, NA, NA, NA)), row.names = c(NA, -83L), class = c("tbl_df",
"tbl", "data.frame"))为了实现这个目标,我正在申请:
msleep %>%
select(name, conservation) %>%
mutate(conservation = toupper(conservation)) %>%
left_join(conservation1, by = c('conservation'='conservation1'))我的直觉告诉我这是可行的,然而,description列的结果给了我缺少的值。有人能帮帮我吗?我是dplyr的新用户。我真的很感谢你的帮助。
发布于 2020-06-16 23:06:11
欢迎来到SO!
这里的问题是,你想加入的东西之间的级别不匹配,所以dplyr不知道如何加入这些东西。
unique(conservation$conservation1)
[1] "lc" NA "nt" "domesticated" "vu" "en" "cd"
unique(conservation$conservation1)
[1] "EX " "EW " "CR " "EN " "VU " "NT " "LC " "DD " "NE " "PE " "PEW "连接的级别应该相同(或至少具有共同的值)。
发布于 2020-06-16 23:15:39
正如@csgroen注意到的,但没有明确显示解决方案,您可以使用基数R中的trimws来删除滞后空格:
msleep %>%
select(name, conservation) %>%
mutate(conservation = toupper(conservation)) %>%
left_join(conservation %>% mutate(conservation1 = trimws(conservation1)),
by = c("conservation" = "conservation1"))
## A tibble: 83 x 3
# name conservation description
# <chr> <chr> <chr>
# 1 Cheetah LC " Least Concern"
# 2 Owl monkey NA NA
# 3 Mountain beaver NT " Near Threatened"
# 4 Greater short-tailed shrew LC " Least Concern"
# 5 Cow DOMESTICATED NA
# 6 Three-toed sloth NA NA
# 7 Northern fur seal VU " Vulnerable"
# 8 Vesper mouse NA NA
# 9 Dog DOMESTICATED NA
#10 Roe deer LC " Least Concern"
## … with 73 more rows发布于 2020-06-16 23:10:16
正如@csgroen所说,根本的问题是你的键不匹配:你的类别键是大写的,并且有尾随的空格。您还可以通过在查找中使用与观察到的数据中相同的键名称来简化您自己的工作。这会给你想要的东西:
conservation <- conservation %>% rename(conservation=conservation1)
msleep %>%
mutate(conservation=toupper(conservation)) %>%
left_join(conservation, by="conservation")https://stackoverflow.com/questions/62411438
复制相似问题