数据设置
我有一个数据集,看起来有点像下面这个简单的dataframe:
CAD_EXCHANGE <- 1.34
EUR_EXCHANGE <- 0.88
df <- tibble(
shipment = c("A", "B", "C", "D", "E"),
invoice = c(rep(500, 5)),
currency = factor(c("USD", "EUR", "CAD", NA, "SDD"))
)
df
# A tibble: 5 x 3
shipment invoice currency
<chr> <dbl> <fct>
1 A 500 USD
2 B 500 EUR
3 C 500 CAD
4 D 500 NA
5 E 500 SDD
levels(df$currency)
[1] "CAD" "EUR" "SDD" "USD"最终目标
我正试图将发票转换成美元的一些共同的其他货币(欧元和加元),但不是全部,或如果数据丢失(即SDD和NA)。我的最终数据框架应该如下所示:
# A tibble: 5 x 5
shipment invoice currency invoice_converted currency_converted
<chr> <dbl> <fct> <dbl> <fct>
1 A 500 USD 500 USD
2 B 500 EUR 568 USD
3 C 500 CAD 373 USD
4 D 500 NA 500 NA
5 E 500 SDD 500 SDD 试验1-不起作用
在未来,我可能要转换的不仅仅是这几种货币,所以我应用了一个case_when()语句。这是我第一次尝试:
df_USD1 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(currency == "EUR" ~ "USD",
currency == "CAD" ~ "USD",
TRUE ~ currency)
)
Error: Problem with `mutate()` column `currency_converted`.
i `currency_converted = case_when(...)`.
x must be a character vector, not a `factor` object.有了以上这些,我了解到我在分配给currency_converted的任务中混合了字符和因素,因为我有默认的TRUE ~ currency ( currency是一个因素)。所以我试着用一些因素来完成任务.
试验2-有效,但不可靠
df_USD2 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(
currency == "EUR" ~ currency[1],
currency == "CAD" ~ currency[1],
TRUE ~ currency)
)这是可行的,但只是因为在我为这个问题的设置中,美元处于第一位,我不能依赖它。
> df$currency
[1] USD EUR CAD <NA> SDD
Levels: CAD EUR SDD USD第三次试验--不起作用
我想我可以尝试一些其他的方法来获得这个因素,但这是行不通的:
df_USD3 <- df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
),
currency_converted = case_when(
currency == "EUR" ~ df$currency[df$currency == "USD"],
currency == "CAD" ~ df$currency[df$currency == "USD"],
TRUE ~ currency
)
)
Error: Problem with `mutate()` column `currency_converted`.
i `currency_converted = factor(...)`.
x `currency == "EUR" ~ df$currency[df$currency == "USD"]`, `currency == "CAD" ~ df$currency[df$currency == "USD"]` must be length 5 or one, not 2.
Run `rlang::last_error()` to see where the error occurred.这似乎是因为NA被退回.
> df$currency[df$currency == "USD"]
[1] USD <NA>
Levels: CAD EUR SDD USD...because,如果我回到原来的df,用其他货币代替NA,它就能工作了--但很明显,我需要能够将NA保存在属于它的地方。
我觉得有一些很好的方法可以做到这一点,但我错过了它,尽管阅读了各种因素,尝试了不同的事情。帮助?
发布于 2021-11-26 18:03:16
case_when不自动进行类型转换--即currency是factor,而case_when中其他条件的返回只是character。因此,我们可以强制将currency转换为character,使所有的返回都是相同的类,并且它应该可以工作。
library(dplyr)
df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
), currency_converted = case_when(currency == "EUR" ~ "USD",
currency == "CAD" ~ "USD",
TRUE ~ as.character(currency)))-output
# A tibble: 5 × 5
shipment invoice currency invoice_converted currency_converted
<chr> <dbl> <fct> <dbl> <chr>
1 A 500 USD 500 USD
2 B 500 EUR 568 USD
3 C 500 CAD 373 USD
4 D 500 <NA> 500 <NA>
5 E 500 SDD 500 SDD 如果我们想将它保持为factor,可以在case_when之后用factor包装,或者直接使用fct_recode而不是case_when
library(forcats)
df %>%
mutate(
invoice_converted = case_when(
currency == "EUR" ~ round(invoice / EUR_EXCHANGE),
currency == "CAD" ~ round(invoice / CAD_EXCHANGE),
TRUE ~ invoice
), currency_converted = fct_recode(currency, USD = "EUR", USD = "CAD"))-output
# A tibble: 5 × 5
shipment invoice currency invoice_converted currency_converted
<chr> <dbl> <fct> <dbl> <fct>
1 A 500 USD 500 USD
2 B 500 EUR 568 USD
3 C 500 CAD 373 USD
4 D 500 <NA> 500 <NA>
5 E 500 SDD 500 SDD https://stackoverflow.com/questions/70128436
复制相似问题