使用条件语句创建多个新列
我想知道是否有一种基于条件创建多列的方法。
例如,下面我有一个带有数据的dataframe,我想创建基于ccy的两个列。一个列是ccy的gbp转换率,另一个是cad转换。
如果我将这些变体排成管道,我可以让它正常工作,但也会有重复(在我真正的问题中,我有一个复杂的ifelse列表,所以为每一列重复代码会产生大量的重复)。
df <- structure(list(product = c('option', 'forward', 'forward', 'option'),
ccy = c('usd', 'usd', 'eur', 'usd'),
amount = c(1000, 2000, 1000, 5000)),
.Names = c('product', 'ccy', 'amount'),
row.names = c(NA, 4L),
class = "data.frame")
df
product ccy amount
1 option usd 1000
2 forward usd 2000
3 forward eur 1000
4 option usd 5000
df %>% mutate(gbp_amount =
ifelse(ccy == 'usd', round(amount / 1.8, 2),
ifelse(ccy == 'eur', round(amount / 1.3, 2),
'not_converted'))) %>%
mutate(cad_amount =
ifelse(ccy == 'usd', round(amount / 0.85, 2),
ifelse(ccy == 'eur', round(amount / .7, 2),
'not_converted')))
product ccy amount gbp_amount cad_amount
1 option usd 1000 555.56 1176.47
2 forward usd 2000 1111.11 2352.94
3 forward eur 1000 769.23 1428.57
4 option usd 5000 2777.78 5882.35是否有一种方法可以基于单个if条件创建多个列?
比如,类似这个伪代码..。
df %>% ifelse(df$ccy == 'usd',
(mutate(gbp_amount = round(amount / 1.8, 2)),
mutate(cad_amount = round(amount / 0.85, 2))),
ifelse(df$ccy == 'eur',
(mutate(gbp_amount = round(amount / 1.3, 2)),
mutate(cad_amount = round(amount / 0.7, 2))),
'not_converted'))发布于 2018-06-29 20:33:00
如果有许多“相等”条件,则可以使用类似SQL的联接。
我正在使用data.table语法,但您也可以使用dplyr:
library(data.table)
setDT(df)
# add a row which cannot be found ("joined") to demonstrate missing rates
df <- rbind(df, data.table(product = "option", ccy = "aud", amount = 3000))
df
lookup <- data.table(ccy = c("usd", "eur"),
gbp_rate = c( 1.8, 1.3),
cad_rate = c( 0.85, 0.7))
lookup
# ccy gbp_rate cad_rate
# 1: usd 1.8 0.85
# 2: eur 1.3 0.70
df[lookup, `:=`(gbp_amount = round(amount / gbp_rate, 2),
cad_amount = round(amount / cad_rate, 2)),
on = "ccy"]
df
# product ccy amount gbp_amount cad_amount
# 1: option usd 1000 555.56 1176.47
# 2: forward usd 2000 1111.11 2352.94
# 3: forward eur 1000 769.23 1428.57
# 4: option usd 5000 2777.78 5882.35
# 5: option aud 3000 NA NA如果需要,您必须根据需要对结果进行排序,并使用NA以外的值标记查找错误(缺少转换率)(但不要使用问题中的字符串"not_converted",因为这会混淆列的数据类型- double vs字符)。
发布于 2018-06-29 22:05:57
考虑构建一个费率数据集并与原始数据合并,避免嵌套ifelse。
rates_df <- data.frame(ccy = c('usd', 'eur'),
type = c('gbp', 'gbp', 'cad', 'cad'),
rate = c(1.8, 1.3, 0.85, 0.7),
stringsAsFactors = FALSE)
rates_df
df %>%
inner_join(rates_df, by="ccy") %>%
mutate(gbp_amount = ifelse(type=="gbp", round(amount / rate, 2), 0),
cad_amount = ifelse(type=="cad", round(amount / rate, 2), 0)) %>%
select(product, ccy, matches("amount")) %>%
group_by(product, ccy, amount) %>%
summarise_all(sum)
# # A tibble: 4 x 5
# # Groups: product, ccy [?]
# product ccy amount gbp_amount cad_amount
# <chr> <chr> <dbl> <dbl> <dbl>
# 1 forward eur 1000 769.23 1428.57
# 2 forward usd 2000 1111.11 2352.94
# 3 option usd 1000 555.56 1176.47
# 4 option usd 5000 2777.78 5882.35发布于 2018-06-29 21:12:40
如果要执行多个操作,则必须使用for-loop。@R的解决方案可能更好。就像他说的,我会使用NA而不是字符串,这样您就不会在向量中混合数据类型,否则它将默认为字符。
for (i in 1:nrow(df)) {
if(df$ccy[i] == "usd") {
df$gbp_amount[i] <- round(df$amount[i] / 1.8, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.85, 2);
} else {
NA
}
if(df$ccy[i] == "eur") {
df$gbp_amount[i] <- round(df$amount[i] / 1.3, 2);
df$cad_amount[i] <- round(df$amount[i] / 0.7, 2);
} else {
NA
}
}https://stackoverflow.com/questions/51107745
复制相似问题