我有些数据如下所示:
Seller Name Price
ⒽomeⓄnline Harper Hand Truck and Dolly 51.7
HomeOnline Harper Hand Truck and Dolly 62.54
Amazon.com Harper Hand Truck and Dolly 41.83
XpW Honeywell Safe Chest 41.37
XoXoGroupLLC Honeywell Safe Chest 51.78
Toys Online Honeywell Safe Chest 43.01
Tempus & Co. Honeywell Safe Chest 52.7
stores123 Honeywell Safe Chest 51.21
ⒽomeⓄnline Honeywell Safe Chest 43.88
HomeOnline Honeywell Safe Chest 43.87
Great Brands Outlet Honeywell Safe Chest 64.95
Connect Buy Honeywell Safe Chest 30.1
Amazon.com Honeywell Safe Chest 24.6我想通过Name计算每一行与Name是卖方的行之间的百分比差。所以输出和“etc.”看起来是这样的这意味着一行一直被填充:
Seller Name Price Pct_Diff
ⒽomeⓄnline Harper Hand Truck and Dolly 51.7 .23
HomeOnline Harper Hand Truck and Dolly 62.54 .49
Amazon.com Harper Hand Truck and Dolly 41.83
XpW Honeywell Safe Chest 41.37 .68
XoXoGroupLLC Honeywell Safe Chest 51.78 1.0
Toys Online Honeywell Safe Chest 43.01 etc...
Tempus & Co. Honeywell Safe Chest 52.7
stores123 Honeywell Safe Chest 51.21
ⒽomeⓄnline Honeywell Safe Chest 43.88
HomeOnline Honeywell Safe Chest 43.87
Great Brands Outlet Honeywell Safe Chest 64.95
Connect Buy Honeywell Safe Chest 30.1
Amazon.com Honeywell Safe Chest 24.6我认为有一个很好的data.table解决方案。但是,我不知道如何将没有"Amazon.com“作为卖方的行与以"Amazon.com”作为卖方的行进行比较。
发布于 2017-06-27 17:58:17
你可以用:
dt[, pct := (Price - Price[Seller=='Amazon.com'])/Price[Seller=='Amazon.com'], by = Name]这意味着:
Seller Name Price pct 1: ⒽomeⓄnline Harper Hand Truck and Dolly 51.70 0.2359551 2: HomeOnline Harper Hand Truck and Dolly 62.54 0.4950992 3: Amazon.com Harper Hand Truck and Dolly 41.83 0.0000000 4: XpW Honeywell Safe Chest 41.37 0.6817073 5: XoXoGroupLLC Honeywell Safe Chest 51.78 1.1048780 6: Toys Online Honeywell Safe Chest 43.01 0.7483740 7: Tempus & Co. Honeywell Safe Chest 52.70 1.1422764 8: stores123 Honeywell Safe Chest 51.21 1.0817073 9: ⒽomeⓄnline Honeywell Safe Chest 43.88 0.7837398 10: HomeOnline Honeywell Safe Chest 43.87 0.7833333 11: Great Brands Outlet Honeywell Safe Chest 64.95 1.6402439 12: Connect Buy Honeywell Safe Chest 30.10 0.2235772 13: Amazon.com Honeywell Safe Chest 24.60 0.0000000
在dplyr中实现的相同逻辑
dt %>%
group_by(Name) %>%
mutate(pct = (Price - Price[Seller=='Amazon.com'])/Price[Seller=='Amazon.com'])使用的数据:
dt <- structure(list(Seller = c("ⒽomeⓄnline", "HomeOnline", "Amazon.com", "XpW", "XoXoGroupLLC", "Toys Online", "Tempus & Co.", "stores123", "ⒽomeⓄnline", "HomeOnline", "Great Brands Outlet", "Connect Buy", "Amazon.com"),
Name = c("Harper Hand Truck and Dolly", "Harper Hand Truck and Dolly", "Harper Hand Truck and Dolly", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest"),
Price = c(51.7, 62.54, 41.83, 41.37, 51.78, 43.01, 52.7, 51.21, 43.88, 43.87, 64.95, 30.1, 24.6)),
.Names = c("Seller", "Name", "Price"), class = c("data.table", "data.frame"), row.names = c(NA, -13L))发布于 2017-06-27 18:02:45
这里有一个dplyr解决方案
libary(dplyr)
df <- data.frame(
Seller = c("ⒽomeⓄnline", "HomeOnline", "Amazon.com", "XpW", "XoXoGroupLLC", "Toys Online", "Tempus & Co.", "stores123", "ⒽomeⓄnline", "HomeOnline", "Great Brands Outlet", "Connect Buy", "Amazon.com"),
Name = c("Harper Hand Truck and Dolly","Harper Hand Truck and Dolly","Harper Hand Truck and Dolly","Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest", "Honeywell Safe Chest"),
Price = c(51.7, 62.54, 41.83, 41.37, 51.78, 43.01, 52.7, 51.21, 43.88, 43.87, 64.95, 30.1, 24.6)
)
df %>%
# Join each row with the "Amazon.com" price for this item
left_join(df %>% filter(Seller == "Amazon.com"), by = "Name", suffix = c("", ".amazon")) %>%
# Remove unused "Seller" column
select(-Seller.amazon) %>%
# Calculate percentage for each row, except for
# "Amazon.com" rows, for which the percent difference is NA
mutate(Pct_Diff = ifelse(Seller == "Amazon.com", NA, round((Price - Price.amazon) / Price.amazon, 2)))
# Seller Name Price Price.amazon Pct_Diff
# 1 <U+24BD>ome<U+24C4>nline Harper Hand Truck and Dolly 51.70 41.83 0.24
# 2 HomeOnline Harper Hand Truck and Dolly 62.54 41.83 0.50
# 3 Amazon.com Harper Hand Truck and Dolly 41.83 41.83 NA
# 4 XpW Honeywell Safe Chest 41.37 24.60 0.68
# 5 XoXoGroupLLC Honeywell Safe Chest 51.78 24.60 1.10
# 6 Toys Online Honeywell Safe Chest 43.01 24.60 0.75
# 7 Tempus & Co. Honeywell Safe Chest 52.70 24.60 1.14
# 8 stores123 Honeywell Safe Chest 51.21 24.60 1.08
# 9 <U+24BD>ome<U+24C4>nline Honeywell Safe Chest 43.88 24.60 0.78
# 10 HomeOnline Honeywell Safe Chest 43.87 24.60 0.78
# 11 Great Brands Outlet Honeywell Safe Chest 64.95 24.60 1.64
# 12 Connect Buy Honeywell Safe Chest 30.10 24.60 0.22
# 13 Amazon.com Honeywell Safe Chest 24.60 24.60 NAhttps://stackoverflow.com/questions/44786703
复制相似问题