我想要运行一个sim,它根据一组规则随机选择行并将行的总价值相加。我对模拟很陌生,所以不知道从哪里开始。
规则:每个sim共挑出9行。9中的每个sim必须包括以下“职位”数:
QB: 1
经常预算:2
WR: 3
TE: 1
K: 1
DST: 1
我希望每个sim都能将组的值(WAR列)和显示百分比的输出相加,每个玩家的百分比表示最高的WAR组的前10 %。希望这是有意义的。这里的最终目标是确定哪些球员最有可能成功。
下面是一个由来自每个位置的10个顶级球员组成的dput作为例子。
德普特
structure(list(player = c("Justin Tucker", "Harrison Butker",
"Wil Lutz", "Greg Zuerlein", "Matt Gay", "Brandon McManus", "Jake Elliott",
"Robbie Gould", "Stephen Hauschka", "Dan Bailey", "Patrick Mahomes",
"Lamar Jackson", "Dak Prescott", "Russell Wilson", "Kyler Murray",
"Deshaun Watson", "Matt Ryan", "Josh Allen", "Tom Brady", "Carson Wentz",
"Christian McCaffrey", "Saquon Barkley", "Ezekiel Elliott", "Alvin Kamara",
"Dalvin Cook", "Clyde Edwards-Helaire", "Derrick Henry", "Miles Sanders",
"Joe Mixon", "Josh Jacobs", "Travis Kelce", "George Kittle",
"Mark Andrews", "Zach Ertz", "Darren Waller", "Evan Engram",
"Hayden Hurst", "Tyler Higbee", "Hunter Henry", "Mike Gesicki",
"Michael Thomas", "Davante Adams", "Julio Jones", "Tyreek Hill",
"DeAndre Hopkins", "Chris Godwin", "Kenny Golladay", "Allen Robinson",
"DJ Moore", "Odell Beckham"), adp = c(3, 3, 2, 2, 1, 1, 1, 1,
1, 1, 26, 23, 12, 11, 10, 9, 5, 4, 4, 4, 66, 57, 53, 50, 45,
43, 41, 40, 40, 39, 29, 26, 18, 15, 10, 8, 7, 6, 4, 4, 48, 40,
38, 37, 36, 34, 29, 27, 27, 27), WAR = c(0.27, 0.27, 0.1, 0.23,
0.09, 0.19, -0.83, -0.3, -0.1, -0.62, 2.26, 1.41, 0.91, 1.7,
2.28, 1.74, 0.28, 2.29, 1.12, 0.06, 1.02, -0.05, 1.36, 3.57,
3.48, 1.04, 2.91, 1.13, 0.69, 1.49, 2.79, 0.71, 0.85, -0.22,
1.67, 0.07, 0.26, 0.06, 0.35, 0.64, -0.04, 2.74, 0.63, 2.35,
1.49, 0.49, 0.33, 1.17, 0.61, 0.28), position = c("K", "K", "K",
"K", "K", "K", "K", "K", "K", "K", "QB", "QB", "QB", "QB", "QB",
"QB", "QB", "QB", "QB", "QB", "RB", "RB", "RB", "RB", "RB", "RB",
"RB", "RB", "RB", "RB", "TE", "TE", "TE", "TE", "TE", "TE", "TE",
"TE", "TE", "TE", "WR", "WR", "WR", "WR", "WR", "WR", "WR", "WR",
"WR", "WR")), row.names = c(NA, -50L), groups = structure(list(
position = c("K", "QB", "RB", "TE", "WR"), .rows = structure(list(
1:10, 11:20, 21:30, 31:40, 41:50), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))发布于 2021-07-23 20:11:15
一个想法是,您可以使用一个查找表来设置每个组的样本数,然后创建一个函数,通过从每个组中采样n_samples来运行一个“模拟”。不完全确定WAR和的目的是什么,但是一旦您进行了模拟,就像分组和一样,操作应该是简单明了的。
注意,在你的样本数据中没有"DST“位置,所以每个模拟都只有8。
library(tidyverse)
# lookup table
df_sample <- data.frame(position = c("K", "QB", "RB", "TE", "WR", "DST"),
n_samples = c(1, 1, 2, 1, 3, 1))
df_nest <- df %>%
left_join(df_sample) %>%
group_by(position, n_samples) %>%
nest
run_sim <- function(nested_df = df_nest){
nested_df %>%
mutate(sim = map2(data, n_samples, sample_n)) %>%
ungroup() %>%
select(-data, -n_samples) %>%
unnest(sim)
}
map_dfr(1:10, ~run_sim(df_nest), .id = 'sim')
#----
# A tibble: 80 x 5
sim position player adp WAR
<chr> <chr> <chr> <dbl> <dbl>
1 1 K Dan Bailey 1 -0.62
2 1 QB Patrick Mahomes 26 2.26
3 1 RB Miles Sanders 40 1.13
4 1 RB Joe Mixon 40 0.69
5 1 TE Evan Engram 8 0.07
6 1 WR Julio Jones 38 0.63
7 1 WR Michael Thomas 48 -0.04
8 1 WR DeAndre Hopkins 36 1.49
9 2 K Stephen Hauschka 1 -0.1
10 2 QB Russell Wilson 11 1.7
# ... with 70 more rowshttps://stackoverflow.com/questions/68504089
复制相似问题