我有一个包含这些数据的数据集:

从Rslt列中,我需要提取第一个和第二个数字,每个数字可以是一个或两个数字,并将每个数字放入一个新的单独的列中,将第一个数字放入Runs1,将第二个数字放入Runs2。
样本输出

代码:
我曾尝试过这样的解决方案,但没有奏效:
ms2 |>
mutate(runs = stri_extract_all(Rslt, regex="\\d+")[[1]])当我试图得到第一个数字时,这也不起作用:
ms2 |>
mutate(R1st = str_extract(Rslt,"^.*(\\d+)"))这一项将<chr [2]>放在runs1列中:
ms %>%
mutate(runs1 = str_split(Rslt, "-"))我更喜欢dplyr解决方案;但是,我愿意使用其他方法来实现它。此外,如果有一个堆栈溢出解决方案确实解决了我的问题,如果你能分享它的链接,我将不胜感激。
dput:
structure(list(Date = structure(c(1399161600, 1399334400, 1399507200,
1399766400, 1400025600), tzone = "UTC", class = c("POSIXct",
"POSIXt")), Tm = c("TOR", "TOR", "TOR", "TOR", "TOR"), Opp = c("PIT",
"PHI", "PHI", "LAA", "CLE"), Rslt = c("W 7-2", "W 6-5", "W 12-6",
"L 3-9", "L 4-15"), AppDec = c("8-8", "9-10 W", "7-8", "5-6",
"7-8")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))发布于 2022-03-16 14:00:45
您可以使用tidyr::separate进行此操作,例如:
library(tidyverse)
dat %>% separate(col=Rslt, into = c("result", "Runs1", "Runs2"), sep = "[ -]", remove = FALSE )
Date Tm Opp Rslt result Runs1 Runs2 AppDec
<dttm> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2014-05-04 00:00:00 TOR PIT W 7-2 W 7 2 8-8
2 2014-05-06 00:00:00 TOR PHI W 6-5 W 6 5 9-10 W
3 2014-05-08 00:00:00 TOR PHI W 12-6 W 12 6 7-8
4 2014-05-11 00:00:00 TOR LAA L 3-9 L 3 9 5-6
5 2014-05-14 00:00:00 TOR CLE L 4-15 L 4 15 7-8 发布于 2022-03-16 14:10:24
您还可以使用tidyr的函数extract。
library(tidyr)
df %>%
extract(Rslt,
into = c("Runs1", "Runs2"),
regex = "(\\d+)-(\\d+)",
remove = FALSE)
# A tibble: 5 × 7
Date Tm Opp Rslt Runs1 Runs2 AppDec
<dttm> <chr> <chr> <chr> <chr> <chr> <chr>
1 2014-05-04 00:00:00 TOR PIT W 7-2 7 2 8-8
2 2014-05-06 00:00:00 TOR PHI W 6-5 6 5 9-10 W
3 2014-05-08 00:00:00 TOR PHI W 12-6 12 6 7-8
4 2014-05-11 00:00:00 TOR LAA L 3-9 3 9 5-6
5 2014-05-14 00:00:00 TOR CLE L 4-15 4 15 7-8 https://stackoverflow.com/questions/71498317
复制相似问题