我有一张叫DATA_TEST的桌子。此表包含一列,其中包含七种不同情况的数据。

#DATA
DATA_TEST<-data.frame(
CUSTOMS_RATE=c("10","20.1","15+0,41 eur/kg","10+0,1 eur/kg
max.17","0,1 eur/l max.17","0,04 eur/kg
max.10","NA"))
View(DATA_TEST)因此,我的目的是将此列划分为三个不同的列,以便继续进行其他统计操作(计算平均值等)。如下表(DATA_TEST1)所示。

多亏了这个社区,我得到了这个code.But有一些错误,因为第二个记录"20.1“,而不是停留在列速率,他去到下一个或精确地在列SPECIFIC_RATE。
library(tidyverse)
DATA_TEST %>%
mutate(CUSTOMS_RATE = str_replace_all(CUSTOMS_RATE, ",", "."),
RATE = str_extract(CUSTOMS_RATE, "^[0-9]+(?=\\+|$)"),
SPECIFIC_RATE = str_extract(CUSTOMS_RATE, "\\d+\\.\\d+"),
MAXIMUM_RATE = str_extract(CUSTOMS_RATE, "(?<=max\\.)\\d+")) %>%
select(2:4) %>%
mutate_all(as.numeric)所以有人能帮我解决这个问题吗?
发布于 2019-05-08 19:23:42
一种选择是将RATE中的代码更改为
RATE = str_extract(CUSTOMS_RATE, "^[0-9]+(?=\\+|$)|^[0-9.]+$")-fullcode
DATA_TEST %>%
mutate(CUSTOMS_RATE = str_replace_all(CUSTOMS_RATE, ",", "."),
RATE = str_extract(CUSTOMS_RATE, "^[0-9]+(?=\\+|$)|^[0-9.]+$"),
SPECIFIC_RATE = str_extract(CUSTOMS_RATE, "\\d+\\.\\d+(?=\\s)"),
MAXIMUM_RATE = str_extract(CUSTOMS_RATE, "(?<=max\\.)\\d+")) %>%
select(2:4) %>%
mutate_all(as.numeric)
# RATE SPECIFIC_RATE MAXIMUM_RATE
#1 10.0 NA NA
#2 20.1 NA NA
#3 15.0 0.41 NA
#4 10.0 0.10 17
#5 NA 0.10 17
#6 NA 0.04 10
#7 NA NA NAhttps://stackoverflow.com/questions/56039584
复制相似问题