我问了这个问题(How to mutate a new column by modifying another column?)
现在我又遇到了另一个问题。我得用更多的“乱”身份证,
df1 <- data.frame(id=c("A-1","A-10","A-100","b-1","b-10","b-100"),n=c(1,2,3,4,5,6))从这些ID中,我想分配新的“整洁”ID,例如,
df2 <- data.frame(id=c("A0001","A0010","A0100","B0001","B0010","B0100"),n=c(1,2,3,4,5,6))(现在我需要大写字母“B”而不是“b”)
我试着使用str_pad函数,但我无法管理。
发布于 2020-03-28 10:03:13
我们可以基于"-"将数据分割成不同的列,将字母转换为大写,使用0的sprintf衬垫,并将两列与unite结合起来。
library(dplyr)
library(tidyr)
df1 %>%
separate(id, c("id1", "id2"), sep = "-") %>%
mutate(id1 = toupper(id1),
id2 = sprintf('%04s', id2)) %>%
unite(id, id1, id2, sep = "")
# id n
#1 A0001 1
#2 A0010 2
#3 A0100 3
#4 B0001 4
#5 B0010 5
#6 B0100 6根据注释,如果在某些情况下我们没有分隔符,并且希望更改某些id1值,则可以使用以下方法。
df1 %>%
extract(id, c("id1", "id2"), regex = "([:alpha:])-?(\\d+)") %>%
mutate(id1 = case_when(id1 == 'c' ~ 'B',
TRUE ~ id1),
id1 = toupper(id1),id2 = sprintf('%04s', id2)) %>%
unite(id, id1, id2, sep = "")发布于 2020-03-28 10:02:33
正如您所说,str_pad函数用于此目的非常方便。但是你必须先把数字提取出来,然后再粘贴到一起。
library(stringr)
paste0(toupper(str_extract(df1$id, "[aA-zZ]-")),
str_pad(str_extract(df1$id, "\\d+"), width=4, pad="0"))
[1] "A-0001" "A-0010" "A-0100" "B-0001" "B-0010" "B-0100"发布于 2020-03-28 10:17:23
碱基R解
df1$id <- sub("^(.)0+?(.{4})$","\\1\\2", sub("-", "0000", toupper(df1$id)))潮汐溶液
library(tidyverse)
df1$id <- str_to_upper(df1$id) %>%
str_replace("-","0000") %>%
str_replace("^(.)0+?(.{4})$","\\1\\2")输出
df1
# id n
# 1 A0001 1
# 2 A0010 2
# 3 A0100 3
# 4 B0001 4
# 5 B0010 5
# 6 B0100 6数据
df1 <- data.frame(id=c("A-1","A-10","A-100","b-1","b-10","b-100"),n=c(1,2,3,4,5,6))https://stackoverflow.com/questions/60899599
复制相似问题