首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将连续的非数字元素粘贴在一起

将连续的非数字元素粘贴在一起
EN

Stack Overflow用户
提问于 2018-09-16 19:28:49
回答 1查看 63关注 0票数 1

问题

在识别满足非数字模式的字符向量中的元素之后,如何将连续满足regex模式的元素粘贴到一起?

概述

sample_text主要包含两种模式:

  • 奇数元素:包含6位数字,后面跟着文本;
  • 偶数元素:包含4位数字,后面跟着文本。

然而,有一些非数字文本的实例属于它前面的奇怪元素。

前解

以前,我不知道字符向量中有连续的非数字元素.这允许我手动地将每个非数字元素粘贴到元素中,这是非数字元素后面的两个元素。

代码语言:javascript
复制
# load necessary package
library(tidyverse) --------

# load necessary data ------
sample_text <-
  c("811411 Home and Garden Equipment Repair and"   
    , "7699 Repair Services, Nec"
    , "Maintenance" # non-digit pattern
    , "811412 Appliance Repair and Maintenance"       
    , "7623 Refrigeration Service and Repair"         
    , "811412 Appliance Repair and Maintenance"       
    , "7629 Electrical Repair Shops")

# previous solution -------
sample_text %>%
  # for those elements which satisfy the non-digit pattern
  # identify the index of those elements which are exactly two behind
  replace(list = str_detect(., "^\\D*$") %>% which() - 2
          # of those elements which are exactly two behind the non-digit pattern
          # paste the non-digit pattern to the end of them.
          , values = paste(.[str_detect(., "^\\D*$") %>% which() - 2]
                           , str_subset(., "^\\D*$"))) %>%
  # only keep elements with digits
  str_subset("\\d")

# [1] "811411 Home and Garden Equipment Repair and Maintenance" # successfully copied the non-digit element and pasted it two elements behind
# [2] "7699 Repair Services, Nec"                             
# [3] "811412 Appliance Repair and Maintenance"                
# [4] "7623 Refrigeration Service and Repair"                  
# [5] "811412 Appliance Repair and Maintenance"                
# [6] "7629 Electrical Repair Shops" 

# end of script #

当前问题

现在我意识到sample_text包含连续的非数字元素,我不知道如何更新我以前的解决方案。任何帮助都将不胜感激!

代码语言:javascript
复制
# sample data -----
sample_text <-
  c("811310 Commercial and Industrial Machinery and"
    , "7692 Welding Repair"                           
    , "Equipment (except Automotive and" # non-digit pattern (1/2)           
    , "Electronic) Repair and Maintenance" # non-digit pattern (2/2)         
    , "811310 Commercial and Industrial Machinery and"
    , "7694 Armature Rewinding Shops"                 
    , "Equipment (except Automotive and" # non-digit pattern (1/2)           
    , "Electronic) Repair and Maintenance" # non-digit pattern (2/2)    
    , "811310 Commercial and Industrial Machinery and"
    , "7699 Repair Services, Nec"                     
    , "Equipment (except Automotive and" # non-digit pattern (1/2)         
    , "Electronic) Repair and Maintenance" # non-digit pattern (2/2)        
    , "811411 Home and Garden Equipment Repair and"   
    , "7699 Repair Services, Nec"                     
    , "Maintenance" # non-digit pattern (1/1)                               
    , "811412 Appliance Repair and Maintenance"       
    , "7623 Refrigeration Service and Repair"         
    , "811412 Appliance Repair and Maintenance"       
    , "7629 Electrical Repair Shops"                  
    , "811412 Appliance Repair and Maintenance"       
    , "7699 Repair Services, Nec")

# desired output ------
 [1] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [2] "7692 Welding Repair"           
 [3] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [4] "7694 Armature Rewinding Shops"           
 [5] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [6] "7699 Repair Services, Nec"           
 [7] "811411 Home and Garden Equipment Repair and Maintenance"   
 [8] "7699 Repair Services, Nec"                       
 [9] "811412 Appliance Repair and Maintenance"       
[10] "7623 Refrigeration Service and Repair"         
[11] "811412 Appliance Repair and Maintenance"       
[12] "7629 Electrical Repair Shops"                  
[13] "811412 Appliance Repair and Maintenance"       
[14] "7699 Repair Services, Nec"
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-09-16 21:33:36

从预期输出中猜测,迭代向量,如果一行不包含任何数字,则应在此之前将其添加到元素中,即当前index - 1:

代码语言:javascript
复制
x <- c()
for(i in sample_text){
  if(grepl("^\\D*$",i, perl=TRUE)) {
    x[length(x)-1] <- paste(x[length(x)-1], i)
  } else {
    x <- c(x, i) #append
  }
}

输出:

代码语言:javascript
复制
 [1] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [2] "7692 Welding Repair"                                                                                               
 [3] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [4] "7694 Armature Rewinding Shops"                                                                                     
 [5] "811310 Commercial and Industrial Machinery and Equipment (except Automotive and Electronic) Repair and Maintenance"
 [6] "7699 Repair Services, Nec"                                                                                         
 [7] "811411 Home and Garden Equipment Repair and Maintenance"                                                           
 [8] "7699 Repair Services, Nec"                                                                                         
 [9] "811412 Appliance Repair and Maintenance"                                                                           
[10] "7623 Refrigeration Service and Repair"                                                                             
[11] "811412 Appliance Repair and Maintenance"                                                                           
[12] "7629 Electrical Repair Shops"                                                                                      
[13] "811412 Appliance Repair and Maintenance"                                                                           
[14] "7699 Repair Services, Nec"        

在线样本

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/52357694

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档