文章/答案/技术大牛

发布

社区首页 >问答首页 >R中同列中的条件字符串级联

问R中同列中的条件字符串级联
EN

Stack Overflow用户

提问于 2022-03-25 12:32:32

回答 4查看 172关注 0票数 3

我对R很陌生，在这样的数据框架中有一个非常大的不规则列：

x <- data.frame(section = c("BOOK I: Introduction", "Page one: presentation", "Page two: acknowledgments", "MAGAZINE II: Considerations", "Page one: characters", "Page two: index", "BOOK III: General Principles", "BOOK III: General Principles", "Page one: invitation"))

section
BOOK I: Introduction
Page one: presentation
Page two: acknowledgments
MAGAZINE II: Considerations 
Page one: characters
Page two: index
BOOK III: General principles
BOOK III: General principles
Page one: invitation

我需要将这个列连接起来，如下所示：

section
BOOK I: Introduction 
BOOK I: Introduction / Page one: presentation
BOOK I: Introduction / Page two: acknowledgments
MAGAZINE II: Considerations
MAGAZINE II: Considerations / Page one: characters
MAGAZINE II: Considerations / Page two: index
BOOK III: General Principles
BOOK III: General Principles
BOOK III: General Principles / Page one: invitation

基本上，目标是提取基于条件的上字符串的值，然后用regex表达式将值与较低的实现连接起来，但我真的不知道该如何做。

提前谢谢。

string

stringi

回答 4

Stack Overflow用户

回答已采纳

发布于 2022-03-25 12:47:09

你可以：

unlist(lapply(split(x$section, cumsum(grepl('^[A-Z]{3}', x$section))), 
              function(y) {
                  if(length(y) == 1) return(y)
                  else c(y[1], paste(y[1], y[-1], sep = " / "))
                }), use.names = FALSE)
#> [1] "BOOK I: Introduction"                               
#> [2] "BOOK I: Introduction / Page one: presentation"      
#> [3] "BOOK I: Introduction / Page two: acknowledgments"   
#> [4] "MAGAZINE II: Considerations"                        
#> [5] "MAGAZINE II: Considerations / Page one: characters" 
#> [6] "MAGAZINE II: Considerations / Page two: index"      
#> [7] "BOOK III: General Principles"                       
#> [8] "BOOK III: General Principles"                       
#> [9] "BOOK III: General Principles / Page one: invitation"

票数 2

Stack Overflow用户

发布于 2022-03-25 12:52:00

使用data.table：

library(data.table)

setDT(x)[grepl("^Page.",section)==F, header:=section] %>% 
  .[,header:=zoo::na.locf(header)] %>% 
  .[section!=header,header:=paste0(header, " / ",section)] %>% 
  .[,.(section = header)] %>% 
  .[]

1:                                BOOK I: Introduction
2:       BOOK I: Introduction / Page one: presentation
3:    BOOK I: Introduction / Page two: acknowledgments
4:                         MAGAZINE II: Considerations
5:  MAGAZINE II: Considerations / Page one: characters
6:       MAGAZINE II: Considerations / Page two: index
7:                        BOOK III: General Principles
8:                        BOOK III: General Principles
9: BOOK III: General Principles / Page one: invitation

票数 3

Stack Overflow用户

发布于 2022-03-25 12:52:38

滚动连接可以实现这一点。在data.table中：

library( data.table )

# add a row column for joining by reference
x[ , row := .I ]

# pick out just the title rows. It looks like these start with either "BOOK" or "MAGAZINE"
books_magazines <- x[ grepl("^BOOK|^MAGAZINE", section),
                      .(row, book_magazine = section) ]

# join the 2 tables, using a rolling join to add the title row to subsequent rows
both_cols <- books_magazines[ x, on = .(row), roll = TRUE ]

# concatenate the 2 columns together where necessary, leave it alone if it's the title row
result <- both_cols[ , .(
    section_string = fifelse( book_magazine == section,
                              book_magazine,
                              sprintf("%s / %s", book_magazine, section) )
) ]

这意味着：

> result$section_string

[1] "BOOK I: Introduction"                               
[2] "BOOK I: Introduction / Page one: presentation"      
[3] "BOOK I: Introduction / Page two: acknowledgments"   
[4] "MAGAZINE II: Considerations"                        
[5] "MAGAZINE II: Considerations / Page one: characters" 
[6] "MAGAZINE II: Considerations / Page two: index"      
[7] "BOOK III: General Principles"                       
[8] "BOOK III: General Principles"                       
[9] "BOOK III: General Principles / Page one: invitation"

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71617071

复制

相似问题

问R中同列中的条件字符串级联
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R中同列中的条件字符串级联EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问R中同列中的条件字符串级联
EN