首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何拆分R中的特定段落?

如何拆分R中的特定段落?
EN

Stack Overflow用户
提问于 2017-08-17 15:16:42
回答 1查看 85关注 0票数 0

我想从一篇长文中提取特定的段落。例如:

代码语言:javascript
复制
txt1 <- "What is claimed is:
1. A hybridized CMP conditioner, comprising: a base; 
a first abrasive unit, provided on said base and comprising a first
bonding layer fixed on said base, a substrate for abrasive unit provided     
on said first bonding layer and an abrasive layer provided on said 
substrate for abrasive unit, said abrasive layer being a diamond coating 
formed through a chemical vapor deposition process, and said diamond 
coating being provided on the surface thereof with a plurality of abrasive     
tips. 
2.  The hybridized CMP conditioner according to claim 1, wherein said base
is provided on the surface thereof with a central region and an annular     
outer region around the outside of said central region. 
3.  The hybridized CMP conditioner according to claim 2, wherein said
central region is provided with a recessed portion for said first abrasive
unit to be provided therein, and said annular outer region is provided 
with a plurality of first accommodating portions spaced apart from each 
other for said second abrasive units to be provided therein. "

我只想摘录第一段。如下所示:

代码语言:javascript
复制
1. A hybridized CMP conditioner, comprising: a base; 
a first abrasive unit, provided on said base and comprising a first
bonding layer fixed on said base, a substrate for abrasive unit provided     
on said first bonding layer and an abrasive layer provided on said 
substrate for abrasive unit, said abrasive layer being a diamond coating 
formed through a chemical vapor deposition process, and said diamond 
coating being provided on the surface thereof with a plurality of abrasive     
tips. 

我已经尝试使用strsplit函数来做

代码语言:javascript
复制
strsplit(txt1, "\n1.", perl = TRUE)

但结果并不是我想要的。

代码语言:javascript
复制
[1] "What is claimed is:"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            

[2] " A hybridized CMP conditioner, comprising: a base; \na first abrasive 
unit, provided on said base and comprising a first bonding layer fixed on 
said base, a substrate for abrasive unit provided on said first bonding 
layer and an abrasive layer provided on said substrate for abrasive unit, 
said abrasive layer being a diamond coating formed through a chemical 
vapor deposition process, and said diamond coating being provided on the 
surface thereof with a plurality of abrasive tips; and \na plurality of 
second abrasive units, provided on said base and comprising a second 
bonding layer fixed on said base, a carrying post provided on said second 
bonding layer, an abrasive particle provided on said carrying post and an 
abrasive material-bonding layer provided between said carrying post and 
said abrasive particle. \n2.  The hybridized CMP conditioner according to 
claim 1, wherein said base is provided on the surface thereof with a 
central region and an annular outer region around the outside of said 
central region. "
EN

回答 1

Stack Overflow用户

发布于 2017-08-17 16:23:03

使用strsplit

代码语言:javascript
复制
 # split at newline followed by number and '.'
paragraphs <- unlist(strsplit(txt1, "\\n(?=(\\d+\\. ))", perl = TRUE))
 # get rid of newlines and select 1st paragraph
gsub(" *\\n", " ", paragraphs)[2] 
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/45728809

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档