问如何在非结构化数据中提取特定字符串之前的日期？
EN

Stack Overflow用户

提问于 2017-03-25 17:01:25

回答 1查看 63关注 0票数 1

我有一个非结构化文本，里面有很多日期，我想在单词"Message"之前提取日期。我拥有的数据如下：

21 March 2017 23:10:45 text1
21 March 2017 23:10:45  More text…..
21 March 2017 23:10:45 And more text …..
21 March 2017 23:10:45 some more text **Message:** more text 
22 March 2017 23:10:45 text1
22 March 2017 23:10:45  More text…..
22 March 2017 23:10:45 And more text …..
22 March 2017 23:10:45 some more text **Message:** more text 
23 March 2017 23:10:45 text1
23 March 2017 23:10:45  More text…..
23 March 2017 23:10:45 And more text …..
23 March 2017 23:10:45 some more text **Message:** more text 
24 March 2017 23:10:45 text1
24 March 2017 23:10:45  More text…..
24 March 2017 23:10:45 And more text …..
24 March 2017 23:10:45 some more text **Message:** more text

并且输出将是一个新的数据格式，其中有一列表示日期：

21 March 2017 
22 March 2017 
23 March 2017 
24 March 2017

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-03-25 17:11:30

怎么样

sub("(?<=\\d{4}).*", "", grep("Message", txt, value=TRUE), perl=TRUE)
# [1] "21 March 2017" "22 March 2017" "23 March 2017" "24 March 2017"

我们首先使用grep()将txt还原为仅包含"Message“的值，然后使用sub()删除第一次出现四位数字后的所有文本。

数据：

txt <- readLines(textConnection("21 March 2017 23:10:45 text1
21 March 2017 23:10:45  More text…..
21 March 2017 23:10:45 And more text …..
21 March 2017 23:10:45 some more text **Message:** more text 
22 March 2017 23:10:45 text1
22 March 2017 23:10:45  More text…..
22 March 2017 23:10:45 And more text …..
22 March 2017 23:10:45 some more text **Message:** more text 
23 March 2017 23:10:45 text1
23 March 2017 23:10:45  More text…..
23 March 2017 23:10:45 And more text …..
23 March 2017 23:10:45 some more text **Message:** more text 
24 March 2017 23:10:45 text1
24 March 2017 23:10:45  More text…..
24 March 2017 23:10:45 And more text …..
24 March 2017 23:10:45 some more text **Message:** more text 
"))

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43019274

复制

相似问题

问如何在非结构化数据中提取特定字符串之前的日期？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在非结构化数据中提取特定字符串之前的日期？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在非结构化数据中提取特定字符串之前的日期？
EN