首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从聊天对话中提取相关文本

从聊天对话中提取相关文本
EN

Stack Overflow用户
提问于 2018-04-11 12:52:07
回答 2查看 50关注 0票数 2

我有下面的聊天对话:

聊天开始:2018年4月2日,星期一,10:23:30 (+0100)聊天来源: GB -我的帐户(登录)代理Navin P(1) Navin:谢谢您联系XYZ,您正在与Navin通话。我能帮上什么忙吗?(34 S)访问者:你好,我刚刚从华晨宽带转向光纤( 39s ) Navin:嗨(42 S) Navin:早上好。(47)参观者:我发现这里的宽带太慢了(52 S),纳文:今天我能帮你什么吗?( 1m 1秒)访问者:我们被告知有一天可以换掉(1米5秒)纳文:我会帮你的。( 1m 11)参观者:你能告诉我什么时候会发生这种事吗?

我只想使用R.从上面提取相关的文本,我本质上只希望访问者注释出现在结果中。

我想要的结果如下:

我发现宽带太慢了,我们被告知有一天会被调换,你能告诉我这会在什么时候发生吗?

我试着用gsub和strsplit来完成这个任务,但是没有结果。感谢这里的投入。

代码语言:javascript
复制
mytext <- paste(c("Agent Navin P ( 1s ) Navin: Thanks for contacting XYZ, you are talking to Navin. How can I help? ( 34s ) Visitor:", 
    "Hello , I?ve just currently switched from brillian broadband to fab fibre ( 39s ) Navin: Hi ( 42s ) Navin: Good morning. ( 47s )", 
    "Visitor: I find the brilliant broadband so slow ( 52s ) Navin: How can i help you today? ( 1m 1s ) Visitor: And we got told would be", 
    "a day to get switched over ( 1m 5s ) Navin: I'll help you with it. ( 1m 11s ) Visitor: Can you tell me when this will happen by"
), collapse = ' ')
EN

回答 2

Stack Overflow用户

发布于 2018-04-11 13:20:43

我相信有很多方法可以做到这一点。我和来访者分手用潜艇移除纳文的答案。对于替换,我们需要在末尾添加[-1],因为我们不需要在我们拆分的第一个“访问者”之前添加任何内容。

代码语言:javascript
复制
str <- "Chat Started: Monday, April 02, 2018, 10:23:30 (+0100) Chat Origin: GB - My Account (Signed In) Agent Navin P ( 1s ) Navin: Thanks for contacting XYZ, you are talking to Navin. How can I help? ( 34s ) Visitor: Hello , I?ve just currently switched from brillian broadband to fab fibre ( 39s ) Navin: Hi ( 42s ) Navin: Good morning. ( 47s ) Visitor: I find the brilliant broadband so slow ( 52s ) Navin: How can i help you today? ( 1m 1s ) Visitor: And we got told would be a day to get switched over ( 1m 5s ) Navin: I'll help you with it. ( 1m 11s ) Visitor: Can you tell me when this will happen by"
str <- strsplit(str," Visitor: ")[[1]]
sub(" \\((.*?)\\) Navin:.*","",str)[-1]

# [1] "Hello , I?ve just currently switched from brillian broadband to fab fibre"
# [2] "I find the brilliant broadband so slow"                                   
# [3] "And we got told would be a day to get switched over"                      
# [4] "Can you tell me when this will happen by"

如果你想要像你的例子那样的一行,你可以使用paste

代码语言:javascript
复制
paste(sub(" \\((.*?)\\) Navin:.*","",str)[-1],collapse = " ")
# [1] "Hello , I?ve just currently switched from brillian broadband to fab fibre I find the brilliant broadband so slow And we got told would be a day to get switched over Can you tell me when this will happen by"

如果这个人的名字不是一致的"Navin",您可以使用\\w+来匹配sub查询中的任何人的名字来删除它。

代码语言:javascript
复制
sub(" \\((.*?)\\) \\w+:.*","",str)[-1]
票数 1
EN

Stack Overflow用户

发布于 2018-04-11 13:33:29

保留更多信息的一个可能的解决办法是:

代码语言:javascript
复制
mytext <- paste(c("Agent Navin P ( 1s ) Navin: Thanks for contacting XYZ, you are talking to Navin. How can I help? ( 34s ) Visitor:", 
    "Hello , I?ve just currently switched from brillian broadband to fab fibre ( 39s ) Navin: Hi ( 42s ) Navin: Good morning. ( 47s )", 
    "Visitor: I find the brilliant broadband so slow ( 52s ) Navin: How can i help you today? ( 1m 1s ) Visitor: And we got told would be", 
    "a day to get switched over ( 1m 5s ) Navin: I'll help you with it. ( 1m 11s ) Visitor: Can you tell me when this will happen by"
), collapse = ' ')


library(dplyr); library(textshape); library(stringi)

mytext %>%
    stri_replace_all_regex('(\\( [0-9ms ]+ \\))(\\s+)', '$1<<splithere>>') %>%
    stri_split_fixed('<<splithere>>') %>%
    lapply(function(x) {
        x %>%
            split_transcript() %>%
            mutate(dialogue = ifelse(!grepl('\\(\\s*([0-9ms ]+)\\s\\)', dialogue), paste(dialogue, '( - )'), dialogue)) %>%
            extract(dialogue, c('dialogue', 'timestamp'), '(^.+)\\s\\(\\s*([0-9ms -]+)\\s\\)')
    })

## [[1]]
##                  person                                                                  dialogue timestamp
## 1  Agent Navin P ( 1s )                                                             Agent Navin P        1s
## 2                 Navin      Thanks for contacting XYZ, you are talking to Navin. How can I help?       34s
## 3               Visitor Hello , I?ve just currently switched from brillian broadband to fab fibre       39s
## 4                 Navin                                                                        Hi       42s
## 5                 Navin                                                             Good morning.       47s
## 6               Visitor                                    I find the brilliant broadband so slow       52s
## 7                 Navin                                                 How can i help you today?     1m 1s
## 8               Visitor                       And we got told would be a day to get switched over     1m 5s
## 9                 Navin                                                    I'll help you with it.    1m 11s
## 10              Visitor                                  Can you tell me when this will happen by         -

然后你就可以按人过滤等等。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/49775686

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档