首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在不同的列中找到单词的交集

在不同的列中找到单词的交集
EN

Stack Overflow用户
提问于 2017-11-27 20:44:30
回答 2查看 84关注 0票数 0

我有一个dataframe,它有以下信息:

代码语言:javascript
复制
installed_players    active_software
vlc.exe;wmplayer.exe    smss.exe;csrss.exe;winlogon.exe;vlc.exe
wmplayer.exe    smss.exe;csrss.exe;winlogon.exe;wmplayer.exe
wmplayer.exe    smss.exe;csrss.exe;winlogon.exe

我需要的是找出哪些已安装的播放器现在是活动的(A列中的哪个软件在B栏中),并将其写到C列

因此,输出应该如下

代码语言:javascript
复制
installed_players    active_software    active_players
vlc.exe;wmplayer.exe    smss.exe;csrss.exe;winlogon.exe;vlc.exe    vlc.exe
wmplayer.exe    smss.exe;csrss.exe;winlogon.exe;wmplayer.exe    wmplayer.exe
wmplayer.exe    smss.exe;csrss.exe;winlogon.exe    none

我想我和intersect有关,但我不知道如何在dataframe中这样做,其中coumn和B是字符串。

数据:

代码语言:javascript
复制
df1 <- structure(list(players_installed = c("wmplayer.exe", "fsquirt.exe XMP.exe" ),
                      processes_run = c("smss.exe;csrss.exe;wininit.exe;reg.exe",
                                        "smss.exe;csrss.exe;wininit.exe;services.exe;lsass.exe;svchost.exe;fontdrvhost.exe;WUDFHost.exe;igfxCUIService.exe;ibmpmsvc.exe;Lenovo.Modern.ImController.exe;ibtsiva.exe;lvvsst.exe;micmute.exe;CamMute.exe;AppleMobileDeviceService.exe;armsvc.exe;")),
                      .Names = c("players_installed", "processes_run"),
                      row.names = 3:4, class = "data.frame")
EN

回答 2

Stack Overflow用户

发布于 2017-11-27 21:28:46

这是解决办法

代码语言:javascript
复制
df1 <- structure(list(players_installed = c("wmplayer.exe", "fsquirt.exe XMP.exe" ),
                  processes_run = c("smss.exe;csrss.exe;wininit.exe;reg.exe;wmplayer.exe",
                                    "smss.exe;csrss.exe;wininit.exe;services.exe;lsass.exe;svchost.exe;fontdrvhost.exe;WUDFHost.exe;igfxCUIService.exe;ibmpmsvc.exe;Lenovo.Modern.ImController.exe;ibtsiva.exe;lvvsst.exe;micmute.exe;CamMute.exe;AppleMobileDeviceService.exe;armsvc.exe;")),
             .Names = c("players_installed", "processes_run"),
             row.names = 3:4, class = "data.frame")
df1$processes_run <- gsub(";", " ", df1$processes_run)
x <- lapply(df1, strsplit, " ")
x <- Map(intersect, x[[1]], x[[2]])
df1$common <- sapply(x, paste0, collapse = " ")

第二次我发问题并自己回答。谢谢你的帮助,zx8754。

票数 0
EN

Stack Overflow用户

发布于 2017-11-27 21:52:05

另一种使用弦乐qdaptools包的解决方案:

代码语言:javascript
复制
library(stringr)
library(qdapTools)

X <- sapply(1:nrow(df), function(i)
              mtabulate(stri_extract_all_words(list(df[i,1], df[i,2]))))

Y <- lapply(1:length(X), function(i) colSums(X[[i]]) == nrow(X[[i]]))

Z <- lapply(lapply(1:length(Y), function(i) Y[[i]][Y[[i]]]), function(x) names(x))

Z[lengths(Z) == 0] <- NA_character_

df$common <- unlist(Z) 

df
##      installed_players                              active_software       common
## 1 vlc.exe;wmplayer.exe      smss.exe;csrss.exe;winlogon.exe;vlc.exe      vlc.exe
## 2         wmplayer.exe smss.exe;csrss.exe;winlogon.exe;wmplayer.exe wmplayer.exe
## 3         wmplayer.exe              smss.exe;csrss.exe;winlogon.exe         <NA>

数据:

代码语言:javascript
复制
 df <- structure(list(installed_players = c("vlc.exe;wmplayer.exe",                                  
     "wmplayer.exe", "wmplayer.exe"), active_software = c("smss.exe;csrss.exe;winlogon.exe;vlc.exe", 
     "smss.exe;csrss.exe;winlogon.exe;wmplayer.exe", "smss.exe;csrss.exe;winlogon.exe"               
     )), .Names = c("installed_players", "active_software"), row.names = c(NA,                       
     3L), class = "data.frame")  
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/47519499

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档