我有一个dataframe,它有以下信息:
installed_players active_software
vlc.exe;wmplayer.exe smss.exe;csrss.exe;winlogon.exe;vlc.exe
wmplayer.exe smss.exe;csrss.exe;winlogon.exe;wmplayer.exe
wmplayer.exe smss.exe;csrss.exe;winlogon.exe我需要的是找出哪些已安装的播放器现在是活动的(A列中的哪个软件在B栏中),并将其写到C列
因此,输出应该如下
installed_players active_software active_players
vlc.exe;wmplayer.exe smss.exe;csrss.exe;winlogon.exe;vlc.exe vlc.exe
wmplayer.exe smss.exe;csrss.exe;winlogon.exe;wmplayer.exe wmplayer.exe
wmplayer.exe smss.exe;csrss.exe;winlogon.exe none我想我和intersect有关,但我不知道如何在dataframe中这样做,其中coumn和B是字符串。
数据:
df1 <- structure(list(players_installed = c("wmplayer.exe", "fsquirt.exe XMP.exe" ),
processes_run = c("smss.exe;csrss.exe;wininit.exe;reg.exe",
"smss.exe;csrss.exe;wininit.exe;services.exe;lsass.exe;svchost.exe;fontdrvhost.exe;WUDFHost.exe;igfxCUIService.exe;ibmpmsvc.exe;Lenovo.Modern.ImController.exe;ibtsiva.exe;lvvsst.exe;micmute.exe;CamMute.exe;AppleMobileDeviceService.exe;armsvc.exe;")),
.Names = c("players_installed", "processes_run"),
row.names = 3:4, class = "data.frame")发布于 2017-11-27 21:28:46
这是解决办法
df1 <- structure(list(players_installed = c("wmplayer.exe", "fsquirt.exe XMP.exe" ),
processes_run = c("smss.exe;csrss.exe;wininit.exe;reg.exe;wmplayer.exe",
"smss.exe;csrss.exe;wininit.exe;services.exe;lsass.exe;svchost.exe;fontdrvhost.exe;WUDFHost.exe;igfxCUIService.exe;ibmpmsvc.exe;Lenovo.Modern.ImController.exe;ibtsiva.exe;lvvsst.exe;micmute.exe;CamMute.exe;AppleMobileDeviceService.exe;armsvc.exe;")),
.Names = c("players_installed", "processes_run"),
row.names = 3:4, class = "data.frame")
df1$processes_run <- gsub(";", " ", df1$processes_run)
x <- lapply(df1, strsplit, " ")
x <- Map(intersect, x[[1]], x[[2]])
df1$common <- sapply(x, paste0, collapse = " ")第二次我发问题并自己回答。谢谢你的帮助,zx8754。
发布于 2017-11-27 21:52:05
另一种使用弦乐和qdaptools包的解决方案:
library(stringr)
library(qdapTools)
X <- sapply(1:nrow(df), function(i)
mtabulate(stri_extract_all_words(list(df[i,1], df[i,2]))))
Y <- lapply(1:length(X), function(i) colSums(X[[i]]) == nrow(X[[i]]))
Z <- lapply(lapply(1:length(Y), function(i) Y[[i]][Y[[i]]]), function(x) names(x))
Z[lengths(Z) == 0] <- NA_character_
df$common <- unlist(Z)
df
## installed_players active_software common
## 1 vlc.exe;wmplayer.exe smss.exe;csrss.exe;winlogon.exe;vlc.exe vlc.exe
## 2 wmplayer.exe smss.exe;csrss.exe;winlogon.exe;wmplayer.exe wmplayer.exe
## 3 wmplayer.exe smss.exe;csrss.exe;winlogon.exe <NA>数据:
df <- structure(list(installed_players = c("vlc.exe;wmplayer.exe",
"wmplayer.exe", "wmplayer.exe"), active_software = c("smss.exe;csrss.exe;winlogon.exe;vlc.exe",
"smss.exe;csrss.exe;winlogon.exe;wmplayer.exe", "smss.exe;csrss.exe;winlogon.exe"
)), .Names = c("installed_players", "active_software"), row.names = c(NA,
3L), class = "data.frame") https://stackoverflow.com/questions/47519499
复制相似问题