首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >将单词与R中的原始文件进行比较

将单词与R中的原始文件进行比较
EN

Stack Overflow用户
提问于 2017-02-09 22:47:28
回答 1查看 40关注 0票数 0

我有json格式的原始数据集。让我们在R中加载它。

代码语言:javascript
复制
library("rjson")
setwd("mydir")
getwd()
json_data <- fromJSON(paste(readLines("N1.json"), collapse=""))
uu <- unlist(json_data)
uutext <- uu[names(uu) == "text"]

我还有另一个数据集mydata2

代码语言:javascript
复制
mydata=read.csv(path to data/words)

我需要在mydata2中找到单词,只有在json文件中的消息中出现的单词。然后把这个消息写进新的文档,"xyz.txt“怎么办?

代码语言:javascript
复制
    chalk      indirect          pick    reaction            team      skip    pumpkin     surprise         bless  ignorance
1     time       patient          road      extent          decade  cemetery  staircase      monarch        bubble      abbey
2  service  conglomerate        banish         pan        friendly  position      tight    highlight          rice  disappear
3    write         swear         break        tire             jam   neutral   momentum  requirement  relationship     matrix
4  inspire          dose          jump     promote           trace    latest   absolute       adjust      joystick      habit
5    wrong        behave         claim    dedicate          threat      sell   particle    statement         teach       lamb
6      eye        tissue  prescription     problem       secretion   revenge     barrel        beard     mechanism   platform
7   forest          kick          face   wisecrack     uncertainty     ratio   complain        doubt    reflection    realism
8    total           fee        debate        hall            soft     smart        sip       ritual          pill   category
9  contain      headline          lump  absorption  superintendent   digital   increase          key        banner     second

 i mean
 chalk -1 number1     indirect -2 number2

模板

代码语言:javascript
复制
Word1-1 number1-1; Word1-2 number 1-2; …; Word 1-10 number 1-10
Word2-1 number2-1; Word2-2 number 2-2; …; Word 2-10 number 2-10
EN

回答 1

Stack Overflow用户

发布于 2017-02-09 23:00:13

下一次请包括真实数据。简化模型:

代码语言:javascript
复制
library(data.table) 
word = c("test","meh","blah")
jsonF = c("let's do test", "blah is right", "test blah", "test test") 

outp <- list()
for (i in 1:length(word)) {
 outp[[i]] = as.data.frame(grep(word[i],jsonF,v=T,fixed=T)) # possibly, ignore.case=T
}

qq = rbindlist(outp)
qq = unique(qq)
print(qq)

1:               let's do test
2:                   test blah
3:                   test test
4:               blah is right

编辑:快速脏粘贴/折叠:

代码语言:javascript
复制
library(data.table)

x = LETTERS[1:10]
y = LETTERS[11:20]

df = rbind(x,y)    

L = list()
for (i in 1:nrow(df)) {
  L[i] = paste0(df[i,],"-",seq(1,10)," ",i,"-",seq(1,10),collapse="; ")

}
Fin = cbind(L)
View(Fin)

提供:

代码语言:javascript
复制
> Fin
L                                                                                           
    [1,] "A-1 1-1; B-2 1-2; C-3 1-3; D-4 1-4; E-5 1-5; F-6 1-6; G-7 1-7; H-8 1-8; I-9 1-9; J-10 1-10"
    [2,] "K-1 2-1; L-2 2-2; M-3 2-3; N-4 2-4; O-5 2-5; P-6 2-6; Q-7 2-7; R-8 2-8; S-9 2-9; T-10 2-10"
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/42139712

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档