我有一个R数据,在这里,我从一些问卷调查数据导入了CSV文件。
我的专栏之一叫做“新闻方法”(NewsMethods),在这里,被调查者被要求列出他们获取新闻的方法。数据集中的数据如下所示:
..。诸若此类。
我想要做的是用它包含的元素数替换每一列。例如,我想用数字5代替第一个列表。
如果有人对我如何做到这一点有任何想法的话,我将非常感激。提亚
编辑
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;Social Media websites or apps;Word of mouth
Television;Social Media websites or apps
Newspaper;Radio;Television;News websites (such as BBC News)
Television
Radio;Television;Word of mouth
Television;Social Media websites or apps;Word of mouth
Television;Word of mouth
Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth我期待这一条改为:5 5 4 2 4 1 3 3 2 6
发布于 2017-04-23 10:33:05
我们可以使用来自str_count的stringr
library(stringr)
df1$Count <- str_count(df1$NewsMethods, ";")+1
df1$Count
#[1] 5 5 4 2 4 1 3 3 2 6或者base R选项与regexpr一起使用
lengths(lapply(gregexpr(";", df1$NewsMethods), function(x) x[x>0]) )+1
#[1] 5 5 4 2 4 1 3 3 2 6数据
df1 <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")发布于 2017-04-23 11:09:40
strsplit和lengths组合的R基选择
lengths(strsplit(dfr$NewsMethods, split = ';'))这意味着:
> lengths(strsplit(dfr$NewsMethods, split = ';'))
[1] 5 5 4 2 4 1 3 3 2 6将结果分配给数据文件中的count-variable:
dfr$count <- lengths(strsplit(dfr$NewsMethods, split = ';'))现在,您的dataframe看起来如下:
> dfr
NewsMethods count
1 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
2 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
3 Radio;Television;Social Media websites or apps;Word of mouth 4
4 Television;Social Media websites or apps 2
5 Newspaper;Radio;Television;News websites (such as BBC News) 4
6 Television 1
7 Radio;Television;Word of mouth 3
8 Television;Social Media websites or apps;Word of mouth 3
9 Television;Word of mouth 2
10 Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 6使用的数据:
dfr <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")https://stackoverflow.com/questions/43569942
复制相似问题