首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >设置数据子集将丢失我的所有观察结果

设置数据子集将丢失我的所有观察结果
EN

Stack Overflow用户
提问于 2015-04-03 10:26:34
回答 1查看 54关注 0票数 1

我有一个数据帧“测试”,我希望子集,但当我尝试时,我失去了所有的观测值。为什么会发生这种情况?

代码语言:javascript
复制
> str(Test)
'data.frame':   157025 obs. of  13 variables:
$ Cancellations    : int  1 1 1 1 1 1 1 1 1 1 ...
$ Benefit          : chr  "Single Parent Support                          "               "Single Parent Support                          " "Job Seeker                                         " "Job Seeker                                     " ...
$ Region           : chr  "        Northland    " "        Northland    " "            Northland    " "        Northland    " ...
$ Month            : chr  "Jun 14" "Jun 14" "Jun 14" "Jun 14" ...
$ CanReason        : chr  "Change in Marital Status           " "Change in     Marital Status           " "Change in Marital Status           " "Change in     Marital Status           " ...
$ Age              : chr  " 20-24 " " 20-24 " " 20-24 " " 20-24 " ...
$ Ethnicity        : chr  "NZ European/Pakeha" "Maori             " "Other                      " "NZ European/Pakeha" ...
$ SMS              : chr  "General Case Management               " "Work     Focused Case Management          " "Work Focused Case Management          " "Work     Search Support                   " ...
$ Duration         : chr  "2-4 yrs " "2-4 yrs " "6-9 mth " "0-3 mth " ...
$ SMSDuration      : int  361 348 59 69 150 37 63 294 107 107 ...
$ AgeYoungest      : chr  "0-4 yrs    " "0-4 yrs    " "No Children" "No    Children" ...
$ AgeYoungestNonSub: chr  "0-4 yrs" "0-4 yrs" "No Children" "No Children" ...
$ Liability        : chr  " 166,000 " " 166,000 " " 102,000 " " 102,000 " ...


> subDie <- Test[CanReason == "Died",]

> str(subDie)
'data.frame':   0 obs. of  13 variables:
$ Cancellations    : int 
$ Benefit          : chr 
$ Region           : chr 
$ Month            : chr 
$ CanReason        : chr 
$ Age              : chr 
$ Ethnicity        : chr 
$ SMS              : chr 
$ Duration         : chr 
$ SMSDuration      : int 
$ AgeYoungest      : chr 
$ AgeYoungestNonSub: chr 
$ Liability        : chr 

我尝试过将因子变量转换为字符。当我将逗号放在" CanReason“索引行(subDie <- Test,CanReason == "Died")的前面时,R告诉我在0个变量上有157025个观察值……我被难住了

EN

回答 1

Stack Overflow用户

发布于 2015-04-03 11:27:13

使用正则表达式在字符矢量CanReason中搜索字符串"Died",使用grepl()返回指示匹配与否的逻辑矢量。使用它来对Test进行子集。

例如

代码语言:javascript
复制
set.seed(12)
CanReason <- sample(c("Change in      Marital status",
                      "Change in   Marital status ",
                      " Died    ",
                      "Died                ",
                      "Died"), 10000, replace = TRUE)
ind <- grepl("Died", CanReason)

sum(ind)
length(CanReason[ind])

给予:

代码语言:javascript
复制
> sum(ind)
[1] 6037
> length(CanReason[ind])
[1] 6037
> head(CanReason[ind])
[1] "Died"                 "Died"                 "Died                "
[4] "Died"                 " Died    "            " Died    "
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/29425399

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档