我有一个数据帧“测试”,我希望子集,但当我尝试时,我失去了所有的观测值。为什么会发生这种情况?
> str(Test)
'data.frame': 157025 obs. of 13 variables:
$ Cancellations : int 1 1 1 1 1 1 1 1 1 1 ...
$ Benefit : chr "Single Parent Support " "Single Parent Support " "Job Seeker " "Job Seeker " ...
$ Region : chr " Northland " " Northland " " Northland " " Northland " ...
$ Month : chr "Jun 14" "Jun 14" "Jun 14" "Jun 14" ...
$ CanReason : chr "Change in Marital Status " "Change in Marital Status " "Change in Marital Status " "Change in Marital Status " ...
$ Age : chr " 20-24 " " 20-24 " " 20-24 " " 20-24 " ...
$ Ethnicity : chr "NZ European/Pakeha" "Maori " "Other " "NZ European/Pakeha" ...
$ SMS : chr "General Case Management " "Work Focused Case Management " "Work Focused Case Management " "Work Search Support " ...
$ Duration : chr "2-4 yrs " "2-4 yrs " "6-9 mth " "0-3 mth " ...
$ SMSDuration : int 361 348 59 69 150 37 63 294 107 107 ...
$ AgeYoungest : chr "0-4 yrs " "0-4 yrs " "No Children" "No Children" ...
$ AgeYoungestNonSub: chr "0-4 yrs" "0-4 yrs" "No Children" "No Children" ...
$ Liability : chr " 166,000 " " 166,000 " " 102,000 " " 102,000 " ...
> subDie <- Test[CanReason == "Died",]
> str(subDie)
'data.frame': 0 obs. of 13 variables:
$ Cancellations : int
$ Benefit : chr
$ Region : chr
$ Month : chr
$ CanReason : chr
$ Age : chr
$ Ethnicity : chr
$ SMS : chr
$ Duration : chr
$ SMSDuration : int
$ AgeYoungest : chr
$ AgeYoungestNonSub: chr
$ Liability : chr 我尝试过将因子变量转换为字符。当我将逗号放在" CanReason“索引行(subDie <- Test,CanReason == "Died")的前面时,R告诉我在0个变量上有157025个观察值……我被难住了
发布于 2015-04-03 11:27:13
使用正则表达式在字符矢量CanReason中搜索字符串"Died",使用grepl()返回指示匹配与否的逻辑矢量。使用它来对Test进行子集。
例如
set.seed(12)
CanReason <- sample(c("Change in Marital status",
"Change in Marital status ",
" Died ",
"Died ",
"Died"), 10000, replace = TRUE)
ind <- grepl("Died", CanReason)
sum(ind)
length(CanReason[ind])给予:
> sum(ind)
[1] 6037
> length(CanReason[ind])
[1] 6037
> head(CanReason[ind])
[1] "Died" "Died" "Died "
[4] "Died" " Died " " Died "https://stackoverflow.com/questions/29425399
复制相似问题