我需要找出在某个数据帧的列中一个值在另一个值之后出现的次数。
主要逻辑是根据另一列获取特定字符串的出现次数。
例如:
df<- data.frame(fruits = c("apples", "apples", "orange", "pears", "apples", "pears", "pears", "papaya", "papaya"),
veggies = c("beans", "carrots", "carrots", "carrots", "brinjal","carrots", "brinjal", "brinjal", "beans"),
branches=c( "Area1", "Area1", "Area1", "Area2","Area2","Area2", "Area2", "Area3", "Area3" ))这是我的数据框。我需要知道基于树枝列的水果或蔬菜的数量
当我使用table(df$fruits)时
输出为:
apples-3 orange-1 papaya-2 pears-3输出通常显示所有树枝的苹果和其他水果的总数。我需要准确地得到每个分支的计数。
我所需的输出应该基于列df$Branches
for Area1
apples-2 orange-1,
for Area2
pears-3 apples-1
for Area3
papaya-3发布于 2015-03-27 17:38:52
试试这个:
library(data.table)
setDT(df)[,list(count=.N),list(branches, fruits)]
# branches fruits count
#1: Area1 apples 2
#2: Area1 orange 1
#3: Area2 pears 3
#4: Area2 apples 1
#5: Area3 papaya 2发布于 2015-03-27 17:53:11
也许只需要使用ftable
> ftable(fruits ~ branches, data = df)
fruits apples orange papaya pears
branches
Area1 2 1 0 0
Area2 1 0 0 3
Area3 0 0 2 0
> ftable(veggies ~ branches, data = df)
veggies beans brinjal carrots
branches
Area1 1 0 2
Area2 0 2 2
Area3 1 1 0发布于 2015-03-27 17:59:40
我不知道您期望的输出,但您可以使用dplyr包获取计数:
例如:
library(dplyr)
df %>% count(fruits, branches)
# OR
count(df, fruits, branches)输出:
Source: local data frame [5 x 3]
Groups: fruits
fruits branches n
1 apples Area1 2
2 apples Area2 1
3 orange Area1 1
4 papaya Area3 2
5 pears Area2 3https://stackoverflow.com/questions/29297398
复制相似问题