首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在R-每个人的唯一月数中使用ave函数

在R-每个人的唯一月数中使用ave函数
EN

Stack Overflow用户
提问于 2014-01-29 12:35:47
回答 1查看 2.4K关注 0票数 1

我希望在下面的数据中计算每个人(hai_dispense_number)每月(月)行数。我的总体目标是观察从4月到9月的平均行数是否增加。我确信我应该使用ave函数来创建一个count变量。但我所有的尝试对我都没有用。见下面的尝试。一旦我做了统计,我想我将能够使用ddply做一个平均每月总结。下面是一个玩具df,列“obs”是我想要的输出。

代码语言:javascript
复制
df
         hai_dispense_number date_of_claim hai_atc     month obs
9972511   Patient HAI0002664    2010-04-07 A10BA02     april   1
11376245  Patient HAI0002664    2010-05-04 A10BA02       may   1
12508505  Patient HAI0002664    2010-05-31 A10BA02       may   2
13480611  Patient HAI0002664    2010-06-30 A10BA02      june   1
13486327  Patient HAI0002664    2010-06-30 A10BH03      june   2
13567944  Patient HAI0002664    2010-06-08 A10BA02      june   3
15003657  Patient HAI0002664    2010-07-27 A10BA02      july   1
15003658  Patient HAI0002664    2010-07-27 A10BH03      july   2
16600413  Patient HAI0002664    2010-08-31 A10BB09    august   1
16600866  Patient HAI0002664    2010-08-23 A10BA02    august   2
16600867  Patient HAI0002664    2010-08-23 A10BH03    august   3
17537505  Patient HAI0002664    2010-08-27 A10BB09    august   4
19176349  Patient HAI0002664    2010-09-17 A10BB09 september   1
19176350  Patient HAI0002664    2010-09-17 A10BH03 september   2
19176358  Patient HAI0002664    2010-09-17 A10BA02 september   3
17765433  Patient HAI0006637    2010-09-17 A10BA02 september   4
12953451  Patient HAI0007418    2010-06-04 A10BA02      june   1
15786889  Patient HAI0007418    2010-07-28 A10BB09      july   1
15787103  Patient HAI0007418    2010-07-12 A10BB09      july   2
15787233  Patient HAI0007418    2010-07-05 A10BA02      july   3
15878776  Patient HAI0007418    2010-07-08 A10BB09      july   4
15908690  Patient HAI0007418    2010-07-23 A10BB09      july   5
17363576  Patient HAI0007418    2010-08-20 A10BB09    august   1
17554737  Patient HAI0007418    2010-08-13 A10BB09    august   2

事先尝试

代码语言:javascript
复制
df$obs<-with(df, ave(month, hai_dispense_number, FUN=seq_along))  ##doesn't split by month

df$obs<-with(df, ave(month, hai_dispense_number, FUN=cumsum))  ##gives all NA values, think seq_along is actually what I want

df$obs <- ave(df$month, df$month, FUN=seq_along)  ##this is better than the previous two, but doesn't seem to split by person

ddply(df,~month,summarise,mean=mean(obs)) ##this works absolutely fine, just need to counts right first!

会重视任何人能给我的任何投入。看上去我好像出了什么根本问题。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2014-01-29 12:58:49

好吧,我已经把你的数据删减为:

代码语言:javascript
复制
> head(df)
            patient month
9972511  HAI0002664 april
11376245 HAI0002664   may
12508505 HAI0002664   may
13480611 HAI0002664  june
13486327 HAI0002664  june
13567944 HAI0002664  june

这就是我们所需要的,因为我们只需要几个月的病人识别资料。要获得所需的新列,请尝试如下:

代码语言:javascript
复制
library(plyr)

> ddply(df, .(patient, month), mutate, obs = 1:length(month))
      patient     month obs
1  HAI0002664     april   1
2  HAI0002664    august   1
3  HAI0002664    august   2
4  HAI0002664    august   3
5  HAI0002664    august   4
6  HAI0002664      july   1
7  HAI0002664      july   2
8  HAI0002664      june   1
9  HAI0002664      june   2
10 HAI0002664      june   3
11 HAI0002664       may   1
12 HAI0002664       may   2
13 HAI0002664 september   1
14 HAI0002664 september   2
15 HAI0002664 september   3
16 HAI0006637 september   1
17 HAI0007418    august   1
18 HAI0007418    august   2
19 HAI0007418      july   1
20 HAI0007418      july   2
21 HAI0007418      july   3
22 HAI0007418      july   4
23 HAI0007418      july   5
24 HAI0007418      june   1

顺便说一句,我假设在您的示例输出中,9月份的obs =4是一种类型,因为患者标识符已经从前三个(2664到6637)更改了。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/21431321

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档