我有类似于下面的数据,但更大
boat = c(1,1,1,1,1,1,1,2,2,2,2,2,2)
species = c("cod", "haddock", "ling",
"cod", "haddock", "ling", "tusk",
"cod", "haddock", "ling",
"cod", "haddock", "ling")
date = c(as.Date(c("1.03.2017","1.03.2017","1.03.2017",
"2.03.2017", "2.03.2017", "2.03.2017","2.03.2017",
"4.03.2017","4.03.2017","4.03.2017",
"7.03.2017", "7.03.2017", "7.03.2017"), "%d.%m.%Y"))
df <- data.frame(boat, species, date)
df
boat species date
1 cod 01.03.2017
1 haddock 01.03.2017
1 ling 01.03.2017
1 cod 02.03.2017
1 haddock 02.03.2017
1 ling 02.03.2017
1 tusk 02.03.2017
2 cod 04.03.2017
2 haddock 04.03.2017
2 ling 04.03.2017
2 cod 07.03.2017
2 haddock 07.03.2017
2 ling 07.03.2017我想创建一个额外的列,按照船的顺序对日期进行排序,这样我的数据集就会如下所示。
boat species date rank
1 cod 01.03.2017 1
1 haddock 01.03.2017 1
1 ling 01.03.2017 1
1 cod 02.03.2017 2
1 haddock 02.03.2017 2
1 ling 02.03.2017 2
1 tusk 02.03.2017 2
2 cod 04.03.2017 1
2 haddock 04.03.2017 1
2 ling 04.03.2017 1
2 cod 07.03.2017 2
2 haddock 07.03.2017 2
2 ling 07.03.2017 2我尝试过以下代码
library(dplyr)
df %>%
group_by(boat, species) %>%
mutate(Order = rank(date))但是以前没有出现过的物种在第一次出现时就被赋予了"1“的等级。如有任何帮助,我们将不胜感激。
发布于 2017-08-25 21:28:53
我们可以使用dplyr中的group_by和dense_rank来创建所需的输出。
library(dplyr)
df2 <- df %>%
group_by(boat) %>%
mutate(rank = dense_rank(date))
df2
# A tibble: 13 x 4
# Groups: boat [2]
boat species date rank
<dbl> <fctr> <date> <int>
1 1 cod 2017-03-01 1
2 1 haddock 2017-03-01 1
3 1 ling 2017-03-01 1
4 1 cod 2017-03-02 2
5 1 haddock 2017-03-02 2
6 1 ling 2017-03-02 2
7 1 tusk 2017-03-02 2
8 2 cod 2017-03-04 1
9 2 haddock 2017-03-04 1
10 2 ling 2017-03-04 1
11 2 cod 2017-03-07 2
12 2 haddock 2017-03-07 2
13 2 ling 2017-03-07 2发布于 2017-08-25 21:19:35
library(dplyr)
left_join(df,
unique(df[,c(1,3)]) %>%
group_by(boat) %>%
mutate(Order = rank(date)))
## boat species date Order
## 1 1 cod 2017-03-01 1
## 2 1 haddock 2017-03-01 1
## 3 1 ling 2017-03-01 1
## 4 1 cod 2017-03-02 2
## 5 1 haddock 2017-03-02 2
## 6 1 ling 2017-03-02 2
## 7 1 tusk 2017-03-02 2
## 8 2 cod 2017-03-04 1
## 9 2 haddock 2017-03-04 1
## 10 2 ling 2017-03-04 1
## 11 2 cod 2017-03-07 2
## 12 2 haddock 2017-03-07 2
## 13 2 ling 2017-03-07 2发布于 2017-08-25 21:21:22
解决方案
df %>%
group_by(boat) %>%
mutate(Order=cumsum(lag(date,default=head(date,1)) != date)+1)输出
boat species date Order
1 1 cod 2017-03-01 1
2 1 haddock 2017-03-01 1
3 1 ling 2017-03-01 1
4 1 cod 2017-03-02 2
5 1 haddock 2017-03-02 2
6 1 ling 2017-03-02 2
7 1 tusk 2017-03-02 2
8 2 cod 2017-03-04 1
9 2 haddock 2017-03-04 1
10 2 ling 2017-03-04 1
11 2 cod 2017-03-07 2
12 2 haddock 2017-03-07 2
13 2 ling 2017-03-07 2https://stackoverflow.com/questions/45882247
复制相似问题