首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >添加两行的值并创建一个新行

添加两行的值并创建一个新行
EN

Stack Overflow用户
提问于 2016-06-20 20:13:15
回答 2查看 87关注 0票数 1

我是R的新手,所以我有一些问题要修改我的数据文件:

代码语言:javascript
复制
id <- c(1, 2,3,4,5,6,7,8,9,10)
number <- c(1,1,1,1,1,1,8,8,2,2)
country <- c("France", "France", "France", "France", "France", "France", "Spain", "Spain", "Belgium", "Belgium")
year <- c(2010,2010,2011,2011,2010,2010,2009,2009,1996,1996)
sex <- c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F")
disease <- c("hiv","hiv","hiv","hiv","cancer","cancer","cancer","cancer","tubercolosis","tubercolosis")
value <- c(15,1,0,2,50,120,600,47,0,0)

我想要的是类似的数据,但是有5个新行,它们表示Value列的MF之和。就像这样:

代码语言:javascript
复制
id <- c(1, 2,3,4,5,6,7,8,9,10,11,12,13,14,15)
number <- c(1,1,1,1,1,1,8,8,2,2,1,1,1,8,2)
country <- c("France", "France", "France", "France", "France", "France", "Spain", "Spain", "Belgium", "Belgium","France", "France", "France", "Spain", "Belgium")
year <- c(2010,2010,2011,2011,2010,2010,2009,2009,1996,1996,2010,2011,2010,2009,1996)
sex <- c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F","T","T","T","T","T")
disease <- c("hiv","hiv","hiv","hiv","cancer","cancer","cancer","cancer","tubercolosis","tubercolosis","hiv","hiv","cancer","cancer","tubercolosis")
value <- c(15,1,0,2,50,120,600,47,0,0,16,2,170,647,0)

非常清楚:

代码语言:javascript
复制
> whatIhave
   id number country year sex      disease value
1   1      1  France 2010   M          hiv    15
2   2      1  France 2010   F          hiv     1
3   3      1  France 2011   M          hiv     0
4   4      1  France 2011   F          hiv     2
5   5      1  France 2010   M       cancer    50
6   6      1  France 2010   F       cancer   120
7   7      8   Spain 2009   M       cancer   600
8   8      8   Spain 2009   F       cancer    47
9   9      2 Belgium 1996   M tubercolosis     0
10 10      2 Belgium 1996   F tubercolosis     0

> whatIwant
   id number country year sex      disease value
1   1      1  France 2010   M          hiv    15
2   2      1  France 2010   F          hiv     1
3   3      1  France 2011   M          hiv     0
4   4      1  France 2011   F          hiv     2
5   5      1  France 2010   M       cancer    50
6   6      1  France 2010   F       cancer   120
7   7      8   Spain 2009   M       cancer   600
8   8      8   Spain 2009   F       cancer    47
9   9      2 Belgium 1996   M tubercolosis     0
10 10      2 Belgium 1996   F tubercolosis     0
11 11      1  France 2010   T          hiv    16
12 12      1  France 2011   T          hiv     2
13 13      1  France 2010   T       cancer   170
14 14      8   Spain 2009   T       cancer   647
15 15      2 Belgium 1996   T tubercolosis     0

它为列T创建了一个新的sex值,指示sum F + M。新的5行是最新的5行,有5行,因为我必须为每个country添加FM值,包括yeardiseaseNumber与国家有关。Id只表示每一行的id。我的数据框架显然比这个大得多。

我该怎么做?谢谢

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-06-20 23:00:00

下面是一个使用data.table方法的快速解决方案:

代码语言:javascript
复制
library(data.table)

# calculate the sums and store it in a separate data table dtpart2 
dtpart2 <- setDT(df)[ , .(value= sum(value)), by = .(number, country, year, disease)]

# create columns of sex and id
dtpart2[, id := max(df$id)+1: nrow(dtpart2) ][, sex := "T"]

# set the same column order as in the original data frame 
setcolorder(dtpart2, names(df))

# Append the two data sets
newdata <- rbind(df,dtpart2)

#>     id number country year  sex      disease value
#>  1:  1      1  France 2010    M          hiv    15
#>  2:  2      1  France 2010    F          hiv     1
#>  3:  3      1  France 2011    M          hiv     0
#>  4:  4      1  France 2011    F          hiv     2
#>  5:  5      1  France 2010    M       cancer    50
#>  6:  6      1  France 2010    F       cancer   120
#>  7:  7      8   Spain 2009    M       cancer   600
#>  8:  8      8   Spain 2009    F       cancer    47
#>  9:  9      2 Belgium 1996    M tubercolosis     0
#> 10: 10      2 Belgium 1996    F tubercolosis     0
#> 11: 11      1  France 2010    T          hiv    16
#> 12: 12      1  France 2011    T          hiv     2
#> 13: 13      1  France 2010    T       cancer   170
#> 14: 14      8   Spain 2009    T       cancer   647
#> 15: 15      2 Belgium 1996    T tubercolosis     0

数据:

代码语言:javascript
复制
df <- data.frame(id, number, country, year, sex, disease, value)
票数 0
EN

Stack Overflow用户

发布于 2016-06-20 20:28:53

代码语言:javascript
复制
df <- 
data.frame(
                number <- c(1,1,1,1,1,1,8,8,2,2),
                country <- c("France", "France", "France", "France", "France", "France", "Spain", "Spain", "Belgium", "Belgium"),
                year <- c(2010,2010,2011,2011,2010,2010,2009,2009,1996,1996),
                sex <- c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F"),
                disease <- c("hiv","hiv","hiv","hiv","cancer","cancer","cancer","cancer","tubercolosis","tubercolosis"),
                value <- c(15,1,0,2,50,120,600,47,0,0))

colnames(df) <- c("number","country", "year", "sex",
                  "disease", "value")

df2 <- aggregate(df[,colnames(df) %in% c("number", "value")], by = list(df$country, df$disease, df$year), FUN = sum)
df2$sex <- "T"

colnames(df2) <- c("country", "disease", "year", "number", "value", "sex")
df2 <- df2[,colnames(df2) %in% c(   "number", "country", "year", "sex",      "disease", "value")]

newdf <- rbind(df,df2)

newdf

   number country year sex      disease value
1       1  France 2010   M          hiv    15
2       1  France 2010   F          hiv     1
3       1  France 2011   M          hiv     0
4       1  France 2011   F          hiv     2
5       1  France 2010   M       cancer    50
6       1  France 2010   F       cancer   120
7       8   Spain 2009   M       cancer   600
8       8   Spain 2009   F       cancer    47
9       2 Belgium 1996   M tubercolosis     0
10      2 Belgium 1996   F tubercolosis     0
11      4 Belgium 1996   T tubercolosis     0
12     16   Spain 2009   T       cancer   647
13      2  France 2010   T       cancer   170
14      2  France 2010   T          hiv    16
15      2  France 2011   T          hiv     2
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37931061

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档