文章/答案/技术大牛

发布

社区首页 >问答首页 >只将一个变量转换为新列R

问只将一个变量转换为新列R
EN

Stack Overflow用户

提问于 2018-07-25 02:30:37

回答 2查看 391关注 0票数 2

我正在尝试dcast我的数据，以便我只能将Actual值分割成一个新的列。但是，我成功地做到这一点的唯一方法是dcast，然后返回melt。我想知道是否有更有效的解决方案。

步骤1:

我已经为我的数据做了一些准备，但是看起来是这样的：

> test_m <- melt(test, id.vars = c("category", "Budget_year", "State"))
> test_m <- test_m[,c("Year", "Type_of_observation"):= tstrsplit(variable, " ", fixed = TRUE)]
> test_m[,variable := NULL]
> head(test_m, n = 10)

          category Budget_year State value    Year Type_of_observation
 1:  Transfer Duty     2000_01     N  1916 1998-99              Actual
 2:       Land Tax     2000_01     N   948 1998-99              Actual
 3:    Payroll Tax     2000_01     N  3605 1998-99              Actual
 4: Total Gambling     2000_01     N  1419 1998-99              Actual
 5:            GST     2000_01     N  4705 1998-99              Actual
 6:  Transfer Duty     2000_01     N  1747 1999-00              Budget
 7:       Land Tax     2000_01     N   830 1999-00              Budget
 8:    Payroll Tax     2000_01     N  3616 1999-00              Budget
 9: Total Gambling     2000_01     N  1558 1999-00              Budget
10:            GST     2000_01     N  5162 1999-00              Budget

现在，我想从Type_of_observation列中创建一个新的列，但只考虑到Actual观测，并将所有其他观察类型抛在脑后。我目前的方法是dcast，然后是melt，如下所示：

步骤2:期望的输出

> test_c <- dcast(test_m, category + Budget_year + State + Year ~ Type_of_observation)
> test_mc <- melt(test_c, id.vars = c("category", "Budget_year", "State", "Year", "Actual"), measure.vars = c("Budget", "Estimate", "Revised"))
> head(test_mc, n = 10)
    category Budget_year State    Year Actual variable value
 1:      GST     2000_01     N 1998-99   4705   Budget    NA
 2:      GST     2000_01     N 1999-00     NA   Budget  5162
 3:      GST     2000_01     N 2000-01     NA   Budget  8318
 4:      GST     2000_01     N 2001-02     NA   Budget    NA
 5:      GST     2000_01     N 2002-03     NA   Budget    NA
 6:      GST     2000_01     N 2003-04     NA   Budget    NA
 7: Land Tax     2000_01     N 1998-99    948   Budget    NA
 8: Land Tax     2000_01     N 1999-00     NA   Budget   830
 9: Land Tax     2000_01     N 2000-01     NA   Budget   921
10: Land Tax     2000_01     N 2001-02     NA   Budget    NA

现在我有了一个Actuals的列，所有其他类型的观察都保留在variable列中。

有什么方法可以让我不用做test_m和melt就可以从dcast到test_mc呢？我最好是在寻找data.table解决方案，但对任何事情都是开放的。

这是dput for test_m

> dput(test_m)
structure(list(category = c("Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST", "Transfer Duty", "Land Tax", "Payroll Tax", 
"Total Gambling", "GST"), Budget_year = c("2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01", "2000_01", "2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01", "2000_01", "2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01", "2000_01", "2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01", "2000_01", "2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01", "2000_01", "2000_01", "2000_01", 
"2000_01", "2000_01", "2000_01"), State = c("N", "N", "N", "N", 
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", 
"N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", 
"N", "N", "N", "N", "N"), value = c("1916", "948", "3605", "1419", 
"4705", "1747", "830", "3616", "1558", "5162", "2249", "889", 
"3742", "1578", "5173", "1746", "921", "3931", "1212", "8318", 
"1686", "948", "4146", "1241", "9520", "1756", "971", "4258", 
"1309", "9789", "1904", "991", "4503", "1374", "10006"), Year = c("1998-99", 
"1998-99", "1998-99", "1998-99", "1998-99", "1999-00", "1999-00", 
"1999-00", "1999-00", "1999-00", "1999-00", "1999-00", "1999-00", 
"1999-00", "1999-00", "2000-01", "2000-01", "2000-01", "2000-01", 
"2000-01", "2001-02", "2001-02", "2001-02", "2001-02", "2001-02", 
"2002-03", "2002-03", "2002-03", "2002-03", "2002-03", "2003-04", 
"2003-04", "2003-04", "2003-04", "2003-04"), Type_of_observation = c("Actual", 
"Actual", "Actual", "Actual", "Actual", "Budget", "Budget", "Budget", 
"Budget", "Budget", "Revised", "Revised", "Revised", "Revised", 
"Revised", "Budget", "Budget", "Budget", "Budget", "Budget", 
"Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", 
"Estimate", "Estimate", "Estimate", "Estimate", "Estimate", "Estimate", 
"Estimate", "Estimate", "Estimate")), .Names = c("category", 
"Budget_year", "State", "value", "Year", "Type_of_observation"
), row.names = c(NA, -35L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x026c24a0>)

data.table

melt

dcast

回答 2

Stack Overflow用户

发布于 2018-07-25 03:19:01

您可以先完成案例，然后加入您的数据集。

最后，执行一个update来查找实际值。

#create complete cases
ans <- test_m[CJ(category=category, Budget_year=Budget_year, State=State, Year=Year, Type_of_observation=c("Budget", "Estimate", "Revised"), unique=TRUE),
    on=.(category, Budget_year, State, Year, Type_of_observation)][
        #update join
        test_m[Type_of_observation=="Actual"], 
        Actual := i.value,
        on=.(category, Budget_year, State, Year)]

#order to match test_mc
setorder(ans, category, Budget_year, State, Year, Type_of_observation)[]

票数 1

Stack Overflow用户

发布于 2018-10-18 02:25:46

我认为我有一个简单的data.table方法来实现这一点，使用setkey并在括号内加入。

我将使用一个更简单的data.table。目标是将interest_rate放到自己的列中。

samp <- data.table(
  group=c("a","a","a","b","b","b","c","c","c"),
  variable=c("balance", "end_balance","interest_rate"),
  value=c(1000, 940, .05, 1200, 1040, .08, 980, 970, .10)
)


setkey(samp, group)

#  This will create a data.table with just our desired variable value, interest_rate, by group
samp[variable=="interest_rate", .(interest_rate=unique(value)), by=.(group)]

#  We then join this to the original data.table using the already set key and
#  drop the interest_rate rows in the final data.table
samp[samp[variable=="interest_rate", .(interest_rate=unique(value)), by=.(group)]][variable!="interest_rate"]

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51510008

复制

相似问题

问只将一个变量转换为新列R
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问只将一个变量转换为新列REN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问只将一个变量转换为新列R
EN