我对编程非常陌生,最近开始尝试使用R进行数据分析。我目前正在尝试根据来自另一个列的值在我的df上生成一个新的列,并将每个列的总降雨量相加。我从PRISM气候组站点获得了我的气候数据,并添加了以下代码来将(1980-01)获取的日期字段分成年和月
climate <- tidyr::separate(climate,date, c("year", "month"), sep = "-") 我的问题是,我如何开始添加一个新的列,以添加基于月份的文本?
我目前的伪代码方法是
if climate$month == 1,2,3 then climate$season == winter
else climate$month == 4,5,6 then climate$season == spring
else climate$month == 7,8,9 then climate$season == summer
else climate$month == 10,11,12 climate$season == fall我的目标是用计算出的每年每个季节的降雨量总和生成一个新的df,同时避免使用Excel
谢谢你的建议!
已解决,以下是最终的工作输出,以供将来参考:
#Read in PRSIM data
prism <- read.csv('PRISM.csv')
#Seperate Date into Year - Month
prism <- tidyr::separate(prism,date, c("year", "month"), sep = "-")
#Convert factor variable into numeric
library(dplyr)
prism <- prism %>% mutate(month= as.numeric(as.character(month)))
#Generate new season column based on month
prism <- prism %>% mutate(season = case_when(
month < 4 ~ "winter",
month < 7 ~ "spring",
month < 10 ~ "summer",
month < 13 ~ "fall",
TRUE ~ NA_character_
))
#Generate new data frame with Year and sum of each seasons value
clima <- prism %>%
group_by(year, season) %>%
summarise(ppt_mm = sum(ppt_mm), tmin_c = sum(tmin_c), tmean_c = sum(tmean_c), tmax_c = sum(tmax_c), vdpmin_hpa = sum(vdpmin_hpa), vdpmax_hpa = sum(vdpmax_hpa))
#By Season
spring <- clima[clima$season=="spring", ]
summer <- clima[clima$season=="summer", ]
fall <- clima[clima$season=="fall", ]
winter <- clima[clima$season=="winter", ]发布于 2020-05-02 13:51:35
您可以使用dplyr::case_when。它比链式条件更好:
library(dplyr)
df %>% mutate(season = case_when(
month < 3 ~ "winter",
month < 7 ~ "spring",
month < 10 ~ "summer",
month < 13 ~ "fall",
TRUE ~ NA_character_
))
month season
1 1 winter
2 2 winter
3 3 spring
4 4 spring
5 5 spring
6 6 spring
7 7 summer
8 8 summer
9 9 summer
10 10 fall
11 11 fall
12 12 fall
13 13 <NA>https://stackoverflow.com/questions/61555037
复制相似问题