我正在编写一个函数,该函数可以从.csv存储库下载多个GitHUB文件,并且最初将它们存储在一个(长格式) tibble中,如下所示:
# write different endings of urls "by hand" with 'ctrl-c' & 'ctrl-v' to get a list.
hobo_id <- c("10088310_Th.csv", "10234637_Th.csv", "10347313_Th.csv", "10347320_Th.csv", "10347321_th.csv", "10347327_Th.csv", "10347328_Th.csv", "10347356_Th.csv", "10347362_Th.csv", "10347366_Th.csv", "10347384_Th.csv", "10347394_Th.csv", "10350002_Th.csv ", "10350005_Th.csv", "10350049_Th.csv", "10610854_Th.csv", "10760709_Th.csv", "10760710_Th.csv", "10760811_Th.csv", "10760820_Th.csv", "10760822_Th.csv", "10801139_th.csv", "10801141_Th.csv")
# import function:
import_csv <- function(hobo_id){
#create urls
HOBO_urls <- paste0('https://raw.githubusercontent.com/data-hydenv/data/master/hobo/2022/hourly/',hobo_id)
# HOBO_urls represents a list of each link, that read_csv will download in the next step
# read in file
hobo_coll <- read_csv(as.character(HOBO_urls))
return(hobo_coll)
}
hobo_coll <- import_csv(hobo_id)到目前为止这是可行的。但是我想要添加一个名为'ID'的列。
我的方法之一是这样的:
import_csv <- function(hobo_id){
#create urls
HOBO_urls <- paste0('https://raw.githubusercontent.com/data-hydenv/data/master/hobo/2022/hourly/',hobo_id)
# read in file
hobo_coll <- read_csv(as.character(HOBO_urls))
# Add column ID
hobo_coll1 <- hobo_coll %>%
mutate(dttm = parse_date_time(dttm, "%Y-%m-%d %H:%M:%S")) %>%
mutate(ID = ifelse(dttm >= "2021-12-13 00:00:00" & dttm <= "2022-01-09 23:00:00", hobo_id, NA))
return(hobo_coll1)
}到目前为止,这是可行的,但是对于4032行(从"2021-12-13 00: 00:00:00“到"2022-01-09 23:00:00"),从'hobo_id‘的ID应该保持不变,然后更改为下一个ID (hobo_id,2),并在下一个时间周期4032行之后切换到下一个(hobo_id,3),等等。
我想一定有办法用tidyr::extract()函数来实现它,但我似乎不知道怎么做。
我还考虑了for循环,但希望坚持使用import_csv()函数解决方案。
感谢您提前提供帮助,非常感谢!
发布于 2022-01-25 15:25:07
直接使用函数参数,不进行任何索引,并更改行
mutate(ID = ifelse(dttm >= "2021-12-13 00:00:00" & dttm <= "2022-01-09 23:00:00", .[[hobo_id]], NA)) 至
mutate(ID = ifelse(dttm >= "2021-12-13 00:00:00" & dttm <= "2022-01-09 23:00:00", hobo_id, NA)) https://stackoverflow.com/questions/70851068
复制相似问题