为了分析一个数据集,我在R中有以下代码:
da_if <- function(x, TYPE) {
data <-read.csv(x,sep = "", skip = 61, na.strings = "NA",header = F, col.names = paste0("V", seq_len(6)), fill = TRUE) #skips first 61 lines due to those lines all being settings and makes six columns
data$V2 <- as.factor(data$V2)
data$V5 <- as.numeric(data$V5)
data$V6 <-as.numeric(data$V6) # 8&9 are needed in order so sum up values and get rid of two words!
y <- split(data, data$V2)
if (TYPE == "SCALES") {
y <- y$SCALES
y[is.na(y)]=0
y$V5<-y$V5+y$V6
y<-y[,-6]
return(y)
} else if (TYPE == "TRACK") {
y <- y$TRACK
return(y)
} else if (TYPE == "RESMAN") {
y<- y$RESMAN
return(y)
} else if (TYPE == "COMMUN"){
y <- y$COMMUN
return(y)
} else {print("Insert valid datatype...")}
}我有一个文件列表要分析,由这个代码生成:
fta<-list.files(pattern=".log", full.names = T) #files to analyse有没有办法改变我的功能,以便同时进行32个数据集的分析?我尝试过使用apply,但坦率地说,我无法将结果存储在32个不同的变量中来进一步分析数据,而且由于某些原因,我太笨了,无法编写一个for循环来执行任何事情.我现在只想分析一下我的数据中的比例。
提前谢谢大家!!
编辑:我的原始数据看起来像这样,是美国宇航局工具箱多任务处理练习的一部分:
14:29:00.467154 TRACK STATE CURSOR X 0.012751340110832256
14:29:00.467154 TRACK STATE CURSOR Y -0.08704373265652304
14:29:00.487683 TRACK STATE CURSOR X 0.012479403159392622
14:29:00.488668 TRACK STATE CURSOR Y -0.08733692625790845
14:29:00.491681 MAIN STATE PAUSE
14:29:00.515652 MAIN STATE GENERICSCALES START
14:30:53.308644 SCALES INPUT Mentale Anforderung 7
14:30:53.309640 SCALES INPUT Körperliche Beanspruchung 6
14:30:53.310467 SCALES INPUT Zeitdruck 5
14:30:53.311462 SCALES INPUT Leistung 3
14:30:53.311462 SCALES INPUT Anstrengung 7
14:30:53.312459 SCALES INPUT Frustration 5
14:30:53.316458 MAIN STATE RESUME
14:30:53.319470 MAIN STATE PUMPSTATUS STOP
14:30:53.320461 MAIN STATE RESMAN STOP
14:30:53.321456 MAIN STATE SYSMON STOP
14:30:53.322470 MAIN STATE COMMUNICATIONS STOP发布于 2020-11-14 11:11:09
不要将数据存储在32个不同的变量中。使用lapply并将数据存储在列表中:
list_data <- lapply(fta, da_if, TYPE = 'SCALES')如果要将数据列表组合成数据列表,则可以使用
combine_data <- do.call(rbind, list_data)https://stackoverflow.com/questions/64833236
复制相似问题