我对R中的循环非常陌生,所以如果这个问题在其他地方被问到了,我深表歉意。
读取全部30个CSVfiles ->按物种将文件A与其他30个CSV文件进行比较->仅为包含匹配物种的30个文件中的每个文件写入新的CSV文件
文件A有一列有190个物种的名称($name)。其他30个csv文件中的每个文件都有一个列,其中的物种($SBSname)在列$SBSname中具有不同数量的物种,范围可以是100-500个复制(因此文件CSV文件可以大于190行)。然而,我不知道如何编写代码……
这就是我目前所有的..。
我已经循环了所有的CSV文件:
30files = list.files(pattern="*.csv")
for (i in 1:length(30files)) assign(30files[i], read.csv(30files[i]))我有只比较一个CSV文件(branching.csv)和文件A的代码:
> str(FileA)
'data.frame': **190 obs. of 1 variable**:
$ name: Factor w/ 190 levels "Acaena novae zelandiae",..: 1 2 3 4 5 6 7 8 9 10 ...
> str(branching.csv)
'data.frame': **4055 obs. of 7 variables:**
$ SBSname : Factor w/ 2877 levels "Abies alba","Abies nordmanniana",..: 794 2075 1049 162 132 333 541 1840 272 1553 ...
$ SBS.number : int 16443 26711 40171 40398 40867 41151 37871 42412 35847 36245 ...
$ general.method : Factor w/ 5 levels "derivation from morphologies or other plant traits",..: 3 1 2 2 2 2 2 2 2 2 ...
$ branching : Factor w/ 2 levels "no","yes": 2 2 1 1 1 1 1 1 1 1 ...
$ valid : int 1 1 1 1 1 1 1 1 1 1 ...
$ reference : Factor w/ 6 levels "Barkman, J.J.(1988): New systems of plant growth forms and phenological plant types",..: 1 1 3 3 3 3 3 3 3 3 ...
$ original.reference: Factor w/ 97 levels "Aarssen, L.W. (1981): The biology of Canadian weeds. 50. Hypochoeris radicata L.",..: 9 9 20 3 3 3 3 3 33 33 ...
Species<-branching.csv[(branching.csv$SBSname %in% FileA$name),]
write.csv(Species, file = "Branching.csv")
> str(Species)
'data.frame': **298 obs. of 7 variables:**
$ name : Factor w/ 2877 levels "Abies alba","Abies nordmanniana",..: 1049 162 1548 47 57 1647 1060 2788 2094 1976 ...
$ SBS.number : int 40171 40398 36280 40532 41629 42495 40103 32792 32892 30583 ...
$ general.method : Factor w/ 5 levels "derivation from morphologies or other plant traits",..: 2 2 2 2 2 2 2 2 2 2 ...
$ branching : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 2 1 2 ...
$ valid : int 1 1 1 1 1 1 1 1 1 1 ...
$ reference : Factor w/ 6 levels "Barkman, J.J.(1988): New systems of plant growth forms and phenological plant types",..: 3 3 3 3 3 3 3 3 3 3 ...
$ original.reference: Factor w/ 97 levels "Aarssen, L.W. (1981): The biology of Canadian weeds. 50. Hypochoeris radicata L.",..: 20 3 33 33 33 33 33 44 44 44 ...任何帮助或建议都是很棒的。不一定要是一个循环!
发布于 2018-01-27 23:58:27
这个简单的循环怎么样?
library(dplyr)
for(i in 1:length(30files))
{
csv.matching = read.csv(30files[i]) %>% inner_join(FileA, by=c("SBSname"="name"))
write.csv(csv.matching, file=gsub("\\.csv", "_matchin.csv", 30files[i]), na="")
}https://stackoverflow.com/questions/48477257
复制相似问题