我有这两份资料:
df1<-structure(list(Measures = c("space and shape", "space and shape",
"space and shape", "space and shape", "space and shape", "change and relationships",
"change and relationships", "change and relationships", "change and relationships",
"change and relationships", "space and shape", "space and shape",
"space and shape", "space and shape", "uncertainty and data",
"quantity", "uncertainty and data", "uncertainty and data", "uncertainty and data",
"quantity", "change and relationships", "change and relationships",
"space and shape", "space and shape", "space and shape", "quantity",
"quantity", "quantity", "quantity", "quantity", "uncertainty and data",
"change and relationships", "quantity", "quantity", "uncertainty and data",
"change and relationships", "uncertainty and data", "quantity",
"change and relationships", "change and relationships", "quantity",
"quantity", "quantity", "quantity", "quantity", "quantity", "change and relationships",
"uncertainty and data", "change and relationships", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "quantity", "quantity",
"quantity", "space and shape", "change and relationships", "quantity",
"space and shape", "space and shape", "change and relationships",
"change and relationships", "uncertainty and data", "uncertainty and data",
"quantity", "change and relationships", "quantity", "change and relationships",
"space and shape", "quantity", "quantity", "quantity", "space and shape",
"space and shape", "space and shape", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "change and relationships",
"change and relationships", "change and relationships", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "change and relationships",
"change and relationships", "change and relationships", "change and relationships",
"change and relationships", "uncertainty and data", "space and shape",
"space and shape", "uncertainty and data", "uncertainty and data",
"uncertainty and data", "uncertainty and data", "uncertainty and data",
"quantity", "quantity", "space and shape", "space and shape",
"space and shape", "space and shape", "change and relationships",
"space and shape", "space and shape", "quantity", "change and relationships",
"change and relationships"), Format = c("Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Auto-coded",
"Constructed Response Expert", "Constructed Response Expert",
"Constructed Response Expert", "Complex Multiple Choice", "Complex Multiple Choice",
"Complex Multiple Choice", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Expert", "Complex Multiple Choice", "Constructed Response Manual",
"Simple Multiple Choice", "Complex Multiple Choice", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Manual",
"Constructed Response Expert", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Auto-coded", "Constructed Response Manual",
"Complex Multiple Choice", "Constructed Response Manual", "Simple Multiple Choice",
"Simple Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Complex Multiple Choice", "Simple Multiple Choice", "Constructed Response Auto-coded",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Manual", "Complex Multiple Choice", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Expert", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Expert", "Simple Multiple Choice",
"Constructed Response Manual", "Simple Multiple Choice", "Simple Multiple Choice",
"Simple Multiple Choice", "Constructed Response Manual", "Constructed Response Manual",
"Simple Multiple Choice", "Simple Multiple Choice", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Constructed Response Expert", "Constructed Response Manual",
"Simple Multiple Choice", "Constructed Response Expert", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Expert",
"Complex Multiple Choice", "Complex Multiple Choice", "Constructed Response Expert",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Expert", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Expert", "Simple Multiple Choice", "Simple Multiple Choice",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Expert", "Constructed Response Manual",
"Complex Multiple Choice", "Constructed Response Manual", "Constructed Response Manual",
"Complex Multiple Choice", "Simple Multiple Choice", "Simple Multiple Choice",
"Simple Multiple Choice", "Constructed Response Manual", "Simple Multiple Choice",
"Constructed Response Expert", "Constructed Response Manual",
"Constructed Response Manual", "Constructed Response Expert",
"Constructed Response Manual", "Constructed Response Expert",
"Simple Multiple Choice", "Constructed Response Manual", "Complex Multiple Choice"
)), row.names = c(NA, -109L), class = "data.frame")
df2<- structure(list(Measures = c("space and shape", "space and shape",
"space and shape", "space and shape"), Format = c("Constructed Response Expert",
"Constructed Response Manual", "Simple Multiple Choice", "Constructed Response Auto-coded"
)), row.names = c(1L, 2L, 4L, 5L), class = "data.frame")我使用这段代码来子集第一个dataframe的所有行,这些行都在第二个dataframe中。
library(tidyverse)
inner_join(df1, df2)但是,我也希望计数,第一次数据中为每一行df2可用的行数,并将这些计数存储在一个向量中。这有点难以用文字来解释,所以我制作了一个简短的视频来解释:
我试着自己做这件事,并创建了一个for循环来完成这个任务。这是我第一次真正使用循环,它不幸地失败了:
for(i in 1:nrow(df2)) {
b<- data.frame('NA','NA')
b[i,]<- inner_join(df1, df2[i,])
num[i]<- nrow(b)
}当我打印数字时
[1] 1 2 3 4 5 6 7 8发布于 2021-11-26 20:17:33
一种策略可以不是使用循环,而是在df1中的组数在inner_join之前。有一种方法可能如下所示:
library(dplyr)
df1 %>%
count(across(everything())) %>%
inner_join(df2)
#> Joining, by = c("Measures", "Format")
#> Measures Format n
#> 1 space and shape Constructed Response Auto-coded 1
#> 2 space and shape Constructed Response Expert 2
#> 3 space and shape Constructed Response Manual 1
#> 4 space and shape Simple Multiple Choice 1我选择了df1的一个子集,因为它非常大:
df1 <- structure(list(
Measures = c(
"space and shape",
"space and shape",
"space and shape",
"space and shape",
"space and shape",
"change and relationships"
),
Format = c(
"Constructed Response Expert",
"Constructed Response Manual",
"Constructed Response Expert",
"Simple Multiple Choice",
"Constructed Response Auto-coded",
"Constructed Response Expert"
)
),
row.names = c(NA, 6L),
class = "data.frame"
)
df2 <- structure(list(
Measures = c(
"space and shape",
"space and shape",
"space and shape",
"space and shape"
),
Format = c(
"Constructed Response Expert",
"Constructed Response Manual",
"Simple Multiple Choice",
"Constructed Response Auto-coded"
)
),
row.names = c(1L, 2L, 4L, 5L),
class = "data.frame")https://stackoverflow.com/questions/70129541
复制相似问题