首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在向量化函数中调用向量化函数

在向量化函数中调用向量化函数
EN

Stack Overflow用户
提问于 2020-07-07 18:09:46
回答 1查看 67关注 0票数 1

我正在为一系列的认知测试编写评分代码,这些测试是通过电池进行的。在下面的示例中,我有一个名为SHAPES_v1的虚拟测试,但在我的应用程序中,有许多不同版本的测试。我试图通过使用sapply()和Vectorize()来向量化我的函数,但是输出(scored_battery_1)与我试图实现的输出(desired_output)不匹配。当我在每个函数中对一个示例项运行单独的调用时,一切都正常,所以我非常肯定我的矢量化失败了。我已经实现了Vectorize(),sapply()注释掉了。Vectorize()方法包含正确的输出,但仍然具有初始变量,是嵌套列表而不是数据框架。知道我做错了什么吗?

代码语言:javascript
复制
library('dplyr')

battery_1 <- data.frame(PID=paste0('PID', 1:5), SHAPES_v1_QID1_RESP=c(rep(4, 3),
  rep(2, 2)), SHAPES_v1_QID2_RESP=c(rep(2, 3), rep(3, 2)),
  LETTERS_v1_QID1_RESP=c(rep(5, 3), rep(2, 2)),
  LETTERS_v1_QID2_RESP=c(rep(5, 1), rep(6, 4)))

SHAPES_v1 <- data.frame(QID=1:2, CorrectResponse=c(4, 3))

LETTERS_v1 <- data.frame(QID=1:2, CorrectResponse=c(5, 6))

########### Simplify names
simpNames <- function(i, varnames) {
  return(paste(varnames[[i]][1], varnames[[i]][2], sep='_'))
}
simpNames <- Vectorize(simpNames, vectorize.args='i', SIMPLIFY=TRUE)

########### Score a specific item
scoreItem <- function(battery, answers, item, num) {
  corrItem <- gsub('RESP', 'CORR', item)
  ans <- answers[answers$QID == num, 'CorrectResponse']
  battery <- battery %>% mutate_at( .funs = funs(ifelse(. == ans,
                            yes = 1, no = 0)), .vars = item)
  names(battery)[names(battery) == item] <- corrItem
  return(battery)
}
scoreItem <- Vectorize(scoreItem, vectorize.args=c('item', 'num'), SIMPLIFY=FALSE)

########### Score a specific test
scoreTest <- function(battery, test) {
  if (exists(test) & length(grep('DISC', test)) == 0) {
    answers <- get(test)

    # List items
    items <- paste0(test, '_', 'QID', answers$QID, '_RESP')
    nums <- answers$QID

    # Score items
    battery <- scoreItem(battery, answers, items, nums)
    #battery <- sapply(1:length(nums), function(i) scoreItem(battery, answers, items[i], nums[i]))
  } else {
    print(paste('Answer key does not exist for', test))
  }
  return(battery)
}
scoreTest <- Vectorize(scoreTest, vectorize.args=c('test'), SIMPLIFY=FALSE)

########### Score the whole battery
score <- function(battery) {
  varnames <- names(battery)[!(names(battery) %in% grep('PID', names(battery), value=TRUE))]
  varnames <- strsplit(varnames, '_')
  varnames <- simpNames(1:length(varnames), varnames)
  tests <- unique(varnames)

  # Score a specific test
  battery <- scoreTest(battery, tests)
  #battery <- sapply(1:length(tests), function(i) scoreTest(battery, tests[i]))

  return(battery)
}

#################### Score the batteries ####################
scored_battery_1 <- score(battery_1)
scored_battery_1

####################### Desired Output ######################
desired_output <- data.frame(PID=paste0('PID', 1:5), SHAPES_v1_QID1_CORR=c(rep(1, 3),
  rep(0, 2)), SHAPES_v1_QID2_CORR=c(rep(0, 3), rep(1, 2)),
  LETTERS_v1_QID1_CORR=c(rep(1, 3), rep(0, 2)),
  LETTERS_v1_QID2_CORR=c(rep(0, 1), rep(1, 4)))
desired_output
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-07-07 18:49:49

不知怎么的,我觉得你把一些事情搞得太复杂了。

我尝试完成您所描述的相同的输出。请告诉我以下几点是否适用于您:

代码语言:javascript
复制
library(dplyr)
library(tidyr)
library(purrr)

score <- function(battery) {
  battery %>%
    pivot_longer(-PID, names_to = 'response_id', values_to = 'response_value') %>%
    mutate(
      test_name = str_extract(response_id, '^[^_]+_[^_]+(?=_)'),
      QID = as.integer(str_extract(response_id, '(?<=QID)\\d+(?=_)'))
    ) %>%
    filter(test_name %in% ls(envir = .GlobalEnv)) %>%
    split(f = .$test_name) %>%
    imap(.f = function(test_results, test_name){
      test_results %>%
        left_join(get(test_name), by = 'QID') %>%
        filter(!is.na(CorrectResponse)) %>%
        mutate(
          is_correct = as.integer(response_value == CorrectResponse)
        )
    }) %>%
    do.call(bind_rows, .) %>%
    select(PID, response_id, is_correct) %>%
    spread(key = response_id, value = is_correct)
}

这实际上是在做以下工作:

scoring

  • filter

  • 将响应列转换为pivot_longer的行表示形式,将PID列保留在

  • 提取test_nameQID的位置,我认为只有在响应加载了

H 112将数据分解为列表的情况下,才需要对test_nameQID进行测试,这样我们就可以……H 213H 114.在每个块上左加入正确的响应df,然后将测试

  • 重新加入到

  • 中,一旦

  • 只选择PID列,原始列名和我们的得分

  • 将这些重新展开成列格式

( Tada :)

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62781421

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档