首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用循环或类似函数包装长代码

使用循环或类似函数包装长代码
EN

Stack Overflow用户
提问于 2020-11-20 08:54:08
回答 2查看 46关注 0票数 2

我想知道是否有任何方法可以包装下面的代码以使其更简短;我正在考虑使用循环或类似的函数来完成此操作。这段代码使用AgeatDeath和Disability生成一个新变量(cat)。如果AgeatDeath介于75.6和77.1之间,并且Disability等于"No Intelectual and Developmental“,则代码将创建值为75.6-77.1的cat变量。谢谢,纳德

代码语言:javascript
复制
IDD <- IDD %>%
      mutate(
        cat = case_when(
          AgeatDeath >= 75.6 &
            AgeatDeath < 77.1  &
            Disability == 'No Intelectual and Developmental Disabilities' ~ '75.6-77.1',
          AgeatDeath >= 74.3 &
            AgeatDeath < 75.6  &
            Disability == 'No Intelectual and Developmental Disabilities' ~ '74.3-75.6',
          AgeatDeath >= 72.5 &
            AgeatDeath < 74.3  &
            Disability == 'No Intelectual and Developmental Disabilities' ~ '72.5-74.3',
          AgeatDeath >= 66.5 &
            AgeatDeath < 72.5  &
            Disability == 'No Intelectual and Developmental Disabilities' ~ '66.6-72.5',
          
          AgeatDeath >= 64.1 &
            AgeatDeath < 71.9  &
            Disability == 'Intellectual disability' ~ '64.1-71.9',
          AgeatDeath >= 62.3 &
            AgeatDeath < 64.1  &
            Disability == 'Intellectual disability' ~ '62.3-64.1',
          AgeatDeath >= 59.4 &
            AgeatDeath < 62.3  &
            Disability == 'Intellectual disability' ~ '59.4-62.3',
          AgeatDeath >= 50.4 &
            AgeatDeath < 59.4  &
            Disability == 'Intellectual disability' ~ '50.4-59.4',
          
          AgeatDeath >= 56.47 &
            AgeatDeath < 59.1  &
            Disability == 'Down syndrome' ~ '56.47-59',
          AgeatDeath >= 55.59 &
            AgeatDeath < 56.47  &
            Disability == 'Down syndrome' ~ '55.59-56.47',
          AgeatDeath >= 54.42 &
            AgeatDeath < 55.59  &
            Disability == 'Down syndrome' ~ '54.42-55.59',
          AgeatDeath >= 50.92 &
            AgeatDeath < 54.42  &
            Disability == 'Down syndrome' ~ '50.92-54.42',
          
          AgeatDeath >= 53.3 &
            AgeatDeath < 58.2  &
            Disability == 'Cerebral palsy' ~ '53.3-58.2',
          AgeatDeath >= 50.6 &
            AgeatDeath < 53.3  &
            Disability == 'Cerebral palsy' ~ '50.6-53.3',
          AgeatDeath >= 48.9 &
            AgeatDeath < 50.6  &
            Disability == 'Cerebral palsy' ~ '48.9-50.6',
          AgeatDeath >= 41.38 &
            AgeatDeath < 48.9  &
            Disability == 'Cerebral palsy' ~ '41.4-48.9',
          
          AgeatDeath >= 44.2 &
            AgeatDeath < 51.1  &
            Disability == 'Other rare developmental disabilities' ~ '44.2-51',
          AgeatDeath >= 41.6 &
            AgeatDeath < 44.2  &
            Disability == 'Other rare developmental disabilities' ~ '41.6-44.2',
          AgeatDeath >= 30.6 &
            AgeatDeath < 38.4  &
            Disability == 'Other rare developmental disabilities' ~ '30.6-38.4',
          AgeatDeath >= 38.4 &
            AgeatDeath < 41.6  &
            Disability == 'Other rare developmental disabilities' ~ '38.4-41.6'
        )
      )
EN

回答 2

Stack Overflow用户

发布于 2020-11-20 09:50:40

一些子集和函数cut()可以发挥很大的作用。我将演示的内容不涉及dplyr

首先创建一个emtpy新变量。我们将使用代码的其余部分在几行代码中进行填充。

代码语言:javascript
复制
IDD$cat <- NA_character

接下来,使用Disability的值和相应的切割点创建一个列表。我们将遍历这个列表。

代码语言:javascript
复制
L <- list(
`No Intelectual and Developmental Disabilities` = c(66.6, 72.5, 74.3, 75.6, 77.1),
`Intellectual disability` = c(50.4, 59.4, 62.3, 64.1, 71.9)
)

你可以填完剩下的部分。现在,我们将使用一个循环来根据Disability的每个值进行子集,使用cut()将这些值划分为类别并重命名这些类别。

代码语言:javascript
复制
for (d in names(L)) {
   IDD$cat[IDD$Disability == d] <- as.character(
                                      cut(IDD$Ageatdeath, 
                                        breaks = L[[d]], 
                                        labels = paste(L[[d]][-4], L[[d]][-1], sep = "-"),
                                        include.lowest = TRUE,
                                        right = FALSE))
}

cut()根据我们提供给L的断点拆分Ageatdeath。我们根据断点给它加标签。right = FALSE使得每个类别都包括下界并排除上界,并且include.lowest = TRUE确保如果任何值在上界,它们都会被包括在最高类别中。我们使用as.character()来确保它是一个字符向量,而不是一个因子。

票数 1
EN

Stack Overflow用户

发布于 2020-11-20 10:01:43

无论您采用哪种方法,您仍然需要将阈值和条件存储在某个地方。现在,这些代码已经写入到您的代码中,但是可以将它们移动到表中。

考虑设置一张表

代码语言:javascript
复制
order | min_age | max_age | disability
------+--------+---------+------------
1     |75.6    | 77.1    | 'No Intelectual and Developmental Disabilities'
2     |74.3    | 75.6    | 'No Intelectual and Developmental Disabilities'
etc.
...

然后,您可以使用该表来设置条件。遵循this问题中的parse_exprs方法:

代码语言:javascript
复制
# loading of condition table
# other setup
# etc.

# ensure conditions are in the preferred order
twc = table_w_conditions %>%
  arrange(order)

# make text strings of conditions
conditions = paste("AgeatDeath >=", twc$min_age,
                  "& AgeatDeath <", twc$max_age,
                  "& Disability ==", twc&disability,
                  " ~ '", twc$min_age, "-", twc$max_age, "'")

# mutate treating text strings as code
IDD <- IDD %>%
  mutate(
        cat = case_when(!!!parse_exprs(conditions))
  )

如果您采用这种方法,我建议您在使用conditions之前检查它是否包含正确条件文本的文本字符串列表。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64922211

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档