文章/答案/技术大牛

发布

社区首页 >问答首页 >如何在Rstudio中使用combn :：combn ()？

问如何在Rstudio中使用combn :：combn ()？
EN

Stack Overflow用户

提问于 2021-04-21 01:01:13

回答 1查看 61关注 0票数 0

我开始使用Rstudio了，但是我在理解combn :：combn ()函数是如何工作的时候遇到了问题。(我真的不能理解如何在下面的学术练习中使用它：

请使用数据集：“mtcar”并回答: 1-建立OLS模型，因此，对于燃油效率变量，这次我们限制了预测值的数量。2-在10个预测器中，让我们将预测器的数量限制为最多1到5个。3-创建一个例程，重复所有可能的排列，并推导出在效果大小和重要性方面可能具有最佳性能的“前3名”模型。

我想要遵循的步骤是：

1-首先，我必须获得所有的排列，并使用合适的KPI评估模型(我认为有必要将整个函数打包在一个循环中，因为到目前为止我已经阅读了使用函数组合:：combn ()的文档)。->问题是，我已经在互联网上搜索了一个例子，但没有成功，这个例子可以让我深入了解如何开始构建解决方案，我以多种方式探索了数据，并使用"lm“函数结束了数据，但我不知道如何将其与"combn”函数一起使用。或许这不是最好的选择或最合适的选择。2-然后选择最佳KPI。

下面我留下了我在Rstudio知识贫乏的情况下为探索数据而构建的东西，以及我如何使用不同的互联网来源：

version  # --- I leave the version that you are using because I don't know if it affects later 

# platform       x86_64-w64-mingw32          
# arch           x86_64                      
# os             mingw32                     
# system         x86_64, mingw32             
# status                                     
# major          4                           
# minor          0.4                         
# year           2021                        
# month          02                          
# day            15                          
# svn rev        80002                       
# language       R                           
# version.string **R version 4.0.4 (2021-02-15)**
# nickname       Lost Library Book.

library(tidyverse) 
#  25-03-2021

#  -- Attaching packages ------------------------------------------------------------------------------ tidyverse 1.3.0 --
#  v ggplot2 3.3.3     v purrr   0.3.4
#  v tibble  3.1.0     v dplyr   1.0.5
#  v tidyr   1.1.3     v stringr 1.4.0
#  v readr   1.4.0     v forcats 0.5.1
#  -- Conflicts --------------------------------------------------------------------------------- tidyverse_conflicts() --
#  x dplyr::filter() masks stats::filter()
#  x dplyr::lag()    masks stats::lag()


data("mtcars")   view(mtcars) ?mtcars         

#A data frame with 32 observations on 11 (numeric) variables.
#[, 1]  mpg Miles/(US) gallon --------->>>>>>> (fuel consumption efficiency)  <<<<<<<-------------------
#[, 2]  cyl     Number of cylinders
#[, 3]  disp    Displacement (cu.in.)
#[, 4]  hp      Gross horsepower
#[, 5]  drat    Rear axle ratio
#[, 6]  wt      Weight (1000 lbs)
#[, 7]  qsec    1/4 mile time
#[, 8]  vs      Engine (0 = V-shaped, 1 = straight)
#[, 9]  am      Transmission (0 = automatic, 1 = manual)
#[,10]  gear    Number of forward gears
#[,11]  carb    Number of carburetors

summary(mtcars)  # I explore the data a bit: 

#mpg             cyl             disp             hp             drat             wt             qsec      
#Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.   :2.760   Min.   :1.513   Min.   :14.50  
#1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5   1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89  
#Median :19.20   Median :6.000   Median :196.3   Median :123.0   Median :3.695   Median :3.325   Median :17.71  
#Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7   Mean   :3.597   Mean   :3.217   Mean   :17.85  
#3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0   3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90  
#Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0   Max.   :4.930   Max.   :5.424   Max.   :22.90  
#vs               am              gear            carb      
#Min.   :0.0000   Min.   :0.0000   Min.   :3.000   Min.   :1.000  
#1st Qu.:0.0000   1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
#Median :0.0000   Median :0.0000   Median :4.000   Median :2.000  
#Mean   :0.4375   Mean   :0.4062   Mean   :3.688   Mean   :2.812  
#3rd Qu.:1.0000   3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
#Max.   :1.0000   Max.   :1.0000   Max.   :5.000   Max.   :8.000  

df<- mtcars  # asigno los datos a df para recordar mas facilmente el nombre del dataset view(df) print(df)

# I look for the correlation between each of the variables vs  to "mpg" 

cor.test(df$cyl, df$mpg)    # -0.852162   --> relevant cor.test(df$disp, df$mpg)   # -0.8475514  --> relevant cor.test(df$hp, df$mpg)     # -0.7761684 cor.test(df$drat, df$mpg)   #  0.6811719    cor.test(df$wt, df$mpg)     # -0.8676594  --> relevant cor.test(df$qsec, df$mpg)   #  0.418684 cor.test(df$vs, df$mpg)     # 
0.6640389 cor.test(df$am, df$mpg)     #  0.5998324 cor.test(df$gear, df$mpg)   #  0.4802848 cor.test(df$carb, df$mpg)   # -0.5509251

# I build the "lm" with all the variables to further explore the data 

model <- lm(mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear
+ carb, df, na.action = na.exclude)

anova(model)

# Response: mpg
# Df Sum Sq Mean Sq  F value    Pr(>F)    
# cyl        1 817.71  817.71 116.4245 5.034e-10 ***
# disp       1  37.59   37.59   5.3526  0.030911 *  
# hp         1   9.37    9.37   1.3342  0.261031    
# drat       1  16.47   16.47   2.3446  0.140644    
# wt         1  77.48   77.48  11.0309  0.003244 ** 
# qsec       1   3.95    3.95   0.5623  0.461656    
# vs         1   0.13    0.13   0.0185  0.893173    
# am         1  14.47   14.47   2.0608  0.165858    
# gear       1   0.97    0.97   0.1384  0.713653    
# carb       1   0.41    0.41   0.0579  0.812179    
# Residuals 21 147.49    7.02                       
# ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

summary(model)

# Call:
#   lm(formula = mpg ~ cyl + disp + hp + drat + wt + qsec + vs + 
#        am + gear + carb, data = df, na.action = na.exclude)
# 
# Residuals:
#   Min      1Q  Median      3Q     Max 
# -3.4506 -1.6044 -0.1196  1.2193  4.6271 
# 
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)  
# (Intercept) 12.30337   18.71788   0.657   0.5181  
# cyl         -0.11144    1.04502  -0.107   0.9161  
# disp         0.01334    0.01786   0.747   0.4635  
# hp          -0.02148    0.02177  -0.987   0.3350  
# drat         0.78711    1.63537   0.481   0.6353  
# wt          -3.71530    1.89441  -1.961   0.0633 .
# qsec         0.82104    0.73084   1.123   0.2739  
# vs           0.31776    2.10451   0.151   0.8814  
# am           2.52023    2.05665   1.225   0.2340  
# gear         0.65541    1.49326   0.439   0.6652  
# carb        -0.19942    0.82875  -0.241   0.8122  
# ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 2.65 on 21 degrees of freedom
# Multiple R-squared:  0.869,   Adjusted R-squared:  0.8066 
# F-statistic: 13.93 on 10 and 21 DF,  p-value: 3.793e-07

#I build the graph to visualize the data obtained from the "lm" 

par(mfrow=c(2,2)) plot (model, pch=16, col="blue")

#as you can see everything is very basic, now I am trying to start the solution
#To the question, if you can give me a light on how to start, I would appreciate it a lot, I know almost nothing about R
# I hope to learn as much as I can. Thank you very much in advance for any help.

method-combination

回答 1

Stack Overflow用户

发布于 2021-04-21 13:50:12

如果我回想一下你的问题:这篇技巧很适合用来探索什么是combinat::combn或R中你想要探索的任何类型的函数：

键入?combinat::combn ->这将打开有关如何在标准过程中使用combn函数的文档：

由于combn是一个函数，因此只需输入combn

即可查看该函数的内部内容

请参阅文档中的combn工作示例，例如：

combn(letters1:4，2)

combn(c(1，1，1，1，2，2，3，3，4)，3，列表，nbin= 4)

通过查看您的问题模式，变量之间的组合也可以使用expand.grid进行

示例：expand.grid(letters[1:5], letters[1:5])将在第一个(a-e)字母和第二个(a-e)字母之间生成组合

因此，如果要将此方法应用于mtcar数据集，也可以：

expand.grid(colnames(mtcars), colnames(mtcars))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67183279

复制

相似问题

问如何在Rstudio中使用combn :：combn ()？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在Rstudio中使用combn :：combn ()？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在Rstudio中使用combn :：combn ()？
EN