首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Panda > Statsmodel:实现variance_inflation_factor时出现语法错误

Panda > Statsmodel:实现variance_inflation_factor时出现语法错误
EN

Stack Overflow用户
提问于 2016-05-10 03:53:22
回答 3查看 2.2K关注 0票数 2

我使用Statsmodel还不到2天,对导入命令等一点都不熟悉。我想从here运行一个简单的variance_inflation_factor,但遇到了一些问题。我的代码如下:

代码语言:javascript
复制
from numpy import *
import numpy as np
import pandas as pd
from pandas import DataFrame, Series
import statsmodels.formula.api as sm
from sklearn.linear_model import LinearRegression
import scipy, scipy.stats
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')
from statsmodels.api import add_constant
from numpy import linalg as LA
import statsmodels as sm

## I have been adding libraries and modules/packages with the intention of erring on the side of caution 

a = df1.years_exp
b = df1.leg_totalbills
c = df1.log_diff_rgdp
d = df1.unemployment
e = df1.expendituresfor
f = df1.direct_expenditures
g = df1.indirect_expenditures

sm.variance_inflation_factor((['a', 'b', 'c', 'd', 'e', 'f']), g)

then I get the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-61-bb126535eadd> in <module>()
----> 1 sm.variance_inflation_factor((['a', 'b', 'c', 'd', 'e', 'f']), g)

AttributeError: module 'statsmodels' has no attribute 'variance_inflation_factor'

有人能告诉我加载和执行这个模块的正确语法吗?如果我发布一些源代码的链接更方便的话,请询问。然而,我有一种感觉,这只是一个简单的语法问题。

EN

回答 3

Stack Overflow用户

发布于 2017-02-13 11:44:13

感谢你提出这个问题!我今天也有同样的问题,只是我想计算每个功能的方差膨胀因子。这是一种程序化的方法:

代码语言:javascript
复制
from patsy import dmatrices
from statsmodels.stats.outliers_influence import variance_inflation_factor

# 'feature_1 + feature_2 ... feature_p'
features_formula = "+".join(df1.columns - ["indirect_expenditures"])

# get y and X dataframes based on this formula:
# indirect_expenditures ~ feature_1 + feature_2 ... feature_p
y, X = dmatrices('indirect_expenditures ~' + features_formula, df1, return_type='dataframe')

# For each Xi, calculate VIF and save in dataframe
vif = pd.DataFrame() 
vif["vif"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
vif["features"] = X.columns
vif

请注意,仅当您导入了pandas并且df1是pandas DataFrame时,上述代码才有效

票数 2
EN

Stack Overflow用户

发布于 2016-05-10 04:03:56

函数variance_inflation_factorstatsmodels.stats.outlier_influence中显示为in the docs,因此要使用它,您必须正确导入,一个选项为

代码语言:javascript
复制
from statsmodels.stats import outliers_influence
# code here 
outliers_influence.variance_inflation_factor((['a', 'b', 'c', 'd', 'e', 'f']), g)
票数 1
EN

Stack Overflow用户

发布于 2016-05-10 05:54:29

代码语言:javascript
复制
a = df1.years_exp
b = df1.leg_totalbills
c = df1.log_diff_rgdp
d = df1.unemployment
e = df1.expendituresfor
f = df1.direct_expenditures
g = df1.indirect_expenditures

ck=np.array([a,b,c,d,e,f,g])
outliers_influence.variance_inflation_factor(ck, 6)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37124342

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档