首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >回归分析,使用统计模型

回归分析,使用统计模型
EN

Stack Overflow用户
提问于 2018-01-03 17:47:52
回答 1查看 278关注 0票数 0

请帮助我从这个code.why中获得输出这个代码的输出是nan?!我哪里错了?

代码语言:javascript
复制
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import pandas as pd
import matplotlib.pyplot as plt
import math
import datetime as dt
#importing Data
es_url = 'https://www.stoxx.com/document/Indices/Current/HistoricalData/hbrbcpe.txt'
vs_url = 'https://www.stoxx.com/document/Indices/Current/HistoricalData/h_vstoxx.txt'
#creating DataFrame
cols=['SX5P','SX5E','SXXP','SXXE','SXXF','SXXA','DK5f','DKXF']
es=pd.read_csv(es_url,index_col=0,parse_dates=True,sep=';',dayfirst=True,header=None,skiprows=4,names=cols)
vs=pd.read_csv(vs_url,index_col=0,header=2,parse_dates=True,sep=',',dayfirst=True)
data=pd.DataFrame({'EUROSTOXX' : es['SX5E'][es.index > dt.datetime(1999,1,1)]},dtype=float)
data=data.join(pd.DataFrame({'VSTOXX' : vs['V2TX'][vs.index > dt.datetime(1999,1,1)]},dtype=float))
data=data.fillna(method='ffill')
rets=(((data/data.shift(1))-1)*100).round(2)
xdat = rets['EUROSTOXX']
ydat = rets['VSTOXX']
#regression analysis
model = smf.ols('ydat ~ xdat',data=rets).fit()
print model.summary()
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-01-04 04:00:57

问题是,当您计算rets时,您将除以0,这会导致inf。此外,当您使用shift时,您有NaNs,因此在继续进行回归之前,需要首先以某种方式处理缺少的值。

使用您的数据演练此示例,并查看:

代码语言:javascript
复制
df = data.loc['2016-03-20':'2016-04-01'].copy()

df看起来像这样:

代码语言:javascript
复制
            EUROSTOXX   VSTOXX
2016-03-21    3048.77  35.6846
2016-03-22    3051.23  35.6846
2016-03-23    3042.42  35.6846
2016-03-24    2986.73  35.6846
2016-03-25       0.00  35.6846
2016-03-28       0.00  35.6846
2016-03-29    3004.87  35.6846
2016-03-30    3044.10  35.6846
2016-03-31    3004.93  35.6846
2016-04-01    2953.28  35.6846

移位1并除以:

代码语言:javascript
复制
df = (((df/df.shift(1))-1)*100).round(2)

打印输出:

代码语言:javascript
复制
             EUROSTOXX  VSTOXX
2016-03-21         NaN     NaN
2016-03-22    0.080688     0.0
2016-03-23   -0.288736     0.0
2016-03-24   -1.830451     0.0
2016-03-25 -100.000000     0.0
2016-03-28         NaN     0.0
2016-03-29         inf     0.0
2016-03-30    1.305547     0.0
2016-03-31   -1.286751     0.0
2016-04-01   -1.718842     0.0

要点:自动移位1总是在顶部创建一个NaN。0.00除以0.00会产生inf

处理缺失值的一种可能的解决方案:

代码语言:javascript
复制
...
xdat = rets['EUROSTOXX']
ydat = rets['VSTOXX']

# handle missing values
messed_up_indices = xdat[xdat.isin([-np.inf, np.inf, np.nan]) == True].index
xdat[messed_up_indices] = xdat[messed_up_indices].replace([-np.inf, np.inf], np.nan)
xdat[messed_up_indices] = xdat[messed_up_indices].fillna(xdat.mean())
ydat[messed_up_indices] = ydat[messed_up_indices].fillna(0.0)

#regression analysis
model = smf.ols('ydat ~ xdat',data=rets, missing='raise').fit()
print(model.summary())

请注意,我在ols中添加了missing='raise'参数,以查看发生了什么。

最终结果打印输出:

代码语言:javascript
复制
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                   ydat   R-squared:                       0.259
Model:                            OLS   Adj. R-squared:                  0.259
Method:                 Least Squares   F-statistic:                     1593.
Date:                Wed, 03 Jan 2018   Prob (F-statistic):          5.76e-299
Time:                        12:01:14   Log-Likelihood:                -13856.
No. Observations:                4554   AIC:                         2.772e+04
Df Residuals:                    4552   BIC:                         2.773e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.1608      0.075      2.139      0.033       0.013       0.308
xdat          -1.4209      0.036    -39.912      0.000      -1.491      -1.351
==============================================================================
Omnibus:                     4280.114   Durbin-Watson:                   2.074
Prob(Omnibus):                  0.000   Jarque-Bera (JB):          4021394.925
Skew:                          -3.446   Prob(JB):                         0.00
Kurtosis:                     148.415   Cond. No.                         2.11
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48074862

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档