我建立了一个回归模型,试图拟合我的数据(x:年份,y:汽车数量)。现在我感到很沮丧。如何评估估计参数(p = 0.0001695867,q= 0.349592505)是否显著?如何进行一些统计检验(p和q,t-统计量的估计p-值)来检验p和q的显着性,可能是回归分析中总体显着性的F检验。由于某种原因,我不想找到p和q的置信区间,但是p值或t-统计量或其他对我更感兴趣的计算方法。所以
Ho :P无统计学意义,H1 :P有统计学意义。q也一样。
还有一个F测试:
Ho: P&Q=0。H1: p或q都不等于0
import pandas as pd
x = pd.read_excel('fitting_data.xlsx', sheet_name="bevshyb cars (2)", index_col=None, dtype={'Name': str, 'Value': float})
import numpy as np
#regression function
def fit(t,p,q):
return 22500000*(((p*p*p+2*p*p*q+p*q*q)*np.exp(-p*t-q*t))/(((p+q*np.exp(-p*t-q*t))*(p+q*np.exp(-p*t-q*t)))))
#initial values
g = [0.000001,0.000001]
import scipy.optimize
t = x['t'].values
carsfact = x['BEVSHYB'].values
c, cov = scipy.optimize.curve_fit(fit,t,carsfact,g)
print(round(c[0],10))
print(round(c[1],10))估计参数:P&Q分别为== 0.0001695867、0.349592505
import sklearn.metrics
print('R^2: ',sklearn.metrics.r2_score(x['BEVSHYB'],y))
print('explained_variance_score: ', sklearn.metrics.explained_variance_score(x['BEVSHYB'], y))回归模型拟合优度评价: R^2: 0.9143477744061798 explained_variance_score: 0.9168457427666166
)))
发布于 2021-11-27 12:35:13
请参阅此链接中发布的问题的答案:它显示了评估优化参数的重要性的一种方法:
下面是这里提供的示例代码;请注意scipy.stats的用法:
from scipy.optimize import curve_fit
import numpy as np
import scipy.odr
import scipy.stats
x = np.array([5.357, 5.797, 5.936, 6.161, 6.697, 6.731, 6.775, 8.442, 9.861])
y = np.array([0.376, 0.874, 1.049, 1.327, 2.054, 2.077, 2.138, 4.744, 7.104])
def f(x,b0,b1):
return b0 + (b1 * x)
def f_wrapper_for_odr(beta, x): # parameter order for odr
return f(x, *beta)
parameters, cov= curve_fit(f, x, y)
model = scipy.odr.odrpack.Model(f_wrapper_for_odr)
data = scipy.odr.odrpack.Data(x,y)
myodr = scipy.odr.odrpack.ODR(data, model, beta0=parameters, maxit=0)
myodr.set_job(fit_type=2)
parameterStatistics = myodr.run()
df_e = len(x) - len(parameters) # degrees of freedom, error
cov_beta = parameterStatistics.cov_beta # parameter covariance matrix from ODR
sd_beta = parameterStatistics.sd_beta * parameterStatistics.sd_beta
ci = []
t_df = scipy.stats.t.ppf(0.975, df_e)
ci = []
for i in range(len(parameters)):
ci.append([parameters[i] - t_df * parameterStatistics.sd_beta[i], parameters[i] + t_df * parameterStatistics.sd_beta[i]])
tstat_beta = parameters / parameterStatistics.sd_beta # coeff t-statistics
pstat_beta = (1.0 - scipy.stats.t.cdf(np.abs(tstat_beta), df_e)) * 2.0 # coef. p-values
for i in range(len(parameters)):
print('parameter:', parameters[i])
print(' conf interval:', ci[i][0], ci[i][1])
print(' tstat:', tstat_beta[i])
print(' pstat:', pstat_beta[i])
print()https://stackoverflow.com/questions/70134760
复制相似问题