首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何用生命线包来估计cox模型?

如何用生命线包来估计cox模型?
EN

Stack Overflow用户
提问于 2021-01-27 20:10:53
回答 1查看 233关注 0票数 0

我想估计cox模型,但是当我试图运行代码时,我有一个错误。似乎这个关于coxphfitter().does的问题解决了这个问题。我认为生命线库不能用ML方法计算系数,这里是.So,我复制错误和示例代码,.I应该说我只写了代码,比如输入,而不是全部。

代码语言:javascript
复制
df_l=df[['Observed','HighLTV','Liquidation']]
    df_c=df[['Observed','HighLTV','Cure']]
    cph_l=CoxPHFitter()
    cph_c=CoxPHFitter()
    cph_l.fit(df_l,'Observed',event_col='Liquidation')
    cph_c.fit(df_c,'Observed',event_col='Cure')
    beta_cure=float('{:.3f}'.format((cph_c.params_[0])))
    beta_liquidation=float('{:.3f}'.format((cph_l.params_[0])))

误差

代码语言:javascript
复制
LinAlgError                               Traceback (most recent call last)
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _newton_rhapson_for_efron_model(self, X, T, E, weights, entries, initial_point, step_size, precision, show_progress, max_steps)
   1497             try:
-> 1498                 inv_h_dot_g_T = spsolve(-h, g, assume_a="pos", check_finite=False)
   1499             except (ValueError, LinAlgError) as e:

~\anaconda3\lib\site-packages\scipy\linalg\basic.py in solve(a, b, sym_pos, lower, overwrite_a, overwrite_b, debug, check_finite, assume_a, transposed)
    247                            overwrite_b=overwrite_b)
--> 248         _solve_check(n, info)
    249         rcond, info = pocon(lu, anorm)

~\anaconda3\lib\site-packages\scipy\linalg\basic.py in _solve_check(n, info, lamch, rcond)
     28     elif 0 < info:
---> 29         raise LinAlgError('Matrix is singular.')
     30 

LinAlgError: Matrix is singular.

During handling of the above exception, another exception occurred:

ConvergenceError                          Traceback (most recent call last)
<ipython-input-145-7cb92b8db8fe> in <module>
      8     k.append(list(map(lambda x: random.choice(o),range(10))))
      9     s=pd.DataFrame(k[i],columns=df.columns)
---> 10     c.append(CCR(s))

<ipython-input-144-da506c585def> in CCR(data)
     30     cph_c=CoxPHFitter()
     31     cph_l.fit(df_l,'Observed',event_col='Liquidation')
---> 32     cph_c.fit(df_c,'Observed',event_col='Cure')
     33     beta_cure=float('{:.3f}'.format((cph_c.params_[0])))
     34     beta_liquidation=float('{:.3f}'.format((cph_l.params_[0])))

~\anaconda3\lib\site-packages\lifelines\utils\__init__.py in f(model, *args, **kwargs)
     52         def f(model, *args, **kwargs):
     53             cls.set_censoring_type(model, cls.RIGHT)
---> 54             return function(model, *args, **kwargs)
     55 
     56         return f

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in fit(self, df, duration_col, event_col, show_progress, initial_point, strata, step_size, weights_col, cluster_col, robust, batch_mode, timeline, formula, entry_col)
    274         """
    275         self.strata = utils.coalesce(strata, self.strata)
--> 276         self._model = self._fit_model(
    277             df,
    278             duration_col,

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model(self, *args, **kwargs)
    595     def _fit_model(self, *args, **kwargs):
    596         if self.baseline_estimation_method == "breslow":
--> 597             return self._fit_model_breslow(*args, **kwargs)
    598         elif self.baseline_estimation_method == "spline":
    599             return self._fit_model_spline(*args, **kwargs)

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model_breslow(self, *args, **kwargs)
    608         )
    609         if utils.CensoringType.is_right_censoring(self):
--> 610             model.fit(*args, **kwargs)
    611             return model
    612         else:

~\anaconda3\lib\site-packages\lifelines\utils\__init__.py in f(model, *args, **kwargs)
     52         def f(model, *args, **kwargs):
     53             cls.set_censoring_type(model, cls.RIGHT)
---> 54             return function(model, *args, **kwargs)
     55 
     56         return f

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in fit(self, df, duration_col, event_col, show_progress, initial_point, strata, step_size, weights_col, cluster_col, robust, batch_mode, timeline, formula, entry_col)
   1225         )
   1226 
-> 1227         params_, ll_, variance_matrix_, baseline_hazard_, baseline_cumulative_hazard_, model = self._fit_model(
   1228             X_norm,
   1229             T,

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model(self, X, T, E, weights, entries, initial_point, step_size, show_progress)
   1353         show_progress: bool = True,
   1354     ):
-> 1355         beta_, ll_, hessian_ = self._newton_rhapson_for_efron_model(
   1356             X, T, E, weights, entries, initial_point=initial_point, step_size=step_size, show_progress=show_progress
   1357         )

~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _newton_rhapson_for_efron_model(self, X, T, E, weights, entries, initial_point, step_size, precision, show_progress, max_steps)
   1505                     )
   1506                 elif isinstance(e, LinAlgError):
-> 1507                     raise exceptions.ConvergenceError(
   1508                         """Convergence halted due to matrix inversion problems. Suspicion is high collinearity. {0}""".format(
   1509                             CONVERGENCE_DOCS

ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please see the following tips in the lifelines documentation: https://lifelines.readthedocs.io/en/latest/Examples.html#problems-with-convergence-in-the-cox-proportional-hazard-modelMatrix is singular.
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-02-12 12:18:31

该文件清楚地说明了这个问题:

ConvergenceError:由于矩阵求逆问题,收敛停止。怀疑是高度共线性的。请参阅生命线文档中的以下技巧:https://lifelines.readthedocs.io/en/latest/Examples.html#problems-with-convergence-in-the-cox-proportional-hazard-modelMatrix是单数。

没有真实的数据,我无法给出任何进一步的建议。但是生命线文档在这个问题上给出了很多建议:

由于矩阵反演问题,

收敛停止:这意味着您的数据集中存在很高的共线性。也就是说,一列等于一个或多个其他列的线性组合。造成此错误的一个常见原因是虚拟分类变量,但不删除列或数据集中的某种层次结构。尝试通过:在模型中添加一个简化器( ex: CoxPHFitter(penalizer=0.1).fit(…) )来寻找这种关系。直到模型会聚。在print_summary()中,具有较高共线性的系数在coefs列中具有较大的(绝对)幅值。使用方差通货膨胀因子(VIF)寻找冗余变量。查看数据集的相关矩阵,或

这很可能不是由生命线引起的错误,而是您的数据或如何将模型应用于您的数据。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/65926665

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档