首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从Scikit学习管道传递估计器到Scikit生存as_concordance_index_ipcw_scorer

从Scikit学习管道传递估计器到Scikit生存as_concordance_index_ipcw_scorer
EN

Stack Overflow用户
提问于 2022-01-07 23:18:54
回答 1查看 179关注 0票数 1

我有一个管道运行预处理,然后是一个随机生存森林从SciKit生存包。我正在尝试使用Scikit-Survival的as_concordance_index_ipcw_scorer()类found 这里

我的管道如下所示:

代码语言:javascript
复制
    Pipeline(steps=[('columntransformer',
                 ColumnTransformer(transformers=[('num',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(strategy='median')),
                                                                  ('scaler',
                                                                   StandardScaler())]),
                                                  Index(['IntVar1', 'IntVar2', 'IntVar3',
       'IntVar4'],
      dtype='object')),
                                                 ('cat',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(fill_value='missing',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore',
                                                                                 sparse=False))]),
                                                  Index(['CharVar1', 'CharVar2', 'CharVar3'], dtype='object'))])),
                ('randomsurvivalforest',
                 RandomSurvivalForest(max_features='sqrt',
                                      min_samples_leaf=0.005,
                                      min_samples_split=0.01, n_estimators=150,
                                      n_jobs=-1, oob_score=True,
                                      random_state=200))])

这是通向管道的python代码和管道的拟合:

代码语言:javascript
复制
print("Importing global DF")
print("Creating X & Y set")
X = df.iloc[:,:-2].copy()
y = Surv.from_dataframe("AliveStatus","Target_Age",df.iloc[:,-2:].copy()) ## Creates structured array for Scikit Surv

print("Defining feature categories by data type")
numerical_features = X.select_dtypes(include=['int64', 'float64']).columns
categorical_features = X.select_dtypes(include=['object']).columns

print("Splitting dataset")
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5) #SkLearn splitter

print("Defining preprocessing steps using SciKitLearn pipeline...")
## Pipeline Steps
numeric_transformer = Pipeline(steps=[
('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())])


categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(sparse=False,handle_unknown='ignore'))]) ## Use "sparse=False" because Random Forests cannot take Spare Matrixes, only Dense Matrixes. 

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)])

## Pipeline defining
print("Defining Random Survival Forest pipeline from SciKit Survival")
rsf = make_pipeline(
    preprocessor,
    RandomSurvivalForest(n_estimators=150, ## Default 100
                        min_samples_split=0.01, ## Default 6
                        min_samples_leaf=0.005, ## Default 3
                        max_features="sqrt", ## Defaults to none when not defined
                        n_jobs=-1, ## Default -1
                        oob_score = True,
                        random_state=200) ## Random State 2020
                        )


##Fitting & Scoring
print("Fitting dataframe to RSF Pipeline")
rsf.fit(X_train,y_train)
print("Fitting completed.")

在试穿完成后,我试着运行以下步骤:

代码语言:javascript
复制
as_concordance_index_ipcw_scorer(rsf).score(X_test,y_test)

我得到以下错误后:

代码语言:javascript
复制
AttributeError                            Traceback (most recent call last)
<ipython-input-97-9a92b22d8026> in <module>
----> 1 as_concordance_index_ipcw_scorer(rsf).score(X_test,y_test)

C:\ProgramData\Anaconda3\lib\site-packages\sksurv\metrics.py in score(self, X, y)
    788         score : float
    789         """
--> 790         estimate = self._do_predict(X)
    791         score = self._score_func(
    792             survival_train=self._train_y,

C:\ProgramData\Anaconda3\lib\site-packages\sksurv\metrics.py in _do_predict(self, X)
    768 
    769     def _do_predict(self, X):
--> 770         predict_func = getattr(self.estimator_, self._predict_func)
    771         return predict_func(X)
    772 

AttributeError: 'as_concordance_index_ipcw_scorer' object has no attribute 'estimator_'

我尝试过的一个选项是指定管道的RSF部分,但没有成功:

代码语言:javascript
复制
as_concordance_index_ipcw_scorer(rsf[1]).score(X_test,y_test)

有什么建议吗?

对于长度或缺少的信息,我很抱歉,我对管道和ScikitSurvival并不熟悉,我想给出尽可能多的细节。

谢谢

EN

回答 1

Stack Overflow用户

发布于 2022-01-09 23:08:37

需要对来自as_concordance_index_ipcw_scorer的估计器实例进行拟合;在这种情况下,已经安装了基本的估计器也没有帮助。

源代码 ( Mixin类)中,安装这些包装器中的一个适合底层估计器,将其保存在新的属性estimator_中(这是您的错误抱怨丢失的地方),还保存了培训标签。因此,您可能能够直接创建这些属性,而不会产生不良影响,但您将在预期的过程中进行操作。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70628206

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档