文章/答案/技术大牛

发布

社区首页 >问答首页 >如何计算ADABoost模型的Shap值？

问如何计算ADABoost模型的Shap值？
EN

Stack Overflow用户

提问于 2020-02-27 20:36:03

回答 2查看 1.5K关注 0票数 2

我正在运行3个不同的模型(随机森林，梯度提升，Ada Boost)和一个基于这3个模型的模型集成。

我设法为GB和RF使用SHAP，但不为ADA使用SHAP，错误如下：

Exception                                 Traceback (most recent call last)
in engine
----> 1 explainer = shap.TreeExplainer(model,data = explain_data.head(1000), model_output= 'probability')

/home/cdsw/.local/lib/python3.6/site-packages/shap/explainers/tree.py in __init__(self, model, data, model_output, feature_perturbation, **deprecated_options)
    110         self.feature_perturbation = feature_perturbation
    111         self.expected_value = None
--> 112         self.model = TreeEnsemble(model, self.data, self.data_missing)
    113 
    114         if feature_perturbation not in feature_perturbation_codes:

/home/cdsw/.local/lib/python3.6/site-packages/shap/explainers/tree.py in __init__(self, model, data, data_missing)
    752             self.tree_output = "probability"
    753         else:
--> 754             raise Exception("Model type not yet supported by TreeExplainer: " + str(type(model)))
    755 
    756         # build a dense numpy version of all the tree objects

Exception: Model type not yet supported by TreeExplainer: <class 'sklearn.ensemble._weight_boosting.AdaBoostClassifier'>

我在Git上找到了这个link

TreeExplainer根据我们试图解释的任何模型类型创建一个TreeEnsemble对象，然后使用该对象进行下游操作。因此，您需要做的就是将另一个if语句添加到

类似于梯度提升的TreeEnsemble构造函数

但我真的不知道如何实现它，因为我对此还很陌生。

adaboost

shap

回答 2

Stack Overflow用户

发布于 2020-04-09 03:05:48

我遇到了同样的问题，我所做的就是修改您正在注释的git中的文件。

在我的例子中，我使用的是windows，所以文件是C:\Users\my_user\AppData\Local\Continuum\anaconda3\Lib\site-packages\shap\explainers格式的，但是你可以双击错误信息，文件就会被打开。

下一步是添加另一个elif，正如git help的答案所说。在我的例子中，我是从404行完成的，如下所示：

1)修改源代码。

... 
    self.objective = objective_name_map.get(model.criterion, None)
    self.tree_output = "probability"
elif str(type(model)).endswith("sklearn.ensemble.weight_boosting.AdaBoostClassifier'>"): #From this line I have modified the code
    scaling = 1.0 / len(model.estimators_) # output is average of trees
    self.trees = [Tree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
    self.objective = objective_name_map.get(model.base_estimator_.criterion, None) #This line is done to get the decision criteria, for example gini.
    self.tree_output = "probability" #This is the last line I added
elif str(type(model)).endswith("sklearn.ensemble.forest.ExtraTreesClassifier'>"): # TODO: add unit test for this case
    scaling = 1.0 / len(model.estimators_) # output is average of trees
    self.trees = [Tree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
...

注意:在其他模型中，shap的代码需要AdaBoost分类器不直接具有的属性'criterion'。所以在这种情况下，这个属性是从“弱”分类器中获得的，并且AdaBoost已经过训练，这就是我添加model.base_estimator_.criterion的原因。

最后，你必须再次导入库，训练你的模型并获得shap值。我举个例子：

2)再次导入库并尝试：

from sklearn import datasets
from sklearn.ensemble import AdaBoostClassifier
import shap

# import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target

ADABoost_model = AdaBoostClassifier()
ADABoost_model.fit(X, y)

shap_values = shap.TreeExplainer(ADABoost_model).shap_values(X)
shap.summary_plot(shap_values, X, plot_type="bar")

这将生成以下内容：

3)获取您的新结果：

票数 3

Stack Overflow用户

发布于 2020-06-25 23:51:41

看起来shap包已经更新了，但是仍然没有包含AdaBoostClassifier。根据前面的答案，我已经修改了前面的答案，以使用第598-610行中的shap/explainers/tree.py文件

### Added AdaBoostClassifier based on the outdated StackOverflow response and Github issue here
### https://stackoverflow.com/questions/60433389/how-to-calculate-shap-values-for-adaboost-model/61108156#61108156
### https://github.com/slundberg/shap/issues/335
elif safe_isinstance(model, ["sklearn.ensemble.AdaBoostClassifier", "sklearn.ensemble._weighted_boosting.AdaBoostClassifier"]):
    assert hasattr(model, "estimators_"), "Model has no `estimators_`! Have you called `model.fit`?"
    self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
    self.input_dtype = np.float32
    scaling = 1.0 / len(model.estimators_) # output is average of trees
    self.trees = [Tree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
    self.objective = objective_name_map.get(model.base_estimator_.criterion, None) #This line is done to get the decision criteria, for example gini.
    self.tree_output = "probability" #This is the last line added

还在进行测试，以将此代码添加到包中:)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/60433389

复制

相似问题

问如何计算ADABoost模型的Shap值？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何计算ADABoost模型的Shap值？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何计算ADABoost模型的Shap值？
EN