文章/答案/技术大牛

发布

社区首页 >问答首页 >随机森林分类器的决策路径

问随机森林分类器的决策路径
EN

Stack Overflow用户

提问于 2018-02-19 15:31:19

回答 1查看 8K关注 0票数 3

下面是我在您的环境中运行它的代码，我正在使用RandomForestClassifier，并试图为RandomForestClassifier中选定的示例计算出decision_path。

import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier

X, y = make_classification(n_samples=1000,
                           n_features=6,
                           n_informative=3,
                           n_classes=2,
                           random_state=0,
                           shuffle=False)

# Creating a dataFrame
df = pd.DataFrame({'Feature 1':X[:,0],
                                  'Feature 2':X[:,1],
                                  'Feature 3':X[:,2],
                                  'Feature 4':X[:,3],
                                  'Feature 5':X[:,4],
                                  'Feature 6':X[:,5],
                                  'Class':y})


y_train = df['Class']
X_train = df.drop('Class',axis = 1)

rf = RandomForestClassifier(n_estimators=50,
                               random_state=0)

rf.fit(X_train, y_train)

我得到的最远是：

#Extracting the decision path for instance i = 12
i_data = X_train.iloc[12].values.reshape(1,-1)
d_path = rf.decision_path(i_data)

print(d_path)

但是输出没有多大意义：

(<1x7046型稀疏矩阵“具有486个存储元素的压缩稀疏行format>、数组( 0、133、282、415、588、761、910、1041、1182、1309、1432、1569、1728、1869、2000、2143、2284、2419、2572、2711、2856、2987、3128、3261、3430、3549、3704、3839、3980、4127、4258、4389、4534、4671、4808、4947、5088、5247、5378、5517、5678、5517、5669、5956、6079、6226、6324、6624、6655、6755、678055))第6925,7046，dtype=int32)

我正在试图找出数据中粒子样本的决策路径。有人能告诉我怎么做吗？

其想法是拥有类似于这的东西。

scikit-learn

python

machine-learning

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-02-19 15:47:21

RandomForestClassifier.decision_path方法返回tuple of (indicator, n_nodes_ptr)。请参阅文档：这里

所以你的变量node_indicator是一个元组，而不是你想的那样。元组对象没有属性“indices”，这就是当您这样做时得到错误的原因：

node_index = node_indicator.indices[node_indicator.indptr[sample_id]:
                                    node_indicator.indptr[sample_id + 1]]

试着：

(node_indicator, _) = rf.decision_path(X_train)

还可以为单个示例id绘制森林中每棵树的决策树：

X_train = X_train.values

sample_id = 0

for j, tree in enumerate(rf.estimators_):

    n_nodes = tree.tree_.node_count
    children_left = tree.tree_.children_left
    children_right = tree.tree_.children_right
    feature = tree.tree_.feature
    threshold = tree.tree_.threshold

    print("Decision path for DecisionTree {0}".format(j))
    node_indicator = tree.decision_path(X_train)
    leave_id = tree.apply(X_train)
    node_index = node_indicator.indices[node_indicator.indptr[sample_id]:
                                        node_indicator.indptr[sample_id + 1]]



    print('Rules used to predict sample %s: ' % sample_id)
    for node_id in node_index:
        if leave_id[sample_id] != node_id:
            continue

        if (X_train[sample_id, feature[node_id]] <= threshold[node_id]):
            threshold_sign = "<="
        else:
            threshold_sign = ">"

        print("decision id node %s : (X_train[%s, %s] (= %s) %s %s)"
              % (node_id,
                 sample_id,
                 feature[node_id],
                 X_train[sample_id, feature[node_id]],
                 threshold_sign,
                 threshold[node_id]))

请注意，在您的情况下，您有50个估计器，所以读起来可能有点无聊。

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48869343

复制

相似问题

问随机森林分类器的决策路径
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问随机森林分类器的决策路径EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问随机森林分类器的决策路径
EN