我有以下使用XGBoost回归算法执行预测的data。然而,问题是,回归算法对任何输入都预测相同的输出,我真的不确定为什么。
data= pd.read_csv("depthwise_data.csv", delimiter=',', header=None, skiprows=1, names=['input_size','input_channels','conv_kernel','conv_strides','running_time'])
X = data[['input_size', 'input_channels','conv_kernel', 'conv_strides']]
Y = data[["running_time"]]
X_train, X_test, y_train, y_test = train_test_split(
np.array(X), np.array(Y), test_size=0.2, random_state=42)
y_train_log = np.log(y_train)
y_test_log = np.log(y_test)
xgb_depth_conv = xgb.XGBRegressor(objective ='reg:squarederror',
n_estimators = 1000,
seed = 123,
tree_method = 'hist',
max_depth=10)
xgb_depth_conv.fit(X_train, y_train_log)
y_pred_train = xgb_depth_conv.predict(X_train)
#y_pred_test = xgb_depth_conv.predict(X_test)
X_data=[[8,576,3,2]] #instance
X_test=np.log(X_data)
y_pred_test=xgb_depth_conv.predict(X_test)
print(np.exp(y_pred_test))
MSE_test, MSE_train = mse(y_test_log,y_pred_test), mse(y_train_log, y_pred_train)
R_squared = r2_score(y_pred_test,y_test_log)
print("MSE-Train = {}".format(MSE_train))
print("MSE-Test = {}".format(MSE_test))
print("R-Squared: ", np.round(R_squared, 2))第一个实例的输出
X_data=[[8,576,3,2]]
print(np.exp(y_pred_test))
[0.7050679]第二个实例的输出
X_data=[[4,960,3,1]]
print(np.exp(y_pred_test))
[0.7050679]发布于 2021-10-03 17:45:25
您的问题源于此X_test=np.log(X_data)
为什么要在测试用例上应用log,而不是在训练样本上应用呢?
如果你完全去掉了np.log,即使是目标(y),你也会得到很好的结果。我自己用你提供给我们的数据进行了测试。
https://stackoverflow.com/questions/69424041
复制相似问题