我试图用枕最小化fmin函数来最小化二次加权kappa函数。
两个函数digitize_train和digitize_train2给出了100%完全相同的结果。
然而,当我尝试使用这些函数时,使用scipy最小化,第二个方法就失败了。
我已经尝试了几个小时来调试这个问题,令我惊讶的是,尽管这两个功能完全相同,但颠簸的数字化功能没有给fmin Powell模拟。
如何修正错误?
问题
numpy.digitize在fmin_powell中的应用
设置
# imports
import numpy as np
import pandas as pd
import seaborn as sns
from scipy.optimize import fmin_powell
from sklearn import metrics
# data
train_labels = [1,1,8,7,6,5,3,2,4,4]
train_preds = [0.1,1.2,8.9, 7.6, 5.5, 5.5, 2.99, 2.4, 3.5, 4.0]
guess_lst = (1.5,2.9,3.1,4.5,5.5,6.1,7.1)
# functions
# here I am trying the convert real numbers -inf to +inf to integers 1 to 8
def digitize_train(train_preds, guess_lst):
(x1,x2,x3,x4,x5,x6,x7) = list(guess_lst)
res = []
for y in list(train_preds):
if y < x1:
res.append(1)
elif y < x2:
res.append(2)
elif y < x3:
res.append(3)
elif y < x4:
res.append(4)
elif y < x5:
res.append(5)
elif y < x6:
res.append(6)
elif y < x7:
res.append(7)
else: res.append(8)
return res
def digitize_train2(train_preds, guess_lst):
return np.digitize(train_preds,guess_lst) + 1
# compare two functions
df = pd.DataFrame({'train_labels': train_labels,
'train_preds': train_preds,
'method_1': digitize_train(train_preds, guess_lst),
'method_2': digitize_train2(train_preds, guess_lst)
})
df**注:这两个函数完全相同**
方法1:没有numpy数字化,运行良好
# using fmin_powel for method 1
def get_offsets_minimizing_train_preds_kappa(guess_lst):
res = digitize_train(train_preds, guess_lst)
return - metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa, guess_lst, disp = True)
print(offsets)方法2:使用numpy数字化失败
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
res = digitize_train2(train_preds, guess_lst)
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)如何使用numpy数字化方法?
更新
按照建议,我试着剪熊猫,但还是给出了错误。ValueError: bins must increase monotonically.
# using fmin_powell for method 3
def get_offsets_minimizing_train_preds_kappa3(guess_lst):
res = pd.cut(train_preds, bins=[-np.inf] + list(guess_lst) + [np.inf],
right=False)
res = pd.Series(res).cat.codes + 1
res = res.to_numpy()
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa3, guess_lst, disp = True)
print(offsets)发布于 2020-06-19 19:24:47
在最小化过程中,guest_lst中的值似乎不再单调增加,其中一项工作是传递sorted of guest_lst in digitize,例如:
def digitize_train2(train_preds, guess_lst):
return np.digitize(train_preds,sorted(guess_lst)) + 1然后你就会得到
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
res = digitize_train2(train_preds, guess_lst)
return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')
offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)
Optimization terminated successfully.
Current function value: -0.990792
Iterations: 2
Function evaluations: 400
[1.5 2.7015062 3.1 4.50379942 4.72643334 8.12463415
7.13652301]https://stackoverflow.com/questions/62476872
复制相似问题