首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >numpy数字化输出在枕极小问题中的应用

numpy数字化输出在枕极小问题中的应用
EN

Stack Overflow用户
提问于 2020-06-19 18:57:04
回答 1查看 58关注 0票数 1

我试图用枕最小化fmin函数来最小化二次加权kappa函数。

两个函数digitize_traindigitize_train2给出了100%完全相同的结果。

然而,当我尝试使用这些函数时,使用scipy最小化,第二个方法就失败了。

我已经尝试了几个小时来调试这个问题,令我惊讶的是,尽管这两个功能完全相同,但颠簸的数字化功能没有给fmin Powell模拟。

如何修正错误?

问题

numpy.digitizefmin_powell中的应用

设置

代码语言:javascript
复制
# imports
import numpy as np
import pandas as pd
import seaborn as sns
from scipy.optimize import fmin_powell
from sklearn import metrics

# data
train_labels = [1,1,8,7,6,5,3,2,4,4]
train_preds = [0.1,1.2,8.9, 7.6, 5.5, 5.5, 2.99, 2.4, 3.5, 4.0]
guess_lst = (1.5,2.9,3.1,4.5,5.5,6.1,7.1)


# functions
# here I am trying the convert real numbers -inf to +inf to integers 1 to 8
def digitize_train(train_preds, guess_lst):
    (x1,x2,x3,x4,x5,x6,x7) = list(guess_lst)   
    res = []
    for y in list(train_preds):
        if y < x1:
            res.append(1)
        elif y < x2:
            res.append(2)
        elif y < x3:
            res.append(3)
        elif y < x4:
            res.append(4)
        elif y < x5:
            res.append(5)
        elif y < x6:
            res.append(6)
        elif y < x7:
            res.append(7)
        else: res.append(8)
    return res

def digitize_train2(train_preds, guess_lst):
    return np.digitize(train_preds,guess_lst) + 1

# compare two functions
df = pd.DataFrame({'train_labels': train_labels,
                   'train_preds': train_preds,
                   'method_1': digitize_train(train_preds, guess_lst),
                   'method_2': digitize_train2(train_preds, guess_lst)
                    })

df

**注:这两个函数完全相同**

方法1:没有numpy数字化,运行良好

代码语言:javascript
复制
# using fmin_powel for method 1
def get_offsets_minimizing_train_preds_kappa(guess_lst):
    res = digitize_train(train_preds, guess_lst)
    return - metrics.cohen_kappa_score(train_labels, res,weights='quadratic')  

offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa, guess_lst, disp = True)
print(offsets)

方法2:使用numpy数字化失败

代码语言:javascript
复制
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
    res = digitize_train2(train_preds, guess_lst)
    return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')  

offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)

如何使用numpy数字化方法?

更新

按照建议,我试着剪熊猫,但还是给出了错误。ValueError: bins must increase monotonically.

代码语言:javascript
复制
# using fmin_powell for method 3
def get_offsets_minimizing_train_preds_kappa3(guess_lst):
    res = pd.cut(train_preds, bins=[-np.inf] + list(guess_lst) + [np.inf],
                        right=False)
    res = pd.Series(res).cat.codes + 1
    res = res.to_numpy()

    return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')  

offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa3, guess_lst, disp = True)
print(offsets)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-19 19:24:47

在最小化过程中,guest_lst中的值似乎不再单调增加,其中一项工作是传递sorted of guest_lst in digitize,例如:

代码语言:javascript
复制
def digitize_train2(train_preds, guess_lst):
    return np.digitize(train_preds,sorted(guess_lst)) + 1

然后你就会得到

代码语言:javascript
复制
# using fmin_powell for method 2
def get_offsets_minimizing_train_preds_kappa2(guess_lst):
    res = digitize_train2(train_preds, guess_lst)
    return -metrics.cohen_kappa_score(train_labels, res,weights='quadratic')  

offsets = fmin_powell(get_offsets_minimizing_train_preds_kappa2, guess_lst, disp = True)
print(offsets)
Optimization terminated successfully.
         Current function value: -0.990792
         Iterations: 2
         Function evaluations: 400
[1.5        2.7015062  3.1        4.50379942 4.72643334 8.12463415
 7.13652301]
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62476872

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档