首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >精确,回忆,F1分数等于滑雪

精确,回忆,F1分数等于滑雪
EN

Stack Overflow用户
提问于 2017-01-12 23:26:40
回答 1查看 4.1K关注 0票数 3

我试着比较k近邻算法中不同的距离计算方法和不同的投票系统。目前,我的问题是,无论我做什么,precision_recall_fscore_support方法从scikit-学习产生完全相同的结果,为精确,回忆和分数。为什么会这样呢?我在不同的数据集(虹膜、玻璃和葡萄酒)上尝试过。我做错了什么?到目前为止,守则:

代码语言:javascript
复制
#!/usr/bin/env python3
from collections import Counter
from data_loader import DataLoader
from sklearn.metrics import precision_recall_fscore_support as pr
import random
import math
import ipdb

def euclidean_distance(x, y):
    return math.sqrt(sum([math.pow((a - b), 2) for a, b in zip(x, y)]))

def manhattan_distance(x, y):
    return sum(abs([(a - b) for a, b in zip(x, y)]))

def get_neighbours(training_set, test_instance, k):
    names = [instance[4] for instance in training_set]
    training_set = [instance[0:4] for instance in training_set]
    distances = [euclidean_distance(test_instance, training_set_instance) for training_set_instance in training_set]
    distances = list(zip(distances, names))
    print(list(filter(lambda x: x[0] == 0.0, distances)))
    sorted(distances, key=lambda x: x[0])
    return distances[:k]

def plurality_voting(nearest_neighbours):
    classes = [nearest_neighbour[1] for nearest_neighbour in nearest_neighbours]
    count = Counter(classes)
    return count.most_common()[0][0]

def weighted_distance_voting(nearest_neighbours):
    distances = [(1/nearest_neighbour[0], nearest_neighbour[1]) for nearest_neighbour in nearest_neighbours]
    index = distances.index(min(distances))
    return nearest_neighbours[index][1]

def weighted_distance_squared_voting(nearest_neighbours):
    distances = list(map(lambda x: 1 / x[0]*x[0], nearest_neighbours))
    index = distances.index(min(distances))
    return nearest_neighbours[index][1]

def main():
    data = DataLoader.load_arff("datasets/iris.arff")
    dataset = data["data"]
    # random.seed(42)
    random.shuffle(dataset)
    train = dataset[:100]
    test = dataset[100:150]
    classes = [instance[4] for instance in test]
    predictions = []
    for test_instance in test:
        prediction = weighted_distance_voting(get_neighbours(train, test_instance[0:4], 15))
        predictions.append(prediction)
    print(pr(classes, predictions, average="micro"))

if __name__ == "__main__":
    main()
EN

回答 1

Stack Overflow用户

发布于 2017-03-10 07:16:05

问题是你使用的是“微观”平均值。

如前所述,这里

正如文档中所写的那样:“注意,对于多类设置中的”微“-averaging,将产生相同的精度、召回和图像: F,而”加权“平均可能会产生一个不介于精确和回忆之间的F分数。”evaluation.html。 但是,如果使用标签参数删除大多数标签,则微观平均与准确性不同,精确度与召回也不同。

票数 4
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/41624878

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档