blocks|key|5282714|text|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5282715|应选择这2个或更多对象中的哪个对象作为第五个最近邻居？|blockquote|5282716|5282717|5282718|这真的取决于你想要如何实现它。|5282719|大多数算法会做三件事中的一件：|5282720|5282721|包括所有相等距离的点，因此对于此估计，它们将使用6个点，而不是5个。|ordered-list-item|5282722|使用两个相等距离的点中的“第一个”找到的点。|5282723|从找到的2个点中随机选择一个点(通常具有一致的种子，因此结果是可重现的)。|5282724|5282725|也就是说，大多数基于径向搜索的算法都有一个固有的平稳性假设，在这种情况下，你选择上面的哪个选项真的无关紧要。一般来说，从理论上讲，它们中的任何一个都应该提供合理的默认值(特别是因为它们是近似值中最远的点，并且应该具有最低的有效权重)。|5282726|entityMap^0|0|0|0|0|0|0|0|0|0|0|0|0^^$0|@$1|2|3|-4|4|5|6|W|7|@]|8|@]|9|$]]|$1|A|3|B|4|C|6|X|7|@]|8|@]|9|$]]|$1|D|3|-4|4|5|6|Y|7|@]|8|@]|9|$]]|$1|E|3|-4|4|5|6|Z|7|@]|8|@]|9|$]]|$1|F|3|G|4|5|6|10|7|@]|8|@]|9|$]]|$1|H|3|I|4|5|6|11|7|@]|8|@]|9|$]]|$1|J|3|-4|4|5|6|12|7|@]|8|@]|9|$]]|$1|K|3|L|4|M|6|13|7|@]|8|@]|9|$]]|$1|N|3|O|4|M|6|14|7|@]|8|@]|9|$]]|$1|P|3|Q|4|M|6|15|7|@]|8|@]|9|$]]|$1|R|3|-4|4|5|6|16|7|@]|8|@]|9|$]]|$1|S|3|T|4|5|6|17|7|@]|8|@]|9|$]]|$1|U|3|-4|4|5|6|18|7|@]|8|@]|9|$]]]|V|$]]

<blockquote>
 Which object of these 2 or more should be chosen as the 5th nearest neighbor?
</blockquote>

It really depends on how you want to implement it.

Most algorithms will do one of three things:

<ol>
<li>Include all equal distance points, so for this estimation, they'll use 6 points, not 5.</li>
<li>Use the "first" found point of the two equal distant.</li>
<li>Pick a random (usually with a consistent seed, so results are reproducable) point from the 2 points found.</li>
</ol>

That being said, most algorithms based on radial searching have an inherent assumption of stationarity, in which case, it really shouldn't matter which of the options above you choose. In general, any of them should, theoretically, provide reasonable defaults (especially since they're the furthest points in the approximation, and should have the lowest effective weightings).

blocks|key|3171572|text|另一个有趣的选择是使用最近的邻居，如下所示：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3171573|3171574|计算每个类到样本的5个最近邻居的距离:每个类之间有5个距离。|unordered-list-item|3171575|然后得到每个类的平均距离。|3171576|较低的平均距离将是您将分配给样本的类。|3171577|3171578|这种方法对于重叠类的数据集是有效的。|3171579|entityMap^0|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|P|8|@]|9|@]|A|$]]|$1|C|3|D|5|E|7|Q|8|@]|9|@]|A|$]]|$1|F|3|G|5|E|7|R|8|@]|9|@]|A|$]]|$1|H|3|I|5|E|7|S|8|@]|9|@]|A|$]]|$1|J|3|-4|5|6|7|T|8|@]|9|@]|A|$]]|$1|K|3|L|5|6|7|U|8|@]|9|@]|A|$]]|$1|M|3|-4|5|6|7|V|8|@]|9|@]|A|$]]]|N|$]]

Another and interesting option is to use the nearest neighbor like this:

<ul>
<li>You calculate the distances of the 5 nearest neighbors from each class to the sample: you will have 5 distances from each class.</li>
<li>Then you get the mean distance for each class.</li>
<li>That lower mean distance will be the class you will assign to the sample.</li>
</ul>

This way is effective for datasets of classes that overlap.

blocks|key|5282766|text|如果你有另一个距离函数，你可以用它来打破平局。即使是一个糟糕的人也可以完成这项工作，如果你有一些启发式的话会更好。例如，如果您知道计算主距离时考虑的某个特征更重要，则仅使用此特征来求解平局。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|5282767|如果不是这样，请随机选择。在同一测试集上多次运行您的程序，以检查随机选择是否重要。|5282768|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|F|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|G|8|@]|9|@]|A|$]]|$1|D|3|-4|5|6|7|H|8|@]|9|@]|A|$]]]|E|$]]

If you have another distance function, you can use it to break the tie. Even a bad one can do the job, better if you have some heuristics. For instance, if you know that one of the feature considered to compute your main distance is more significant, use only this one to solve the tie.

If it's not the case, pick at random. The run several times your program on the same test set, to check if the random choice matters.

blocks|key|2200180|text|如果您有k=5，您可以查看前五个记录，查看这五个记录中最常见的结果。你很可能会得到两对，这会让你陷入困境，这将是艰难的，因为这样你就有50/50的机会。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|2200181|所以这让生活充满了挑战。那么如何为k选择一个值呢？有一些指标可以用来分析事后的结果，但没有严格的k必须是什么的规则，所以我会让自己更容易开始，坚持使用k=3而不是k=5，然后通过查看预测的实际准确性，研究一些策略，可以帮助您优化k的值。|2200182|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|F|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|G|8|@]|9|@]|A|$]]|$1|D|3|-4|5|6|7|H|8|@]|9|@]|A|$]]]|E|$]]

If you have k=5, you look at the top five records, look at the most common result out of those five. It's probable that you would get two pairs which would put you in a bind and it would be tough, because then you have a 50/50 chance of each pair.

So that makes life challenging. So how do you pick out a value for k? There are some metrics you can use to analyze the result after the fact, but no strict rule of what k must be, so I would make it easy on yourself just starting out and stick with k=3 instead of k=5 and then down the road look into some strategies that can assist you in optimizing the value of k, by looking at the actual accuracy of your predictions.

Using the KNN-algorithm, say k=5. Now I try to classify an unknown object by getting its 5 nearest neighbours. What to do, if after determining the 4 nearest neighbors, the next 2 (or more) nearest objects have the same distance? Which object of these 2 or more should be chosen as the 5th nearest neighbor?

K Nearest-Neighbor Algorithm

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云AI代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

功能1上新10个字符

功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符。

功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符。

功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符

功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符

功能4上新

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

精选全网热门MCP server，让你的AI更好用 🚀

💥开发者 MCP广场重磅上线！

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

使用KNN算法，比如k=5。现在我尝试通过获取一个未知对象的5个最近邻居来对其进行分类。如果在确定了4个最近的邻居之后，接下来的2个(或更多)最近的对象具有相同的距离，该怎么办？这2个或更多的对象中的哪一个应该被选为第五个最近邻居？

问K近邻算法
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问K近邻算法EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问K近邻算法
EN