首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >最近邻搜索kdTree

最近邻搜索kdTree
EN

Stack Overflow用户
提问于 2018-01-06 11:16:27
回答 4查看 30.3K关注 0票数 6

对于一个N点列表,[(x_1,y_1), (x_2,y_2), ... ],我试图根据距离找到每个点最近的邻居。我的数据集太大,无法使用蛮力方法,因此KDtree似乎是最好的。

我看到sklearn.neighbors.KDTree可以找到最近的邻居,而不是从头开始实现一个。这能用来找到每个粒子的近邻,即返回一个dim(N)列表吗?

EN

回答 4

Stack Overflow用户

回答已采纳

发布于 2018-01-06 11:57:01

这个问题非常广泛,而且缺少细节。目前还不清楚你尝试了什么,你的数据是什么样子,最近的邻居是什么(身份?)

假设您对标识不感兴趣(距离为0),您可以查询两个最近的邻居并删除第一列。这可能是这里最简单的方法。

代码:

代码语言:javascript
复制
 import numpy as np
 from sklearn.neighbors import KDTree
 np.random.seed(0)
 X = np.random.random((5, 2))  # 5 points in 2 dimensions
 tree = KDTree(X)
 nearest_dist, nearest_ind = tree.query(X, k=2)  # k=2 nearest neighbors where k1 = identity
 print(X)
 print(nearest_dist[:, 1])    # drop id; assumes sorted -> see args!
 print(nearest_ind[:, 1])     # drop id 

输出

代码语言:javascript
复制
 [[ 0.5488135   0.71518937]
  [ 0.60276338  0.54488318]
  [ 0.4236548   0.64589411]
  [ 0.43758721  0.891773  ]
  [ 0.96366276  0.38344152]]
 [ 0.14306129  0.1786471   0.14306129  0.20869372  0.39536284]
 [2 0 0 0 1]
票数 15
EN

Stack Overflow用户

发布于 2018-01-06 11:33:34

您可以使用sklearn.neighbors.KDTreequery_radius()方法,它返回某个半径内最近的邻居的索引列表(而不是返回k个最近的邻居)。

代码语言:javascript
复制
from sklearn.neighbors import KDTree

points = [(1, 1), (2, 2), (3, 3), (4, 4), (5, 5)]

tree = KDTree(points, leaf_size=2)
all_nn_indices = tree.query_radius(points, r=1.5)  # NNs within distance of 1.5 of point
all_nns = [[points[idx] for idx in nn_indices] for nn_indices in all_nn_indices]
for nns in all_nns:
    print(nns)

产出:

代码语言:javascript
复制
[(1, 1), (2, 2)]
[(1, 1), (2, 2), (3, 3)]
[(2, 2), (3, 3), (4, 4)]
[(3, 3), (4, 4), (5, 5)]
[(4, 4), (5, 5)]

注意,每个点都包含在给定半径范围内的近邻列表中。如果要删除这些标识点,则可以将行计算all_nns更改为:

代码语言:javascript
复制
all_nns = [
    [points[idx] for idx in nn_indices if idx != i]
    for i, nn_indices in enumerate(all_nn_indices)
]

其结果是:

代码语言:javascript
复制
[(2, 2)]
[(1, 1), (3, 3)]
[(2, 2), (4, 4)]
[(3, 3), (5, 5)]
[(4, 4)]
票数 8
EN

Stack Overflow用户

发布于 2019-02-05 14:35:09

滑雪应该是最好的。我写了下面的一段时间,在那里我需要定制的距离。(我猜sklearn不支持自定义距离fn 具有自定义距离度量的'KD树。添加以供参考

改编自我的2D https://gist.github.com/alexcpn/1f187f2114976e748f4d3ad38dea17e8的要点

代码语言:javascript
复制
# From https://gist.github.com/alexcpn/1f187f2114976e748f4d3ad38dea17e8
# Author alex punnen
from collections import namedtuple
from operator import itemgetter
import numpy as np

def find_nearest_neighbour(node,point,distance_fn,current_axis):
    # Algorith to find nearest neighbour in a KD Tree;the KD tree has done a spatial sort
    # of the given co-ordinates, such that to the left of the root lies co-ordinates nearest to the x-axis
    # and to the right of the root ,lies the co-ordinates farthest from the x axis
    # On the y axis split on the left of the parent/root node lies co-ordinates nearest to the y-axis and to
    # the right of the root, lies the co-ordinates farthest from the y axis
    # to find the nearest neightbour, from the root, you first check left and right node; if distance is closer
    # to the right node,then the entire left node can be discarded from search, because of the spatial split
    # and that node becomes the root node. This process is continued recursively till the nearest is found
    # param:node: The current node
    # param: point: The point to which the nearest neighbour is to be found
    # param: distance_fn: to calculate the nearest neighbour
    # param: current_axis: here assuming only two dimenstion and current axis will be either x or y , 0 or 1

    if node is None:
        return None,None
    current_closest_node = node
    closest_known_distance = distance_fn(node.cell[0],node.cell[1],point[0],point[1])
    print closest_known_distance,node.cell

    x = (node.cell[0],node.cell[1])
    y = point

    new_node = None
    new_closest_distance = None
    if x[current_axis] > y[current_axis]:
        new_node,new_closest_distance= find_nearest_neighbour(node.left_branch,point,distance_fn,
                                                          (current_axis+1) %2)
    else:
        new_node,new_closest_distance = find_nearest_neighbour(node.right_branch,point,distance_fn,
                                                           (current_axis+1) %2) 

    if  new_closest_distance and new_closest_distance < closest_known_distance:
        print 'Reset closest node to ',new_node.cell
        closest_known_distance = new_closest_distance
        current_closest_node = new_node

    return current_closest_node,closest_known_distance


class Node(namedtuple('Node','cell, left_branch, right_branch')):
    # This Class is taken from wikipedia code snippet for  KD tree
    pass

def create_kdtree(cell_list,current_axis,no_of_axis):
    # Creates a KD Tree recursively following the snippet from wikipedia for KD tree
    # but making it generic for any number of axis and changes in data strucure
    if not cell_list:
        return
    # get the cell as a tuple list this is for 2 dimensions
    k= [(cell[0],cell[1])  for cell  in cell_list]
    # say for three dimension
    # k= [(cell[0],cell[1],cell[2])  for cell  in cell_list]
    k.sort(key=itemgetter(current_axis)) # sort on the current axis
    median = len(k) // 2 # get the median of the list
    axis = (current_axis + 1) % no_of_axis # cycle the axis
    return Node(k[median], # recurse 
                create_kdtree(k[:median],axis,no_of_axis),
                create_kdtree(k[median+1:],axis,no_of_axis))

def eucleaden_dist(x1,y1,x2,y2):
    a= np.array([x1,y1])
    b= np.array([x2,y2])
    dist = np.linalg.norm(a-b)
    return dist


np.random.seed(0)
#cell_list = np.random.random((2, 2))
#cell_list = cell_list.tolist()
cell_list = [[2,2],[4,8],[10,2]]
print(cell_list)
tree = create_kdtree(cell_list,0,2)

node,distance = find_nearest_neighbour(tree,(1, 1),eucleaden_dist,0)
print 'Nearest Neighbour=',node.cell,distance

node,distance = find_nearest_neighbour(tree,(8, 1),eucleaden_dist,0)
print 'Nearest Neighbour=',node.cell,distance
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48126771

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档