文章/答案/技术大牛

发布

社区首页 >问答首页 >在多个2d数组中找到每个坐标中最频繁的值

问在多个2d数组中找到每个坐标中最频繁的值
EN

Stack Overflow用户

提问于 2022-06-11 07:59:06

回答 3查看 132关注 0票数 2

例如，我有多个2d数组，例如：

A = [[-1, -1, 0, 1, -1], [1, 1, 0, -1, -1], [-1, -1, -1, -1, -1], [-1, 1, -1, -1, 0]]
B = [[-1, -1, 0, 1, -1], [1, -1, 0, -1, -1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]
C = [[0, -1, 0, 1, -1], [1, -1, 0, -1, -1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]
D = [[-1, -1, 0, 1, 0], [0, 0, -1, 0, 1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]

我需要在每个坐标中找到最频繁的值，以便输出如下所示：

E = [[-1 -1 0 1 -1],[1 -1 0 -1 -1],[0 1 -1 1 -1],[-1 1 -1 -1 -1]]

我肯定可以遍历每一个数组，但是我正在寻找一种矢量化的方法。元素的数量可以在10-11左右，数组的尺寸在900X900左右。

有可能用列表理解来解决这个问题吗？

numpy-ndarray

python

arrays

numpy

回答 3

Stack Overflow用户

回答已采纳

发布于 2022-06-11 09:05:44

使用列表理解有点麻烦。做了些工作，但还是做了。

基本上，您必须使用嵌套的子列表理解，并且数组必须具有相同的大小才能工作。

要处理一个矩阵，它只需要一个嵌套列表，但是当我们处理一个矩阵列表时，它将是三维的，所以是2个嵌套的childs。

我使用的导入模式获得了最主要的值。

from statistics import mode


A = [[-1, -1, 0, 1, -1], [1, 1, 0, -1, -1], [-1, -1, -1, -1, -1], [-1, 1, -1, -1, 0]]
B = [[-1, -1, 0, 1, -1], [1, -1, 0, -1, -1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]
C = [[0, -1, 0, 1, -1], [1, -1, 0, -1, -1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]
D = [[-1, -1, 0, 1, 0], [0, 0, -1, 0, 1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]

matrixes = [A, B, C, D]

result = [[mode([x[k][j] for x in matrixes]) for j in range(len(matrixes[0][0]))] for k in range(len([x[0][0] for x in matrixes]))]


print(result)

结果：

[[-1, -1, 0, 1, -1], [1, -1, 0, -1, -1], [0, 1, -1, 1, -1], [-1, 1, -1, -1, -1]]

票数 1

Stack Overflow用户

发布于 2022-06-11 08:32:38

您可以zip所有数组，并在所有数组的行和列的每个单元格中计算计数并查找最大值和保存最大值，如下所示：

import numpy as np
from collections import Counter
def cell_wise_cnt(arrs):
    n_row = 0
    res = np.empty((len(arrs[0]),len(arrs[0][0])))
    for row in zip(*arrs):
        arr = np.array(row)
        num_col = len(arr[0])
        for col in range(num_col):
            res[n_row][col] = Counter(arr[:, col]).most_common()[0][0]
        n_row += 1
    return res

输出：

>>> cell_wise_cnt(arrs = (A,B,C,D))

array([[-1., -1.,  0.,  1., -1.],
       [ 1., -1.,  0., -1., -1.],
       [ 0.,  1., -1.,  1., -1.],
       [-1.,  1., -1., -1., -1.]])

基于colab的基准测试

%timeit cell_wise_cnt(arrs = (A,B,C,D))
# 136 µs per loop

% timeit scipy.stats.mode([A,B,C,D]).mode
# 585 µs per loop

%timeit stats.mode((A,B,C,D)*100_000).mode
# 1.73 s per loop

%timeit cell_wise_cnt(arrs = (A,B,C,D)*100_000)
# 2.38 s per loop

使用python3.9和洛佩斯氏答案，我们可以得到一个更好的run_time。

import statistics
def Julio_Lopes(arrs):
    return [[statistics.mode(j)  for j in zip(*i)] for i in zip(*arrs)]

%timeit Julio_Lopes(arrs = (A,B,C,D))
# 106 µs per loop

%timeit Julio_Lopes(arrs = (A,B,C,D)*100_000)
# 653 ms per loop

票数 1

Stack Overflow用户

发布于 2022-06-11 18:20:20

您可以简单地使用scipy.stats.mode

from scipy import stats

arr = [A,B,C,D]
arr.mode(arr)[0][0].tolist()

#output
 [[-1, -1, 0, 1, -1],
 [1, -1, 0, -1, -1],
 [0, 1, -1, 1, -1],
 [-1, 1, -1, -1, -1]]

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/72582649

复制

相似问题

问在多个2d数组中找到每个坐标中最频繁的值
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在多个2d数组中找到每个坐标中最频繁的值EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问在多个2d数组中找到每个坐标中最频繁的值
EN