我用滑动窗口对一个大的矩形图像做深入的学习。图像具有形状(高度、宽度)。
预测输出为形状(高度、宽度、prediction_probability)。我的预测是在重叠窗口中输出的,我需要将这些窗口加在一起,以便对整个输入图像逐像素地进行预测。窗口重叠超过2个像素(高度,宽度)。
在C++中,我以前做过这样的工作,创建一个大型的结果索引,然后将所有的ROI添加到一起。
#include <opencv2/core/core.hpp>
using namespace std;
template <int numberOfChannels>
static void AddBlobToBoard(Mat& board, vector<float> blobData,
int blobWidth, int blobHeight,
Rect roi) {
for (int y = roi.y; y < roi.y + roi.height; y++) {
auto vecPtr = board.ptr< Vec <float, numberOfChannels> >(y);
for (int x = roi.x; x < roi.x + roi.width; x++) {
for (int channel = 0; channel < numberOfChannels; channel++) {
vecPtr[x][channel] +=
blobData[(band * blobHeight + y - roi.y) * blobWidth + x - roi.x];}}}在Python中有一种矢量化的方法吗?
发布于 2018-08-28 08:21:07
编辑:
@Kevin,如果你正在训练一个网络,你应该用一个完全连接的层来完成这个步骤。那就是说..。
我有一个非矢量化的解决方案,如果你想要的话。任何解决方案都是内存密集型的。在我的笔记本电脑上,它对CIFAR大小的灰色图像(32x32)起着快速的作用。也许关键的一步可以由聪明的人来引导。
首先,使用arr将测试数组win拆分为windows win。这是测试数据。
>>> import numpy as np
>>> from skimage.util.shape import view_as_windows as viewW
>>> arr = np.arange(20).reshape(5,4)
>>> win = viewW(arr, (3,3))
>>> arr # test data
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
>>> win[0,0]==arr[:3,:3] # it works.
array([[ True, True, True],
[ True, True, True],
[ True, True, True]])现在,要重新组合,生成一个带有(5,4,6)形状的输出数组(5,4,6)。6是win中的窗口数,(5,4)是arr.shape。我们将沿着-1轴在每个切片中通过一个窗口填充这个数组。
# the array to be filled
out = np.zeros((5,4,6)) # shape of original arr stacked to the number of windows
# now make the set of indices of the window corners in arr
inds = np.indices((3,2)).T.reshape(3*2,2)
# and generate a list of slices. each selects the position of one window in out
slices = [np.s_[i[0]:i[0]+3:1,i[1]:i[1]+3:1,j] for i,j in zip(inds,range(6))]
# this will be the slow part. You have to loop through the slices.
# does anyone know a vectorized way to do this?
for (ii,jj),slc in zip(inds,slices):
out[slices] = win[ii,jj,:,:]现在,out数组在其正确的位置包含所有窗口,但在-1轴上分隔为窗格。要提取原始数组,可以将不包含零的所有元素平均在此轴上。
>>> out = np.true_divide(out.sum(-1),(out!=0).sum(-1))
>>> # this can't handle scenario where all elements in an out[i,i,:] are 0
>>> # so set nan to zero
>>> out = np.nan_to_num(out)
>>> out
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.],
[12., 13., 14., 15.],
[16., 17., 18., 19.]])你能想出一种用矢量化的方式对一组切片进行操作的方法吗?
合在一起:
def from_windows(win):
"""takes in an arrays of windows win and returns the original array from which they come"""
a0,b0,w,w = win.shape # shape of window
a,b = a0+w-1,b0+w-1 # a,b are shape of original image
n = a*b # number of windows
out = np.zeros((a,b,n)) # empty output to be summed over last axis
inds = np.indices((a0,b0)).T.reshape(a0*b0,2) # indices of window corners into out
slices = [np.s_[i[0]:i[0]+3:1,i[1]:i[1]+3:1,j] for i,j in zip(inds,range(n))] # make em slices
for (ii,jj),slc in zip(inds,slices): # do the replacement into out
out[slc] = win[ii,jj,:,:]
out = np.true_divide(out.sum(-1),(out!=0).sum(-1)) # average over all nonzeros
out = np.nan_to_num(out) # replace any nans remnant from np.alltrue(out[i,i,:]==0) scenario
return out # hope you've got ram 而测试:
>>> arr = np.arange(32**2).reshape(32,32)
>>> win = viewW(arr, (3,3))
>>> np.alltrue(arr==from_windows(win))
True
>>> %timeit from_windows(win)
6.3 ms ± 117 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)实际上,这还不够快,不能让你继续训练。
https://stackoverflow.com/questions/52044113
复制相似问题