文章/答案/技术大牛

发布

社区首页 >问答首页 >如何实现超像素池层？

问如何实现超像素池层？
EN

Stack Overflow用户

提问于 2017-05-11 02:58:35

回答 1查看 1.1K关注 0票数 4

我想实现以下文件中定义的超级像素池层“弱监督的使用超级像素池网络的语义分割”，最初是用Torch (实现不可用)实现的。我希望在Keras与Theano后端(最好)。

我将举一个小例子来说明这一层的作用。它需要以下投入：

feature_map：shape = (batch_size, height, width, feature_dim)

superpixel_map：shape = (batch_size, height, width)

让我们假设两个带有batch_size = 1, height = width = 2, feature_dim = 1的小矩阵

feature_map = np.array([[[[ 0.1], [ 0.2 ]], [[ 0.3], [ 0.4]]]])  
superpixel_map = np.array([[[ 0,  0], [ 1,  2]]])

现在，输出的形状为= (batch_size, n_superpixels, feature_dim)。这里，n_superpixels基本上是= np.amax(superpixel_map) + 1。

输出计算如下。

找到superpixel_map == i的位置，i从0到n_superpixels - 1的位置各不相同。让我们考虑一下i = 0。i = 0的职位是(0, 0, 0)和(0, 0, 1)

现在，将特征地图中这些位置的元素平均起来。这给了我们(0.1 + 0.2) / 2 = 0.15的值。对i = 1和i = 2这样做，这分别给出了0.3和0.4的值。

现在，这个问题变得复杂了，因为通常是batch_size > 1和height, width >> 1。

我在Keras中实现了一个新层，它基本上是这样做的，但我使用的是循环。现在，如果height = width = 32。Theano给出了最大递归深度误差。有人知道怎么解决这个问题吗？如果TensorFlow提供了新的东西，那么我也准备好切换到TensorFlow后端了。

我的新层的代码如下：

class SuperpixelPooling(Layer):
    def __init__(self, n_superpixels=None, n_features=None, batch_size=None, 
                 input_shapes=None, **kwargs):
        super(SuperpixelPooling, self).__init__(**kwargs)
        self.n_superpixels = n_superpixels
        self.n_features = n_features
        self.batch_size = batch_size
        self.input_shapes = input_shapes  # has to be a length-2 tuple, First tuple has the
                                          # shape of feature map and the next tuple has the
                                          # length of superpixel map. Shapes are of the
                                          # form (height, width, feature_dim)
    def compute_output_shape(self, input_shapes):
        return (input_shapes[0][0],
                    self.n_superpixels,
                    self.n_features)
    def call(self, inputs):
        # x = feature map
        # y = superpixel map, index from [0, n-1]
        x = inputs[0]  # batch_size x m x n x k
        y = inputs[1]  # batch_size x m x n
        ht = self.input_shapes[0][0]
        wd = self.input_shapes[0][1]
        z = K.zeros(shape=(self.batch_size, self.n_superpixels, self.n_features), 
                    dtype=float)
        count = K.zeros(shape=(self.batch_size, self.n_superpixels, self.n_features), 
                        dtype=int)
        for b in range(self.batch_size):
            for i in range(ht):
                for j in range(wd):
                    z = T.inc_subtensor(z[b, y[b, i, j], :], x[b, i, j, :])
                    count = T.inc_subtensor(count[b, y[b, i, j], :], 1)
        z /= count   
        return z

我认为递归深度超过问题的原因是我使用了嵌套的for循环。我看不出有什么办法可以避免这些循环。如果有人有任何建议，请告诉我。

交叉发布的这里.如果我在那里得到任何答复，我会更新这篇文章。

tensorflow

neural-network

deep-learning

keras

theano

回答 1

Stack Overflow用户

发布于 2017-05-13 01:51:54

我在我的GitHub上有了初步的实现。它还没有准备好使用。阅读更多细节。为了完整起见，我将在这里发布实现及其简要说明(基本上来源于自述文件)。

class SuperpixelPooling(Layer):
    def __init__(self, n_superpixels=None, n_features=None, batch_size=None, input_shapes=None, positions=None, superpixel_positions=None, superpixel_hist=None, **kwargs):
        super(SuperpixelPooling, self).__init__(**kwargs)

        # self.input_spec = InputSpec(ndim=4)
        self.n_superpixels = n_superpixels
        self.n_features = n_features
        self.batch_size = batch_size
        self.input_shapes = input_shapes  # has to be a length-2 tuple, First tuple has shape of feature map and the next tuple has 
                                          # length of superpixel map. Shapes are of the form (height, width, feature_dim)
        self.positions = positions  # has three columns
        self.superpixel_positions = superpixel_positions  # has two columns
        self.superpixel_hist = superpixel_hist  # is a vector
    def compute_output_shape(self, input_shapes):
        return (self.batch_size, self.n_superpixels, self.n_features)
    def call(self, inputs):
        # x = feature map
        # y = superpixel map, index from [0, n-1]
        x = inputs[0]  # batch_size x k x m x n
        y = inputs[1]  # batch_size x m x n
        ht = self.input_shapes[0][0]
        wd = self.input_shapes[0][1]
        z = K.zeros(shape=(self.batch_size, self.n_superpixels, self.n_features), dtype=float)
        z = T.inc_subtensor(z[self.superpixel_positions[:, 0], self.superpixel_positions[:, 1], :], x[self.positions[:, 0], :, self.positions[:, 1], self.positions[:, 2]])
        z /= self.superpixel_hist
        return z

解释：

在Keras中实现超像素池层。有关实现，请参见keras.layers.pooling。

本文提出了超像素池层的概念：“基于超像素池网络的弱监督语义分割”，AAAI 2017。该层接受两个输入，一个超像素映射(大小M x N)和一个特征映射(大小K x M x N)。它将属于同一超像素的特征集合(在此实现中为平均池)，并形成一个1 x K向量，其中K是特征映射深度/通道。

一个简单的实现需要三个for循环:一个在批处理上迭代，另一个在行上迭代，最后一个在特性映射的列上迭代，然后动态地将它池起来。但是，当您试图编译包含此层的模型时，这会在Theano中给出“最大递归深度超出”错误。即使特征映射的宽度和高度仅为32，也会发生此错误。

为了克服这个问题，我认为把所有的东西作为参数传递到这个层，至少会去掉两个for循环。最后，我创建了一个一行程序来实现整个平均池操作的核心。你需要通过：

图像中的超像素数
特征图深度/通道
批次尺寸
特征映射和超像素映射的形状
一个N x 3矩阵，它包含与(batch_size, row, column)对应的所有可能的索引组合，称为positions。这只需要在培训期间生成一次，前提是输入图像大小和批处理大小保持不变。
一个叫N x 2矩阵的superpixel_positions。第一行包含与矩阵i的positions行中的索引相对应的超像素索引。例如，如果矩阵i的行positions包含(12, 10, 20)，那么同一行的超像素位置将包含sp_i = superpixel_map[12, 10, 20]所在的(12, sp_i)。
一个N x S矩阵-- superpixel_hist --其中S是该图像中超像素的数字。顾名思义，这个矩阵保存了当前图像中的超像素直方图。

这种实现的缺点是，这些参数必须改变每个图像(具体来说，在第6和第7点中提到的参数)。当GPU一次处理整个批处理时，这是不切实际的。我认为这可以通过将所有这些参数作为外部输入传递给该层来解决。基本上，它们可以从(比如说) HDF5文件中读取。我计划在短期内做到这一点。完成后我会更新这个。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43905784

复制

相似问题

问如何实现超像素池层？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何实现超像素池层？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何实现超像素池层？
EN