文章/答案/技术大牛

发布

社区首页 >问答首页 >使用keras的注意力卷积

问使用keras的注意力卷积
EN

Stack Overflow用户

提问于 2018-02-05 19:37:50

回答 1查看 436关注 0票数 1

我已经在keras中实现了一个细心的卷积层，如本paper所述。

你可以在下面的gist中看到它的代码

我刚开始实现自定义层，它仍然很慢。我使用了很多tf.map_fn，我想这就是为什么它这么慢的原因，但我不知道有什么不同的方法可以做到这一点。如果有人有一些关于如何改进层的技巧或实现自定义层的一般技巧，比如如何避免后端(tensorflow)函数，那就更好了。

我使用keras 2.1.3和tensorflow 1.5作为后端。

谢谢

keras-layer

tensorflow

machine-learning

keras

convolution

回答 1

Stack Overflow用户

发布于 2018-02-05 20:34:03

我不明白你为什么要使用tf.map_fn，你可以在任何地方避免它...

这里有一些提示(这可能会使代码更快，也可能不会)。

铸造

您真的需要将值强制转换为浮点型吗？如果(至少) x[0]是一个嵌入，那么它已经是一个浮点型了，对吧？(不确定“上下文”的性质)

第37行和第38行：

text = x[0]
context = x[1]

为什么要映射keras中已经支持的函数？

例如，为什么要这样做(L42)：

weighted_attentive_context = tf.map_fn(self._compute_attentive_context, (text, context), dtype=K.floatx())

你什么时候能做到这一点？

weighted_attentive_context = self._compute_attentive_context(text,context)

通过以下方式：

def _comput_attentive_context(self,text,context):

针对 _compute_attentive_context**:**的建议

def _compute_attentive_context(self, text, context):

    #computes the context-score for every vector like equation 2
    temp = tf.matmul(text, self.We)
    scores = tf.matmul(temp, K.transpose(context))

    #why not?
    scores_softmax = K.softmax(scores)


    #computes the context featur_map like equation 4
    res = tf.matmul(scores_softmax, context)

    #why not?
    res = self._weight_for_output(res)
    return res

，为什么不使用 K.conv1D ，而不是所有这些复杂的重复、连接等？

def _conv(self, x):
    return K.conv1D(x, self.W1, padding='same')

    #if you have special reasons for what you're doing, please share them in the comments,
    #please also share the exact shapes of the inputs and desired outputs
    #here, you should make self.W1 with shape (filterLength, em_dim, desired_output_dim)

针对 call**:**的建议

def call(self, x, mask=None):
    #x is a list of two tensors
    text = x[0]
    context = x[1]

    #applies bilinear energy funtion (text * We * context)
    #and weights the computed feature map like in equation 6 (W2 * ci)
    weighted_attentive_context = self._compute_attentive_context(text, context)

    #does the actual convolution, this is still kind of hacky
    conv = K.conv1D(text,self.W1,padding='same')

    added = conv + weighted_attentive_context
    batch = K.bias_add(added, self.bias)
    return batch

批量矩阵乘法

对于这些乘法，您可以使用K.dot()，如下所示：

If批次x权重：K.dot(x, self.W)
If权重x批次：K.permute_dimensions(K.dot(self.W,x),(1,0,2))

考虑到你有这些形状：

批次

If batch x -> x：( batch，words，emb) | W：(emb，

)

If weights x batch -> W：(any，words) | x：(batch，words，emb)

结果将是：

If batch x
：(words，any) <-这似乎是合理的选择，如果重量x batch：(any，emb)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/48621769

复制

相似问题

问使用keras的注意力卷积
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用keras的注意力卷积EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用keras的注意力卷积
EN