我正在尝试用flax.nn.Module实现一个基本的RNN单元。实现RNN单元的公式非常简单:
a_t =W* h_{t-1} +U* x_t +b h_t = tanh(a_t) o_t =V* h_t +c
其中h_t是时间t处的更新状态,x_t是输入,o_t是输出,Tanh是我们的激活函数。
我的代码使用flax.nn.Module,
class ElmanCell(nn.Module):
@nn.compact
def __call__(self, h, x):
nextState = jnp.tanh(jnp.dot(W, h) * jnp.dot(U, x) + b)
return nextState我不知道hoe实现参数W、U和b,它们应该是nn.Module的属性吗?
发布于 2022-03-19 22:15:11
试一试如下:
class RNNCell(nn.Module):
@nn.compact
def __call__(self, state, x):
# Wh @ h + Wx @ x + b can be efficiently computed
# by concatenating the vectors and then having a single dense layer
x = np.concatenate([state, x])
new_state = np.tanh(nn.Dense(state.shape[0])(x))
return new_state这样就可以了解参数。请参阅https://schmit.github.io/jax/2021/06/20/jax-language-model-rnn.html
https://stackoverflow.com/questions/71475589
复制相似问题