Very simple RNN example

1. Notation

I_{t} : input vector to RNN at time t

S_{t} : vector that represents RNN state at time t (also called hidden state)

O_{t} : output vector of RNN at time t

W_{1}, b_{1} : weight matrix and bias vector to be trained from data, which determine the next state vector of RNN given the current input and previous state vectors.

W_{2}, b_{2} : weight matrix and bias vector to be trained from data, which determine the current output of RNN given the current state

2. Explanation

Eqn. 1 is for the state update and Eqn. 2 is for the output. The size of the hidden state depends on your decision.

I miss activation function for Eqn. 1 and Eqn. 2. For Eqn. 1, sigmoid function is used. For Eqn. 2, it depends on your application.

3. Example

Toy examples can be found at https://medium.com/@erikhallstrm/hello-world-rnn-83cd7105b767

* Good question and answer from stackoverflow [https://stackoverflow.com/questions/40384791/for-the-tf-nn-rnn-cell-basicrnn-whats-the-difference-between-the-state-and-outp]

Q) for the tf.nn.rnn_cell.BasicRNN,what's the difference between the state and output?

as I know state = tanh(w * input + u * pre_state + b) output = state*w_out but for the tf.nn.rnn_cell.BasicRNN , I just get the unit_num (I think it's the dim of state) and at the api web page,Most basic RNN: output = new_state = activation(W * input + U * state + B so can I think in this function state = output? and the function just has w,u,b,but no w_out?

A) What "vanilla" RNN that you describe does is it computes the new hidden state, and then uses some output projection to compute the output. In tensorflow they separated that "compute new hidden state" and "compute output projection" parts. The BasicRNN just outputs the hidden state as its output, another class called OutputProjectionWrapper can then apply a projection to it (and multiplying by w_out is just applying a projection). To get the behavior you want, you need to do:

tf.nn.rnn_cell.OutputProjectionWrapper(tf.nn.rnn_cell.BasicRNNCell(...), num_output_units)

It also allows you to have different number of neurons in your hidden state and in your output projection.

저작자표시 비영리 변경금지

'Deep Learning' 카테고리의 다른 글

[tensorflow] install tensorflow on ubuntu (0)	2018.07.05
Very simple LSTM example (0)	2018.03.06
[KERAS] how to install keras with tensorflow+anaconda+pycharm on windows10 (0)	2018.01.29
Experimental results of "Driving experience sharing method for end-to-end control of self-driving cars" (0)	2018.01.19
[tensorflow] training specific variables (0)	2017.11.14

즐거운 인생 - Happy Life

Very simple RNN example

'Deep Learning' 카테고리의 다른 글

티스토리툴바

Very simple RNN example

'Deep Learning' 카테고리의 다른 글

'Deep Learning' Related Articles

티스토리툴바