본문 바로가기

Deep Learning

Very simple LSTM example

반응형

 

1. Notation

 

x_{t} : input vector at time t

h_{t} : hidden state of RNN at time t

C_{t} : cell state of RNN at time t

f_{t} = sigmoid(W1 x [h_{t-1} x_{t}] + b1)

i_{t} = sigmoid(W2 x [h_{t-1} x_{t}] + b2)

tilde{C}_{t} = tanh(W3 x [h_{t-1} x_{t}] + b3)

 

2. Explanation

 

Cell state can be regarded as a bag of information obtained previously from past inputs. Everytime RNN gets an input from outside, RNN decides how much new information from the input needs to be stored in the bag (Cell state) and how much old information in the bag needs to be discarded.

 

 

a) Like simple RNN (explained in the previous post, 2018/03/06 - [Deep Learning] - Very simple RNN example), the current input (x_{t}) and the previous state (h_{t-1}) are concatenated and used for the next state and the cell state update.

 

b) From the information point of view, C_{t} contains old information while tilde{C}_{t}, a candidate current cell state, contains new information. f_{t} and i_{t}, respectively, determine how much information do we need to forget and remember in C_{t} and tilde{C}_{t}. Therefore, C_{t}, the current cell state, is determined by

 

C_{t} = f_{t} x C_{t-1} + i_{t} x tilde{C}_{t}

 

c) Now, current hidden state h_{t} is determined from the current input, the previous hidden state, and the current cell state as follow

 

o_{t} = sigmoid(W4 x [h_{t-1} x_{t}] + b4)

h_{t} = o_{t} x tanh(C_{t})

 

 

For more details, visit http://colah.github.io/posts/2015-08-Understanding-LSTMs/