- Sequence modelling in google translations
- Self parking car - Sequence modelling
- Sequence modelling - predict the next word
- ML Models are not designed for sequences
- FFN specifies size of input at outside (Fixed)
- Sequences are variable length inputs
- Use all information available in sequence and also fixed length vector
- bow (bag of words), Each slot represents word and number is occurences of it, Vector size remains same
- Sequential information lost in bow
- Preserve sequence but also maintain length
- Deal with variable length sequences
- Maintain Sequence Order
- Keep track of long term dependencies
- Share parmeters across the sequence
- Architected same as NN
- Each Hidden unit is using slightly different function
- HU - Function of input from its own previous output (Cell State)
- HU - Input + Previous Cell State = New input at timestamp
- Parameter sharing is taken care
- Sn - contain information of all past timestamps
- Solves long term dependencies
- Similar to NN
- Backpropagation through time (GD - Take derivative of loss with respect to each parameter, Shift parameters in opposite direction to minimise loss)
- Loss at each time step, Total loss = sum of loss at every time step
- Backpropagation through time
- Vanishing Gradient problems - By time stamp increase gradient becomes longer and longer
- Activation functions (RELU, tanh, Sigmoid)
- Initializing weights to something like identity matrix (prevent shrinking product)
- Add more complex cells (Gated Cell)
- RNN vs LSTM, GRU
- Long Short Term Memory (Keep Memory Unchanged for many time steps)
- 3 Step process
- Step 1 - Forget irrelevant part of previous states (Remember Gate)
- Step 2 - Selectively update cell States (seperate from whats outputted)
- Step 3 - Output Certain parts of cell state
- 3 Steps implemented using Logic Gates
- Logic gates implemented using Sigmoid functions
- Update happens through additive function
- Final Cell State Summarizes all information from the sequence
- Music generation using RNN
- Machine Translation (Two RNN side by side Encoder / Decoder)
- Final cell state is passed, Decoder figures out and produces in different language
- With Attention in Machine Translation we take weighted sum of all previous cell states
Happy Mastering DL!!!
No comments:
Post a Comment