Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #153

November 23, 2018

Day #153 - Sequence Modelling of Neural Networks

Sequence modelling in google translations
Self parking car - Sequence modelling

Challenges

Sequence modelling - predict the next word
ML Models are not designed for sequences
FFN specifies size of input at outside (Fixed)
Sequences are variable length inputs
Use all information available in sequence and also fixed length vector
bow (bag of words), Each slot represents word and number is occurences of it, Vector size remains same
Sequential information lost in bow
Preserve sequence but also maintain length

To Model Sequences

Deal with variable length sequences
Maintain Sequence Order
Keep track of long term dependencies
Share parmeters across the sequence

RNN (Recurrent Neural Network)

Architected same as NN
Each Hidden unit is using slightly different function
HU - Function of input from its own previous output (Cell State)
HU - Input + Previous Cell State = New input at timestamp
Parameter sharing is taken care
Sn - contain information of all past timestamps
Solves long term dependencies

Train RNN

Similar to NN
Backpropagation through time (GD - Take derivative of loss with respect to each parameter, Shift parameters in opposite direction to minimise loss)
Loss at each time step, Total loss = sum of loss at every time step
Backpropagation through time
Vanishing Gradient problems - By time stamp increase gradient becomes longer and longer

Methods to Address Bias in RNN

Activation functions (RELU, tanh, Sigmoid)
Initializing weights to something like identity matrix (prevent shrinking product)
Add more complex cells (Gated Cell)
RNN vs LSTM, GRU
Long Short Term Memory (Keep Memory Unchanged for many time steps)

LSTMs Overview

3 Step process
Step 1 - Forget irrelevant part of previous states (Remember Gate)
Step 2 - Selectively update cell States (seperate from whats outputted)
Step 3 - Output Certain parts of cell state
3 Steps implemented using Logic Gates
Logic gates implemented using Sigmoid functions
Update happens through additive function
Final Cell State Summarizes all information from the sequence
Music generation using RNN
Machine Translation (Two RNN side by side Encoder / Decoder)
Final cell state is passed, Decoder figures out and produces in different language
With Attention in Machine Translation we take weighted sum of all previous cell states

Happy Mastering DL!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

November 23, 2018

Day #153 - Sequence Modelling of Neural Networks

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts