Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #188

January 13, 2019

Day #188 - An Introduction to LSTMs in Tensorflow

Neural Networks

Input Layer - hidden Layer - Output Layer
Rewrite as matrix multiplication

Activation functions - Non-linear transformation for data - Data is non-linear - Build models that have non-linear capacity

Iteratively find until it converge, Direction of descent - SGD
Backprop to find gradient

Model Sequence

Represent as bag of words
Represent sentence as vectors
BOW does not preserve order
Longer feature vector to maintain order

Markov Models

Rules
States
Transitions
Chances for next word prediction using markov model
State depends on previous state

RNN and LSTM

Sequence is sentence, function
Success of deep models (Alexa)

RNN Key Needs

Maintain Sequence
Learn the order
Preserve History
Producing function of previous state

RNN

W, U stay the same
Cell state at time n contain information from all past time stamps
Compute function of all previous states
Machine Translation - 2 RNN - Encoder - Decoder Model
Last Cell State is representation of sentence

Train RNN

Backpropagation
Added time dimension
Chain rule for RNN and dependency on previous states
Cell state depends on all previous time cell states
Backpropagation through time
Hard to train due to vanishing gradient problem
Capture short term dependencies
Initialize weights differently
Gated Cell (Really Effective). Recurrent unit with several steps of logic gates
Gates decide what information to multiply
Functions of LSTM - Forgetting, Selective Updates, Output Certain Parts of cell
Fixed length encoding is a problem for encoder-decoder, Solution is attend over all encoder states

Tensorflow

DL Framework
GPU Acceleration
Code Reusability
TPU

TF Basics

Session
Computation Graph
Feed data in, Get Results
Variables, Sessions, Tensors
Perceptron classifier
Share weights

LSTM Example

Input Gate / Forget Gate / Update Gate / Output
Code - https://github.com/nicholaslocascio/bcs-lstm

Hidden Technical Debt in Machine Learning Systems https://t.co/szYRTtSpDd on some of the new joys and struggles of deploying machine learning models in the wild. Still a long way to go to establish new language and design patterns for programming the 2.0 stack pic.twitter.com/6qR9BAA6qS
— Andrej Karpathy (@karpathy) November 5, 2018

Next Talks List
Yann LeCun - How does the brain learn so much so quickly? (CCN 2017)
Frank Hutter and Joaquin Vanschoren: Automatic Machine Learning (NeurIPS 2018 Tutorial)
Fernanda Viégas and Martin Wattenberg: Visualization for Machine Learning (NeurIPS 2018 Tutorial)
Memory: why it matters and how it works
The Neuroscience of Emotions
NIPS 2018 Videos

Happy Mastering DL!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

January 13, 2019

Day #188 - An Introduction to LSTMs in Tensorflow

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts