Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #151 - Back to Basics

November 19, 2018

Day #151 - Back to Basics - Geoff Hinton Papers

Paper 1 - Learning Representations by back propagating errors (1986)

Key Summary

The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between actual output and desired output
Ability to create new distinguishing features
The aim is to find the set of weights that ensure that for each input vector the output vector produced by the network is same as the desired output vector
The drawback in learning procedure is that the error surface may contain local minima so that gradient descent is not guaranteed to find a global minimum

Paper 2 - Deep learning (2015)

Key Summary

Deep Learning

Machine Learning systems are used to identify objects in images, transcribe speech into text, match new items, posts or products with user interests and relevant results of search
Multiple processing layers to learn representations of data with multiple levels of abstraction
Recurrent Networks for sequential data such as text and speech
Deep Learning methods are representation learning methods with multiple levels of representation obtained by composing non-linear models that transform representation at abstract level
The layers are learned from data by general purpose learning procedure
The conventional option is hand design good feature extractors which require a considerable amount of engineering skill and domain expertise. Key advantage of deep learning is learn automatically using general purpose learning procedure
A deep learning architecture is a multistack layer of simple modules, all of which may compute simple non-linear input-output mappings
The backpropagation procedure to compute the gradient of an objective function with respect to the weights of a multi-layer stack of module is nothing more than a practical application of chain rule of derivatives

Convolutional Neural Networks

Composed of Convolutional layers and pooling layers
Units in convolutional layer organized into feature maps
Filtering operation performed by feature map is a discrete convolution
Pooling computes maximum of local patches
Two or three stages of convolution, non-linearity and pooling are stacked up, followed by more convolutional and fully connected layer

Recurrent Neural Networks

RNN process an input sequence one element at a time, maintaining their hidden units as state vector (history of past sequences)
Good at predicting next word in a sequence

Paper #3 - Dropout: A Simple Way to Prevent Neural Networks from Overfitting
Key Summary

Randomly drop units from neural network during training
Dropping out units hidden and visible in a neural network
Temporarily remove from network along with incoming and outgoing connections

Paper #4 - SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS

Key Summary

Long Short Term Memory - RNN Architecture
RNN are deep in time, Since their hidden state is a function of all previous hiddem states
Make use of previous context
Deep birectional LSTM RNNs for speech recognition

LSTM Components

Input gate
Forget gate
Output Gate
Cell Activation Vectors

Bidirectional RNN has

Forward Hidden Sequence
Backward Hidden Sequence

CTC - Connnectionist Temporal Classification

Uses Softmax layer to define a seperate output distribution
CTC uses forward - backward algorithm to sum over all the possible alignments and determine the normalised probability
RNN trained with CTC are bi-directional

Paper #5 - How Neural Networks Learn from Experience (1992)

Brain creates internal representations to learn without any explicit instructions

ANN are modern neurons
Behavior of ANN depends on weights, activation functions
Backpropagation algorithm to train the neural network

Backpropagation Challenges

Requires labeled training data
Forward Pass - Signal = Activity = y
Backward Pass - Signal = dE/dy

Paper #6 - How Learning Can Guide Evolution

Learning alters the shape of search space and provides good evolutionary path
Learning organisms evolve much faster

Key Summary

Interaction between learning and evolution was proposed by Baldwin
Learning alters search space in which evolution operates

Paper #7 - Evolution Strategies

Inspired by Theory of natural evolution
Motivated by Darwinian Theory

Unimodal vs Multimodal

A landscape is unimodal if it has single minimum
Multimodal if it has several minima with equal function values

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

November 19, 2018

Day #151 - Back to Basics - Geoff Hinton Papers

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts