"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

January 01, 2019

Day #176 - Unsupervised Deep Learning (NeurIPS 2018 Tutorial)

Types of Learning
  • Supervised (Predefined training targets)/ UnSupervised (Learn from data)
  • Reinforcement (Reward based Learning)
  • Unsupervised based learn / explore / interact (Childrens Learn)
  • Generalization to new tasks and Situations




Here its on Targets whereas in Unsupervised its on the Data Provided



"Stop Learning Tasks, Start Learning Skills" - Satinder Singh

UnSupervised Learning
  • Task is undefined
  • Max Likelihood on data instead of target

Challenges
  • Curse of Dimensionality
  • Not all bits are created equal
Generative Models
  • Modelling densities also gives us a generative model
  • See what model has learnt
AutoRegressive Models
  • Simple, powerful class of models
  • Chain rule of probabilities

  • Decompose as chain of conditional probabilities
  • Split High Dimensional Data into sequence of small pieces
  • Condition them via network state (LSTM / GRU)




Four senses of Digital World - Images, Audio, Video, Text

Disadvantages of Autoregressive Models
  • Very expensive with high dimensional data
  • Order dependent
Paper 
  • Exploring the limits of language modelling (2016)
  • Wavenet - Generative model for Saw Audio (2016)
  • PixelRNN - Pixel Recurrent Networks (2016)
  • Conditional Pixel CNN (2016)
  • Handwriting Synthesis with RNN
  • Contrastive Predictive Coding (2018) - Maximize mutual information between codes
DL 
  • Learn Complex representations of Data
  • Network learns the data
  • Autoencoder and Variational Auto Encoder
  • Learn Dataset, Not Data Points



Empowered Agents
Recipes of Unsupervised Learning
  • Associate feature vector to data points
  • PCA and K-means are strong baseline if dimensionality is not too large
Self-Supervised Learning
  • With domain expertise define a prediction task which requires some semantic understanding
  • Conditional Prediction (Less uncertainity, less high dimensional)
  • Often times, original regression turned into a classification task
  • Take two patches and predict relationships between patches
  • UCF 101 Action Recognition Dataset


Learning by Clustering
  • Extract Features in each images and run K-Means
  • Train the CNN in supervised mode to predict cluster

Summary
  • Domain Knowledge important for semantic understanding
  • Check for bias in data

Unsupervised feature learning for Text
  • NLP - Atomic - Word - Token - Discrete - Modelling Easy
  • Vision - Atomic - Pixel - Continuous


Word2vec
  • Meaning of word determined by context
  • Semantically similar word share the information

Representing Sentences
  • auto-encoding
  • BERT - Deep Birectional transformers for language
  • Using Attention


Multiple techniques - word2vec, bi-LSTM, ELMO, GPT, BERT

Key Idea - Learn Deep Representations by predicting a word from the context
Generative Models - Learning Representations, Planning
GAN - Auto-Regressive, GLO, Flow-based Algorithms
Text Challenges - Generate documents, track state, model uncertainity, meaningful metrics
Unsupervised Machine Translation - Context of word similar across languages


Steps
  • Learn Embeddings Seperately
  • Learn joint space via adversarial training + refinement
  • Paper - Word Translation without parallel data - ICLR 2018
  • MUSE approach facebook
  • Seq2Seq model to translate
  • Paper - Phrase based and Neural unsupervised machine translation EMNLP 2018




Open Research Problems
  • What is a good metric ?
  • Down stream tasks ?
  • Dialog Systems, Sentiment Analysis ?
  • Generalize unsupervised learning algorithms
  • Metrics based on dimensions / Noise
  • Modelling uncertainity
  • Learning the skill not the task




Happy Mastering DL!!!

No comments: