- Supervised (Predefined training targets)/ UnSupervised (Learn from data)
- Reinforcement (Reward based Learning)
- Unsupervised based learn / explore / interact (Childrens Learn)
- Generalization to new tasks and Situations
Here its on Targets whereas in Unsupervised its on the Data Provided
UnSupervised Learning
- Task is undefined
- Max Likelihood on data instead of target
Challenges
- Curse of Dimensionality
- Not all bits are created equal
- Modelling densities also gives us a generative model
- See what model has learnt
- Simple, powerful class of models
- Chain rule of probabilities
- Decompose as chain of conditional probabilities
- Split High Dimensional Data into sequence of small pieces
- Condition them via network state (LSTM / GRU)
Disadvantages of Autoregressive Models
- Very expensive with high dimensional data
- Order dependent
- Exploring the limits of language modelling (2016)
- Wavenet - Generative model for Saw Audio (2016)
- PixelRNN - Pixel Recurrent Networks (2016)
- Conditional Pixel CNN (2016)
- Handwriting Synthesis with RNN
- Contrastive Predictive Coding (2018) - Maximize mutual information between codes
- Learn Complex representations of Data
- Network learns the data
- Autoencoder and Variational Auto Encoder
- Learn Dataset, Not Data Points
Recipes of Unsupervised Learning
- Associate feature vector to data points
- PCA and K-means are strong baseline if dimensionality is not too large
- With domain expertise define a prediction task which requires some semantic understanding
- Conditional Prediction (Less uncertainity, less high dimensional)
- Often times, original regression turned into a classification task
- Take two patches and predict relationships between patches
- UCF 101 Action Recognition Dataset
- Extract Features in each images and run K-Means
- Train the CNN in supervised mode to predict cluster
- Domain Knowledge important for semantic understanding
- Check for bias in data
- NLP - Atomic - Word - Token - Discrete - Modelling Easy
- Vision - Atomic - Pixel - Continuous
- Meaning of word determined by context
- Semantically similar word share the information
- auto-encoding
- BERT - Deep Birectional transformers for language
- Using Attention
Key Idea - Learn Deep Representations by predicting a word from the context
Generative Models - Learning Representations, Planning
GAN - Auto-Regressive, GLO, Flow-based Algorithms
Text Challenges - Generate documents, track state, model uncertainity, meaningful metrics
Unsupervised Machine Translation - Context of word similar across languages
Steps
- Learn Embeddings Seperately
- Learn joint space via adversarial training + refinement
- Paper - Word Translation without parallel data - ICLR 2018
- MUSE approach facebook
- Seq2Seq model to translate
- Paper - Phrase based and Neural unsupervised machine translation EMNLP 2018
- What is a good metric ?
- Down stream tasks ?
- Dialog Systems, Sentiment Analysis ?
- Generalize unsupervised learning algorithms
- Metrics based on dimensions / Noise
- Modelling uncertainity
- Learning the skill not the task
Happy Mastering DL!!!
No comments:
Post a Comment