"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

November 30, 2018

Day #156 - Reinforcement Learning

"Rewards for right moves, Starve for wrong moves"

Key Summary
  • Intelligence Systems Stack
  • Agents to Effectors
  • Raw Data - Features - Gain Knowledge - Reason - Short term and Long Term Actions
  • Sensory Data - Create Representations
  • Raw Sensory Data - Feature Learning (Higher Order Representations) - Extract Actionable usable Knowledge
  • Supervised learning - Memorizers
  • Reinforcement learning - brute force reasoning
  • Reinforcement learning components (Goal - State - Actions - Reward)
Step 1 - Reinforcement Learning Stack


Step 2 - Data Sources

Step 3 - Feature Extraction


Step 4 - Representations


Step 5 - Reasoning

Step 6 - Actions


Types of Deep Learning

Reinforcement Learning Components

Learning States Logic



Markov Decision Process
  • State - Action - Reward - State
  • Policy - Behavior function
  • Value Function - How good is state / function
  • Model - Agents representation of Environment
  • Stochastic System (having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely)
  • Reward structure changes the next step strategy
  • Encourage Exploration with positive reward
  • Goal is to Optimize reward
Summary
Intelligence - Ability to accomplish complex goals
Understanding - Ability to turn complex information into simple, useful information

DQN - Deep Q Learning
  • Neural Network injected into Q
  • Q function injected into Neural Network
  • Deep Mind uses DQN
  • Greedy way pick the best action
Policy Gradients
  • DQN - Q Learning - Off Policy
  • Policy Gradient - Directly optimizing policy space
DeepStack
  • To beat poker players
"Deep Learning for Perception tasks but not for forming actions"

Happy Mastering DL!!!

No comments: