"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 26, 2018

Day #173 - CS231N - Lecture 11: Deep Learning libraries - Notes

Key Summary

Cafee
  • From Berkeley
  • Widely used for CNN
  • Written in C++
  • Python and Matlab bindings
  • Good for standard feedforward vanilla CNN
  • Blob - Weights, Pixel values, Intermediate values (n dimensional tensor)
  • Layer - Function - Input / Output blob
  • Common Problem - Not much documentation on layer types
  • Net - Combines bunch of layers
  • Solver - Intended to run forward / backward in network / resume checkpoints
  • Gradient Descent, RMSProp are in the solver
  • Protocol buffers - Binary strongly types JSON for serializing data in risk
  • Cafee.proto - Defines all protocol buffer files
  • Convert Data - File format LMDB
  • Proto.txt to define the net
  • Solver - Learning Rate, Regularization rates




Torch
  • NYU 
  • Written in C
  • Used in Fb and Deepmind
  • Lua - High Level Scripting for embedded devices, similar to JS
  • JIT compliation to make things fast
  • Learn Lua in 5 mins site
  • Torch tensors are just like numpy arrays
  • GPU is just another data type
  • optim package implements momentum, Adam
  • Caffe has Nets and Layers
  • Torch just has modules
  • Modules are classes written in Lua
  • Containers to combine multiple modules
  • nngraph hookup more complex topology easily
  • Not great for RNN
Backward Pass
  • updateGradInput
  • accGradparameters - Accumulate grad parameters - Receive gradients from upstream
Workflow in Torch
  • Preprocess data
  • Train a model in Lua / Torch
  • Use Trained model


Theano
  • University of Montreal
  • High Level Wrappers - Keras, Lasange
  • Computational graphs
  • Debugging hard
Lasagne - High Level Wrapper for theano





Tensorflow
  • Similar to Theano
  • From Professional Engineers
  • First ground up from Industrial Place
  • Create Placeholders for data and labels - Create input nodes
  • Initialize variables with numpy arrays
  • Compute Score, Probs, Loss
  • SGD to minimise loss
  • Wrap it in Session Code
  • One hot - Y always integer
  • In some frameworks it is a vector where everything is zero except the correct class
  • Tensorflow wants one hot
  • Tensorboard to visualise the network
  • Async or Sync training
Projects and Architecture Inputs
#1. Image Captioning
  • Need Pretained models
  • Need RNNs
#2. Semantic Segmentation
  • Need pretrained model
  • Need loss function
#3. Object Detection
  • Pretrained models
  • Custom imperative code
  • Cafe + Python





Keras - Good Presentation

Happy Mastering DL!!!

No comments: