Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #173 - CS231N - Lecture 11: Deep Learning libraries - Notes

December 26, 2018

Day #173 - CS231N - Lecture 11: Deep Learning libraries - Notes

Key Summary

Cafee

From Berkeley
Widely used for CNN
Written in C++
Python and Matlab bindings
Good for standard feedforward vanilla CNN
Blob - Weights, Pixel values, Intermediate values (n dimensional tensor)
Layer - Function - Input / Output blob
Common Problem - Not much documentation on layer types
Net - Combines bunch of layers
Solver - Intended to run forward / backward in network / resume checkpoints
Gradient Descent, RMSProp are in the solver
Protocol buffers - Binary strongly types JSON for serializing data in risk
Cafee.proto - Defines all protocol buffer files
Convert Data - File format LMDB
Proto.txt to define the net
Solver - Learning Rate, Regularization rates

Torch

NYU
Written in C
Used in Fb and Deepmind
Lua - High Level Scripting for embedded devices, similar to JS
JIT compliation to make things fast
Learn Lua in 5 mins site
Torch tensors are just like numpy arrays
GPU is just another data type
optim package implements momentum, Adam
Caffe has Nets and Layers
Torch just has modules
Modules are classes written in Lua
Containers to combine multiple modules
nngraph hookup more complex topology easily
Not great for RNN

Backward Pass

updateGradInput
accGradparameters - Accumulate grad parameters - Receive gradients from upstream

Workflow in Torch

Preprocess data
Train a model in Lua / Torch
Use Trained model

Theano

University of Montreal
High Level Wrappers - Keras, Lasange
Computational graphs
Debugging hard

Lasagne - High Level Wrapper for theano

Tensorflow

Similar to Theano
From Professional Engineers
First ground up from Industrial Place
Create Placeholders for data and labels - Create input nodes
Initialize variables with numpy arrays
Compute Score, Probs, Loss
SGD to minimise loss
Wrap it in Session Code
One hot - Y always integer
In some frameworks it is a vector where everything is zero except the correct class
Tensorflow wants one hot
Tensorboard to visualise the network
Async or Sync training

Projects and Architecture Inputs

#1. Image Captioning

Need Pretained models
Need RNNs

#2. Semantic Segmentation

Need pretrained model
Need loss function

#3. Object Detection

Pretrained models
Custom imperative code
Cafe + Python

Keras - Good Presentation

Happy Mastering DL!!!

No comments:

Subscribe to: Post Comments (Atom)