Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #171 - ConvNets in practice - CS231n Lessons

December 24, 2018

Day #171 - ConvNets in practice - CS231n Lessons

Key Summary

RNN, For Modelling Sequences - Vanilla, LSTM
RNN for language models
CNN + RNN for image captioning
Feedforward - Feedforward function

Low Level CNN Working Practice
Making Most of Data

Data Augmentation
Images + Labels -> CNN -> Compute Loss -> Back Propagate
Images + Transformation + Labels -> CNN -> Compute Loss -> Back Propagate
Artificially expand training set, Preserve Labels, Widely used in practice
Types of Transformation
Horizontal Flip
Random Crops / Samples from Training Images / Random Scale and Rotation
Color Jitter (Randomly jitter contrast)
Color Jitter with PCA
Data Augmentation - Random mix of translation, rotation, stretching, shearing, lens distortions
Dropout/ DropConnect - Randomly drop or sets weights to zero

Data Augmentation - Summary

Simple to implement, Use it
Useful for small datasets
Fits into framework of noise / marginalization

Transfer Learning

You need a lot of data if you want to train / use CNNs
Train on Image Net / Pre-train model download
Treat it as feature extractor
Replace last layer with Linear Classifier
Freeze network and retrain top layer
Train only the last layers (Final Layers)
Works better for similar types of data
Edges, Color, Gabor applicable for any type of visual data
Image captioning word vectors (Pre-trained)

Convolutions

Computational workhouse

Design Efficient Network Architecture

3 3 X3 similar as 7 x 7
H, W, C Filters, Stride 1

Convolution - Summary

Replace Large Convolutions (5x5, 7x7) with stacks of 3 x 3 convolutions
1 x 1 bottleneck convolutions are very efficient
Can factor N x N convolutions into 1 x N and N x 1
All the above gives fewer parameters, less compute and more non-linearity

All about Convolutions (Computing them)

im2col (Convolution recast as natrix multiply)
im2col memory overhead
Depth C to match input
Take Each Convolutional weights compute inner products
FFT - Convolution theorem, Convolution of signals same as FFT (Element wise transform of signals)
FFT of weights, input image
Elementwise computation
Compute inverse, Speed up only for larger filters
FFT doesn't work too well in practice
FFT doesn't handle striding too well

Fast Algorithms

Strassen's Algorithm
Naive matrix multiplication

Processing

NVidia much common for GPU
GPU good at matrix multiplication
Floating point precision discussions
16 bit floating point operations from Nirvana
Lower precision makes things faster and still works

Happy Mastering DL!!!

No comments:

Subscribe to: Post Comments (Atom)