"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 19, 2018

Day #168 - Lecture 7: Convolutional Neural Networks - Deep Learning Class Notes - CS231N

Key Summary
  • Data loss, Regularization loss
  • Optimization (SGD, Momentum, Adagrad, Adadelta, RMSProp)
  • Cortex arranged as simple to complex cells
  • CNN operate over volumes
  • Convolve the filter with the image (Slide over image spatially computing dot products)
  •  Activation maps (Indpendent filters)
  • Convolution Operation -> Activation Maps
  • Input - Low Features - Mid Level Features - High Level Features - Trainable Classifier
  • Building Hierarchy of features and composing them spatially
  • Convolutional Layer - RELU (Non-linearity) - Pooling - Fully Connected
  • 3D volumes of higher level abstraction
  • Stride - How much you shift filter at a time
  • Filters = (N-F)/Stride +1
  • F - Filters, N - Input
  • Pad inputs with zero (Preserve size spatially). Size won't shrink too fast - Zero Padding on borders, 0 will not contribute further (ignoring)
  • Filters are Odd Numbers, Odd Filters have nice representation
  • Lowest is 3 x 3 - Convenience
  • Deep Depth, Spacial Dimension small - will be handled
  • All images resized to squares by default
  • Filters of different size not used in same layer
  • Only do convolution where data is present
  • More computation at fixed spatial size
  • Convolution on previous input image 
  • Filter and Kernel are interchangeably used
Neuron View of Convolution layer
  • Filter computes dot product on inputs
  • Output of filter is neuron fixed in space
  • Local connectivity pattern
  • Neuron Receptive field 5 x 5
  • All neurons have same weights / local connectivity but operates on different parts of images
  • Conv operations will preserve spatial size



Pooling

  • Does downsampling operation
  • Activations are downsampled
  • MAX Pooling - Shrinks size using max
  • Reduce size by half



FC
  • Ended up 4 x 4 x 10 
  • Has a neuron fully connected to output
  • Last matrix operation
  • Contains neurons that connect to the entire input volume as in ordinary neural networks
  • Anything that backprop through can be put in Convnet
  • Filters with edges pop out clearly
LeNet
  • 32 x 32 input image
  • 6 filters 5 x 5 - Brought it down to 28 x 28
  • Subsample / Maxpooling
  • Stride 1
AlexNet
  • 227 x 227 x 3 images
  • Two seperate streams that compute similar things
  • GPUs were not there
  • First Conv later 96 filters
  • Output 55 x 55 x 96 depth
  • Pooling 3 x 3 stride 2
  • Zero parameters in pooling layer
  • First to use ReLU
  • Heavy Data Augmentation (Sizing, color)
  • Only in last few layers dropout used
ZFNet
  • Built on top of Alexnet
  • Modifications done on Conv1
VGGNet
  • 3 x 3 with stride 1 and pad 1 throught the conv net
  • 16 layer model performed best
  • 224 x 224 image 
  • Filters increase as spatial size decreases
  • 140 million total parameters
  • Most memory in very first few convolution layers
  • Linear structure
  • Nicer and Uniform Architecture
GoogleNet
  • Introduced inception module
ResNet (Residual net)
  • Residual Network
  • MS Research Asia
  • More layers works better
  • Careful about increasing number of layers
  • 150 layers
  • ResNet has skip connections
  • Batch Normalization
  • Xavier Initialization
  • SGD + Momentum
  • Mini-Batch size 256
  • Rapid Reduction in size, pack all capacity into more layers

Resnet

  • Stacked up Layers
  • Regular Box + Residual Input added, Skip Connection
  • Variety (Batch norm after addition, Relu last, Batch norm fast, preactivation)
  • Resnet18 - 18 Convolution layers

Skip Connection

  • f(x) + x
  • residual is learnt, f(x) = y-x
  • Works as refinement

Happy Mastering DL!!!

No comments: