Key Summary
Pooling
FC
Resnet
Skip Connection
Happy Mastering DL!!!
- Data loss, Regularization loss
- Optimization (SGD, Momentum, Adagrad, Adadelta, RMSProp)
- Cortex arranged as simple to complex cells
- CNN operate over volumes
- Convolve the filter with the image (Slide over image spatially computing dot products)
- Activation maps (Indpendent filters)
- Convolution Operation -> Activation Maps
- Input - Low Features - Mid Level Features - High Level Features - Trainable Classifier
- Building Hierarchy of features and composing them spatially
- Convolutional Layer - RELU (Non-linearity) - Pooling - Fully Connected
- 3D volumes of higher level abstraction
- Stride - How much you shift filter at a time
- Filters = (N-F)/Stride +1
- F - Filters, N - Input
- Pad inputs with zero (Preserve size spatially). Size won't shrink too fast - Zero Padding on borders, 0 will not contribute further (ignoring)
- Filters are Odd Numbers, Odd Filters have nice representation
- Lowest is 3 x 3 - Convenience
- Deep Depth, Spacial Dimension small - will be handled
- All images resized to squares by default
- Filters of different size not used in same layer
- Only do convolution where data is present
- More computation at fixed spatial size
- Convolution on previous input image
- Filter and Kernel are interchangeably used
- Filter computes dot product on inputs
- Output of filter is neuron fixed in space
- Local connectivity pattern
- Neuron Receptive field 5 x 5
- All neurons have same weights / local connectivity but operates on different parts of images
- Conv operations will preserve spatial size
Pooling
- Does downsampling operation
- Activations are downsampled
- MAX Pooling - Shrinks size using max
- Reduce size by half
FC
- Ended up 4 x 4 x 10
- Has a neuron fully connected to output
- Last matrix operation
- Contains neurons that connect to the entire input volume as in ordinary neural networks
- Anything that backprop through can be put in Convnet
- Filters with edges pop out clearly
- 32 x 32 input image
- 6 filters 5 x 5 - Brought it down to 28 x 28
- Subsample / Maxpooling
- Stride 1
- 227 x 227 x 3 images
- Two seperate streams that compute similar things
- GPUs were not there
- First Conv later 96 filters
- Output 55 x 55 x 96 depth
- Pooling 3 x 3 stride 2
- Zero parameters in pooling layer
- First to use ReLU
- Heavy Data Augmentation (Sizing, color)
- Only in last few layers dropout used
- Built on top of Alexnet
- Modifications done on Conv1
- 3 x 3 with stride 1 and pad 1 throught the conv net
- 16 layer model performed best
- 224 x 224 image
- Filters increase as spatial size decreases
- 140 million total parameters
- Most memory in very first few convolution layers
- Linear structure
- Nicer and Uniform Architecture
- Introduced inception module
- Residual Network
- MS Research Asia
- More layers works better
- Careful about increasing number of layers
- 150 layers
- ResNet has skip connections
- Batch Normalization
- Xavier Initialization
- SGD + Momentum
- Mini-Batch size 256
- Rapid Reduction in size, pack all capacity into more layers
Resnet
- Stacked up Layers
- Regular Box + Residual Input added, Skip Connection
- Variety (Batch norm after addition, Relu last, Batch norm fast, preactivation)
- Resnet18 - 18 Convolution layers
Skip Connection
- f(x) + x
- residual is learnt, f(x) = y-x
- Works as refinement
Happy Mastering DL!!!
No comments:
Post a Comment