- RNN, For Modelling Sequences - Vanilla, LSTM
- RNN for language models
- CNN + RNN for image captioning
- Feedforward - Feedforward function
Making Most of Data
- Data Augmentation
- Images + Labels -> CNN -> Compute Loss -> Back Propagate
- Images + Transformation + Labels -> CNN -> Compute Loss -> Back Propagate
- Artificially expand training set, Preserve Labels, Widely used in practice
- Types of Transformation
- Horizontal Flip
- Random Crops / Samples from Training Images / Random Scale and Rotation
- Color Jitter (Randomly jitter contrast)
- Color Jitter with PCA
- Data Augmentation - Random mix of translation, rotation, stretching, shearing, lens distortions
- Dropout/ DropConnect - Randomly drop or sets weights to zero
- Simple to implement, Use it
- Useful for small datasets
- Fits into framework of noise / marginalization
- You need a lot of data if you want to train / use CNNs
- Train on Image Net / Pre-train model download
- Treat it as feature extractor
- Replace last layer with Linear Classifier
- Freeze network and retrain top layer
- Train only the last layers (Final Layers)
- Works better for similar types of data
- Edges, Color, Gabor applicable for any type of visual data
- Image captioning word vectors (Pre-trained)
- Computational workhouse
- 3 3 X3 similar as 7 x 7
- H, W, C Filters, Stride 1
- Replace Large Convolutions (5x5, 7x7) with stacks of 3 x 3 convolutions
- 1 x 1 bottleneck convolutions are very efficient
- Can factor N x N convolutions into 1 x N and N x 1
- All the above gives fewer parameters, less compute and more non-linearity
- im2col (Convolution recast as natrix multiply)
- im2col memory overhead
- Depth C to match input
- Take Each Convolutional weights compute inner products
- FFT - Convolution theorem, Convolution of signals same as FFT (Element wise transform of signals)
- FFT of weights, input image
- Elementwise computation
- Compute inverse, Speed up only for larger filters
- FFT doesn't work too well in practice
- FFT doesn't handle striding too well
- Strassen's Algorithm
- Naive matrix multiplication
Processing
- NVidia much common for GPU
- GPU good at matrix multiplication
- Floating point precision discussions
- 16 bit floating point operations from Nirvana
- Lower precision makes things faster and still works
No comments:
Post a Comment