"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

November 08, 2020

Weekend Reads - Advanced Models for Computer Vision

Key Notes

What Classifier will Miss - Human-level scene understanding

  • Parsing the scene
  • The angle of Bicycle (Pose, Relative pose)
  • Person on Bicycle
  • Closer Inspection

Tasks

  • Object Detection
  • Pose Estimation
  • Accuracy vs Efficiency of Models

CNN as Deep Learning Puzzle

Input-Output Node, Loss Computation and Backprop


Classification - sparse description of the image

Object Detection

  • Multi-task problem
  • Classification & Localisation
  • Object, Location, Bounding box
  • Dataset, Samples, List of Objects, Labels, Bbox for each object


Predict BBOX Coordinates

  • Continuous Output
  • Minimize mse of samples
  • Regression for bbox prediction
  • The first part is the classification
  • The Second Step is regression








Faster RCNN

  • Two-Stage Detector
  • Good Candidate BBOX
  • Refine through Regression
  • Discretize bbox space
  • Anchor points distributed
  • Candidate boxes of different scale and ratio
  • n candidates per anchor
  • Is there an object or not in the box
  • Refine through regression
  • We cannot backdrop on parameters of bbox (Spatial Transformer Networks)





One Stage Detector - Train end to end

  • Employ Hard negative mining

Retinanet uses Focal Loss (The loss function is just a mathematical way of saying how far off a guess is from the real value of a data point.). It puts more weight on the objects that were hard to classify and decreases the impact on easy correct predictions




Semantic Segmentation

  • Pooling - reduce the resolution of feature maps
  • Upsample based on the nearest neighbor approach



U-Net

  • Segmenting medical images
  • Input Image -> Convolution -> RELU pooling
  • Encoder - Similar to Image Classifier
  • Upsampling through Decoder for same resolution output
  • Upsampling - blobby feature map
  • For every location distribution over classes
  • Cross Entropy (Avg Over all Locations)




Keep Thinking!!!

No comments: