"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

December 18, 2018

Day #167 - YoFlow and Dock

Reading #1 - YoFlow
YoFlow - Link1
Poster - Link2

Key Summary
  • You Look Only Once (Yolo) Implemented in Tensorflow
  • K-means clustering across short rolling windows to group similar objects across frames
  • Predict if object is similar between frames
Yolo Basics
  • Single Feedforward network
  • Yolo reframe object detection as single regression problem
  • Images -> Pixels -> Bounding Box -> Probabilities
  • Divides image into S X S Grid
  • If Object centre falls in the Grid then grid cell is responsible for detecting the object
  • Predict bounding boxes and class probabilities
Yolo Implementation
  • 32 layer Deep CNN
  • Input Image resized into 448 x 448
  • Yolo Network Output = S x S x (B*5 + C) tensor of predictions ( 7 x 7 x 30 )
  • S - Numbers of rows and columns in which we divide the image
  • B - Number of objects that can be predicted in given box
  • C - Number of classes
  • 5 - Terms account for x-axis grid offset, y-axis grid offset, width, height and confidence in each grid cell
  • Dataset - Pascal VOC Dataset
K-Means Extension to Yolo
  • K-means clustering across images within the short rolling window to group similar objects across frames
  • Define Distance between two images I1, I2 given dimensions x,y and color channels c
  • Works well when images are similar
Yoflow Weights - https://github.com/johnwlambert/YoloTensorFlow229

Reading #2 - DOCK: Detecting Objects by transferring Common-sense Knowledge
Key Summary
  • Transfer Learning to transfer knowledge from source categories
  • Encode similaruty, spatial, attribute and scene
  • Model Objects at region level
  • Leverage pretrained object detectors
  • Similarity at Region Level
  • Context used for Object Detection, Semantic Segmentation and Object Discovery
  • Use External Knowledge to interpret common sense
  • Pascal VOC Dataset
Architecture
  • Base Network CNN + Common Sense
  • Classification Matrix, Common Sense Matrix
  • Image with Region proposals
  • Common-sense cues based on similarity, attributes, spatial relations
Happy Mastering DL!!!

No comments: