"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

March 13, 2019

Day #219 - Person Re-Identification

All the research papers with code available in site paperswithcode 
Reid Resources 

Overall Lessons
  • CNN based auto encoders to encode an input image, and then using K-nearest neighbor algo, find the closest match to the encoded images in a database
  • Query2Gallery Similarity using Euclidean distance
  • Foreground, Head, Upper Body, Lower Body used for Cues
  • Detection + Classification
  • Local Maximal Occurrence (LOMO) analyzes the horizontal occurrence of local features, and maximizes the occurrence to make a stable representation against viewpoint changes
  • Video tracklets in person re-identification
Key Lessons
Talk #1 - Human Semantic Parsing for Person Re-identification
  • Query Image
  • Retrieve all images of the same identity
  • Query, Top 10 Retrieved Matches
Challenges
  • Illumination Condition
  • Background Clutter
  • Occlusion
  • Observable Body parts not visible
  • Hard to obtain posture
  • Extracting Robot visual representation
  • Low-Resolution Images
Questions
  • Develop complex models?
  • Extract Local Visual Cues?
  • Human Pose Estimation used to estimate 
  • Unable to identify arbitrary contours of body parts
  • Methods of Horizontal stripes
Contributions
  • Human Semantic Parsing (SPReid)
  • Simple holistic models work
SPReid
  • Inception-V3 architecture
  • Modified Inception-V3 architecture
  • Dilated Convolution


Architecture
  • Image - Inception V3
  • Avg Pooling get final representation
  • Foreground, Head, Upper Body, Lower Body used for Cues
  • One Global
  • One Foreground
Training and Evaluation
  • Softmax cross entropy loss
  • Train on low resolution, fine tune on high resolution
  • Look into person (Dataset)
  • Query2Gallery Similarity using Euclidean distance


Talk #2 - Joint Detection and Identification Feature Learning for Person Search | Spotlight 2-2B

Key Lessons
  • Match Photo with Manually Crafted
  • Find from the whole image, Detect People and Extract People and Features
  • Softmax classifier
  • Detection + Classification
  • Online instance Matching
  • Labeled One's Lookup Table
  • Minimize the distance between sathe me person


Talk #3 - Unsupervised Person Re-identification by Deep Learning Tracklet Association



Key Lessons
  • Supervised (Pairwise Neighboring)
  • Triplet Loss
  • Manually Labelled, Impose huge constraint
  • Completely Unsupervised using tracket associations
  • Collect Tracklet Data
  • Tracklet Sampling
  • Tracklet Association
  • Histogram Loss, Surrogate Loss








Siamese Network
Key Lessons
  • Find similar faces
  • Sequence of CNN, Pooling and Feature vector
  • Fed to make classification
  • Number computed vector F(x1) - Encoding of input Image
  • Feed second pic and get another F(x2)
  • Encoding is good representation, Find distance between x1 and x2
  • Two CNN and comparing them is Siamese Network Architecture
  • Train NN that generates encoding


More Reads
One Shot Learning with Siamese Networks using Keras
Image Similarity with Siamese Networks
Keras Example1
Siamese Network
Survey on Deep Learning Techniques for Person Re-Identification Task
Unsupervised Person Re-identification by Deep Learning Tracklet Association
Enhanced Deep Feature Representation for Person Re-identification
WACV18: Vehicle Re-identification by Adversarial Bi-directional LSTM Network

Survey on Deep Learning Techniques for Person Re-Identification Task
Key Notes
  • On-line applications for people/object detection and tracking
  • Recognizing a suspicious action/behavior from the camera network
  • Off-line applications to support operators and forensic investigators 
Image Challenges
  • Low image resolution
  • Unconstrained pose
  • Illumination changes
  • Occlusions 
Features to Exploit
  • Face
  • Clothing appearance
  • Gait
  • CNN generates a set of feature maps in which each pixel of given image corresponds to a specific feature representation
  • Image Size - 128 × 64
DNN Key Considerations
  • Objective function
  • Loss functions
  • Data augmentation
Feature fusion deep neural network
  • Network takes a single image size of 224 × 224 × 3 as the input of the network
  • Hand-crafted features are extracted by one of the standard person re-identification descriptor
  • Both extracted features are followed by a buffer layer and a fully connected layer which are acting as the fusion layer
  • A softmax loss layer then takes the output vector of fully connected layer in order to minimizing the cross-entropy loss
Siamese network
  • Siamese network models have been widely employed in person re-identification task
  • Employed as pairwise
  • Two subnetworks included
  • Output is similarity score
Tripletmodels
Training sample separately fed into three identical networks with shared parameter set between them
For each triplet unit they organized to maximize the margin between the matched pairs and the mismatched pairs. Hinge loss, Cosine similarity loss, Contrastive loss

Happy Mastering DL!!!!

No comments: