"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;
Showing posts with label Person Re-Identification. Show all posts
Showing posts with label Person Re-Identification. Show all posts

May 28, 2020

Learning Notes - Papers - Face Detection / Reidentification Papers

Paper #1 - Facial Keypoints Detection
Key Notes
  • Using PCA and LBP 
  • Apply different models
  • Combine LBP and PCA together
Key Tasks
  • Face Alignment
  • Face Verification
Key Points locations
  • lefteyecenter, righteyecenter,
  • lefteyeinnercorner, lef teyeoutercorner,
  • righteyeinnercorner, righteyeoutercorner,
  • lefteyebrowinnerend, lef teyebrowouterend,
  • righteyebrowinnerend, righteyebrowouterend,
  • nosetip,
  • mouthlef tcorner, mouthrightcorner,
  • mouthcentertoplip, mouthcenterbottomlip
LBP (Local Binary Pattern) is an operator used to describe the local texture features of images. It has the  advantages of rotation invariance and gray invariance

Paper #2 - CNN architecture for Key Point Detection presented in Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet

Paper #3 - FACE RECOGNITION SYSTEM 
Key Notes
  • Face detection
  • Face preprocessing 
  • Face recognition processes
Facial feature points
  • points (eyes, mouth center points, eyes, mouth contour points, organ contour points, etc.)
Siamese Network for Face Comparison
  • Siamese network is neural network for measuring of similarity
  • It can be used for category identification, classification
Key Notes
  • Face detector is used to localize faces in images
  • Facial landmark detector
Face Matching
  • Cosine distance or L2 distance
  • Nearest neighbor (NN) and threshold comparison 
  • With GAN generate Makeup Faces, Similar Fake Faces and Compare
Paper #5 - A Fast and Accurate System for Face Detection, Identification, and Verification
Key Notes
  • Deep CNN based detector
Face Detection
  • Region proposal Networks
  • Sliding window based
Multi-task learning for Facial Analysis
  • Simultaneous face detection
  • Landmark localization
  • Headpose estimation
Single DCNN which can accomplish multiple tasks such as face detection, landmark localization, attribute prediction, age estimation, face recognition
More Reads
KPNet: Towards Minimal Face Detector
Face Recognition Based on the Key Points of High-dimensional Feature and Triplet Loss Automatic landmark annotation and dense correspondence registration for 3D human facial images
Keypoint Detection and Local Feature Matching for Textured 3D Face Recognition

September 04, 2019

Day #274 - Re_id Notes from papers / Analysis - Reidentification of person from historical data

Approach
  • Extract Features
  • Cluster to find similar faces
  • Approximate k-NN search
Survey on Deep Learning Techniques for Person Re-Identification
Classification Model
  • Using SIFT, Color Histograms
  • Determining the individual identity (aka class)
  • Image Categorization by Age / Gender and Search
Siamese Network 
  • Learning a similarity function, which takes two images as input and expresses how similar they are.
  • Triplet Siamese model, Pairwise Model
  • Triplet models - The triplet loss function takes face encoding of three images anchor, positive and negative.  Here anchor and positive are the images of same person whereas negative is the image of a different person
Face Search at Scale: 80 Million Gallery
Key Points
  • Represent objects with feature vectors 
  • Employ an indexing or approximate search scheme in the feature space
Performance-oriented Design
  • Fast filtering step (Approximate k-NN search)
  • Re-ranking step (K Candidates Deep Feature Similarity)
Using Siamese Networks - Retail Use Cases
  • Scenario #1 – Find a person in Camera1 and Find him across all other cameras
  • Scenario #2 – Find a person at Entrance and Track him across in-store video
  • Scenario #3 – Retrain this for every +/- 10 minutes, Dynamically Track for every single customer, Retrain as Class – Query Image Scenario
Happy Learning!!!

April 09, 2019

Day #236 - Papers on Person Re-Identification

Paper #1 - Camera Style Adaptation for Person Re-identification

Key Lessons
  • Person Reidentification - Given Query Person, Retrieve person from multiple sources
  • Challenges - Resolution, Environment, Illumination
  • Camera Style Adaptation Approach - unsupervised, camera-invariant property
Techniques
  • Input image pairs are partitioned into three overlapping horizontal parts respectively, and through a siamese CNN model to learn the similarity of them using cosine distance
Paper #2 - SIMPLE ONLINE AND REALTIME TRACKING WITH A DEEP ASSOCIATION METRIC
Techniques
  • Kalman filtering in image space and frame by frame
  • Kalman filter with constant velocity motion
Paper #3 - In Defense of the Triplet Loss for Person Re-Identification
Techniques
  • A plain CNN with a triplet loss 
Triplet Loss
Key Lessons
  • Look at Anchor, Distance with Positive Example, Distance with Negative Example
  • 3 Images at a time Anchor, Positive, Negative Image
  • APNN
  • d(A,P) = 0.5 Set Margin to achieve it for positive / negative
  • L(A,P,N) = Max(||f(A)-f(P)||^2 - ||f(A)-f(N)||^2 + Alpha)
  • Chosing Triplets Randomly
  • Map Training Set into Triple
Example - Link1, Link2



Happy Mastering DL!!!

March 27, 2019

Day #227 - Learning Re-Id

1. Installed the Pytorch framework for deep-learning person re-identification - Link
2. Followed the steps commenting out the GPU code and ran it on CPU
3. Code to Check Available Models


4. Running it on Other Datasets

5. Next Step - On Custom Dataset Setting up, Testing Models




















Happy Mastering DL!!!

March 13, 2019

Day #219 - Person Re-Identification

All the research papers with code available in site paperswithcode 
Reid Resources 

Overall Lessons
  • CNN based auto encoders to encode an input image, and then using K-nearest neighbor algo, find the closest match to the encoded images in a database
  • Query2Gallery Similarity using Euclidean distance
  • Foreground, Head, Upper Body, Lower Body used for Cues
  • Detection + Classification
  • Local Maximal Occurrence (LOMO) analyzes the horizontal occurrence of local features, and maximizes the occurrence to make a stable representation against viewpoint changes
  • Video tracklets in person re-identification
Key Lessons
Talk #1 - Human Semantic Parsing for Person Re-identification
  • Query Image
  • Retrieve all images of the same identity
  • Query, Top 10 Retrieved Matches
Challenges
  • Illumination Condition
  • Background Clutter
  • Occlusion
  • Observable Body parts not visible
  • Hard to obtain posture
  • Extracting Robot visual representation
  • Low-Resolution Images
Questions
  • Develop complex models?
  • Extract Local Visual Cues?
  • Human Pose Estimation used to estimate 
  • Unable to identify arbitrary contours of body parts
  • Methods of Horizontal stripes
Contributions
  • Human Semantic Parsing (SPReid)
  • Simple holistic models work
SPReid
  • Inception-V3 architecture
  • Modified Inception-V3 architecture
  • Dilated Convolution


Architecture
  • Image - Inception V3
  • Avg Pooling get final representation
  • Foreground, Head, Upper Body, Lower Body used for Cues
  • One Global
  • One Foreground
Training and Evaluation
  • Softmax cross entropy loss
  • Train on low resolution, fine tune on high resolution
  • Look into person (Dataset)
  • Query2Gallery Similarity using Euclidean distance


Talk #2 - Joint Detection and Identification Feature Learning for Person Search | Spotlight 2-2B

Key Lessons
  • Match Photo with Manually Crafted
  • Find from the whole image, Detect People and Extract People and Features
  • Softmax classifier
  • Detection + Classification
  • Online instance Matching
  • Labeled One's Lookup Table
  • Minimize the distance between sathe me person


Talk #3 - Unsupervised Person Re-identification by Deep Learning Tracklet Association



Key Lessons
  • Supervised (Pairwise Neighboring)
  • Triplet Loss
  • Manually Labelled, Impose huge constraint
  • Completely Unsupervised using tracket associations
  • Collect Tracklet Data
  • Tracklet Sampling
  • Tracklet Association
  • Histogram Loss, Surrogate Loss








Siamese Network
Key Lessons
  • Find similar faces
  • Sequence of CNN, Pooling and Feature vector
  • Fed to make classification
  • Number computed vector F(x1) - Encoding of input Image
  • Feed second pic and get another F(x2)
  • Encoding is good representation, Find distance between x1 and x2
  • Two CNN and comparing them is Siamese Network Architecture
  • Train NN that generates encoding


More Reads
One Shot Learning with Siamese Networks using Keras
Image Similarity with Siamese Networks
Keras Example1
Siamese Network
Survey on Deep Learning Techniques for Person Re-Identification Task
Unsupervised Person Re-identification by Deep Learning Tracklet Association
Enhanced Deep Feature Representation for Person Re-identification
WACV18: Vehicle Re-identification by Adversarial Bi-directional LSTM Network

Survey on Deep Learning Techniques for Person Re-Identification Task
Key Notes
  • On-line applications for people/object detection and tracking
  • Recognizing a suspicious action/behavior from the camera network
  • Off-line applications to support operators and forensic investigators 
Image Challenges
  • Low image resolution
  • Unconstrained pose
  • Illumination changes
  • Occlusions 
Features to Exploit
  • Face
  • Clothing appearance
  • Gait
  • CNN generates a set of feature maps in which each pixel of given image corresponds to a specific feature representation
  • Image Size - 128 × 64
DNN Key Considerations
  • Objective function
  • Loss functions
  • Data augmentation
Feature fusion deep neural network
  • Network takes a single image size of 224 × 224 × 3 as the input of the network
  • Hand-crafted features are extracted by one of the standard person re-identification descriptor
  • Both extracted features are followed by a buffer layer and a fully connected layer which are acting as the fusion layer
  • A softmax loss layer then takes the output vector of fully connected layer in order to minimizing the cross-entropy loss
Siamese network
  • Siamese network models have been widely employed in person re-identification task
  • Employed as pairwise
  • Two subnetworks included
  • Output is similarity score
Tripletmodels
Training sample separately fed into three identical networks with shared parameter set between them
For each triplet unit they organized to maximize the margin between the matched pairs and the mismatched pairs. Hinge loss, Cosine similarity loss, Contrastive loss

Happy Mastering DL!!!!