"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

November 18, 2019

Day #297 - Paper Analysis - WIDER Face and Pedestrian Challenge

WIDER Face and Pedestrian Challenge
Tasks - face detection, pedestrian detection, person search
Dataset - WIDER Pedestrian Track - 20,000 images. From surveillance cameras, driving vehicles

Face Detection
  • Approach 1 -  single stage detector with the network structure based on RetinaNet [7] and FAN - Face attention network
  • Approach 2 - two-stage face detector following Faster R-CNN [12] and FPN Feature pyramid networks [13] framework
  • Approach 3 - a two-stage face detection framework. RetinaNet [7] and RefineDet [15]. The team uses two-stage classification and regression to improve the accuracy of classification
PEDESTRIAN DETECTION TRACK
  • Approach 1 - basic detection framework of the champion is Cascade R-CNN. Five models are ensembled: ResNet50 [18], DenseNet-161 [19], 197 SENet-154 [20] and two ResNext-101 [21] models.
  • Approach 2 - The second team uses FPN [13] and Faster R-CNN [12] as the basis of their detection framework
  • Approach 3 - The team at the third place uses Cascade R-CNN [16] as the detection framework
PERSON SEARCH TRACK
  • Approach 1 - The winning team designs a cascaded model that utilizes both face and body features for person search. (1) The face detector used here is MTCNN [26] trained on WIDER FACE [4]. (2) The face recognition model backbones include ResNet [18], InceptionResNet-v2 [27], DenseNet [19], DPN and MobiletNet [28]. (3) The Re-ID backbones include ResNet=50, ResNet-101, DenseNet-161 and DenseNet-201
  • Approach 2 - The solution is decomposed into two stages - the first stage is to retrieve faces, and the second stage is to retrieve the bodies. Finally, the retrieval results of the two stages are combined as the ranking result. (1) Face Detection. The face detector used here are PCN [29] and MTCNN [26]. (2) Face Retrieval. A second-order networks [30], [31], [32] (ResNet34 as backbone) trained on VGGFace2 [33] with softmax loss and ring loss [34] is used here
  • Approach 3 -  In the first step, the face in the query is used to search persons, whose faces can be detected, by face recognition. Then these images are further used to search again in all candidate images by person reidentification feature to get the final result
Happy Learning!!!


No comments: