Paper #1 - PaDNet: Pan-Density Crowd Counting
Key Notes
Crowd Counting Challenges
Heavy occlusions, perspective distortions, scale variations and varying density of people
Combine the detection result with regression result for crowd counting
Implementation
Point-level annotations on person heads
Novelty
Average distance from certain head at j to its K-nearest neighbors (K-NN)
Datasets
Congested Scenes
Key Notes
- Density-Aware Network (DAN) contains multiple subnetworks pretrained on scenarios with different densities
- Capturing pandensity information
- Feature Enhancement Layer (FEL) effectively captures the global and local contextual features
- Feature Fusion Network (FFN) embeds spatial context and fuses these density-specific features
- Inconsistent densities due to camera perspective
- Sliding window detector
- Regression-based approaches
- Hand-crafted features
- detection-based methods affected by severe occlusions
- Switch-CNN through training the switch classifier to select the optimal regressor for one input patch
- Each subnetwork of Switch-CNN is trained on a specific density subdataset and thus cannot utilize the whole dataset
- High computation complexity in predicting the global and local contexts
- FEN extracts low-level feature of image
- DAN employs multiple subnetworks to recognize different density levels in crowds and to generate the feature map
Regression Based Methods
- Mapping from low-level features extracted from local image patches to the count
- Extracted features include foreground features, edge features, textures, and gradient features such as local binary pattern (LBP), and histogram oriented gradients (HOG)
- Regression approaches include linear regression [24], piecewise linear regression [25], ridge regression [26], and Gaussian process regression
- Five-branch contextual pyramid CNN
- GANs-based method to generate highquality density maps
- Input: input crowd image patches dataset S
- Output: output the parameters ΘPaDNet
- Init: Dividing the whole image patches S into N clusters S1, S2...SN via K-means clustering algorithm.
- P is the number of people in an image patch, dij represents the distance between the ith subject and its jth nearest neighbor
Crowd Counting Challenges
Heavy occlusions, perspective distortions, scale variations and varying density of people
Combine the detection result with regression result for crowd counting
Implementation
Point-level annotations on person heads
Novelty
- Online pseudo ground truth updating scheme which initializes the pseudo ground truth bounding boxes from point-level annotations
- Novel locally-constrained regression loss
- Curriculum learning strategy
- Patch-based density estimation
- Estimate crowd counts via the detection of each individual pedestrian Regression of density maps
- Crowd counting is casted as estimating a continuous density function
Average distance from certain head at j to its K-nearest neighbors (K-NN)
Datasets
- ShanghaiTech
- WorldExpo’10
- UCF CC 50
- UCSD
Congested Scenes
- Convolutional neural network
- Front-end for 2D feature extraction and a dilated CNN for the back-end
- Multi-column based architecture (MCNN) for crowd counting.
- Dilated convolutional layers have been demonstrated in segmentation tasks with significant improvement of accuracy
- Dilated convolution shows distinct advantages compared to the scheme of using convolution + pooling + deconvolution.
Deep Learning-Based Crowd Scene Analysis Survey
Detection based approaches
Datasets
Make it a High / Medium / Low crowd scenario
Code
Congested Scene Recognition called CSRNet
Ref Code
Dilated convolutional layers to aggregate the multiscale contextual information in the congested scenes.
More Papers
Benchmark data and method for real-time people counting in cluttered scenes using depth sensors
Learning Spatial Awareness to Improve Crowd Counting
Image Crowd Counting Using Convolutional Neural Network and Markov Random Field
Fast Video Crowd Counting with a Temporal Aware Network
Paper with Code - Reference
Example - Keras - Dilated Convolution
7 Technologies that Count People (Buildings & Offices)
Keep Thinking!!!
Congested Scene Recognition called CSRNet
Ref Code
Dilated convolutional layers to aggregate the multiscale contextual information in the congested scenes.
More Papers
Benchmark data and method for real-time people counting in cluttered scenes using depth sensors
Learning Spatial Awareness to Improve Crowd Counting
Image Crowd Counting Using Convolutional Neural Network and Markov Random Field
Fast Video Crowd Counting with a Temporal Aware Network
Paper with Code - Reference
Example - Keras - Dilated Convolution
7 Technologies that Count People (Buildings & Offices)
Keep Thinking!!!
No comments:
Post a Comment