Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, 2018 24th International Conference on Pattern Recognition (ICPR)
…
6 pages
1 file
We investigate generative adversarial networks as an effective solution to the crowd counting problem. These networks not only learn the mapping from crowd image to corresponding density map, but also learn a loss function to train this mapping. There are many challenges to the task of crowd counting, such as severe occlusions in extremely dense crowd scenes, perspective distortion, and high visual similarity between pedestrians and background elements. To address these problems, we proposed multi-scale generative adversarial network to generate highquality crowd density maps of arbitrary crowd density scenes. We utilized the adversarial loss from discriminator to improve the quality of the estimated density map, which is critical to accurately predict crowd counts. The proposed multi-scale generator can extract multiple hierarchy features from the crowd image. The results showed that the proposed method provided better performance compared to current state-of-the-art methods .
Journal of Intelligent Systems, 2020
The purpose of crowd counting is to estimate the number of pedestrians in crowd images. Crowd counting or density estimation is an extremely challenging task in computer vision, due to large scale variations and dense scene. Current methods solve these issues by compounding multi-scale Convolutional Neural Network with different receptive fields. In this paper, a novel end-to-end architecture based on Multi-Scale Adversarial Convolutional Neural Network (MSA-CNN) is proposed to generate crowd density and estimate the amount of crowd. Firstly, a multi-scale network is used to extract the globally relevant features in the crowd image, and then fractionally-strided convolutional layers are designed for up-sampling the output to recover the loss of crucial details caused by the earlier max pooling layers. An adversarial loss is directly employed to shrink the estimated value into the realistic subspace to reduce the blurring effect of density estimation. Joint training is performed in a...
Our work proposes a novel deep learning framework for estimating crowd density from static images of highly dense crowds. We use a combination of deep and shallow, fully convolutional networks to predict the density map for a given crowd image. Such a combination is used for effectively capturing both the high-level semantic information (face/body detectors) and the low-level features (blob detectors), that are necessary for crowd counting under large scale variations. As most crowd datasets have limited training samples (<100 images) and deep learning based approaches require large amounts of training data, we perform multiscale data augmentation. Augmenting the training samples in such a manner helps in guiding the CNN to learn scale invariant representations. Our method is tested on the challenging UCF CC 50 dataset, and shown to outperform the state of the art methods.
arXiv: Computer Vision and Pattern Recognition, 2021
Recently the crowd counting has received more and more attention. Especially the technology of high-density environment has become an important research content, and the relevant methods for the existence of extremely dense crowd are not optimal. In this paper, we propose a multi-level attentive Convolutional Neural Network (MLAttnCNN) for crowd counting. We extract high-level contextual information with multiple different scales applied in pooling, and use multilevel attention modules to enrich the characteristics at different layers to achieve more efficient multi-scale feature fusion, which is able to be used to generate a more accurate density map with dilated convolutions and a 1 × 1 convolution. The extensive experiments on three available public datasets show that our proposed network achieves outperformance to the state-of-the-art approaches.
Big Data and Cognitive Computing, 2021
Automatically estimating the number of people in unconstrained scenes is a crucial yet challenging task in different real-world applications, including video surveillance, public safety, urban planning, and traffic monitoring. In addition, methods developed to estimate the number of people can be adapted and applied to related tasks in various fields, such as plant counting, vehicle counting, and cell microscopy. Many challenges and problems face crowd counting, including cluttered scenes, extreme occlusions, scale variation, and changes in camera perspective. Therefore, in the past few years, tremendous research efforts have been devoted to crowd counting, and numerous excellent techniques have been proposed. The significant progress in crowd counting methods in recent years is mostly attributed to advances in deep convolution neural networks (CNNs) as well as to public crowd counting datasets. In this work, we review the papers that have been published in the last decade and provi...
International Journal of Computational Intelligence Systems
People counting has been investigated extensively as a tool to increase the individual’s safety and to avoid crowd hazards at public places. It is a challenging task especially in high-density environment such as Hajj and Umrah, where millions of people gathered in a constrained environment to perform rituals. This is due to large variations of scales of people across different scenes. To solve scale problem, a simple and effective solution is to use an image pyramid. However, heavy computational cost is required to process multiple levels of the pyramid. To overcome this issue, we propose deep-fusion model that efficiently and effectively leverages the hierarchical features exits in various convolutional layers deep neural network. Specifically, we propose a network that combine multiscale features from shallow to deep layers of the network and map the input image to a density map. The summation of peaks in the density map provides the final crowd count. To assess the effectiveness...
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
In this work, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements in errors. Furthermore, we introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD) that is ∼2.8 × larger than the most recent crowd counting datasets in terms of the number of images. It contains 4,250 images with 1.11 million annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weatherbased degradations and illumination variations in addition to many distractor images, making it a very challenging dataset. Additionally, the dataset consists of rich annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset.
Proceedings of the AAAI Conference on Artificial Intelligence
Counting people in dense crowds is a demanding task even for humans. This is primarily due to the large variability in appearance of people. Often people are only seen as a bunch of blobs. Occlusions, pose variations and background clutter further compound the difficulty. In this scenario, identifying a person requires larger spatial context and semantics of the scene. But the current state-of-the-art CNN regressors for crowd counting are feedforward and use only limited spatial context to detect people. They look for local crowd patterns to regress the crowd density map, resulting in false predictions. Hence, we propose top-down feedback to correct the initial prediction of the CNN. Our architecture consists of a bottom-up CNN along with a separate top-down CNN to generate feedback. The bottom-up network, which regresses the crowd density map, has two columns of CNN with different receptive fields. Features from various layers of the bottom-up CNN are fed to the top-down network. T...
IEEE Access, 2019
Crowd counting and density estimation is an important and challenging problem in the visual analysis of the crowd. Most of the existing approaches use regression on density maps for the crowd count from a single image. However, these methods cannot localize individual pedestrian and therefore cannot estimate the actual distribution of pedestrians in the environment. On the other hand, detection-based methods detect and localize pedestrians in the scene, but the performance of these methods degrades when applied in high-density situations. To overcome the limitations of pedestrian detectors, we proposed a motion-guided filter (MGF) that exploits spatial and temporal information between consecutive frames of the video to recover missed detections. Our framework is based on the deep convolution neural network (DCNN) for crowd counting in the low-to-medium density videos. We employ various state-of-the-art network architectures, namely, Visual Geometry Group (VGG16), Zeiler and Fergus (ZF), and VGGM in the framework of a region-based DCNN for detecting pedestrians. After pedestrian detection, the proposed motion guided filter is employed. We evaluate the performance of our approach on three publicly available datasets. The experimental results demonstrate the effectiveness of our approach, which significantly improves the performance of the state-of-the-art detectors. INDEX TERMS Deep convolutional neural networks, crowd counting and density estimation, Motion Guided Filter, faster R-CNN.
Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2017
Counting pedestrians in surveillance applications is a common scenario. However, it is often challenging to obtain sufficient annotated training data, especially so for creating models using deep learning which require a large amount of training data. To address this problem, this paper explores the possibility of training a deep convolutional neural network (CNN) entirely from synthetically generated images for the purpose of counting pedestrians. Nuances of transfer learning are exploited to train models from a base model trained for image classification. A direct approach and a hierarchical approach are used during training to enhance the capability of the model for counting higher number of pedestrians. The trained models are then tested on natural images of completely different scenes captured by different acquisition systems not experienced by the model during training. Furthermore, the effectiveness of the cross entropy cost function and the squared error cost function are evaluated and analyzed for the scenario where a model is trained entirely using synthetic images. The performance of the trained model for the test images from the target site can be improved by fine-tuning using the image of the background of the target site.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International journal of electrical and computer engineering systems, 2020
Fourth International Workshop on Pattern Recognition,, 2019
Computers, Materials & Continua
Computer Science and Information Systems, 2022
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
Journal of Imaging, 2020
IEEE Access, 2019
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018
IEEExplore, 2019 Innovations in Intelligent Systems and Applications Conference, 2019
IEEE Access
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
2019 IEEE International Conference on Image Processing (ICIP), 2019
Sensors, 2019
Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, 2017
IEEE Access, 2022
arXiv (Cornell University), 2020