Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Journal of Imaging
…
21 pages
1 file
Many scientific studies deal with person counting and density estimation from single images. Recently, convolutional neural networks (CNNs) have been applied for these tasks. Even though often better results are reported, it is often not clear where the improvements are resulting from, and if the proposed approaches would generalize. Thus, the main goal of this paper was to identify the critical aspects of these tasks and to show how these limit state-of-the-art approaches. Based on these findings, we show how to mitigate these limitations. To this end, we implemented a CNN-based baseline approach, which we extended to deal with identified problems. These include the discovery of bias in the reference data sets, ambiguity in ground truth generation, and mismatching of evaluation metrics w.r.t. the training loss function. The experimental results show that our modifications allow for significantly outperforming the baseline in terms of the accuracy of person counts and density estima...
Big Data and Cognitive Computing, 2021
Automatically estimating the number of people in unconstrained scenes is a crucial yet challenging task in different real-world applications, including video surveillance, public safety, urban planning, and traffic monitoring. In addition, methods developed to estimate the number of people can be adapted and applied to related tasks in various fields, such as plant counting, vehicle counting, and cell microscopy. Many challenges and problems face crowd counting, including cluttered scenes, extreme occlusions, scale variation, and changes in camera perspective. Therefore, in the past few years, tremendous research efforts have been devoted to crowd counting, and numerous excellent techniques have been proposed. The significant progress in crowd counting methods in recent years is mostly attributed to advances in deep convolution neural networks (CNNs) as well as to public crowd counting datasets. In this work, we review the papers that have been published in the last decade and provi...
Our work proposes a novel deep learning framework for estimating crowd density from static images of highly dense crowds. We use a combination of deep and shallow, fully convolutional networks to predict the density map for a given crowd image. Such a combination is used for effectively capturing both the high-level semantic information (face/body detectors) and the low-level features (blob detectors), that are necessary for crowd counting under large scale variations. As most crowd datasets have limited training samples (<100 images) and deep learning based approaches require large amounts of training data, we perform multiscale data augmentation. Augmenting the training samples in such a manner helps in guiding the CNN to learn scale invariant representations. Our method is tested on the challenging UCF CC 50 dataset, and shown to outperform the state of the art methods.
Journal of Intelligent Systems, 2020
The purpose of crowd counting is to estimate the number of pedestrians in crowd images. Crowd counting or density estimation is an extremely challenging task in computer vision, due to large scale variations and dense scene. Current methods solve these issues by compounding multi-scale Convolutional Neural Network with different receptive fields. In this paper, a novel end-to-end architecture based on Multi-Scale Adversarial Convolutional Neural Network (MSA-CNN) is proposed to generate crowd density and estimate the amount of crowd. Firstly, a multi-scale network is used to extract the globally relevant features in the crowd image, and then fractionally-strided convolutional layers are designed for up-sampling the output to recover the loss of crucial details caused by the earlier max pooling layers. An adversarial loss is directly employed to shrink the estimated value into the realistic subspace to reduce the blurring effect of density estimation. Joint training is performed in a...
Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2017
Counting pedestrians in surveillance applications is a common scenario. However, it is often challenging to obtain sufficient annotated training data, especially so for creating models using deep learning which require a large amount of training data. To address this problem, this paper explores the possibility of training a deep convolutional neural network (CNN) entirely from synthetically generated images for the purpose of counting pedestrians. Nuances of transfer learning are exploited to train models from a base model trained for image classification. A direct approach and a hierarchical approach are used during training to enhance the capability of the model for counting higher number of pedestrians. The trained models are then tested on natural images of completely different scenes captured by different acquisition systems not experienced by the model during training. Furthermore, the effectiveness of the cross entropy cost function and the squared error cost function are evaluated and analyzed for the scenario where a model is trained entirely using synthetic images. The performance of the trained model for the test images from the target site can be improved by fine-tuning using the image of the background of the target site.
Computer Science and Information Systems, 2022
Crowd counting has a range of applications and it is an important task that can help with the accident prevention such as crowd crushes and stampedes in political protests, concerts, sports, and other social events. Many crown counting approaches have been proposed in the recent years. In this paper we compare five deep-learning-based approaches to crowd counting, reevaluate them and present a novel CSRNet-based approach. We base our implementation on five convolutional neural network (CNN) architectures: CSRNet, Bayesian Crowd Counting, DM-Count, SFA-Net, and SGA-Net and present a novel approach by upgrading CSRNet with application of a Bayesian crowd counting loss function and pixel modeling. The models are trained and evaluated on three widely used crowd image datasets, ShanghaiTech part A, part B, and UCF-QNRF. The results show that models based on SFA-Net and DM-Count outperform state-of-the-art when trained and evaluated on the similar data, and the proposed extended model outperforms the base model with the same backbone when trained and evaluated on the significantly different data, suggesting improved robustness levels.
IEEE Access
Nowadays, crowd analysis is one of the most important concepts that needs be relied upon, it contributes to decision making and ensuring the safety and security of the crowd. There are a variety of interesting research problems within the scope of crowd analysis including crowd tracking, crowd behaviour recognition and crowd counting. Crowd counting based on images and videos has been studied in past years. Nonetheless, estimating and detecting the number of human heads remains a challenging task due to occlusions, resolution, and lighting changes. This paper provides an overview and performance comparison of crowd counting techniques using convolutional neural networks (CNN) based on density map estimation. In this paper, we present a comprehensive analysis and benchmarking of crowd counting based on the UCF-QNRF dataset that contains the largest number of crowd count images and head annotations available in the public domain. We also show the density maps generation and their empirical evaluation along with performance comparison. INDEX TERMS Large-scale crowd, crowd counting, computer vision, deep learning, convolutional neural networks, bio-inspired model, density map estimation.
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
In this work, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements in errors. Furthermore, we introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD) that is ∼2.8 × larger than the most recent crowd counting datasets in terms of the number of images. It contains 4,250 images with 1.11 million annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weatherbased degradations and illumination variations in addition to many distractor images, making it a very challenging dataset. Additionally, the dataset consists of rich annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset.
Proceedings of the AAAI Conference on Artificial Intelligence
Counting people in dense crowds is a demanding task even for humans. This is primarily due to the large variability in appearance of people. Often people are only seen as a bunch of blobs. Occlusions, pose variations and background clutter further compound the difficulty. In this scenario, identifying a person requires larger spatial context and semantics of the scene. But the current state-of-the-art CNN regressors for crowd counting are feedforward and use only limited spatial context to detect people. They look for local crowd patterns to regress the crowd density map, resulting in false predictions. Hence, we propose top-down feedback to correct the initial prediction of the CNN. Our architecture consists of a bottom-up CNN along with a separate top-down CNN to generate feedback. The bottom-up network, which regresses the crowd density map, has two columns of CNN with different receptive fields. Features from various layers of the bottom-up CNN are fed to the top-down network. T...
IEEE Access, 2019
Crowd counting and density estimation is an important and challenging problem in the visual analysis of the crowd. Most of the existing approaches use regression on density maps for the crowd count from a single image. However, these methods cannot localize individual pedestrian and therefore cannot estimate the actual distribution of pedestrians in the environment. On the other hand, detection-based methods detect and localize pedestrians in the scene, but the performance of these methods degrades when applied in high-density situations. To overcome the limitations of pedestrian detectors, we proposed a motion-guided filter (MGF) that exploits spatial and temporal information between consecutive frames of the video to recover missed detections. Our framework is based on the deep convolution neural network (DCNN) for crowd counting in the low-to-medium density videos. We employ various state-of-the-art network architectures, namely, Visual Geometry Group (VGG16), Zeiler and Fergus (ZF), and VGGM in the framework of a region-based DCNN for detecting pedestrians. After pedestrian detection, the proposed motion guided filter is employed. We evaluate the performance of our approach on three publicly available datasets. The experimental results demonstrate the effectiveness of our approach, which significantly improves the performance of the state-of-the-art detectors. INDEX TERMS Deep convolutional neural networks, crowd counting and density estimation, Motion Guided Filter, faster R-CNN.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International journal of electrical and computer engineering systems, 2020
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
2018 24th International Conference on Pattern Recognition (ICPR), 2018
arXiv: Computer Vision and Pattern Recognition, 2021
Sensors, 2019
Fourth International Workshop on Pattern Recognition,, 2019
Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods, 2017
IEEE Access, 2019
International Journal of Computational Intelligence Systems
Journal of Imaging, 2020
2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
International Journal of Engineering & Technology
2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015
Expert Systems with Applications, 2022
Computers, Materials & Continua