Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2011, CVPR 2011 WORKSHOPS
Conventional human detection is mostly done in images taken by visible-light cameras. These methods imitate the detection process that human use. They use features based on gradients, such as histograms of oriented gradients (HOG), or extract interest points in the image, such as scale-invariant feature transform (SIFT), etc. In this paper, we present a novel human detection method using depth information taken by the Kinect for Xbox 360. We propose a model based approach, which detects humans using a 2-D head contour model and a 3-D head surface model. We propose a segmentation scheme to segment the human from his/her surroundings and extract the whole contours of the figure based on our detection point. We also explore the tracking algorithm based on our detection result. The methods are tested on our database taken by the Kinect in our lab and present superior results.
Although Kinect was designed as a gaming tool, in the last few years studies have shown that this sensor can be used for real-time environmental scanning, segmentation, classifications and scene understanding. Our approach, based on using Kinect or any other similar device, gathers depth and RGB data from the sensors and processes the information in near real time. The purpose is to divide the data into distinct regions based on depth and colour and then calculate the distance for each detected area (depth labelling). To achieve performance in many real situations involving humans, compared to other existing segmentation or depth calculation solutions, right from the beginning, we considered the fact that humans are different than objects. Most of the objects are static and thus, they are less likely to change their dimensions and localization into every frame. We propose a method where regions are detected by merging 2 different segmentation methods: human detection using skeletal tracking and RANSAC algorithm as a method for object detection. Our experimental results are showing that the solution running on a mobile device (notebook) works with a humble improvement of maximum 7% compared to the RANSAC object detection method.
This paper presents a novel method for people detection and tracking using depth images provided by Kinetic camera. The depth image captured by a Kinect camera is analysed using its histogram, allowing for the depth image to be divided in slices, making the retrieval of regions of interest a simple and computationally light process when compared to point clouds. These regions are then classified as human or not using a template matching technique. An efficient gradient descent algorithm is used to perform the template matching, using the RPROP algorithm, and the tracking is performed based on color image histogram comparison for each region of interest, in consecutive frames. The proposed method is viable for on-line detection and tracking of people and has been tested in a mobile platform in an unconstrained environment.
International Journal of Research, 2015
Using head and hand blobs as an input to the computer are very crucial for human-computer interaction (HCI) applications. These blobs play an important role in bridging the information gap between a human and computer. One of the famous technologies that play a crucial role as an advanced input device for HCI is the Kinect camera developed by Microsoft. Kinect camera (codenamed Project Nathal) has a distinct advantage over other 3D cameras because it obtains more accurate depth information of a subject easily and very fast. By using Kinect, one can track up to six people concurrently and also obtain motion analysis with feature extraction. Being extremely useful in indoor HCI applications, it cannot be used in outdoor applications because its infrared depth sensor makes it extremely sensitive to sunlight.
Lecture Notes in Computer Science, 2012
Detecting humans and objects in images has been a very challenging problem due to variation in illumination, pose, clothing, background and other complexities. Depth information is an important cue when humans recognize objects and other humans. In this work we utilize the depth information that a Kinect sensor-Xtion Pro Live provides to detect humans and obstacles in real time for a blind or visually impaired user. The system runs in two modes. For the first mode, we focus on how to track and/or detect multiple humans and moving objects and transduce the information to the user. For the second mode, we present a novel approach on how to avoid obstacles for safe navigation for a blind or visually-impaired user in an indoor environment. In addition, we present a user study with some blindfolded users to measure the efficiency and robustness of our algorithms and approaches.
Lecture Notes in Computer Science, 2014
In this paper authors have presented a method to detect human from a Kinect captured Gray-Depth (G-D) using Continuous Hidden Markov models (C-HMMs). In our proposed approach, we initially generate multiple gray scale images from a single gray scale image/ video frame based on their depth connectivity. Thus, we initially segment the G image using depth information and then relevant components were extracted. These components were further filtered out and features were extracted from the candidate components only. Here a robust feature named Local gradients histogram(LGH) is used to detect human from G-D video. We have evaluated our system against the data set published by LIRIS in ICPR 2012 and on our own data set captured in our lab. We have observed that our proposed method can detect human from this data-set with a 94.25% accuracy.
International Journal of Signal Processing, Image Processing and Pattern Recognition, 2015
Kinect is a motion-sensing device which was originally developed for the Xbox 360 gaming console. This recently developed low-cost sensor detects the body position, motion, and voice; it consists of a microphone, a RGB camera, and a depth sensor. Kinect is PC-centric sensor which allows developers to develop real-life applications with human gestures and body motions. This paper presents an approach to interpret the indoor room objects in order to match the objects features in depth images captured from an RGBD video database. The dataset consists of color and depth image pairs gathered in real-time indoor home environment. The objects features are matched in depth image pairs with the feature association method to detect stable features at different time instances.
Springer eBooks, 2015
In the last years, economic multichannel sensors became very widespread. The most known of these devices is certainly the Microsoft Kinect, able to provide at the same time a color image and a depth map of the scene. However Kinect focuses specifically on human-computer interaction, so the SDK supplied with the sensors allows to achieve an efficient detection of foreground people but not of generic objects. This paper presents an alternative and more general solution for the foreground segmentation and a comparison with the standard background subtraction algorithm of Kinect. The proposed algorithm is a porting of a previous one that works on a Time-of-Flight camera, based on a combination of a Otsu thresholding and a region growing. The new implementation exploits the particular characteristic of Kinect sensor to achieve a fast and precise result.
2014
Automatic people detection is a widely adopted technology that has applications in retail stores, crowd management and surveillance. The goal of this work is to create a general purpose people detection framework. Studies on people detection, tracking and re-identification are reviewed. The emphasis is on people detection from depth images. Furthermore, an approach based on a network of smart depth cameras is presented. The performance is evaluated with four image sequences, totalling over 20 000 depth images. Experimental results show that simple and lightweight algorithms are very useful in practical applications.
Automatic human joint detection has been used in many application nowadays. In this paper, we propose an approach to detect full body human joint method using depth and color image. The proposed solution is divided into 3 stage, which is image preprocess stage, distance transform stage, and anthropometric constraint analysis stage. The output of our solution is a stickman model with the same pose as in the given input image. Our implementation is done by using a Microsoft Kinect RGB and depth camera with 480x640 image resolution. The performance of this solution is demonstrated on several human posture.
Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, 2013
In this paper authors have presented a method to localize and detect human being from Kinect captured sequence of images. The proposed method takes a sequence of gray (G) scale image and the corresponding depth (D) image as input. The gray scale image and the depth information are captured using two different sensors within the same device, Kinect and the processing are executed in the processor attached with Kinect. The proposed method localizes the human by using their motion along x, y direction and then considers all pixels connected with those pixels and over a 3D plane to accomplish the segmentation with an accuracy of 77%. Experimental results demonstrate that our method is robust against existing method for human localization.
Conventional human detection is mostly done in images taken by visible-light cameras. These methods imitate the detection process that human use. They use features based on gradients, such as histograms of oriented gradients (HOG), or extract interest points in the image, such as scale-invariant feature transform (SIFT), etc. In this paper, we present a novel human detection method using depth information taken by the Kinect for Xbox 360. We propose a model based approach, which detects humans using a 2-D head contour model and a 3-D head surface model. We propose a eye detection by open Cv code . We also explore the tracking algorithm based on our detection result.
The detection of persons in an image has been the subject of several studies. Most of these works were done on images taken by cameras in visible light (RGB). In this paper,we are interested in people contours detection on the Kinect3D images. We investigate the application of Gradient approach and optimal filters on depth images. We also use this detection to monitor the person via her gestures. Results show that edge detection of Canny is good for people in both light condition but, the performance of Sobel algorithm was better for the images taken in the dark depths.
The detection and tracking of a hand is an emerging research issue now-a-days to control the devices by hand motion. Conventional hand detection methods use color and shape information from a RGB camera. With the recent advent of the depth camera, some researchers show that they can improve the performance of hand detection by combining the color (or intensity) information with the information from the depth camera. In this paper, we propose a novel method for hand detection using both color and depth information from Microsoft’s Kinect device. The proposed method extract the candidate hand regions from the depth image and select the best candidate based on the color and shape feature of each candidate regions. Then the contour of the selected candidate is determined in the higher resolution RGB image to improve the positional accuracy. For the tracking of the detected hand, we propose the boundary tracking method based on Generalized Hough Transform (GHT). The experimental results show that proposed method can improve the accuracy of hand motion detection over conventional methods.
Proceedings of the Fifth Symposium on Information and Communication Technology - SoICT '14, 2014
This paper investigates on an approach of how to extract and track multiple subjects from a sequence of depth images. The Kinect camera is used to obtain a depth image revealing the depth information of a scene. Our proposed system includes the object clustering module to segment different isolated regions correspondent to objects in a depth image and foreground detection module to find moving regions from a sequence of frames. The combination of the two modules let us know which object is moving within a sequence of frames to locate a human subject. In order to extract the depth silhouettes of multiple subjects during time, we propose the use of matching algorithm between two consecutive frames to track their movements. We evaluate the algorithm with a long sequence of frames within a complex environment containing backgrounds with furniture and show how the algorithm is able to precisely extract and separate different human subjects with a fast processing speed. Therefore, the proposed approach is suitable for widely practical applications working with human activity recognition, human pose estimation and human tracking from depth images.
In this paper we present a module for detection and segmentation of human torsos for Human-Robot Interaction. People detection is a very important task in Human-Robot Interaction and particularly at the beginning of this interaction. Visual and 3D data coming from a Kinect sensor is used to achieve the task. The module proposed uses three main process: a) a face detector b) a skin color detector t and c) a 3D region growing segmentation technique. Open sources libraries like OpenNI, OpenKinect and Point Cloud Library are used in our implementation to handle the device and sensors data, as well as OpenCV and GLUT to the image processing and display.
The paper addresses the problem of people extraction in a closed context in a video sequence including colour and depth information. The study is based on low cost depth captor included in products such as Kinect or Asus devices that contain a couple of cameras, colour and depth cameras. Depth cameras lack precision especially where a discontinuity in depth occur and some times fail to give an answer. Colour information may be ambiguous to discriminate between background and foreground. This made us use first depth information to achieve a coarse segmentation that is improved with colour information. Furthermore, color information is only used when a classification in two classes of fore/background pixels is clear enough. The developed method provides a reliable and robust segmentation and a natural visual rendering, while maintaining a real time processing.
In this paper, we have proposed a system to keep track of human body movements in real time mode. The Kinect sensors are used to capture Depth and Audio streams. The system is designed by integration of two modules namely Kinect Module and Augmented Reality module. The kinect module performs Voice Recognition and captures depth images that are used by Augmented Reality module for computing the distance parameters. Augmented Reality module also captures real-time image data streams from high resolution camera. The system generates 3D module that is superimposed on real time data.
Object segmentation is a fundamental research topic in computer vision. While, only the color information for object segmentation has been the main focus of research, with the availability of low cost color plus range sensors, depth segmentation is now attracting significant attention. This paper presents a novel algorithm for depth segmentation. The proposed technique exploits the divergence of the 2D vector field to segment three-dimensional (3D) object in the depth maps. For a given depth image acquired using a low resolution Kinect sensor, a 2D vector field is computed first at each point of the range image. The depth map is then converted to the div map by computing the 2D vector field’s divergence. The latter maps the vector field to a scalar field. The variation of divergence values over the surface contour of the 3D object helps to extract its boundaries. Finally, the depth segmentation is accomplished by applying a threshold to the div map to segment 3D object from the background. In addition to removing the background, the proposed technique also segments the object from the surface on which the object is positioned. The proposed technique was tested on low resolution Washington RGB-D (Kinect) object dataset. Preliminary experimental results suggest that the proposed algorithm achieves better depth segmentation compared to state-of-the art graph-based depth segmentation. The proposed technique also outperforms the latter by achieving 40% higher computational efficiency.
Multimedia Tools and Applications, 2019
In this paper we present a new 3D descriptor for human classification and a human detection method based on this descriptor. The proposed 3D descriptor allows for the classification of an object represented by a point cloud, as human or non-human. It is derived from the well-known Histogram of Oriented Gradient by employing surface normals instead of gradients. The process consists in an appropriate subdivision of the object point cloud into blocks. These blocks provide the spatial distribution modeling of the surface normal orientation into the different parts of the object. This distribution modelling is expressed in the form of a histogram. In addition we have set up a multi-kinect acquisition system that provides us with Complete Point Clouds (CPC) (i.e. 360 • view). Such CPCs enable a suitable processing, particularly in case of occlusions. Moreover they allow for the determination of the human frontal orientation. Based on the proposed 3D descriptor, we have developed a human detection method that is applied on CPCs. First, we evaluated the 3D descriptor over a set of CPC candidates by using the Support Vector Machine (SVM) classifier. The learning process was conducted with the original CPC database that we have built. The results are very promising. The descriptor can discriminate human from non-human candidates and provides the frontal direction of the humans with high precision. In addition we demonstrated that using the CPCs improves significantly the classification results in comparison with Single Point Clouds
Hand detection is one of the most important and crucial step in human computer interaction environment. This paper presents a distance technique for hand detection based on depth image information get from Kinect sensor. Distance was used as the method of this study for hand detection. First, Kinect sensor was used to obtain depth image information. Second, background subtraction and iterative method for shadow removal applied to reduce noise from the depth image. Then, it used the official Microsoft SDK for extraction process. Finally, two hands could be segmented based on different color in specific distance. The experiment result shows that hands and head can be detected in a different position with good accuracy.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.