0% found this document useful (0 votes)
70 views9 pages

Theft Detection Using Deep Learning

The document discusses the development of a theft detection system using deep learning techniques, particularly focusing on real-time object detection through CCTV surveillance. The system aims to automate crime detection and alert authorities, reducing reliance on human observation and improving response times to theft incidents. Key technologies mentioned include YOLO for object detection, various machine learning models for analyzing suspicious activities, and a comprehensive dataset for training the models.

Uploaded by

imhameem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views9 pages

Theft Detection Using Deep Learning

The document discusses the development of a theft detection system using deep learning techniques, particularly focusing on real-time object detection through CCTV surveillance. The system aims to automate crime detection and alert authorities, reducing reliance on human observation and improving response times to theft incidents. Key technologies mentioned include YOLO for object detection, various machine learning models for analyzing suspicious activities, and a comprehensive dataset for training the models.

Uploaded by

imhameem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Theft Detection using Deep Learning

Sairaj Shirole (  [email protected] )

Research Article

Keywords: YOLO (You Only Look Once), OpenCV (Open-Source Computer Vision Library), NCRB (National
Crime records bureau) ML (machine learning)

Posted Date: November 8th, 2023

DOI: https://doi.org/10.21203/rs.3.rs-3540282/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: No competing interests reported.


Theft Detection Using Deep Learning

Parth Virdhe Anuj Nemanwar Sairaj Shirole Aditya Chouthankar


Student I2IT,Pune Student I2IT,Pune Student I2IT,Pune Student I2IT,Pune

Prof. Prashant Mandale

Professor I2IT, Pune

Abstract— In earlier times, crime detection relied solely on iii) Detecting suspicious activities in the vicinity, including the
human observation, lacking efficient methods for detection. The presence of weapons, and promptly notifying relevant
advent of CCTV cameras marked a significant advancement in authorities.
crime detection, but the manual review of video footage by It is worth noting that crimes, including theft, often exhibit
humans proved to be a time-consuming process. In present patterns that can be predicted through the analysis of large
world, Artificial Intelligence (AI) and Machine Learning (ML)
volumes of data. These patterns, once identified, can be
have made significant strides, the need for intelligent systems to
automate crime detection in CCTV surveillance has become invaluable to law enforcement efforts. Unfortunately, in many
paramount. Such systems can not only detect crimes but also instances, thefts go unreported due to societal pressures and
classify them and provide alerts to nearby police stations and other factors. Intelligent systems have the potential to swiftly
ambulances, thereby contributing to the reduction of crime detect theft incidents, thereby bypassing the need for
rates in any given country. Object detection and tracking in individuals to report them and automatically alerting the
computer vision have gained widespread attention due to their appropriate authorities. This proactive approach can help curb
diverse applications, including surveillance and security manipulative activities associated with theft and enhance
systems. Researchers have diligently worked to improve the overall security.
accuracy and efficiency of these processes. Our system aims to
enhance security measures and facilitate swift responses to
potential threats by employing real-time object detection on live
video feeds. Furthermore, this system can be further optimized
through the integration of specialized hardware, ensuring even
more robust and efficient crime detection capabilities.
II. RELATED WORK
1) The authors of given paper propose the use of
Keywords— YOLO (You Only Look Once), OpenCV machine learning (ML) models for real-time handgun
(Open-Source Computer Vision Library), NCRB (National weapon identification in surveillance. Their approach
Crime records bureau) ML (machine learning).
involves the utilization of a sliding window and a region-
I. INTRODUCTION based technique to detect handguns. They find that the Faster
Region-based Convolutional Neural Network (Faster R-
Theft is a prevalent global crime, accounting for a significant CNN) provides faster, more precise, and accurate results,
portion of criminal offenses, as reported by the National
achieving a precision rate of 84.21%, a 100% recall rate, and
Crime Records Bureau (NCRB), with theft incidents making
a higher true negatives rate.To make security alarm decisions
up approximately 80% of all reported crimes. The
consequences of increasing theft rates are not only financial when detecting a firearm, they introduce the Alert Activation
but also emotional for victims. This underscores the pressing Time per Interval (AATpI) mechanism, which validates the
need to develop a surveillance system that is user-friendly, presence of a handgun in the following frames before making
minimizes false alarms, reduces human intervention, and is a decision. This approach contributes to making more
cost-effective. accurate decisions regarding security alerts.Some notable
strengths of their project include real-time detection of
machine learning (ML) techniques offer a valuable avenue for weapons, testing on low-quality YouTube footage, and
the development of such efficient systems. These techniques obtaining predicted results. However, the project does have
can be instrumental in achieving various key objectives, some limitations, such as the inability to detect handguns in
including: the background and faster-moving objects. Additionally, the
system is designed to detect only handguns, limiting its scope
i) Detecting motion in otherwise static to a specific type of firearm.
environments.
ii) Recognizing facial expressions and identifying 2) In this paper, the authors utilized the "Change of
individuals wearing masks using ML models. trajectory by theta angle" method, as proposed by W.
Kobanne, to detect suspicious motion. The method involves effective object detection model training and subsequent
comparing two theta angles, theta 1 and theta 2, and if theta analysis.
1 is greater than theta 2, it triggers suspicious behavior
detection through object tracking.The system they developed
incorporates multiple levels of surveillance, each involving 5) The authors of the project used a crime dataset
meticulous monitoring of actions within each frame of video “UCF-crime dataset”. This dataset contains survelliance
using machine learning models that are trained for their videos of length 128-hours. The author has used lengthy
specific tasks. These models consider various parameters to surveillance footage if 1900 with different abnormalities like
assess whether a behavior qualifies as criminal.The decision- accidents,shoplifting,robberies and other events.The model
making process involves integrating the outputs of various created is able to detect crime without human involvement
sub-models, each with its own priority settings. These sub- and alerts the police to fasten the process. They had used
models encompass functions like mask detection, weapon different Pre-trained models, like Googlee Net and VGGNet-
detection, pose detection, and motion detection. By 19, have been well trained on and they can recognize objects
aggregating the outputs of these sub-models, the system can with less mistakes. But they chose the VGGNet19 model due
make informed decisions regarding potential criminal to its high accuracy. It can classify and recognizes items in
activity. real-time.
3) In their paper, the authors applied Markov's chain
rule to estimate the probabilities associated with different
types of crimes. They utilized a Transition Probability Matrix 6) In their project, the authors plan to employ various
(TPM) as a method to predict the probabilities of next object detection approaches, including Faster R-CNN,
occurrences of crimes. The TPM consists of two key Retina-Net, SingleShot MultiBox Detector (SSD), and YOLO.
components: a vector representing the probability from the From all the given approaches YOLO can give the maximum
training dataset and a matrix that characterizes the Markov accuracy for detecting objects in real world, particularly
chain method. In this specific context, the authors used the suited for real-time scenarios.
crime vector as the probability found from the given dataset YOLO utilizes neural networks to achieve object detection
. From the probability found matrix is made. They introduced and incorporates several key techniques:
the concept of a "crime growth factor," which essentially
quantifies how likely it is for one type of crime (e.g., Crime a) Residual Blocks: Residual blocks are a fundamental
A) to occur on day d + 1 given that another type of crime component in YOLO's neural network architecture. They help
(e.g., Crime B) occurred on day d. These crime growth in addressing the vanishing gradient problem and allow for
factors are then converted into probability values. To the training of deeper networks, which can capture complex
features in the input data.
calculate these factors, the authors considered dividing each
day into four parts and eventually merged them into a single
matrix. Subsequently, they employed the Naïve Bayes b) Bounding Box Regression: YOLO includes
algorithm to identify the primary hotspots or locations where bounding box regression, a technique used to refine the
these crimes are most likely to occur based on the calculated locations of detected objects. This helps in accurately
probabilities. localizing objects within the image.

4) In their paper, the authors recognized the c) Intersection Over Union (IOU): IOU is very
importance of having a substantial dataset of weapon images important metric in detecting objects. It measures the overlay
for training machine learning models. They went about between predicted bounding boxes and ground truth boxes.
collecting these images manually from Google and organized YOLO uses IOU to evaluate the accuracy of object detection
them into a specific format, typically saved as ".jpg" files, and and to decide when to detect objects.
stored within a folder named "images." They made sure to In YOLO, the bounding boxes are weighted based on
gather a minimum of 50 images for each distinct weapon probabilities assigned to different objects in the image. These
class to ensure a diverse and representative dataset. probabilities are determined by the model during inference.
Before proceeding with the training process, a preprocessing The final weights are then used to determine which bounding
step was undertaken. All the collected images were resized to boxes should be considered as valid detections.
a uniform size of 416x416 pixels. Standardizing the image Each bounding box in YOLO has four dimensions: the center
dimensions to this size helps streamline the subsequent coordinates of the bounding box, its length, height, and a
processing of images in batches, making it more confidence score. This representation makes YOLO well-
computationally efficient and consistent. The main goal of suited for applications that require fast and robust object
this data acqusition process and preprocessed the data to detection.
facilitate the training of machine learning models, Overall, the utilization of YOLO and its associated
particularly for the task of object detection. Object detection, techniques in this project aims to provide efficient and
as explained, is a CV field dedicated to the identification and accurate object detection, particularly in real-time scenarios
localization of objects within digital images or video frames. where rapid detection of objects is crucial.
By preparing a comprehensive dataset and resizing images
to a consistent size, the authors laid the foundation for 7) In their paper [7], the authors present a CCTV
surveillance system designed to automatically detect gestures
or signs of aggression and brutality in real-time. This system (DTs). In particular, it highlights the interpretability and
is made of two main modules, each serving a distinct scalability of these methods.
purpose: The comparative results obtained from this study offer
valuable insights for both researchers and practitioners who
a) Object Detection: The first module is focused on work on image recognition tasks. It helps them make valuable
detecting objectionable objects like guns and knives. This is decisions when picking the best significant machine learning
crucial for identifying potential threats within the algorithm established on their definite needs and essentials.
surveillance footage. In summary, the research paper [8] provides a rigorous
analysis of machine learning algorithms in the context of
image recognition, aiding in the understanding of their
b) Abnormal Human Activity Detection: The second strengths and weaknesses regarding accuracy,
module is designed to identify abnormal human activities, computational efficiency, and hardiness.
such as aggressive gestures or actions that may indicate .
violence. This module aims to recognize patterns of behavior 9) In this paper authored by [9], the authors propose a
that are not typical in a given context. way for the detection of larceny using a combination of
Inceptionn V3 and both directional long short term memory
The primary objective of the system is to minimize the need (BILSTM). Here's an overview of the key components and
for human intervention in monitoring CCTV feeds and to findings of the study:
minimize false alarms. To achieve this, the system is designed
to activate surveillance only when there is movement in a a) Inception V3: Inception V3 is a CNN architecture
room. This approach not only conserves resources but also noted for its effectiveness in capturing one and the other
ensures privacy when surveillance is unnecessary. local and global attributes. It employs various convolutional
machine learning techniques are leveraged in this system, filters to capture details at different scales, making it suitable
including the Faster R-CNN (Region-based Convolutional for detecting suspicious activities or objects in surveillance
Neural Network) for detecting objects and optical flow for footage.
motion estimation. Faster R-CNN is employed for accurate b) Bidirectional long short term memory (BILSTM):
and efficient object detection, while optical flow helps in BILSTM is a type of RNN that brilliant in modeling
tracking motion patterns. impermanent provinces in uninterrupted data. It takes input
When potential criminal activities or signs of aggression are series in both further and rearward directions, allowing it to
detected, the system triggers alerts or buzzers to bring catch both before and after the context. This is particularly
attention to the situation. Additionally, it has the capability to useful for analyzing sequences of video frames.
notify relevant authorities, such as law enforcement agencies, c) Method Combination: The proposed method
to facilitate a prompt response. combines the strengths of Inception V3 and BILSTM.
The authors also mention potential future enhancements to Inception V3 is used for feature extraction, while BILSTM is
the system, including the incorporation of night vision employed for sequence analysis. This combination aims to
capabilities using infrared image enhancement. This would provide comprehensive feature extraction and sequence
further improve the system's effectiveness in low-light modeling for shoplifting detection.
conditions, enhancing overall surveillance capabilities. d) Dataset: The study uses a dataset called 'shoplift-
23,' which consists of 900 videos categorized into two
8) In research paper [8], the authors present a classes: shoplifting and non-shoplifting. Each video provides
comprehensive relative study of various machine learning 90 frames for training input, resulting in a total of 81,000
algorithms for recogning images, with a particular focus on frames. The problem is treated as a supervised classification
the following four algorithms: Convolutional Neural task.
Networks (CNNs), Support Vectoring Machines (SVMs), e) Model Evaluation: Various tactics, such as two-D
Random Forests (RFs), and k-Nearest Neighbors (KNN). CNN, three-D CNN, and the used model, are evaluated using
The study evaluates these machine learning algorithms based an 80:20 random split for training and validation.
on several critical criteria, including: f) Results: The results demonstrates the used method
surpassed baseline tactics in respect of accuracy, precision,
a) Accuracy: This criterion assesses how well each recall, and f1-score. The proposed model achieves an
algorithm can correctly classify and recognize objects or accuracy 82%, precision 88.80%, recall 78.40%, and f1-
patterns in images. score 83.01%.
b) Computational Efficiency: It examines the g) Reasons for Superior Performance: The paper
computational resources required by each algorithm, discusses the reasons behind the superior performance of the
including training and inference times. proposed model, including its multi-scale processing,
c) Robustness: The study investigates how well the efficient use of parameters, regularization techniques, and
models perform under different conditions,i.e; in case of the ability of BILSTM to catch deep-rooted provinces in
noisy images and fluctuations in image quality. sequential video data.
In summary, the authors present a method that combines
The paper also delves into the architecture and training Inception V3 and BILSTM for shoplifting detection,
process of Deep Belief Networks (DBNs) and Decision Trees demonstrating its effectiveness in outperforming baseline
methods on a dataset of surveillance videos. The method's
success is attributed to its feature extraction capabilities, During data preprocessing, a unique vocabulary is
sequence modeling, and various architectural optimizations. generated, consisting of words that appear in all the dataset's
image captions. This vocabulary is saved for reference in
subsequent steps, and the data is prepared for further
processing.
III. METHODOLOGY
B) Data Acquisition and Analysis:
The components indexed down are the different steps of our The image frames obtained from the video are processed
system are: and passed through various machine learning models. Each
of these models performs specific tasks in a predefined
1. Acquiring Real-time footages sequence, focusing on different evaluation parameters. In
2. Different Techniques of our model
this phase, the following machine learning models are
3. Deciding theft activity depending on our tactics
utilized:

1) Mask Detection:
Detecting instances of robbery and identifying individuals
wearing masks is a crucial application of CV and machine
learning. To accomplish this, machine learning model is
trained to distinguish images or video frames into distinct
categories, including "normal," "robbery," "mask," and "no
mask." Transfer learning with pre-trained models like
ResNet or Mobile Net can be employed to expedite the
training process. To specifically detect masks, object
detection techniques like YOLO or FR-CNN is used to
locate faces within the frames and then determine the
presence of masks.

The mask detection process involves two phases:


Fig 1: CNN Training Technique
a) In the initial phase, a face mask dataset is used to
train the model to distinguish face mask using Keras or
TensorFlow. This classifier is serialized and saved to disk for
future use.

b) In the second phase, the previously trained face


mask classifier is filled from disk. It is then applied to detect
faces in the image frames or video stream. Region of interest
(ROI) is determined for each detected face, and the face mask
classifier is applied to these ROIs to determine whether a
mask is present or not.

2) Weapon Detection:
Weapon detection is indeed a critical task in various
security applications, and machine learning algo knn can
be used for classification in such scenarios. Here's an
overview of how knn can be used for weapon detection:

I) Step 1: Select the no. of neighbors (K)- The first step in


using KNN is to decide the no. of nearest neighbors (K)
that will be considered when classifying new data points.
This is typically chosen based on experimentation and
cross-validation to find the optimal K value for your
A) Acquiring Real-time footages: specific problem.
To gather and convert video footage into useful data, an IP
camera is utilized to capture the video feed. The camera is II) step 2: Calculate K Neighbors' Euclidean Distance-
linked to a distant server or organizer. Later recording the For each new data point (e.g., an image frame in a video
stream), calculate the Euclidean distance between every
tape, it undergoes a processing step where it is broken down
data points in the training dataset. The Euclidean length
into individual image frames. These image frames are then
measures the similarity or closeness between data points
prepared for analysis using machine learning models.
in a multifaceted space.
c) SSD (Single Shot MultiBox Detector):- Speed and
III) Step 3: Find the K Closest Neighbors- Select the K Accuracy Balance: SSD is designed to strike a balance
data points with the smallest Euclidean distances. These between speed and accuracy. It offers relatively high-speed
are the K nearest neighbors. object detection while maintaining good accuracy. It's
suitable for a different kind of application, like robotics,
IV) Step 4: Count Data Points in every Category- Among retail, and surveillance.
the K nearest neighbors, count the number of data points
that belong to each class. In this context of weapon d) False face R-CNN - Instance Segmentation: False face R-
detection, you may have categories like "No Weapon," CNN extends Faster R-CNN by including instance
"Handgun," and "Knife." segmentation, which not only detects objects but also
segments them at the pixel level. This is useful in applications
V) Step 5: Assign to the Majority Category- Finally, where precise object boundaries are required, such as medical
assign the latest data point to the class that has the imaging or computer vision research.
maximum count in the company of the K nearest Your choice of framework should consider factors like the
neighbors. In other words, the category with the most real-time requirements of your application, the level of
representatives in the company of the neighbors is the accuracy needed, and whether instance segmentation is
predicted category for the new data point. essential. It's often a good practice to experiment with
This process allows the KNN algorithm to classify new multiple frameworks and fine-tune them to meet your specific
data points (e.g., video frames) into predefined categories needs, as each framework has its strengths and trade-offs.
based on the similarity in the middle of the new data and
previously labeled data. For weapon detection, the KNN
model can be trained to distinguish between different
classes, such as detecting the presence of a handgun,
knife, or no weapon in a given image or video frame.
It's worth noting that the success of KNN depends on
selecting an appropriate K value and having a well-
preprocessed and representative training dataset that
includes various examples of the categories you want to
classify. Additionally, KNN is just one of many machine
learning tactics that can be used for this purpose, and its
effectiveness should be evaluated in comparison to other
approaches to determine the best fit for the specific
weapon detection task.

4) Motion Detection:
In this we use Change of trajectory by theta angle method
proposed by W.Kobanne which is used for detecting
suspicious motion. It consists of the following steps: -.
a) Commence: first rectangle encompassing the objec.
We Manually start rectangle nearby object in the first frame
b) Then the extracting of interesting points of the object
in the frame.
3) Object Detection c) ‘ x’ is the gap from the starting in the horizontal
Selecting an object detection framework or axis, In the vertical axis, ‘y’ is the gap from the start.
library depends on the specific requirements of your
d) After every ten frames, a mean displacement vector
application. Here's a brief overview of each of the options
is determined, as well as inclination THETA in middle of two
you've mentioned:
successive mean displacement vectors. |a|* |b|. cos = a*b
a) YOLO: - Real-world Performance: yolo is renowned for
its high precision object detection capabilities. It can operate 5)Theft Detection using CNN:
images or video frames very quickly, making it suitable for Real-time theft detection using Convolutional Neural
applications requirng low-latency responses, such as Networks- (CNNs) is a challenging but achievable task. To
autonomous vehicles and real-time surveillance. implement a real-time theft detection system, you'll need to
consider factors such as low latency, real-time video
b) FR-CNN - Accuracy: FR-CNN is known for its accuracy processing, and efficient use of computational resources.
in object detection. It typically achieves max precisionn and a) We must design a CNN architecture that's
recal rates, building it suitable for application for detecting suitable for real-time video frame analysis. We
objects with high precision is crucial, such as medical will be considering employing lightweight models
imaging or fine-grained object recognition. like MobileNet , SqueezeNet, or custom
architectures optimized for real-time
performance
b) We will Optimize our model for real-time IV. CONCLUSION
processing by reducing its size, utilizing In conclusion, theft detection and security challenges require
quantization, and leveraging hardware a multi-faceted and adaptive approach. Utilizing advanced
acceleration (e.g., GPUs or TPUs) if available. algorithms, machine learning, and continuous improvement
c) CNNs are thoroughly used for image and video efforts can enhance the effectiveness of security systems.
inspection. They are effective for detecting theft in The dynamic nature of security threats underscores the need
images and video frames, especially when for ongoing research, development, and vigilance in
combined with techniques for object detection addressing these challenges.
and tracking. a.

b.

c.

6)Decision making to check theft activity: .


theft detection is a critical aspect of security and loss
prevention. It is a multifaceted process that combines various V. REFERENCES
elements, including technology, strategies, and vigilance. To [1] Wen, C.Y., Chiu, S.H., Tseng, Y.R. and Lu, C.P.. “The mask detection
effectively identify and mitigate theft-related incidents, technology for occluded face analysis in the surveillance system”.
Journal of Forensic Science, 50(3), pp.1-9 2005.
individuals and organizations need to adopt a holistic
approach. [2] Suhr, J.K., Eum, S., Jung, H.G., Li, G., Kim, G. and Kim, J..
“Recognizability assessment of facial images for automated teller
machine applications”. Pattern Recognition, 45(5), pp.1899-1914
This approach encompasses: 2012.
[3] Min, R., d'Angelo, A. and Dugelay, J. “Efficient scarf detection prior to
e) Implementing advanced surveillance systems, face recognition” In 2010 18th European Signal Processing Conference
machine learning algorithms, and object detection methods (pp. 259-263). IEEE,2010, August.
to monitor and analyze security footage in real-time. [4] Kim, G., Suhr, J.K., Jung, H.G. and Kim, J. “Face occlusion detection
by using B-spline active contour and skin color information.” In 2010
f) Employee Training: Ensuring that employees are 11th International Conference on Control Automation Robotics &
well-trained and aware of theft prevention and detection
Vision (pp. 627-632). IEEE, 2010, December.
strategies. This includes recognizing suspicious behavior and
following established protocols. [5] Ba, S.O.; Odobez, J. “Recognizing Visual Focus of Attention From Head
Pose in Natural Meetings. IEEE Trans.” Syst. Man, Cybern. Part B
g) Physical Security Measures: Employing physical (Cybernetics) 2009, 39, 16–33. [CrossRef]
security measures such as access control systems, locks,
alarms, and security personnel to deter theft and [6] Nayak, N.M.; Sethi, R.J.; Song, B.; Roy-Chowdhury, A.K. “Modeling
unauthorized access. and Recognition of Complex Human Activities. In Visual Analysis of
h) Data Analysis: Utilizing data analysis techniques to Humans: Looking at People” Springer: London, UK, 2011; pp. 289–
identify patterns and anomalies that may indicate theft or 309. [CrossRef]
suspicious activities. Machine learning models and
algorithms play a vital role in automating this process. [7] Rankin, S.; Cohen, N “Maclennan-Brown, K.; Sage, K. CCTV Operator
i) Alarm and Notification Systems : As mentioned, Performance Benchmarking. In Proceedings of the 2012 IEEE
raising alarms to alert authorities or owners when potential International Carnahan Conference on Security Technology
theft is detected. This includes using the results of various
(ICCST)”,Newton, MA, USA, 15–18 October 2012; pp. 325–330.
algorithms and surveillance systems to trigger timely
responses [CrossRef]

j) Continuous Improvement: Recognizing that theft [8] Castillo, A., Tabik, S., Pérez, F., Olmos, R. and Herrera, F. “Brightness
detection is an ongoing effort, and continually refining
guided preprocessing for automatic cold steel weapon detection in
strategies and technologies to adapt to evolving threats.
surveillance videos with deep learning”. Neurocomputing, 330,
pp.151-161, 2019.

[9] Dorogyy, Y., Kolisnichenko, V., & Levchenko, K. (2018, September).


“Violent crime detection system. In 2018 IEEE 13th international
scientific and technical conference on computer sciences and
information technologies (CSIT)” (Vol. 1, pp. 352-355). IEEE.

[10] Samuel, D. J., & Cuzzolin, F. (2021). SVD-GAN for Real-Time


Unsupervised Video Anomaly Detection

[11] S. Chackravarthy, S. Schmitt and L. Yang, "Intelligent Crime Anomaly


Detection in Smart Cities Using Deep Learning," 2018 IEEE 4th
International Conference on Collaboration and Internet Computing [23] Munea, T.L., Jembre, Y.Z., Weldegebriel, H.T., Chen, L., Huang,
(CIC), 2018, pp. 399-404, doi: 10.1109/CIC.2018.00060 C. and Yang, C., “The progress of human pose estimation: a survey and
[12] Dr. S.V. Viraktamath, Rachita Byahatti, Madhuri Yavagal “Object taxonomy of models applied in 2D human pose estimation”, IEEE

Detection and Classification using YOLOv3” Dept. of Electronics and Access, 8, pp.133330-133348,2020.

Communication Engineering SDM College Of Engineering and [24] Zhang, L.; Zhu, G.; Shen, P.; Song, J. “Learning Spatiotemporal
Technology Dharwad, India. International Journal of Engineering Features Using 3DCNN and Convolutional LSTM for Gesture
Research & Technology (IJERT) ISSN: 2278-0181 Recognition”,In Proceedings of the 2017 IEEE International
IJERTV10IS020078 ,Vol. 10 Issue 02, February-2021 Conference on Computer Vision Workshops (ICCVW), Venice, Italy,;
[13] Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi “You pp. 3120–3128. [CrossRef] , 22–29 October 2017.

Only Look Once: Unified, Real-Time Object Detection”, University of [25] Ogwueleka, F.N.; Misra, S.; Colomo-Palacios, R.; Fernandez, L.
Washington , Allen Institute for AI , Facebook AI Research. “Neural Network and Classification Approach in Identifying Customer
[14] Tanvir Ahmad , Yinglong Ma , Muhammad Yahya, Belal Ahmad, Shah Behavior in the Banking Sector”,A Case Study of an International

Nazir , and Amin ul Haq. Hindawi Scientific Programming Volume, Bank. Hum. Factors Ergon. Manuf. Serv. Ind. 25, 28–42.

Article ID 8403262, 2020. [CrossRef],2015.

[15] Y. Lee, T. Song, H. Kim, D. K. Hant, and H. Ko, “Hostile intent and
behaviour detection in elevators” in 4th International Conference on
Imaging for Crime Detection and Prevention 2011 (ICDP 2011), pp. 1–
6, London, 2011.

[16] M. Nakib, R. T. Khan, M. S. Hasan, and J. Uddin, “Crime scene


prediction by detecting threatening objects using convolutional neural
network” International Conference on Computer, Communication,
Chemical, Material and Electronic Engineering (IC4ME2), pp. 1–4,
Rajshahi, Bangladesh, 2018.

[17] Toshev, A. and Szegedy, C., Deeppose: “Human pose estimation via
deep neural networks” In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 1653-1660),2014.

[18] Xu, X., Tang, J., Zhang, X., Liu, X., Zhang, H. and Qiu, Y., “Exploring
techniques for vision based human activity recognition: Methods,
systems, and evaluation. Sensors”, 13(2), pp.1635-1650.”,2013.

[19] El Maadi, A. and Djouadi, M.S., October. Suspicious motion patterns


detection and tracking in crowded scenes. In 2013 IEEE International
Symposium on Safety, Security, and Rescue Robotics (SSRR) (pp. 1-
6). IEEE,2013.

[20] Airfares, W., Kobbane, A. and Krioula, A., September. Suspicious


behavior detection of people by monitoring cameras. In 2016 5th
International Conference on Multimedia Computing and Systems
(ICMCS) (pp. 113-117). IEEE,2016.

[21] Pérez-Hernández, F., Tabik, S., Lamas, A., Olmos, R., Fujita, H.
and Herrera, F., “Object detection binary classifiers methodology based
on deep learning to identify small objects handled similarly:
Application in video surveillance”, KnowledgeBased Systems, 194,
p.105590,2020.

[22] Simo-Serra, E., Ramisa, A., Alenyà, G., Torras, C. and Moreno-
Noguer, F., “Single image 3D human pose estimation from noisy
observations”. In 2012 IEEE Conference on Computer Vision and
Pattern Recognition (pp. 2673-2680). IEEE, 2012 June.

You might also like