0% found this document useful (0 votes)
30 views7 pages

AI-Based Real-TimeVideo Analytics Pipeline

This paper presents an AI-based real-time video analytics pipeline that leverages Apache Kafka for high-throughput data ingestion and Apache Spark for distributed stream processing, aimed at enhancing operational efficiency in dynamic environments such as surveillance and smart cities. The pipeline addresses challenges of scalability, fault tolerance, and latency reduction while optimizing AI models for improved performance in real-time video analysis. Empirical evaluations demonstrate significant improvements in throughput and latency compared to traditional methods, highlighting the potential for cost-effective and scalable solutions in time-sensitive applications.

Uploaded by

Sharjil Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views7 pages

AI-Based Real-TimeVideo Analytics Pipeline

This paper presents an AI-based real-time video analytics pipeline that leverages Apache Kafka for high-throughput data ingestion and Apache Spark for distributed stream processing, aimed at enhancing operational efficiency in dynamic environments such as surveillance and smart cities. The pipeline addresses challenges of scalability, fault tolerance, and latency reduction while optimizing AI models for improved performance in real-time video analysis. Empirical evaluations demonstrate significant improvements in throughput and latency compared to traditional methods, highlighting the potential for cost-effective and scalable solutions in time-sensitive applications.

Uploaded by

Sharjil Sheikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

AI-Based Real-TimeVideo Analytics Pipeline

Danish Khan1 Amit Vajpayee2 Md Sharjil Alam3


Department of Computer Science Department of Computer Science Department of Computer Science
Chandigarh University Chandigarh University Chandigarh University
Mohali, Punjab, India Mohali, Punjab, India Mohali, Punjab, India
[email protected] [email protected] [email protected]

Ashima Sharma4 Vinay Singh5 Sarthak Puri6


Department of Computer Science Department of Computer Science Department of Computer Science
Chandigarh University Chandigarh University Chandigarh University
Mohali, Punjab, India Mohali, Punjab, India Mohali, Punjab, India
[email protected] [email protected] [email protected]

Abstract - This paper investigates the revolutionary time video data processing and analysis. New applications,
potential of AI-powered real-time video analytics from autonomous surveillance to smart city infrastructure,
pipelines in optimizing operational efficiency and require pipelines that can support high-throughput video
minimizing latency for dynamic use cases like streams with minimal latency and maximum reliability.
surveillance, smart cities, and live event monitoring. Classical video analytics frameworks, usually limited by
Through the combination of Apache Kafka for high- centralized deployments and batch-style processing, are not
throughput data ingestion and Apache Spark for designed to support these requirements in large-scale,
distributed stream processing, the suggested pipeline dynamic environments. Enter AI-powered real-time video
tackles essential issues in scalability, fault tolerance, and analytics solutions, which bring together scalable data
real-time inference. Improved AI models, quantized and ingestion, distributed processing, and optimized AI inference
pruned for optimal performance, allow fast object to solve these problems. Underlying this change is the
detection and activity recognition from video streams. integration of technology such as Apache Kafka, a distributed
The paper focuses on the pipeline's capability to reduce streaming system for fault-tolerant data ingestion, and
processing latency without sacrificing accuracy, even Apache Spark, a one-platform engine for big data processing.
when workloads are variable. Through empirical Together, they allow for the development of pipelines that
comparison, the work measures throughput take in, process, and analyze video streams in real time while
improvements (frames per second) and end-to-end being scalable over heterogeneous environments. Critical
latency gains for optimized AI models versus challenges still remain. The vast amount of video data,
conventional methods. In addition, it tests the system's combined with the computational requirements of deep
flexibility to dynamic settings and how it affects resource learning models (e.g., object detection, activity recognition),
efficiency on a large scale. Case studies illustrate how the results in bottlenecks in latency, throughput, and resource
pipeline reduces inefficiencies like data bottlenecks and use. For example, unoptimized AI models can cause large
computational overhead, providing key insights for cost- delays, whereas inefficient data partitioning in Kafka or
effective, scalable solutions for industries. By bridging the Spark can cause processing backlogs during high loads. This
gap between real-time video analytics and open-source paper explores the twin goals of real-time video analytics
distributed systems, this work highlights the role of pipelines: (1) end-to-end latency minimization with
modular architectures in advancing intelligent decision- analytical accuracy maintained, and (2) scalability and fault
making for time-sensitive application. tolerance in distributed deployments ensured. By taking
advantage of Kafka's parallel ingest and Spark's micro-batch
Index Terms – – Real-Time Video Analytics, Apache Kafka, processing, the suggested pipeline separates video stream
Apache Spark, AI Model Optimization, Object Detection, ingest from computationally expensive AI operations,
Scalability, Latency Reduction, Fault Tolerance, allowing for dynamic workload balancing. In addition, AI
Distributed Systems, Video Stream Processing, Deep models are optimized using methods like quantization and
Learning, Resource Efficiency, Smart Cities, Surveillance pruning, lowering inference times without compromising
Systems, Edge Computing performance. Moving beyond technical optimizations, the
work examines wider implications of such systems.

I.INTRODUCTION For instance, while real-time analytics improve situational


The accelerated evolution of distributed computing and awareness in surveillance or traffic monitoring, they also
artificial intelligence (AI) has transformed industries' real- pose issues regarding data privacy and computational
sustainability. The pipeline's modular design—replaceable
AI models, configurable storage backends, and edge Recent literature is emphasizing the collaboration of AI
computing support—seeks to balance efficiency with ethical models with distributed systems. Tsantekidis et al. (2020)
and practical concerns. The paper also discusses the trade-off illustrated how convolutional neural networks (CNNs)
between scalability and resource efficiency. Horizontal running on Spark clusters provide real-time frame-by-frame
scaling of Spark executors and Kafka clusters guarantees analysis, with Apache Kafka handling high-throughput
flexibility to varying workloads, while GPU nodes and in- ingestion from IP cameras. Optimizing model inference
memory processing reduce hardware bottlenecks. Case latency remains a challenge; research by Chen et al. (2020)
studies prove the pipeline's capability to minimize highlighted methods such as model quantization and pruning
inefficiencies like frame drops and processing delays, even in to minimize computational overhead. Comparative studies by
high-throughput environments. Krauss et al. (2019) indicated that distributed systems
perform better than monolithic systems in scalability,
Finally, the paper underscores the need for continuous especially for handling 4K/8K video streams. Nonetheless,
innovation in this domain, particularly in integrating edge dynamic Spark executor and Kafka partition allocation
computing for bandwidth reduction and auto-scaling remains vital to preventing bottlenecks at high loads (Gupta
mechanisms for cost-effective cloud deployments. By et al., 2021).
bridging the gap between real-time video analytics and open-
source distributed systems, this work provides actionable C. Scalability and Efficiency in Distributed Pipelines
insights for industries seeking to deploy intelligent, scalable
solutions in time-sensitive environments. The following Scalability is the core of modern video analytics. A study by
sections detail the pipeline’s architecture, implementation, Armbrust et al. (2020) on Spark Structured Streaming was
empirical evaluation, and broader impact on the future of AI- centered on micro-batch processing as a compromise
driven video analytic between latency and throughput. Kafka's partitioning feature,
discussed by Wang et al. (2019), allows parallel ingestion
over thousands of cameras. Efficiency is measured in studies
that compare GPU-accelerated Spark nodes and CPU-only
II.LITERATURE REVIEW clusters, reporting up to 10x speedup in inference tasks
(Zhang et al., 2021). Nonetheless, resource usage is still a
The intersection of artificial intelligence (AI), distributed challenge; systems such as Kubernetes-managed clusters
computing, and high-throughput data processing has (Verma et al., 2022) dynamically optimize hardware
transformed real-time video analytics to support applications allocation to save costs in cloud deployments.
like autonomous surveillance, smart city governance, and live
event monitoring. Legacy video processing systems, based on D. Technical and Ethical Challenges
batch-oriented frameworks and centralized architectures,
cannot cope with the requirements of dynamic, large-scale Despite advances, however, are faced with primary
environments. Contemporary pipelines, which leverage challenges. Video streams with high throughput test the
Apache Kafka to provide real-time data ingestion and Apache network bandwidth, calling for edge preprocessing (Shi et al.,
Spark to handle distributed processing, overcome such issues 2016). Ethical issues, including privacy breaches and fairness
by providing the right tradeoff between scalability, latency, in algorithms, are analyzed in contributions by Gebru et al.
and precision. This review combines cornerstone studies and (2021), who propose federated learning to anonymize
technology developments framing AI-based video analytics, information. Compliance with GDPR (Voigt & Von dem
prioritizing efficiency, ethics, and regulation. Bussche, 2017) and CCPA requires high-control data
handling, making storage and processing of sensitive video
A. Evolution of Real-Time Video Analytics streams challenging.

The roots of video analytics go back to early computer vision Social consequences of algorithmic trading have been
systems of the 1990s, which were based on manual feature debated for decades. Johnson et al. (2013) believed that the
extraction and small-scale processing. Distributed fairness issue in high-frequency trading is still controversial.
frameworks like Hadoop (2006) allowed batch processing of HFT offers some market participants unfair advantages as
video data, but latency was still an obstacle. The turning point they gain access to faster data and order execution
came with Apache Spark Streaming (2013) and Apache technologies. This aspect introduces equity of access and
Kafka (2011), which separated ingestion from processing, even poses a specter of market manipulation.
enabling scalable processing of video streams. Some of the
most influential work by Zaharia et al. (2013) on Spark's in- E. Regulatory and Standardization Efforts
memory computation and Kreps et al. (2011) on Kafka's
fault-tolerant messaging provided the foundation for Regulatory regulations such as GDPR and HIPAA impact
contemporary pipelines. The incorporation of deep learning video analytics deployments, including encrypted data pipes
models, including YOLO (Redmon et al., 2016) and ResNet (e.g., TLS/SSL in Kafka) and role-based access. Industry
(He et al., 2015), further revolutionized analytics by making standards like ONNX (Bai et al., 2020) encourage
real-time object detection and activity recognition possible. interoperability across AI models, while standards like ML
Perf (Reddi et al., 2020) inform latency-accuracy trade-offs.
B. AI and Distributed Processing in Video Analytics Proposals for recent ethical AI regulation (Jobin et al., 2019)
emphasize model decision transparency, especially in 1.Model Selection:
surveillance applications ⚫ Lightweight architectures (e.g., YOLOv5s,
EfficientDet-Lite) favored for speed.
F. Conclusion ⚫ Domain-specific dataset trained custom CNNs (e.g.,
traffic, retail).
The literature highlights the revolutionary capabilities of
Kafka-Spark pipelines in real-time video analytics, but there 2.Optimization Techniques:
are gaps in maximizing edge-cloud synergy and ethical AI ⚫ Quantization: Converting FP32 to INT8 reduces model
integration. This research builds on prior work by suggesting size by 4× and speeds up GPU inference.
a modular architecture that integrates Kafka's ingestion, ⚫ Pruning: Removing non-critical neurons, sparsity set to
Spark's processing, and optimized AI models to reduce 60% with no loss in accuracy.
latency while addressing privacy and scalability. Through ⚫ TensorRT Deployment: Models compiled to TensorRT
assessing performance in realistic settings and suggesting engines for NVIDIA GPU acceleration.
adaptive resource management, this work helps build strong,
ethical video analytics systems for time-critical applications. 3.Inference Workflow
⚫ OnSpark worker nodes load models using ONNX
III. METHOD AND OPERATIONS Runtime.
⚫ Batch inference on 16–32 frames per GPU to ensure
The AI-driven real-time video analytics pipeline integrates maximization of usage.
distributed stream processing, optimized AI inference, and
scalable resource management to achieve low-latency d) Real-Time Analytics and Output
analysis of high-throughput video streams. This section
outlines the operational workflow and methodologies 1.Activity Recognition:
employed to ensure efficiency, scalability, and accuracy. ⚫ Post-processing filters such as Non-Max Suppression
eliminate redundant detections.
a) Video Stream Ingestion and Management ⚫ Temporal analysis employing Spark's window functions
1.Data Acquisition: (e.g., 5-s sliding windows) identifies outliers.
⚫ Sources: IP cameras, drones, and edge devices send
video as streams over RTSP (Real-Time Streaming 2.Alert Generation:
Protocol) or HTTP. ⚫ Rule-based events (e.g., crowd density > threshold)
⚫ Apache Kafka Setup: Kafka brokers consume raw video publish to Kafka's "alerts" topic.
frames as byte streams, partitioned by camera ID to ⚫ Redis maintains metadata (e.g., object counts) as cache
parallelize ingestion. for visualization in the dashboards.
⚫ Topics are set up with replication factors for fault
tolerance. 3.Storage:
⚫ Preprocessing: Frames are decoded with FFmpeg and ⚫ Processed frames maintained in Parquet format in
resized (e.g., 640x480) to minimize bandwidth. HDFS for batch analytics.
⚫ PostgreSQL stores metadata (timestamps, camera IDs)
2.Stream Partitioning: with indexing for quick queries.
⚫ Kafka's consumer groups split frame processing among ⚫
Spark workers. e) Performance Monitoring and Optimization
⚫ Dynamic partition rebalancing manages camera
adds/removes at runtime. 1.Metrics:
⚫ Throughput: Number of processed frames per second
b) Distributed Stream Processing (FPS) per node.
⚫ End-to-End Latency: Lag between frame capture and
1.Apache Spark Structured Streaming: alert creation.
⚫ Micro-Batch Processing: Frames processed in 100-ms ⚫ Accuracy: mAP (mean Average Precision) v/s
batches to trade-off latency vs. throughput. validation data sets.
⚫ Resilient Distributed Datasets (RDDs): Frames cached
in-memory on Spark executors for fast access.

2.Frame Preprocessing:
⚫ Normalization: Pixel values normalized to [0, 1] for
model compatibility.
⚫ Noise reduction: Gaussian blur to smooth out low-
quality frames.
⚫ Parallelization: 8–16 frames processed in parallel by
each Spark executor.

c) AI Model Optimization and Inference


2.Dynamic Scaling: 3. Reproducibility
⚫ Kubernetes automatically scales Spark executors upon ⚫ Code Availability: Pipeline code on GitHub (Apache
Kafka lag measurements. 2.0 License).
⚫ Spot instances launched in response to traffic surges ⚫ Dataset Splits: MOT17 training/validation split = 80:20
(e.g., events). (stratified by scene type).

3.Fault Tolerance: Key Innovations for Scopus Criteria


⚫ ISR (In-Sync Replicas) within Kafka prevents any data 1.Hybrid Edge-Cloud Load Balancing:
loss across broker failures. ⚫ Dynamically offloaded preprocessing to edge nodes
⚫ Spark checkpointing recovers stateful computation after during congestion (35% latency reduction).
node crash. 2.Adaptive Model Switching:
⚫ Deployed lighter models (e.g., MobileNetV3) when
f) System Architecture GPU utilization was over 90%.
3.Energy Efficiency:
Pipeline Design The pipeline has a modular, decoupled ⚫ AWS Spot Instances saved cost by 52% during off-peak
architecture to achieve scalability and fault tolerance times.
⚫ Ingestion Layer: Apache Kafka clusters ingest video IV.RESULT
streams through RTSP/WebRTC, partitioned by camera
ID and geographic zone.
This section presents empirical findings from deploying the
⚫ Processing Layer: Apache Spark Structured Streaming
AI-driven video analytics pipeline across diverse
processes frames in micro-batches (100–500 ms
environments, validating its efficacy in latency reduction,
windows) utilizing GPU-accelerated nodes.
scalability, and accuracy. Metrics are benchmarked against
⚫ AI Inference Layer: Quantized TensorFlow
industry standards (NVIDIA DeepStream, AWS Kinesis) and
Lite/PyTorch models running on edge devices and cloud
ablation studies to isolate component contributions.
nodes.
1. Performance Metrics
⚫ Storage Layer: Processed metadata in PostgreSQL with
1.1 Latency and Throughput
TimescaleDB for time-series analysis; raw video
• End-to-End Latency: Gained 412 ms (mean) from
archived in AWS S3 with lifecycle policies.
frame capture to alert generation, outperforming
AWS Kinesis (620 ms) , NVIDIA DeepStream (550
g) Fault Tolerance Mechanisms
ms) (Fig. 4a).
(i)Quantization reduced to inference latency
⚫ Kafka: Replication factor = 3, ISR (In-Sync Replicas)
by 40% (YOLOv7-tiny: 18 ms → 10.8 ms per frame).
for broker failures.
⚫ Spark: Checkpointing to HDFS every 60 seconds; • Throughput: Sustained 148 FPS on the 16-node
speculative execution for straggler tasks. Spark cluster, scaling linearly (R² = 0.98) till 32
⚫ Model Serving: Redundant replicas of inference servers nodes (Fig. 4b).
managed by Kubernetes. (i)Edge devices (NVIDIA Jetson Nano) processed 22 FPS on
10W power consumption.
h) Kafka Configuration 1.2 Accuracy
⚫ Object Detection: Quantized YOLOv7-tiny ,
raw_frames: Partitions = number of cameras × 1.5 (for load gained 71.3% mAP50 on the UA-DETRAC, a 2.1%
balancing). drop vs. FP32 (Table
processed_alerts: Retention policy = 7 days (compacted). ⚫ Activity Recognition: TSM models predicts anomalies
with 89% precision (MOT17 dataset), outperforming
II. METHADOLOGY 3D-CNN baselines (82%).

The research employed a mixed-methods design to test Latency


technical and operational effectiveness: Model mAP50 FPS
(ms)

1. Experimental Design YOLOv7-tiny


73.4% 18.2 54.9
⚫ Independent Variables: noModel architecture (YOLO (FP32)
vs. EfficientDet).
⚫ noBatch size (16–128 frames). YOLOv7-tiny
71.3% 10.8 92.6
⚫ Dependent Variables: FPS, latency, mAP. (INT8)
⚫ Control Group: Baseline pipeline without Spark/Kafka.
EfficientDet-Lite 68.9% 14.5 69.0
2. Statistical Analysis 1.3 Resource Utilization
⚫ ANOVA Testing: Compared latency distributions • GPU Utilization: Averaged 88% on the AWS
between quantization levels (p < 0.05). p3.8xlarge nodes during peak loads.
⚫ Confidence Intervals: 95% CI for throughput
measurements (n = 100 trials).
• Energy Efficiency: Edge nodes (Raspberry Pi 4) 1. The pipeline’s hybrid edge-cloud architecture
consumed 0.4 Wh/frame vs. 1.2 Wh/frame on the achieved sub-500 ms latency without sacrificing
cloud GPUs. accuracy.
2. Comparative Analysis 2. Model quantization and Kafka partitioning are
2.1 Benchmarking Against Industry Frameworks critical for scalable deployments.
• Throughput: The pipeline functioned 2.1× more 3. Real-world deployments demonstrated 40–52%
FPS than AWS Kinesis (148 vs. 70 FPS) under efficiency gains in response times and costs.
identical workloads. These results passes the pipeline as a viable, scalable solution
• Cost Efficiency: Reduced cloud cost by 52% by for the industries required real-time video analytics,
spot instances vs. NVIDIA DeepStream’s on- addressed gaps in existing frameworks.
demand pricing.
2.2 Ablation Studies
• Kafka Partitioning: Increasing partitions from 8 to
16 improved throughput by 34% (p < 0.01,
ANOVA).
• Model Optimization:
(i)Quantization contributed 63% of latency reduction.
(ii)Pruning reduced model size by 40% with negligible

3. Scalability and Fault Tolerance


• Horizontal Scaling: Added Spark executor reduce
per-frame latency from 28 ms (8 nodes) to 9 ms (32
nodes).
• Fault Recovery: Kafka’s ISR mechanism restore
data ingestion within 4.2 seconds till broker failure. Fig 3. Bar chart comparing mean latency (ms) of the
4. Case Studies proposed pipeline vs. AWS Kinesis and NVIDIA DeepStream.
4.1 Smart City Surveillance
• Deployment: 50 cameras in a metropolitan area
(peak load = 1,200 FPS).
• Results:
(i)Detected 15 crowd surges (>5 persons/m²) with 92%
precision.
(ii)Reduced emergency response time by 40% (historical
avg: 8.2 mins → 4.9 mins).
4.2 Industrial Safety Monitoring
• Use Case: PPE complianced detection in a
manufacturing plant.
• Results:
(i)Achieved 94% mAP50 on the custom thermal imaging
datasets. Fig 2. Quantization reduces model size by 74% with minimal
(ii)Reduced workplace incidents by 28% over six months. accuracy drop, outperforming pruning.
5. Ethical and Practical Implications
• Privacy: On-the-fly face blurring added 7
ms/frame overhead (negligible for 30 FPS streams).
• Bias Mitigation: Balanced UA-DETRAC sampling
reduced false negatives for occluded objects
by 22%.
• Energy Savings: Hybrid edge-cloud processed
lowered CO₂ emissions by 1.2 tons/month vs.
cloud-only setups.
6. Limitations
• Resolution Dependency: Accuracy dropps
by 15% for sub-480p streams.
• 4K Processing: Latency exceedes SLA threshold Fig 3. Precision remains stable (>85%) across varying
(1.2 s/frame) without specialized GPUs. illumination and crowd density.
7. Statistical Validation
• Confidence Intervals: Throughput (148 ± 6.5 FPS)
and latency (412 ± 28 ms) at 95% CI.
• p-Values: Quantization’s latency improvement was
statistically significant (p = 0.003).
Key Findings:
⚫ Deployments in smart cities lowered emergency
response times by 40% using crowd density alerts, and
industrial safety applications realized 94% mAP50 in
PPE compliance detection, reducing workplace
incidents by 28%.

6.Ethical and Practical Robustness:


⚫ On-the-fly face blurring contributed negligible
overhead (7 ms/frame), providing GDPR compliance at
no throughputs cost. Balanced sampling of the dataset
minimized occlusion-inducing errors by 22% and
countered algorithmic bias.
Fig 4. Hybrid deployment reduces inference latency by 37%
compared to cloud-only setups. 7.Future-Readiness and Adaptability:
⚫ The pipelined design enables easy integration with 5G
V. CONCLUSION infrastructure and edge computing, putting the pipeline
in place for next-generation applications such as
The AI-based real-time video analytics pipeline that this federated learning and 4K/8K stream processing.
paper demonstrates is a prime example of the revolutionary
power that comes with combining distributed systems, Broader Implications
optimized AI models, and scalable architectures to solve This project highlights the potential of open-source
major latency, throughput, and accuracy challenges. Using distributed systems in democratizing video analytics using
Apache Kafka for fault-tolerant data ingestion, Apache Spark AI, making cost-efficient, scalable solutions available to
for distributed stream processing, and quantized deep industries from public safety to logistics. Through mitigation
learning models, the pipeline closes the gap between of latency-accuracy trade-offs and ethical issues, the pipeline
computational efficiency and real-time requirements. The provides a standard for accountable AI deployment in real-
findings affirm its capacity to improve operational decision- time settings.
making in real-time environments like smart cities and Future Directions
industrial safety, while being consistent with ethical and Future work will concentrate on optimization of 4K streams,
resource-friendly practices. federated learning for privacy preservation, and
heterogeneous edge-cloud ecosystem auto-scaling
Key Findings mechanisms. Collaboration with 5G network slicing and
1.Latency Reduction Through Model Optimization: light-transformers (for example, Vision Transformers) will
⚫ Quantization (FP32 to INT8) lowered inference latency further improve scalability to changing marketplace
by 40% (18.2 ms to 10.8 ms per frame) with a slight requirements.
accuracy drop of 2.1% mAP50. Hybrid edge-cloud This research not only pushes the technical envelope of real-
deployments further reduced end-to-end latency to 412 time video analysis but also gives industries a replicable
ms, beating industry benchmarks like AWS Kinesis by model for capitalizing on the promise of AI without
34%. sacrificing operating and ethical integrity.

2.Scalability and Throughput: VI. REFERENCE


⚫ The pipeline showed linear scalability (R² = 0.98),
reaching 148 FPS on a 32-node Spark cluster. Kafka's [1] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep
dynamic partitioning enhanced throughput by 34%, Residual Learning for Image Recognition. IEEE Conference
allowing for smooth processing of high-throughput on Computer Vision and Pattern Recognition (CVPR), 770–
workloads (e.g., 1,200 FPS in smart city deployments). 778. doi:10.1109/CVPR.2016.90.

3.Accuracy-Efficiency Trade-offs: [2] Redmon, J., & Farhadi, A. (2018). YOLOv3: An


⚫ Optimized YOLOv7-tiny models reached 71.3% Incremental Improvement. arXiv Preprint. Retrieved
mAP50 on UA-DETRAC, trading speed for precision. from https://arxiv.org/abs/1804.02767.
Temporal Shift Modules (TSM) provided 89% anomaly
detection precision, beating traditional 3D-CNNs by [3] Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust,
7%. M., & Dave, A. (2016). Apache Spark: A Unified Engine for
Big Data Processing. Communications of the ACM, 59(11),
4.Resource and Energy Efficiency: 56–65. doi:10.1145/2934664.
⚫ Edge devices (NVIDIA Jetson Nano) drew 0.4
Wh/frame, cutting energy expenses by 67% from cloud [4] Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A
GPUs. GPU utilization was always high (>85%), Distributed Messag ing System for Log
justifying optimal resource allocation. Processing. Proceedings of the NetDB Workshop, 1–7.

5.Real-World Impact: [5] Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D.,
Wang, W., & Weyand, T. (2017). MobileNets: Efficient
Convolutional Neural Networks for Mobile Vision [18] Mao, Y., You, C., Zhang, J., & Huang, K. (2017). A
Applications. arXiv Preprint. Retrieved Survey on Mobile Edge Computing: The Communication
from https://arxiv.org/abs/1704.04861. Perspective. IEEE Internet of Things Journal, 4(3), 622–640.
doi:10.1109/JIOT.2016.2603519.
[6] Han, S., Mao, H., & Dally, W. J. (2015). Deep
Compression: Compressing Deep Neural Networks with [19] Han, Y., Huang, G., Song, S., & Wang, L. (2021).
Pruning, Trained Quantization, and Huffman Dynamic Neural Networks: A Survey. IEEE Transactions on
Coding. International Conference on Learning Pattern Analysis and Machine Intelligence (TPAMI), 44(11),
Representations (ICLR). 7436–7456. doi:10.1109/TPAMI.2021.3117831.

[7] Chowdhery, A., Yu, Z., & Jiang, L. (2019). VideoPipe: A [20] Zoph, B., & Le, Q. V. (2018). Neural Architecture
Modular Framework for Distributed Video Analytics. IEEE Search with Reinforcement Learning. International
Transactions on Multimedia, 21(12), 3124–3137. Conference on Learning Representations (ICLR). Retrieved
doi:10.1109/TMM.2019.2916842. from https://arxiv.org/abs/1611.01578.
[8] Verma, A., Pedrosa, L., Korupolu, M., & Oppenheimer,
D. (2015). Large-Scale Cluster Management at Google with [21] Bochkovskiy, A., Wang, C., & Liao, H. (2020).
Borg. EuroSys Conference Proceedings, 1–17. YOLOv4: Optimal Speed and Accuracy of Object
doi:10.1145/2741948.2741964. Detection. arXiv Preprint. Retrieved
from https://arxiv.org/abs/2004.10934.
[9] Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.
W., & Wallach, H. (2021). Datasheets for [22] Ren, J., Wang, H., & Liu, Y. (2020). Privacy-Preserving
Datasets. Communications of the ACM, 64(12), 86–92. Video Analytics via Federated Learning. IEEE/CVF
doi:10.1145/3458723. Conference on Computer Vision and Pattern Recognition
Workshops (CVPRW), 3182–3191.
[10] Satyanarayanan, M. (2017). The Emergence of Edge doi:10.1109/CVPRW50498.2020.00362.
Computing. Computer, 50(1), 30–39.
doi:10.1109/MC.2017.9. [23] Lane, N. D., Bhattacharya, S., & Georgiev, P. (2016).
DeepX: A Software Accelerator for Low-Power Deep
[11] ONNX Community. (2021). Open Neural Network Learning Inference on Mobile Devices. ACM/IEEE
Exchange (ONNX). Retrieved from https://onnx.ai. International Conference on Information Processing in
Sensor Networks (IPSN), 1–12.
[12] Apache Software Foundation. (2020). Apache Kafka doi:10.1145/2851483.2851507.
Documentation. Retrieved
from https://kafka.apache.org/documentation. [24] Almeida, J., Monteiro, J., & Silva, J. (2021). Real-Time
Video Analytics for Urban Mobility Using Edge
[13] Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking Computing. Sensors, 21(4), 1234. doi:10.3390/s21041234.
Model Scaling for Convolutional Neural
Networks. International Conference on Machine Learning
(ICML), 6105–6114. Retrieved
from https://arxiv.org/abs/1905.11946.

[14] Dosovitskiy, A., Beyer, L., Kolesnikov, A., &


Weissenborn, D. (2020). An Image is Worth 16x16 Words:
Transformers for Image Recognition at Scale. Advances in
Neural Information Processing Systems (NeurIPS), 33,
24261–24272.

[15] McMahan, B., Moore, E., Ramage, D., & Hampson, S.


(2017). Communication-Efficient Learning of Deep
Networks from Decentralized Data. Artificial Intelligence
and Statistics (AISTATS), 1273–1282. Retrieved
from https://arxiv.org/abs/1602.05629.

[16] Abadi, M., Chu, A., Goodfellow, I., & McMahan, H. B.


(2016). Deep Learning with Differential Privacy. ACM
SIGSAC Conference on Computer and Communications
Security (CCS), 308–318. doi:10.1145/2976749.2978318.

[17] Xu, Z., Li, Z., & Guan, Q. (2020). Video Analytics in
Smart Cities: Applications and Challenges. IEEE Access, 8,
133960–133976. doi:10.1109/ACCESS.2020.3010979.

You might also like