0% found this document useful (0 votes)

83 views11 pages

HandTracking Using MediaPipe

Uploaded by

Rithvik Mandya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views11 pages

HandTracking Using MediaPipe

Uploaded by

Rithvik Mandya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

ed

Integrating Computer Vision and Unity Engine for 3D

Hand Gesture Recognition

iew
Rohit Prasad， Anirudha Sawant， Chinmaay Sharma， Siddhesh Navghare,
Beatrice S
*Department of Computer Engineering, Xavier Institute of Engineering, Mahim, Mahim 400016, India

* Corresponding author， [email protected]

v
re
Abstract
The fusion of computer vision and the Unity game engine offers promising opportunities for advancing 3D
hand gesture recognition, enabling immersive interactions in various applications. The approach involves
real-time hand detection and precise 2D landmark localization through a webcam feed, followed by mapping
er
these landmarks to 3D space within the Unity environment. Interactive elements and visual feedback are
incorporated to enhance the system's usability. The system enables accurate hand positioning and
pe
manipulation of a 3D hand model in Unity, incorporating depth perception considerations. Realistic
rendering and immersive display of the 3D hand model are achieved, showcasing its potential applications
in gaming, education, and virtual reality. The integration of computer vision techniques with the Unity game
engine presents a compelling approach to advancing 3D hand gesture recognition. The developed system
demonstrates
ot

Keywords Human computer interaction; Virtual environment; Gesture design

1 Introduction

The integration of 3D hand gesture recognition with computer vision and the Unity game engine marks a
rin

significant leap in intuitive human-computer interaction. However, achieving seamless harmony between
these technologies poses a key challenge. This paper addresses this challenge by presenting a unified
framework that amalgamates computer vision methodologies with Unity's capabilities for precise 3D hand
gesture recognition. The primary goal is to enable real-time hand detection, accurate 2D landmark
ep

localization, and seamless mapping of these landmarks into Unity's 3D space. By intricately blending
computer vision techniques and Unity's environment, the framework allows for the creation and
manipulation of a realistic 3D hand model while ensuring considerations for depth perception and responsive
Pr

gesture-driven interactions. This integration of interactive elements within Unity not only enhances user
engagement but also extends its applicability across gaming, education, and virtual reality domains.

ed
The literature survey delves into the diverse approaches in gesture recognition, a pivotal domain in computer
science crucial for human-computer interaction, virtual reality, and gaming. Research by Zhang et al.
explores real-time reconstruction methodologies, employing joint learning networks and energy

iew
optimization functions to achieve dynamic gesture recognition with single-depth camera systems. This
signifies a significant shift towards real-time performance using minimal hardware setups.[1] Integration
of deep learning techniques, as highlighted in recent advancements, contributes to improved accuracy and
efficiency in gesture recognition systems. CNN-based pose regression and neural network-based shape
estimation enhance real-time performance, paving the way for more robust recognition models.[2] The

v
literature also underscores advancements in dataset creation, including hybrid real-synthetic datasets, crucial

re
for training gesture recognition models. These datasets alleviate challenges associated with manual labeling
and foster the development of more accurate recognition systems.[3]

Furthermore, hand pose estimation techniques have undergone significant evolution, transitioning from
er
depth-based to RGB-based approaches with a focus on real-time tracking. Methods such as dense geometry
representation and machine learning-based depth estimation address challenges related to depth ambiguity
and interaction handling, bolstering the accuracy of pose estimation systems.[4] Notably, single-camera
pe
setups like RGB2Hands and VNect have demonstrated potential for gesture recognition, enabling real-time
capture of hand gestures using only a single RGB camera, thus broadening the applicability of gesture
recognition technology.[5] Gesture recognition systems find diverse applications in human-computer
interaction, virtual reality, gaming, and sign language recognition, enhancing user experience and
ot

accessibility across various domains.[6]

Despite the advancements, challenges persist, including depth ambiguity, interaction handling, and
tn

occlusion in gesture recognition. Techniques such as dense geometry representation, neural network-based
depth estimation, and 3D convolutional neural networks offer promising solutions, improving the accuracy
and robustness of gesture recognition systems.[7] The practical implications of gesture recognition extend
rin

to healthcare, education, and beyond, offering potential applications in rehabilitation, assistive technologies,
and interactive learning experiences.[8] Looking ahead, future directions in gesture recognition research
include the development of more efficient algorithms, creation of larger datasets, and exploration of novel
applications in emerging fields such as augmented reality and human-robot interaction.[9] These
ep

advancements hold promise for further enhancing the capabilities and accessibility of gesture recognition
technology.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561
4. Algorithm

ed
# Step 1: Hand Landmarks Extraction and Transmission
# Python Script

from HandTrackingModule import ...

iew
import cv2
import socket

# Initialize HandTrackingModule and socket connection

cap = cv2.VideoCapture(0)
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
server_address = ('127.0.0.1', 12345)

while True:

v
success, img = cap.read()
hands, img = detector.findHands(img)
data = []

re
if hands:
hand = hands[0]
lmList = hand["lmList"]
for lm in lmList:
data.extend([lm[0], h - lm[1], lm[2]]) # Extract hand landmarks
sock.sendto(str.encode(str(data)), server_address) # Send landmark data to Unity

er
In the Python script, the process initiates by initializing the HandTrackingModule, an essential component
for accurate hand landmark detection, and establishing a robust socket connection for seamless
communication with the Unity environment. Leveraging the powerful capabilities of OpenCV, the script
pe
continuously captures real-time frames from the webcam feed, enabling the HandTrackingModule to
meticulously analyze and extract hand landmarks with precision. These landmarks, represented as 2D
coordinates encapsulating intricate hand movements and gestures, undergo meticulous formatting to ensure
seamless transmission over the established socket connection to Unity. This pivotal segment of the algorithm
ot

serves as the foundational framework for capturing and transmitting intricate hand landmark data from the
Python environment to Unity, facilitating a seamless integration between the two systems for subsequent
processing and visualization tasks. The collaborative synergy between Python's robust computational
tn

capabilities and Unity's immersive visualization tools enables the creation of dynamic and interactive hand
gesture recognition systems.
rin

// Step 2: Unity Processing and 3D Mapping

// Unity Script: HandTracking.cs

using UnityEngine;
ep

using System.Net;
using System.Net.Sockets;
using System.Text;

public class HandTracking : MonoBehaviour

{
Pr

UdpClient client;
IPEndPoint endPoint;

void Start()
{
client = new UdpClient(12345);

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561
endPoint = new IPEndPoint(IPAddress.Parse("127.0.0.1"), 12345);
}

ed
void Update()
{
byte[] data = client.Receive(ref endPoint);
string receivedData = Encoding.UTF8.GetString(data);
string[] coordinates = receivedData.Split(' ');

iew
// Map coordinates to 3D space and update hand model positions
foreach (string coordinate in coordinates)
{
// Map 2D coordinates to 3D space
Vector3 position = MapTo3DSpace(coordinate);
// Update hand model positions
UpdateHandModelPosition(position);
}
}

v
Vector3 MapTo3DSpace(string coordinate)
{

re
// Implement mapping logic from 2D to 3D coordinates
}

void UpdateHandModelPosition(Vector3 position)

{
// Update hand model positions in Unity scene
}
}
er
In the Unity script, the system awaits the arrival of hand landmark data sent from the Python environment
pe
via the established socket connection. Once the data is received, it is decoded and parsed to extract the
individual hand landmark coordinates. Through a mapping function, these 2D coordinates are converted
into 3D world space, considering scaling and translation parameters for accurate positioning. The script
dynamically updates the positions of GameObjects representing hand landmarks, ensuring alignment with
ot

the detected hand movements. This section of the algorithm encapsulates the processing and visualization
aspects within the Unity environment, facilitating the interactive and immersive representation of hand
gestures in 3D space.
tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561
5 Methodology

ed
Here is the complete working of the project wherein the system takes in a 2D video capture and transforms
the image landmarks and maps them into a 3D environment.

v iew
re
er
pe
ot
tn
rin
ep
Pr

Fig. 1: Working

ed
Here, the goal is to extract hand landmarks using Python with OpenCV and the Mediapipe library. The hand
landmarks, represented as 2D coordinates, will be captured in real-time. Subsequently, a UDP
communication channel will be established to facilitate the transmission of these coordinates to a Unity

iew
application. The data, consisting of the hand landmark coordinates, will be converted into string format and
sent over the network to the IP address 127.0.0.1 (localhost). This communication mechanism enables the
seamless integration of hand tracking data into a Unity environment for further interactive and immersive
applications, such as virtual reality or augmented reality experiences.

v
5.2 Unity Processing and 3D Mapping

re
A Unity script named HandTracking.cs is created to receive and process the 2D coordinates transmitted
from a Python script. The received coordinates, representing hand landmarks, undergo a mapping function
within the Unity script to convert them into 3D world space. This mapping considers scaling and translation
er
parameters to accurately position the landmarks in the Unity environment. The script then dynamically
updates the positions of GameObjects in real time to reflect the movement and positioning of the hand
landmarks. This integration allows for the seamless incorporation of hand tracking data into the Unity
pe
application, providing an interactive and responsive experience where GameObjects align with the detected
hand movements and positions.
ot

5.3 Gesture Visualization and Tracking

The LineCode.cs script in Unity is employed to visually connect specific hand landmarks using a
tn

LineRenderer, enhancing visualization of hand movements. Additionally, rotation tracking logic is

implemented both in Python and Unity to determine the relative orientation of hand landmarks over time.
By tracking changes in distances and orientations between the landmarks, hand gestures are calculated and
visualized within the Unity environment. This comprehensive approach enables the dynamic representation
rin

of hand movements, fostering a more immersive and interactive user experience by visually conveying
gestures through the manipulation of GameObjects and the LineRenderer component in Unity.

5.4 Mapping 2D Landmarks to 3D Coordinates

Let P2D(xi , yi) represent the i-th 2D landmark coordinate extracted from the hand using Mediapipe.
The mapping to 3D coordinates P3D(Xi , Yi , Zi) is achieved through a transformation function F:
Pr

P3D(Xi , Yi , Zi) = F(P2D(xi , yi))

Where F consists of: Scaling Factor (S), Translation Matrix (T), and the Mapping Equation (M).

ed
The changing distances between specific hand landmarks are analyzed over time in both Python and Unity.
The goal is to determine a magnification factor by comparing the hand size in the 2D image to a reference
size. This factor is then utilized to dynamically calculate and adjust the scale of GameObjects in the Unity

iew
environment based on hand movements. By continuously monitoring and analyzing the evolving hand size
and distances between landmarks, this approach allows for a dynamic and responsive scaling mechanism in
Unity.
● Scaling Factor (S):

v
re
Scale the 2D coordinates to the 3D space, where zmax and zmin define the desired range along the Z-
axis, and xmax−xmin and ymax−ymin represent the range along the X and Y axes.

● Translation Matrix (T): er

pe
Translate the 2D coordinates to ensure the minimum values map to the origin in 3D space.

● Mapping Equation (M):

P3D(Xi , Yi , Zi) = S · P2D(xi , yi) + T

Apply the scaling factor and translation matrix to obtain the corresponding 3D coordinates
tn

5.6 Z-Axis Adjustment and 3D Model Interaction

In this phase, the project focuses on achieving dynamic interaction with 3D models in Unity. A Z-axis
adjustment factor is calculated by analyzing changes in the hand’s depth over time. This factor is applied to
rin

adjust the Z-axis position of GameObjects representing hand landmarks, ensuring that the 3D models
respond dynamically to changes in the hand’s depth. This enhancement significantly enriches the user’s
interactive experience by allowing more nuanced and realistic control over the virtual environment.
ep

Define ∆X and ∆Y as the changes in X and Y coordinates between consecutive frames.

● Magnitude of Change (∆D):

Calculate the Euclidean distance to represent the overall change in hand position.

ed
Evaluate whether the change in hand position exceeds a predefined threshold, indicating a
significant gesture.

iew
● Mapping to 3D Magnification (∆Z):
∆Z = Θ(∆D, threshold) · scaling factor · ∆D
Map the detected changes to the Z-axis of the 3D hand model, considering the scaling factor for
appropriate magnification.

v
5.7 Spatial Constraints and Scene Boundaries

re
To enhance user control within the Unity scene, spatial constraints are defined by determining scene
boundaries based on the X, Y, and Z coordinates of hand landmarks. Both Python and Unity scripts are
integrated to enforce these boundaries, constraining hand movements within a specified area. This ensures
er
a more guided and controlled interaction, preventing unintended movements outside the defined limits and
enhancing the overall usability and safety of the application.
pe
5.8 Precise 3D Model Manipulation

The project advances into providing users with precise control over 3D models in Unity. By implementing
the ability to receive mapped 2D coordinates triggered by user input, the position of 3D models is
ot

dynamically set using the Transform component. Additionally, rotation adjustments based on user
interactions are incorporated, allowing users to finely tune the orientation of 3D models. This level of
granularity in manipulation enhances the interactive experiences, enabling users to achieve a more detailed
tn

and personalized interaction with the virtual elements in the Unity environment.
rin
ep
Pr

ed
In summary, the project successfully integrated Mediapipe for 2D hand landmark extraction with Unity
for 3D gesture recognition and visualization. The Python script utilized Mediapipe to extract 2D hand
landmarks from a standard 2D camera feed and transmitted them to Unity in real-time. In Unity, the C#

iew
script received the landmark data and mapped it onto 3D game objects that represent the hand landmarks.
The system also implemented a mechanism to detect changes in hand position and adjust the Z-axis of the
3D hand model accordingly, providing users with an immersive and dynamic gesture recognition
experience. Throughout the project, iterative testing and optimization were performed to ensure accuracy,
efficiency, and user-friendliness.

v
re
Original Gesture Generated Gesture

er
pe

Image. 1(a) Image. 1(b)

ot
tn
rin

Image. 2(a) Image. 2(b)

In the results, original gestures captured by the camera (images 1(a) and 2(a)) closely replicate corresponding
3D models generated in Unity (images 1(b) and 2(b)). Image a shows a user’s original gesture, accurately
mirrored in image b by the project’s 3D model. Similarly, image c displays another original gesture,
faithfully reproduced in image d by the Unity-generated 3D model. This demonstrates the system’s precision
Pr

in translating 2D hand landmarks into realistic 3D representations.

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561
7 Discussion

ed
The paper presents a comprehensive approach to integrating computer vision techniques with the Unity
game engine for accurate 3D hand gesture recognition. The key components of the system include real-time
hand detection and 2D landmark localization using a webcam feed and computer vision algorithms, as well

iew
as the mapping of these 2D landmarks to 3D space within the Unity environment, considering depth
perception.

The authors leverage the strengths of both Python (computer vision) and Unity (3D rendering and
interactivity) to create a robust and responsive system. The use of socket communication to transmit hand

v
landmark data from Python to Unity enables seamless integration between the two environments. The paper

re
highlights the techniques used for 2D-to-3D mapping, scaling, and depth adjustment, which are crucial for
achieving accurate and immersive 3D hand gesture representation in Unity. The iterative testing and
optimization mentioned suggest the system can provide reliable and precise hand gesture recognition. The
system's integration of hand gesture recognition with Unity can enable more intuitive and natural
er
interactions within virtual reality (VR) and augmented reality (AR) environments, expanding the
possibilities for immersive experiences. Additionally, the system can be utilized in educational and training
applications, allowing users to manipulate 3D models and visualizations using hand gestures, making the
pe
learning process more engaging and interactive.

The hand gesture recognition capabilities can improve accessibility by providing alternative input
methods for individuals with disabilities, enabling them to interact with digital systems more effectively.
ot

Furthermore, the system's ability to track and recognize hand gestures can be beneficial in healthcare and
rehabilitation applications, such as monitoring hand movements for physical therapy or assisting in the
development of assistive technologies.
tn

Overall, the paper presents a well-designed framework that successfully combines computer vision and
Unity engine technologies to achieve accurate 3D hand gesture recognition. The proposed system has the
rin

potential to significantly enhance user interactions and experiences across various domains, from gaming
and virtual reality to education and healthcare.
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561
References
1. H. Zhang, Y. Zhou, Y. Tian, J.-H. Yong, and F. Xu, “Single depth view based Real-Time reconstruction of Hand-

ed
Object interactions,” ACM Transactions on Graphics, vol. 40, no. 3, pp. 1–12, Jun. 2021, doi: 10.1145/3451341.
2. J. Wang et al., “RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video,” arXiv
(Cornell University), Jun. 2021, [Online]. Available: http://arxiv.org/pdf/2106.11725.pdf.
3. F. Mueller et al., “Real-time pose and shape reconstruction of two interacting hands with a single depth camera,”

iew
ACM Transactions on Graphics, vol. 38, no. 4, pp. 1–13, Jul. 2019, doi: 10.1145/3306346.3322958.
4. J. Segen and S. Kumar, ”Shadow gestures: 3D hand pose estimation using a single camera,” Proceedings. 1999
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149), Fort Collins,
CO, USA, 1999, pp. 479-485 Vol. 1, doi: 10.1109/CVPR.1999.786981.
5. N. Shimada, K. Kimura and Y. Shirai, ”Real-time 3D hand posture estimation based on 2D appearance retrieval

v
using monocular camera,” Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces
and Gestures in Real-Time Systems, Vancouver, BC, Canada, 2001, pp. 23-30, doi: 10.1109/RATFG.2001.938906.

re
6. Mehta, Dushyant, et al. ”Vnect: Real-time 3d human pose estimation with a single rgb camera.” Acm transactions
on graphics (tog) 36.4 (2017): 1-14.
7. L. Ge, H. Liang, J. Yuan, and D. Thalmann, “Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural
Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 4, pp. 956–970, Apr.

8.
2019, doi: 10.1109/tpami.2018.2827052. er
S. S. Rautaray, “Real time hand gesture recognition system for dynamic applications,” International Journal of
UbiComp, vol. 3, no. 1, pp. 21–31, Jan. 2012, doi: 10.5121/iju.2012.3103.
pe
9. Y. Shi, Y. Li, X. Fu, M. I. A. O. Kaibin, and Q. Miao, “Review of dynamic gesture recognition,” Virtual Reality &
Intelligent Hardware, vol. 3, no. 3, pp. 183–206, Jun. 2021, doi: 10.1016/j.vrih.2021.05.001.
10. V. L. Patil, S. R. Sutar, S. B. Ghadge, and S. Palkar, “Gesture Recognition for Media Interaction: A Streamlit
Implementation with OpenCV and MediaPipe.,” International Journal for Research in Applied Science and
Engineering Technology, vol. 11, no. 9, pp. 1039–1046, Sep. 2023, doi: 10.22214/ijraset.2023.55775
11. Indriani, Moh. Harris, and A. S. Agoes, “Applying hand gesture recognition for user guide application using
ot

MediaPipe,” Advances in Engineering Research, Jan. 2021, doi: 10.2991/aer.k.211106.017.

tn
rin
ep
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=4813561

Bomb in Hand Paper Final
No ratings yet
Bomb in Hand Paper Final
10 pages
AR Hand Motion Detection with OpenCV
No ratings yet
AR Hand Motion Detection with OpenCV
27 pages
First Meeting Hasanv3
No ratings yet
First Meeting Hasanv3
20 pages
What Is Gesture1
No ratings yet
What Is Gesture1
15 pages
(23MCA1030) Industry Conclave Poster
No ratings yet
(23MCA1030) Industry Conclave Poster
1 page
Project Plan
No ratings yet
Project Plan
11 pages
Multiview 3D Hand Pose Dataset
No ratings yet
Multiview 3D Hand Pose Dataset
23 pages
Sensors 22 00706
No ratings yet
Sensors 22 00706
14 pages
Development of A Hand Pose Recognition System On An Embedded Computer Using Artificial Intelligence
No ratings yet
Development of A Hand Pose Recognition System On An Embedded Computer Using Artificial Intelligence
4 pages
Ss
No ratings yet
Ss
50 pages
Single Shot Detector CNN and Deep Dilated Masks Fo
No ratings yet
Single Shot Detector CNN and Deep Dilated Masks Fo
12 pages
Hand Gesture Recognition Techniques
No ratings yet
Hand Gesture Recognition Techniques
18 pages
Mueller GANerated Hands For CVPR 2018 Paper
No ratings yet
Mueller GANerated Hands For CVPR 2018 Paper
11 pages
Gesture Recognition and Fingertip Detection (ML)
No ratings yet
Gesture Recognition and Fingertip Detection (ML)
3 pages
Bci Final and Final
No ratings yet
Bci Final and Final
17 pages
114 Submission
No ratings yet
114 Submission
5 pages
Termpaper
No ratings yet
Termpaper
16 pages
HGR Progress Presentation Apr 8
No ratings yet
HGR Progress Presentation Apr 8
46 pages
Universal Hand Control
No ratings yet
Universal Hand Control
10 pages
Hand Detection Project Report
No ratings yet
Hand Detection Project Report
4 pages
Real-Time Sign Language System
No ratings yet
Real-Time Sign Language System
6 pages
Real-Time Vision-Based Hand Tracking and Gesture Recognition
No ratings yet
Real-Time Vision-Based Hand Tracking and Gesture Recognition
117 pages
Survey Paper Based On Hand Gesture Hex Color Matrix Vector: Sukrit Mehra, Prashant Verma, Harshit Bung, Deepak Bairagee
No ratings yet
Survey Paper Based On Hand Gesture Hex Color Matrix Vector: Sukrit Mehra, Prashant Verma, Harshit Bung, Deepak Bairagee
4 pages
3D Hand Gesture Recognition Guide
No ratings yet
3D Hand Gesture Recognition Guide
30 pages
Convolution Neural Networks For Hand Gesture Recognation
No ratings yet
Convolution Neural Networks For Hand Gesture Recognation
5 pages
Hand Gesture Recognition System
No ratings yet
Hand Gesture Recognition System
5 pages
Hand Tracking Project Presentation
No ratings yet
Hand Tracking Project Presentation
8 pages
Wireless Vision Based Mobile Robot Control Using Hand Gesture Recognition Through Perceptual Color Space
No ratings yet
Wireless Vision Based Mobile Robot Control Using Hand Gesture Recognition Through Perceptual Color Space
6 pages
Gaming Hub
No ratings yet
Gaming Hub
5 pages
Ijprems30400021488 April
No ratings yet
Ijprems30400021488 April
6 pages
AI-Powered Game with Hand Detection
No ratings yet
AI-Powered Game with Hand Detection
4 pages
Hand Gesture Control for Laptops
No ratings yet
Hand Gesture Control for Laptops
5 pages
Sketch To Solve
No ratings yet
Sketch To Solve
20 pages
Applsci 13 07433
No ratings yet
Applsci 13 07433
16 pages
IJRPR3693
No ratings yet
IJRPR3693
7 pages
Back To RGB - 3D Tracking of Hands and Hand-Object Interactions Based On Short-Baseline Stereo - 1705.05301
No ratings yet
Back To RGB - 3D Tracking of Hands and Hand-Object Interactions Based On Short-Baseline Stereo - 1705.05301
10 pages
Paper 19786
No ratings yet
Paper 19786
5 pages
Article Title
No ratings yet
Article Title
22 pages
MediaPipe To Recognise The Hand Gestures
No ratings yet
MediaPipe To Recognise The Hand Gestures
6 pages
Ai RPT
No ratings yet
Ai RPT
11 pages
IJCRT2402615
No ratings yet
IJCRT2402615
6 pages
Mini
No ratings yet
Mini
27 pages
Pps Project
No ratings yet
Pps Project
13 pages
Hand Gesture Recognition for Sign Language
No ratings yet
Hand Gesture Recognition for Sign Language
4 pages
Hand Gesture
No ratings yet
Hand Gesture
37 pages
RESEARCH
No ratings yet
RESEARCH
8 pages
Cyze 4 R 5 R 5 PTB 06 Q 0 CB 6 Rdi 24 H
No ratings yet
Cyze 4 R 5 R 5 PTB 06 Q 0 CB 6 Rdi 24 H
59 pages
Gray Word2
No ratings yet
Gray Word2
8 pages
RRP CSE (AI&ML) A9 FinalDraft
No ratings yet
RRP CSE (AI&ML) A9 FinalDraft
34 pages
Mediapipe Hands: On-Device Real-Time Hand Tracking
No ratings yet
Mediapipe Hands: On-Device Real-Time Hand Tracking
5 pages
SDK PRO 1.4.8 Technical Documentation
No ratings yet
SDK PRO 1.4.8 Technical Documentation
26 pages
Towards Smart Interaction Hand Gesture Recognition Using Machine Learning in IoT Scenarios
No ratings yet
Towards Smart Interaction Hand Gesture Recognition Using Machine Learning in IoT Scenarios
5 pages
Gesture Recognition for Interfaces
No ratings yet
Gesture Recognition for Interfaces
21 pages
Dynamic Hand Gesture Detector Using Python and Open CV
No ratings yet
Dynamic Hand Gesture Detector Using Python and Open CV
3 pages
Major Project PPT Format (1) Hand Gesture Recognition
No ratings yet
Major Project PPT Format (1) Hand Gesture Recognition
20 pages
IJRPR4850
No ratings yet
IJRPR4850
6 pages
Libro Avanzado 10 Icpna
0% (1)
Libro Avanzado 10 Icpna
23 pages
Mobile Technology Exam Guide
No ratings yet
Mobile Technology Exam Guide
36 pages
OBE Syllabus - EnG 2102 - 3rd Semester (Batch 52)
No ratings yet
OBE Syllabus - EnG 2102 - 3rd Semester (Batch 52)
4 pages
Math Lesson PowerPoint
No ratings yet
Math Lesson PowerPoint
17 pages
Karmasangsthan 20-11-2025
No ratings yet
Karmasangsthan 20-11-2025
14 pages
1.3.3 CSE List
No ratings yet
1.3.3 CSE List
221 pages
Vodafone Dispersed Radio Guidelines For Direct Supply and Powershift
No ratings yet
Vodafone Dispersed Radio Guidelines For Direct Supply and Powershift
13 pages
Exploring The Premed Discourse Community
No ratings yet
Exploring The Premed Discourse Community
4 pages
How To Install Zimmwriter On Mac OS - Unofficial Guide
0% (1)
How To Install Zimmwriter On Mac OS - Unofficial Guide
3 pages
Greatest of All
No ratings yet
Greatest of All
4 pages
Listen To The Speakers Describing Their Favourite Things and Do The Exercises To Practise and Improve Your Listening Skills
No ratings yet
Listen To The Speakers Describing Their Favourite Things and Do The Exercises To Practise and Improve Your Listening Skills
1 page
Introduction to C# Programming
No ratings yet
Introduction to C# Programming
13 pages
Understanding Analogies in Context
No ratings yet
Understanding Analogies in Context
43 pages
Minecraft Launcher Debug Log
No ratings yet
Minecraft Launcher Debug Log
14 pages
November 2020 (v2) MS - Paper 2 CAIE Computer Science GCSE
No ratings yet
November 2020 (v2) MS - Paper 2 CAIE Computer Science GCSE
9 pages
ERNIE 3.0: Knowledge-Enhanced NLP Model
No ratings yet
ERNIE 3.0: Knowledge-Enhanced NLP Model
22 pages
ICT Assignment
No ratings yet
ICT Assignment
6 pages
Learn Online English With Sir Samad Soomro
No ratings yet
Learn Online English With Sir Samad Soomro
20 pages
S1-S4 MTH Learner's Research Book (LBL)
No ratings yet
S1-S4 MTH Learner's Research Book (LBL)
22 pages
Research Methodology Guide
No ratings yet
Research Methodology Guide
4 pages
MGT 370 Auditing Group Assignment
No ratings yet
MGT 370 Auditing Group Assignment
2 pages
Sermon 4 Your City, Your Mission
No ratings yet
Sermon 4 Your City, Your Mission
3 pages
L5DC Booklist
100% (1)
L5DC Booklist
5 pages
Missionary Linguistics Conference 2018
No ratings yet
Missionary Linguistics Conference 2018
12 pages
Constructing Hermitian Matrices
No ratings yet
Constructing Hermitian Matrices
14 pages
TO Text Linguistics: L2 Linguistique Du Texte
No ratings yet
TO Text Linguistics: L2 Linguistique Du Texte
26 pages
Unit8 Reading PDF 2
No ratings yet
Unit8 Reading PDF 2
9 pages
The Sound of Music - by Deobarah Cowly
No ratings yet
The Sound of Music - by Deobarah Cowly
3 pages
Introduction to Morphology Concepts
No ratings yet
Introduction to Morphology Concepts
4 pages
Quiz ENGLISH
No ratings yet
Quiz ENGLISH
5 pages

HandTracking Using MediaPipe

Uploaded by

HandTracking Using MediaPipe

Uploaded by

ed

Integrating Computer Vision and Unity Engine for 3D

* Corresponding author， [email protected]

Keywords Human computer interaction; Virtual environment; Gesture design

accessibility across various domains.[6]

from HandTrackingModule import ...

# Initialize HandTrackingModule and socket connection

// Step 2: Unity Processing and 3D Mapping

public class HandTracking : MonoBehaviour

void UpdateHandModelPosition(Vector3 position)

5.3 Gesture Visualization and Tracking

LineRenderer, enhancing visualization of hand movements. Additionally, rotation tracking logic is

5.4 Mapping 2D Landmarks to 3D Coordinates

P3D(Xi , Yi , Zi) = F(P2D(xi , yi))

● Translation Matrix (T): er

● Mapping Equation (M):

P3D(Xi , Yi , Zi) = S · P2D(xi , yi) + T

5.6 Z-Axis Adjustment and 3D Model Interaction

Define ∆X and ∆Y as the changes in X and Y coordinates between consecutive frames.

● Magnitude of Change (∆D):

Image. 1(a) Image. 1(b)

Image. 2(a) Image. 2(b)

in translating 2D hand landmarks into realistic 3D representations.

MediaPipe,” Advances in Engineering Research, Jan. 2021, doi: 10.2991/aer.k.211106.017.

You might also like