INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XV, Issue III, March 2026
System Architecture and Implementation Details
The overall architecture of the proposed hand gesture control system consists of three major modules: gesture
acquisition, gesture processing, and robotic actuation. These modules work together to achieve real-time hand
gesture recognition and robotic hand movement replication. The gesture acquisition module captures the user’s
hand gestures through a camera, which continuously records video frames during system operation. These frames
are transmitted to the computer where the gesture recognition algorithm processes the visual data. The accuracy
of gesture detection largely depends on the quality of captured frames, lighting conditions, and camera
positioning.
The gesture processing module performs several computer vision operations to identify the gesture performed
by the user. Initially, each captured frame is converted into a format suitable for processing using the OpenCV
library. Image preprocessing operations such as resizing, color conversion, and noise reduction are performed to
improve detection accuracy. After preprocessing, the MediaPipe framework is used to detect hand landmarks.
MediaPipe provides a machine learning-based hand tracking model that identifies 21 key points on the human
hand. These key points represent important finger joints and palm locations that allow the system to understand
the orientation and position of the hand.
Once the hand landmarks are detected, the system analyzes the relative positions of the fingers to determine the
gesture performed by the user. For example, when all fingers are extended, the system recognizes an open-hand
gesture. When all fingers are folded, the system interprets the gesture as a closed fist. Similarly, when two fingers
are extended while others remain folded, the system identifies a two-finger gesture. These gestures are mapped
to specific control commands that correspond to the movement of the robotic hand.
The recognized gesture is converted into digital control signals and transmitted to the Arduino microcontroller
through serial communication. The communication between the computer and the microcontroller is achieved
using a serial port interface. The Python program sends encoded data representing the detected gesture, which is
then received by the Arduino microcontroller. The Arduino program interprets this data and determines the
appropriate motor control signals required to replicate the gesture.
The robotic actuation module is responsible for converting the digital control commands into physical movement
of the robotic hand. Servo motors are used to control the movement of individual fingers. Each servo motor is
connected to a specific finger mechanism in the robotic hand. The Arduino microcontroller generates pulse width
modulation (PWM) signals that determine the angular position of each servo motor. By adjusting the PWM
signal, the servo motor rotates to a desired angle, thereby moving the finger of the robotic hand.
The integration of these modules allows the system to perform gesture recognition and robotic control in real
time. The entire process—from gesture detection to robotic movement—occurs within a short time interval,
enabling smooth and responsive interaction between the user and the robotic hand. This architecture
demonstrates the effectiveness of combining computer vision algorithms with embedded systems to create an
intuitive and contactless human–machine interface.
Experimental Setup
To evaluate the performance of the proposed system, a prototype experimental setup was developed using
commonly available hardware and software components. The system was implemented on a computer running
Python, where the OpenCV and MediaPipe libraries were used for gesture detection and processing. A standard
webcam was used to capture hand gestures, while an Arduino UNO microcontroller controlled the robotic hand
mechanism through servo motors.
During experimentation, different hand gestures were performed in front of the camera to test the system’s
recognition capability. The system successfully detected the hand landmarks and classified gestures in real time.
The recognized gestures were transmitted to the Arduino microcontroller, which generated appropriate PWM