INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025

www.ijltemas.in Page 151

Real Time Yoga Pose Detection Using AI
Ms. Priti Uddhav Pawar, Mrs. H. H. Kulkarni

Electronics and Telecommunication (SPPU) GES RH Sapat College of Engineering, Management Studies and
Research (SPPU) Nashik, India

DOI: https://doi.org/10.51583/IJLTEMAS.2025.1410000020

Abstract—After the pandemic situation in the world, a new era started where everything is going online like education, training,
shopping, work, etc. by taking the advantages of AI technology and the online trend of doing YOGA at home, Bringing the idea,
there is a chance to shift offline yoga training to online. This study introduces a novel method to detect real-time yoga poses using
artificial intelligence (AI) and camera input and provide corrective feedback accordingly. The method uses a real-time pose
prediction model, using the advantage of Media Pipe to capture human body landmarks from camera frames. These key points
are then analyzed to identify different yoga poses. The system provides real-time corrective feedback on posture by analyzing it
to help users make sure that it works smoothly. By combining AI-based pose recognition and feedback, this solution aims to improve
yoga practice, especially for beginners.This research presents a deep learning-based approach for yoga pose detection using the
YOLOv5 object detection framework. The data set, named YOLO Yoga Dataset.v1i.yolov5, consists of 1,013 annotated images in
five yoga poses: Bridge, Downward Dog, Plank, Shoulderstand and Tree Pose. The data set was sourced and preprocessed through
Roboflow and trained on YOLOv5 with data augmentation to improve generalization.

Index Terms—Yoga Pose, OpenCV, Pose Estimation, Media Pipe, Real-Time Feedback Introduction

I. Introduction

Originally, yoga started in the spiritual tradition. As yoga combines physical poses, breathing exercises, and meditation, it improves
flexibility, strength, and reduces stress and pain. It helps people begin their day with intention, encourages mindful movement,
and supports mental clarity. Today, it has become popular as a way to promote physical and mental well-being. Today, yoga is a
widely practiced form of exercise known for its numerous benefits, both physically and mentally. However, doing yoga correctly
can be difficult, especially for beginners. Incorrect postures can lead to ineffective practice and this improper movement reduces
the benefits or even causes injuries. Recent advances in artificial intelligence (AI) and computer vision have provided opportunities
to create systems that offer real-time feedback for yoga practitioners, helping them to correct their postures instantly. This paper
presents a system that detects and analyzes yoga poses using AI and camera-based input. The solution uses the YOLO, OpenCV,
and MediaPipe framework to identify human body landmarks, allowing the system to track and classify poses accurately. With this
system, users receive real-time feedbackon their postures, promoting safe and effective yoga practice at-home. Object detection
algorithms such as YOLOv5, based on convolutional neural networks, have shown remarkable success in real-time detection tasks,
combining speed and accuracy.

This study introduces a yoga pose detection system using YOLOv5 trained on a curated dataset prepared via Roboflow. The goal
is to accurately identify five fundamental yoga poses from RGB images.

II. Literature Review

Several Studies have explored AI-based pose estimation techniques for fitness and rehabilitation applications. Previ- ously, different
technologies had been used to detect yoga postures. Human activity detection has been used in a wide range of commercial
applications. Kundu, R., Reinders, J. (2019). This article discusses how deep learning models can be applied to yoga pose
Recognition, offering an overview of methods and challenges. [19] Redmon, J et al. (2016) this paper introduces YOLO, a real-
time object detection system that could be adapted to yoga pose detection, offering insights into how real-time systems can function
effectively. [18] Rutuja Gajbhiye et al. (2022) in their publication proposed an artificial intelligence-based estimation of human
pose, as well as pose correction and estimation, which uses Posenet and CNN to locate key points on the human body from
a data set comprising six yoga poses. [25] Kumar et al. (2020) introduce a yoga pose estimation system that takes advantage
of OpenPose capabilities. OpenPose, a renowned pose estimation library, empowers real-time estimation of the user’s body pose.
The system’s accuracy and efficiency shine through, enabling instantaneous feedback on yoga poses. However, the computational
demands of OpenPose can impede its applicability in real-time scenarios such as yoga practice [2] Vedangi Agrwal et al. (2022) AI-
driven yoga pose detection demonstrates the growing impact of artificial intelligence and computer vision in improving yoga
practice. The reviewed study effectively used MediaPipe and video streaming for real-time pose estimation and guidance, making
yoga more accessible to home users. However, the system encountered limitations such as sensitivity to lighting conditions,
challenges in accurately detecting complex poses, and the need for more robust pose classification methods. [3] Harshada Dhakate
et al. (2024) development of accurate and efficient yoga pose recognition systems could revolutionize the teaching and prac-
tice of yoga using proposed methodology is evaluated using a publicly available dataset obtained from ’Kaggle,’ an open- source
platform. The data set comprises a total of 1551 images. Using pose net and deep learning algorithms. [6]

Proposed System

With this study of survey highlight the challenges need for a more advanced approach, which is addressed by taking advantage

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025

www.ijltemas.in Page 152

of YOLOv5 cutting edge computer vision object detection model. The superior speed, accuracy, and ability of YOLOv5 to detect
multiple objects in real-time make it an ideal choice for real-time yoga pose detection. By integrating YOLOv5 with an AI-driven
feedback mechanism, this study aims to overcome the limitations identified in existing solu- tions, ensuring accurate pose detection,
real-time feedback, and a more adaptable user experience. This approach not only improves the precision of yoga pose analysis,
but also provides a safer and more effective training environment for users. This proposed system aims to create an advanced Yoga

Fig. 1. Flowchart

Pose Detection System using YOLO (You Only Look Once), Media Pipe, and OpenCV. The system will accurately detect people
in the frame, track their body pose, and provide real- time feedback on their yoga poses.

III. Methodology

The suggested methodology involves using a camera to capture an image of a yoga practitioner performing an asana, which is then
individually inputted to the deep learning architectures. By comparing it to the pre-trained model these architectures calculate the
pose taken by the practitioner. [6]

Data set Description

The YOLO Yoga Dataset.v1i.yolov5 was obtained from Roboflow. The data set is organized into three subsets:

Training: 709 images

Validation: 203 images

Testing: 101 images

Each image is annotated with bounding boxes in YOLO format, containing class IDs and normalized coordinates. There are total 5
classes namely Yoga Pose Bridge, Yoga Pose Downward, Yoga Pose Plank, Yoga Pose Shoulderstand and Yoga Pose Tree

System Design and Architecture

The system design using OpenCV, YOLO, Medipipe, and Python. The system is designed with the following key com- ponents:

Video Input - A camera is used to capture the video of performing yoga. The tool OpenCV read the capture frames in
real time

Object Detection - Apply YOLO model to each frame to detect the human each video frame. Identify and isolate the region
of interest (ROI) containing the person.

Pose Detection - the MediaPipe pose detection model analyzes the human body and extract 33 key points (head, shoulder,
knees, elbows, hips, etc.) in real time.

After detecting the body landmark, the system compares them with predefine yoga poses to identify current pos- ture. This
posture evaluation process OpenCV and Python are used for calculations.

This process included the drawing of skeleton using the key point. Determine the angle that separates the joints. Determines
the angles at joints using the trigonometric cosine rule. For the identified yoga pose, compare the calculated angles with the

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025

www.ijltemas.in Page 153

standard values. Provide cor- rective feedback immediately.

Real Time Feedback - Real-time feedback is given by the system, which highlights the important points it has identified
and labels each detected bounding box with the pose name and accuracy.

Model Training

The YOLOv5 model was trained using the Yolo framework and the YOLO Yoga Dataset.v1i.yolov5.The training process was
executed for 50 epochs with a batch size of 16 and an input image resolution of 640×640 pixels.

The YOLO variant was chosen for its balance between accuracy and real-time inference speed. The training was performed using
the default SGD optimizer with a learning rate of 0.01 and weight decay of 0.0005. The cosine learning rate scheduler gradually
reduced the learning rate to ensure smoother convergence. Data augmentation techniques such as random flipping, rotation,
brightness and contrast variation, and mosaic augmentation were applied during training to im- prove generalization and reduce
over fitting. The Early Stopping callback monitored the validation loss to prevent unnecessary epochs once the model achieved
optimal performance. Each Image was annotated with bounding boxes and pose class labels. The model was trained on this yoga
pose dataset. The system was trained using a yoga pose dataset consisting of over 1,013 labeled images. These images cover diverse
postures and variations in human body alignment. This model includes a total of 5 classes. 50 epochs were used to train the model.
The model is trained using 70% of the data, with the remaining 20% going to validation and the remaining 10% going to testing.

Media Pipe Pose Detection

MediaPipe is a learning framework that uses deep learning models for real-time pose estimation. It detects 33 key points on the
human body, including major joints like the shoulders, elbows, wrists, knees, and ankles. These points are used to analyze the
body’s position and determine whether a user is performing a particular yoga pose. The system utilizes Python, OpenCV, and Media
Pipe to process video frames from the camera. Once the landmarks are detected, the system compares them with known reference
poses to classify the current pose.

Real-Time Feedback and Analysis

Visual indications that show whether the user is aligned correctly are provided by the feedback mechanism, which displays the
important body spots identified throughout the yoga exercise. The device can provide remedial measures if it detects deviations
from the optimum position. To match the target pose, for example, it can suggest changing the arm angles or more effectively
aligning the body.

IV. Result

A. Quantitative Results

Table I

Performance Metrics of YOLOv5 on the Yoga Pose Dataset

Metric Train Validation Test

Precision 94.2% 92.3% 91.8%

Recall 93.1% 90.8% 89.7%

mAP@0.5 96.1% 94.6% 93.8%

YOLOv5 achieved a mean Average Precision of 94.6% on the validation dataset, demonstrating strong capability in recognizing
diverse yoga poses. The Tree and Plank poses yielded the highest accuracy, while the Shoulder stand pose showed slightly lower
precision due to limited training examples and variations in posture representation.

Given figure shows confusion matrix shows how well a model is performing by comparing its prediction to the actual results. The
average FPS (frames per second) achieved was 25-30, [3] enabling smooth real-time performance.

The system was tested on various a wide range of yoga poses. The results showed that the pose detection model was highly
effective in real-time applications. In comparison with existing research in the field, this model achieve more accuracy.

Table II

Research Paper Accuracies

Enhancing Yoga Practice: Real-time Pose

Analysis and Personalized Feedback

Accuracy- 80%

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025

www.ijltemas.in Page 154


Fig. 2. Confusion Matrix

Fig. 3. Recall Confidence Curve












Fig. 4. Confusion Normalize matrix


Fig. 5. F1 Confidence Curve

INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)

ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue X, October 2025

www.ijltemas.in Page 155

V. Conclusion

This paper presents an AI-based system for real-time yoga pose detection using camera input. By coordinating the MediaPipe
framework for pose prediction, combined with Deep Learning (YOLOv5) algorithms for classification, the system offers an
efficient way for users to receive feedback on their yoga practice. Future improvements will focus on refining pose recognition
accuracy, expanding the training dataset, and providing more comprehensive feedback to users.

VI. Acknowledgment

I would like to give special thanks and sincere gratitude to Prof. H. H. Kulkarni for their valuable guidance and refining project
methodology of MediaPipe and OpenCV for their contributions to this project. Additionally, I am thankful to all individuals
who guided in the testing phase and provided valuable feedback.

Reference

1. K. R, V. R and M. A, ”Yoga Asanas Pose Detection using Feature Level Fusion with Deep Learning-Based Model,”
2022 International Conference on computational Modelling, Simulation and Optimiza- tion (ICCMSO), Pathum Thani,
Thailand, 2022, pp. 331-339, doi: 10.1109/ICCMSO58359.2022.00071.

2. Kumar et al. (2020). Exercise Posture Monitoring System using Pose Estimation. International Journal of Scientific
and Technol- ogy Research. [Link](http://www.ijstr.org/final-print/nov2020/Exercise- Posture-Monitoring-System-
Using-Pose-Estimation.pdf)

3. V. Agarwal, K. Sharma and A. K. Rajpoot, ”AI based Yoga Trainer Simplifying home yoga using mediapipe and video
streaming,” 2022 3rd International Conference for Emerging Technology (INCET), Belgaum, India, 2022, pp. 1-5, doi:
10.1109/INCET54531.2022.9824332.

4. S. A. Elavarasi, P. Ankit Kumar and J. Jayanthi, ”Development of Ai- Based Posture Monitoring System to Assist
Yoga Training,” 2023 International Conference on Research Methodologies in Knowledge Management, Artificial
Intelligence and Telecommunication Engineer- ing (RMKMATE), Chennai, India, 2023, pp. 1-4, doi: 10.1109/RMK-
MATE59243.2023.10368735.

5. Chaudhary, N. Thoiba Singh, M. Chaudhary and K. Yadav, ”Real- Time Yoga Pose Detection Using OpenCV and
MediaPipe,” 2023 4th International Conference for Emerging Technology (INCET), Belgaum, India, 2023, pp. 1-5, doi:
10.1109/INCET57972.2023.10170485.

6. H. Dhakate, S. Anasane, S. Shah, R. Thakare and S. G. Rawat, ”Enhancing Yoga Practice: Real-time Pose Analysis and
Personalized Feedback,” 2024 International Conference on Emerging Systems and Intelligent Computing (ESIC),
Bhubaneswar, India, 2024, pp. 35-40, doi: 10.1109/ESIC60604.2024.10481659.

7. Q. Su, J. Zhang, M. Chen and H. Peng, ”PW-YOLO-Pose: A Novel Algorithm for Pose Estimation of Power Workers,”
in IEEE Access, vol. 12, pp. 116841-116860, 2024, doi: 10.1109/ACCESS.2024.3437359.

8. R. S. Alruwaythi, T. A. Alruwaili, R. M. Alsehli and L. Syed, ”Enhancing Pedestrian Pose Detection through
YOLO Deep Learning Techniques,” 2024 17th International Conference on Development in eSystem Engineering
(DeSE), Khorfakkan, United Arab Emirates, 2024, pp. 509-514, doi: 10.1109/DeSE63988.2024.10911952.

9. W. Supanich, S. Kulkarineetham, P. Sukphokha and P. Wisarn- sart, ”Machine Learning-Based Exercise Posture
Recognition Sys- tem Using MediaPipe Pose Estimation Framework,” 2023 9th In- ternational Conference on Advanced
Computing and Communication Systems (ICACCS), Coimbatore, India, 2023, pp. 2003-2007, doi:
10.1109/ICACCS57279.2023.10112726.

10. S. Ranjan, S. Tyagi, S. Gupta, M. Kaur and M. Kumar Goyal, ”Image Processing-Based Real-Time Detection and
Tracking of Human,” 2023 3rd International Conference on Technological Advancements in Com- putational Sciences
(ICTACS), Tashkent, Uzbekistan, 2023, pp. 1542- 1547, doi: 10.1109/ICTACS59847.2023.10390106

11. Google, “MediaPipe: Cross-platform, customizable ML solutions for live and streaming media,” https://mediapipe.dev/,
Accessed: Mar. 2025

12. L. J. Wilson, S. Patel and A.S. Sharma, ’Real-time human pose detection for fitness applications’, Journal of AI Research,
vol. 45, pp. 125-134, Mar. 2023.

13. T. J. Nakamura, S. D. Moore, and K. L. Henson, “Leveraging machine learning for real-time pose correction in yoga,”
IEEE Transactions on Biomedical Engineering, vol. 72, no. 6, pp. 832-841, Jun. 2024.

14. Z. Cao, G. Hidalgo, T. Simon, S. Wei, and Y. Sheikh, ”OpenPose: Realtime multi-person 2D pose estimation using part
affinity fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 172-186, Jan. 2021.

15. TensorFlow, ”Pose estimation with TensorFlow models,” https://www.tensorflow.org/lite/models/poseestimation,
Accessed
: Mar.2025.