INTERNATIONAL JOURNAL OF LATEST TECHNOLOGY IN ENGINEERING,
MANAGEMENT & APPLIED SCIENCE (IJLTEMAS)
ISSN 2278-2540 | DOI: 10.51583/IJLTEMAS | Volume XIV, Issue XI, November 2025
“A Theoretical and Practical Study of Linear Regression”
Dr. Pranesh Kulkarni
Assistant Professor, T. John Institute of Technology (Affiliated to VTU, Belagavi), Gottigere near NICE
Road Junction, Bannerghatta Road, Bangalore - 560083
Received: 05 December 2025; Accepted: 12 December 2025; Published: 22 December 2025
ABSTRACT
This article provides a self-contained description of linear regression, covering both the necessary linear algebra
concepts and their implementation in Python. Linear regression remains one of the most interpretable and widely
used tools in the data scientist’s toolbox. By mastering both its theoretical foundations and practical applications,
one can build robust and explainable models.
In this paper, we explain the fundamentals of linear regression, outline how it works, and guide the reader through
the implementation process step by step. We also discuss essential techniques such as feature scaling and
gradient descent, which are crucial for improving model accuracy and efficiency. Whether applied to business
trend analysis or broader data science applications, this paper serves as a comprehensive introduction to linear
regression for beginners and practitioners alike.
keywords: Linear Regression, Regression Analysis, Statistical Modeling, Predictive Modeling, Machine
Learning, Least Squares Method, Model Evaluation, Data Analysis, Regression Theory
INTRODUCTION
Linear regression is a supervised machine learning algorithm used to model the linear relationship between a
dependent variable and one or more independent features by fitting a linear equation to observed data. When
there is only one independent feature, the method is referred to as Simple Linear Regression. When multiple
independent features are involved, it is known as Multiple Linear Regression. Similarly, if there is only one
dependent variable, the model is called Univariate Linear Regression, whereas the presence of multiple
dependent variables leads to Multivariate Regression.
To illustrate, consider the case of a used car dealership that sells only cars of the same model and year. In this
setting, it is reasonable to assume that the selling price of a car depends primarily on the number of miles it has
been driven. Suppose we acquire a car with 55,000 miles and wish to determine its selling price. If we had a
function y=f(x)y = f(x)y=f(x), where yyy represents the selling price and xxx represents the mileage, we could
simply substitute x=55,000x =55,000x=55,000 into the function to obtain the expected price. However, in
practice, such an exact function is unknown and may not even exist.
What we do have, instead, is historical data: assume that five cars have previously been sold, with their respective
mileages and selling prices summarized in Table 1. The problem now becomes: Based on our past experience,
at what price should we sell a car with 55,000 miles? While multiple answers are possible since sellers are free
to set asking prices linear regression [1], the focus of this paper, provides a systematic and data-driven approach
to estimating such values.
Why Linear Regression is Important?
The interpretability of linear regression is one of its greatest strengths. The model’s coefficients clearly show the
impact of each independent variable on the dependent variable, providing valuable insights into the underlying
dynamics of the data. Its simplicity is also a virtue linear regression is transparent, easy to implement, and forms
the foundation for more advanced machine learning algorithms. Many techniques, such as regularization
Page 1048