Linear Regression Explained (2026): Basics, Examples & Metrics
Updated on January 10, 2026 4 minutes read
Linear regression is a simple way to model the relationship between variables. In 2026, it is still widely used as a baseline because it is quick to train and easy to interpret.
You can use it to predict a numeric outcome (like revenue, time, or price) and to estimate how strongly different inputs relate to that outcome.
What linear regression models
Linear regression connects an input (or several inputs) to an output by fitting a straight line (or, with multiple inputs, a plane or hyperplane). The goal is to capture the trend in your data and make reasonable predictions.
Two terms show up often and mean the same thing across most courses and tools:
- Target / dependent variable (
y): the value you want to predict. - Feature(s) / independent variable(s) (
x): the value(s) used to predict it.
The core equation
For simple linear regression (one feature), the model is often written as:
y = m*x + b
m(slope): how muchychanges whenxincreases by 1.b(intercept): the predicted value ofywhenx = 0.
In real datasets, points do not land perfectly on the line. The difference between the observed value and the model prediction is called a residual (also called an error).
Simple vs. multiple linear regression
Simple linear regression uses one feature. It is useful when you expect a single main driver and want a clear baseline.
Multiple linear regression uses several features at once, for example: Marketing spend, seasonality, and product price, predicting weekly sales.
How the best-fit line is chosen
The most common approach is ordinary least squares. It chooses parameters that minimize the total squared residuals across the data.
Squared error is practical: it is easy to compute and penalizes big mistakes more than small ones, which can be helpful in many business settings.
A practical workflow in 2026
1) Start with a clear question
Define what you are predicting and why it matters. Also, decide what level of error is acceptable for the use case you have.
2) Explore and prepare your data
Check for missing values, obvious outliers, and inconsistent units.
A quick scatter plot of x vs. y can reveal whether a straight-line model
is a reasonable starting point.
3) Fit the model
Train on historical data and test on data the model has not seen. In practice, compare linear regression to a simple baseline (like predicting the mean) to confirm you are improving.
4) Evaluate with the right metrics
Common regression metrics include:
- MAE (Mean Absolute Error): average absolute difference between prediction and truth.
- RMSE (Root Mean Squared Error): similar to MAE, but penalizes large errors more.
- R2 (R-squared): how much variance the model explains (useful, but not the only score).
5) Interpret and communicate results
Linear regression is popular becausethe coefficients are interpretable. A coefficient is easiest to explain in real units, such as: "an increase of 1 unit in x is associated with an increase of m units in y."
For decision-making, combine interpretation with error analysis and sanity checks. Not just a single score.
Assumptions worth checking
Linear regression can still be useful when assumptions are imperfect, but you will usually get more reliable results when these are approximately true:
- The relationship is roughly linear in the range you care about.
- Residuals do not show obvious patterns (a sign you are missing structure).
- Residual spread is reasonably consistent across prediction values.
- Features are not so correlated that coefficients become unstable (multicollinearity).
Common pitfalls
- Extrapolation: predictions become risky outside your training range.
- Correlation is not causation: a strong coefficient does not prove cause and effect.
- Outliers: A small number of extreme points can pull the line in the wrong direction.
- Non-linear patterns: if the relationship curves, consider transformations or other models.
Real-world examples
Ice cream sales vs. temperature: warmer days often correlate with higher sales.
Study time vs. exam score: More study time can correlate with better scores, with many factors involved.
House size vs. price: size may explain part of price, but location and condition often matter too.
Next steps
Once you are comfortable with linear regression, explore regularized variants. (like Ridge and Lasso) and feature engineering (like polynomial features).
If you want guided practice, explore Code Labs Academy's Data Science & AI Bootcamp or browse All bootcamps.