Writing · Research Note · 2025

Regression Model Evaluation Metrics: MAE, MSE, RMSE, and R² Explained

A comprehensive guide to key regression model evaluation metrics. Understand the differences, advantages, and limitations of MAE, MSE, RMSE, and R² to choose the right metric for your machine learning projects.

· 6 min read · #AI#study#machine learning#regression#evaluation metrics#MAE#MSE#RMSE#R-squared
On this page
  1. Introduction: The Essence of Regression Models
  2. Theoretical Foundation of Evaluation Metrics
  3. Residuals: The Building Blocks
  4. Bias and Error
  5. The Modeling Cycle
  6. Quick Reference: Metrics Comparison Table
  7. Basic Evaluation Metrics
  8. Mean Absolute Error (MAE)
  9. Mean Squared Error (MSE) and Root Mean Squared Error (RMSE)
  10. Coefficient of Determination (R²)
  11. Advanced Evaluation Metrics
  12. Percentage-Based Errors

— Il n’ya a qu’une loi en sentiment. C’est de faire le bonheur de ce qu’on aime
— There is only one law in emotion: to bring happiness to those we love

— Stendhal, diary entry dated June 19, 1805

Introduction: The Essence of Regression Models

When we talk about prediction models, we’re typically referring to one of two main types:

  • Regression models: These produce continuous outputs (like predicting house prices, temperature, or stock values)
  • Classification models: These generate categorical or binary outputs (like predicting yes/no, spam/not spam, or classifying images)

Regression models attempt to find relationships between variables to predict continuous numeric values. But how do we know if our regression model is any good? This is where evaluation metrics come in.

Theoretical Foundation of Evaluation Metrics

Residuals: The Building Blocks

At the core of regression evaluation is the concept of residuals - the differences between predicted values and actual observed values:

Residual=yiy^i\text{Residual} = y_i - \hat{y}_i

Where yiy_i is the actual value and y^i\hat{y}_i is the predicted value.

Bias and Error

Bias represents systematic errors in predictions. Ideally, the sum of residuals should be close to zero, indicating that your model isn’t consistently over or under-predicting values.

The Modeling Cycle

The process typically flows from model construction to evaluation to refinement:

  1. Build the model
  2. Evaluate the model
  3. Improve the model
  4. Repeat until reaching satisfactory performance

Quick Reference: Metrics Comparison Table

MetricRangeInterpretationBest ForLimitations
MAE[0, ∞)Average error magnitudeGeneral use, robust to outliersNot differentiable
MSE[0, ∞)Average squared errorOptimizationSensitive to outliers
RMSE[0, ∞)Error in original unitsEasy interpretationSensitive to outliers
(-∞, 1]Explained varianceModel comparisonIncreases with features
MAPE[0, ∞)Percentage errorBusiness contextsBiased, undefined at zero
SMAPE[0, 2]Symmetric % errorForecastingUnintuitive range

Basic Evaluation Metrics

Mean Absolute Error (MAE)

Mean Absolute Error measures the average magnitude of errors without considering their direction.

Formula:

MAE=1ni=1nyiy^i\text{MAE} = \frac{1}{n}\sum_{i=1}^{n}|y_i - \hat{y}_i|

Advantages:

  • Intuitive interpretation: Directly represents the average error magnitude
  • Robust to outliers: Doesn’t amplify the effect of outliers
  • Unit preservation: Maintains the original unit dimension
  • Straightforward: Directly reflects the average deviation

Limitations:

  • Equal error weighting: Assumes all error magnitudes are equally important
  • Optimization challenges: Not easily differentiable at zero
  • Scale dependence: Difficult to compare across different datasets

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE)

MSE calculates the average of squared differences between predicted and actual values. RMSE brings the metric back to the original units.

MSE Formula:

MSE=1ni=1n(yiy^i)2\text{MSE} = \frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2

RMSE Formula:

RMSE=1ni=1n(yiy^i)2\text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i - \hat{y}_i)^2}

Advantages:

  • Penalizes larger errors: Due to the squared term
  • Differentiable: Easier to use in optimization algorithms
  • Unbiased predictions: Minimizing MSE leads to centered predictions

Limitations:

  • Outlier sensitivity: Heavily influenced by outliers
  • Interpretation difficulty: MSE is not in original units (RMSE solves this)
  • Unit variance: Cannot directly compare across different datasets

Coefficient of Determination (R²)

R² represents the proportion of variance in the dependent variable that can be explained by the independent variables.

Formula:

R2=1RSSTSSR^2 = 1 - \frac{RSS}{TSS}

Where:

  • RSS (Residual Sum of Squares): (yiy^i)2\sum(y_i - \hat{y}_i)^2
  • TSS (Total Sum of Squares): (yiyˉ)2\sum(y_i - \bar{y})^2

Interpretation:

  • Ranges typically from 0 to 1 (can be negative for very poor models)
  • Higher values indicate better fit (more variance explained)
  • An R² of 0.75 means the model explains 75% of the variance

Advantages:

  • Scale-independent: Can compare models across different target variables
  • Intuitive percentage: Easy to understand as “percent of variance explained”
  • Widely recognized: Standard reporting metric in statistics

Limitations:

  • Always increases with additional features (even irrelevant ones)
  • No penalty for model complexity (unlike adjusted R²)
  • Can be misleading if model assumptions are violated
  • Not ideal for time series or non-linear relationships

Advanced Evaluation Metrics

Percentage-Based Errors

Mean Absolute Percentage Error (MAPE)

MAPE measures the percentage difference between actual and predicted values.

MAPE=1ni=1nyiy^iyi×100%\text{MAPE} = \frac{1}{n}\sum_{i=1}^{n}\left|\frac{y_i - \hat{y}_i}{y_i}\right| \times 100\%

Advantages and Limitations of MAPE:

  • Business-friendly: Easy to communicate as a percentage
  • Scale-independent: Useful for comparing across different datasets
  • Issues with zeros: Undefined when actual values are zero
  • Asymmetric penalty: Penalizes over-forecasting more than under-forecasting

Symmetric Mean Absolute Percentage Error (SMAPE)

SMAPE attempts to address MAPE’s asymmetry problem:

SMAPE=1ni=1nyiy^i(yi+y^i)/2×100%\text{SMAPE} = \frac{1}{n}\sum_{i=1}^{n}\left|\frac{y_i - \hat{y}_i}{(y_i + \hat{y}_i)/2}\right| \times 100\%

Advantages of SMAPE:

  • Bounded range: Output is limited to [0%, 200%]
  • More balanced: Treats over and under-predictions more equally
  • Handles zeros better: Though still problematic when both values are zero

Logarithmic Error Metrics

Mean Squared Logarithmic Error (MSLE)

MSLE applies logarithmic transformation before calculating squared error:

MSLE=1ni=1n(log(yi+1)log(y^i+1))2\text{MSLE} = \frac{1}{n}\sum_{i=1}^{n}(\log(y_i + 1) - \log(\hat{y}_i + 1))^2

Root Mean Squared Logarithmic Error (RMSLE):

RMSLE=MSLE\text{RMSLE} = \sqrt{\text{MSLE}}

These metrics are particularly useful for:

  • Data with exponential growth patterns
  • Cases where errors at lower values should be penalized more than at higher values
  • Datasets with a wide range of values

Information Criteria

Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC)

These metrics help balance model fit with complexity:

AIC=2k2ln(L)\text{AIC} = 2k - 2\ln(L) BIC=ln(n)k2ln(L)\text{BIC} = \ln(n)k - 2\ln(L)

Where:

  • k is the number of model parameters
  • L is the likelihood function
  • n is the number of observations

These metrics:

  • Penalize models with more parameters
  • Help prevent overfitting
  • Cannot be compared across different datasets

How to Choose the Right Evaluation Metric?

Key Considerations

  1. What is the impact of outliers?

    • If outliers should be heavily penalized: MSE/RMSE
    • If outliers shouldn’t dominate evaluation: MAE
  2. Is over-prediction or under-prediction more costly?

    • If both are equally important: Symmetric measures like RMSE or MAE
    • If one direction is more costly: Consider asymmetric measures or custom loss functions
  3. Do you need relative or absolute measures?

    • For comparing across datasets: R², MAPE
    • For interpretation in original units: RMSE, MAE

General Selection Guidelines

  • Optimize one primary metric during model training
  • Evaluate with multiple metrics to understand different aspects of performance
  • Consider domain standards - some fields have conventional metrics
  • Understand limitations of each metric in your specific context

Conclusion: Best Practices

Understanding regression model evaluation metrics is crucial for building effective predictive models. Different metrics highlight different aspects of model performance:

  • MAE: Best for understanding average error magnitude
  • MSE/RMSE: Best for optimization and when large errors are important
  • : Best for explaining variance and model comparison
  • MAPE/SMAPE: Best for business contexts requiring percentage interpretation
  • MSLE/RMSLE: Best for exponential data and when relative errors matter

Remember that the ultimate goal isn’t just to optimize metrics but to create models that generalize well to new, unseen data and provide valuable insights for your specific problem.

References

  1. A Comprehensive Overview of Regression Evaluation Metrics, https://developer.nvidia.com/blog/a-comprehensive-overview-of-regression-evaluation-metrics/#entry-content-comments, By Eryk Lewinson
  2. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning.
  3. Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)?
  4. Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE).

EnderJones
EnderJones

PhD candidate · Tokyo, Japan