Linear regression is the most popular method to forecast demand, but it does not always give the best results. Using an example from the fast-moving consumer goods industry, we introduce what are often better alternatives and discuss how to measure predictive accuracy (and no, it is not R²).
We start with a linear regression model to forecast monthly consumption. Given an R² of 0.94, we predict consumption with income. To test the model, we use 2000-2011 monthly demand to forecast the known 2012 and 2013 monthly demand.
Our predicted values do not fit well* when compared with the actual values – even after we adjusted for other factors such as autocorrelation and seasonality. Unfortunately, this is where many analysts stop because they find the high R² attractive.
Let’s look at a different forecast model called ARMAX. It is a major forecasting method that combines regression and moving average. We run the same analysis with household income.
The fit between our predicted values and actual values improves when compared to the linear regression model.
Finally, we turn to another method called the time-series model. Unlike the two previous models, this model does not use statistics. Instead, it is a sophisticated moving average method. It predicts the future solely based on historical demand. The idea is that historical data sends strong signals about the future.
Our model results indicate that the fit between our predicted values and actual values improves over the ARMAX model.
In conclusion, the time-series model gives the best fit among the three models. The linear regression has the lowest accuracy. This is often the case, especially for short-term forecasts.
In fact, almost 60% of executives express satisfaction in using time-series models.
Lesson 2: To forecast demand, don’t always rely on linear regression.
In the third post, we will look at adding multiple predictors to our analysis, and further explain how to interpret model results beyond just R².
* To analyze forecast precision, FMCG companies mostly use mean absolute percentage error (MAPE). MAPE is calculated by taking the average of the absolute percent deviation of the model from the actual value in each period.