Tourism Income Time Series & Intervention Analysis
This project investigates the long-term behaviour of South Africa’s tourist accommodation income using classical and seasonal time series techniques. By modelling over a decade of monthly revenue data, it uncovers underlying trends, seasonal patterns, and structural changes in the tourism sector.
It applies SARIMA modelling, forecast evaluation, and intervention analysis to quantify the impact of COVID-19 on industry income and assess recovery patterns. The project highlights how unexpected shocks can alter the trajectory of economic time series and demonstrates how statistical models can be used to measure, forecast, and communicate these effects clearly.
Category
Time Series Analysis
Language
R
Start Date
September 30, 2024
Designer
Sbusiso Mdingi




Tourism Income Time Series & Intervention Analysis
This project analyses over a decade of monthly tourism accommodation income in South Africa to understand long-term patterns, seasonal behaviour, and structural changes in the industry. Using classical time series methods, seasonal ARIMA modelling, and intervention analysis, I built a complete analytical workflow that quantifies how revenue evolved before, during, and after COVID-19, and how the industry may recover in future periods.
The objective was to model the underlying trend, seasonality, and stochastic structure of the data, then measure how the COVID-19 intervention disrupted the historical pattern. By comparing pre-intervention forecasts to post-intervention observations, the analysis identifies both the immediate drop in tourism income and the gradual recovery trajectory. The project demonstrates how statistical modelling can uncover meaningful economic insights and translate complex temporal patterns into intuitive visual stories.
FROM DATA TO INSIGHT
Tourism revenue data captures the pulse of an entire sector, yet the raw series alone cannot show how seasonal patterns evolve, what underlying growth exists, or how major shocks reshape the industry. I wanted to build a transparent, step-by-step modelling process that reveals these hidden components and quantifies the true impact of COVID-19 on the tourism economy.
Instead of simply forecasting revenue, the goal was to separate trend, seasonality, and structural change, showing exactly how and when the industry deviated from expected behaviour. This approach makes it possible to measure the magnitude of the shock, the duration of the recovery, and the potential long-term consequences for the sector.
By treating the pandemic as a measurable intervention, the project illustrates how real world events can be incorporated into statistical models and used to explain economic outcomes with clarity and rigour.
MODELLING STRATEGY
My approach follows a classical time series pipeline. I began by analysing raw income data from 2013–2024 to identify underlying structure — including trend, seasonal cycles, and extreme events. After confirming non-stationarity through ADF, PP, and KPSS tests, I applied both regular and seasonal differencing to produce a stationary series suitable for modelling.
Candidate SARIMA models were evaluated using AIC, BIC, RMSE, MAPE, residual analysis, and autocorrelation diagnostics. The final selected model, SARIMA(2,1,1)×(0,1,1)₁₂, offered the best balance between parsimony and predictive accuracy, capturing both the seasonal pattern and short-term dynamics of tourism income.
To measure COVID-19’s impact, I applied multiple intervention modelling strategies, including covariate based step functions, ratio adjustments, and targeted trial-and-error calibration. These approaches allowed me to compare expected (counterfactual) income to observed revenue and quantify percentage losses month by month throughout the pandemic and recovery period.
TECHNICAL BREAKDOWN
The modelling pipeline incorporates decomposition, seasonality tests, SARIMA parameter estimation, model comparison, and intervention effect measurement. Diagnostic tools such as residual plots, Ljung–Box tests, and normality checks ensured that the final model met all key assumptions and captured the essential structure of the data without overfitting.
Forecasts produced using the selected SARIMA model illustrate how the industry would have evolved in the absence of COVID-19, while the intervention models show how actual income diverged from these expectations. The final analysis quantifies the total income lost during the pandemic, the percentage drop at each point in time, and the rate at which the sector is recovering, highlighting the possibility of a long term structural shift.
This project demonstrates not only my ability to apply advanced time series techniques, but also my capacity to turn statistical modelling into clear, economically meaningful insight.