Lab: forecast a time series

Decompose a synthetic monthly sales series, check its ACF, fit exponential smoothing, evaluate on the held-out tail, and interpret each step.

Forecasting is a sequence of decisions: what structure is present, what model fits that structure, and how well does it generalise to future data. This lab walks that sequence on a controlled synthetic series where you know the ground truth and can verify your conclusions against it.

The series

A synthetic 48-month sales series: a moderate upward trend, a repeating annual cycle, and moderate noise. The last 12 months are held out for evaluation.

Python — editable, runs in your browser

Checkpoint 2 — decompose

Python — editable, runs in your browser

The trend range should span roughly 200 to 285 (2.5 × 35 months). The seasonal amplitude should be around ±30. The residual standard deviation should be close to the true noise standard deviation of 12.

Checkpoint 3 — check ACF

Python — editable, runs in your browser

Expect high autocorrelation at lag 1 (the trend carries memory forward) and a visible spike at lag 12 (the annual cycle). If lag 12 is not significantly above lag 11 and lag 13, the seasonal component in the series is too weak relative to the noise to detect reliably in 36 observations.

Checkpoint 4 — fit and evaluate

Python — editable, runs in your browser

Interpretation guide

Work through these after running the code:

Is MAE as a percentage of the test mean below 10%? That is a reasonable target for this series, given a noise standard deviation of 12 against a mean around 290 (roughly 4%). If your MAE% is higher, the error is mostly noise — the forecasting model cannot do better without more signal.
Do the per-month errors show a pattern? Systematic over-prediction in summer and under-prediction in winter suggests the seasonal component was not estimated well. This can happen because 36 training months gives only 3 complete annual cycles — barely enough for reliable seasonal estimation.
Compare to a naïve baseline. If Holt-Winters does not substantially beat "repeat last year's value", the complexity is not justified for this series length.

In production, you would re-evaluate the model periodically as new data arrives and retrain if accuracy degrades. The held-out evaluation in this lab simulates that first production test: "does the model generalise to future months it has never seen?"

Where to go next

The Time Series module is complete. Next: Pipeline Design — the engineering concerns that determine whether an analysis done in a notebook can become a reliable, reproducible system.

Finished reading? Mark it complete to track your progress.

The series

Checkpoint 2 — decompose

Checkpoint 3 — check ACF

Checkpoint 4 — fit and evaluate

Interpretation guide

Where to go next

On this page