"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

August 11, 2022

Forecasting - Timeseries

Key Notes

  • ML for forecasting
  • M5 dataset hierarchical information
  • A large number of correlated time series 30K


  • Sparsity of data
  • Weekly / Seasonal patterns
  • LightGBM performed better
  • Baseline exponential smoothing was better
  • Benchmark with simple methods

  • Table of features
  • Features with info from past
  • Feature / Lag1 / Lag2
  • Know past/feature values
  • Advertising spend of future


  • Use weather forecast for future
  • Create a naive forecast from the previous value
  • Use lag futures

  • Metadata static features

  • Multi-step forecasting
  • Direct forecasting
  • Recursive forecasting

  • Models to build one step, two steps ahead
  • Recursive forecasting
  • Fit once and recursively use the model with one step ahead forecast
  • Append to training data
  • Recreate features
  • Plug that back into the model


  • Independent models have issues
  • Recursive is less complicated
  • Correlated but errors may propagate
  • Split data by time to replicate the actual forecasting process
  • Split by time horizon
  • Split & Forecast Horizon

  • Differences for time series
  • Split by time

  • Feature Engineering
  • Data Imputation
  • Encode Categorical variable
  • Temporal aspects - Time
  • Future data (Marketing info / Promos)
  • Do not allow data leak of future in past


  • Weekly Seasonality
  • Exogeneous features - Advertising spend
  • Effect distributed in time
  • Spend on a daily basis (Distributed lags)

  • lag selection

  • Seasonal trends - Festivals / Seasons

  • Create a bunch of lags

  • Window features (Function over a window of time)

  • Rolling standard deviation
  • Rolling mean
  • Month trend
  • biweekly trend
  • Weekendtrend
  • Festive trend
  • Nested window features
  • Model learns seasonality

  • Future values are discarded while featuring computation
  • Use value before the timestamp
  • Expanding window of mean
  • Useful libraries


Ref Slides

Exogenous Variables - having an external cause or origin

  • Agriculture - Crop-eating pests, weather, crop diseases
  • Supply chain - Economy, consumer attitudes changes
  • Retail - Economy indicator, Weather, Unemployment rates, Inflation

Keep Forecasting!!!!

No comments: