We treat each product number-store number pair as a separate entity
We include an additional ’open’ flag to denote whether data is present on a given day
Data is resampled at regular daily intervals,imputing any missing days using the last available observation
We apply a log-transform on the sales data, and adopt z-score normalization across all entities
Dropping where any record missing
The training set is made up of samples taken between 2015-01-01 to 2015-12-01. The validation set of samples from the 30 days after the training set. The test set of all entities over the 30-day horizon following the validation set.
We consider log sales, transactions, oil to be real-valued and the rest to be categorical.