Home | GitHub | Speaking Engagements | Terms | E-mail

Predicting Hotel Cancellations with Machine Learning

The purpose of this project is to predict hotel cancellations for two separate Portuguese hotels (H1 and H2), both on a classification and time series basis. Included in the GitHub repository are the datasets and notebooks for all models run. The Python version used is 3.6.5.

The original datasets and research by Antonio et al. can be found here: Using Data Science to Predict Hotel Booking Cancellations

The classification models were built using data from the H1 dataset, with predictions then compared to the H2 dataset.

Time series forecasting was conducted on H1 and H2 independently.

Findings

H1 Results

Reading ARIMA LSTM
MDA 0.86 0.8
RMSE 57.95 60.92
MFE -12.72 -51.62

H2 Results

Reading ARIMA LSTM
MDA 0.86 0.8
RMSE 274.08 109.53
MFE 156.33 54.07

Each individual article with relevant findings can be accessed as below:

Feature Selection and Classification

Time Series Forecasting