Home | GitHub | Speaking Engagements | Terms | E-mail
Forecasting Average Daily Rate Trends For Hotels Using LSTM
Here is how an LSTM model can be used to forecast the ADR (average daily rate) for hotels - a cornerstone metric within the industry.
Average Daily Rate (ADR) is recognised as one of the most important metrics for hotels.
It is calculated as follows:
ADR = Revenue ÷ sold rooms
Essentially, ADR is measuring the average price of a hotel room over a given period.
The dataset under study consists of cancellation bookings for a Portuguese hotel which includes the ADR for each individual booking as one of the included variables.
Using time series analysis, let us assume that the hotel wishes to 1) calculate the average ADR value across all bookings for a given week and 2) use this data to forecast future weekly ADR trends - by weekly ADR we mean the average ADR across all bookings in any one week - hereafter referred to as “weekly ADR”.
One will note in the dataset that there are numerous cancellation incidences with a positive ADR value — it is assumed in this case that even though the customer cancelled, they were still ultimately charged for the booking (e.g. cancelling past the cancellation deadline, etc).
A long-short term memory network (LSTM) is used to do this. LSTMs are sequential neural networks that assume dependence between the observations in a particular series. As such, they have increasingly come to be used for time series forecasting purposes.
For reference, ADR per customer is included - given that some customers are also companies as well as individuals, which results in more than one room per booking in many cases.
Using pandas, the full date (year and week number) is joined with the corresponding ADR Value for each booking.
These data points were then grouped together to obtain the average ADR per week across all bookings as follows:
df4 = df3.groupby('FullDate').agg("mean") df4 df4.sort_values(['FullDate'], ascending=True)
Here is what the new dataframe looks like:
As a side note, the full notebook and datasets are available at the link for the GitHub repository provided below, where the data manipulation procedures are illustrated in more detail.
A plot of the time series is generated:
import matplotlib.pyplot as plt plt.plot(tseries) plt.tick_params( axis='x', # changes apply to the x-axis which='both', # both major and minor ticks are affected bottom=False, # ticks along the bottom edge are off top=False, # ticks along the top edge are off labelbottom=False) # labels along the bottom edge are off plt.ylabel('ADR') plt.title("Weekly ADR") plt.show()
LSTM Model Configuration
Let’s begin the analysis for the H1 dataset. The first 100 observations from the created time series is called. Then, a dataset matrix is created and the data is scaled.
df = df[:100] # Form dataset matrix def create_dataset(df, previous=1): dataX, dataY = ,  for i in range(len(df)-previous-1): a = df[i:(i+previous), 0] dataX.append(a) dataY.append(df[i + previous, 0]) return np.array(dataX), np.array(dataY)
The data is then normalized with MinMaxScaler in order to allow the neural network to interpret it properly:
# normalize dataset with MinMaxScaler scaler = MinMaxScaler(feature_range=(0, 1)) df = scaler.fit_transform(df) df
Here is a sample of the output:
array([[0.35915778], [0.42256282], [0.53159902], ... [0.27125524], [0.26293747], [0.25547682]])
The data is partitioned into training and test sets, with the previous parameter set to 5:
import tensorflow as tf from tensorflow.keras import layers from tensorflow.keras.layers import Dense from tensorflow.keras.layers import LSTM # Training and Validation data partition train_size = int(len(df) * 0.8) val_size = len(df) - train_size train, val = df[0:train_size,:], df[train_size:len(df),:] # Number of previous previous = 5 X_train, Y_train = create_dataset(train, previous) X_val, Y_val = create_dataset(val, previous)
When the previous parameter is set to this, this essentially means that the value at time t (Y_train for the training data), is being predicted using the values t-1, t-2, t-3, t-4, and t-5 (all under X_train).
Here is a sample of the Y_train array:
array([0.70858066, 0.75574219, 0.7348692 , 0.63555916, 0.34629856, 0.32723163, 0.18514608, 0.21056117, 0.13243974, 0.1321469 , 0.06636683, 0.09516089, 0.02223529, 0.02497857, 0.06036494, ... 0.12222412, 0.07324677, 0.05206859, 0.05937164, 0.04205497, 0.0867528 , 0.10976084, 0.0236608 , 0.11987636])
Here is a sample of the X_train array:
array([[0.35915778, 0.42256282, 0.53159902, 0.6084246 , 0.63902841], [0.42256282, 0.53159902, 0.6084246 , 0.63902841, 0.70858066], [0.53159902, 0.6084246 , 0.63902841, 0.70858066, 0.75574219], ... [0.07324677, 0.05206859, 0.05937164, 0.04205497, 0.0867528 ], [0.05206859, 0.05937164, 0.04205497, 0.0867528 , 0.10976084], [0.05937164, 0.04205497, 0.0867528 , 0.10976084, 0.0236608 ]])
100 epochs are run:
# reshape input to be [samples, time steps, features] X_train = np.reshape(X_train, (X_train.shape, 1, X_train.shape)) X_val = np.reshape(X_val, (X_val.shape, 1, X_val.shape)) # Generate LSTM network model = tf.keras.Sequential() model.add(LSTM(4, input_shape=(1, previous))) model.add(Dense(1)) model.compile(loss='mean_squared_error', optimizer='adam') history=model.fit(X_train, Y_train, validation_split=0.2, epochs=100, batch_size=1, verbose=2)
Here are some sample results:
Train on 59 samples, validate on 15 samples Epoch 1/100 59/59 - 1s - loss: 0.0689 - val_loss: 0.0027 Epoch 2/100 59/59 - 0s - loss: 0.0431 - val_loss: 0.0118 ... Epoch 99/100 59/59 - 0s - loss: 0.0070 - val_loss: 0.0031 Epoch 100/100 59/59 - 0s - loss: 0.0071 - val_loss: 0.0034 dict_keys(['loss', 'val_loss'])
This is a visual representation of the training and validation loss:
Training and Validation Predictions
Now, let’s generate some predictions.
# Generate predictions trainpred = model.predict(X_train) valpred = model.predict(X_val)
Here is a sample of training and test predictions:
>>> trainpred array([[0.6923234 ], [0.73979336], [0.75128263], ... [0.09547461], [0.11602292], [0.050261 ]], dtype=float32)
>>> valpred array([[0.06604623], [0.0982968 ], [0.10709635], ... [0.3344252 ], [0.2922875 ]], dtype=float32)
The predictions are converted back to normal values using
scaler.inverse_transform, and the training and validation scores are calculated.
import math from sklearn.metrics import mean_squared_error # calculate RMSE trainScore = math.sqrt(mean_squared_error(Y_train, trainpred[:,0])) print('Train Score: %.2f RMSE' % (trainScore)) valScore = math.sqrt(mean_squared_error(Y_val, valpred[:,0])) print('Validation Score: %.2f RMSE' % (valScore))
Training and Validation Scores
Train Score: 12.71 RMSE Validation Score: 8.83 RMSE
Here is a plot of the predictions:
The test and prediction arrays are reshaped accordingly, and the function for mean directional accuracy is defined:
import numpy as np def mda(actual: np.ndarray, predicted: np.ndarray): """ Mean Directional Accuracy """ return np.mean((np.sign(actual[1:] - actual[:-1]) == np.sign(predicted[1:] - predicted[:-1])).astype(int))
The mean directional accuracy is now calculated:
>>> mda(Y_val, predictions) 0.8571428571428571
An MDA of 86% is obtained, meaning that the model correctly predicts the direction of the actual weekly ADR trends 86% of the time.
As seen above, a validation score of 8.83 RMSE was also obtained. RMSE is a measure of the deviation in weekly ADR from the actual values, and assumes the same numerical format as the same. The mean weekly ADR across the validation data was 69.99.
The mean forecast error on the validation data came in at -1.419:
>>> forecast_error = (predictions-Y_val) >>> forecast_error >>> mean_forecast_error = np.mean(forecast_error) >>> mean_forecast_error -1.419167548625413
Testing on unseen (test) data
Now that the model has been trained, the next step is to test the predictions of the model on unseen (or test data).
As previously explained, the value at time t is being predicted by LSTM using the values t-1, t-2, t-3, t-4, and t-5.
The last 15 weekly ADR values in the series are predicted in this case.
actual = tseries.iloc[100:115] actual = np.array(actual) actual
The previously built model is now used to predict each value using the previous five values in the time series:
# Test (unseen) predictions # (t) and (t-5) >>> XNew array([[ 82.1267268 , 90.48381679, 85.81940503, 84.46819121, 83.25621451], [ 90.48381679, 85.81940503, 84.46819121, 83.25621451, 84.12304147], ... [189.16831978, 198.22268542, 208.71251185, 211.52835052, 211.16204036], [198.22268542, 208.71251185, 211.52835052, 211.16204036, 210.28488251]])
The variables are scaled appropriately, and
model.predict is invoked:
Xnew = scaler.transform(Xnew) Xnew Xnewformat = np.reshape(Xnew, (Xnew.shape, 1, Xnew.shape)) ynew=model.predict(Xnewformat)
Here is an array of the generated predictions:
array([0.02153895, 0.0157201 , 0.12966183, 0.22085814, 0.26296526, 0.33762595, 0.35830092, 0.54184073, 0.73585206, 0.8718423 , 0.92918825, 0.9334069 , 0.8861607 , 0.81483454, 0.76510745], dtype=float32)
The array is converted back to the original value format:
>>> ynew = ynew * np.abs(maxt-mint) + np.min(tseries) >>> ynewpd=pd.Series(ynew) >>> ynewpd 0 45.410988 1 44.423096 2 63.767456 3 79.250229 4 86.398926 5 99.074379 6 102.584457 7 133.744766 8 166.682877 9 189.770493 10 199.506348 11 200.222565 12 192.201385 13 180.092041 14 171.649673 dtype: float32
Here is the calculated MDA, RMSE, and MFE (mean forecast error).
MDA = 0.86
>>> mda(actualpd, ynewpd) 0.8666666666666667
RMSE = 33.77
>>> mse = mean_squared_error(actualpd, ynewpd) >>> rmse = sqrt(mse) >>> print('RMSE: %f' % rmse) RMSE: 33.775573
MFE = -30.17
>>> forecast_error = (ynewpd-actualpd) >>> mean_forecast_error = np.mean(forecast_error) >>> mean_forecast_error -30.173496939933216
With the mean weekly ADR for the test set coming in at 160.49, the RMSE and MFE performance do look reasonably strong (the lower the error, the better).
The same procedure was carried out on the H2 dataset (ADR data for a separate hotel in Portugal). Here are the results when comparing the predictions to the test set:
MDA = 0.86
>>> mda(actualpd, ynewpd) 0.8666666666666667
RMSE = 38.15
>>> mse = mean_squared_error(actualpd, ynewpd) >>> rmse = sqrt(mse) >>> print('RMSE: %f' % rmse) RMSE: 38.155347
MFE = -34.43
>>> forecast_error = (ynewpd-actualpd) >>> mean_forecast_error = np.mean(forecast_error) >>> mean_forecast_error -34.437111023457376
For the H2 dataset, the mean weekly ADR on the test set came in at 131.42, with RMSE and MFE errors low by comparison.
In this example, you have seen how ADR can be forecasted using an LSTM model. Specifically, the above examples have illustrated:
- How to construct an LSTM model
- Methods to gauge error and accuracy for LSTM model predictions
- Comparison of LSTM model performance vs ARIMA
The datasets and notebooks for this example are available at the MGCodesandStats GitHub repository, along with further research on this topic.