Home | Portfolio | Terms and Conditions | E-mail me | LinkedIn

Part 4: Displaying AUC Results with Django Web App

This is Part 4 of a four-part project on predicting hotel cancellations with machine learning.

- Part 1: Predicting Hotel Cancellations with Support Vector Machines and ARIMA

- Part 2: Predicting Hotel Cancellations with a Keras Neural Network

- Part 3: Predicting Weekly Hotel Cancellations with an LSTM Network

Back in Part 1 of this project, we saw how SVM can be used to predict hotel cancellations. An accuracy of 0.74 was achieved.

While conducting an analysis and generating the results are one thing, it would be even more useful to deploy or productionize this model in some way. For this purpose, the generated AUC curve is wrapped into a Django app and displayed as a web page.

Here are the steps to creating the Django web application. Note that the below terminal commands were run using Linux.

Create and run server

Open the terminal and input the following:

django-admin startproject mysite

This creates the mysite folder.

python3 manage.py runserver

This runs the server. Upon inputting http://127.0.0.1:8000/ into your browser, the following should display:

install_worked_successfully

Create pyfiles

Now, a subfolder is created which we will call pyfiles.

python3 manage.py startapp pyfiles

Visit mysite -> urls.py, and ensure the following is included in the .py code:

from django.contrib import admin
from django.urls import include, path

urlpatterns = [
    path('pyfiles/', include('pyfiles.urls')),
    path('admin/', admin.site.urls),
]

Also ensure that the following is included in mysite -> pyfiles -> urls.py:

from django.urls import path

from . import views

urlpatterns = [
    path('', views.index, name='index'),
]

Under views.py, the Python code that is used to run the SVM and generate the AUC graph is included:

# Create your views here.
from django.http import HttpResponse
from django.shortcuts import render
from PIL import Image

# Import libraries
import os
import csv
import random
import statsmodels.api as sm
import statsmodels.formula.api as smf
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.preprocessing import MinMaxScaler
path="hotel-django"
os.chdir(path)
os.getcwd()

train_df = pd.read_csv('H1.csv')
a=train_df.head()
b=train_df
b
b.sort_values(['ArrivalDateYear','ArrivalDateWeekNumber'], ascending=True)

IsCanceled = train_df['IsCanceled']
y = IsCanceled

leadtime = train_df['LeadTime'] #1
staysweekendnights = train_df['StaysInWeekendNights'] #2
staysweeknights = train_df['StaysInWeekNights'] #3
adults = train_df['Adults'] #4
children = train_df['Children'] #5
babies = train_df['Babies'] #6
isrepeatedguest = train_df['IsRepeatedGuest'] #11
previouscancellations = train_df['PreviousCancellations'] #12
previousbookingsnotcanceled = train_df['PreviousBookingsNotCanceled'] #13
bookingchanges = train_df['BookingChanges'] #16
agent = train_df['Agent'] #18
company = train_df['Company'] #19
dayswaitinglist = train_df['DaysInWaitingList'] #20
adr = train_df['ADR'] #22
rcps = train_df['RequiredCarParkingSpaces'] #23
totalsqr = train_df['TotalOfSpecialRequests'] #24

# Categorical variables
mealcat=train_df.Meal.astype("category").cat.codes
mealcat=pd.Series(mealcat)
countrycat=train_df.Country.astype("category").cat.codes
countrycat=pd.Series(countrycat)
marketsegmentcat=train_df.MarketSegment.astype("category").cat.codes
marketsegmentcat=pd.Series(marketsegmentcat)
distributionchannelcat=train_df.DistributionChannel.astype("category").cat.codes
distributionchannelcat=pd.Series(distributionchannelcat)
reservedroomtypecat=train_df.ReservedRoomType.astype("category").cat.codes
reservedroomtypecat=pd.Series(reservedroomtypecat)
assignedroomtypecat=train_df.AssignedRoomType.astype("category").cat.codes
assignedroomtypecat=pd.Series(assignedroomtypecat)
deposittypecat=train_df.DepositType.astype("category").cat.codes
deposittypecat=pd.Series(deposittypecat)
customertypecat=train_df.CustomerType.astype("category").cat.codes
customertypecat=pd.Series(customertypecat)
reservationstatuscat=train_df.ReservationStatus.astype("category").cat.codes
reservationstatuscat=pd.Series(reservationstatuscat)

y1 = y
x1 = np.column_stack((leadtime,countrycat,deposittypecat))
x1 = sm.add_constant(x1, prepend=True)

x1_train, x1_test, y1_train, y1_test = train_test_split(x1, y1, random_state=0)

from sklearn.metrics import classification_report,confusion_matrix
import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve
from sklearn import svm

clf = svm.SVC(gamma='scale')
clf.fit(x1, y1)  
prclf = clf.predict(x1_test)
prclf

h2data = pd.read_csv('H2.csv')
a=h2data.head()
a

type(h2data)

t_leadtime = h2data['LeadTime'] #1
t_staysweekendnights = h2data['StaysInWeekendNights'] #2
t_staysweeknights = h2data['StaysInWeekNights'] #3
t_adults = h2data['Adults'] #4
t_children = h2data['Children'] #5
t_babies = h2data['Babies'] #6
t_isrepeatedguest = h2data['IsRepeatedGuest'] #11
t_previouscancellations = h2data['PreviousCancellations'] #12
t_previousbookingsnotcanceled = h2data['PreviousBookingsNotCanceled'] #13
t_bookingchanges = h2data['BookingChanges'] #16
t_agent = h2data['Agent'] #18
t_company = h2data['Company'] #19
t_dayswaitinglist = h2data['DaysInWaitingList'] #20
t_adr = h2data['ADR'] #22
t_rcps = h2data['RequiredCarParkingSpaces'] #23
t_totalsqr = h2data['TotalOfSpecialRequests'] #24

# Categorical variables
t_mealcat=h2data.Meal.astype("category").cat.codes
t_mealcat=pd.Series(t_mealcat)
t_countrycat=h2data.Country.astype("category").cat.codes
t_countrycat=pd.Series(t_countrycat)
t_marketsegmentcat=h2data.MarketSegment.astype("category").cat.codes
t_marketsegmentcat=pd.Series(t_marketsegmentcat)
t_distributionchannelcat=h2data.DistributionChannel.astype("category").cat.codes
t_distributionchannelcat=pd.Series(t_distributionchannelcat)
t_reservedroomtypecat=h2data.ReservedRoomType.astype("category").cat.codes
t_reservedroomtypecat=pd.Series(t_reservedroomtypecat)
t_assignedroomtypecat=h2data.AssignedRoomType.astype("category").cat.codes
t_assignedroomtypecat=pd.Series(t_assignedroomtypecat)
t_deposittypecat=h2data.DepositType.astype("category").cat.codes
t_deposittypecat=pd.Series(t_deposittypecat)
t_customertypecat=h2data.CustomerType.astype("category").cat.codes
t_customertypecat=pd.Series(t_customertypecat)
t_reservationstatuscat=h2data.ReservationStatus.astype("category").cat.codes
t_reservationstatuscat=pd.Series(t_reservationstatuscat)

a = np.column_stack((t_leadtime,t_countrycat,t_deposittypecat))
a = sm.add_constant(a, prepend=True)
IsCanceled = h2data['IsCanceled']
b = IsCanceled
b=b.values

prh2 = clf.predict(a)
prh2

from sklearn.metrics import classification_report,confusion_matrix
print(confusion_matrix(b,prh2))
print(classification_report(b,prh2))

import matplotlib.pyplot as plt
from sklearn import metrics
from sklearn.metrics import roc_curve
falsepos,truepos,thresholds=roc_curve(b,clf.decision_function(a))
plt.plot(falsepos,truepos,label="ROC")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")

cutoff=np.argmin(np.abs(thresholds))
plt.plot(falsepos[cutoff],truepos[cutoff],'o',markersize=10,label="cutoff",fillstyle="none")
plt.title("AUC reading on test set")

metrics.auc(falsepos, truepos)

Displaying results

path='mysite/graphs'
os.chdir(path)
plt.savefig('aucgraph.png')

img = Image.open('mysite/graphs/aucgraph.png')

def index(request):
    response = HttpResponse(content_type = 'image/png')
    img.save(response, "PNG")
    return response

Now, the server is run:

python3 manage.py runserver

The graph can now be visualized by visiting http://localhost:8000/pyfiles/.

aucgraph

Conclusion

In this example, you have seen how Django can be used to display machine learning results in web format. Many thanks for your time, and feel free to reach out with any questions.