Home | Portfolio | GitHub | LinkedIn | Medium | Stack Overflow | Terms | E-mail
Hotel Analytics: Calculating Revenue, RevPAR and GOPPAR Using SQL
The hotel industry relies on several unique KPIs to gauge performance. One of the most important of these is RevPAR - which stands for revenue per available room.
- RevPAR = Occupancy (%) * Average Daily Rate (ADR)
In addition, hoteliers can also use GOPPAR (or Gross Operating Profit per Available Room) to calculate room profitability.
- GOPPAR = Operating Income/Number of Available Rooms
Often, hotel businesses can find it challenging to keep up with these metrics and identify key revenue and profitability trends over time. However, using SQL and data visualization together can be quite an effective way of analysing these metrics.
In this example, the use of SQL to calculate RevPAR and GOPPAR metrics on data from a hypothetical hotel database will be illustrated. The Python visualisation library, Seaborn, is also used to illustrate such information via heatmaps and boxplots.
Data
This example will work with two separate tables.
One contains the ADR, Occupancy, and Number of Rooms for three separate hotel brands (Brand 1, Brand 2, and Brand 3), as well as the relevant year and period.
Table name: Brands
>>> select * from brands;
brand | adr | occupancy | rooms | year | period
---------+--------+-----------+----------+------+--------
Brand 1 | 141.00 | 58.00 | 2062.00 | 2021 | Q1
Brand 2 | 74.00 | 72.00 | 10346.00 | 2021 | Q1
Brand 3 | 119.00 | 59.00 | 5551.00 | 2021 | Q1
Brand 1 | 113.00 | 74.00 | 2064.00 | 2021 | Q2
Brand 2 | 60.00 | 73.00 | 10354.00 | 2021 | Q2
Brand 3 | 124.00 | 71.00 | 5549.00 | 2021 | Q2
Brand 1 | 136.00 | 92.00 | 2064.00 | 2021 | Q3
Brand 2 | 60.00 | 80.00 | 10346.00 | 2021 | Q3
Brand 3 | 109.00 | 88.00 | 5551.00 | 2021 | Q3
Brand 1 | 130.00 | 66.00 | 2062.00 | 2021 | Q4
Brand 2 | 89.00 | 71.00 | 10351.00 | 2021 | Q4
Brand 3 | 127.00 | 70.00 | 5548.00 | 2021 | Q4
Brand 1 | 137.00 | 64.00 | 2061.00 | 2022 | Q1
Brand 2 | 78.00 | 64.00 | 10349.00 | 2022 | Q1
Brand 3 | 123.00 | 64.00 | 5551.00 | 2022 | Q1
Brand 1 | 120.00 | 75.00 | 2063.00 | 2022 | Q2
Brand 2 | 52.00 | 76.00 | 10354.00 | 2022 | Q2
Brand 3 | 126.00 | 78.00 | 5551.00 | 2022 | Q2
Brand 1 | 146.00 | 85.00 | 2064.00 | 2022 | Q3
Brand 2 | 69.00 | 84.00 | 10348.00 | 2022 | Q3
Brand 3 | 126.00 | 86.00 | 5549.00 | 2022 | Q3
Brand 1 | 122.00 | 74.00 | 2062.00 | 2022 | Q4
Brand 2 | 57.00 | 78.00 | 10350.00 | 2022 | Q4
Brand 3 | 147.00 | 65.00 | 5549.00 | 2022 | Q4
(24 rows)
The second table contains the Operating Income recorded by the hotel for each quarter.
Table name: Profit
>>> select * from profit;
operating_income | year | period
------------------+------+--------
320458.00 | 2021 | Q1
345445.00 | 2021 | Q2
370121.00 | 2021 | Q3
355646.00 | 2021 | Q4
328659.00 | 2022 | Q1
354657.00 | 2022 | Q2
398561.00 | 2022 | Q3
350650.00 | 2022 | Q4
(8 rows)
The hotel in question would like to use SQL to 1) identify the percentage contribution to revenue by brand, and 2) calculate RevPAR and GOPPAR on a quarterly basis.
1. Revenue (%) by Brand
Under the table Brands, we can see that data is provided for the three hotel brands in question - Brand 1, Brand 2, and Brand 3. The hotel in question owns these three brands, and would like to calculate the percentage share of each brand to total revenue.
Let’s see how this can be accomplished using SQL.
>>> select brand, sum(cast(revenue/(select sum(rooms*(occupancy/100)*adr) from brands) as decimal(8,5))*100) as percentage from
(select brand, cast((rooms*(occupancy/100)*adr) as int) as revenue from brands order by revenue desc) as subquery
group by brand order by brand;
brand | percentage
---------+------------
Brand 1 | 16.267
Brand 2 | 42.498
Brand 3 | 41.235
(3 rows)
As we can see, Brand 2 shows the most revenue share at 42.49%, followed closely by Brand 3 at 41.23%, with Brand 1 substantially less at 16.26%.
Let’s explain what is going on in this query.
Firstly, we are calculating total revenue across all instances in the Brands table using the subquery as specified:
>>> select brand, cast((rooms*(occupancy/100)*adr) as int) as revenue from brands;
brand | revenue
---------+---------
Brand 1 | 168630
Brand 2 | 551235
Brand 3 | 389736
Brand 1 | 172592
Brand 2 | 453505
Brand 3 | 488534
Brand 1 | 258248
Brand 2 | 496608
Brand 3 | 532452
Brand 1 | 176920
Brand 2 | 654080
Brand 3 | 493217
Brand 1 | 180708
Brand 2 | 516622
Brand 3 | 436975
Brand 1 | 185670
Brand 2 | 409190
Brand 3 | 545552
Brand 1 | 256142
Brand 2 | 599770
Brand 3 | 601290
Brand 1 | 186157
Brand 2 | 460161
Brand 3 | 530207
(24 rows)
Once this has been done, the sum of the revenue across Brands 1, 2, and 3 are then being divided by the total revenue across all three brands to obtain the percentage share of each.
2. RevPAR by Quarter
To calculate RevPAR by quarter (taking the performance of the three brands together for each quarter), the clause below is used to calculate the average RevPAR across the three brands by quarter:
>>> select year, period, avg(revpar) as revpar from (select year, period, (adr*(occupancy/100)) as revpar from brands
group by year, period, adr, occupancy order by year, period) as subquery group by year, period order by year, period;;
year | period | revpar
------+--------+--------
2021 | Q1 | 68.42
2021 | Q2 | 71.82
2021 | Q3 | 89.68
2021 | Q4 | 79.29
2022 | Q1 | 72.10
2022 | Q2 | 75.93
2022 | Q3 | 96.80
2022 | Q4 | 76.76
A subquery is firstly being used to calculate the RevPAR for each entry in the table separately:
>>> select year, period, (adr*(occupancy/100)) as revpar from brands
group by year, period, adr, occupancy order by year, period;
year | period | revpar
------+--------+----------
2021 | Q1 | 70.21
2021 | Q1 | 53.28
2021 | Q1 | 81.78
2021 | Q2 | 83.62
2021 | Q2 | 88.04
2021 | Q2 | 43.80
2021 | Q3 | 95.92
2021 | Q3 | 48.00
2021 | Q3 | 125.12
2021 | Q4 | 88.90
2021 | Q4 | 63.19
2021 | Q4 | 85.80
2022 | Q1 | 49.92
2022 | Q1 | 87.68
2022 | Q1 | 78.72
2022 | Q2 | 39.52
2022 | Q2 | 90.00
2022 | Q2 | 98.28
2022 | Q3 | 108.36
2022 | Q3 | 124.10
2022 | Q3 | 57.96
2022 | Q4 | 95.55
2022 | Q4 | 44.46
2022 | Q4 | 90.28
(24 rows)
Simply put, we are firstly calculating the RevPAR for each brand across each quarter separately. Then, we are averaging the RevPAR across the three different brands for each quarter to get the average performance, i.e. the average RevPAR by room across the three brands.
This can also be visualized by a heatmap using Python’s seaborn as follows:
analysis = pd.pivot_table(df, columns='year', index='period', values="revpar")
sns.heatmap(analysis, annot=True, cmap="coolwarm", vmin=60, vmax=100, fmt='.2f')
plt.title("RevPAR by quarter")
plt.savefig("revpar by quarter.png")
plt.close()
3. GOPPAR by Quarter
Last but not least, let us see how we can calculate GOPPAR by quarter using SQL.
Given that the operating_income variable is in a separate table (Profit), the below clause will join this table with the Brands table in order to conduct the relevant calculations.
Operating income per quarter is already provided in the Profit table.
To calculate GOPPAR by quarter, it is necessary to divide operating income per quarter by the total number of rooms per quarter across the three brands.
>>> select t1.year as year, t1.period as period, t1.operating_income as operating_income, sum(t2.rooms) as rooms,
t1.operating_income/sum(t2.rooms) as goppar from profit as t1 inner join brands as t2 on t1.year=t2.year
and t1.period=t2.period group by t1.year, t1.period, t1.operating_income;
year | period | operating_income | rooms | goppar
------+--------+------------------+----------+---------------------
2021 | Q1 | 320458.00 | 17959.00 | 17.84
2021 | Q2 | 345445.00 | 17967.00 | 19.22
2021 | Q3 | 370121.00 | 17961.00 | 20.60
2021 | Q4 | 355646.00 | 17961.00 | 19.80
2022 | Q1 | 328659.00 | 17961.00 | 18.29
2022 | Q2 | 354657.00 | 17968.00 | 19.73
2022 | Q3 | 398561.00 | 17961.00 | 22.19
2022 | Q4 | 350650.00 | 17961.00 | 19.52
(8 rows)
As we can see, we are joining the two tables on the basis of year and period, while grouping by year, period, and operating income.
Once again, we can use a heatmap to visualise GOPPAR as follows:
analysis = pd.pivot_table(df, columns='year', index='period', values="goppar")
sns.heatmap(analysis, annot=True, cmap="coolwarm", vmin=15, vmax=25, fmt='.2f')
plt.title("GOPPAR by quarter")
plt.savefig("goppar by quarter.png")
plt.close()
Once again, we can see that Q3 showed the best performance on a GOPPAR basis across this hypothetical dataset.
Conclusion
In this example, you have seen how to:
- Use subqueries to calculate revenue share by percentage
- Calculate RevPAR and GOPPAR by quarter using SQL
- Illustrate RevPAR and GOPPAR performance using heatmaps