Skip to the content.

Home | Portfolio | GitHub | LinkedIn | Medium | Stack Overflow | Terms | E-mail

Hotel Analytics: Calculating Revenue, RevPAR and GOPPAR Using SQL

The hotel industry relies on several unique KPIs to gauge performance. One of the most important of these is RevPAR - which stands for revenue per available room.

In addition, hoteliers can also use GOPPAR (or Gross Operating Profit per Available Room) to calculate room profitability.

Often, hotel businesses can find it challenging to keep up with these metrics and identify key revenue and profitability trends over time. However, using SQL and data visualization together can be quite an effective way of analysing these metrics.

In this example, the use of SQL to calculate RevPAR and GOPPAR metrics on data from a hypothetical hotel database will be illustrated. The Python visualisation library, Seaborn, is also used to illustrate such information via heatmaps and boxplots.

Data

This example will work with two separate tables.

One contains the ADR, Occupancy, and Number of Rooms for three separate hotel brands (Brand 1, Brand 2, and Brand 3), as well as the relevant year and period.

Table name: Brands

>>> select * from brands;

  brand  |  adr   | occupancy |  rooms   | year | period 
---------+--------+-----------+----------+------+--------
 Brand 1 | 141.00 |     58.00 |  2062.00 | 2021 | Q1
 Brand 2 |  74.00 |     72.00 | 10346.00 | 2021 | Q1
 Brand 3 | 119.00 |     59.00 |  5551.00 | 2021 | Q1
 Brand 1 | 113.00 |     74.00 |  2064.00 | 2021 | Q2
 Brand 2 |  60.00 |     73.00 | 10354.00 | 2021 | Q2
 Brand 3 | 124.00 |     71.00 |  5549.00 | 2021 | Q2
 Brand 1 | 136.00 |     92.00 |  2064.00 | 2021 | Q3
 Brand 2 |  60.00 |     80.00 | 10346.00 | 2021 | Q3
 Brand 3 | 109.00 |     88.00 |  5551.00 | 2021 | Q3
 Brand 1 | 130.00 |     66.00 |  2062.00 | 2021 | Q4
 Brand 2 |  89.00 |     71.00 | 10351.00 | 2021 | Q4
 Brand 3 | 127.00 |     70.00 |  5548.00 | 2021 | Q4
 Brand 1 | 137.00 |     64.00 |  2061.00 | 2022 | Q1
 Brand 2 |  78.00 |     64.00 | 10349.00 | 2022 | Q1
 Brand 3 | 123.00 |     64.00 |  5551.00 | 2022 | Q1
 Brand 1 | 120.00 |     75.00 |  2063.00 | 2022 | Q2
 Brand 2 |  52.00 |     76.00 | 10354.00 | 2022 | Q2
 Brand 3 | 126.00 |     78.00 |  5551.00 | 2022 | Q2
 Brand 1 | 146.00 |     85.00 |  2064.00 | 2022 | Q3
 Brand 2 |  69.00 |     84.00 | 10348.00 | 2022 | Q3
 Brand 3 | 126.00 |     86.00 |  5549.00 | 2022 | Q3
 Brand 1 | 122.00 |     74.00 |  2062.00 | 2022 | Q4
 Brand 2 |  57.00 |     78.00 | 10350.00 | 2022 | Q4
 Brand 3 | 147.00 |     65.00 |  5549.00 | 2022 | Q4
(24 rows)

The second table contains the Operating Income recorded by the hotel for each quarter.

Table name: Profit

>>> select * from profit;

 operating_income | year | period 
------------------+------+--------
        320458.00 | 2021 | Q1
        345445.00 | 2021 | Q2
        370121.00 | 2021 | Q3
        355646.00 | 2021 | Q4
        328659.00 | 2022 | Q1
        354657.00 | 2022 | Q2
        398561.00 | 2022 | Q3
        350650.00 | 2022 | Q4
(8 rows)

The hotel in question would like to use SQL to 1) identify the percentage contribution to revenue by brand, and 2) calculate RevPAR and GOPPAR on a quarterly basis.

1. Revenue (%) by Brand

Under the table Brands, we can see that data is provided for the three hotel brands in question - Brand 1, Brand 2, and Brand 3. The hotel in question owns these three brands, and would like to calculate the percentage share of each brand to total revenue.

Let’s see how this can be accomplished using SQL.

>>> select brand, sum(cast(revenue/(select sum(rooms*(occupancy/100)*adr) from brands) as decimal(8,5))*100) as percentage from
(select brand, cast((rooms*(occupancy/100)*adr) as int) as revenue from brands order by revenue desc) as subquery
group by brand order by brand;

  brand  | percentage 
---------+------------
 Brand 1 |   16.267
 Brand 2 |   42.498
 Brand 3 |   41.235
(3 rows)

As we can see, Brand 2 shows the most revenue share at 42.49%, followed closely by Brand 3 at 41.23%, with Brand 1 substantially less at 16.26%.

Let’s explain what is going on in this query.

Firstly, we are calculating total revenue across all instances in the Brands table using the subquery as specified:

>>> select brand, cast((rooms*(occupancy/100)*adr) as int) as revenue from brands;

  brand  | revenue 
---------+---------
 Brand 1 |  168630
 Brand 2 |  551235
 Brand 3 |  389736
 Brand 1 |  172592
 Brand 2 |  453505
 Brand 3 |  488534
 Brand 1 |  258248
 Brand 2 |  496608
 Brand 3 |  532452
 Brand 1 |  176920
 Brand 2 |  654080
 Brand 3 |  493217
 Brand 1 |  180708
 Brand 2 |  516622
 Brand 3 |  436975
 Brand 1 |  185670
 Brand 2 |  409190
 Brand 3 |  545552
 Brand 1 |  256142
 Brand 2 |  599770
 Brand 3 |  601290
 Brand 1 |  186157
 Brand 2 |  460161
 Brand 3 |  530207
(24 rows)

Once this has been done, the sum of the revenue across Brands 1, 2, and 3 are then being divided by the total revenue across all three brands to obtain the percentage share of each.

2. RevPAR by Quarter

To calculate RevPAR by quarter (taking the performance of the three brands together for each quarter), the clause below is used to calculate the average RevPAR across the three brands by quarter:

>>> select year, period, avg(revpar) as revpar from (select year, period, (adr*(occupancy/100)) as revpar from brands
group by year, period, adr, occupancy order by year, period) as subquery group by year, period order by year, period;;

 year | period | revpar           
------+--------+--------
 2021 | Q1     | 68.42
 2021 | Q2     | 71.82
 2021 | Q3     | 89.68
 2021 | Q4     | 79.29
 2022 | Q1     | 72.10
 2022 | Q2     | 75.93
 2022 | Q3     | 96.80
 2022 | Q4     | 76.76

A subquery is firstly being used to calculate the RevPAR for each entry in the table separately:

>>> select year, period, (adr*(occupancy/100)) as revpar from brands
group by year, period, adr, occupancy order by year, period;

 year | period | revpar           
------+--------+----------
 2021 | Q1     |  70.21
 2021 | Q1     |  53.28
 2021 | Q1     |  81.78
 2021 | Q2     |  83.62
 2021 | Q2     |  88.04
 2021 | Q2     |  43.80
 2021 | Q3     |  95.92
 2021 | Q3     |  48.00
 2021 | Q3     |  125.12
 2021 | Q4     |  88.90
 2021 | Q4     |  63.19
 2021 | Q4     |  85.80
 2022 | Q1     |  49.92
 2022 | Q1     |  87.68
 2022 | Q1     |  78.72
 2022 | Q2     |  39.52
 2022 | Q2     |  90.00
 2022 | Q2     |  98.28
 2022 | Q3     |  108.36
 2022 | Q3     |  124.10
 2022 | Q3     |  57.96
 2022 | Q4     |  95.55
 2022 | Q4     |  44.46
 2022 | Q4     |  90.28
(24 rows)

Simply put, we are firstly calculating the RevPAR for each brand across each quarter separately. Then, we are averaging the RevPAR across the three different brands for each quarter to get the average performance, i.e. the average RevPAR by room across the three brands.

This can also be visualized by a heatmap using Python’s seaborn as follows:

analysis = pd.pivot_table(df, columns='year', index='period', values="revpar")
sns.heatmap(analysis, annot=True, cmap="coolwarm", vmin=60, vmax=100, fmt='.2f')
plt.title("RevPAR by quarter")
plt.savefig("revpar by quarter.png")
plt.close()

revpar_by_quarter_2

3. GOPPAR by Quarter

Last but not least, let us see how we can calculate GOPPAR by quarter using SQL.

Given that the operating_income variable is in a separate table (Profit), the below clause will join this table with the Brands table in order to conduct the relevant calculations.

Operating income per quarter is already provided in the Profit table.

To calculate GOPPAR by quarter, it is necessary to divide operating income per quarter by the total number of rooms per quarter across the three brands.

>>> select t1.year as year, t1.period as period, t1.operating_income as operating_income, sum(t2.rooms) as rooms,
t1.operating_income/sum(t2.rooms) as goppar from profit as t1 inner join brands as t2 on t1.year=t2.year
and t1.period=t2.period group by t1.year, t1.period, t1.operating_income;

 year | period | operating_income | rooms    | goppar        
------+--------+------------------+----------+---------------------
 2021 | Q1     |        320458.00 | 17959.00 | 17.84
 2021 | Q2     |        345445.00 | 17967.00 | 19.22
 2021 | Q3     |        370121.00 | 17961.00 | 20.60
 2021 | Q4     |        355646.00 | 17961.00 | 19.80
 2022 | Q1     |        328659.00 | 17961.00 | 18.29
 2022 | Q2     |        354657.00 | 17968.00 | 19.73
 2022 | Q3     |        398561.00 | 17961.00 | 22.19
 2022 | Q4     |        350650.00 | 17961.00 | 19.52
(8 rows)

As we can see, we are joining the two tables on the basis of year and period, while grouping by year, period, and operating income.

Once again, we can use a heatmap to visualise GOPPAR as follows:

analysis = pd.pivot_table(df, columns='year', index='period', values="goppar")
sns.heatmap(analysis, annot=True, cmap="coolwarm", vmin=15, vmax=25, fmt='.2f')
plt.title("GOPPAR by quarter")
plt.savefig("goppar by quarter.png")
plt.close()

goppar_by_quarter_2

Once again, we can see that Q3 showed the best performance on a GOPPAR basis across this hypothetical dataset.

Conclusion

In this example, you have seen how to:

References