Home | Portfolio | GitHub | LinkedIn | Medium | Stack Overflow | Terms | E-mail
Data and Market Research Specialist
“I always strive to be better at market research than any statistican and better at statistics than any market researcher.”
My work includes:
-
Building regression analysis and time series methods in Python and R to generate business intelligence solutions for clients across a range of industries.
-
Covering the latest business developments for companies across the hotel and consumer goods sectors. I use PostgreSQL to conduct a thorough analysis of key business metrics, along with generating data visualisations using Seaborn that deliver an enhanced understanding of a company’s performance in its industry.
Samples
Analysing Sales Data: SQL and Data Visualization
The majority of companies keep their important data stored in a database. That said, how often do these companies take the time to analyse such data for meaningful insights? When used efficiently, interrogating a database with SQL and generating data visualizations from those insights can greatly enhance understanding of the data.
Forecasting Hotel Revenue: Predicting ADR Fluctuations with ARIMA
Average daily rates represent the average rate per day paid by a staying customer at a hotel. This is an important metric for a hotel, as it represents the overall profitability of each customer. In this example, let us see how average daily rates can be forecasted using an ARIMA model.
Handling Imbalanced Classification Data: Predicting Hotel Cancellations
When attempting to build a classification algorithm, one must often contend with the issue of an unbalanced dataset. An unbalanced dataset is one where there is an unequal sample size between classes, which induces significant bias into the predictions of the classifier in question. Let us see how a Support Vector machine can be used to classify hotel booking customers in terms of cancellation risk.
Hotel Analytics: Calculating Revenue, RevPAR and GOPPAR Using SQL
The hotel industry relies on several unique KPIs to gauge performance. One of the most important of these is RevPAR - which stands for revenue per available room.
In addition, hoteliers can also use GOPPAR (or Gross Operating Profit per Available Room) to calculate room profitability. Often, hotel businesses can find it challenging to keep up with these metrics and identify key revenue and profitability trends over time. However, using SQL and data visualization together can be quite an effective way of analysing these metrics.
Visualising Customer Lifetime Value (LTV) Trends with Python Seaborn
When a business sells to customers, the reality is that a certain percentage of customers will cease buying from that business over a given period. This is what is known as churn.
The total revenue that a business can expect from a customer before they churn is known as customer lifetime value. Revenue remaining constant, the longer a customer keeps buying from a company - the higher their customer lifetime revenue (or LTV) will be.
Technical Skills
Cloud: AWS, Azure
Languages: Python, R, SQL
Machine learning libraries: InterpretML, PyMC3, scikit-learn, statsmodels, TensorFlow
Platforms and relevant tools: PyCharm, Jupyter Notebook, pgAdmin4, RStudio, Git, Docker, Linux
Visualization libraries: ggplot2, matplotlib, seaborn
Training Courses
Time Series Forecasting with Bayesian Modeling. LiveProject series produced for Manning Publications.
- Devised liveProject series to illustrate modelling of time series shocks with Bayesian Dynamic Linear Modeling, modeling of posterior distributions with PyMC3, MCMC sampling with Prophet, and Structural Time Series Modeling with TensorFlow Probability.
TensorFlow 2.0 Essentials: What’s New. Video seminar produced for O’Reilly Media.
- Illustrated use of eager execution and AutoGraph, as well as use of tf.keras for neural network modelling across classification, regression, and time series datasets.
Business Analytics with R — Statistics and Machine Learning. Video series produced for O’Reilly Media.
- Illustrated use of data manipulation techniques, regression analysis and hypothesis testing, along with classification and regression-based machine learning techniques.