Skip to the content.

Home | Portfolio | GitHub | LinkedIn | Medium | Stack Overflow | Terms | E-mail

Dates and Subqueries in SQL

It is often the case that when working with a SQL database, one typically has to work with tables that contain a date column showing the date for each relevant record.

However, the ability of SQL to work with dates and yield valuable insights from such data types is often not well understood.

Weather Data Example

Let us consider the following example. Suppose there exists a weather database with recorded dates and relevant weather information in a table. Here is a snippet of the data:

subqueries-dates-1

Source: Table (and data) created by author using PostgreSQL. Table displayed in pgAdmin4.

Also, let us suppose that a month variable has been defined in the table and the relevant values have been extracted from the table as follows:

update weatherdata set month=extract(month from date);

Now, for the purposes of ensuring that we have sufficient temperature records for each month and do not have too long a lag between records — let us assume that we would like to calculate the average duration between each consecutive record in the table, and group them by month.

This task will be accomplished by:

Calculate duration between dates

By using the LAG function, we can calculate the duration between each consecutive date.

However, we would also like to display the date and month columns in the new table as well — we will need to use the month column when it comes to subsequently grouping the average duration by month.

To accomplish this, we must:

This is done as follows:

select t1.date, t1.date - lag(t1.date) over (order by t1.date) as date_difference, t1.month from weatherdata as t1 inner join weatherdata as t2 on t1.date=t2.date;

Here is the generated table from the above query:

subqueries-dates-2-edited

Source: Table (and data) created by author using PostgreSQL. Table displayed in pgAdmin4.

We can see that for the weather data recorded last month — there is less than a day of duration for most entries — meaning that weather patterns are being recorded with regularity and we are likely obtaining a representative sample for that month!

Using subquery with a GROUP BY function

Having calculated the above table, we now wish to calculate the average duration between recorded dates by month.

In order to do this using the data we have just generated above — we must now use a subquery. That is to say, we will be incorporating the above query into a broader aggregate query that can use the GROUP BY function.

To group the duration by month, the following query is run:

select month, avg(date_difference) from (select t1.date as date, t1.date - lag(t1.date) over (order by t1.date) as date_difference, t1.month as month from weatherdata as t1 inner join weatherdata as t2 on t1.date=t2.date) as subquery group by month order by month;

Here is the generated data:

subqueries-3

Source: Table (and data) created by author using PostgreSQL. Table displayed in pgAdmin4.

From the above, we can see that across months 1–12 (January to December) — we have calculated the average duration between each recorded date for each month.

Conclusion

In this article, you have seen:

Many thanks for reading, and you can find further examples of useful SQL practices here.