Sunday, October 26, 2008

A 3 level Nested SQL Query on a single table.

I had a lot of fun today writing the most complicated SQL query I've ever written.

The table given to me was something like this:
  CUSTOMER          PRODUCT           SALE_DATE     STATE     QTY
-----------------------------------------------------------------
Bloom Pepsi 1999-02-04 NY 2233
Bloom Egg 1996-05-06 NJ 1233
Abhay Pepsi 1987-06-07 NJ 2222
Bloom Pepsi 1983-03-04 NJ 122
Question:
For each combination of customer and product, output the maximum
and minimum sales quantities along with the corresponding dates
(i.e., dates of those maximum and minimum sales quantities).
--
My first solution:
Ah, I said, .. and wrote the following expression after scratching my head for a minute
SELECT MIN(QTY), MAX(QTY) FROM CUSTOMER_TABLE_NAME GROUP BY CUSTOMER, PRODUCT;
But then.. reading the question carefully, I noticed that I had to give my result 2 dates, dates of those maximum and minimum sales quantities.

So I scratched and scratched my head, and finally when I couldn't scratch any more, I wrote the following query, which is the most complicated I've ever written.. but it works very well and has rewarded me with a complete SQL refresher. The solution is below, I am using 3 levels of nesting with implicit inner joins, at various levels.
Final Solution:
SELECT min_table_2.customer, min_table_2.product, max_table_2.maxq2, min_table_2.minq2,
max_table_2.sale_max, min_table_2.sale_min

FROM (SELECT customer as c2, product as p2, sale_date as sale_min, min_table_1.minq as minq2
FROM (SELECT customer as c1, product as p1, MIN(qty) as minq
FROM sales GROUP BY customer, product)min_table_1,
sales
WHERE min_table_1.c1 = customer AND min_table_1.p1 = product AND min_table_1.minq = qty)min_table_2,

(SELECT customer as c2, product as p2, sale_date as sale_max, max_table_1.maxq as maxq2
FROM (SELECT customer as c1, product as p1, MAX(qty) as maxq
FROM sales GROUP BY customer, product)max_table_1,
sales
WHERE max_table_1.c1 = customer AND max_table_1.p1 = product AND max_table_1.maxq = qty)max_table_2

WHERE min_table_2.c2 = max_table_2.c2 AND min_table_2.p2 = max_table_2.p2
Reasoning:
The outer most FROM joins the inner 2 tables (that we generate).
First we generate one result set from a subquery, the columns of this result set are: customer, product, MinQty.
But, we need the date of the record with the minimum quantity, so we use write another subquery using the results of the inner most query, and generate a new result set with the following columns: customer, product, MinQty, min_date
We similarly generate another table for maximum quantity with columns: customer, product, maxQty, max_date.
Once this much is done, we join these 2 generated result sets with the outer most query, the criteria for this implicit inner join is the customer and product columns of the minimum and maximum qty result sets should be equal. This is specified in our outer WHERE clause
WHERE min_table_2.c2 = max_table_2.c2 AND min_table_2.p2 = max_table_2.p2
Once the outer level join is done, its just a matter of selecting the columns we need for the final result set.

SQL queries can be real fun :)
cheers!