First off - we are not going to be using any aggregates but it looks like we may from the results...
One of the problems I often find and often can never think of a solution is something like this:
Lately I have been dealing with Weather data. I am storing a number of attributes related to the current weather each minute such as the DateTime, the Temperature, Humidity etc.
Now say I want to return a single result set that includes data for the current month with the following fields:
- Date
- Min Temperature
- Date & Time of the Min Temperature
- Max Temperature
- Date & Time of the Max Temperature
I spent a bit of time and I could not find a nice solution to the problem so I posted a question to the microsoft.public.sqlserver.newsusers newsgroup. I got a very good response from Tom Cooper. His solution uses SQL 2005 Common Table Expressions (CTE) which is exactly what I was wanting and so to help me and others I have decided to document how it works.
Lets look at a snapshot of the weather data ( yes there is a column with the date, and another with the date time but you could truncate the datetime to a date field if required, but for me it was quicker to just add a date column as well )
id Date DateTime Temp
19649 2008-07-01 00:00:00.000 2008-07-01 00:01:00.000 -0.44
19650 2008-07-01 00:00:00.000 2008-07-01 00:02:00.000 -0.44
19651 2008-07-01 00:00:00.000 2008-07-01 00:03:00.000 -0.39
19652 2008-07-01 00:00:00.000 2008-07-01 00:04:00.000 -0.39
19653 2008-07-01 00:00:00.000 2008-07-01 00:05:00.000 -0.39
19654 2008-07-01 00:00:00.000 2008-07-01 00:06:00.000 -0.33
19655 2008-07-01 00:00:00.000 2008-07-01 00:07:00.000 -0.33
Now as you can see we will have lots of data: 1 row per minute = 60 per hour, which means we get 1,440 rows per day or around 43,000 rows per month.
If we were using our sample 7 rows above the output we want to see is the following:
DateTime MinTemp MinTempDateTime MaxTemp MaxTempDateTime
2008-07-01 -0.44 2008-07-01 00:01:00.000 -0.33 2008-07-01 00:06:00.000
Here is the code that would do that:
declare @StartDate datetime
declare @EndDate datetime
set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;
With MinTemp(WeatherDate, WeatherDateTime, Temperature, rn) As
(
select w.Date, w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp)
From Weather w
where w.DateTime between @StartDate and @EndDate
),
MaxTemp(WeatherDate, WeatherDateTime, Temperature, rn) As
(
Select w.Date, w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp Desc)
From Weather w
where w.DateTime between @StartDate and @EndDate
)
Select mn.WeatherDate As 'DateTime',
mn.Temperature As MinTemp,
mn.WeatherDateTime As MinTempDateTime,
mx.Temperature As MaxTemp,
mx.WeatherDateTime As MaxTempDateTime
From MinTemp mn
Inner Join MaxTemp mx On mn.WeatherDate = mx.WeatherDate
where mn.rn = 1 And mx.rn = 1
order by mn.WeatherDate
Now I will break it down and show you the results and how they are produced:
DateTime MinTemp MinTempDateTime MaxTemp MaxTempDateTime
2008-07-01 -0.44 2008-07-01 00:01:00.000 -0.33 2008-07-01 00:06:00.000
There are two CTE (Common Table Expressions - new in SQL 2005) used in this query.
If we break it down we have this query first:
declare @StartDate datetime
declare @EndDate datetime
set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;
select w.Date, w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp)
From Weather w
where w.DateTime between @StartDate and @EndDate
This produces the following results
Date DateTime Temp RowNumber
2008-07-01 00:00:00.000 2008-07-01 00:01:00.000 -0.44 1
2008-07-01 00:00:00.000 2008-07-01 00:02:00.000 -0.44 2
2008-07-01 00:00:00.000 2008-07-01 00:03:00.000 -0.39 3
2008-07-01 00:00:00.000 2008-07-01 00:04:00.000 -0.39 4
2008-07-01 00:00:00.000 2008-07-01 00:05:00.000 -0.39 5
2008-07-01 00:00:00.000 2008-07-01 00:06:00.000 -0.33 6
2008-07-01 00:00:00.000 2008-07-01 00:07:00.000 -0.33 7
So we basically are returning a result set of the temperatures sorted by the temperature and we have included a new RowNumber field. This Row number field is reset to 1 for every different Date ( as defined by w.Date) in the result set. Since we are only dealing with 7 minutes in the same day our row number just returns the values 1-7. But if we had multiple dates we would have other rows where the RowNumber would reset to 1 and start incrementing again.
This is handled by the Row_Number() Over (Partition By w.Date Order By w.Temp) T-SQL code
Based on this we could then query the above result set to only return the records where the RowNumber is equal to 1
Date DateTime Temp
2008-07-01 00:00:00.000 2008-07-01 00:01:00.000 -0.44
This now contains the actual date without the time, the date of the minimum temperature, and the minimum temperature.
When working with the max temp it is exactly the same except we sort the results in descending order.
declare @StartDate datetime
declare @EndDate datetime
set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;
select w.Date, w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp desc)
From Weather w
where w.DateTime between @StartDate and @EndDate
This produces the following results
Date DateTime Temp RowNumber
2008-07-01 00:00:00.000 2008-07-01 00:06:00.000 -0.33 1
2008-07-01 00:00:00.000 2008-07-01 00:07:00.000 -0.33 2
2008-07-01 00:00:00.000 2008-07-01 00:03:00.000 -0.39 3
2008-07-01 00:00:00.000 2008-07-01 00:04:00.000 -0.39 4
2008-07-01 00:00:00.000 2008-07-01 00:05:00.000 -0.39 5
2008-07-01 00:00:00.000 2008-07-01 00:01:00.000 -0.44 6
2008-07-01 00:00:00.000 2008-07-01 00:02:00.000 -0.44 7
So basically the only difference has been the order by clause to sort in descending order Row_Number() Over (Partition By w.Date Order By w.Temp desc)
So if you look at both CTEs we have one record set sorted by the minimum temperature by the date and another record set sorted by the maximum temperature by the date.
We then do a normal inner-join on the date field and limit the results to only include the rows where the RowNumber = 1
Select mn.WeatherDate As 'DateTime',
mn.Temperature As MinTemp,
mn.WeatherDateTime As MinTempDateTime,
mx.Temperature As MaxTemp,
mx.WeatherDateTime As MaxTempDateTime
From MinTemp mn
Inner Join MaxTemp mx On mn.WeatherDate = mx.WeatherDate
where mn.rn = 1 And mx.rn = 1
order by mn.WeatherDate
Hopefully I have explained it without being too complex, but essentially the CTEs are different result sets - think of them as a table. So we have a table of minimum temperatures and a table of maximum temperatures.
We can now run this for basically any date period and gets results by per day. Here is some results for July 2008. ps: The weather data is from Christchurch, New Zealand.
Date Min Min DateTime Max Max DateTime
2008-07-01 00:00:00.000 -3.00 2008-07-01 07:45:00.000 8.33 2008-07-01 14:36:00.000
2008-07-02 00:00:00.000 6.61 2008-07-02 23:59:00.000 17.28 2008-07-02 14:48:00.000
2008-07-03 00:00:00.000 5.11 2008-07-03 02:15:00.000 15.61 2008-07-03 14:37:00.000
2008-07-04 00:00:00.000 1.39 2008-07-04 05:24:00.000 10.56 2008-07-04 12:50:00.000
2008-07-05 00:00:00.000 -0.22 2008-07-05 03:39:00.000 6.78 2008-07-05 11:02:00.000
2008-07-06 00:00:00.000 0.56 2008-07-06 22:35:00.000 10.00 2008-07-06 13:58:00.000
2008-07-07 00:00:00.000 -0.11 2008-07-07 04:10:00.000 8.78 2008-07-07 13:30:00.000
...
2008-07-19 00:00:00.000 6.83 2008-07-19 23:09:00.000 8.39 2008-07-19 17:18:00.000
2008-07-20 00:00:00.000 4.39 2008-07-20 08:05:00.000 15.06 2008-07-20 13:24:00.000
2008-07-21 00:00:00.000 5.83 2008-07-21 07:21:00.000 15.44 2008-07-21 14:23:00.000
2008-07-22 00:00:00.000 4.94 2008-07-22 23:50:00.000 12.67 2008-07-22 13:16:00.000
2008-07-23 00:00:00.000 5.06 2008-07-23 00:03:00.000 8.83 2008-07-23 16:59:00.000
Click here for a google search result for Common Table Expressions
http://www.google.com/search?hl=en&q=CTE+Common+Table+Expressions&btnG=Search&meta=