Chris Crowe's Blog
Ramblings of an IIS MVP ( MVP Since 1997 )

Powered By IIS 7

Search my blog

Some of my readers



My Microsoft Certifications


Dec 15, 1998

Dec 20, 2000

Jan 31, 2001

Jul 22, 2002

Nov 1, 2004

My Microsoft MVP Awards




1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
mvp.support.microsoft.com

First off - we are not going to be using any aggregates but it looks like we may from the results...

One of the problems I often find and often can never think of a solution is something like this:

Lately I have been dealing with Weather data. I am storing a number of attributes related to the current weather each minute such as the DateTime, the Temperature, Humidity etc.

Now say I want to return a single result set that includes data for the current month with the following fields:

  • Date
  • Min Temperature
  • Date & Time of the Min Temperature
  • Max Temperature
  • Date & Time of the Max Temperature

I spent a bit of time and I could not find a nice solution to the problem so I posted a question to the microsoft.public.sqlserver.newsusers newsgroup. I got a very good response from Tom Cooper. His solution uses SQL 2005 Common Table Expressions (CTE) which is exactly what I was wanting and so to help me and others I have decided to document how it works.

Lets look at a snapshot of the weather data ( yes there is a column with the date, and another with the date time but you could truncate the datetime to a date field if required, but for me it was quicker to just add a date column as well )

id           Date                                        DateTime                              Temp
19649    2008-07-01 00:00:00.000    2008-07-01 00:01:00.000    -0.44
19650    2008-07-01 00:00:00.000    2008-07-01 00:02:00.000    -0.44
19651    2008-07-01 00:00:00.000    2008-07-01 00:03:00.000    -0.39
19652    2008-07-01 00:00:00.000    2008-07-01 00:04:00.000    -0.39
19653    2008-07-01 00:00:00.000    2008-07-01 00:05:00.000    -0.39
19654    2008-07-01 00:00:00.000    2008-07-01 00:06:00.000    -0.33
19655    2008-07-01 00:00:00.000    2008-07-01 00:07:00.000    -0.33

Now as you can see we will have lots of data: 1 row per minute = 60 per hour, which means we get 1,440 rows per day or around 43,000 rows per month.

If we were using our sample 7 rows above the output we want to see is the following:

DateTime       MinTemp            MinTempDateTime               MaxTemp            MaxTempDateTime
2008-07-01   -0.44             2008-07-01 00:01:00.000    -0.33             2008-07-01 00:06:00.000

Here is the code that would do that:

declare @StartDate    datetime
declare @EndDate    datetime

set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;

With MinTemp(WeatherDate, WeatherDateTime, Temperature, rn) As
(
    select w.Date,  w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp)
    From Weather w 
    where w.DateTime between @StartDate and @EndDate
),
 MaxTemp(WeatherDate, WeatherDateTime, Temperature,  rn) As
(
    Select w.Date,  w.DateTime,  w.Temp,  Row_Number() Over (Partition By w.Date Order By w.Temp Desc)
    From Weather w
    where w.DateTime between @StartDate and @EndDate
)
Select mn.WeatherDate As 'DateTime',
  mn.Temperature As MinTemp,
  mn.WeatherDateTime As MinTempDateTime,
  mx.Temperature As MaxTemp,
  mx.WeatherDateTime As MaxTempDateTime  
From MinTemp mn
Inner Join MaxTemp mx On mn.WeatherDate = mx.WeatherDate 
where mn.rn = 1 And mx.rn = 1
order by mn.WeatherDate

Now I will break it down and show you the results and how they are produced:

DateTime       MinTemp            MinTempDateTime               MaxTemp            MaxTempDateTime
2008-07-01   -0.44             2008-07-01 00:01:00.000    -0.33             2008-07-01 00:06:00.000

There are two CTE (Common Table Expressions - new in SQL 2005) used in this query.

If we break it down we have this query first:

declare @StartDate    datetime
declare @EndDate    datetime

set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;

select w.Date,  w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp)
From Weather w 
where w.DateTime between @StartDate and @EndDate
This produces the following results

Date                                  DateTime                          Temp   RowNumber
2008-07-01 00:00:00.000    2008-07-01 00:01:00.000    -0.44     1
2008-07-01 00:00:00.000    2008-07-01 00:02:00.000    -0.44     2
2008-07-01 00:00:00.000    2008-07-01 00:03:00.000    -0.39     3
2008-07-01 00:00:00.000    2008-07-01 00:04:00.000    -0.39     4
2008-07-01 00:00:00.000    2008-07-01 00:05:00.000    -0.39     5
2008-07-01 00:00:00.000    2008-07-01 00:06:00.000    -0.33     6
2008-07-01 00:00:00.000    2008-07-01 00:07:00.000    -0.33     7

So we basically are returning a result set of the temperatures sorted by the temperature and we have included a new RowNumber field. This Row number field is reset to 1 for every different Date ( as defined by w.Date) in the result set. Since we are only dealing with 7 minutes in the same day our row number just returns the values 1-7. But if we had multiple dates we would have other rows where the RowNumber would reset to 1 and start incrementing again.

This is handled by the Row_Number() Over (Partition By w.Date Order By w.Temp)  T-SQL code

Based on this we could then query the above result set to only return the records where the RowNumber is equal to 1

Date                                    DateTime                            Temp  
2008-07-01 00:00:00.000    2008-07-01 00:01:00.000    -0.44    

This now contains the actual date without the time, the date of the minimum temperature, and the minimum temperature.

When working with the max temp it is exactly the same except we sort the results in descending order.

 

declare @StartDate    datetime
declare @EndDate    datetime

set @StartDate = '2008-7-1'
set @EndDate = dateadd(minute, 7, @startDate)
;

select w.Date,  w.DateTime, w.Temp, Row_Number() Over (Partition By w.Date Order By w.Temp desc)
From Weather w 
where w.DateTime between @StartDate and @EndDate
This produces the following results

Date                                    DateTime                           Temp   RowNumber
2008-07-01 00:00:00.000    2008-07-01 00:06:00.000    -0.33     1
2008-07-01 00:00:00.000    2008-07-01 00:07:00.000    -0.33     2
2008-07-01 00:00:00.000    2008-07-01 00:03:00.000    -0.39     3
2008-07-01 00:00:00.000    2008-07-01 00:04:00.000    -0.39     4
2008-07-01 00:00:00.000    2008-07-01 00:05:00.000    -0.39     5
2008-07-01 00:00:00.000    2008-07-01 00:01:00.000    -0.44     6
2008-07-01 00:00:00.000    2008-07-01 00:02:00.000    -0.44     7

So basically the only difference has been the order by clause to sort in descending order Row_Number() Over (Partition By w.Date Order By w.Temp desc) 

So if you look at both CTEs we have one record set sorted by the minimum temperature by the date and another record set sorted by the maximum temperature by the date.

We then do a normal inner-join on the date field and limit the results to only include the rows where the RowNumber = 1

Select mn.WeatherDate As 'DateTime',
  mn.Temperature As MinTemp,
  mn.WeatherDateTime As MinTempDateTime,
  mx.Temperature As MaxTemp,
  mx.WeatherDateTime As MaxTempDateTime  
From MinTemp mn
Inner Join MaxTemp mx On mn.WeatherDate = mx.WeatherDate 
where mn.rn = 1 And mx.rn = 1
order by mn.WeatherDate

Hopefully I have explained it without being too complex, but essentially the CTEs are different result sets - think of them as a table. So we have a table of minimum temperatures and a table of maximum temperatures.

We can now run this for basically any date period and gets results by per day. Here is some results for July 2008. ps: The weather data is from Christchurch, New Zealand.

Date                                      Min     Min DateTime                    Max    Max DateTime
2008-07-01 00:00:00.000    -3.00    2008-07-01 07:45:00.000     8.33    2008-07-01 14:36:00.000
2008-07-02 00:00:00.000     6.61    2008-07-02 23:59:00.000    17.28    2008-07-02 14:48:00.000
2008-07-03 00:00:00.000     5.11    2008-07-03 02:15:00.000    15.61    2008-07-03 14:37:00.000
2008-07-04 00:00:00.000     1.39    2008-07-04 05:24:00.000    10.56    2008-07-04 12:50:00.000
2008-07-05 00:00:00.000    -0.22    2008-07-05 03:39:00.000     6.78    2008-07-05 11:02:00.000
2008-07-06 00:00:00.000     0.56    2008-07-06 22:35:00.000    10.00    2008-07-06 13:58:00.000
2008-07-07 00:00:00.000   -0.11    2008-07-07 04:10:00.000       8.78    2008-07-07 13:30:00.000
...

2008-07-19 00:00:00.000    6.83    2008-07-19 23:09:00.000      8.39    2008-07-19 17:18:00.000
2008-07-20 00:00:00.000    4.39    2008-07-20 08:05:00.000    15.06    2008-07-20 13:24:00.000
2008-07-21 00:00:00.000    5.83    2008-07-21 07:21:00.000    15.44    2008-07-21 14:23:00.000
2008-07-22 00:00:00.000    4.94    2008-07-22 23:50:00.000    12.67    2008-07-22 13:16:00.000
2008-07-23 00:00:00.000    5.06    2008-07-23 00:03:00.000      8.83    2008-07-23 16:59:00.000

Click here for a google search result for Common Table Expressions

http://www.google.com/search?hl=en&q=CTE+Common+Table+Expressions&btnG=Search&meta=

posted on Wednesday, July 23, 2008 9:35 AM |

Comments


# Producing a single result set with Aggregates and non Aggregate columns.

Gravatar
This is exactly what I was asking you about the other week. (more or less)

And now you know how to-do it...
Posted by Simeon on 7/23/2008 11:55 AM

# re: SQL : Producing a single result set with Aggregates and non Aggregate columns.

Gravatar
How to get table list(table names) for a database according to batch size.

e.g. if i there r 50 tables in database then if i want table names from 10 to 30 range.

reqired in sql 2000 and oracle
Posted by vijay on 7/24/2008 9:20 PM

Post Comment


Title *
Name *
Email
Url
Comment *  
Please add 8 and 2 and type the answer here: