TIMESERIES Example

Moderator: NorbertKrupa

Post Reply
User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

TIMESERIES Example

Post by JimKnicely » Thu Aug 09, 2012 2:38 pm

There is a really cool feature in Vertica that can be used in conjunction with a SELECT statement to provide gap-filling and interpolation (GFI) computations. What does that mean? Let’s take a look at an example.

Pretend we live in a crazy world where there can be extreme fluctuations in the price of gasoline over very short amounts of time. Imagine we’re tracking the gas prices as they change in a table named TS_TEST in a Vertica database.

Here is a sample set of data from our database:

Code: Select all

dbadmin=> SELECT item, price_time, price
dbadmin->   FROM ts_test
dbadmin->  WHERE price_time BETWEEN timestamp '2012-07-26 08:00' AND timestamp '2012-07-26 08:25'
dbadmin->  ORDER
dbadmin->     BY item, price_time, price;
  item   |     price_time      | price
---------+---------------------+-------
Reg Gas | 2012-07-26 08:00:00 |  3.75
Reg Gas | 2012-07-26 08:00:15 |  3.81
Reg Gas | 2012-07-26 08:00:20 |  3.50
Reg Gas | 2012-07-26 08:01:04 |  3.57
Reg Gas | 2012-07-26 08:01:47 |  3.59
Reg Gas | 2012-07-26 08:02:12 |  4.10
Reg Gas | 2012-07-26 08:02:39 |  3.60
Reg Gas | 2012-07-26 08:02:52 |  4.00
Reg Gas | 2012-07-26 08:03:29 |  4.01
Reg Gas | 2012-07-26 08:03:46 |  3.89
Reg Gas | 2012-07-26 08:04:45 |  3.95
Reg Gas | 2012-07-26 08:05:01 |  3.95
Reg Gas | 2012-07-26 08:05:28 |  4.00
Reg Gas | 2012-07-26 08:06:35 |  4.02
Reg Gas | 2012-07-26 08:06:54 |  3.91
Reg Gas | 2012-07-26 08:07:19 |  3.80
Reg Gas | 2012-07-26 08:07:19 |  3.80
Reg Gas | 2012-07-26 08:07:29 |  3.89
Reg Gas | 2012-07-26 08:07:35 |  3.89
Reg Gas | 2012-07-26 08:09:43 |  3.75
Reg Gas | 2012-07-26 08:09:43 |  3.75
Reg Gas | 2012-07-26 08:10:31 |  3.75
Reg Gas | 2012-07-26 08:10:31 |  3.75
Reg Gas | 2012-07-26 08:10:35 |  3.76
Reg Gas | 2012-07-26 08:10:35 |  3.76
Reg Gas | 2012-07-26 08:11:20 |  3.86
Reg Gas | 2012-07-26 08:11:20 |  3.86
Reg Gas | 2012-07-26 08:11:35 |  3.87
Reg Gas | 2012-07-26 08:11:59 |  3.90
Reg Gas | 2012-07-26 08:12:01 |  3.91
Reg Gas | 2012-07-26 08:12:20 |  3.95
Reg Gas | 2012-07-26 08:12:23 |  3.96
Reg Gas | 2012-07-26 08:13:00 |  4.50
Reg Gas | 2012-07-26 08:13:22 |  4.55
Reg Gas | 2012-07-26 08:13:53 |  4.67
Reg Gas | 2012-07-26 08:14:05 |  4.87
Reg Gas | 2012-07-26 08:14:13 |  4.87
Reg Gas | 2012-07-26 08:14:45 |  4.90
Reg Gas | 2012-07-26 08:14:50 |  4.96
Reg Gas | 2012-07-26 08:15:00 |  5.00
Reg Gas | 2012-07-26 08:15:20 |  5.01
Reg Gas | 2012-07-26 08:15:23 |  5.05
Reg Gas | 2012-07-26 08:17:13 |  5.15
Reg Gas | 2012-07-26 08:20:12 |  5.16
Reg Gas | 2012-07-26 08:24:43 |  5.10
Reg Gas | 2012-07-26 08:25:00 |  5.11
(46 rows)
Notice that the times in the price_time column are relatively random. For reporting purposes we may only want to see the gas price at finite intervals, i.e. every 10 second, every 30 seconds, every minute, etc. This is where the TIMESERIES clause comes into play.

I’d like to see the price of gasoline every 1 minute starting at 8:00 am and ending at 8:25 am. The following query using the TIMESERIES clause of the SELECT statement will give me that information:

Code: Select all

dbadmin=> SELECT item, slice_time, ts_first_value(price, 'const') price
dbadmin->   FROM ts_test
dbadmin->  WHERE price_time BETWEEN timestamp '2012-07-26 08:00' AND timestamp '2012-07-26 08:25'
dbadmin->  TIMESERIES slice_time AS '1 minute' OVER (PARTITION BY item ORDER BY price_time)
dbadmin->  ORDER
dbadmin->     BY item, slice_time, price;
  item   |     slice_time      | price
---------+---------------------+-------
Reg Gas | 2012-07-26 08:00:00 |  3.75
Reg Gas | 2012-07-26 08:01:00 |  3.50
Reg Gas | 2012-07-26 08:02:00 |  3.59
Reg Gas | 2012-07-26 08:03:00 |  4.00
Reg Gas | 2012-07-26 08:04:00 |  3.89
Reg Gas | 2012-07-26 08:05:00 |  3.95
Reg Gas | 2012-07-26 08:06:00 |  4.00
Reg Gas | 2012-07-26 08:07:00 |  3.91
Reg Gas | 2012-07-26 08:08:00 |  3.89
Reg Gas | 2012-07-26 08:09:00 |  3.89
Reg Gas | 2012-07-26 08:10:00 |  3.75
Reg Gas | 2012-07-26 08:11:00 |  3.76
Reg Gas | 2012-07-26 08:12:00 |  3.90
Reg Gas | 2012-07-26 08:13:00 |  4.50
Reg Gas | 2012-07-26 08:14:00 |  4.67
Reg Gas | 2012-07-26 08:15:00 |  5.00
Reg Gas | 2012-07-26 08:16:00 |  5.05
Reg Gas | 2012-07-26 08:17:00 |  5.05
Reg Gas | 2012-07-26 08:18:00 |  5.15
Reg Gas | 2012-07-26 08:19:00 |  5.15
Reg Gas | 2012-07-26 08:20:00 |  5.15
Reg Gas | 2012-07-26 08:21:00 |  5.16
Reg Gas | 2012-07-26 08:22:00 |  5.16
Reg Gas | 2012-07-26 08:23:00 |  5.16
Reg Gas | 2012-07-26 08:24:00 |  5.16
Reg Gas | 2012-07-26 08:25:00 |  5.11
(26 rows)
Notice that although we have no data in our table for the times 08:16, 08:18, 08:19, 08:21, 08:22 and 08:23, we still see a gas price in the results due to the gap-filling capability of the TIMESERIES clause.

There is a lot more fun to be had with the TIMESERIES clause. Please read about it in the Vertica “SQL Reference Manual” beginning on page 861!
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Post Reply

Return to “Vertica Tips, Lessons and Examples”