time_slice

Forum to discuss any topic related specifically to SQL in Vertica

time_slice

Postby wlam » Fri Apr 06, 2012 3:39 am

When the slice_length argument for the time_slice() function is not a factor of the unit, when does the time slice begin?
Does the very first time slice begin at 0001-1-1 00:00:00 ?

For the following time_slice

time_slice(timestamp '0001-1-1 00:00:00', 59, 'second', 'start')

I get this result

0001-12-31 23:59:34 BC

I am confused.
wlam
Newbie
Newbie
 
Posts: 6
Joined: Thu Apr 05, 2012 10:52 pm

Message from our Sponsor

Sponsor

Sponsor
 

Re: time_slice

Postby knicely87 » Thu Apr 12, 2012 2:24 pm

Thanks for driving me crazy the last few days while I thought about an answer to your question! :)

I was also very confused by your example at first, but I think this is how it all works out:

    1) Time slices appear to be relative to the smallest (oldest) time stamp in Vertica which I believe is the time stamp '290279-12-23 00:00:00 BC'. I show here that the TIMESTAMP function returns an "out of range" error for a time stamp which is just once second earlier than '290279-12-23 00:00:00 BC':
    Code: Select all
    dbadmin=> select timestamp '290279-12-23 00:00:00 BC' a_really_long_time_ago;
      a_really_long_time_ago
    --------------------------
     290279-12-23 00:00:00 BC
    (1 row)

    dbadmin=> select timestamp '290279-12-22 23:59:59 BC' way_too_long_ago;
    ERROR:  date/time field value out of range: "290279-12-22 23:59:59 BC"
    2) We can find the start of the 59 second time slice for the time stamp '290279-12-23 00:00:00 BC' using the TIME_SLICE function:
    Code: Select all
    dbadmin=> select time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'start');
            time_slice
    --------------------------
     290279-12-22 23:59:43 BC
    (1 row)
    It's interesting that we can compute this value even though it's a time stamp earlier than the time stamp we already determined to be the smallest allowable in Vertica! This must be a special case.

    Just to clarify, here we see that we can store the time stamp '290279-12-23 00:00:00 BC' but not the time stamp '290279-12-22 23:59:43 BC' in a time stamp field in a table:
    Code: Select all
    dbadmin=> create table test2 (a_timestamp timestamp);
    CREATE TABLE
    dbadmin=> insert into test2 values (timestamp '290279-12-23 00:00:00 BC');
     OUTPUT
    --------
          1
    (1 row)

    dbadmin=> insert into test2 values (timestamp '290279-12-22 23:59:43 BC');
    ERROR:  date/time field value out of range: "290279-12-22 23:59:43 BC"
    3) Now we need to determine how many seconds there are between the time stamp '290279-12-22 23:59:43 BC' (the start of the earliest 59 second time slice) and the time stamp '0001-1-1 00:00:00' (from your example):
    Code: Select all
    dbadmin=> select timestampdiff('second', timestamp '290279-12-22 23:59:43 BC', timestamp '0001-1-1 00:00:00') seconds_between_time_stamps;
    ERROR:  date/time field value out of range: "290279-12-22 23:59:43 BC"
    Oops, we can't use the time stamp '290279-12-22 23:59:43 BC' so we'll have to use the ending time stamp for the time slice:
    Code: Select all
    dbadmin=> select time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'end');
            time_slice
    --------------------------
     290279-12-23 00:00:42 BC
    (1 row)

    dbadmin=> select timestampdiff('second', timestamp '290279-12-23 00:00:42 BC', timestamp '0001-1-1 00:00:00') seconds_between_time_stamps;
     seconds_between_time_stamps
    -----------------------------
                   9160289740758
    (1 row)
    4) We need to add the 59 seconds for the first time slice that we could not compute:
    Code: Select all
    dbadmin=> select 9160289740758 + 59 secs_with_first_ts;
     secs_with_first_ts
    --------------------
          9160289740817
    (1 row)
    5) We can then divide the total seconds by 59 to get the total number of time slices:
    Code: Select all
    dbadmin=> select 9160289740817 / 59.0 total_time_slices;
          total_time_slices
    ------------------------------
     155259148149.440677966101695
    (1 row)
    6) The fractional number of seconds computed in step 5 can be used to find the number of seconds that we need to subtract from your example time stamp '0001-1-1 00:00:00' to get to the start of its 59 second time slice:
    Code: Select all
    dbadmin=> select .440677966101695 * 59 seconds_to_ts_start;
     seconds_to_ts_start
    ---------------------
      26.000000000000005
    (1 row)

    dbadmin=> select  timestamp '0001-1-1 00:00:00' - '26 seconds'::INTERVAL my_timeslice_start;
       my_timeslice_start
    ------------------------
     0001-12-31 23:59:34 BC
    (1 row)
And that's how I believe that Vertica arrived at the time stamp '0001-12-31 23:59:34 BC' as the start of the 59 second time slice for the time stamp '0001-1-1 00:00:00' in your example!

We can condense steps 3-6 from above into one SELECT statement:
Code: Select all
dbadmin=> select timestampadd('second', (-1)*cast(round(mod((timestampdiff('second', time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'end'), timestamp '0001-01-01 00:00:00')+59)/59.0,1)*59,1.0) as int), timestamp '0001-01-01 00:00:00');
      timestampadd
------------------------
 0001-12-31 23:59:34 BC
(1 row)

Note that according to the Vertica documentation, this is expected behavior, as the following properties are true for all time slices:
    1. Equal in length
    2. Consecutive (no gaps between them)
    3. Non-overlapping
I hope this helps!
Jim Knicely
HP ATP - Vertica Big Data Solutions V1 Certified
User avatar
knicely87
Site Admin
Site Admin
 
Posts: 729
Joined: Sat Jan 21, 2012 4:58 am

Re: time_slice

Postby Tom » Thu Apr 12, 2012 4:02 pm

:o I'll have to read that again...
Tom
Newbie
Newbie
 
Posts: 6
Joined: Wed Jan 25, 2012 8:11 pm

Re: time_slice

Postby wlam » Thu Apr 12, 2012 6:23 pm

knicely87,

Thank you very much for your answer. I was worried that no one would answer.
wlam
Newbie
Newbie
 
Posts: 6
Joined: Thu Apr 05, 2012 10:52 pm

Re: time_slice

Postby knicely87 » Thu Apr 12, 2012 6:59 pm

You are welcome! I hope the answer made sense!

If you don't mind me asking, how are you using the TIME_SLICE function? Someone else asked me what the applicable use for this function is in the real world.
Jim Knicely
HP ATP - Vertica Big Data Solutions V1 Certified
User avatar
knicely87
Site Admin
Site Admin
 
Posts: 729
Joined: Sat Jan 21, 2012 4:58 am

Re: time_slice

Postby wlam » Thu Apr 12, 2012 7:10 pm

knicely87,

Looking at your example, it is still not clear why 290279-12-22 23:59:38 BC is the first time slice for 59 seconds.

time_slice(timestamp '290279-12-23 00:00:00 BC', 58, 'second', 'start') = 290279-12-22 23:59:38 BC
time_slice(timestamp '290279-12-23 00:00:00 BC', 57, 'second', 'start') = 290279-12-22 23:59:39 BC
time_slice(timestamp '290279-12-23 00:00:00 BC', 56, 'second', 'start') = 290279-12-22 23:59:36 BC
time_slice(timestamp '290279-12-23 00:00:00 BC', 55, 'second', 'start') = 290279-12-22 23:59:35 BC
time_slice(timestamp '290279-12-23 00:00:00 BC', 54, 'second', 'start') = 290279-12-23 00:00:00 BC
time_slice(timestamp '290279-12-23 00:00:00 BC', 53, 'second', 'start') = 290279-12-22 23:59:12 BC

How does Vertica determine the first time_slice?
wlam
Newbie
Newbie
 
Posts: 6
Joined: Thu Apr 05, 2012 10:52 pm

Re: time_slice

Postby wlam » Thu Apr 12, 2012 8:19 pm

Here is an example usage that my customer gave to me:

I have used that function to aggregate data for every 5-minute intervals. For instance, there are trades happening for a stock symbol throughout the day. My analysis might require looking at things for 5-minute intervals. I can use the TIME_SLICE function to round a timestamp column to 5-minute interval. The result is that I am now looking at only 288 data points out of a 24 hour window, instead of millions of data points if I don’t aggregate using TIME_SLICE.
wlam
Newbie
Newbie
 
Posts: 6
Joined: Thu Apr 05, 2012 10:52 pm

Re: time_slice

Postby knicely87 » Fri Apr 13, 2012 2:25 pm

wlam,

I'm not sure exactly sure how Vertica determines the very first start time stamp for a given time slice. However, I think it is important to note that there isn't one single starting point for all time slices... There appears to be an absolute start time stamp for each distinct slice_length and time_unit of a time slice. For instance, you found that the absolute start time stamp for all "58 second" time slices is "290279-12-23 00:00:36 BC" - there are no earlier time slices that are 58 seconds in length (at least ones that we can operate on). All future "58 second" time slices are relative to the initial "290279-12-23 00:00:36 BC" time stamp.

I mentioned in a previous post that I think that time stamps earlier than "290279-12-23 00:00:00 BC" are all special cases that only Vertica can handle. Keep in mind, I'm just speculating... Maybe someone from Vertica can provide us with a better explanation ;)

Now that I think about it, there must also be an upper limit for each time slice. That is, I wonder what the maximum end time stamps are for each distinct slice_length and time_unit?!?!

PS. Thanks for the great example of how you're using the TIME_SLICE function in the real world...
Jim Knicely
HP ATP - Vertica Big Data Solutions V1 Certified
User avatar
knicely87
Site Admin
Site Admin
 
Posts: 729
Joined: Sat Jan 21, 2012 4:58 am


Return to Vertica SQL

Who is online

Users browsing this forum: No registered users and 1 guest