|
|
Forum to discuss any topic related specifically to SQL in Vertica
by wlam » Fri Apr 06, 2012 3:39 am
When the slice_length argument for the time_slice() function is not a factor of the unit, when does the time slice begin? Does the very first time slice begin at 0001-1-1 00:00:00 ?
For the following time_slice
time_slice(timestamp '0001-1-1 00:00:00', 59, 'second', 'start')
I get this result
0001-12-31 23:59:34 BC
I am confused.
-
wlam
- Newbie

-
- Posts: 6
- Joined: Thu Apr 05, 2012 10:52 pm
Message from our Sponsor
Sponsor
-
Sponsor
-
by knicely87 » Thu Apr 12, 2012 2:24 pm
Thanks for driving me crazy the last few days while I thought about an answer to your question!  I was also very confused by your example at first, but I think this is how it all works out: 1) Time slices appear to be relative to the smallest (oldest) time stamp in Vertica which I believe is the time stamp '290279-12-23 00:00:00 BC'. I show here that the TIMESTAMP function returns an "out of range" error for a time stamp which is just once second earlier than '290279-12-23 00:00:00 BC': - Code: Select all
dbadmin=> select timestamp '290279-12-23 00:00:00 BC' a_really_long_time_ago; a_really_long_time_ago -------------------------- 290279-12-23 00:00:00 BC (1 row)
dbadmin=> select timestamp '290279-12-22 23:59:59 BC' way_too_long_ago; ERROR: date/time field value out of range: "290279-12-22 23:59:59 BC"
2) We can find the start of the 59 second time slice for the time stamp '290279-12-23 00:00:00 BC' using the TIME_SLICE function: - Code: Select all
dbadmin=> select time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'start'); time_slice -------------------------- 290279-12-22 23:59:43 BC (1 row)
It's interesting that we can compute this value even though it's a time stamp earlier than the time stamp we already determined to be the smallest allowable in Vertica! This must be a special case.
Just to clarify, here we see that we can store the time stamp '290279-12-23 00:00:00 BC' but not the time stamp '290279-12-22 23:59:43 BC' in a time stamp field in a table: - Code: Select all
dbadmin=> create table test2 (a_timestamp timestamp); CREATE TABLE dbadmin=> insert into test2 values (timestamp '290279-12-23 00:00:00 BC'); OUTPUT -------- 1 (1 row)
dbadmin=> insert into test2 values (timestamp '290279-12-22 23:59:43 BC'); ERROR: date/time field value out of range: "290279-12-22 23:59:43 BC"
3) Now we need to determine how many seconds there are between the time stamp '290279-12-22 23:59:43 BC' (the start of the earliest 59 second time slice) and the time stamp '0001-1-1 00:00:00' (from your example): - Code: Select all
dbadmin=> select timestampdiff('second', timestamp '290279-12-22 23:59:43 BC', timestamp '0001-1-1 00:00:00') seconds_between_time_stamps; ERROR: date/time field value out of range: "290279-12-22 23:59:43 BC"
Oops, we can't use the time stamp '290279-12-22 23:59:43 BC' so we'll have to use the ending time stamp for the time slice: - Code: Select all
dbadmin=> select time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'end'); time_slice -------------------------- 290279-12-23 00:00:42 BC (1 row)
dbadmin=> select timestampdiff('second', timestamp '290279-12-23 00:00:42 BC', timestamp '0001-1-1 00:00:00') seconds_between_time_stamps; seconds_between_time_stamps ----------------------------- 9160289740758 (1 row)
4) We need to add the 59 seconds for the first time slice that we could not compute: - Code: Select all
dbadmin=> select 9160289740758 + 59 secs_with_first_ts; secs_with_first_ts -------------------- 9160289740817 (1 row)
5) We can then divide the total seconds by 59 to get the total number of time slices: - Code: Select all
dbadmin=> select 9160289740817 / 59.0 total_time_slices; total_time_slices ------------------------------ 155259148149.440677966101695 (1 row)
6) The fractional number of seconds computed in step 5 can be used to find the number of seconds that we need to subtract from your example time stamp '0001-1-1 00:00:00' to get to the start of its 59 second time slice: - Code: Select all
dbadmin=> select .440677966101695 * 59 seconds_to_ts_start; seconds_to_ts_start --------------------- 26.000000000000005 (1 row)
dbadmin=> select timestamp '0001-1-1 00:00:00' - '26 seconds'::INTERVAL my_timeslice_start; my_timeslice_start ------------------------ 0001-12-31 23:59:34 BC (1 row)
And that's how I believe that Vertica arrived at the time stamp '0001-12-31 23:59:34 BC' as the start of the 59 second time slice for the time stamp '0001-1-1 00:00:00' in your example! We can condense steps 3-6 from above into one SELECT statement: - Code: Select all
dbadmin=> select timestampadd('second', (-1)*cast(round(mod((timestampdiff('second', time_slice(timestamp '290279-12-23 00:00:00 BC', 59, 'second', 'end'), timestamp '0001-01-01 00:00:00')+59)/59.0,1)*59,1.0) as int), timestamp '0001-01-01 00:00:00'); timestampadd ------------------------ 0001-12-31 23:59:34 BC (1 row)
Note that according to the Vertica documentation, this is expected behavior, as the following properties are true for all time slices: 1. Equal in length 2. Consecutive (no gaps between them) 3. Non-overlapping I hope this helps!
Jim Knicely HP ATP - Vertica Big Data Solutions V1 Certified
-

knicely87
- Site Admin

-
- Posts: 729
- Joined: Sat Jan 21, 2012 4:58 am
-
by Tom » Thu Apr 12, 2012 4:02 pm
 I'll have to read that again...
-
Tom
- Newbie

-
- Posts: 6
- Joined: Wed Jan 25, 2012 8:11 pm
by wlam » Thu Apr 12, 2012 6:23 pm
knicely87,
Thank you very much for your answer. I was worried that no one would answer.
-
wlam
- Newbie

-
- Posts: 6
- Joined: Thu Apr 05, 2012 10:52 pm
by knicely87 » Thu Apr 12, 2012 6:59 pm
You are welcome! I hope the answer made sense!
If you don't mind me asking, how are you using the TIME_SLICE function? Someone else asked me what the applicable use for this function is in the real world.
Jim Knicely HP ATP - Vertica Big Data Solutions V1 Certified
-

knicely87
- Site Admin

-
- Posts: 729
- Joined: Sat Jan 21, 2012 4:58 am
-
by wlam » Thu Apr 12, 2012 7:10 pm
knicely87,
Looking at your example, it is still not clear why 290279-12-22 23:59:38 BC is the first time slice for 59 seconds.
time_slice(timestamp '290279-12-23 00:00:00 BC', 58, 'second', 'start') = 290279-12-22 23:59:38 BC time_slice(timestamp '290279-12-23 00:00:00 BC', 57, 'second', 'start') = 290279-12-22 23:59:39 BC time_slice(timestamp '290279-12-23 00:00:00 BC', 56, 'second', 'start') = 290279-12-22 23:59:36 BC time_slice(timestamp '290279-12-23 00:00:00 BC', 55, 'second', 'start') = 290279-12-22 23:59:35 BC time_slice(timestamp '290279-12-23 00:00:00 BC', 54, 'second', 'start') = 290279-12-23 00:00:00 BC time_slice(timestamp '290279-12-23 00:00:00 BC', 53, 'second', 'start') = 290279-12-22 23:59:12 BC
How does Vertica determine the first time_slice?
-
wlam
- Newbie

-
- Posts: 6
- Joined: Thu Apr 05, 2012 10:52 pm
by wlam » Thu Apr 12, 2012 8:19 pm
Here is an example usage that my customer gave to me:
I have used that function to aggregate data for every 5-minute intervals. For instance, there are trades happening for a stock symbol throughout the day. My analysis might require looking at things for 5-minute intervals. I can use the TIME_SLICE function to round a timestamp column to 5-minute interval. The result is that I am now looking at only 288 data points out of a 24 hour window, instead of millions of data points if I don’t aggregate using TIME_SLICE.
-
wlam
- Newbie

-
- Posts: 6
- Joined: Thu Apr 05, 2012 10:52 pm
by knicely87 » Fri Apr 13, 2012 2:25 pm
wlam, I'm not sure exactly sure how Vertica determines the very first start time stamp for a given time slice. However, I think it is important to note that there isn't one single starting point for all time slices... There appears to be an absolute start time stamp for each distinct slice_length and time_unit of a time slice. For instance, you found that the absolute start time stamp for all "58 second" time slices is "290279-12-23 00:00:36 BC" - there are no earlier time slices that are 58 seconds in length (at least ones that we can operate on). All future "58 second" time slices are relative to the initial "290279-12-23 00:00:36 BC" time stamp. I mentioned in a previous post that I think that time stamps earlier than "290279-12-23 00:00:00 BC" are all special cases that only Vertica can handle. Keep in mind, I'm just speculating... Maybe someone from Vertica can provide us with a better explanation Now that I think about it, there must also be an upper limit for each time slice. That is, I wonder what the maximum end time stamps are for each distinct slice_length and time_unit?!?! PS. Thanks for the great example of how you're using the TIME_SLICE function in the real world...
Jim Knicely HP ATP - Vertica Big Data Solutions V1 Certified
-

knicely87
- Site Admin

-
- Posts: 729
- Joined: Sat Jan 21, 2012 4:58 am
-
Return to Vertica SQL
Who is online
Users browsing this forum: No registered users and 1 guest
|