Counting Based on Date between Two Dates

Moderator: NorbertKrupa

Post Reply
Razgriz
Newbie
Newbie
Posts: 2
Joined: Fri Jul 24, 2015 9:33 am

Counting Based on Date between Two Dates

Post by Razgriz » Fri Jul 24, 2015 9:56 am

Hello,

I have a table with columns ACCOUNT_ID, ACTIVE_DATE and INACTIVE_DATE.

Code: Select all

ACCOUNT_ID| ACTIVE_DATE| INACTIVE_DATE
   USER1		05/01/15		05/05/15
   USER3		05/01/15		05/20/15
   USER2		05/07/15		05/20/15
   USER1		05/06/15		05/20/15
I can count the number of account_ids where a date is between the active_date and inactive_date for a specific point in time:

Code: Select all

select 
count(distinct accountid)
from table
where 
'05/05/15'
between ACTIVE_DATE
and INACTIVE_DATE
If I continue to change the date on 5/6/15 I would get 2 users and on 5/7 I would get 3 users.

My question is, how do write this looping test in SQL so I can move down a series of dates and count the number of users that are between the active and inactive dates?

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Counting Based on Date between Two Dates

Post by JimKnicely » Sat Jul 25, 2015 3:18 pm

Hi,

Vertica has a lot of nice built in analytic functions! I think the TIMESERIES clause, which provides gap-filling and interpolation (GFI) computation, will help you out.

Please take a look at the documentation here:

http://my.vertica.com/docs/7.1.x/HTML/i ... Clause.htm

Example:

Code: Select all

dbadmin=> SELECT * FROM test;
 account_id | active_date | inactive_date
------------+-------------+---------------
 USER1      | 2015-05-01  | 2015-05-05
 USER1      | 2015-05-06  | 2015-05-20
 USER2      | 2015-05-07  | 2015-05-20
 USER3      | 2015-05-01  | 2015-05-20
(4 rows)

Code: Select all

dbadmin=> SELECT foo2.ts, COUNT(DISTINCT account_id)
dbadmin->   FROM test
dbadmin->  CROSS JOIN (SELECT ts::DATE
dbadmin(>                FROM (SELECT MIN(active_date)::TIMESTAMP AS tm FROM test
dbadmin(>                       UNION
dbadmin(>                      SELECT MAX(inactive_date)::TIMESTAMP AS tm FROM test) AS foo
dbadmin(>              TIMESERIES ts AS '1 DAY' OVER (ORDER BY tm)) foo2
dbadmin->  WHERE foo2.ts BETWEEN active_date AND inactive_date
dbadmin->  GROUP BY foo2.ts
dbadmin->  ORDER BY foo2.ts;
     ts     | COUNT
------------+-------
 2015-05-01 |     2
 2015-05-02 |     2
 2015-05-03 |     2
 2015-05-04 |     2
 2015-05-05 |     2
 2015-05-06 |     2
 2015-05-07 |     3
 2015-05-08 |     3
 2015-05-09 |     3
 2015-05-10 |     3
 2015-05-11 |     3
 2015-05-12 |     3
 2015-05-13 |     3
 2015-05-14 |     3
 2015-05-15 |     3
 2015-05-16 |     3
 2015-05-17 |     3
 2015-05-18 |     3
 2015-05-19 |     3
 2015-05-20 |     3
(20 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Counting Based on Date between Two Dates

Post by id10t » Sat Jul 25, 2015 5:07 pm

Hi!
knicely87 wrote: I think the TIMESERIES clause, which provides gap-filling and interpolation (GFI) computation, will help you out.
And use it on your own risk, because this functionality is buggy.
https://community.dev.hp.com/t5/Vertica ... true#M2579

Example:
(why it starts from timestamp "2013-02-16 18:30:00"?)

daniel=> \set min '''2013-02-26 00:00:00'''
daniel=> \set max '''2013-09-11 00:00:00'''
daniel=> \set interval '''14 DAYS 12 HOURS 47 MINUTES'''
daniel=> SELECT ts
daniel-> FROM (SELECT :min::TIMESTAMP as tm
daniel(> UNION
daniel(> SELECT :max::TIMESTAMP as tm) as t
daniel-> TIMESERIES ts as :interval OVER (ORDER BY tm);
ts
---------------------
2013-02-16 18:30:00
2013-03-03 07:17:00
2013-03-17 20:04:00
2013-04-01 08:51:00
2013-04-15 21:38:00
2013-04-30 10:25:00
2013-05-14 23:12:00
2013-05-29 11:59:00
2013-06-13 00:46:00
2013-06-27 13:33:00
2013-07-12 02:20:00
2013-07-26 15:07:00
2013-08-10 03:54:00
2013-08-24 16:41:00
2013-09-08 05:28:00
(15 rows)

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Counting Based on Date between Two Dates

Post by JimKnicely » Sun Jul 26, 2015 3:22 am

Hi,

Maybe a better solution would be to use a date dimension table?

Code: Select all

dbadmin=> CREATE TABLE date_dim (the_date date);
CREATE TABLE

Code: Select all

dbadmin=> INSERT INTO date_dim
dbadmin->   SELECT the_date
dbadmin->     FROM (SELECT ('05/01/15'::DATE + (row_number() over () - 1)) the_date
dbadmin(>             FROM tables CROSS JOIN columns) foo
dbadmin->    WHERE the_date <= '05/20/15'::DATE;
 OUTPUT
--------
     20
(1 row)

Code: Select all

dbadmin=> SELECT * FROM date_dim;
  the_date
------------
 2015-05-01
 2015-05-02
 2015-05-03
 2015-05-04
 2015-05-05
 2015-05-06
 2015-05-07
 2015-05-08
 2015-05-09
 2015-05-10
 2015-05-11
 2015-05-12
 2015-05-13
 2015-05-14
 2015-05-15
 2015-05-16
 2015-05-17
 2015-05-18
 2015-05-19
 2015-05-20
(20 rows)

Code: Select all

dbadmin=> SELECT dd.the_date, COUNT(DISTINCT account_id)
dbadmin->   FROM test
dbadmin->  CROSS JOIN date_dim dd
dbadmin->  WHERE dd.the_date BETWEEN active_date AND inactive_date
dbadmin->  GROUP BY dd.the_date
dbadmin->  ORDER BY dd.the_date;
  the_date  | COUNT
------------+-------
 2015-05-01 |     2
 2015-05-02 |     2
 2015-05-03 |     2
 2015-05-04 |     2
 2015-05-05 |     2
 2015-05-06 |     2
 2015-05-07 |     3
 2015-05-08 |     3
 2015-05-09 |     3
 2015-05-10 |     3
 2015-05-11 |     3
 2015-05-12 |     3
 2015-05-13 |     3
 2015-05-14 |     3
 2015-05-15 |     3
 2015-05-16 |     3
 2015-05-17 |     3
 2015-05-18 |     3
 2015-05-19 |     3
 2015-05-20 |     3
(20 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Counting Based on Date between Two Dates

Post by id10t » Sun Jul 26, 2015 8:20 am

Hi!
knicely87 wrote:Maybe a better solution would be to use a date dimension table?
A better solution is - to push on Vertica developers to fix a bug, otherwise it's not a solution, it's a workaround.

Razgriz
Newbie
Newbie
Posts: 2
Joined: Fri Jul 24, 2015 9:33 am

Re: Counting Based on Date between Two Dates

Post by Razgriz » Thu Sep 03, 2015 6:41 pm

Thanks everyone for their help!

Post Reply

Return to “Vertica SQL”