Help with Code

Moderator: NorbertKrupa

Vertica_grm
Newbie
Newbie
Posts: 17
Joined: Tue Apr 21, 2015 8:35 pm

Help with Code

Post by Vertica_grm » Sun Jul 19, 2015 9:25 pm

Hey Jim,

could you please guide me here:

I have a below table.

UserID Timestamp Event
1234 4/17/2014 22:00 A
1234 5/1/2014 21:46 A
1234 5/8/2014 18:33 A
1234 5/8/2014 18:49 A
1234 5/12/2014 21:19 B
1234 5/12/2014 21:19 B
1234 5/8/2014 21:19 A
1234 5/12/2014 21:20 B
1234 5/12/2014 21:21 B

i would like to find: For Every timestamp of event B find the count of A which happened before B's timestamp and after B which might have happened perviously. E.g. for Second occurence of B count should be 0 because there was no A between B and the previous timestamp of B.

Final output should be like this:

UserID Timestamp Event Count
1234 4/17/2014 22:00 A -
1234 5/1/2014 21:46 A -
1234 5/8/2014 18:33 A -
1234 5/8/2014 18:49 A -
1234 5/12/2014 21:19 B 4
1234 5/12/2014 21:19 B 0
1234 5/8/2014 21:19 A -
1234 5/12/2014 21:20 B 1
1234 5/12/2014 21:21 B 0

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Help with Code

Post by NorbertKrupa » Mon Jul 20, 2015 3:21 am

Can you show us what you tried?
Checkout vertica.tips for more Vertica resources.

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Help with Code

Post by JimKnicely » Mon Jul 20, 2015 3:46 am

There are probably a bunch of ways to solve this. Here is my first attempt :)

Code: Select all

dbadmin=> SELECT * FROM test ORDER BY pk;
 pk | user_id |     time_stamp      | event
----+---------+---------------------+-------
  1 |    1234 | 2014-04-17 22:00:00 | A
  2 |    1234 | 2014-05-01 21:46:00 | A
  3 |    1234 | 2014-05-08 18:33:00 | A
  4 |    1234 | 2014-05-08 18:49:00 | A
  5 |    1234 | 2014-05-12 21:19:00 | B
  6 |    1234 | 2014-05-12 21:19:00 | B
  7 |    1234 | 2014-05-08 21:19:00 | A
  8 |    1234 | 2014-05-12 21:20:00 | B
  9 |    1234 | 2014-05-12 21:21:00 | B
(9 rows)
Note that I added a pk column to keep your ordering since your data is not sorted.

Code: Select all

dbadmin=> SELECT user_id AS "UserID", time_stamp AS "Timestamp", event AS "Event",
dbadmin->        CASE WHEN event = 'B' THEN LAG(for_lag) OVER (ORDER BY pk)::VARCHAR ELSE '-' END AS "Count"
dbadmin->   FROM(SELECT pk, user_id, time_stamp, event, cce,
dbadmin(>               SUM(ctr) OVER (PARTITION BY user_id, cce ORDER BY pk) for_lag
dbadmin(>          FROM (SELECT pk, user_id, time_stamp, event,
dbadmin(>                       CASE WHEN event = 'A' THEN 1 ELSE 0 END AS ctr,
dbadmin(>                       CONDITIONAL_CHANGE_EVENT(event) OVER (ORDER BY pk) AS cce
dbadmin(>                  FROM test) foo) foo2
dbadmin->  ORDER BY pk;
 UserID |      Timestamp      | Event | Count
--------+---------------------+-------+-------
   1234 | 2014-04-17 22:00:00 | A     | -
   1234 | 2014-05-01 21:46:00 | A     | -
   1234 | 2014-05-08 18:33:00 | A     | -
   1234 | 2014-05-08 18:49:00 | A     | -
   1234 | 2014-05-12 21:19:00 | B     | 4
   1234 | 2014-05-12 21:19:00 | B     | 0
   1234 | 2014-05-08 21:19:00 | A     | -
   1234 | 2014-05-12 21:20:00 | B     | 1
   1234 | 2014-05-12 21:21:00 | B     | 0
(9 rows)
This may get you thinking about alternatives...
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Vertica_grm
Newbie
Newbie
Posts: 17
Joined: Tue Apr 21, 2015 8:35 pm

Re: Help with Code

Post by Vertica_grm » Mon Jul 20, 2015 7:26 pm

Thanks Jim,Really appreciate it.

That worked.

Vertica_grm
Newbie
Newbie
Posts: 17
Joined: Tue Apr 21, 2015 8:35 pm

Re: Help with Code

Post by Vertica_grm » Wed Jul 22, 2015 9:55 pm

If I have result set from a subquery as
UserID ColumnA Column B Percent
1 1 5 0.2
2 2 5 0.4
3 1 4 0.25
7 0 1 0
8 0 1 0


How do I add another Column (5th one below) which is basically SUM(ColumnA)/SUM(COLUMNB)

UserID ColumnA Column B % Group AVG Optimum
1 2 5 0.4 0.3125 Yes
2 2 5 0.4 0.3125 Yes
3 1 4 0.25 0.3125 Yes
7 0 1 0 0.3125 No
8 0 1 0 0.3125 No

Issue: I need first 4 column in the final result. since they will be in select statement, I will need to groupby by those column also which messes up my result on SUM function.

VSQL isn't terribly helpful in temp tables or variables. What would be the best way out?

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Help with Code

Post by JimKnicely » Thu Jul 23, 2015 12:22 am

Hi,

I think the analytic function SUM() is what you are looking for. You should check out the docs and read about analytic functions. It's one of the areas where Vertica really, really shines!

http://my.vertica.com/docs/7.1.x/HTML/i ... s%7C_____0

Below are examples using your data sets:

Code: Select all

dbadmin=> SELECT userid, column_a, column_b,
dbadmin->        ROUND(column_a / column_b, 4.0) "%",
dbadmin->        ROUND(SUM(column_a) over () / SUM(column_b) over (), 4.0) "Group AVG"
dbadmin->   FROM test1 ORDER BY userid;

 userid | column_a | column_b |  %   | Group AVG
--------+----------+----------+------+-----------
      1 |        1 |        5 |  0.2 |      0.25
      2 |        2 |        5 |  0.4 |      0.25
      3 |        1 |        4 | 0.25 |      0.25
      7 |        0 |        1 |    0 |      0.25
      8 |        0 |        1 |    0 |      0.25
(5 rows)

Code: Select all

dbadmin=>
dbadmin=> SELECT userid, column_a, column_b,
dbadmin->        ROUND(column_a / column_b, 4.0) "%",
dbadmin->        ROUND(SUM(column_a) over () / SUM(column_b) over (), 4.0) "Group AVG"
dbadmin->   FROM test2 ORDER BY userid;
 userid | column_a | column_b |  %   | Group AVG
--------+----------+----------+------+-----------
      1 |        2 |        5 |  0.4 |    0.3125
      2 |        2 |        5 |  0.4 |    0.3125
      3 |        1 |        4 | 0.25 |    0.3125
      7 |        0 |        1 |    0 |    0.3125
      8 |        0 |        1 |    0 |    0.3125
(5 rows)
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Vertica_grm
Newbie
Newbie
Posts: 17
Joined: Tue Apr 21, 2015 8:35 pm

Re: Help with Code

Post by Vertica_grm » Thu Jul 23, 2015 3:05 pm

Thank you JIm. You're the best! :D :D :D

Post Reply

Return to “New to Vertica SQL”