Hey Jim,
could you please guide me here:
I have a below table.
UserID Timestamp Event
1234 4/17/2014 22:00 A
1234 5/1/2014 21:46 A
1234 5/8/2014 18:33 A
1234 5/8/2014 18:49 A
1234 5/12/2014 21:19 B
1234 5/12/2014 21:19 B
1234 5/8/2014 21:19 A
1234 5/12/2014 21:20 B
1234 5/12/2014 21:21 B
i would like to find: For Every timestamp of event B find the count of A which happened before B's timestamp and after B which might have happened perviously. E.g. for Second occurence of B count should be 0 because there was no A between B and the previous timestamp of B.
Final output should be like this:
UserID Timestamp Event Count
1234 4/17/2014 22:00 A -
1234 5/1/2014 21:46 A -
1234 5/8/2014 18:33 A -
1234 5/8/2014 18:49 A -
1234 5/12/2014 21:19 B 4
1234 5/12/2014 21:19 B 0
1234 5/8/2014 21:19 A -
1234 5/12/2014 21:20 B 1
1234 5/12/2014 21:21 B 0
Help with Code
Moderator: NorbertKrupa
-
- GURU
- Posts: 527
- Joined: Tue Oct 22, 2013 9:36 pm
- Location: Chicago, IL
- Contact:
Re: Help with Code
Can you show us what you tried?
Checkout vertica.tips for more Vertica resources.
- JimKnicely
- Site Admin
- Posts: 1825
- Joined: Sat Jan 21, 2012 4:58 am
- Contact:
Re: Help with Code
There are probably a bunch of ways to solve this. Here is my first attempt
Note that I added a pk column to keep your ordering since your data is not sorted.
This may get you thinking about alternatives...
Code: Select all
dbadmin=> SELECT * FROM test ORDER BY pk;
pk | user_id | time_stamp | event
----+---------+---------------------+-------
1 | 1234 | 2014-04-17 22:00:00 | A
2 | 1234 | 2014-05-01 21:46:00 | A
3 | 1234 | 2014-05-08 18:33:00 | A
4 | 1234 | 2014-05-08 18:49:00 | A
5 | 1234 | 2014-05-12 21:19:00 | B
6 | 1234 | 2014-05-12 21:19:00 | B
7 | 1234 | 2014-05-08 21:19:00 | A
8 | 1234 | 2014-05-12 21:20:00 | B
9 | 1234 | 2014-05-12 21:21:00 | B
(9 rows)
Code: Select all
dbadmin=> SELECT user_id AS "UserID", time_stamp AS "Timestamp", event AS "Event",
dbadmin-> CASE WHEN event = 'B' THEN LAG(for_lag) OVER (ORDER BY pk)::VARCHAR ELSE '-' END AS "Count"
dbadmin-> FROM(SELECT pk, user_id, time_stamp, event, cce,
dbadmin(> SUM(ctr) OVER (PARTITION BY user_id, cce ORDER BY pk) for_lag
dbadmin(> FROM (SELECT pk, user_id, time_stamp, event,
dbadmin(> CASE WHEN event = 'A' THEN 1 ELSE 0 END AS ctr,
dbadmin(> CONDITIONAL_CHANGE_EVENT(event) OVER (ORDER BY pk) AS cce
dbadmin(> FROM test) foo) foo2
dbadmin-> ORDER BY pk;
UserID | Timestamp | Event | Count
--------+---------------------+-------+-------
1234 | 2014-04-17 22:00:00 | A | -
1234 | 2014-05-01 21:46:00 | A | -
1234 | 2014-05-08 18:33:00 | A | -
1234 | 2014-05-08 18:49:00 | A | -
1234 | 2014-05-12 21:19:00 | B | 4
1234 | 2014-05-12 21:19:00 | B | 0
1234 | 2014-05-08 21:19:00 | A | -
1234 | 2014-05-12 21:20:00 | B | 1
1234 | 2014-05-12 21:21:00 | B | 0
(9 rows)
Jim Knicely
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
-
- Newbie
- Posts: 17
- Joined: Tue Apr 21, 2015 8:35 pm
Re: Help with Code
Thanks Jim,Really appreciate it.
That worked.
That worked.
-
- Newbie
- Posts: 17
- Joined: Tue Apr 21, 2015 8:35 pm
Re: Help with Code
If I have result set from a subquery as
UserID ColumnA Column B Percent
1 1 5 0.2
2 2 5 0.4
3 1 4 0.25
7 0 1 0
8 0 1 0
How do I add another Column (5th one below) which is basically SUM(ColumnA)/SUM(COLUMNB)
UserID ColumnA Column B % Group AVG Optimum
1 2 5 0.4 0.3125 Yes
2 2 5 0.4 0.3125 Yes
3 1 4 0.25 0.3125 Yes
7 0 1 0 0.3125 No
8 0 1 0 0.3125 No
Issue: I need first 4 column in the final result. since they will be in select statement, I will need to groupby by those column also which messes up my result on SUM function.
VSQL isn't terribly helpful in temp tables or variables. What would be the best way out?
UserID ColumnA Column B Percent
1 1 5 0.2
2 2 5 0.4
3 1 4 0.25
7 0 1 0
8 0 1 0
How do I add another Column (5th one below) which is basically SUM(ColumnA)/SUM(COLUMNB)
UserID ColumnA Column B % Group AVG Optimum
1 2 5 0.4 0.3125 Yes
2 2 5 0.4 0.3125 Yes
3 1 4 0.25 0.3125 Yes
7 0 1 0 0.3125 No
8 0 1 0 0.3125 No
Issue: I need first 4 column in the final result. since they will be in select statement, I will need to groupby by those column also which messes up my result on SUM function.
VSQL isn't terribly helpful in temp tables or variables. What would be the best way out?
- JimKnicely
- Site Admin
- Posts: 1825
- Joined: Sat Jan 21, 2012 4:58 am
- Contact:
Re: Help with Code
Hi,
I think the analytic function SUM() is what you are looking for. You should check out the docs and read about analytic functions. It's one of the areas where Vertica really, really shines!
http://my.vertica.com/docs/7.1.x/HTML/i ... s%7C_____0
Below are examples using your data sets:
I think the analytic function SUM() is what you are looking for. You should check out the docs and read about analytic functions. It's one of the areas where Vertica really, really shines!
http://my.vertica.com/docs/7.1.x/HTML/i ... s%7C_____0
Below are examples using your data sets:
Code: Select all
dbadmin=> SELECT userid, column_a, column_b,
dbadmin-> ROUND(column_a / column_b, 4.0) "%",
dbadmin-> ROUND(SUM(column_a) over () / SUM(column_b) over (), 4.0) "Group AVG"
dbadmin-> FROM test1 ORDER BY userid;
userid | column_a | column_b | % | Group AVG
--------+----------+----------+------+-----------
1 | 1 | 5 | 0.2 | 0.25
2 | 2 | 5 | 0.4 | 0.25
3 | 1 | 4 | 0.25 | 0.25
7 | 0 | 1 | 0 | 0.25
8 | 0 | 1 | 0 | 0.25
(5 rows)
Code: Select all
dbadmin=>
dbadmin=> SELECT userid, column_a, column_b,
dbadmin-> ROUND(column_a / column_b, 4.0) "%",
dbadmin-> ROUND(SUM(column_a) over () / SUM(column_b) over (), 4.0) "Group AVG"
dbadmin-> FROM test2 ORDER BY userid;
userid | column_a | column_b | % | Group AVG
--------+----------+----------+------+-----------
1 | 2 | 5 | 0.4 | 0.3125
2 | 2 | 5 | 0.4 | 0.3125
3 | 1 | 4 | 0.25 | 0.3125
7 | 0 | 1 | 0 | 0.3125
8 | 0 | 1 | 0 | 0.3125
(5 rows)
Jim Knicely
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
-
- Newbie
- Posts: 17
- Joined: Tue Apr 21, 2015 8:35 pm
Re: Help with Code
Thank you JIm. You're the best!