Return top N rows for each group

Moderator: NorbertKrupa

Post Reply
chrismv48
Newbie
Newbie
Posts: 2
Joined: Mon Dec 09, 2013 8:48 pm

Return top N rows for each group

Post by chrismv48 » Mon Dec 09, 2013 9:06 pm

I'd like to return the top 5 geo_country rows based on sum(imps) for each tag_id. This is the query I started:

SELECT tag_id,
geo_country,
SUM(imps) AS imps,
RANK() OVER (PARTITION BY tag_id ORDER BY SUM(imps) DESC) AS rank
FROM table1
WHERE tag_id IN (2013150,1981153)
AND ymd > CURRENT_DATE - 3
GROUP BY 1,
2 LIMIT 10;

This actually returns only rows from the first tag in the WHERE clause (2013150). I know that the other tag has sum(imps) values high enough which should include it in the results.

Also, how do I implement the Top N part? I tried adding a LIMIT clause within the OVER function, but it doesn't look like it is an accepted parameter.

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Return top N rows for each group

Post by id10t » Mon Dec 09, 2013 11:09 pm

Hi!

[DELETED]
Last edited by id10t on Sat May 09, 2015 3:06 pm, edited 1 time in total.

chrismv48
Newbie
Newbie
Posts: 2
Joined: Mon Dec 09, 2013 8:48 pm

Re: Return top N rows for each group

Post by chrismv48 » Tue Dec 10, 2013 1:08 am

Awesome, thank you! Full query below for future reference:

SELECT *
FROM (SELECT tag_id, geo_country, sum(imps),
RANK() OVER (PARTITION BY tag_id ORDER BY SUM(imps) DESC) AS rank
FROM table1
WHERE tag_id IN (2013150,1981153)
AND ymd > CURRENT_DATE - 3
GROUP BY 1,2) as t2
WHERE t2.rank <=5;

Post Reply

Return to “Vertica SQL”