How to know if my partition is getting hit in vertica

Moderator: NorbertKrupa

Post Reply
Amit
Newbie
Newbie
Posts: 4
Joined: Tue Jul 07, 2015 1:04 pm

How to know if my partition is getting hit in vertica

Post by Amit » Tue Jul 07, 2015 2:42 pm

Hi ALL,

I have created a table with partition date field.Total 21 number of partition will be created.
However i am not able to figure out if the partition is actually getting hit.
I cud see any information in explain/profile.

Please let me know how to identify it

Amit
Newbie
Newbie
Posts: 4
Joined: Tue Jul 07, 2015 1:04 pm

Re: How to know if my partition is getting hit in vertica

Post by Amit » Tue Jul 07, 2015 2:43 pm

small correction :- i cud not see any information regarding it in explain/profile

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: How to know if my partition is getting hit in vertica

Post by NorbertKrupa » Tue Jul 07, 2015 3:16 pm

Can you show us how you partitioned your table? You might want to check the system table v_monitor.query_events. It will contain a PARTITIONS_ELIMINATED event for the query. The event_details column contains information on how many partitions were used for the query:

Code: Select all

-[ RECORD 1 ]-----+-------------------------------------------
...
statement_id      | 1
event_category    | EXECUTION
event_type        | PARTITIONS_ELIMINATED
event_description | Some storage containers will not be processed
                    because they contain no relevant data.
...
event_details     | Using only 1 stores out of 12 for projection 
                    test.table_p_node0001
suggested_action  |
Checkout vertica.tips for more Vertica resources.

Amit
Newbie
Newbie
Posts: 4
Joined: Tue Jul 07, 2015 1:04 pm

Re: How to know if my partition is getting hit in vertica

Post by Amit » Wed Jul 08, 2015 8:54 am

Hi,

below is the table structure.

CREATE TABLE test.AMIT
(
Store_Cd char(5),
Tm_Frame_Cd char(5),
Hour_Nr int,
Rptg_Dt date NOT NULL,
Company_Cd char(5),
Store_Hour_Ind char(1),
Product_Desc varchar(100),
Product_Sub_Desc varchar(100),
CY_Bookings_Dollar numeric(18,2),
CY_Bookings_Local numeric(18,2),
Order_Cnt int,
Total_Order_Cnt int,
Units int
)
partition by Rptg_Dt;

there will be total 21 partition any time.

Query which i am running is
SELECT
COALESCE(SUM(CY_BOOKINGS_LOCAL),0) CY_BOOKINGS_LOCAL
,COALESCE(SUM(CY_BOOKINGS_DOLLAR),0) CY_BOOKINGS_DOLLAR
,COALESCE(SUM(ORDER_CNT),0) ORDER_CNT
,COALESCE(SUM(BLOCKED_ORDER_CNT),0) BLOCKED_ORDER_CNT
,COALESCE(SUM(UNITS),0) UNITS
,D.PRODUCT_SUB_DESC
,hour_ind
FROM test.AMIT P
RIGHT OUTER JOIN
(SELECT
hours,
hour_ind,
PRODUCT_DESC PRODUCT_DESC_VALUE,
PRODUCT_SUB_DESC
FROM (SELECT CAST(CONSTANT_VALUE AS INTEGER) AS hours ,CONSTANT_SUB_VALUE AS hour_ind FROM test.MD WHERE CONSTANT_NAME ='HOUR' ) XX
,(SELECT CONSTANT_VALUE AS PRODUCT_DESC,CONSTANT_SUB_VALUE AS PRODUCT_SUB_DESC FROM test.MD WHERE CONSTANT_NAME ='PROD_HIER' AND CONSTANT_SUB_VALUE <> 'ALL') P
WHERE PRODUCT_DESC ='iPhone'
) D
ON (P.PRODUCT_SUB_DESC = D.PRODUCT_SUB_DESC AND HOUR_NR = hours
AND PRODUCT_DESC IN ('iPhone')
AND RPTG_DT = '2015-01-26'
AND STORE_CD IN ('R039')
AND STORE_HOUR_IND IN ('Y')
AND P.PRODUCT_SUB_DESC <>'ALL'
)
GROUP BY
D.PRODUCT_SUB_DESC,
hour_ind
ORDER BY
D.PRODUCT_SUB_DESC,
hour_ind
;

Explain of above is

Access Path:
+-GROUPBY HASH (SORT OUTPUT) (LOCAL RESEGMENT GROUPS) [Cost: 463, Rows: 39] (PATH ID: 2)
| Aggregates: sum(<SVAR>), sum(<SVAR>), sum(<SVAR>), sum(<SVAR>), sum(<SVAR>)
| Group By: D.PRODUCT_SUB_DESC, D.hour_ind
| Execute on: dddt_node0003
| +---> JOIN HASH [LeftOuter] [Cost: 461, Rows: 39 (7 RLE)] (PATH ID: 3)
| | Join Cond: (P.Product_Sub_Desc = D.PRODUCT_SUB_DESC) AND (P.Hour_Nr = D.hours)
| | Execute on: dddt_node0003
| | +-- Outer -> SELECT [Cost: 63, Rows: 39 (3 RLE)] (PATH ID: 4)
| | | Execute on: dddt_node0003
| | | +---> JOIN (CROSS JOIN) [Cost: 63, Rows: 39 (3 RLE)] (PATH ID: 5)
| | | | Materialize at Output: test.md.constant_value, test.md.constant_sub_value
| | | | Execute on: dddt_node0003
| | | | +-- Outer -> STORAGE ACCESS for MD [Cost: 18, Rows: 39 (3 RLE)] (PATH ID: 6)
| | | | | Projection: test.md_node0004
| | | | | Materialize: test.md.constant_name
| | | | | Filter: (test.md.constant_name = 'HOUR')
| | | | | Execute on: dddt_node0003
| | | | +-- Inner -> STORAGE ACCESS for MD [Cost: 33, Rows: 1] (PATH ID: 7)
| | | | | Projection: test.md_node0004
| | | | | Materialize: test.md.constant_sub_value
| | | | | Filter: (test.md.constant_name = 'PROD_HIER')
| | | | | Filter: (test.md.constant_sub_value <> 'ALL')
| | | | | Filter: (test.md.constant_value = 'iPhone')
| | | | | Execute on: dddt_node0003
| | +-- Inner -> STORAGE ACCESS for P [Cost: 397, Rows: 7] (PUSHED GROUPING) (PATH ID: 8)
| | | Projection: test.AMIT_b0
| | | Materialize: P.Product_Sub_Desc, P.Hour_Nr, P.CY_Bookings_Dollar, P.CY_Bookings_Local, P.Order_Cnt, P.Units, P.Blocked_Order_Cnt
| | | Filter: (P.Product_Desc = 'iPhone')
| | | Filter: (P.Rptg_Dt = '2015-01-26'::date)
| | | Filter: (P.Store_Hour_Ind = 'Y')
| | | Filter: (P.Store_Cd = 'R039')
| | | Filter: (P.Product_Sub_Desc <> 'ALL')
| | | Execute on: dddnode0003


DBD have created a projection,however the query is always using the super projection test.AMIT_b0.

schema_name | table_name | projections_used | query_duration_us | query_start_epoch | query_start | error_code | user_name
-------------+------------+------------------------+-------------------+-------------------+-------------------------------+------------+------------------
| | query projection_usage | 183543.000000 | 286447 | 2015-07-08 07:26:24.066323+00 | | test


CPU time is also very high.

Please suggest what can be done to avoid GROUPBY HASH .And what sud projection DDL ( order by and segment by to increase the performance.


We have added order by clause in subquery D.PRODUCT_SUB_DESC AND HOUR_NR to make data sorting on joining condition,but by using this CPU time is increasing.

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: How to know if my partition is getting hit in vertica

Post by NorbertKrupa » Mon Jul 13, 2015 2:53 am

I would try to move Rptg_Dt to a predicate outside of the join.
Checkout vertica.tips for more Vertica resources.

Post Reply

Return to “Vertica Performance Tuning”