Timing query vs profile data

Moderator: NorbertKrupa

Post Reply
allan
Newbie
Newbie
Posts: 9
Joined: Wed Sep 05, 2012 4:13 pm

Timing query vs profile data

Post by allan » Wed Sep 11, 2013 8:37 pm

Hi,

I'm using Vertica V5.0.12 and trying to optimize some queries. I am seeing wild variation between successive profile runs of exactly the same query and I am having trouble resolving the timing result printed from vsql against the timing numbers that the execution_engine_profiles table reports:

Code: Select all

VSQL reported timing (us) 	4059524 	2393756 	3026559 	3584777 	4227938 	4175864 	3220636 	2423808 	3146294 	2262554
Sum execution times (us) 	11573332	10980750	11304506	11176371	11192633	11963247	13216589	11734707	11484416	12004472
Sum clock times (us) 	150244070	72815214	86611426	106801309	86491732	90276609	85790151	78952310	101863891	69199734
Sum input queue wait (us) 	3363844	1786777	2116612	3104509	2356909	2804455	2422717	2033945	2228894	1800763
Sum output queue wait (us) 	0	0	0	0	0	0	0	0	0	0
In the above table the row 'VSQL reported timing' is the time that PROFILE ... reports for first fetch.

The 'Sum xxxx' rows are taken from the execution_engine_profiles table and are the sum of the 'xxxx' counter name.

Strangely the 5th column reports the longest execution time (4227938 us) and yet the first column's clock times sum is largest (150244070 us). The sum of clock times is completely incorrect since there's no way that the query took 150 seconds ...

How are the clock times to be interpreted?

Also, looking across the execution times they are all much the same but again they exceed the overall time for the query, what is happening here?

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: Timing query vs profile data

Post by id10t » Wed Sep 11, 2013 9:05 pm

Hi!

https://my.vertica.com/docs/6.1.x/HTML/ ... #12260.htm

clock time (us) - Real-time clock time spent processing the query(time used by the process), in microseconds.

allan
Newbie
Newbie
Posts: 9
Joined: Wed Sep 05, 2012 4:13 pm

Re: Timing query vs profile data

Post by allan » Wed Sep 11, 2013 9:42 pm

Maybe I didn't explain my confusion very well:

Code: Select all

db=> profile select 1+1;
NOTICE:  Statement is being profiled.
HINT:  select * from v_monitor.execution_engine_profiles where transaction_id=45035996340493973 and statement_id=3;
NOTICE:  Initiator memory for query: [on pool sysquery: 4232 KB, minimum: 4232 KB]
NOTICE:  Total memory required by query: [4232 KB]
 ?column?
----------
        2
(1 row)

Time: First fetch (1 row): 33.760 ms. All rows formatted: 33.779 ms
So the query takes 33760 us, but the profiler reports:

Code: Select all

db=> select operator_name,operator_id,baseplan_id,path_id,localplan_id,counter_name,counter_value from v_monitor.execution_engine_profiles where transaction_id=45035996340493973 and statement_id=3 and counter_name = 'clock time (us)';
^[^[
 operator_name | operator_id | baseplan_id | path_id | localplan_id |  counter_name   | counter_value
---------------+-------------+-------------+---------+--------------+-----------------+---------------
 Root          |           0 |           0 |      -1 |            0 | clock time (us) |            26
 NewEENode     |           1 |           1 |      -1 |            1 | clock time (us) |           206
 ExprEval      |           2 |           2 |       0 |            2 | clock time (us) |            21
 StorageUnion  |           3 |           3 |       1 |            3 | clock time (us) |           338
 Scan          |           4 |           4 |       1 |            4 | clock time (us) |            56
(5 rows)
and that execution time adds up to 647 us, which is a huge difference from the 33760 us reported earlier.

Where is the extra time being lost?

scutter
Master
Master
Posts: 302
Joined: Tue Aug 07, 2012 2:15 am

Re: Timing query vs profile data

Post by scutter » Thu Sep 12, 2013 1:43 am

Hi Allan,

The sums of the execution times are misleading because some of the operators run on multiple threads. For example, if you look at the Scan operator, you'll see multiple occurrences of it for each path id on each node. Each of those represents a thread running on a different core. Other operators may run on a subset of cores or on a single core.

And on a multi-node cluster, the query is executing on all nodes at once in parallel. Again, adding up execution times will be misleading.

When interpreting execution_engine_profiles data you need to keep the parallelism in mind, and also the fact that some of the operators will run at the same time. So adding up the execution isn't representative of the actual execution time. I will often look at the max execution time for an operator per path id, and also across nodes. It can be helpful to observe that execution is skewed with more time on a subset of nodes, though there's not necessarily anything you can do about it.

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

allan
Newbie
Newbie
Posts: 9
Joined: Wed Sep 05, 2012 4:13 pm

Re: Timing query vs profile data

Post by allan » Thu Sep 12, 2013 4:01 pm

Hi,

I would agree that there may be threading happening but the explanation plan says:

Code: Select all

 explain select 1+1;

 Access Path:
 +-STORAGE ACCESS for dual [Cost: 663, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
 |  Projection: v_catalog.dual_p
 |  Materialize: dual.dummy
 |  Execute on: Query Initiator

I'm not sure how much parallelism you can expect to have for a simple select 1+1 query. Indeed the execution_engine_profile table indicates they are all running on the same node and probably the same thread, if path_id is an indication.

It is annoying that there is no easily discernible relationship between the execution time as printed out by the \timing flag in vsql and the execution_engine_profile counters. The execution profiling is unhelpful for determining overall system timing as far as I can see.

I still cannot reconcile the orders of magnitude difference reported by the profiling tools and the vsql tool. I think I recall seeing on the myvertica forums that for tiny queries the planning and optimising could take up a large proportion of the query time and that the execution time was eclipsed by this. Does antyone have any experience of that?

scutter
Master
Master
Posts: 302
Joined: Tue Aug 07, 2012 2:15 am

Re: Timing query vs profile data

Post by scutter » Thu Sep 12, 2013 8:51 pm

You can get a feeling for how long the planning is taking by using \timing on the EXPLAIN of the query.

In general I wouldn't get caught up in trying to account for every last bit of time. You can't control the planning time. Use the execution_engine_profiles data as a way to see where the is spent during execution and move on. I find tremendous value in the profiling data.
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

Post Reply

Return to “Vertica Performance Tuning”