performance of simple query on from-scratch database varies

Moderator: NorbertKrupa

Post Reply
matthewcornell
Beginner
Beginner
Posts: 30
Joined: Mon Dec 09, 2013 2:30 pm

performance of simple query on from-scratch database varies

Post by matthewcornell » Fri Mar 21, 2014 4:37 pm

Hi Folks,

I've started an experiment on our small 4-node cluster to simulate possible performance gains in adding another node. (A good excuse to learn Vertica along the way :-) To do this I'm running our join-heavy benchmark query on 3 nodes and then again after adding the fourth node. I don't yet understand why running DBD makes the query twice as slow, but in working to understand this, I discovered another oddity: When I create my test database from scratch and then run my query (no DBD, just default partitions), I get two (seemingly random?) execution times: 6 minutes (2/10 runs) and 11.5 minutes (8/10 runs). Weird! I haven't been able to recreate the 'fast' result in the last few trials, so I don't have EXPLAIN or PROFILE results to share. But I'm hoping you can share a few possibile explanations. Note that I'm using the same three nodes each time, and that there are no non-OS processes running on them.

Thanks in advance!

scutter
Master
Master
Posts: 301
Joined: Tue Aug 07, 2012 2:15 am

Re: performance of simple query on from-scratch database var

Post by scutter » Fri Mar 21, 2014 9:01 pm

Matthew,

I just wrote

http://www.vertica-forums.com/viewtopic.php?f=63&t=1764

with your question in mind. How to review profiling data for queries that you didn't explicitly profile. This might help you to review the variable performance of these joins.

Another thing to check is how much memory was allocated for the query when it ran each time, and whether it may have "retried" after failing to run on one or more attempts.

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

matthewcornell
Beginner
Beginner
Posts: 30
Joined: Mon Dec 09, 2013 2:30 pm

Re: performance of simple query on from-scratch database var

Post by matthewcornell » Mon Mar 24, 2014 3:46 pm

I made a beginner mistake: I thought I had run multiple times to prime the caches, but I did not. Results of runs 2+ are consistently faster. Sorry about that. -- matt

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: performance of simple query on from-scratch database var

Post by NorbertKrupa » Mon Mar 24, 2014 5:49 pm

I'm very curious about what you mean by prime the caches. While there is some metadata collected, the queries should be freshly compiled each run.
Checkout vertica.tips for more Vertica resources.

matthewcornell
Beginner
Beginner
Posts: 30
Joined: Mon Dec 09, 2013 2:30 pm

Re: performance of simple query on from-scratch database var

Post by matthewcornell » Wed Mar 26, 2014 8:48 pm

norbertk wrote:I'm very curious about what you mean by prime the caches. While there is some metadata collected, the queries should be freshly compiled each run.
I did some more experimenting and got strange results. I ran the same query on 3-node and 4-node database configurations with and without DBD projections. I ran each three times to ensure consistent results. What I found is that in the <3-node, no DBD> case, run 1/3 was slower than 2/3 and 3/3:

<3-node, no DBD>:
#1: 11m27.745s
#2: 5m9.826s
#3: 5m6.028s

Strangely, all three runs were the same in all other cases:

<3-node, DBD>:
#1: 9m39.472s
#2: 9m39.580s
#3: 9m39.537s

<4-node, no DBD>:
#1: 8m49.549s
#2: 8m49.138s
#3: 8m47.697s

<4-node, DBD>:
#1: 14m30.955s
#2: 14m31.220s
#3: 14m34.515s

What do you think? I've repeated that first case (<3-node, no DBD>) at least four times with the same results.

Thanks,

matt

id10t
GURU
GURU
Posts: 732
Joined: Mon Apr 16, 2012 2:44 pm

Re: performance of simple query on from-scratch database var

Post by id10t » Wed Mar 26, 2014 9:25 pm

Hi!

[DELETED]
Last edited by id10t on Wed May 06, 2015 5:52 pm, edited 1 time in total.

matthewcornell
Beginner
Beginner
Posts: 30
Joined: Mon Dec 09, 2013 2:30 pm

Re: performance of simple query on from-scratch database var

Post by matthewcornell » Wed Mar 26, 2014 9:36 pm

Thanks for giving me something to think about, sKwa. Sorry I didn't include details - I was hoping to get some general information without bogging it down.

Post Reply

Return to “Vertica Performance Tuning”