[Vertica][VJDBC](3587) ERROR: Insufficient resources to ...

Moderator: NorbertKrupa

scutter
Master
Master
Posts: 301
Joined: Tue Aug 07, 2012 2:15 am

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by scutter » Wed Oct 16, 2013 4:08 pm

Look in your vertica.log files for messages like "RESOURCE RESERVATION FAILURE". Those will include a dump of the memory usage by the each transaction/statement_id at the time of the resource acquisition failure. The lines with > are holding resources, the ones with Q are queued. It also shows the "overflow queries" that are borrowing memory from the general pool (perhaps all of your statements use the general pool anyway). It's a little tricky to interpret this the first time you work through it.

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

kleysonr
Newbie
Newbie
Posts: 17
Joined: Thu Oct 10, 2013 2:09 pm

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by kleysonr » Thu Oct 17, 2013 12:54 pm

Hi Sharon,

I've found the information, but it makes no sense to me.

May you help me to understand it ?

Code: Select all

2013-10-10 13:25:16.883 Init Session:0x7f805000f5b0-a000000009a3c7 [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a000000009a3c7,71] on general for Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
*general(45035996273718906) Priority 0 QueueTimeout 300
     Size Queries:10000,Threads:8314,File Handles:39993,Memory(KB):10113024,
     Reserved Queries:5,Threads:20,File Handles:39,Memory(KB):9607372,
>  [a000000009a3ac,36] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
>  [a000000009a3b1,36] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
>  [a000000009a3d2,36] - Queries:1,Threads:4,File Handles:9,Memory(KB):1919836,
>  [a000000009a3df,36] - Queries:1,Threads:4,File Handles:9,Memory(KB):1919836,
>  [a000000009a404,15] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
 sysquery(45035996273718908)
     Size Queries:10000,Threads:53,File Handles:256,Memory(KB):65536,
 sysdata(45035996273718910)
     Size Queries:0,Threads:0,File Handles:0,Memory(KB):102400,
     Reserved Queries:0,Threads:0,File Handles:0,Memory(KB):12288,
>  [ffffffffffffffff,45035996273718910] - Queries:0,Threads:0,File Handles:0,Memory(KB):12288,
 wosdata(45035996273718912)
     Size Queries:0,Threads:0,File Handles:0,Memory(KB):0,
     Overflow Queries:0,Threads:0,File Handles:0,Memory(KB):8192,
>  [ffffffffffffffff,45035996273718912] - Queries:0,Threads:0,File Handles:0,Memory(KB):8192,
 tm(45035996273718914)
     Size Queries:3,Threads:166,File Handles:801,Memory(KB):204800,
 refresh(45035996273718916)
     Size Queries:10000,Threads:0,File Handles:0,Memory(KB):0,
 recovery(45035996273718918)
     Size Queries:7,Threads:0,File Handles:0,Memory(KB):0,
 dbd(45035996273718920)
     Size Queries:10000,Threads:0,File Handles:0,Memory(KB):0,
Queue priority 0 (pools: general sysdata wosdata dbd)
Q  [a000000009a3c7,71] - general - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
Q  [a000000009a41b,22] - general - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
Q  [a000000009a3ee,22] - general - Queries:1,Threads:4,File Handles:9,Memory(KB):1919836,
Q  [a000000009a3bb,87] - general - Queries:1,Threads:4,File Handles:0,Memory(KB):4544,
Q  [a000000009a435,4] - general - Queries:1,Threads:4,File Handles:8,Memory(KB):9152,
2013-10-10 13:25:16.884 Init Session:0x7f805000f5b0-a000000009a3c7 <ERROR> @v_biprod_node0001: 53000/3587: Insufficient resources to execute plan on pool general [Timedout waiting for resource request: Request exceeds limits: Memory(KB) Exceeded: Requested = 1919836, Free = 1 (Limit = 9607373, Used = 9607372)]
        LOCATION:  Exec_compilePlan, /scratch_a/release/vbuild/vertica/Dist/Dist.cpp:1028

scutter
Master
Master
Posts: 301
Joined: Tue Aug 07, 2012 2:15 am

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by scutter » Thu Oct 17, 2013 1:56 pm

The output is what we'd expect based on what you described: 5 statements running in the general pool:

> [a000000009a3ac,36] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
> [a000000009a3b1,36] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,
> [a000000009a3d2,36] - Queries:1,Threads:4,File Handles:9,Memory(KB):1919836,
> [a000000009a3df,36] - Queries:1,Threads:4,File Handles:9,Memory(KB):1919836,
> [a000000009a404,15] - Queries:1,Threads:4,File Handles:7,Memory(KB):1919836,

The hex numbers are the transaction id, the numbers next to them are the statement ids. Check the QUERY_REQUESTS to confirm again that these five statements were taking the ~30 seconds that you mentioned before. You'll need to convert the hex to an int or vice versa to match them up with the integer transaction ids in QUERY_REQUESTS.

How many total loads are you running? I think it was around 10 or 12? With each of them taking 30 seconds, 5 running at a time, you'd still be well under the 300 seconds for the last one in the batch to get through the queue, so something still doesn't seem quite right if the general pool's queuetimeout is 300 seconds.

Have you monitored RESOURCE_QUEUES while the loads are running? Some sample output at intervals might help.

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

kleysonr
Newbie
Newbie
Posts: 17
Joined: Thu Oct 10, 2013 2:09 pm

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by kleysonr » Fri Oct 18, 2013 4:38 pm

Hi Sharon,

I truncated all tables and logs to start again. Bellow all informations about the system when the process stopped to work.

Code: Select all

select * from resource_pool_status where pool_name = 'general';

-[ RECORD 1 ]--------------+------------------
node_name                  | v_biprod_node0001
pool_oid                   | 45035996273718906
pool_name                  | general
is_internal                | t
memory_size_kb             | 10113024
memory_size_actual_kb      | 10113024
memory_inuse_kb            | 9607372
general_memory_borrowed_kb | 0
queueing_threshold_kb      | 9607373
max_memory_size_kb         | 10113024
running_query_count        | 10
planned_concurrency        | 10
max_concurrency            | 10
is_standalone              | t
queue_timeout_in_seconds   | 300
execution_parallelism      | 1
priority                   | 0
runtime_priority           | MEDIUM
runtime_priority_threshold | 2
single_initiator           | false
query_budget_kb            | 959918
In vertica.log there are 4 messages about "RESOURCE RESERVATION FAILURE" and all of them are the same in RESOURCE_QUEUES;

Code: Select all

# grep "RESOURCE RESERVATION FAILURE" vertica.log

2013-10-18 11:17:25.033 Init Session:0x7fbb80013510-a00000001c4554 [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c4554,92] on general for Queries:1,Threads:4,File Handles:11,Memory(KB):959918,
2013-10-18 11:17:25.044 Init Session:0x7fbb80015680-a00000001c4597 [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c4597,73] on general for Queries:1,Threads:4,File Handles:0,Memory(KB):4544,
2013-10-18 11:17:25.044 Init Session:0x7fbb80010800-a00000001c453f [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c453f,115] on general for Queries:1,Threads:4,File Handles:0,Memory(KB):4544,
2013-10-18 11:17:25.055 Init Session:0x7fbb80010190-a00000001c44ef [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c44ef,176] on general for Queries:1,Threads:4,File Handles:7,Memory(KB):959918,

Code: Select all

select * from RESOURCE_QUEUES;

-[ RECORD 1 ]---------+------------------------------
node_name             | v_biprod_node0001
transaction_id        | 45035996275557716 (a00000001c4554)
statement_id          | 92
pool_name             | general
memory_requested_kb   | 959918
priority              | 0
position_in_queue     | 1
queue_entry_timestamp | 2013-10-18 11:12:25.033059-03
-[ RECORD 2 ]---------+------------------------------
node_name             | v_biprod_node0001
transaction_id        | 45035996275557783 (a00000001c4597)
statement_id          | 73
pool_name             | general
memory_requested_kb   | 4544
priority              | 0
position_in_queue     | 2
queue_entry_timestamp | 2013-10-18 11:12:25.044137-03
-[ RECORD 3 ]---------+------------------------------
node_name             | v_biprod_node0001
transaction_id        | 45035996275557695 (a00000001c453f)
statement_id          | 115
pool_name             | general
memory_requested_kb   | 4544
priority              | 0
position_in_queue     | 3
queue_entry_timestamp | 2013-10-18 11:12:25.044179-03
-[ RECORD 4 ]---------+------------------------------
node_name             | v_biprod_node0001
transaction_id        | 45035996275557615 (a00000001c44Ef)
statement_id          | 176
pool_name             | general
memory_requested_kb   | 959918
priority              | 0
position_in_queue     | 4
queue_entry_timestamp | 2013-10-18 11:12:25.05559-03
Following, the details about request [a00000001c4554,92]

Code: Select all

2013-10-18 11:17:25.033 Init Session:0x7fbb80013510-a00000001c4554 [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c4554,92] on general for Queries:1,Threads:4,File Handles:11,Memory(KB):959918,
*general(45035996273718906) Priority 0 QueueTimeout 300
     Size Queries:10,Threads:8314,File Handles:39993,Memory(KB):10113024,
     Reserved Queries:10,Threads:40,File Handles:82,Memory(KB):9607372,
>  [a00000001c44db,36] - Queries:1,Threads:4,File Handles:7,Memory(KB):959918,
>  [a00000001c44e6,92] - Queries:1,Threads:4,File Handles:7,Memory(KB):959918,
>  [a00000001c44f8,71] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
>  [a00000001c4508,64] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
>  [a00000001c4517,22] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
>  [a00000001c452c,15] - Queries:1,Threads:4,File Handles:7,Memory(KB):959918,
>  [a00000001c456b,36] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
>  [a00000001c457f,15] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
>  [a00000001c45b0,57] - Queries:1,Threads:4,File Handles:7,Memory(KB):959918,
>  [a00000001c45c5,15] - Queries:1,Threads:4,File Handles:9,Memory(KB):959918,
 sysquery(45035996273718908)
     Size Queries:10000,Threads:53,File Handles:256,Memory(KB):65536,
 sysdata(45035996273718910)
     Size Queries:0,Threads:0,File Handles:0,Memory(KB):102400,
     Reserved Queries:0,Threads:0,File Handles:0,Memory(KB):65536,
>  [ffffffffffffffff,45035996273718910] - Queries:0,Threads:0,File Handles:0,Memory(KB):65536,
 wosdata(45035996273718912)
     Size Queries:0,Threads:0,File Handles:0,Memory(KB):0,
     Overflow Queries:0,Threads:0,File Handles:0,Memory(KB):8192,
>  [ffffffffffffffff,45035996273718912] - Queries:0,Threads:0,File Handles:0,Memory(KB):8192,
 tm(45035996273718914)
     Size Queries:3,Threads:166,File Handles:801,Memory(KB):204800,
 refresh(45035996273718916)
     Size Queries:10000,Threads:0,File Handles:0,Memory(KB):0,
 recovery(45035996273718918)
     Size Queries:7,Threads:0,File Handles:0,Memory(KB):0,
 dbd(45035996273718920)
     Size Queries:10000,Threads:0,File Handles:0,Memory(KB):0,
Queue priority 105 (pools: tm)
Q  [a00000001c4664,1] - tm - Queries:1,Threads:4,File Handles:16,Memory(KB):213568,
Queue priority 0 (pools: general sysdata wosdata dbd)
Q  [a00000001c4554,92] - general - Queries:1,Threads:4,File Handles:11,Memory(KB):959918,
Q  [a00000001c4597,73] - general - Queries:1,Threads:4,File Handles:0,Memory(KB):4544,
Q  [a00000001c453f,115] - general - Queries:1,Threads:4,File Handles:0,Memory(KB):4544,
Q  [a00000001c44ef,176] - general - Queries:1,Threads:4,File Handles:7,Memory(KB):959918,
Checking QUERY_REQUESTS table, all queries dumped on vertica.log took a little more than 300s. But I believe that those happen because the process it was stopped/locked and the queries were finished automatically on timeout.

I don't believe that the problem is queries taking more than 300s, because when the system is customized to PLANNEDCONCURRENCY > 12 (peak of queries) the longest query is about 30s.

scutter
Master
Master
Posts: 301
Joined: Tue Aug 07, 2012 2:15 am

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by scutter » Sat Oct 19, 2013 12:49 am

In both scenarios all loads are being initiated against node0001?

statement_id 92 definitely waited here for five minutes and timed out:

queue_entry_timestamp | 2013-10-18 11:12:25.033059-03

2013-10-18 11:17:25.033 Init Session:0x7fbb80013510-a00000001c4554 [ResourceManager] <INFO> RESOURCE RESERVATION FAILURE:Timedout waiting for resource request of [a00000001c4554,92] on general for Queries:1,Threads:4,File Handles:11,Memory(KB):959918,

If with plannedconcurrency set greater than 12, the loads are taking only 30s, then I'd compare profiling data for both sets of loads to see what's different. Start with "clock time (us)" and "execution time (us)" for each of the loads. Look at some operators like Load or DataTarget. Enable profiling so that you'll have the profiling data in execution_engine_profiles for a run with plannedconcurrency set to 10 and set higher.

You could also check the RESOURCE_ACQUISITIONS table to compare when the resources are requested vs granted.

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

scutter
Master
Master
Posts: 301
Joined: Tue Aug 07, 2012 2:15 am

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by scutter » Sat Oct 19, 2013 11:48 am

Is the load process by chance written with the assumption that all loads will be able to make progress? Is it getting stalled when plannedconcurrency is less than 12, and some of the loads queue and won't be able to actually read data until other loads finish?

--Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

kleysonr
Newbie
Newbie
Posts: 17
Joined: Thu Oct 10, 2013 2:09 pm

Re: [Vertica][VJDBC](3587) ERROR: Insufficient resources to

Post by kleysonr » Mon Oct 21, 2013 4:23 pm

Hi Sharon,
In both scenarios all loads are being initiated against node0001?
I have just one node.

I tried Data Profiling, but nothing special for me that can help me to understand the problem.

I did a new configuration: High PLANNEDCONCURRENCY ( = 15 ) and Low MAXCONCURRENCY ( = 5 ). Now my process stopped when running_query_count achieved MAXCONCURRENCY.

So, the problem is not with PLANNEDCONCURRENCY, but with some constraint that avoid running_query_count to increase.

I am using Pentado Data Integration (aka Kettle) to load data. The steps used to store the data have a configuration for 'commit size', works as a batch insert. If I set up 'commit size' for 1, my process works fine. For some reason, I think that batch insert needs to split the main query in many others subqueries, and because running_query_count is at maximum the error happens.

I tried to set up DirectBatchInsert=true for JDBC, but I get the error:

[Vertica][JDBC](11380) Null pointer exception.

I'm sad about it. And I'm affraid because these problem I can't use vertica database. I don't know anymore what to do ...

Thanks for try help me !!!

Post Reply

Return to “Vertica Error Codes”