Spread Monitoring

Moderator: NorbertKrupa

Post Reply
Timbo
Intermediate
Intermediate
Posts: 53
Joined: Thu Jun 21, 2012 11:05 am
Location: London, UK

Spread Monitoring

Post by Timbo » Wed Jan 28, 2015 3:34 pm

Hi,
Does anyone have the definition of the columns in the "dc_spread_monitor" table or some SQL that would make some meaning of the data?

No information on this table in either the V6 or V7 online documentation.

I see in the V7 MC that there is an alert for "Spread Retransmit Rate Over Threshold 10%", but need to look for something similar in a V6 cluster and assuming the data to analyse is in the dc_spread_monitor table.

Regards
Tim

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Spread Monitoring

Post by NorbertKrupa » Wed Jan 28, 2015 4:30 pm

Timbo wrote:I see in the V7 MC that there is an alert for "Spread Retransmit Rate Over Threshold 10%", but need to look for something similar in a V6 cluster and assuming the data to analyse is in the dc_spread_monitor table.
If you're just looking for a query, this might help:

Code: Select all

SELECT DATE_TRUNC('minute', time) AS time,
       node_name, 
       ROUND(( MAX(retrans) - MIN(retrans) ) / ( MAX(message_delivered) - MIN(message_delivered) ) * 100, 2.0) AS retransmit_rate
FROM   dc_spread_monitor 
GROUP  BY 1, 
          2 
HAVING ROUND(( MAX(retrans) - MIN(retrans) ) / ( MAX(message_delivered) - MIN(message_delivered) ) * 100, 2.0) > 10
ORDER  BY 1 DESC, 
          2; 
The original intent of this alert in 7.x was to alert on the possibility of a node dropping from the cluster. However, it doesn't always mean that a high retransmit rate is indicative of an unhealthy cluster. If the cluster has no activity, the retransmit rate will appear high since there are fewer transmissions.
Checkout vertica.tips for more Vertica resources.

Post Reply

Return to “Vertica Performance Tuning”