Potential use of Vertica for systems monitoring metrics/data
Posted: Tue Nov 20, 2012 4:40 pm
I'm new to Vertica and looking into the potential use of Vertica for storing systems monitoring data for a very large cluster of systems. There are 100s if systems and many metrics for each system being reported, such as time-series data (cpu utilization, memory utilization, ...), health information (OK, warning, critical, ..), and other more complex data, such as a JSON payload, associated with the event. Normally, the time-series data is would be stored in RRD or Whisper files as a separate file for each metric. This makes it convenient to use tools such as Graphite/Carbon.
Has anyone done anything like this yet with Vertica and has any suggestions or could weigh in with the pros/cons of using Vertica as compared to storing time-series data in RRD or Whisper files? In a Whisper file the data would be added as a tuple (metric-name, value, timestamp).
It would be nice getting all the metric, health, and event information in a single database and then more complex analysis could occur on it and more easily compared/correlated against health information and events. I've read a few articles on using Vertica for time-series data at these links.
http://www.vertica.com/2010/06/08/readi ... -vertica-4
http://www.vertica.com/2010/09/27/gap-f ... ation-gfi/
http://www.vertica.com/2010/09/30/more- ... functions/
http://www.vertica.com/2011/06/20/repor ... aggerated/
I'm also wondering about suggestions on structuring the database. For example, there could be a single table in which all metrics are stored in a tuple (hostname-metric-name, value, timestamp) or (hostname, metric-name, value, timestamp) or each hostname-metric could be stored in a separate table.
Some of the things we would like to do with the data is visualize it, but also perform more complex statistics, such as comparing the current values of a metric (in a small time window) against the expected values (longer time window) for notification/alarming purposes.
Thanks --Roland
Has anyone done anything like this yet with Vertica and has any suggestions or could weigh in with the pros/cons of using Vertica as compared to storing time-series data in RRD or Whisper files? In a Whisper file the data would be added as a tuple (metric-name, value, timestamp).
It would be nice getting all the metric, health, and event information in a single database and then more complex analysis could occur on it and more easily compared/correlated against health information and events. I've read a few articles on using Vertica for time-series data at these links.
http://www.vertica.com/2010/06/08/readi ... -vertica-4
http://www.vertica.com/2010/09/27/gap-f ... ation-gfi/
http://www.vertica.com/2010/09/30/more- ... functions/
http://www.vertica.com/2011/06/20/repor ... aggerated/
I'm also wondering about suggestions on structuring the database. For example, there could be a single table in which all metrics are stored in a tuple (hostname-metric-name, value, timestamp) or (hostname, metric-name, value, timestamp) or each hostname-metric could be stored in a separate table.
Some of the things we would like to do with the data is visualize it, but also perform more complex statistics, such as comparing the current values of a metric (in a small time window) against the expected values (longer time window) for notification/alarming purposes.
Thanks --Roland