Opinions re: Schema Design for storing time series data

Moderator: NorbertKrupa

Post Reply
dcosgrov
Newbie
Newbie
Posts: 2
Joined: Thu Sep 25, 2014 2:20 pm

Opinions re: Schema Design for storing time series data

Post by dcosgrov » Thu Sep 25, 2014 2:34 pm

I am looking to get some advice on what type(s) of schema designs have been effective for storing time series telemetry/sensor data in Vertica. I am working on a project where there is a need to store multiple metric values for a given entity at a single point in time, then, continually add new "polled" data on a fixed interval. I see at least a couple of different options for schema design:

option 1:
entity | time | metric name | metric value
A now cpu util 70
A now memory util 90
...
A now+5min cpu util 75
A now+5min memory util 85


option 2:
entity | time | cpu util | memory util | ...
A now 70 90
A now+5min 75 85

I would like to go with more of a key-value design (option 1) but am concerned about significant record growth and combining values from different metrics that might not compress well. Anyway, I'd be interested in hearing about other options beyond what I have above and schema designs that have been particularly effective for storing time series data in Vertica.

Thanks in advance,

Dave

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Opinions re: Schema Design for storing time series data

Post by NorbertKrupa » Thu Sep 25, 2014 4:18 pm

Your first option should be fine with proper partitioning and a proper timestamp. Instead of having a metric name, you could also use a metric lookup table.

The second option is not really normalized.
Checkout vertica.tips for more Vertica resources.

dcosgrov
Newbie
Newbie
Posts: 2
Joined: Thu Sep 25, 2014 2:20 pm

Re: Opinions re: Schema Design for storing time series data

Post by dcosgrov » Fri Sep 26, 2014 7:21 pm

Thanks a lot for the feedback.

Post Reply

Return to “Vertica Database Development”