Storage Policies in Vertica 6.1 and Beyond

Moderator: NorbertKrupa

Post Reply
User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Storage Policies in Vertica 6.1 and Beyond

Post by JimKnicely » Fri Apr 12, 2013 12:15 pm

Hi,

The storage location capabilities in Vertica have been greatly enhanced as of release 6.1.0! (BTW, a storage location simply refers to a location on disk)

Starting in Vertica 6.1.0 we can now do the following exciting tasks:
  • 1. Label storage locations using the ALTER_LOCATION_LABEL function to change the label.
    2. Create database object level storage policies using the SET_OBJECT_STORAGE_POLICY function where the database object types include database, schemas, tables, and partition ranges. Nice!
    3. Monitor storage policies via the V_MONITOR.STORAGE_POLICIES system table
These are great features because now we can tell Vertica to put data that we deem the most important on our fastest drives while moving less important data (i.e. archive data) on to a slower drive!

Prior to release 6.1.0, Vertica stores columns included in the projection sort order on the fastest storage locations. Columns not included in the projection sort order are stored on slower disks. Columns for each projection are ranked as follows:
  • 1. Columns in the sort order are given the highest priority (numbers > 1000).
    2. The last column in the sort order is given the rank number 1001.
    3. The next-to-last column in the sort order is given the rank number 1002, and so on until the first column in the sort order is given 1000 + # of sort columns.
    4. The remaining columns are given numbers from 1000–1, starting with 1000 and decrementing by one per column.
Vertica then stores columns on disk from the highest ranking to the lowest ranking, with the highest ranking columns placed on the fastest disks, and the lowest ranking columns placed on the slowest disks.

Note that we use the MEASURE_LOCATION_PERFORMANCE and SET_LOCATION_PERFORMANCE functions to tell Vertica which locations are the fastest.

For instance, to measure and set the performance of the location “/mnt/vertica/data” on node 1 we could perform the following two commands:

Code: Select all

dbadmin=> SELECT MEASURE_LOCATION_PERFORMANCE('/mnt/vertica/data' , 'v_testdata1_node0001');
WARNING 3914:  measure_location_performance can take a long time to execute. Please check logs for progress
          MEASURE_LOCATION_PERFORMANCE
-------------------------------------------------
Throughput : 177 MB/sec. Latency : 44 seeks/sec
(1 row)

dbadmin=> SELECT SET_LOCATION_PERFORMANCE('/mnt/vertica/data','v_testdata1_node0001','177','44');
             SET_LOCATION_PERFORMANCE
--------------------------------------------------
/mnt/vertica/data performance statistics stored.
(1 row)
Now, as of Vertica 6.1.0 we can change the labels for the storage locations. For instance, on Vertica test we could do the following:

Code: Select all

dbadmin=> SELECT ALTER_LOCATION_LABEL('/mnt/vertica/data', '', 'SAN');
       ALTER_LOCATION_LABEL
----------------------------------
/mnt/vertica/data label changed.
(1 row)

dbadmin=> SELECT ALTER_LOCATION_LABEL('/usr/local/data/testdata1/v_testdata1_node0001_data', 'v_testdata1_node0001', 'LOCAL');
                        ALTER_LOCATION_LABEL
--------------------------------------------------------------------
/usr/local/data/testdata1/v_testdata1_node0001_data label changed.
(1 row)

dbadmin=> SELECT ALTER_LOCATION_LABEL('/usr/local/data/testdata1/v_testdata1_node0002_data', 'v_testdata1_node0002', 'LOCAL');
                        ALTER_LOCATION_LABEL
--------------------------------------------------------------------
/usr/local/data/testdata1/v_testdata1_node0002_data label changed.
(1 row)

dbadmin=> SELECT ALTER_LOCATION_LABEL('/usr/local/data/testdata1/v_testdata1_node0003_data', 'v_testdata1_node0003', 'LOCAL');
                        ALTER_LOCATION_LABEL
--------------------------------------------------------------------
/usr/local/data/testdata1/v_testdata1_node0003_data label changed.
(1 row)
So, here is an updated list our storage locations in Vertica test (the best performing locations are highlighted in yellow):

Code: Select all

dbadmin=> SELECT node_name, location_path, location_usage, location_label, throughput, latency
dbadmin->   FROM storage_locations;
      node_name       |                    location_path                    | location_usage | location_label | throughput | latency
----------------------+-----------------------------------------------------+----------------+----------------+------------+---------
v_testdata1_node0001 | /usr/local/data/testdata1/v_testdata1_node0001_data | DATA,TEMP      | LOCAL          |        147 |     380
v_testdata1_node0001 | /mnt/vertica/data                                   | DATA           | SAN            |        177 |      44
v_testdata1_node0002 | /usr/local/data/testdata1/v_testdata1_node0002_data | DATA,TEMP      | LOCAL          |        170 |      86
v_testdata1_node0002 | /mnt/vertica/data                                   | DATA           | SAN            |        121 |      40
v_testdata1_node0003 | /usr/local/data/testdata1/v_testdata1_node0003_data | DATA,TEMP      | LOCAL          |        160 |     100
v_testdata1_node0003 | /mnt/vertica/data                                   | DATA           | SAN            |        170 |      96
(6 rows)
Let’s try some storage policy examples…

For the first example, there are no storage policies in place:

Code: Select all

dbadmin=> SELECT * FROM storage_policies;
schema_name | object_name | policy_details | location_label
-------------+-------------+----------------+----------------
(0 rows)
Now, let’s create a simple table:

Code: Select all

dbadmin=> CREATE TABLE test_no_sp (col1 INT) KSAFE 1;
CREATE TABLE
dbadmin=> INSERT INTO test_no_sp VALUES (1);
OUTPUT
--------
      1
(1 row)
What storage location was used?

Code: Select all

dbadmin=> SELECT p.projection_name,
dbadmin->        sc.node_name,
dbadmin->        sc.storage_type,
dbadmin->        sc.location_label
dbadmin->   FROM projections p
dbadmin->   LEFT JOIN storage_containers sc
dbadmin->     ON sc.projection_name = p.projection_name
dbadmin->  WHERE p.anchor_table_name = 'test_no_sp';
projection_name |      node_name       | storage_type | location_label
-----------------+----------------------+--------------+----------------
test_no_sp_b0   | v_testdata1_node0002 | WOS          |
test_no_sp_b1   | v_testdata1_node0003 | WOS          |
(2 rows)
Note that the location_label is NULL in the result set because the data is not on disk yet (the storage_type is WOS, which is memory).

We can force the data to move to disk with the DO_TM_TASK function:

Code: Select all

dbadmin=> SELECT DO_TM_TASK('MOVEOUT', 'test_no_sp');
                                                                             DO_TM_TASK                                                                
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Task: moveout
(Table: testdata1_wh.test_no_sp) (Projection: testdata1_wh.test_no_sp_b0)
(Table: testdata1_wh.test_no_sp) (Projection: testdata1_wh.test_no_sp_b1)

(1 row)
Now let’s check the storage used:

Code: Select all

dbadmin=> SELECT p.projection_name,
dbadmin->        sc.node_name,
dbadmin->        sc.storage_type,
dbadmin->        sc.location_label
dbadmin->   FROM projections p
dbadmin->   LEFT JOIN storage_containers sc
dbadmin->     ON sc.projection_name = p.projection_name
dbadmin->  WHERE p.anchor_table_name = 'test_no_sp';
projection_name |      node_name       | storage_type | location_label
-----------------+----------------------+--------------+----------------
test_no_sp_b0   | v_testdata1_node0002 | ROS          | LOCAL
test_no_sp_b1   | v_testdata1_node0003 | ROS          | SAN
(2 rows)
Vertica chose to use LOCAL for node 2 and SAN for node 3 because they are the best performing locations on each node (refer back to the STORAGE_LOCATIONS query above).

In the next example, we’ll create a storage policy for a table and see which location is used!

Code: Select all

dbadmin=> CREATE TABLE test_sp (col1 INT) KSAFE 1;
CREATE TABLE

dbadmin=> SELECT SET_OBJECT_STORAGE_POLICY('test_sp', 'SAN');
SET_OBJECT_STORAGE_POLICY
----------------------------
Object storage policy set.
(1 row)

dbadmin=> SELECT * FROM storage_policies;
schema_name  | object_name | policy_details | location_label
--------------+-------------+----------------+----------------
testdata1_wh | test_sp     | Table          | SAN
(1 row)

dbadmin=> INSERT INTO test_sp VALUES (1);
OUTPUT
--------
      1
(1 row)

dbadmin=> SELECT DO_TM_TASK('MOVEOUT', 'test_sp');
                                                                       DO_TM_TASK                                                                      
--------------------------------------------------------------------------------------------------------------------------------------------------------
Task: moveout
(Table: testdata1_wh.test_sp) (Projection: testdata1_wh.test_sp_b0)
(Table: testdata1_wh.test_sp) (Projection: testdata1_wh.test_sp_b1)

(1 row)

dbadmin=> SELECT p.projection_name,
dbadmin->        sc.node_name,
dbadmin->        sc.storage_type,
dbadmin->        sc.location_label
dbadmin->   FROM projections p
dbadmin->   LEFT JOIN storage_containers sc
dbadmin->     ON sc.projection_name = p.projection_name
dbadmin->  WHERE p.anchor_table_name = 'test_sp';
projection_name |      node_name       | storage_type | location_label
-----------------+----------------------+--------------+----------------
test_sp_b1      | v_testdata1_node0003 | ROS          | SAN
test_sp_b0      | v_testdata1_node0002 | ROS          | SAN
(2 rows)
So in this case, Vertica used the SAN location on both nodes!

Have fun!
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

zvika
Beginner
Beginner
Posts: 25
Joined: Thu Apr 19, 2012 7:55 am

Re: Storage Policies in Vertica 6.1 and Beyond

Post by zvika » Sat Apr 13, 2013 1:40 pm

Excellent one !

Post Reply

Return to “Vertica Tips, Lessons and Examples”