Partition Swapping

Moderator: NorbertKrupa

Post Reply
doug_harmon
Beginner
Beginner
Posts: 36
Joined: Fri Feb 17, 2012 6:09 pm
Contact:

Partition Swapping

Post by doug_harmon » Wed Jun 12, 2013 10:41 pm

Partition swapping is a common technique for loading data into tables. Typically you will load data into a stage table, then swap the stage table into a partition in the destination table. For large tables this is much faster than a merge, with downtime measured in seconds.

Here's what I learned:
1. A fast way to create a stage table identical to the destination table is:

Code: Select all

  CREATE TABLE <StageTable> LIKE <DestinationTable> INCLUDING PROJECTIONS;
2. If you move a partition into an existing table that already has data in that partition, the data will be added to what is already in the table. Vertica will not check to see if Primary Key constraints are violated, you will have to do that manually.
3. If you want to partition a table differently Vertica will take care of the repartitioning in the background. You can continue to query the table while it is being rebuilt. :D

Code: Select all

         ALTER TABLE <TableName> PARTITION BY <PartitionFunctionGoesHere> REORGANIZE;
Example: ALTER TABLE MyExampleDestination PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;

SQL File: HowToDoPartitionSwappingInVertica.sql

Code: Select all


----------------------------------------------------------------------------------------------
-- Create a table with a Primary Key Constraint
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleDestination
( RowID INT NOT NULL PRIMARY KEY
 ,VariableOne INT NOT NULL
 ,VariableTwo INT
)
ORDER BY RowID
PARTITION BY ( FLOOR(RowID/10)::INT );

----------------------------------------------------------------------------------------------
-- Populate table with 100 records of data
----------------------------------------------------------------------------------------------
INSERT INTO MyExampleDestination 
(RowID, VariableOne, VariableTwo)
SELECT ROW_NUMBER() OVER() - 1 AS RowID
      ,(RANDOM()*100)::INT AS VariableOne
      ,(RANDOM()*100)::INT AS VariableTwo
FROM all_tables
limit 100;

----------------------------------------------------------------------------------------------
-- Should return 0 records if Primary Key Constraint is valid
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');

----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity 
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;

----------------------------------------------------------------------------------------------
-- Create a Source table with the same structure as the Destination Table
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleSource
LIKE MyExampleDestination INCLUDING PROJECTIONS;

----------------------------------------------------------------------------------------------
-- Populate the Source table with data for Projection Keys < 5
----------------------------------------------------------------------------------------------
INSERT /*+ direct */ 
INTO   MyExampleSource
SELECT * 
  FROM  MyExampleDestination
 WHERE FLOOR(RowID/10)::INT < 5
   AND VariableTwo%2=0;

----------------------------------------------------------------------------------------------
-- Show the distribution of records in the Source table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity 
FROM MyExampleSource
GROUP BY 1
ORDER BY 1;

----------------------------------------------------------------------------------------------
-- Drop Partitions 0-3.  Leave 4 alone.
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 0); 
SELECT DROP_PARTITION ('MyExampleDestination' , 1); 
SELECT DROP_PARTITION ('MyExampleDestination' , 2); 
SELECT DROP_PARTITION ('MyExampleDestination' , 3);

----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity 
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;


----------------------------------------------------------------------------------------------
-- Swap partitions 0-4 from Source table into Destination Table
----------------------------------------------------------------------------------------------
SELECT MOVE_PARTITIONS_TO_TABLE ( 'MyExampleSource',0, 4, 'MyExampleDestination' ) ;

SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity 
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;

----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
-- Note that Partition 4 has both the existing and the new records.
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity 
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;

----------------------------------------------------------------------------------------------
-- Should return records showing where Primary Key Constraint is NOT valid in Partition 4
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');


----------------------------------------------------------------------------------------------
-- Delete bad Partition
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 4);

----------------------------------------------------------------------------------------------
-- Change the way the table is partitioned. This runs in the background, you can continue 
-- to query the table while it is being reorganized to use the new partitioning structure
----------------------------------------------------------------------------------------------
ALTER TABLE MyExampleDestination
PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;

----------------------------------------------------------------------------------------------
-- Clean Up
----------------------------------------------------------------------------------------------
DROP TABLE MyExampleDestination, MyExampleSource;

Result

Code: Select all

dbadmin=> \i HowToDoPartitionSwappingInVertica.sql
----------------------------------------------------------------------------------------------
-- Create a table with a Primary Key Constraint
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleDestination
( RowID INT NOT NULL PRIMARY KEY
 ,VariableOne INT NOT NULL
 ,VariableTwo INT
)
ORDER BY RowID
PARTITION BY ( FLOOR(RowID/10)::INT );
CREATE TABLE

----------------------------------------------------------------------------------------------
-- Populate table with 100 records of data
----------------------------------------------------------------------------------------------
INSERT INTO MyExampleDestination
(RowID, VariableOne, VariableTwo)
SELECT ROW_NUMBER() OVER() - 1 AS RowID
      ,(RANDOM()*100)::INT AS VariableOne
      ,(RANDOM()*100)::INT AS VariableTwo
FROM all_tables
limit 100;
 OUTPUT
--------
    100
(1 row)


----------------------------------------------------------------------------------------------
-- Should return 0 records if Primary Key Constraint is valid
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
 Schema Name | Table Name | Column Names | Constraint Name | Constraint Type | Column Values
-------------+------------+--------------+-----------------+-----------------+---------------
(0 rows)


----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
 Partition | Quantity
-----------+----------
         0 |       10
         1 |       10
         2 |       10
         3 |       10
         4 |       10
         5 |       10
         6 |       10
         7 |       10
         8 |       10
         9 |       10
(10 rows)


----------------------------------------------------------------------------------------------
-- Create a Source table with the same structure as the Destination Table
----------------------------------------------------------------------------------------------
CREATE TABLE MyExampleSource
LIKE MyExampleDestination INCLUDING PROJECTIONS;
CREATE TABLE

----------------------------------------------------------------------------------------------
-- Populate the Source table with data for Projection Keys < 5
----------------------------------------------------------------------------------------------
INSERT /*+ direct */
INTO   MyExampleSource
SELECT *
  FROM  MyExampleDestination
 WHERE FLOOR(RowID/10)::INT < 5
   AND VariableTwo%2=0;
 OUTPUT
--------
     24
(1 row)


----------------------------------------------------------------------------------------------
-- Show the distribution of records in the Source table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleSource
GROUP BY 1
ORDER BY 1;
 FLOOR | Quantity
-------+----------
     0 |        6
     1 |        3
     2 |        6
     3 |        4
     4 |        5
(5 rows)


----------------------------------------------------------------------------------------------
-- Drop Partitions 0-3.  Leave 4 alone.
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 0);
  DROP_PARTITION
-------------------
 Partition dropped
(1 row)

SELECT DROP_PARTITION ('MyExampleDestination' , 1);
  DROP_PARTITION
-------------------
 Partition dropped
(1 row)

SELECT DROP_PARTITION ('MyExampleDestination' , 2);
  DROP_PARTITION
-------------------
 Partition dropped
(1 row)

SELECT DROP_PARTITION ('MyExampleDestination' , 3);
  DROP_PARTITION
-------------------
 Partition dropped
(1 row)


----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
 Partition | Quantity
-----------+----------
         4 |       10
         5 |       10
         6 |       10
         7 |       10
         8 |       10
         9 |       10
(6 rows)



----------------------------------------------------------------------------------------------
-- Swap partitions 0-4 from Source table into Destination Table
----------------------------------------------------------------------------------------------
SELECT MOVE_PARTITIONS_TO_TABLE ( 'MyExampleSource',0, 4, 'MyExampleDestination' ) ;
                             MOVE_PARTITIONS_TO_TABLE
-----------------------------------------------------------------------------------
 5 distinct partition values moved at epoch 231938. Effective move epoch: 231937.

(1 row)


SELECT FLOOR(RowID/10)::INT, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
 FLOOR | Quantity
-------+----------
     0 |        6
     1 |        3
     2 |        6
     3 |        4
     4 |       15
     5 |       10
     6 |       10
     7 |       10
     8 |       10
     9 |       10
(10 rows)


----------------------------------------------------------------------------------------------
-- Show the distribution of records by the Partition Key in the Destination Table
-- Note that Partition 4 has both the existing and the new records.
----------------------------------------------------------------------------------------------
SELECT FLOOR(RowID/10)::INT AS Partition, COUNT(*) AS Quantity
FROM MyExampleDestination
GROUP BY 1
ORDER BY 1;
 Partition | Quantity
-----------+----------
         0 |        6
         1 |        3
         2 |        6
         3 |        4
         4 |       15
         5 |       10
         6 |       10
         7 |       10
         8 |       10
         9 |       10
(10 rows)


----------------------------------------------------------------------------------------------
-- Should return records showing where Primary Key Constraint is NOT valid in Partition 4
----------------------------------------------------------------------------------------------
SELECT ANALYZE_CONSTRAINTS('MyExampleDestination');
 Schema Name |      Table Name      | Column Names | Constraint Name | Constraint Type | Column Values
-------------+----------------------+--------------+-----------------+-----------------+---------------
 public      | MyExampleDestination | RowID        | C_PRIMARY       | PRIMARY         | ('49')
 public      | MyExampleDestination | RowID        | C_PRIMARY       | PRIMARY         | ('42')
 public      | MyExampleDestination | RowID        | C_PRIMARY       | PRIMARY         | ('43')
 public      | MyExampleDestination | RowID        | C_PRIMARY       | PRIMARY         | ('40')
 public      | MyExampleDestination | RowID        | C_PRIMARY       | PRIMARY         | ('44')
(5 rows)



----------------------------------------------------------------------------------------------
-- Delete bad Partition
----------------------------------------------------------------------------------------------
SELECT DROP_PARTITION ('MyExampleDestination' , 4);
  DROP_PARTITION
-------------------
 Partition dropped
(1 row)


----------------------------------------------------------------------------------------------
-- Change the way the table is partitioned. This runs in the background, you can continue
-- to query the table while it is being reorganized to use the new partitioning structure
----------------------------------------------------------------------------------------------
ALTER TABLE MyExampleDestination
PARTITION BY ( FLOOR(RowID/20)::INT ) REORGANIZE;
vsql:HowToDoPartitionSwapVertica.sql:113: NOTICE 4954:  The new partitioning scheme will produce 5 partitions
vsql:HowToDoPartitionSwapVertica.sql:113: NOTICE 4785:  Started background repartition table task
ALTER TABLE

----------------------------------------------------------------------------------------------
-- Clean Up
----------------------------------------------------------------------------------------------
DROP TABLE MyExampleDestination, MyExampleSource;
DROP TABLE

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Partition Swapping

Post by JimKnicely » Thu Jun 13, 2013 12:33 pm

#doug_harmon,

Thanks for sharing your experience!
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

billykopecki
Beginner
Beginner
Posts: 42
Joined: Thu Apr 19, 2012 9:03 pm

Re: Partition Swapping

Post by billykopecki » Thu Jun 13, 2013 1:34 pm

I second that! Great post!

Post Reply

Return to “Vertica Tips, Lessons and Examples”