Ive been experimenting with the identity settings and had some questions.
I have a table with 1.5 billion records in IQ that I need to move to Vertica. It makes use of an identity column.
In my testing so-far, when I specify "identity(1,1,1)" I get the correct identity values migrated from IQ to Vertica, however, it takes approx 24 hours to load 6M records. That would translate into an approx 8-month load time for the full 1.5B records, which is obviously not going to work.
When I increase the cache value, I get much better load times, but I would need to go to "identity(1,1,1500000000) in order to get the correct values to load.
Is there any downside to using such a large cache value for this column? Does it consume un-necessary resources or cause other issues that I may not be aware of?
Alternativly, is there a way to change the "cache" value after I load the table and go back to a "identity(1,1,1)" setting after the back-fill is completed?
identity columns
Moderator: NorbertKrupa
- JimKnicely
- Site Admin
- Posts: 1825
- Joined: Sat Jan 21, 2012 4:58 am
- Contact:
Re: identity columns
Check out the link "How Vertica Allots Cache for Sequencing":
https://my.vertica.com/docs/8.1.x/HTML/ ... encing.htm
Having a large CACHE does consume more memory on each node, and if a node goes down, you could potentially lose a lot of sequential values.
You can change the CACHE of the sequence used by the IDENTITY:
dbadmin=> create table test (c identity(1,1,1500000000), c1 varchar(10));
CREATE TABLE
dbadmin=> select sequence_name, session_cache_count from sequences where identity_table_name = 'test';
sequence_name | session_cache_count
---------------+---------------------
test_c_seq | 1500000000
(1 row)
dbadmin=> alter sequence test_c_seq cache 5000;
ALTER SEQUENCE
dbadmin=> select sequence_name, session_cache_count from sequences where identity_table_name = 'test';
sequence_name | session_cache_count
---------------+---------------------
test_c_seq | 5000
(1 row)
https://my.vertica.com/docs/8.1.x/HTML/ ... encing.htm
Having a large CACHE does consume more memory on each node, and if a node goes down, you could potentially lose a lot of sequential values.
You can change the CACHE of the sequence used by the IDENTITY:
dbadmin=> create table test (c identity(1,1,1500000000), c1 varchar(10));
CREATE TABLE
dbadmin=> select sequence_name, session_cache_count from sequences where identity_table_name = 'test';
sequence_name | session_cache_count
---------------+---------------------
test_c_seq | 1500000000
(1 row)
dbadmin=> alter sequence test_c_seq cache 5000;
ALTER SEQUENCE
dbadmin=> select sequence_name, session_cache_count from sequences where identity_table_name = 'test';
sequence_name | session_cache_count
---------------+---------------------
test_c_seq | 5000
(1 row)
Jim Knicely
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.