I was trying to understand why manual delete/insert take lesser time than update.
My understanding is this-
Delete/Insert:
Deletes all records from all projections satisfying the predicate condition ( projections in ROS containers are intact /not dropped)
Inserts records into new positions into all projections (containers intact/not dropped)
Update
Deletes all records from all projections satisfying the predicate condition ( projections in ROS containers are intact /not dropped)
Inserts records into same position as the deleted records (since this cant happen , these ROS containers are
dropped , and (existing unchanged records + new records) is reloaded into the new container
Is my understanding correct ?
Why manual delete/insert better than update?
Moderator: NorbertKrupa
- JimKnicely
- Site Admin
- Posts: 1825
- Joined: Sat Jan 21, 2012 4:58 am
- Contact:
Re: Why manual delete/insert better than update?
Hi,
Can you show us an example?
Note that in both scenarios, manual DELETE/INSERT vs UPDATE, delete vectors are created and only new files are created. No files are updated or deleted.
If you are having performance issues, make sure you have optimized the projections involved for UPDATE and DELETE! Check out the following online doc:
Best Practices for DELETE and UPDATE
https://my.vertica.com/docs/8.0.x/HTML/ ... UPDATE.htm
As far as performance, the following example shows that the performance is relatively even for both methods.
Example...
SETUP:
DELETE followed by INSERT:
Total time: 2.99 + 2.35 = 5.34 seconds
UPDATE only:
Total time: 6.11 seconds
That's < 1 second difference...
Can you show us an example?
Note that in both scenarios, manual DELETE/INSERT vs UPDATE, delete vectors are created and only new files are created. No files are updated or deleted.
If you are having performance issues, make sure you have optimized the projections involved for UPDATE and DELETE! Check out the following online doc:
Best Practices for DELETE and UPDATE
https://my.vertica.com/docs/8.0.x/HTML/ ... UPDATE.htm
As far as performance, the following example shows that the performance is relatively even for both methods.
Example...
SETUP:
Code: Select all
dbadmin=> \d test;
List of Fields by Tables
Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key
--------+-------+---------------------+-------------+------+---------+----------+-------------+-------------
public | test | pk | int | 8 | | t | t |
public | test | sale_date_key | int | 8 | | f | f |
public | test | ship_date_key | int | 8 | | f | f |
public | test | sales_dollar_amount | float | 8 | | f | f |
public | test | transaction_type | varchar(16) | 16 | | f | f |
(5 rows)
dbadmin=> select count(*) from test;
count
-----------
100000000
(1 row)
Time: First fetch (1 row): 52.497 ms. All rows formatted: 52.543 ms
dbadmin=> select transaction_type, count(*) from test group by transaction_type;
transaction_type | count
------------------+----------
purchase | 95006294
return | 4993706
(2 rows)
dbadmin=> create table test_new_values (pk int, sale_date_key int, ship_date_key int, sales_dollar_amount float, transaction_type varchar(16)) order by pk;
CREATE TABLE
dbadmin=> insert /*+ direct */ into test_new_values select pk, sale_date_key, ship_date_key, sales_dollar_amount * 100, transaction_type from test where transaction_type = 'return';
OUTPUT
---------
4993706
(1 row)
Code: Select all
dbadmin=> delete /*+ direct */ from test where transaction_type = 'return';
OUTPUT
---------
4993706
(1 row)
Time: First fetch (1 row): 2985.270 ms. All rows formatted: 2985.313 ms
dbadmin=> insert /*+ direct */ into test select * from test_new_values;
OUTPUT
---------
4993706
(1 row)
Time: First fetch (1 row): 2349.500 ms. All rows formatted: 2349.557 ms
UPDATE only:
Code: Select all
dbadmin=> update /*+ direct */ test set sales_dollar_amount = sales_dollar_amount * 100 where transaction_type = 'return';
OUTPUT
---------
4993706
(1 row)
Time: First fetch (1 row): 6113.424 ms. All rows formatted: 6113.484 ms
That's < 1 second difference...
Jim Knicely
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.
Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.