Page 1 of 1

Implicit vs. Explicit Conversion

Posted: Mon Sep 09, 2013 9:46 pm
by Jbaskin
Hey guys,

Does anyone know if it's better in Vertica to explicitly convert all data types to the same when doing comparisons, or is okay to let Vertica do it implicitly? I know this was always a big deal in Oracle...

Say I have two tables A and B:

Code: Select all

dbadmin=> create table a (a1 int);
CREATE TABLE
dbadmin=> create table b (a1 varchar);
CREATE TABLE
dbadmin=> insert into a values (1);
 OUTPUT
--------
      1
(1 row)

dbadmin=> insert into b values ('1');
 OUTPUT
--------
      1
(1 row)
Which of the following query is a better practice?

Code: Select all

dbadmin=> select * from a join b on a.a1 = b.a1;
 a1 | a1
----+----
  1 | 1
(1 row)

dbadmin=> select * from a join b on a.a1::varchar = b.a1;
 a1 | a1
----+----
  1 | 1
(1 row)

dbadmin=> select * from a join b on a.a1 = b.a1::int;
 a1 | a1
----+----
  1 | 1
(1 row)

Re: Implicit vs. Explicit Conversion

Posted: Tue Sep 10, 2013 9:11 am
by nnani
Hello Jbaskin,

I would go with Explicit Casting option.

Advantages

1. Easy code portability, No need to change your code for different environments
2. Easy readability and understanding of code.

I think you should Let Vertica do explicit casting while doing comparison.
More over You can give it check.
Use the timing option and test both of your scenarios with big result sets, you will surely find the difference.


Hope this helps.. :)

Re: Implicit vs. Explicit Conversion

Posted: Tue Sep 10, 2013 3:26 pm
by JimKnicely
I agree with nanni!

I believe that it's always best practice to explicitly convert our data because who knows our data better?

The following example shows we should get better performance by explicitly converting our data, and in this case, converting to INT:

Code: Select all

dbadmin=> create table a (a1 int);
CREATE TABLE

dbadmin=> create table b (a1 varchar(1));
CREATE TABLE

dbadmin=> insert into a select randomint(10) from tables cross join columns cross join tables t1 cross join tables t2;
  OUTPUT
-----------
 217512000
(1 row)

dbadmin=> insert into b select randomint(10)::varchar from tables cross join columns cross join tables t1 cross join tables t2;
  OUTPUT
-----------
 217512000
(1 row)

dbadmin=> \timing
Timing is on.
Okay performance:

Code: Select all

dbadmin=> select count(*) from (select * from a join b on a.a1 = b.a1) foo;
      count
------------------
 4731146840901419
(1 row)

Time: First fetch (1 row): 35984.389 ms. All rows formatted: 35984.433 ms
Worst performance:

Code: Select all

dbadmin=> select count(*) from (select * from a join b on a.a1::varchar = b.a1) foo;
      count
------------------
 4731146840901419
(1 row)

Time: First fetch (1 row): 66358.395 ms. All rows formatted: 66358.427 ms
Best performance:

Code: Select all

dbadmin=> select count(*) from (select * from a join b on a.a1 = b.a1::int) foo;
      count
------------------
 4731146840901419
(1 row)

Time: First fetch (1 row): 27953.255 ms. All rows formatted: 27953.287 ms

Re: Implicit vs. Explicit Conversion

Posted: Thu Sep 12, 2013 3:53 pm
by Jbaskin
Excellent example and explanation, Jim! Thank you!!! This will finally end an argument I've been having with a co-worker :twisted: