copy load data that may contain the delimiter
Posted: Mon Jun 29, 2015 2:19 pm
I'm currently using copy command to load data into vertica.
I volumes of 1TB gziped data daily.
copy table (col1,col2,col3) FROM 1.dat GZIP DELIMITER '|'
My problem is that my data that goes into the columns may contain the delimiter '|' or any other asscii character.
From the documentation there is an option to use ESCAPE AS '~' which mean i will need to escape any data in column that may contain '|' and change it to '~|'
my only question is what happend if my data also has '~|' before i escape it? should i change it to '~~|' or any other alternative?
Would using ENCLOSED BY will help me with it? and what do i need to do if my data contains the ENCLOSED BY chatacter?
I prefer not to use the ENCLOSED BY as it will increase the data transfer of my files, but if that's the best solution i will reconsider.
example for raw data in my system:
(the number represents the column that data belongs to)
1|2|2|3
1|2|3
1|2~|2|3
1|2"2|3
||
|2|3
|2|
|2|3
1||
1||3
1|2|
1|2|3
In all of the above cases the original data contains 3 columns of raw data before escaping it.
Thanks for the help.
Koby
I volumes of 1TB gziped data daily.
copy table (col1,col2,col3) FROM 1.dat GZIP DELIMITER '|'
My problem is that my data that goes into the columns may contain the delimiter '|' or any other asscii character.
From the documentation there is an option to use ESCAPE AS '~' which mean i will need to escape any data in column that may contain '|' and change it to '~|'
my only question is what happend if my data also has '~|' before i escape it? should i change it to '~~|' or any other alternative?
Would using ENCLOSED BY will help me with it? and what do i need to do if my data contains the ENCLOSED BY chatacter?
I prefer not to use the ENCLOSED BY as it will increase the data transfer of my files, but if that's the best solution i will reconsider.
example for raw data in my system:
(the number represents the column that data belongs to)
1|2|2|3
1|2|3
1|2~|2|3
1|2"2|3
||
|2|3
|2|
|2|3
1||
1||3
1|2|
1|2|3
In all of the above cases the original data contains 3 columns of raw data before escaping it.
Thanks for the help.
Koby