How to export data from SQL Server to Vertica

Moderator: NorbertKrupa

darinbob
Newbie
Newbie
Posts: 10
Joined: Thu Mar 27, 2014 2:43 pm

Re: How to export data from SQL Server to Vertica

Post by darinbob » Thu Jul 24, 2014 8:19 pm

I ended up writing a VB program to look for 'NULL|', '|NULL' or '|NULL|' and replace with | or || depending.

sqlcmd -S MYSERVER -d mydb -u -s"|" -I -h-1 -k2 -W -i "C:\path\input.sql" |null-program.exe | vsql -h verticadev-d dbase-U dbadmin-w password-c "COPY DEV.myTable FROM LOCAL STDIN DELIMITER '|' REJECTED DATA 'C:\path\Reject.txt' EXCEPTIONS 'C:\path\Except.txt';"

Loaded 1 million rows, 1.5 GB of raw data in 71.70 minutes. The replace piece probably adds quite a bit of processing time, but this solution has taught me quite a lot.

I appreciate all the help.

doug_harmon
Beginner
Beginner
Posts: 36
Joined: Fri Feb 17, 2012 6:09 pm
Contact:

Re: How to export data from SQL Server to Vertica

Post by doug_harmon » Fri Jul 25, 2014 1:24 am

There's also the NULL as 'NULL' option for the copy command.

Code: Select all

[dbadmin@vertica tmp]$ cat test_file.txt
1|NULL|6
2|abcd|4
72|a|NULL
NULL|hi|8
[dbadmin@vertica tmp]$ vsql
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

dbadmin=> create table test_table (v1 int, v2 varchar(10), v3 int);
CREATE TABLE
dbadmin=> copy test_table from '/tmp/test_file.txt' NULL AS 'NULL' ;
 Rows Loaded 
-------------
           4
(1 row)

dbadmin=> select * from test_table;
 v1 |  v2  | v3 
----+------+----
  1 |      |  6
  2 | abcd |  4
 72 | a    |   
    | hi   |  8
(4 rows)

dbadmin=> 


doug_harmon
Beginner
Beginner
Posts: 36
Joined: Fri Feb 17, 2012 6:09 pm
Contact:

Re: How to export data from SQL Server to Vertica

Post by doug_harmon » Fri Jul 25, 2014 1:33 am

There are a number of levers you can pull to speed up the load as well. Assuming that the network is the gating factor, you could stream the data through a gzip utility to convert it to gzip. The copy statement can read gzip data as long as you change the input from STDIN to STDIN GZIP.

However, that only speeds things up if you're running this on the Windows Box that houses your SQL Server. If you're doing all of this on a ETL box that sits between the SQL Server and the Vertica Server -- that defeats the purpose because the data will be coming from the SQL Server to the ETL box in uncompressed format.

Post Reply

Return to “Vertica Migration”