First of all, please apologize my poor english, let me know if there is something you can't understand.
Hope you can help me on my problem, let me introduce the background :
I have to load huge amounts of datas from an Oracle database to a Vertica database on a daily basis.
For performances problematics, I can't use Java or tmp files such as .csv or .txt (around 50 tables involved, some gets several millions lines, every night)
I am restricted to windows.
So the basic solution for this is the following :
Code: Select all
echo @MyPath\Myscript.sql | sqlplus user/pwd@mydb | vsql -h host -d mydb -U user -w pwd -c "COPY Table(DATA) FROM LOCAL STDIN DELIMITER '|' DIRECT STREAM NAME 'Loading data from myscript' EXCEPTIONS 'exception.txt' REJECTED DATA 'rejected.txt';"
Of course, I want to check also my Constraints. There is PK, FK and UK to check. I don't want to go on complex and fancy SELECT/INSERT/DELETE queries or/with some TMP tables when i can use SELECT ANALYSE_CONSTRAINT and the NO COMMIT options.
Here come the core problem (not the core question, please read until the end).
If i do such thing as :
Code: Select all
vsql -h host -d mydb -w pwd -c "COMMIT;" | echo @MyPath\Myscript.sql | sqlplus user/pwd@mydb | vsql -h host -d mydb -U user -w pwd -c "COPY Table(DATA) FROM LOCAL STDIN DELIMITER '|' DIRECT STREAM NAME 'Loading data from myscript' EXCEPTIONS 'exception.txt' REJECTED DATA 'rejected.txt' NO COMMIT;"
Next :
Code: Select all
#Initialize ON_ERROR_STOP for avoid commit if constraint violated
vsql -h host -d mydb -U user -w pwd -c "\set ON_ERROR_STOP on"
echo @MyPath/Myscript.sql | sqlplus user/pwd@mydb | vsql -h host -d mydb -U user -w pwd -c "COPY Table(DATA) FROM LOCAL STDIN DELIMITER '|' DIRECT STREAM NAME 'Loading data from myscript' EXCEPTIONS 'exception.txt' REJECTED DATA 'rejected.txt' NO COMMIT;SELECT 1/(1-SIGN( COUNT(*)))FROM (SELECT ANALYZE_CONSTRAINTS('Table')) as CHECK_CONSTRAINT; COMMIT;"
You can't know when an error is raised (there is 50 tables implied) neither knowing what keys have been violated.
So bad solution.
Next solution : using sql script launched by vsql.
Code: Select all
echo @MyPath\Myscript.sql | sqlplus user/pwd@mydb | vsql -h host -d mydb -U user -w pwd -f myVertScript.sql
Code: Select all
COPY Table(DATA) FROM LOCAL STDIN DELIMITER '|' DIRECT STREAM NAME 'Loading data from myscript' EXCEPTIONS 'exception.txt' REJECTED DATA 'rejected.txt' NO COMMIT;
SELECT ANALYZE_CONSTRAINTS('Table');
COMMIT;
Code: Select all
SELECT ANALYZE_CONSTRAINTS('Table');
COMMIT;
I used a very simple perl script to get this variable (I tried in a first place to use DBI but you can't with STDIN. There is a postgre perl module allowing you to use pg_putcopydata() to load your datas into STDIN pipe, but there is nothing like that for vertica. Once again this is build to use .txt or .csv files) :
Code: Select all
#execute the cmd prompt (which call the sqlplus query) and store the result into variable $test1
my $test1=`sqlplus -s user/pwd@mydb \@MyPath\\Myscript.sql`;
#execute the cmd prompt which call the vsql command and store the result into variable $test2
my $test2=`vsql -h host -d mydb -U user -w pwd -v var="$test1" -f myVertScript.sql`;
printf "$test2";
Code: Select all
COPY Table(DATA) FROM LOCAL STDIN DELIMITER '|' DIRECT STREAM NAME 'Loading data from myscript' EXCEPTIONS 'exception.txt' REJECTED DATA 'rejected.txt' NO COMMIT;
\\echo :var
\.
SELECT ANALYZE_CONSTRAINTS('table') as CHECK_CONSTRAINT;
COMMIT;
Tried also :
:var
\echo :var
I also tried to put the copy statements into the :var (concatenate with the datas to load, separated by \r\n, all in the perl script) but vsql consider they are (copy and datas) on the same line despite the \r\n (using an echo into the myVertScript.sql showed me the \r\n is well printed)
So here is the core question :
Is there any way to put :var value into STDIN within a (v)sql script ?
This is very frustrating. I got my datas into my vsql session, but the COPY seems just to don't have the tools to catch them directly instead of using some slow-to-build-and-to-get heavy txt files (some of the tables get several millions lines to transfer every night).
But STDIN of course, which i can't use into a script for the reason above.
Hope you can help me !
Best regards,
EDIT : Just seen the dedicated Data load field... Maybe this topic should be moved.
EDIT2 :
I am aware about EnableNewPrimaryKeysByDefault and EnableNewUniqueKeysByDefault but we still need to use analyze_constraint for the FK. So it is nice to double tap your checking process but it is not enough. Moreover it is triggering ERROR even with NO COMMIT parameter used, meaning you can't use it with analyze_constraint().