COPY command for file with 15-20 millons row

Moderator: NorbertKrupa

Post Reply
Dimitrius
Newbie
Newbie
Posts: 2
Joined: Wed Oct 03, 2012 11:49 am

COPY command for file with 15-20 millons row

Post by Dimitrius » Wed Oct 03, 2012 12:50 pm

Hi,

i'm trying to insert .csv file with 15 millions of row.
this done like : cat file.csv | ../vsql -h 10.0.0.10 -U Admin -w 123123 -c "COPY ProjectionTest.FE (list about 30 columns) FROM LOCAL STDIN DELIMITER AS ',' ENCLOSED BY '\"' DIRECT;"

when I uploaded same file 4-5 times it takes about 7 min, but not when tables has 100+ millions ros it takes 15+ minutes.

is it expected behavioral for COPY command?
is inserting time based on size of table (# of rows) to which we inserting values?
maybe it is possible to use different scenario to decrease the time?

thank you for advice.

User avatar
Julie
Master
Master
Posts: 221
Joined: Thu Apr 19, 2012 9:29 pm

Re: COPY command for file with 15-20 millons row

Post by Julie » Wed Oct 03, 2012 6:13 pm

Hi there!

Are you copying across a network? I noticed you are using the LOCAL option of the COPY command. I found that it's sometimes faster to compress a file on a client machine, copy to the Vertica server and then load it from the server.

Or is your issue that you see the COPY command performance degrade if there are already records in the table (i.e. 100+ million)?
Thanks,
Juliette

jpcavanaugh
Intermediate
Intermediate
Posts: 149
Joined: Mon Apr 30, 2012 10:04 pm
Location: New York
Contact:

Re: COPY command for file with 15-20 millons row

Post by jpcavanaugh » Wed Oct 03, 2012 9:02 pm

How much memory do you have on the box and what is the planned concurrency?

Dimitrius
Newbie
Newbie
Posts: 2
Joined: Wed Oct 03, 2012 11:49 am

Re: COPY command for file with 15-20 millons row

Post by Dimitrius » Thu Oct 04, 2012 10:29 am

Hi, thank you for your answers,

the problem is that I have only 1 node currently and after some investigation I found that upload time is related to how much queries was run on Vertica in this time.

Post Reply

Return to “Vertica Data Load”