Hi,
i'm trying to insert .csv file with 15 millions of row.
this done like : cat file.csv | ../vsql -h 10.0.0.10 -U Admin -w 123123 -c "COPY ProjectionTest.FE (list about 30 columns) FROM LOCAL STDIN DELIMITER AS ',' ENCLOSED BY '\"' DIRECT;"
when I uploaded same file 4-5 times it takes about 7 min, but not when tables has 100+ millions ros it takes 15+ minutes.
is it expected behavioral for COPY command?
is inserting time based on size of table (# of rows) to which we inserting values?
maybe it is possible to use different scenario to decrease the time?
thank you for advice.
COPY command for file with 15-20 millons row
Moderator: NorbertKrupa
Re: COPY command for file with 15-20 millons row
Hi there!
Are you copying across a network? I noticed you are using the LOCAL option of the COPY command. I found that it's sometimes faster to compress a file on a client machine, copy to the Vertica server and then load it from the server.
Or is your issue that you see the COPY command performance degrade if there are already records in the table (i.e. 100+ million)?
Are you copying across a network? I noticed you are using the LOCAL option of the COPY command. I found that it's sometimes faster to compress a file on a client machine, copy to the Vertica server and then load it from the server.
Or is your issue that you see the COPY command performance degrade if there are already records in the table (i.e. 100+ million)?
Thanks,
Juliette
Juliette
-
- Intermediate
- Posts: 149
- Joined: Mon Apr 30, 2012 10:04 pm
- Location: New York
- Contact:
Re: COPY command for file with 15-20 millons row
How much memory do you have on the box and what is the planned concurrency?
Re: COPY command for file with 15-20 millons row
Hi, thank you for your answers,
the problem is that I have only 1 node currently and after some investigation I found that upload time is related to how much queries was run on Vertica in this time.
the problem is that I have only 1 node currently and after some investigation I found that upload time is related to how much queries was run on Vertica in this time.