Fast query result export

Moderator: NorbertKrupa

Post Reply
Posts: 1
Joined: Tue Nov 27, 2012 2:17 am

Fast query result export

Post by oleg_myrk » Tue Nov 27, 2012 2:23 am


What is the fastest way to export a query result from Vertica to some external format? Greenplum has different parallel data unloading options.

Running vsql I measured maximum of 50,000 rows per second per cpu, which is quite slow for our purposes. The (uncompressed) row size is about 100 bytes.

Interestingly, Postgresql is also capable of exporting data at about the same speed, so it might be some problem of our setup or data.

Would using Hadoop connector allow to export query results faster? ... ew-tricks/

There is a web page from 2010 ... tegration/
claiming that:

In addition, inspired by a large banking customer, Vertica is announcing some cool Hadoop integration futures:
* Vertica-formatted data will be stored on HDFS (Hadoop Distributed File System).
* It will get there via parallel backup — i.e., you will be able to back up Vertica to HDFS.
* Libraries will be exposed to let HDFS read and write the Vertica-formatted data, for purposes like ETL, long-running analytics, etc.

What is the status of this development?

Thank You!

Posts: 149
Joined: Mon Apr 30, 2012 10:04 pm
Location: New York

Re: Fast query result export

Post by jpcavanaugh » Tue Nov 27, 2012 3:07 am

Where do you want to write it to? Thinking outside the box, you could use the hadoop connector or you could use a custom UDX.

Post Reply

Return to “Vertica Database Development”