Page 1 of 1

Projections for Catalog tables in Vertica

Posted: Thu Sep 26, 2013 9:20 am
by pk132007
We have a requirement where we calling V_CATALOG.VIEW_COLUMNS catalog table iteratively inside a script to get the metadata for aroung 300 columns.

We are quering V_CATALOG.VIEW_COLUMNS table one by one for each column.
The query is taking a lot of time to fetch the data.

Is there any way we can reduce the query time?
Can we ask our DB admin to create projections on catalog table? Is that possible ? Will it help us to reduce the query time ?

Please help.

Re: Projections for Catalog tables in Vertica

Posted: Thu Sep 26, 2013 10:30 am
by id10t
Hi!

Can you expand your "fetching algorithm/strategy"? What does it mean "one by one for each column":

Code: Select all

_____________________
1| get names of views
2| for each view:
3|     get names of columns
4|     for each column:
5|          do something()
_____________________
* Which language used for script? Bash + vsql?
* If bash: Can you use in python?
* If yes: Can you use in pyodbc?

Other strategy: fetch all meta-data, store it separately to Vertica in some repository and after it query a repository and not Vertica.
This strategy can be easily implemented in python: python has built in database - SQLite(for performance SQLite can be in-memory database).
- fetch all meta-data from Vertica
- save it in SQLite (define to be in-memory for performance, it will not take a lot space)
- query SQLite for info about Vertica's meta

PS:
It looks complicated, but trust me it is not so much complicated strategy for implementation. I can help.

Re: Projections for Catalog tables in Vertica

Posted: Fri Sep 27, 2013 1:57 pm
by JimKnicely
Hi,

I've found the Vertica system tables to be excruciatingly slow at times. At the HP conference in Boston a few months ago, an attendee brought this up at one of the user sessions. The tech from Vertica responded and said that their engineers are aware of this issue and are actively working on a solution for the next release :)

In the mean time, if your views aren't actively changing and are relatively stable, maybe you can create a table yourself called VIEW_COLUMNS_COPY (or something similar) and copy the contents of the V_CATALOG.VIEW_COLUMNS table into your table with an interval relative to how the often your views change?

Then you can query your table for the view meta data which should perform a lot better!

Just a thought...