Flattened Tables - Vertica Blog Post

Moderator: NorbertKrupa

Post Reply
User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Flattened Tables - Vertica Blog Post

Post by JimKnicely » Thu Dec 07, 2017 3:06 pm

Before release 8.1., Vertica users could denormalize their data by combining all fact and dimension table columns in a single ‘fat’ table. These tables facilitated faster query execution. However, this approach required users to maintain redundant sets of normalized and denormalized data, which incurred its own overhead.

Vertica 8.1 addressed this issue by introducing flattened tables. Flattened tables include columns that get their values by querying other tables. Operations on the source tables and flattened table are decoupled: changes in one are not automatically propagated to the other. Instead, columns are populated with queried data only when the column is created, or when you explicitly refresh it. This approach minimizes the overhead that is otherwise typical of conventional denormalized tables.

Read more about "Flattened Tables" here:
https://my.vertica.com/blog/flattened-tables/
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

thiago_rodrigo
Newbie
Newbie
Posts: 2
Joined: Thu Oct 11, 2018 9:50 pm

Re: Flattened Tables - Vertica Blog Post

Post by thiago_rodrigo » Thu Oct 11, 2018 9:52 pm

was asking in a other discussion but since there is no discussion focused on Prejoin Projection and the replacement for this since v8.1 Flattened Tables, I created this one.

Basically, what is the best practice to improve the performance of a query with joins in the V9.X?

Before V8.1, we would create a projection (for example: CREATE PROJECTION prj_fact_dimensions as select fact.f1, fact.f2, ..., fact.f50, dimension1.d1, dimension1.d2, ...., dimension1.d50, dimension2.di1, dimension2.di2, ..., dimension2.di50 from fact, dimension1, dimension2 where fact.id_di1 = dimension1.id_di1 and fact.id_di2 = dimension2.id_di2 order by f1), now this would result in a error: "Prejoin projection is deprecated".

With this example what is the best solution on V9.1?

thiago_rodrigo
Newbie
Newbie
Posts: 2
Joined: Thu Oct 11, 2018 9:50 pm

Re: Flattened Tables - Vertica Blog Post

Post by thiago_rodrigo » Thu Oct 11, 2018 9:52 pm

Bellow is the post I made on other discussion, with my problem with more details:

Hello, Ben_Vandiver! I'm having a similar problem here. I have a fact table with millions of registers with some dimensions. And I need to improve a performance of a query, initially I was trying to create a projection and end up with the error "Prejoin projection is deprecated", then I looked up more information about flattened tables that you suggested as a solution.
However, I have some doubts about this:

1- I was planning on creating a flattened table that used the fact table and the dimensions, but do not alter them, and them be able to create other flattened tables to aggregate data from fact and dimension tables.

2- Is there a way to create a flattened table totally based on the query I was using to create a projection? I was expecting something like create table fact_wide set using [query], is there a way to do it?

3- I only found 2 suggestions of use of flattened tables, creating a table based on other (like: CREATE TABLE fact_wide as select a, b,c from fact; and later add dimension columns one by one with a query related), or altering the fact table adding the dimension columns. Then I come to questions, if I create a flattened table like fact_wide, how the fact data can be refreshed on fact_wide?

4- Also if I alter the fact table (which I'm not sure is a good practice), do I really have to add a query column by column even if its the same dimension table (in my case there are a lot of columns in each dimension)

Post Reply

Return to “Vertica Links”