Why Adding long varchar column in a Fact table is not advisable?

Moderator: NorbertKrupa

Post Reply
abdel
Newbie
Newbie
Posts: 2
Joined: Sat Jul 11, 2015 7:37 am

Why Adding long varchar column in a Fact table is not advisable?

Post by abdel » Sat Jul 11, 2015 7:43 am

Hi All,

I wanted to understand the disadvantages of adding long varchar column in fact table.
What effect it will have on performance and storage?
why it is always advisable to store long varchar column in dimension tables and not in fact tables.

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Why Adding long varchar column in a Fact table is not advisable?

Post by NorbertKrupa » Sat Jul 11, 2015 7:32 pm

It's not recommended to join, sort, segment, or add constraints on LONG data types. For storage and performance, Vertica stores LONG data types as storage only containers.

See the documentation for more notes.
Checkout vertica.tips for more Vertica resources.

abdel
Newbie
Newbie
Posts: 2
Joined: Sat Jul 11, 2015 7:37 am

Re: Why Adding long varchar column in a Fact table is not advisable?

Post by abdel » Sun Jul 12, 2015 7:17 am

Thanks alot for the reply.

Also what about the side of data storage when storing large varchar column in fact table? My understanding is that it will add up to lot of storage?

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: Why Adding long varchar column in a Fact table is not advisable?

Post by NorbertKrupa » Sun Jul 12, 2015 7:03 pm

In a quick test, a 2.5 KB raw piece of text was compressed to 1.6 KB when stored as a long varchar.
Checkout vertica.tips for more Vertica resources.

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: Why Adding long varchar column in a Fact table is not advisable?

Post by JimKnicely » Sat Jul 18, 2015 12:18 pm

Large VARCHAR fields also present an issue when trying to load them ... refer to this blog post:

https://community.dev.hp.com/t5/Vertica ... a-p/230903

Refer to this “Approaches for Data Loading” PDF doc:
http://my.vertica.com/docs/SolutionsArc ... oading.pdf

In particular...
Loading Wide Tables

Use Case: Wide tables with large VARCHAR columns are bottlenecks of the workflow in Phase II of the COPY command.

Recommendation: Hewlett-Packard recommends the following options for loading wide tables:
  • Change the LoadMergeChunkSizeK parameter as an exception for specific loads.
  • Use flex tables for wide tables and for multiple small tables. Loading wide tables into flex tables requires loading one field instead of many fields. Thus, it reduces the size of the catalog and improves overall database performance. The initial load is very fast, with data available to the users quickly. However, query performance is lower in comparison to columnar storage.
  • Using GROUPED correlated columns to load wide tables. GROUPED clause groups two or more columns into a single disk file. Two columns are correlated if the value of one column is related to the value of the other column.
You cannot resolve this issue by adding more resources, splitting the files, or parallelizing the work between the nodes. You should contact HP Vertica Support and adjust the configuration parameters under their guidance.
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

Post Reply

Return to “Vertica SQL”