Page 1 of 1

Estimating Size of a table

Posted: Fri Sep 20, 2013 8:09 am
by nnani
Hello All,

Just thought of sharing this with all.
This can be handy SQL snippet in projects

Suppose you want to calculate the individual table size of any table in your database.
You can use the COLUMN_STORAGE system table for calculating the size.

Lets see and understand this step by step

COLUMN_STORAGE table shows the space in bytes used for every column in database.
so Let so go ahead querying a table

Code: Select all

nnani=> select anchor_table_schema, anchor_table_name,node_name, column_name, used_bytes as column_used_bytes
nnani-> from column_storage where anchor_table_schema='DV0' and anchor_table_name='feed_prvdr' and
nnani-> node_name ='vnode0002';
 anchor_table_schema | anchor_table_name |    node_name    |   column_name   | column_used_bytes
---------------------+-------------------+-----------------+-----------------+-------------------
 DV0         | feed_prvdr  | vnode0002 | prvdr_id   |                 6
 DV0         | feed_prvdr  | vnode0002 | prvdr_nm   |                65
 DV0         | feed_prvdr  | vnode0002 | prvdr_desc |               120
 DV0         | feed_prvdr  | vnode0002 | pdt_by     |                62
 DV0         | feed_prvdr  | vnode0002 | dt_tm      |                20
 DV0         | feed_prvdr  | vnode0002 | dt_tm      |                20
 DV0         | feed_prvdr  | vnode0002 |            |                96
 DV0         | feed_prvdr  | vnode0002 | prvdr_id   |                 3
 DV0         | feed_prvdr  | vnode0002 | prvdr_nm   |                32
 DV0         | feed_prvdr  | vnode0002 | prvdr_desc |                49
 DV0         | feed_prvdr  | vnode0002 | pdt_by     |                31
 DV0         | feed_prvdr  | vnode0002 | dt_tm      |                10
 DV0         | feed_prvdr  | vnode0002 | dt_tm      |                10
 DV0         | feed_prvdr  | vnode0002 |            |                48
(14 rows)

The above query shows column for the specific table present on node 2 and their storage bytes, Now the thing to observe here is the why are they repeated twice on the same node.
This might be due to super projection and the buddy projection residing on the same node so all the column are appearing twice.

Now lets find the size of this table

Code: Select all

nnani=> select sum(column_used_bytes) as table_size_in_bytes from
nnani-> (select anchor_table_schema, anchor_table_name,node_name, column_name, sum(used_bytes) as column_used_bytes
nnani(> from column_storage where anchor_table_schema='DV0' and anchor_table_name='feed_prvdr' and
nnani(> node_name IN (select node_name from nodes)
nnani(> group by 1,2,3,4) sub;
 table_size_in_bytes
---------------------
                2952
(1 row)
So the total table size of this table is 2952 bytes in your database.

Hope this helps.. :)

Re: Estimating Size of a table

Posted: Fri Sep 20, 2013 10:39 am
by id10t
Hi!

Why audit is't an option?

Code: Select all

daniel=> select audit('public.lj', 0, 1);
 audit 
-------
 6
(1 row)

BTW: Your query calculates wrong value, because you also explicitly calculates an epoch column.

Code: Select all

daniel=> select sum(column_used_bytes) as table_size_in_bytes from^J(select anchor_table_schema, anchor_table_name,node_name, column_name, sum(used_bytes) as column_used_bytes^Jfrom column_storage where anchor_table_schema='public' and anchor_table_name='lj' and^Jnode_name IN (select node_name from nodes)^Jgroup by 1,2,3,4) sub;
 table_size_in_bytes 
---------------------
                  53
(1 row)
6 bytes vs 53 bytes ?

Code: Select all

daniel=> select anchor_table_schema, anchor_table_name,node_name, column_name, sum(used_bytes) as column_used_bytes^Jfrom column_storage where anchor_table_name='lj' group by 1,2,3,4;
 anchor_table_schema | anchor_table_name |    node_name    | column_name | column_used_bytes 
---------------------+-------------------+-----------------+-------------+-------------------
 public              | lj                | v_test_node0001 | id          |                 5
 public              | lj                | v_test_node0001 | epoch       |                48

daniel=> \d lj
                                List of Fields by Tables
 Schema | Table | Column | Type | Size | Default | Not Null | Primary Key | Foreign Key 
--------+-------+--------+------+------+---------+----------+-------------+-------------
 public | lj    | id     | int  |    8 |         | f        | f           | 
(1 row)

Re: Estimating Size of a table

Posted: Fri Sep 20, 2013 12:53 pm
by scutter
audit() gives raw table size, column_storage gives compressed size. Two useful values, but different.

The projection_storage table also adds up the used bytes for you by node - just sum up across nodes.

I think it's appropriate to include the epoch column in the math when considering total space used by a projection, but if you're doing a compression calculation you might want to exclude it to be thorough, but leaving it in really does give you the actual compression ratio.

Re: Estimating Size of a table

Posted: Fri Sep 20, 2013 1:08 pm
by id10t
:oops:

Re: Estimating Size of a table

Posted: Fri Sep 20, 2013 4:17 pm
by nnani
Thanks for the suggestion guys.

I learned the fact about the getting the table size ( RAW / COMPRESSED )
This is something new which I got to know.

As scutter said,
audit() gives raw table size, column_storage gives compressed size. Two useful values, but different.
This can be really useful information

Thanks for correcting me. :)

Re: Estimating Size of a table

Posted: Mon Jan 27, 2014 3:22 am
by NorbertKrupa
Since this is a fairly common question, I thought it would be beneficial to expand with blog posts. I hope it's alright to share here.
  1. Vertica table size

Re: Estimating Size of a table

Posted: Mon Jan 27, 2014 8:08 pm
by JimKnicely
I hope it's alright to share here.
It is most definitely okay to share here! Thanks for doing so :D