How is table data stored on disk?

Moderator: NorbertKrupa

Post Reply
User avatar
Jbaskin
Intermediate
Intermediate
Posts: 61
Joined: Sat Jan 28, 2012 1:21 pm

How is table data stored on disk?

Post by Jbaskin » Wed Mar 07, 2012 2:02 pm

Hello, Does anyone know how Vertica stores the data in tables on disk? The directories in the file system are just a bunch of confusing numbers. Thanks.

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: How is table data stored on disk?

Post by JimKnicely » Thu Mar 22, 2012 1:30 pm

Hi jsbaskin,

Projection data in Verica is stored in ROS (Read Optimized Storage) containers on disk. The root location on disk where your data is located can be found by querying the DISK_STORAGE system table:

Code: Select all

dbadmin=> select storage_path from v_monitor.disk_storage where storage_usage = 'DATA';
                  storage_path
-------------------------------------------------
 /data1/data/intersect/v_intersect_node0001_data
 /data2/data/intersect/v_intersect_node0001_data
(2 rows)
In the storage path directory on disk you'll see a bunch of sub-directories having numbered names ranging from 001 up to 999. Each directory is a ROS container. In each of those directories are additional sub-directories named by using a unique identifier called the projection storage uid. In these sub-directories is where you'll find your data files.

For example, I have a table named holiday_dim. Say I want to find out where on disk its data files are located.

First I would need to find out which projections have been created for the table. To do that I would query the PROJECTION_STORAGE system table:

Code: Select all

dbadmin=> select projection_name, row_count, ros_count from v_monitor.projection_storage where anchor_table_name = 'holiday_dim';
   projection_name    | row_count | ros_count
----------------------+-----------+-----------
 holiday_dim_node0001 |        60 |         1
 holiday_dim_node0002 |        60 |         1
 holiday_dim_node0003 |        60 |         1
(3 rows)
Now that I know the projection name (holiday_dim_node0001) on node 1, I can find out the ROS container's unique id by querying the STORAGE_CONTAINERS system table:

Code: Select all

dbadmin=> select storage_type, storage_oid from v_monitor.storage_containers where projection_name = 'holiday_dim_node0001';
 storage_type |    storage_oid
--------------+-------------------
 ROS          | 45035996274585219
(1 row)
The storage_oid is the unique numeric ID assigned by the Vertica catalog that identifies the storage location. It's basically the directory name on disk.

Since I know that my data is stored in the location /data1/data/intersect/v_intersect_node0001_data on disk, I just need to find the ROS container directory under there... For that I can use the Linux find command:

Code: Select all

bash-3.2$ pwd
/data1/data/intersect/v_intersect_node0001_data
bash-3.2$ find ./ -name 45035996274585219
./219/45035996274585219
bash-3.2$ cd 219/45035996274585219
bash-3.2$ ls -lrt
total 8
-rw------- 1 dbadmin dbadmin  48 Mar 21 22:31 45035996274585219_0.pidx
-rw------- 1 dbadmin dbadmin 134 Mar 21 22:31 45035996274585219_0.fdb
There are always two files located in the directory for a projection. One file has a .fdb extension and the other has a .pidx extension. The .fdb file is the data itself (encoded and compressed) while the .pidx file contains index/statistics info from the data file.

Notice that the last three digits of the projection directory represents the ROS container's directory name:
  • /data1/data/intersect/v_intersect_node0001_data/219/45035996274585219
Have fun!
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

man
Newbie
Newbie
Posts: 1
Joined: Wed Apr 11, 2012 10:34 am

Re: How is table data stored on disk?

Post by man » Wed Apr 11, 2012 10:41 am

Hi!
I have an another question...how works the projections data files creation? There is one data file only that stores data? Or is it possible to store data of projection in different files?
And...
I read something about automatic managemet of storage location...but obviusly if there is one file per projection's data it is always stored on one storage location...is it correct?

Tnx!

User avatar
JimKnicely
Site Admin
Site Admin
Posts: 1825
Joined: Sat Jan 21, 2012 4:58 am
Contact:

Re: How is table data stored on disk?

Post by JimKnicely » Fri Apr 13, 2012 1:07 pm

Hi man,

Great question. I believe that projection data is stored in one file per node unless the projection's anchor table was created using the PARTITION clause... in that case, there will be one file for each partition. When you create a table with the PARTITION clause then all projections based on that table are also partitioned.

To demonstrate, let's create a partitioned table:

Code: Select all

dbadmin=> create table test1 (col1 int, col2 varchar2(100), primary key (col1)) partition by col1;
CREATE TABLE
dbadmin=> insert into test1 values (1, 'Pittsburgh');
 OUTPUT
--------
      1
(1 row)

dbadmin=> insert into test1 values (2, 'Toronto');
 OUTPUT
--------
      1
(1 row)

dbadmin=> insert into test1 values (3, 'Dallas');
 OUTPUT
--------
      1
(1 row)
We can check the v_monitor.projection_storage system table to find the table's projections:

Code: Select all

dbadmin=> select projection_name, row_count, ros_count from v_monitor.projection_storage where anchor_table_name = 'test1';
 projection_name | row_count | ros_count
-----------------+-----------+-----------
 test1_node0001  |         3 |         3
 test1_node0002  |         3 |         3
 test1_node0003  |         3 |         3
(3 rows)
Then we can find the storage containers used by the "test1_node0001" projection via the v_monitor.storage_containers system table:

Code: Select all

dbadmin=> select storage_type, storage_oid from v_monitor.storage_containers where projection_name = 'test1_node0001';
 storage_type |    storage_oid
--------------+-------------------
 ROS          | 45035996276572379
 ROS          | 45035996276572389
 ROS          | 45035996276572399
(3 rows)
We see that there are three ROS containers for the single projection, one ROS for each partition.

If we go out the OS we'll also see that there are three separate .fdb data files, one for each ROS and therefore one for each partition:

Code: Select all

verticatst01:/usr/local/data/intersect/v_intersect_node0001_data $ ls -lrt ./379/45035996276572379
total 8
-rw------- 1 dbadmin dbadmin 48 Apr 13 07:17 45035996276572379_0.pidx
-rw------- 1 dbadmin dbadmin  3 Apr 13 07:17 45035996276572379_0.fdb
verticatst01:/usr/local/data/intersect/v_intersect_node0001_data $ ls -lrt ./389/45035996276572389
total 8
-rw------- 1 dbadmin dbadmin 48 Apr 13 07:17 45035996276572389_0.pidx
-rw------- 1 dbadmin dbadmin  3 Apr 13 07:17 45035996276572389_0.fdb
verticatst01:/usr/local/data/intersect/v_intersect_node0001_data $ ls -lrt ./399/45035996276572399
total 8
-rw------- 1 dbadmin dbadmin 48 Apr 13 07:17 45035996276572399_0.pidx
-rw------- 1 dbadmin dbadmin  3 Apr 13 07:17 45035996276572399_0.fdb
I hope this helps!
Jim Knicely

Image

Note: I work for Vertica. My views, opinions, and thoughts expressed here do not represent those of my employer.

vicky
Newbie
Newbie
Posts: 15
Joined: Fri Mar 22, 2013 10:54 am

Re: How is table data stored on disk?

Post by vicky » Tue Apr 23, 2013 8:04 am

When query is running, What happens if node fails? I mean regarding performance

User avatar
becky
Intermediate
Intermediate
Posts: 118
Joined: Sat Apr 28, 2012 11:37 am

Re: How is table data stored on disk?

Post by becky » Wed Dec 16, 2015 8:04 pm

Hi guys,

I upgraded to 7.2. I not

I thought that the only files asscoiated with data are the .fb and .pidx files. But I now see .gt files:

Code: Select all

[dbadmin@BK 277]$ ls -lrt
total 4
-rw------- 1 dbadmin verticadba 438 Dec 16 13:50 0267ea83948962386adc79978bf46adc00a0000000024745_0.gt
Anyone know what these .gt files are?
THANKS - BECKSTER

NorbertKrupa
GURU
GURU
Posts: 527
Joined: Tue Oct 22, 2013 9:36 pm
Location: Chicago, IL
Contact:

Re: How is table data stored on disk?

Post by NorbertKrupa » Wed Dec 16, 2015 8:35 pm

Coming in 7.3 there will be integration with Grand Theft Auto.
Checkout vertica.tips for more Vertica resources.

Post Reply

Return to “New to Vertica Database Administration”