Page 1 of 1

Parse Json from Vertica VARCHAR column

Posted: Wed Mar 29, 2017 1:33 pm
by _pj_

We have the case that Json is pushed from outer system to our Vertica database into tables with VARCHAR(65000) columns. We do not have any possibility to change this process currently.

So to parse the Json-s we load those to our ETL tool, do the parsing there and then write back to the same database. This is quite time consuming.

We thought that maybe we could utilize the possibility to let Vertica do the parsing. But as we understand this can only be done with Flex tables, we do not have the possibility to change our regular columnar tables to flex tables.

So, now we tried to do this:
a. Create temporary flex table.
b. CONNECT TO VERTICA from and to the same instance.
c. COPY from the VARCHAR(65000) table to the temp flex table.
d. parse the Json and load the results to where needed.
But with this we hit the problem, that the CONNECT TO VERTICA seems to be possible to be done only with the dbadmin user of the connect-to database.

So we have come quite far from the initial problem with additional problems :)

Any good ideas how to get the Json in VARCHAR(65000) parsed as optimal as possible. Maybe using the ETL tool is still the best option in the end?


Re: Parse Json from Vertica VARCHAR column

Posted: Fri Mar 31, 2017 3:08 am
by scutter
Hi PJ,

You can parse json stored in a column without using a flex table. Here are two examples:

First storing the json in a varbinary column as a vmap. This would be ideal if queries are executed repeatedly on the same json so that you only generate the vmap once.

create table t(vmap long varbinary(10000));
insert into t values (mapjsonextractor('{ "name" : "sharon"}'));
insert into t values (mapjsonextractor('{ "name" : "david"}'));
select * from t;
select maplookup(vmap, 'name') from t;

You can also parse json stored as a varchar and generate the vmap on the fly.

create table t2(json varchar(10000));
insert into t2 values ('{ "name" : "sharon"}');
insert into t2 values ('{ "name" : "david"}');
select * from t2;
select maplookup(mapjsonextractor(json), 'name') from t2;

This is buried in the MAPJSONEXTRACTOR documentation - the coljson example. ... RACTOR.htm


Re: Parse Json from Vertica VARCHAR column

Posted: Mon Apr 10, 2017 1:52 pm
by _pj_
Hi, Sharon

Great reply, thank you.

Seems to work from the functional point of view, from performance point of view it is still better to use ETL tool for the parsing as it seems to do it a lot faster. Not sure, but assume that it is related to the fact that we have the Json in Varchar(65000) column and Vertica just does not handle good such long defined strings (even though the Json text in it rarely exceeds 1000 characters). The ETL tool just sees those as strings and manages to out-perform Vertica in this parsing case.


Re: Parse Json from Vertica VARCHAR column

Posted: Thu Jun 29, 2017 7:44 pm
by Harshitha
Hi Sharon,

Thank you so much . By using this I am able to load json column into another table.Now I am facing one issue with bad json file.
Is there a way to have mapjsonextractor keep going after it hits an error in the input file?
Please let me know how can I skip the input file errors.

I appreciate any help here.