Parse Json from Vertica VARCHAR column

Moderator: NorbertKrupa

Post Reply
_pj_
Newbie
Newbie
Posts: 6
Joined: Wed Mar 29, 2017 1:14 pm

Parse Json from Vertica VARCHAR column

Post by _pj_ » Wed Mar 29, 2017 1:33 pm

Hi,

We have the case that Json is pushed from outer system to our Vertica database into tables with VARCHAR(65000) columns. We do not have any possibility to change this process currently.

So to parse the Json-s we load those to our ETL tool, do the parsing there and then write back to the same database. This is quite time consuming.

We thought that maybe we could utilize the possibility to let Vertica do the parsing. But as we understand this can only be done with Flex tables, we do not have the possibility to change our regular columnar tables to flex tables.

So, now we tried to do this:
a. Create temporary flex table.
b. CONNECT TO VERTICA from and to the same instance.
c. COPY from the VARCHAR(65000) table to the temp flex table.
d. parse the Json and load the results to where needed.
But with this we hit the problem, that the CONNECT TO VERTICA seems to be possible to be done only with the dbadmin user of the connect-to database.

So we have come quite far from the initial problem with additional problems :)

Any good ideas how to get the Json in VARCHAR(65000) parsed as optimal as possible. Maybe using the ETL tool is still the best option in the end?

Br,
pj

scutter
Master
Master
Posts: 302
Joined: Tue Aug 07, 2012 2:15 am

Re: Parse Json from Vertica VARCHAR column

Post by scutter » Fri Mar 31, 2017 3:08 am

Hi PJ,

You can parse json stored in a column without using a flex table. Here are two examples:

First storing the json in a varbinary column as a vmap. This would be ideal if queries are executed repeatedly on the same json so that you only generate the vmap once.

create table t(vmap long varbinary(10000));
insert into t values (mapjsonextractor('{ "name" : "sharon"}'));
insert into t values (mapjsonextractor('{ "name" : "david"}'));
commit;
select * from t;
select maplookup(vmap, 'name') from t;


You can also parse json stored as a varchar and generate the vmap on the fly.

create table t2(json varchar(10000));
insert into t2 values ('{ "name" : "sharon"}');
insert into t2 values ('{ "name" : "david"}');
commit;
select * from t2;
select maplookup(mapjsonextractor(json), 'name') from t2;


This is buried in the MAPJSONEXTRACTOR documentation - the coljson example.

https://my.vertica.com/docs/8.0.x/HTML/ ... RACTOR.htm


—Sharon
Sharon Cutter
Vertica Consultant, Zazz Technologies LLC

_pj_
Newbie
Newbie
Posts: 6
Joined: Wed Mar 29, 2017 1:14 pm

Re: Parse Json from Vertica VARCHAR column

Post by _pj_ » Mon Apr 10, 2017 1:52 pm

Hi, Sharon

Great reply, thank you.

Seems to work from the functional point of view, from performance point of view it is still better to use ETL tool for the parsing as it seems to do it a lot faster. Not sure, but assume that it is related to the fact that we have the Json in Varchar(65000) column and Vertica just does not handle good such long defined strings (even though the Json text in it rarely exceeds 1000 characters). The ETL tool just sees those as strings and manages to out-perform Vertica in this parsing case.

Br,
_pj_

Harshitha
Newbie
Newbie
Posts: 1
Joined: Thu Jun 29, 2017 7:31 pm

Re: Parse Json from Vertica VARCHAR column

Post by Harshitha » Thu Jun 29, 2017 7:44 pm

Hi Sharon,

Thank you so much . By using this I am able to load json column into another table.Now I am facing one issue with bad json file.
Is there a way to have mapjsonextractor keep going after it hits an error in the input file?
Please let me know how can I skip the input file errors.

I appreciate any help here.

thanks
HK

Post Reply

Return to “Vertica Data Load”