Using jsonb on PostgreSQL

jsonb datatype, a specialised representation of the JSON data, allowing PostgreSQL to be competitive in managing the "lingua franca" of the moment for the exchange of data via web services. It is useful to perform a number of tests to verify its actual performance. In this post I have do some test for using jsonb datatype in PostgreSQL, hardware information: iMac 14.2, 3.4 GHz Intel Core i5 CPU, 8 GB 1600 MHz DDR3 RAM, 7200 RPM HDD.

I have create 3 tables: user_ini base info data (2 000 000 rows), tbl_user_json with json type column (2 000 000 rows), tbl_user_jsonb with jsonb type column (2 000 000 rows).

Connect database and create benchmark data

$ psql -d postgres
psql (9.5.0)
Type "help" for help.

postgres=# create table user_ini(id int4 ,user_id int8, user_name character varying(64),create_time timestamp(6) with time zone default clock_timestamp());
CREATE TABLE

postgres=# insert into user_ini(id,user_id,user_name) select r,round(random()*2000000), r || '_test' from generate_series(1,2000000) as r;
INSERT 0 2000000

Generate json data

postgres=# create table tbl_user_json(id serial, user_info json);
CREATE TABLE

postgres=# insert into tbl_user_json(user_info) select row_to_json(user_ini) from user_ini;
INSERT 0 2000000

Generate jsonb data

postgres=# create table tbl_user_jsonb(id serial, user_info jsonb);
CREATE TABLE

postgres=# insert into tbl_user_jsonb(user_info) select row_to_json(user_ini)::jsonb from user_ini;    
INSERT 0 2000000

Get table size

postgres=# \dt+ tbl_user_json
                       List of relations
 Schema |     Name      | Type  | Owner |  Size  | Description 
--------+---------------+-------+-------+--------+-------------
 public | tbl_user_json | table | xuri  | 274 MB | 
(1 row)

postgres=# \dt+ tbl_user_jsonb
                       List of relations
 Schema |      Name      | Type  | Owner |  Size  | Description 
--------+----------------+-------+-------+--------+-------------
 public | tbl_user_jsonb | table | xuri  | 332 MB | 
(1 row)

Query rows

postgres=# select * from tbl_user_jsonb limit 3;
 id |                                                user_info                                                
----+---------------------------------------------------------------------------------------------------------
  1 | {"id": 1, "user_id": 1961705, "user_name": "1_test", "create_time": "2016-02-04T11:46:47.434434+08:00"}
  2 | {"id": 2, "user_id": 1977164, "user_name": "2_test", "create_time": "2016-02-04T11:46:47.434508+08:00"}
  3 | {"id": 3, "user_id": 1731350, "user_name": "3_test", "create_time": "2016-02-04T11:46:47.43451+08:00"}
(3 rows)

jsonb key and value query test

Query user_info field from tbl_user_jsonb table:

postgres=# select * from tbl_user_jsonb where user_info->>'user_name'= '1_test';
 id |                                                user_info                                                
----+---------------------------------------------------------------------------------------------------------
  1 | {"id": 1, "user_id": 1961705, "user_name": "1_test", "create_time": "2016-02-04T11:46:47.434434+08:00"}
(1 row)

Query explain analyze without index:

postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name'= '1_test';
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Seq Scan on tbl_user_jsonb  (cost=0.00..72545.09 rows=10000 width=141) (actual time=0.012..359.949 rows=1 loops=1)
   Filter: ((user_info ->> 'user_name'::text) = '1_test'::text)
   Rows Removed by Filter: 1999999
 Planning time: 0.033 ms
 Execution time: 359.966 ms
(5 rows)

Create index for tbl_user_jsonb field in user_info table:

postgres=# create index idx_gin_user_infob_user_name on tbl_user_jsonb using btree ((user_info ->> 'user_name'));
CREATE INDEX

Query plan, just use 0.057 ms:

postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name'= '1_test';
                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl_user_jsonb  (cost=233.93..23867.07 rows=10000 width=141) (actual time=0.037..0.037 rows=1 loops=1)
   Recheck Cond: ((user_info ->> 'user_name'::text) = '1_test'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on idx_gin_user_infob_user_name  (cost=0.00..231.43 rows=10000 width=0) (actual time=0.034..0.034 rows=1 loops=1)
         Index Cond: ((user_info ->> 'user_name'::text) = '1_test'::text)
 Planning time: 0.127 ms
 Execution time: 0.057 ms
(7 rows)

Query by user_id in user_info table without index:

postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_id'= '1';
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on tbl_user_jsonb  (cost=0.00..72545.00 rows=10000 width=141) (actual time=459.846..486.119 rows=1 loops=1)
   Filter: ((user_info ->> 'user_id'::text) = '1'::text)
   Rows Removed by Filter: 1999999
 Planning time: 0.037 ms
 Execution time: 486.131 ms
(5 rows)

GIN index

Before use GIN index, drop index idx_gin_user_infob_user_name:

postgres=# drop index idx_gin_user_infob_user_name;
DROP INDEX

Create GIN index cost 202147.862 ms:

postgres=# create index idx_tbl_user_jsonb_user_Info on tbl_user_jsonb using gin (user_Info);
CREATE INDEX

postgres=# \di+ idx_tbl_user_jsonb_user_Info
                                       List of relations
 Schema |             Name             | Type  | Owner |     Table      |  Size  | Description 
--------+------------------------------+-------+-------+----------------+--------+-------------
 public | idx_tbl_user_jsonb_user_info | index | xuri  | tbl_user_jsonb | 445 MB | 
(1 row)

Query by key and value with index:

postgres=# explain analyze select * from tbl_user_jsonb where user_info @> '{"user_id": 1017031}';
                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl_user_jsonb  (cost=59.50..6647.26 rows=2000 width=141) (actual time=56.129..56.145 rows=3 loops=1)
   Recheck Cond: (user_info @> '{"user_id": 1017031}'::jsonb)
   Rows Removed by Index Recheck: 1
   Heap Blocks: exact=4
   ->  Bitmap Index Scan on idx_tbl_user_jsonb_user_info  (cost=0.00..59.00 rows=2000 width=0) (actual time=56.098..56.098 rows=4 loops=1)
         Index Cond: (user_info @> '{"user_id": 1017031}'::jsonb)
 Planning time: 28.148 ms
 Execution time: 56.183 ms
(8 rows)

Query by key and value without index:

postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name' ='6_test';
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
 Seq Scan on tbl_user_jsonb  (cost=0.00..72545.00 rows=10000 width=141) (actual time=0.028..423.282 rows=1 loops=1)
   Filter: ((user_info ->> 'user_name'::text) = '6_test'::text)
   Rows Removed by Filter: 1999999
 Planning time: 0.036 ms
 Execution time: 423.295 ms
(5 rows)

Drop GIN index idx_tbl_user_jsonb_user_Info:

postgres=# drop index idx_tbl_user_jsonb_user_Info;
DROP INDEX

Create new index idx_gin_user_info_user_name and query explain analyze, so fast:

postgres=# create index idx_gin_user_info_user_name on tbl_user_jsonb using gin((user_info -> 'user_name'));
CREATE INDEX

postgres=# explain analyze select * from tbl_user_jsonb where user_info->'user_name' ?'6_test';
                                                               QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl_user_jsonb  (cost=35.50..6628.26 rows=2000 width=141) (actual time=0.019..0.019 rows=1 loops=1)
   Recheck Cond: ((user_info -> 'user_name'::text) ? '6_test'::text)
   Heap Blocks: exact=1
   ->  Bitmap Index Scan on idx_gin_user_info_user_name  (cost=0.00..35.00 rows=2000 width=0) (actual time=0.009..0.009 rows=1 loops=1)
         Index Cond: ((user_info -> 'user_name'::text) ? '6_test'::text)
 Planning time: 0.121 ms
 Execution time: 0.035 ms
(7 rows)

json and jsonb query benchmark

Before compare json and jsonb, drop GIN index idx_gin_user_info_user_name:

postgres=# drop index idx_gin_user_info_user_name;
DROP INDEX

Create function index for id field on tbl_user_json and tbl_user_jsonb table:

postgres=# create index idx_gin_user_info_id on tbl_user_json using btree (((user_info ->> 'id')::integer));
CREATE INDEX

postgres=# create index idx_gin_user_infob_id on tbl_user_jsonb using btree (((user_info ->> 'id')::integer));
CREATE INDEX

Because ->> return text type value, and this value will be used for compare with id field, so we need create function index.

json query

postgres=# explain analyze select id,user_info->'id',user_info->'user_name' from tbl_user_json where (user_info->>'id')::int4 > '1' and (user_info->>'id')::int4 < '10000';
                                                               QUERY PLAN                                                               
----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl_user_json  (cost=214.93..22443.74 rows=10000 width=111) (actual time=16.235..32.507 rows=9998 loops=1)
   Recheck Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000))
   Heap Blocks: exact=172
   ->  Bitmap Index Scan on idx_gin_user_info_id  (cost=0.00..212.43 rows=10000 width=0) (actual time=16.179..16.179 rows=9998 loops=1)
         Index Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000))
 Planning time: 6.588 ms
 Execution time: 32.929 ms
(7 rows)

jsonb query

postgres=# explain analyze select id,user_info->'id',user_info->'user_name' from tbl_user_jsonb where (user_info->>'id')::int4 > '1' and (user_info->>'id')::int4 < '10000';
                                                               QUERY PLAN                                                                
-----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on tbl_user_jsonb  (cost=214.93..24048.07 rows=10000 width=141) (actual time=16.194..20.925 rows=9998 loops=1)
   Recheck Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000))
   Heap Blocks: exact=205
   ->  Bitmap Index Scan on idx_gin_user_infob_id  (cost=0.00..212.43 rows=10000 width=0) (actual time=16.164..16.164 rows=9998 loops=1)
         Index Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000))
 Planning time: 0.136 ms
 Executio
5.00 avg. rating (98% score) - 1 vote