jsonb datatype, a specialised representation of the JSON data, allowing PostgreSQL to be competitive in managing the "lingua franca" of the moment for the exchange of data via web services. It is useful to perform a number of tests to verify its actual performance. In this post I have do some test for using jsonb datatype in PostgreSQL, hardware information: iMac 14.2, 3.4 GHz Intel Core i5 CPU, 8 GB 1600 MHz DDR3 RAM, 7200 RPM HDD.
I have create 3 tables: user_ini
base info data (2 000 000 rows), tbl_user_json
with json
type column (2 000 000 rows), tbl_user_jsonb
with jsonb
type column (2 000 000 rows).
Connect database and create benchmark data
$ psql -d postgres psql (9.5.0) Type "help" for help. postgres=# create table user_ini(id int4 ,user_id int8, user_name character varying(64),create_time timestamp(6) with time zone default clock_timestamp()); CREATE TABLE postgres=# insert into user_ini(id,user_id,user_name) select r,round(random()*2000000), r || '_test' from generate_series(1,2000000) as r; INSERT 0 2000000
Generate json data
postgres=# create table tbl_user_json(id serial, user_info json); CREATE TABLE postgres=# insert into tbl_user_json(user_info) select row_to_json(user_ini) from user_ini; INSERT 0 2000000
Generate jsonb data
postgres=# create table tbl_user_jsonb(id serial, user_info jsonb); CREATE TABLE postgres=# insert into tbl_user_jsonb(user_info) select row_to_json(user_ini)::jsonb from user_ini; INSERT 0 2000000
Get table size
postgres=# \dt+ tbl_user_json List of relations Schema | Name | Type | Owner | Size | Description --------+---------------+-------+-------+--------+------------- public | tbl_user_json | table | xuri | 274 MB | (1 row) postgres=# \dt+ tbl_user_jsonb List of relations Schema | Name | Type | Owner | Size | Description --------+----------------+-------+-------+--------+------------- public | tbl_user_jsonb | table | xuri | 332 MB | (1 row)
Query rows
postgres=# select * from tbl_user_jsonb limit 3; id | user_info ----+--------------------------------------------------------------------------------------------------------- 1 | {"id": 1, "user_id": 1961705, "user_name": "1_test", "create_time": "2016-02-04T11:46:47.434434+08:00"} 2 | {"id": 2, "user_id": 1977164, "user_name": "2_test", "create_time": "2016-02-04T11:46:47.434508+08:00"} 3 | {"id": 3, "user_id": 1731350, "user_name": "3_test", "create_time": "2016-02-04T11:46:47.43451+08:00"} (3 rows)
jsonb key and value query test
Query user_info
field from tbl_user_jsonb
table:
postgres=# select * from tbl_user_jsonb where user_info->>'user_name'= '1_test'; id | user_info ----+--------------------------------------------------------------------------------------------------------- 1 | {"id": 1, "user_id": 1961705, "user_name": "1_test", "create_time": "2016-02-04T11:46:47.434434+08:00"} (1 row)
Query explain analyze without index:
postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name'= '1_test'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Seq Scan on tbl_user_jsonb (cost=0.00..72545.09 rows=10000 width=141) (actual time=0.012..359.949 rows=1 loops=1) Filter: ((user_info ->> 'user_name'::text) = '1_test'::text) Rows Removed by Filter: 1999999 Planning time: 0.033 ms Execution time: 359.966 ms (5 rows)
Create index for tbl_user_jsonb
field in user_info
table:
postgres=# create index idx_gin_user_infob_user_name on tbl_user_jsonb using btree ((user_info ->> 'user_name')); CREATE INDEX
Query plan, just use 0.057 ms:
postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name'= '1_test'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl_user_jsonb (cost=233.93..23867.07 rows=10000 width=141) (actual time=0.037..0.037 rows=1 loops=1) Recheck Cond: ((user_info ->> 'user_name'::text) = '1_test'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on idx_gin_user_infob_user_name (cost=0.00..231.43 rows=10000 width=0) (actual time=0.034..0.034 rows=1 loops=1) Index Cond: ((user_info ->> 'user_name'::text) = '1_test'::text) Planning time: 0.127 ms Execution time: 0.057 ms (7 rows)
Query by user_id
in user_info
table without index:
postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_id'= '1'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Seq Scan on tbl_user_jsonb (cost=0.00..72545.00 rows=10000 width=141) (actual time=459.846..486.119 rows=1 loops=1) Filter: ((user_info ->> 'user_id'::text) = '1'::text) Rows Removed by Filter: 1999999 Planning time: 0.037 ms Execution time: 486.131 ms (5 rows)
GIN index
Before use GIN index, drop index idx_gin_user_infob_user_name
:
postgres=# drop index idx_gin_user_infob_user_name; DROP INDEX
Create GIN index cost 202147.862 ms:
postgres=# create index idx_tbl_user_jsonb_user_Info on tbl_user_jsonb using gin (user_Info); CREATE INDEX postgres=# \di+ idx_tbl_user_jsonb_user_Info List of relations Schema | Name | Type | Owner | Table | Size | Description --------+------------------------------+-------+-------+----------------+--------+------------- public | idx_tbl_user_jsonb_user_info | index | xuri | tbl_user_jsonb | 445 MB | (1 row)
Query by key and value with index:
postgres=# explain analyze select * from tbl_user_jsonb where user_info @> '{"user_id": 1017031}'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl_user_jsonb (cost=59.50..6647.26 rows=2000 width=141) (actual time=56.129..56.145 rows=3 loops=1) Recheck Cond: (user_info @> '{"user_id": 1017031}'::jsonb) Rows Removed by Index Recheck: 1 Heap Blocks: exact=4 -> Bitmap Index Scan on idx_tbl_user_jsonb_user_info (cost=0.00..59.00 rows=2000 width=0) (actual time=56.098..56.098 rows=4 loops=1) Index Cond: (user_info @> '{"user_id": 1017031}'::jsonb) Planning time: 28.148 ms Execution time: 56.183 ms (8 rows)
Query by key and value without index:
postgres=# explain analyze select * from tbl_user_jsonb where user_info->>'user_name' ='6_test'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Seq Scan on tbl_user_jsonb (cost=0.00..72545.00 rows=10000 width=141) (actual time=0.028..423.282 rows=1 loops=1) Filter: ((user_info ->> 'user_name'::text) = '6_test'::text) Rows Removed by Filter: 1999999 Planning time: 0.036 ms Execution time: 423.295 ms (5 rows)
Drop GIN index idx_tbl_user_jsonb_user_Info
:
postgres=# drop index idx_tbl_user_jsonb_user_Info; DROP INDEX
Create new index idx_gin_user_info_user_name
and query explain analyze, so fast:
postgres=# create index idx_gin_user_info_user_name on tbl_user_jsonb using gin((user_info -> 'user_name')); CREATE INDEX postgres=# explain analyze select * from tbl_user_jsonb where user_info->'user_name' ?'6_test'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl_user_jsonb (cost=35.50..6628.26 rows=2000 width=141) (actual time=0.019..0.019 rows=1 loops=1) Recheck Cond: ((user_info -> 'user_name'::text) ? '6_test'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on idx_gin_user_info_user_name (cost=0.00..35.00 rows=2000 width=0) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: ((user_info -> 'user_name'::text) ? '6_test'::text) Planning time: 0.121 ms Execution time: 0.035 ms (7 rows)
json and jsonb query benchmark
Before compare json and jsonb, drop GIN index idx_gin_user_info_user_name
:
postgres=# drop index idx_gin_user_info_user_name; DROP INDEX
Create function index for id
field on tbl_user_json
and tbl_user_jsonb
table:
postgres=# create index idx_gin_user_info_id on tbl_user_json using btree (((user_info ->> 'id')::integer)); CREATE INDEX postgres=# create index idx_gin_user_infob_id on tbl_user_jsonb using btree (((user_info ->> 'id')::integer)); CREATE INDEX
Because ->>
return text
type value, and this value will be used for compare with id
field, so we need create function index.
json query
postgres=# explain analyze select id,user_info->'id',user_info->'user_name' from tbl_user_json where (user_info->>'id')::int4 > '1' and (user_info->>'id')::int4 < '10000'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl_user_json (cost=214.93..22443.74 rows=10000 width=111) (actual time=16.235..32.507 rows=9998 loops=1) Recheck Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000)) Heap Blocks: exact=172 -> Bitmap Index Scan on idx_gin_user_info_id (cost=0.00..212.43 rows=10000 width=0) (actual time=16.179..16.179 rows=9998 loops=1) Index Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000)) Planning time: 6.588 ms Execution time: 32.929 ms (7 rows)
jsonb query
postgres=# explain analyze select id,user_info->'id',user_info->'user_name' from tbl_user_jsonb where (user_info->>'id')::int4 > '1' and (user_info->>'id')::int4 < '10000'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tbl_user_jsonb (cost=214.93..24048.07 rows=10000 width=141) (actual time=16.194..20.925 rows=9998 loops=1) Recheck Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000)) Heap Blocks: exact=205 -> Bitmap Index Scan on idx_gin_user_infob_id (cost=0.00..212.43 rows=10000 width=0) (actual time=16.164..16.164 rows=9998 loops=1) Index Cond: ((((user_info ->> 'id'::text))::integer > 1) AND (((user_info ->> 'id'::text))::integer < 10000)) Planning time: 0.136 ms Executio