SlideShare a Scribd company logo
JSONB in PostgreSQL
Working with JSON in PostgreSQL vs. MongoDB
Dharshan Rangegowda
Founder, ScaleGrid.io | @dharshanrg
What is JSON?
● JSON stands for Javascript object
notation.
● Open standard format RFC 7159.
● Most popular format to store and
exchange documents.
Working with JSON in PostgreSQL vs. MongoDB
Why does PostgreSQL need to care about JSON?
• Schema flexibility
• Dealing with transient or changing columns.
• Nested objects
• Might not need to deserialize to query.
• Handling objects from other systems
• E.g. Stripe transaction
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL + JSON Timeline
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL JSON Support
• Wave 1: PostgreSQL 9.2 (2012) added support for the JSON datatype
• Text field with JSON validation
• No index support
• Wave 2: PostgreSQL 9.4 (2014) added support for JSONB datatype
• Binary data structure to store JSON
• Index support
Working with JSON in PostgreSQL vs. MongoDB
PostgreSQL JSON Support
• Wave 3: PostgreSQL 12 (2019) added support for SQL/JSON standard
• JSONPath support
• Powerful query and projection engine for JSON data
• Further improvements to JSONPath in PostgreSQL 13
• JSON roadmap
Working with JSON in PostgreSQL vs. MongoDB
JSON vs. JSONB
• JSONB is what you should be using (in most cases)
• However, there are some scenarios where JSON is useful:
• JSON preserves the original formatting (a.k.a whitespace)
• JSON preserves ordering of the keys
• JSON preserves duplicate keys
• JSON is faster to ingest vs. JSONB
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● What is the best way to use JSONB?
○ Do we even need columns any more?
○ Why not just use <int id, jsonb data>?
● JSONB has some high-level limitations you need to
be aware of:
○ Statistics collection
○ Storage bloat
● Commonly occurring fields should be stored as
columns.
○ Use JSONB for variable or intermittent columns.
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● PostgreSQL collect stats on column data distribution
○ Most common values (MCV)
○ Fraction of null values
○ Histogram of distribution
● No column statistics collected for JSONB
○ Query planner doesn’t have stats to make smart decisions
○ Could make wrong choice – cross join vs hash join etc
● More details in blog post - When To Avoid JSONB
In A PostgreSQL Schema
Working with JSON in PostgreSQL vs. MongoDB
JSONB Anti Patterns
● Storage bloat
○ Keys are stored in the data (Similar to MongoDB mmapv1)
○ Use smaller key names to reduce footprint
○ Relies on TOAST compression
○ Sample table with 1M rows (11GB of data)
○ PostgreSQL - 8.7 GB
○ MongoDB Snappy – 8GB, Zlib – 5.3 GB
Working with JSON in PostgreSQL vs. MongoDB
JSONB & TOAST
● If the size of your column exceeds the
TOAST_TUPLE_THRESHOLD (2KB default) data
could be moved to out of line storage - TOAST
● TOAST also provides compression (pglz)
○ Decent Compression
○ MongoDB WiredTiger snappy/zlib is potentially better
● To access the data it needs to be De’TOASTed
○ Could result in performance overhead
Working with JSON in PostgreSQL vs. MongoDB
JSONB Data Structures
Working with JSON in PostgreSQL vs. MongoDB
Images courtesy: https://erthalion.info/2017/12/21/advanced-json-benchmarks/
BSON Data Structures
Working with JSON in PostgreSQL vs. MongoDB
JSONB Operators
Working with JSON in PostgreSQL vs. MongoDB
Operator Description
->, ->> Get JSON object field by key
@>, <@ Does the left JSONB value contain the right JSONB path/value entries at
the top level?
?, ?!, ?& Does the string exist as a top-level key within the JSON value?
@@, @@> JSONPath operators
Full list of operators can be found in the docs – JSONB op table
JSONB Functions
• PostgreSQL provides a wide variety of functions to create and process
JSON data
• Creation functions
• Processing functions
Working with JSON in PostgreSQL vs. MongoDB
MongoDB Query language
• Query language based on JSON syntax
• db.books.find( {} ) , db.books.find( { publisher: "D" } )
• Array operators
• db.books.find( { tags: ["red", "blank"] } )
• AND and OR operators
• db.books.find( { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } )
Working with JSON in PostgreSQL vs. MongoDB
MongoDB Query language
• Query nested documents
• db.books.find( { "size.uom": "in" } )
• Query an Array of objects
• db.books.find( { 'instock.qty': { $lte: 20 } } ))
• Project fields to return from query
• db.books.find( {prints: 1}, { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } )
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes
• JSONB provides a wide array of options to index your JSON data.
• We are going to dig into three types of indexes:
• GIN
• BTREE
• HASH
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes : GIN
• GIN stands for “Generalized
Inverted Indexes”
• GIN supports two operator classes
• jsonb_ops
• ?, ?|, ?&, @>, @@, @?
• [Index each key and value]
• jsonb_pathops
• @>, @@, @?
• [Index only the values]
Copyright © ScaleGrid.io
JSON sample data
Fictional book database (apologies to any librarians in the audience):
Working with JSON in PostgreSQL vs. MongoDB
demo=# d+ books
Table "public.books"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
--------+------------------------+-----------+----------+---------+----------+--------------+-------------
id | integer | | not null | | plain | |
author | character varying(255) | | | | extended | |
isbn | character varying(25) | | | | extended | |
rating | integer | | | | plain | |
data | jsonb | | | | extended | |
Indexes:
"books_pkey" PRIMARY KEY, btree (id)
…..
JSON sample data
Working with JSON in PostgreSQL vs. MongoDB
demo=# select jsonb_pretty(data) from books where id = 1000021;
jsonb_pretty
------------------------------------
{ +
"tags": { +
"nk906585": { +
"ik844766": "iv364087"+
} +
}, +
"prints": [ +
{ +
"price": 100, +
"style": "hc" +
}, +
{ +
"price": 50, +
"style": "pb" +
} +
], +
"braille": false, +
"keywords": [ +
"abc", +
"kef", +
"keh" +
], +
"hardcover": true, +
"publisher": "nVxJVA8Bwx", +
"criticrating": 2 +
}
JSONB Indexes: GIN - ?
Find all books that are available in braille? Let’s create the GIN index on the ‘data’ JSONB column:
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX datagin ON books USING gin (data);
demo=# select * from books where data ? 'braille';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
.....
demo=# explain analyze select * from books where data ? 'braille';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) (actual time=0.033..0.039 rows=15 loops=1)
Recheck Cond: (data ? 'braille'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.022..0.022 rows=15 loops=1)
Index Cond: (data ? 'braille'::text)
Planning Time: 0.102 ms
Execution Time: 0.067 ms
(7 rows)
JSONB Indexes: GIN - ?
What if we wanted to find books that were in braille or in hardcover?
Working with JSON in PostgreSQL vs. MongoDB
demo=# explain analyze select * from books where data ?| array['braille','hardcover'];
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.029..0.035 rows=15 loops=1)
Recheck Cond: (data ?| '{braille,hardcover}'::text[])
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.023..0.023 rows=15 loops=1)
Index Cond: (data ?| '{braille,hardcover}'::text[])
Planning Time: 0.138 ms
Execution Time: 0.057 ms
(7 rows)
JSONB Indexes: GIN
GIN index supports the “existence” operators only on “top level” keys. If the key is not at the top level, then
the index will not be used.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data->'tags' ? 'nk455671';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0}
(2 rows)
demo=# explain analyze select * from books where data->'tags' ? 'nk455671';
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..38807.29 rows=1000 width=158) (actual time=0.018..270.641 rows=2 loops=1)
Filter: ((data -> 'tags'::text) ? 'nk455671'::text)
Rows Removed by Filter: 1000017
Planning Time: 0.078 ms
Execution Time: 270.728 ms
(5 rows)
JSONB Indexes: GIN
The way to check for existence in nested docs is to use “Expression indexes”. Let’s create an index on
data->tags:
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX datatagsgin ON books USING gin (data->'tags');
demo=# select * from books where data->'tags' ? 'nk455671';
id | author | isbn | rating | data
---------+-----------------+------------+--------+-----------------------------------------------------------------------------------------------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0}
(2 rows)
demo=# explain analyze select * from books where data->'tags' ? 'nk455671';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1007.75 rows=1000 width=158) (actual time=0.031..0.035 rows=2 loops=1)
Recheck Cond: ((data ->'tags'::text) ? 'nk455671'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on datatagsgin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.021..0.021 rows=2 loops=1)
Index Cond: ((data ->'tags'::text) ? 'nk455671'::text)
Planning Time: 0.098 ms
Execution Time: 0.061 ms
(7 rows)
JSONB Indexes: GIN - @>
The “path” operator can be used for multi-level queries of your JSON data. Let’s use it similar to the ?
operator.
Working with JSON in PostgreSQL vs. MongoDB
select * from books where data @> '{"braille":true}'::jsonb;
demo=# explain analyze select * from books where data @> '{"braille":true}'::jsonb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.040..0.048 rows=6 loops=1)
Recheck Cond: (data @> '{"braille": true}'::jsonb)
Rows Removed by Index Recheck: 9
Heap Blocks: exact=2
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.030..0.030 rows=15 loops=1)
Index Cond: (data @> '{"braille": true}'::jsonb)
Planning Time: 0.100 ms
Execution Time: 0.076 ms
(8 rows)
JSONB Indexes: GIN - @>
The "path" operator can be used for multi level queries of your JSON data.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb;
id | author | isbn | rating | data
-----+-----------------+------------+--------+-------------------------------------------------------------------------------------
346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3}
(1 row)
demo=# explain analyze select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.491..0.492 rows=1 loops=1)
Recheck Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.092..0.092 rows=1 loops=1)
Index Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb)
Planning Time: 0.090 ms
Execution Time: 0.523 ms
JSONB Indexes: GIN - @>
The JSON queries can be nested to many levels. You can also use the ># operation but GIN does not
support it.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
(1 row)
JSONB Indexes: GIN - jsonb_pathops
GIN also supports a “pathops” option to reduce the size of the GIN index.
From the docs:
“The technical difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates
independent index items for each key and value in the data, while the latter creates index items only for
each value in the data.”
On my small dataset of 1M books, you can see that the pathops GIN index is smaller – you should test
with your dataset to understand the savings.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX dataginpathops ON books USING gin (data jsonb_path_ops);
public | dataginpathops | index | sgpostgres | books | 67 MB |
public | datatagsgin | index | sgpostgres | books | 84 MB |
JSONB Indexes: GIN - jsonb_pathops
Let’s rerun our query from before with the pathops index:
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
---------------------------------
------------------
1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false,
"publisher": "zSfZIAjGGs", "
criticrating": 4}
(1 row)
demo=# explain select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb;
QUERY PLAN
-----------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158)
Recheck Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb)
-> Bitmap Index Scan on dataginpathops (cost=0.00..12.50 rows=1000 width=0)
Index Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb)
(4 rows)
JSONB Indexes: GIN - jsonb_pathops
The “jsonb_pathops” option supports only the @> operator.
Smaller index but more limited scenarios.
The following queries below can no longer leverage the GIN index:
Working with JSON in PostgreSQL vs. MongoDB
select * from books where data ? 'tags'; => Sequential scan
select * from books where data @> '{"tags" :{}}'; => Sequential scan
select * from books where data @> '{"tags" :{"k7888":{}}}' => Sequential scan
JSONB Indexes: B-tree
• B-tree indexes are the most common index type in relational databases.
• If you index an entire JSONB column with a B-tree index, the only useful
operators are the comparison operators:
• =, <, <=, >, >=
• Can be used only for whole object comparisons.
• Very limited use case.
Working with JSON in PostgreSQL vs. MongoDB
JSONB Indexes: B-tree
• Use B-tree “Expression indexes”
• B-tree expression indexes can support the common comparison operators '=', '<', '>', '>=', '<=‘ (which
GIN doesn't support).
• Retrieve all books with a data->criticrating > 4.
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data->'criticrating' > 4;
ERROR: operator does not exist: jsonb >= integer
LINE 1: select * from books where data->'criticrating’ > 4;
^
HINT: No operator matches the given name and argument types. You might need to add explicit type casts.
#Lets cast JSONB to integer
demo=# select * from books where (data->'criticrating')::int4 > 4;
#If you are using a version prior to pg11 you need to query as text and then cast
demo=# select * from books where (data->>'criticrating')::int4 > 4;
JSONB Indexes: B-tree
For expression indexes, the index needs to be an exact match with the query expression:
Working with JSON in PostgreSQL vs. MongoDB
demo=# CREATE INDEX criticrating ON books USING BTREE (((data->'criticrating')::int4));
CREATE INDEX
demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1)
Index Cond: (((data -> 'criticrating'::text))::integer = 3)
Planning Time: 0.103 ms
Execution Time: 79.019 ms
(4 rows)
demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1)
Index Cond: (((data -> 'criticrating'::text))::integer = 3)
Planning Time: 0.103 ms
Execution Time: 79.019 ms
(4 rows)
1
From above we can see that the BTREE index is being used as expected.
JSONB Indexes: HASH
• If you are only interested in the "=" operator, then Hash indexes become interesting.
• Hash indexes tend to be smaller than B-tree indexes.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX publisherhash ON books USING HASH ((data->'publisher'));
demo=# select * from books where data->'publisher' = 'XlekfkLOtL'
demo-# ;
id | author | isbn | rating | data
-----+-----------------+------------+--------+-------------------------------------------------------------------------------------
346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3}
(1 row)
demo=# explain analyze select * from books where data->'publisher' = 'XlekfkLOtL';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using publisherhash on books (cost=0.00..2.02 rows=1 width=158) (actual time=0.016..0.017 rows=1 loops=1)
Index Cond: ((data -> 'publisher'::text) = 'XlekfkLOtL'::text)
Planning Time: 0.080 ms
Execution Time: 0.035 ms
(4 rows)
JSONB Indexes: GIN - Trigram
• PostgreSQL supports string matching using Trigram indexes.
• Trigrams are basically words broken up into sequences of 3 letters.
• We can search for any arbitrary regex (not just left anchored).
Working with JSON in PostgreSQL vs. MongoDB
CREATE EXTENSION pg_trgm;
CREATE INDEX publisher ON books USING GIN ((data->'publisher') gin_trgm_ops);
demo=# select * from books where data->'publisher' LIKE '%I0UB%';
id | author | isbn | rating | data
----+-----------------+------------+--------+---------------------------------------------------------------------------------
4 | KiEk3xjqvTpmZeS | EYqXO9Nwmm | 0 | {"tags": {"nk3": {"ik1": "iv1"}}, "publisher": "MI0UBqZJDt", "criticrating": 1}
(1 row)
demo=# explain analyze select * from books where data->'publisher' LIKE '%I0UB%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=9.78..111.28 rows=100 width=158) (actual time=0.033..0.033 rows=1 loops=1)
Recheck Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on publisher (cost=0.00..9.75 rows=100 width=0) (actual time=0.025..0.025 rows=1 loops=1)
Index Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text)
Planning Time: 0.213 ms
Execution Time: 0.058 ms
(7 rows)
JSONB Indexes: GIN - Arrays
• GIN indexes are great for indexing arrays.
• Indexing and searching the keyword array.
Working with JSON in PostgreSQL vs. MongoDB
CREATE INDEX keywords ON books USING GIN ((data->'keywords') jsonb_path_ops);
demo=# select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb;
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
--------------
1000003 | zEG406sLKQ2IU8O | viPdlu3DZm | 4 | {"tags": {"nk263020": {"ik203820": "iv817928"}}, "keywords": ["abc", "kef", "keh"], "publisher": "7NClevxuTM",
"criticrating": 2}
1000004 | GCe9NypHYKDH4rD | so6TQDYzZ3 | 4 | {"tags": {"nk780341": {"ik397357": "iv632731"}}, "keywords": ["abc", "kef", "keh"], "publisher": "fqaJuAdjP5",
"criticrating": 2}
(2 rows)
demo=# explain analyze select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=54.75..1049.75 rows=1000 width=158) (actual time=0.026..0.028 rows=2 loops=1)
Recheck Cond: ((data -> 'keywords'::text) @> '["abc", "keh"]'::jsonb)
Heap Blocks: exact=1
-> Bitmap Index Scan on keywords (cost=0.00..54.50 rows=1000 width=0) (actual time=0.014..0.014 rows=2 loops=1)
Index Cond: ((data -> 'keywords'::text) @&amp;amp;amp;amp;amp;gt; '["abc", "keh"]'::jsonb)
Planning Time: 0.131 ms
Execution Time: 0.063 ms
(7 rows)
SQL/JSON
• SQL standard added support for JSON – SQL Standard-2016 (SQL/JSON).
• SQL/JSON Data model
• JSONPath
• SQL/JSON functions
• With PG12 release, PostgreSQL has one of the best implementations of
SQL/JSON.
Working with JSON in PostgreSQL vs. MongoDB
SQL/JSON 2016
● A sequence of SQL/JSON items, each item can be (recursively) any of:
○ SQL/JSON scalar — non-null value of SQL types: Unicode character string, numeric, Boolean
or datetime.
○ SQL/JSON null, value that is distinct from any value of any SQL type (not the same as NULL).
○ SQL/JSON arrays, ordered list of zero or more SQL/JSON items — SQL/JSON element
○ SQL/JSON objects — unordered collections of zero or more SQL/JSON members.
■ (key, SQL/JSON item)
Working with JSON in PostgreSQL vs. MongoDB
JSONPath
Working with JSON in PostgreSQL vs. MongoDB
.key Returns an object member with the specified key
[*] Wildcard array element accessor that returns all array elements
.* Wildcard member accessor that returns the values of all members located at the top level of
the current object
.** Recursive wildcard member accessor that processes all levels of the JSON hierarchy of the
current object and returns all the member values, regardless of their nesting level
JSONPath allows you to specify an expression (using a syntax similar to the
property access notation in Javascript) to query or project your JSON data.
SQL/JSON Functions
● PG 12 provides several functions to use JSONPATH to query your JSON
data
○ jsonb_path_exists - Checks whether JSON path returns any item for the
specified JSON value
○ jsonb_path_match - Returns the result of JSON path predicate check for
the specified JSON value.
○ jsonb_path_query - Gets all JSON items returned by JSON path for the
specified JSON value.
Working with JSON in PostgreSQL vs. MongoDB
JSONPath
Finding books by publisher?
Working with JSON in PostgreSQL vs. MongoDB
demo=# select * from books where data @@ '$.publisher == "ktjKEZ1tvq"';
id | author | isbn | rating | data
---------+-----------------+------------+--------+---------------------------------------------------------------------------------------------------------------------
-------------
1000001 | 4RNsovI2haTgU7l | GwSoX67gLS | 2 | {"tags": {"nk542369": {"ik55240": "iv305393"}}, "keywords": ["abc", "def", "geh"], "publisher": "ktjKEZ1tvq",
"criticrating": 0}
(1 row)
demo=# explain analyze select * from books where data @@ '$.publisher == "ktjKEZ1tvq"';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on books (cost=21.75..1014.25 rows=1000 width=158) (actual time=0.123..0.124 rows=1 loops=1)
Recheck Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath)
Heap Blocks: exact=1
-&amp;amp;amp;amp;gt; Bitmap Index Scan on datagin (cost=0.00..21.50 rows=1000 width=0) (actual time=0.110..0.110 rows=1 loops=1)
Index Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath)
Planning Time: 0.137 ms
Execution Time: 0.194 ms
(7 rows)
JSONPath
Add a JSONPath filter:
Working with JSON in PostgreSQL vs. MongoDB
select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")');
Build complicated filter expressions:
select * from books where jsonb_path_exists(data, '$.prints[*] ?(@.style=="hc" && @.price == 100)');
Index support for JSONPath is very limited.
demo=# explain analyze select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")');
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Seq Scan on books (cost=0.00..36307.24 rows=333340 width=158) (actual time=0.019..480.268 rows=1 loops=1)
Filter: jsonb_path_exists(data, '$."publisher"?(@ == "ktjKEZ1tvq")'::jsonpath, '{}'::jsonb, false)
Rows Removed by Filter: 1000028
Planning Time: 0.095 ms
Execution Time: 480.348 ms
(5 rows)
JSONPath: Projection JSON
Select the last element of the array
Working with JSON in PostgreSQL vs. MongoDB
demo=# select jsonb_path_query(data, '$.prints[$.size()]') from books where id = 1000029;
jsonb_path_query
------------------------------
{"price": 50, "style": "pb"}
(1 row)
Select only the hardcover prints from the array
demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc")') from books where id = 1000029;
jsonb_path_query
-------------------------------
{"price": 100, "style": "hc"}
(1 row)
We can also chain the filters
demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc") ?(@.price ==100)') from books where id = 1000029;
jsonb_path_query
-------------------------------
{"price": 100, "style": "hc"}
(1 row)
Roadmap
● Improvements to the JSONPath implementation in PG13
● Future Roadmap
Working with JSON in PostgreSQL vs. MongoDB
Questions?
Working with JSON in PostgreSQL vs. MongoDB

More Related Content

What's hot (20)

Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uring
Anne Nicolas
 
Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략
Jin wook
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
EDB
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxData
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
PgDay.Seoul
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
Puneet Behl
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
EXEM
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
Alexander Kukushkin
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
Smita Prasad
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
Databricks
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
4.3 MySQL + PHP
4.3 MySQL + PHP4.3 MySQL + PHP
4.3 MySQL + PHP
Jalpesh Vasa
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Kernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uringKernel Recipes 2019 - Faster IO through io_uring
Kernel Recipes 2019 - Faster IO through io_uring
Anne Nicolas
 
Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략Mongo DB 성능최적화 전략
Mongo DB 성능최적화 전략
Jin wook
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
EDB
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxData
 
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG[Pgday.Seoul 2018]  이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
[Pgday.Seoul 2018] 이기종 DB에서 PostgreSQL로의 Migration을 위한 DB2PG
PgDay.Seoul
 
MongoDB Performance Tuning
MongoDB Performance TuningMongoDB Performance Tuning
MongoDB Performance Tuning
Puneet Behl
 
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
EXEM
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL Database Slides
PostgreSQL Database SlidesPostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs PostgreSQL WAL for DBAs
PostgreSQL WAL for DBAs
PGConf APAC
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
Alexander Kukushkin
 
PostgreSQL- An Introduction
PostgreSQL- An IntroductionPostgreSQL- An Introduction
PostgreSQL- An Introduction
Smita Prasad
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
Alexander Kukushkin
 
Hive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas PatilHive Bucketing in Apache Spark with Tejas Patil
Hive Bucketing in Apache Spark with Tejas Patil
Databricks
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Databricks
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 

Similar to Working with JSON Data in PostgreSQL vs. MongoDB (20)

Oh, that ubiquitous JSON !
Oh, that ubiquitous JSON !Oh, that ubiquitous JSON !
Oh, that ubiquitous JSON !
Alexander Korotkov
 
You, me, and jsonb
You, me, and jsonbYou, me, and jsonb
You, me, and jsonb
Arielle Simmons
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing supportJsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Ontico
 
The NoSQL Way in Postgres
The NoSQL Way in PostgresThe NoSQL Way in Postgres
The NoSQL Way in Postgres
EDB
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
Ontico
 
Power JSON with PostgreSQL
Power JSON with PostgreSQLPower JSON with PostgreSQL
Power JSON with PostgreSQL
EDB
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typePostgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Jumping Bean
 
There is Javascript in my SQL
There is Javascript in my SQLThere is Javascript in my SQL
There is Javascript in my SQL
PGConf APAC
 
No sql way_in_pg
No sql way_in_pgNo sql way_in_pg
No sql way_in_pg
Vibhor Kumar
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
pgdayrussia
 
How to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes StorageHow to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes Storage
techprane
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and OperatorsPostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
Do More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the EnterpriseDo More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the Enterprise
EDB
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured PostgresNoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured Postgres
EDB
 
Postgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journeyPostgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
Mathias test
Mathias testMathias test
Mathias test
Mathias Stjernström
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросовNoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
CodeFest
 
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
Ryan B Harvey, CSDP, CSM
 
NoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured PostgresNoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured Postgres
EDB
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing supportJsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Postgres vs Mongo / Олег Бартунов (Postgres Professional)
Ontico
 
The NoSQL Way in Postgres
The NoSQL Way in PostgresThe NoSQL Way in Postgres
The NoSQL Way in Postgres
EDB
 
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
NoSQL Best Practices for PostgreSQL / Дмитрий Долгов (Mindojo)
Ontico
 
Power JSON with PostgreSQL
Power JSON with PostgreSQLPower JSON with PostgreSQL
Power JSON with PostgreSQL
EDB
 
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data typePostgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type
Jumping Bean
 
There is Javascript in my SQL
There is Javascript in my SQLThere is Javascript in my SQL
There is Javascript in my SQL
PGConf APAC
 
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
PG Day'14 Russia, Работа со слабо-структурированными данными в PostgreSQL, Ол...
pgdayrussia
 
How to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes StorageHow to Use JSONB in PostgreSQL for Product Attributes Storage
How to Use JSONB in PostgreSQL for Product Attributes Storage
techprane
 
PostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and OperatorsPostgreSQL 9.4 JSON Types and Operators
PostgreSQL 9.4 JSON Types and Operators
Nicholas Kiraly
 
Do More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the EnterpriseDo More with Postgres- NoSQL Applications for the Enterprise
Do More with Postgres- NoSQL Applications for the Enterprise
EDB
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured PostgresNoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured Postgres
EDB
 
Postgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journeyPostgre(No)SQL - A JSON journey
Postgre(No)SQL - A JSON journey
Nicola Moretto
 
NoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросовNoSQL для PostgreSQL: Jsquery — язык запросов
NoSQL для PostgreSQL: Jsquery — язык запросов
CodeFest
 
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
JSON Processing in the Database using PostgreSQL 9.4 :: Data Wranglers DC :: ...
Ryan B Harvey, CSDP, CSM
 
NoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured PostgresNoSQL on ACID: Meet Unstructured Postgres
NoSQL on ACID: Meet Unstructured Postgres
EDB
 
Ad

More from ScaleGrid.io (6)

Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory EngineRedis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
ScaleGrid.io
 
Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets
ScaleGrid.io
 
Introduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted SetsIntroduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted Sets
ScaleGrid.io
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
ScaleGrid.io
 
Introduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: HashesIntroduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: Hashes
ScaleGrid.io
 
Introduction to Redis Data Structures
Introduction to Redis Data Structures Introduction to Redis Data Structures
Introduction to Redis Data Structures
ScaleGrid.io
 
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory EngineRedis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
Redis vs. MongoDB: Comparing In-Memory Databases with Percona Memory Engine
ScaleGrid.io
 
Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets Introduction to Redis Data Structures: Sets
Introduction to Redis Data Structures: Sets
ScaleGrid.io
 
Introduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted SetsIntroduction to Redis Data Structures: Sorted Sets
Introduction to Redis Data Structures: Sorted Sets
ScaleGrid.io
 
Cassandra vs. MongoDB
Cassandra vs. MongoDBCassandra vs. MongoDB
Cassandra vs. MongoDB
ScaleGrid.io
 
Introduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: HashesIntroduction to Redis Data Structures: Hashes
Introduction to Redis Data Structures: Hashes
ScaleGrid.io
 
Introduction to Redis Data Structures
Introduction to Redis Data Structures Introduction to Redis Data Structures
Introduction to Redis Data Structures
ScaleGrid.io
 
Ad

Recently uploaded (20)

Dev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API WorkflowsDev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API Workflows
UiPathCommunity
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyesEnd-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
ThousandEyes
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath InsightsUiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPathCommunity
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Lorenzo Miniero
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Cyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptxCyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptx
Ghimire B.R.
 
Introducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRCIntroducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRC
Adtran
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Grannie’s Journey to Using Healthcare AI Experiences
Grannie’s Journey to Using Healthcare AI ExperiencesGrannie’s Journey to Using Healthcare AI Experiences
Grannie’s Journey to Using Healthcare AI Experiences
Lauren Parr
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 
Dev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API WorkflowsDev Dives: System-to-system integration with UiPath API Workflows
Dev Dives: System-to-system integration with UiPath API Workflows
UiPathCommunity
 
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
ELNL2025 - Unlocking the Power of Sensitivity Labels - A Comprehensive Guide....
Jasper Oosterveld
 
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto CertificateCybersecurity Fundamentals: Apprentice - Palo Alto Certificate
Cybersecurity Fundamentals: Apprentice - Palo Alto Certificate
VICTOR MAESTRE RAMIREZ
 
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyesEnd-to-end Assurance for SD-WAN & SASE with ThousandEyes
End-to-end Assurance for SD-WAN & SASE with ThousandEyes
ThousandEyes
 
Contributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptxContributing to WordPress With & Without Code.pptx
Contributing to WordPress With & Without Code.pptx
Patrick Lumumba
 
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath InsightsUiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPathCommunity
 
Securiport - A Border Security Company
Securiport  -  A Border Security CompanySecuriport  -  A Border Security Company
Securiport - A Border Security Company
Securiport
 
Jira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : IntroductionJira Administration Training – Day 1 : Introduction
Jira Administration Training – Day 1 : Introduction
Ravi Teja
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Multistream in SIP and NoSIP @ OpenSIPS Summit 2025
Lorenzo Miniero
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
SDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhereSDG 9000 Series: Unleashing multigigabit everywhere
SDG 9000 Series: Unleashing multigigabit everywhere
Adtran
 
Cyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptxCyber Security Legal Framework in Nepal.pptx
Cyber Security Legal Framework in Nepal.pptx
Ghimire B.R.
 
Introducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRCIntroducing the OSA 3200 SP and OSA 3250 ePRC
Introducing the OSA 3200 SP and OSA 3250 ePRC
Adtran
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Grannie’s Journey to Using Healthcare AI Experiences
Grannie’s Journey to Using Healthcare AI ExperiencesGrannie’s Journey to Using Healthcare AI Experiences
Grannie’s Journey to Using Healthcare AI Experiences
Lauren Parr
 
Microsoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentationMicrosoft Build 2025 takeaways in one presentation
Microsoft Build 2025 takeaways in one presentation
Digitalmara
 
Co-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using ProvenanceCo-Constructing Explanations for AI Systems using Provenance
Co-Constructing Explanations for AI Systems using Provenance
Paul Groth
 
Data Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any ApplicationData Virtualization: Bringing the Power of FME to Any Application
Data Virtualization: Bringing the Power of FME to Any Application
Safe Software
 

Working with JSON Data in PostgreSQL vs. MongoDB

  • 1. JSONB in PostgreSQL Working with JSON in PostgreSQL vs. MongoDB Dharshan Rangegowda Founder, ScaleGrid.io | @dharshanrg
  • 2. What is JSON? ● JSON stands for Javascript object notation. ● Open standard format RFC 7159. ● Most popular format to store and exchange documents. Working with JSON in PostgreSQL vs. MongoDB
  • 3. Why does PostgreSQL need to care about JSON? • Schema flexibility • Dealing with transient or changing columns. • Nested objects • Might not need to deserialize to query. • Handling objects from other systems • E.g. Stripe transaction Working with JSON in PostgreSQL vs. MongoDB
  • 4. PostgreSQL + JSON Timeline Working with JSON in PostgreSQL vs. MongoDB
  • 5. PostgreSQL JSON Support • Wave 1: PostgreSQL 9.2 (2012) added support for the JSON datatype • Text field with JSON validation • No index support • Wave 2: PostgreSQL 9.4 (2014) added support for JSONB datatype • Binary data structure to store JSON • Index support Working with JSON in PostgreSQL vs. MongoDB
  • 6. PostgreSQL JSON Support • Wave 3: PostgreSQL 12 (2019) added support for SQL/JSON standard • JSONPath support • Powerful query and projection engine for JSON data • Further improvements to JSONPath in PostgreSQL 13 • JSON roadmap Working with JSON in PostgreSQL vs. MongoDB
  • 7. JSON vs. JSONB • JSONB is what you should be using (in most cases) • However, there are some scenarios where JSON is useful: • JSON preserves the original formatting (a.k.a whitespace) • JSON preserves ordering of the keys • JSON preserves duplicate keys • JSON is faster to ingest vs. JSONB Working with JSON in PostgreSQL vs. MongoDB
  • 8. JSONB Anti Patterns ● What is the best way to use JSONB? ○ Do we even need columns any more? ○ Why not just use <int id, jsonb data>? ● JSONB has some high-level limitations you need to be aware of: ○ Statistics collection ○ Storage bloat ● Commonly occurring fields should be stored as columns. ○ Use JSONB for variable or intermittent columns. Working with JSON in PostgreSQL vs. MongoDB
  • 9. JSONB Anti Patterns ● PostgreSQL collect stats on column data distribution ○ Most common values (MCV) ○ Fraction of null values ○ Histogram of distribution ● No column statistics collected for JSONB ○ Query planner doesn’t have stats to make smart decisions ○ Could make wrong choice – cross join vs hash join etc ● More details in blog post - When To Avoid JSONB In A PostgreSQL Schema Working with JSON in PostgreSQL vs. MongoDB
  • 10. JSONB Anti Patterns ● Storage bloat ○ Keys are stored in the data (Similar to MongoDB mmapv1) ○ Use smaller key names to reduce footprint ○ Relies on TOAST compression ○ Sample table with 1M rows (11GB of data) ○ PostgreSQL - 8.7 GB ○ MongoDB Snappy – 8GB, Zlib – 5.3 GB Working with JSON in PostgreSQL vs. MongoDB
  • 11. JSONB & TOAST ● If the size of your column exceeds the TOAST_TUPLE_THRESHOLD (2KB default) data could be moved to out of line storage - TOAST ● TOAST also provides compression (pglz) ○ Decent Compression ○ MongoDB WiredTiger snappy/zlib is potentially better ● To access the data it needs to be De’TOASTed ○ Could result in performance overhead Working with JSON in PostgreSQL vs. MongoDB
  • 12. JSONB Data Structures Working with JSON in PostgreSQL vs. MongoDB Images courtesy: https://erthalion.info/2017/12/21/advanced-json-benchmarks/
  • 13. BSON Data Structures Working with JSON in PostgreSQL vs. MongoDB
  • 14. JSONB Operators Working with JSON in PostgreSQL vs. MongoDB Operator Description ->, ->> Get JSON object field by key @>, <@ Does the left JSONB value contain the right JSONB path/value entries at the top level? ?, ?!, ?& Does the string exist as a top-level key within the JSON value? @@, @@> JSONPath operators Full list of operators can be found in the docs – JSONB op table
  • 15. JSONB Functions • PostgreSQL provides a wide variety of functions to create and process JSON data • Creation functions • Processing functions Working with JSON in PostgreSQL vs. MongoDB
  • 16. MongoDB Query language • Query language based on JSON syntax • db.books.find( {} ) , db.books.find( { publisher: "D" } ) • Array operators • db.books.find( { tags: ["red", "blank"] } ) • AND and OR operators • db.books.find( { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } ) Working with JSON in PostgreSQL vs. MongoDB
  • 17. MongoDB Query language • Query nested documents • db.books.find( { "size.uom": "in" } ) • Query an Array of objects • db.books.find( { 'instock.qty': { $lte: 20 } } )) • Project fields to return from query • db.books.find( {prints: 1}, { $or: [ { publisher: "A" }, { criticrating: { $lt: 30 } } ] } ) Working with JSON in PostgreSQL vs. MongoDB
  • 18. JSONB Indexes • JSONB provides a wide array of options to index your JSON data. • We are going to dig into three types of indexes: • GIN • BTREE • HASH Working with JSON in PostgreSQL vs. MongoDB
  • 19. JSONB Indexes : GIN • GIN stands for “Generalized Inverted Indexes” • GIN supports two operator classes • jsonb_ops • ?, ?|, ?&, @>, @@, @? • [Index each key and value] • jsonb_pathops • @>, @@, @? • [Index only the values] Copyright © ScaleGrid.io
  • 20. JSON sample data Fictional book database (apologies to any librarians in the audience): Working with JSON in PostgreSQL vs. MongoDB demo=# d+ books Table "public.books" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+------------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | author | character varying(255) | | | | extended | | isbn | character varying(25) | | | | extended | | rating | integer | | | | plain | | data | jsonb | | | | extended | | Indexes: "books_pkey" PRIMARY KEY, btree (id) …..
  • 21. JSON sample data Working with JSON in PostgreSQL vs. MongoDB demo=# select jsonb_pretty(data) from books where id = 1000021; jsonb_pretty ------------------------------------ { + "tags": { + "nk906585": { + "ik844766": "iv364087"+ } + }, + "prints": [ + { + "price": 100, + "style": "hc" + }, + { + "price": 50, + "style": "pb" + } + ], + "braille": false, + "keywords": [ + "abc", + "kef", + "keh" + ], + "hardcover": true, + "publisher": "nVxJVA8Bwx", + "criticrating": 2 + }
  • 22. JSONB Indexes: GIN - ? Find all books that are available in braille? Let’s create the GIN index on the ‘data’ JSONB column: Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX datagin ON books USING gin (data); demo=# select * from books where data ? 'braille'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} ..... demo=# explain analyze select * from books where data ? 'braille'; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) (actual time=0.033..0.039 rows=15 loops=1) Recheck Cond: (data ? 'braille'::text) Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.022..0.022 rows=15 loops=1) Index Cond: (data ? 'braille'::text) Planning Time: 0.102 ms Execution Time: 0.067 ms (7 rows)
  • 23. JSONB Indexes: GIN - ? What if we wanted to find books that were in braille or in hardcover? Working with JSON in PostgreSQL vs. MongoDB demo=# explain analyze select * from books where data ?| array['braille','hardcover']; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.029..0.035 rows=15 loops=1) Recheck Cond: (data ?| '{braille,hardcover}'::text[]) Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.023..0.023 rows=15 loops=1) Index Cond: (data ?| '{braille,hardcover}'::text[]) Planning Time: 0.138 ms Execution Time: 0.057 ms (7 rows)
  • 24. JSONB Indexes: GIN GIN index supports the “existence” operators only on “top level” keys. If the key is not at the top level, then the index will not be used. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data->'tags' ? 'nk455671'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} 685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0} (2 rows) demo=# explain analyze select * from books where data->'tags' ? 'nk455671'; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Seq Scan on books (cost=0.00..38807.29 rows=1000 width=158) (actual time=0.018..270.641 rows=2 loops=1) Filter: ((data -> 'tags'::text) ? 'nk455671'::text) Rows Removed by Filter: 1000017 Planning Time: 0.078 ms Execution Time: 270.728 ms (5 rows)
  • 25. JSONB Indexes: GIN The way to check for existence in nested docs is to use “Expression indexes”. Let’s create an index on data->tags: Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX datatagsgin ON books USING gin (data->'tags'); demo=# select * from books where data->'tags' ? 'nk455671'; id | author | isbn | rating | data ---------+-----------------+------------+--------+----------------------------------------------------------------------------------------------------------- 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} 685122 | GWfuvKfQ1PCe1IL | jnyhYYcF66 | 3 | {"tags": {"nk455671": {"ik615925": "iv253423"}}, "publisher": "b2NwVg7VY3", "criticrating": 0} (2 rows) demo=# explain analyze select * from books where data->'tags' ? 'nk455671'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Bitmap Heap Scan on books (cost=12.75..1007.75 rows=1000 width=158) (actual time=0.031..0.035 rows=2 loops=1) Recheck Cond: ((data ->'tags'::text) ? 'nk455671'::text) Heap Blocks: exact=2 -> Bitmap Index Scan on datatagsgin (cost=0.00..12.50 rows=1000 width=0) (actual time=0.021..0.021 rows=2 loops=1) Index Cond: ((data ->'tags'::text) ? 'nk455671'::text) Planning Time: 0.098 ms Execution Time: 0.061 ms (7 rows)
  • 26. JSONB Indexes: GIN - @> The “path” operator can be used for multi-level queries of your JSON data. Let’s use it similar to the ? operator. Working with JSON in PostgreSQL vs. MongoDB select * from books where data @> '{"braille":true}'::jsonb; demo=# explain analyze select * from books where data @> '{"braille":true}'::jsonb; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.040..0.048 rows=6 loops=1) Recheck Cond: (data @> '{"braille": true}'::jsonb) Rows Removed by Index Recheck: 9 Heap Blocks: exact=2 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.030..0.030 rows=15 loops=1) Index Cond: (data @> '{"braille": true}'::jsonb) Planning Time: 0.100 ms Execution Time: 0.076 ms (8 rows)
  • 27. JSONB Indexes: GIN - @> The "path" operator can be used for multi level queries of your JSON data. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb; id | author | isbn | rating | data -----+-----------------+------------+--------+------------------------------------------------------------------------------------- 346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3} (1 row) demo=# explain analyze select * from books where data @> '{"publisher":"XlekfkLOtL"}'::jsonb; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=16.75..1009.25 rows=1000 width=158) (actual time=0.491..0.492 rows=1 loops=1) Recheck Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb) Heap Blocks: exact=1 -> Bitmap Index Scan on datagin (cost=0.00..16.50 rows=1000 width=0) (actual time=0.092..0.092 rows=1 loops=1) Index Cond: (data @> '{"publisher": "XlekfkLOtL"}'::jsonb) Planning Time: 0.090 ms Execution Time: 0.523 ms
  • 28. JSONB Indexes: GIN - @> The JSON queries can be nested to many levels. You can also use the ># operation but GIN does not support it. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} (1 row)
  • 29. JSONB Indexes: GIN - jsonb_pathops GIN also supports a “pathops” option to reduce the size of the GIN index. From the docs: “The technical difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates independent index items for each key and value in the data, while the latter creates index items only for each value in the data.” On my small dataset of 1M books, you can see that the pathops GIN index is smaller – you should test with your dataset to understand the savings. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX dataginpathops ON books USING gin (data jsonb_path_ops); public | dataginpathops | index | sgpostgres | books | 67 MB | public | datatagsgin | index | sgpostgres | books | 84 MB |
  • 30. JSONB Indexes: GIN - jsonb_pathops Let’s rerun our query from before with the pathops index: Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- --------------------------------- ------------------ 1000005 | XEI7xShT8bPu6H7 | 2kD5XJDZUF | 0 | {"tags": {"nk455671": {"ik937456": "iv506075"}}, "braille": true, "keywords": ["abc", "kef", "keh"], "hardcover": false, "publisher": "zSfZIAjGGs", " criticrating": 4} (1 row) demo=# explain select * from books where data @> '{"tags":{"nk455671":{"ik937456":"iv506075"}}}'::jsonb; QUERY PLAN ----------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=12.75..1005.25 rows=1000 width=158) Recheck Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb) -> Bitmap Index Scan on dataginpathops (cost=0.00..12.50 rows=1000 width=0) Index Cond: (data @> '{"tags": {"nk455671": {"ik937456": "iv506075"}}}'::jsonb) (4 rows)
  • 31. JSONB Indexes: GIN - jsonb_pathops The “jsonb_pathops” option supports only the @> operator. Smaller index but more limited scenarios. The following queries below can no longer leverage the GIN index: Working with JSON in PostgreSQL vs. MongoDB select * from books where data ? 'tags'; => Sequential scan select * from books where data @> '{"tags" :{}}'; => Sequential scan select * from books where data @> '{"tags" :{"k7888":{}}}' => Sequential scan
  • 32. JSONB Indexes: B-tree • B-tree indexes are the most common index type in relational databases. • If you index an entire JSONB column with a B-tree index, the only useful operators are the comparison operators: • =, <, <=, >, >= • Can be used only for whole object comparisons. • Very limited use case. Working with JSON in PostgreSQL vs. MongoDB
  • 33. JSONB Indexes: B-tree • Use B-tree “Expression indexes” • B-tree expression indexes can support the common comparison operators '=', '<', '>', '>=', '<=‘ (which GIN doesn't support). • Retrieve all books with a data->criticrating > 4. Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data->'criticrating' > 4; ERROR: operator does not exist: jsonb >= integer LINE 1: select * from books where data->'criticrating’ > 4; ^ HINT: No operator matches the given name and argument types. You might need to add explicit type casts. #Lets cast JSONB to integer demo=# select * from books where (data->'criticrating')::int4 > 4; #If you are using a version prior to pg11 you need to query as text and then cast demo=# select * from books where (data->>'criticrating')::int4 > 4;
  • 34. JSONB Indexes: B-tree For expression indexes, the index needs to be an exact match with the query expression: Working with JSON in PostgreSQL vs. MongoDB demo=# CREATE INDEX criticrating ON books USING BTREE (((data->'criticrating')::int4)); CREATE INDEX demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1) Index Cond: (((data -> 'criticrating'::text))::integer = 3) Planning Time: 0.103 ms Execution Time: 79.019 ms (4 rows) demo=# explain analyze select * from books where (data->'criticrating')::int4 = 3; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Index Scan using criticrating on books (cost=0.42..4626.93 rows=5000 width=158) (actual time=0.069..70.221 rows=199883 loops=1) Index Cond: (((data -> 'criticrating'::text))::integer = 3) Planning Time: 0.103 ms Execution Time: 79.019 ms (4 rows) 1 From above we can see that the BTREE index is being used as expected.
  • 35. JSONB Indexes: HASH • If you are only interested in the "=" operator, then Hash indexes become interesting. • Hash indexes tend to be smaller than B-tree indexes. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX publisherhash ON books USING HASH ((data->'publisher')); demo=# select * from books where data->'publisher' = 'XlekfkLOtL' demo-# ; id | author | isbn | rating | data -----+-----------------+------------+--------+------------------------------------------------------------------------------------- 346 | uD3QOvHfJdxq2ez | KiAaIRu8QE | 1 | {"tags": {"nk88": {"ik37": "iv161"}}, "publisher": "XlekfkLOtL", "criticrating": 3} (1 row) demo=# explain analyze select * from books where data->'publisher' = 'XlekfkLOtL'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Index Scan using publisherhash on books (cost=0.00..2.02 rows=1 width=158) (actual time=0.016..0.017 rows=1 loops=1) Index Cond: ((data -> 'publisher'::text) = 'XlekfkLOtL'::text) Planning Time: 0.080 ms Execution Time: 0.035 ms (4 rows)
  • 36. JSONB Indexes: GIN - Trigram • PostgreSQL supports string matching using Trigram indexes. • Trigrams are basically words broken up into sequences of 3 letters. • We can search for any arbitrary regex (not just left anchored). Working with JSON in PostgreSQL vs. MongoDB CREATE EXTENSION pg_trgm; CREATE INDEX publisher ON books USING GIN ((data->'publisher') gin_trgm_ops); demo=# select * from books where data->'publisher' LIKE '%I0UB%'; id | author | isbn | rating | data ----+-----------------+------------+--------+--------------------------------------------------------------------------------- 4 | KiEk3xjqvTpmZeS | EYqXO9Nwmm | 0 | {"tags": {"nk3": {"ik1": "iv1"}}, "publisher": "MI0UBqZJDt", "criticrating": 1} (1 row) demo=# explain analyze select * from books where data->'publisher' LIKE '%I0UB%'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=9.78..111.28 rows=100 width=158) (actual time=0.033..0.033 rows=1 loops=1) Recheck Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text) Heap Blocks: exact=1 -> Bitmap Index Scan on publisher (cost=0.00..9.75 rows=100 width=0) (actual time=0.025..0.025 rows=1 loops=1) Index Cond: ((data -&gt; 'publisher'::text) ~~ '%I0UB%'::text) Planning Time: 0.213 ms Execution Time: 0.058 ms (7 rows)
  • 37. JSONB Indexes: GIN - Arrays • GIN indexes are great for indexing arrays. • Indexing and searching the keyword array. Working with JSON in PostgreSQL vs. MongoDB CREATE INDEX keywords ON books USING GIN ((data->'keywords') jsonb_path_ops); demo=# select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- -------------- 1000003 | zEG406sLKQ2IU8O | viPdlu3DZm | 4 | {"tags": {"nk263020": {"ik203820": "iv817928"}}, "keywords": ["abc", "kef", "keh"], "publisher": "7NClevxuTM", "criticrating": 2} 1000004 | GCe9NypHYKDH4rD | so6TQDYzZ3 | 4 | {"tags": {"nk780341": {"ik397357": "iv632731"}}, "keywords": ["abc", "kef", "keh"], "publisher": "fqaJuAdjP5", "criticrating": 2} (2 rows) demo=# explain analyze select * from books where data->'keywords' @> '["abc", "keh"]'::jsonb; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=54.75..1049.75 rows=1000 width=158) (actual time=0.026..0.028 rows=2 loops=1) Recheck Cond: ((data -> 'keywords'::text) @> '["abc", "keh"]'::jsonb) Heap Blocks: exact=1 -> Bitmap Index Scan on keywords (cost=0.00..54.50 rows=1000 width=0) (actual time=0.014..0.014 rows=2 loops=1) Index Cond: ((data -> 'keywords'::text) @&amp;amp;amp;amp;amp;gt; '["abc", "keh"]'::jsonb) Planning Time: 0.131 ms Execution Time: 0.063 ms (7 rows)
  • 38. SQL/JSON • SQL standard added support for JSON – SQL Standard-2016 (SQL/JSON). • SQL/JSON Data model • JSONPath • SQL/JSON functions • With PG12 release, PostgreSQL has one of the best implementations of SQL/JSON. Working with JSON in PostgreSQL vs. MongoDB
  • 39. SQL/JSON 2016 ● A sequence of SQL/JSON items, each item can be (recursively) any of: ○ SQL/JSON scalar — non-null value of SQL types: Unicode character string, numeric, Boolean or datetime. ○ SQL/JSON null, value that is distinct from any value of any SQL type (not the same as NULL). ○ SQL/JSON arrays, ordered list of zero or more SQL/JSON items — SQL/JSON element ○ SQL/JSON objects — unordered collections of zero or more SQL/JSON members. ■ (key, SQL/JSON item) Working with JSON in PostgreSQL vs. MongoDB
  • 40. JSONPath Working with JSON in PostgreSQL vs. MongoDB .key Returns an object member with the specified key [*] Wildcard array element accessor that returns all array elements .* Wildcard member accessor that returns the values of all members located at the top level of the current object .** Recursive wildcard member accessor that processes all levels of the JSON hierarchy of the current object and returns all the member values, regardless of their nesting level JSONPath allows you to specify an expression (using a syntax similar to the property access notation in Javascript) to query or project your JSON data.
  • 41. SQL/JSON Functions ● PG 12 provides several functions to use JSONPATH to query your JSON data ○ jsonb_path_exists - Checks whether JSON path returns any item for the specified JSON value ○ jsonb_path_match - Returns the result of JSON path predicate check for the specified JSON value. ○ jsonb_path_query - Gets all JSON items returned by JSON path for the specified JSON value. Working with JSON in PostgreSQL vs. MongoDB
  • 42. JSONPath Finding books by publisher? Working with JSON in PostgreSQL vs. MongoDB demo=# select * from books where data @@ '$.publisher == "ktjKEZ1tvq"'; id | author | isbn | rating | data ---------+-----------------+------------+--------+--------------------------------------------------------------------------------------------------------------------- ------------- 1000001 | 4RNsovI2haTgU7l | GwSoX67gLS | 2 | {"tags": {"nk542369": {"ik55240": "iv305393"}}, "keywords": ["abc", "def", "geh"], "publisher": "ktjKEZ1tvq", "criticrating": 0} (1 row) demo=# explain analyze select * from books where data @@ '$.publisher == "ktjKEZ1tvq"'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on books (cost=21.75..1014.25 rows=1000 width=158) (actual time=0.123..0.124 rows=1 loops=1) Recheck Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath) Heap Blocks: exact=1 -&amp;amp;amp;amp;gt; Bitmap Index Scan on datagin (cost=0.00..21.50 rows=1000 width=0) (actual time=0.110..0.110 rows=1 loops=1) Index Cond: (data @@ '($."publisher" == "ktjKEZ1tvq")'::jsonpath) Planning Time: 0.137 ms Execution Time: 0.194 ms (7 rows)
  • 43. JSONPath Add a JSONPath filter: Working with JSON in PostgreSQL vs. MongoDB select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")'); Build complicated filter expressions: select * from books where jsonb_path_exists(data, '$.prints[*] ?(@.style=="hc" && @.price == 100)'); Index support for JSONPath is very limited. demo=# explain analyze select * from books where jsonb_path_exists(data,'$.publisher ?(@ == "ktjKEZ1tvq")'); QUERY PLAN ------------------------------------------------------------------------------------------------------------ Seq Scan on books (cost=0.00..36307.24 rows=333340 width=158) (actual time=0.019..480.268 rows=1 loops=1) Filter: jsonb_path_exists(data, '$."publisher"?(@ == "ktjKEZ1tvq")'::jsonpath, '{}'::jsonb, false) Rows Removed by Filter: 1000028 Planning Time: 0.095 ms Execution Time: 480.348 ms (5 rows)
  • 44. JSONPath: Projection JSON Select the last element of the array Working with JSON in PostgreSQL vs. MongoDB demo=# select jsonb_path_query(data, '$.prints[$.size()]') from books where id = 1000029; jsonb_path_query ------------------------------ {"price": 50, "style": "pb"} (1 row) Select only the hardcover prints from the array demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc")') from books where id = 1000029; jsonb_path_query ------------------------------- {"price": 100, "style": "hc"} (1 row) We can also chain the filters demo=# select jsonb_path_query(data, '$.prints[*] ?(@.style=="hc") ?(@.price ==100)') from books where id = 1000029; jsonb_path_query ------------------------------- {"price": 100, "style": "hc"} (1 row)
  • 45. Roadmap ● Improvements to the JSONPath implementation in PG13 ● Future Roadmap Working with JSON in PostgreSQL vs. MongoDB
  • 46. Questions? Working with JSON in PostgreSQL vs. MongoDB