Published at Feb 17, 2025

How to Query JSONB in PostgreSQL

Start supercharging your PostgreSQL today.

Written by Warda Bibi

Like most relational databases, PostgreSQL uses a strict schema approach, where data is organized and stored in a fixed, predefined structure. A database schema serves as a blueprint for how data is organized, defining the following elements:

Tables and their columns
Data types
Relationships
Constraints

The term strict schema means that the structure of a table, its columns, data types, and constraints must be defined in advance before any data can be stored. As a result, only compatible data will be allowed. Hence, the age column, defined as INT, will reject a value like "thirty" to ensure consistency across the table.

This consistency allows efficient querying since the database knows the exact structure and type of data. If the data structure needs to change, such as adding, removing, or modifying columns, the schema must be explicitly updated.

But what happens if you’re faced with one of the following situations?

What if the data doesn’t fit neatly into fixed, predefined structures?
What if we don’t know column names and their types or how many columns will be in advance?
What if the sizes and data formats are uncertain, or the data is simply unknown?

Instead of trying to parse the data and identify column names, types, and sizes upfront, can we store the entire data as-is in a single column? The answer is yes.

In earlier days, storing arbitrary data structures as plain text was always possible, but processing the data and retrieving it became major issues. Since the database had no internal understanding of the internal structure, every time a query was run, the database would have to load and parse the entire text blob to understand its content.

Furthermore, querying deeply nested data required complicated and slow regular expressions. So, relational databases like PostgreSQL and MySQL were not optimal for handling this kind of dynamic data.

As a result, if you needed to store such schema-less data, you would have to use external document stores like MongoDB, which are built specifically to handle such data. Document stores allow you to "store data now, figure out the schema later," meaning you can store arbitrary data structures without worrying about defining every possible column and data type upfront.

JSON (JavaScript Object Notation) in PostgreSQL

JSON support was first introduced in PostgreSQL 9.2 (2012). Over the years, PostgreSQL has evolved with powerful JSON features that allow it to handle arbitrary data structures natively. When dealing with dynamic, nested, or arbitrary data, the JSON (JavaScript Object Notation) format is an ideal solution.

JSON is designed to handle hierarchical, and schema-less data, which allows you to store an arbitrary number of properties with varying types, lengths, and structures. For example, a JSON record might include an arbitrary set of properties, such as:

{ "sensor_id": "sensor_123", "timestamp": "2024-12-18T12:00:00Z", "Temperature": 22.5, "humidity": 60, "location": {"latitude": 40.7128, "longitude": -74.0060}, "alerts": ["high_temperature", "low_battery"], "metadata": {"model": "sensor_v2", "firmware_version": "1.2.3"} }This is a typical JSON object; it stores data as key-value pairs. It has various properties (e.g., temperature, location, alerts), and the properties themselves can contain different types of data:

Integers
Strings
Arrays
Booleans

Or even other JSON objects.

The schema is not predefined and can evolve from one record to another. Such data can also be stored as plain text, but the JSON data types have the advantage of enforcing that each stored value is valid according to the JSON rules.

JSON-specific functions and operators are also available for data stored in these data types. PostgreSQL provides two data types for working with JSON data: JSON and JSONB.

In this blog, we will explore the power of JSONB in PostgreSQL and show you how to store, query, and index JSON data efficiently.

To learn how to query JSON metadata in PostgreSQL, check out this article.

Difference Between JSON and JSONB Data Types

The JSON and JSONB data types in PostgreSQL accept almost identical input. Their main difference is their efficiency.

The JSON data type stores the raw JSON text exactly as it is, including any spaces, line breaks, duplicates, and the order of keys in JSON objects.
On the other hand, JSONB (Here, B stands for binary) stores JSON data in a binary format. It converts the input into a binary representation before storing it.
This JSONB format eliminates the need to reparse the data each time it is accessed, leading to significantly better performance for querying and processing. However, the trade-off is that writing data to a JSONB column incurs some overhead due to converting text to binary. This conversion makes JSONB slightly slower to insert than JSON but much faster for querying and processing.
JSONB also normalizes data by removing unnecessary whitespace and formatting, and it does not preserve the order of keys in objects. Additionally, it does not allow duplicate keys in JSON objects; only the last occurrence of a key is stored, and earlier duplicates are discarded.
One of JSONB's most significant advantages is its support for indexing, which can drastically improve query performance, especially on large datasets.

The following example demonstrates the difference between JSON and JSONB: SELECT '{"device_id": "abc123", "active": true, "active": false}'::json AS metadata_json,'{"device_id": "abc123", "active": true, "active": false}'::jsonb AS metadata_jsonb;In this query, we define both JSON and JSONB with the same data containing duplicate keys (‘active’ appears twice).

{"device_id": "abc123", "active": true, "active": false}

JSON stores the data exactly as provided, retaining duplicate keys.
JSONB automatically normalizes the data, resolving duplicate keys by keeping only the last occurrence.

The result of running this query is shown below:

metadata_json | metadata_jsonb ---------------------------------------------------------------------+------------------------------------------ {"device_id": "abc123", "active": true, "active": false} | {"active": false, "device_id": "abc123"}

JSONB hasn’t preserved the order of keys, discarded duplicates, and kept only the last occurrence of "active," unlike JSON, which retains the original order and keys.

When it comes to use cases, you should choose JSON if you don’t plan on performing frequent queries or need to preserve the exact format of the input data, such as for logging purposes. On the other hand, if you need to query or index the data or if performance is a priority, then JSONB is the better choice.

Creating a JSONB Column

Let's create a table for storing sensor device data, where each row represents a single device. The table will include an ID column (as a primary key) and a JSONB column to store the sensor data.

CREATE TABLE sensor_devices ( id SERIAL PRIMARY KEY, data jsonb );

Let’s insert a record:

INSERT INTO sensor_devices (data) VALUES ( '{ "device_name": "Sensor A", "active": true, "temperature_readings": [22.5, 23.0, 21.8] }'::jsonb );Note: To convert the string to the JSONB, we used ::jsonb.

SELECT * FROM sensor_devices;

After running the above query, the output will look like this:

id | data ----+----------------------------------------------------------------------- 1 | {"active": true, "device_name": "Sensor A", "temperature_readings": [22.5, 23.0, 21.8]}

Updating a JSONB Column

If you want to replace the entire JSON document for a specific device, you can directly update the data column for that device. Let's insert another record to experiment with update queries.

INSERT INTO sensor_devices (data) VALUES ( '{ "device_name": "Sensor B", "active": true, "temperature_readings": [23.5, 20.0, 22.8] }'::jsonb );

Now replace this entire data column using the UPDATE query:

UPDATE sensor_devices SET data = '{"active": false}'::jsonb WHERE id = 2;

This will replace the data column for the device with id = 2 with the new JSON object, as shown below:

id | data ----+---------------------------------------------------------------------- 1 | {"active": true, "device_name": "Sensor A", "temperature_readings": [22.5, 23.0, 21.8]} 2 | {"active": false}

Concatenate ( || )

If you want to add a new key-value pair to the existing JSON document for a device, use the || (concatenate) operator. For example, adding a location key to the existing data:

UPDATE sensor_devices SET data = data || '{"location": "Room 2"}'::jsonb WHERE id = 1;

This will add the key "location": "Room 2" to the existing JSON document for the device with id 1, as shown below:

id| data ---+----------------------------------------------------------------------- 2 | {"active": false} 1 | {"active": true, "location": "Room 2", "device_name": "Sensor A", "temperature_readings": [22.5, 23.0, 21.8]} Sensor A with id=1 has been updated with a key ‘location.’

Note: If the "location" key had already been already there, it would have been updated with the new value.

Delete ( – )

If you want to remove a key from the JSON document, use the – operator. For instance, we can remove the location key using the query mentioned below:

UPDATE sensor_devices SET data = data - 'location' WHERE id = 1;

This will remove the location key from the data column for the device with id = 1, as shown below.

id | data ----+------------------------------------------------------------------------- 2 | {"active": false} 1 | {"active": true, "device_name": "Sensor A", "temperature_readings": [22.5, 23.0, 21.8]}

Querying a JSONB Column

PostgreSQL includes two native operators: arrow operator (->) and arrow-text operator (->>) to query JSONB documents.

The arrow operator -> returns a JSONB object field by key or array index and is suitable for navigating nested structures.

The arrow-text operator ->> returns the object field as plain text. Once extracted as text, you can no longer query the structure deeper because it’s no longer treated as a JSONB object.

Let's try to understand these operators better with the help of a simple example :

SELECT '{"id": "1234", "reading": {"temperature": 22.5, "unit": "C"} }'::jsonb -> 'id' as ID;

This query extracts the value associated with the key 'id' as a JSONB object.

id -------- "1234"

Here, id is a string, meaning the following:

-> ’id’ should return an object field “1234”
->> ’id’ should return it as a plain text 1234 (without quotations if the value is a string).

SELECT '{"id": "1234", "reading": {"temperature": 22.5, "unit": "C"} }'::jsonb ->> 'id' as ID;

id ------ 1234

Similarly, if you want to query fields within the reading object, you can use both -> and ->> operators. The results will appear the same, but internally, -> returns the entire object while ->> returns the object as plain text.

So, the following query will return the object:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' AS sensor_reading;

sensor_reading ----------------------------------- {"unit": "C", "temperature": 22.5}

And running the following query will return plain text:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb ->> 'reading' AS sensor_reading; sensor_reading ------------------------------------ {"unit": "C", "temperature": 22.5}

As a result, if you want to query the reading object deeper, -> will allow you to do so because it keeps the data as a JSON object. On the other hand, since ->> returns plain text, you can't query further, and attempting to do so will result in an error. If you run the following query:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' AS sensor_reading;

sensor_reading ----------------------- "C"

It returns "C" because the -> operator extracts the value as JSON. However, if you run:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' ->> 'unit' AS sensor_reading; sensor_reading -------------------- C It returns C as plain text, as the ->> operator extracts the value as text. However, let’s look at this query:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb ->> 'reading' -> 'unit' AS sensor_reading;

ERROR: operator does not exist: text -> unknown LINE 1: ...ature": 22.5, "unit": "C"}}'::jsonb ->> 'reading' -> 'unit' . HINT: No operator matches the given name and argument types. You might need to add explicit type casts.This query throws an error because the ->> operator converts ‘reading’ into plain text, which can no longer be used as a JSON object to extract ‘unit.’

With jsonb ->> 'reading' -> 'unit,' you can't access 'unit' because after using ->>, the reading is no longer an object, it's now plain text. This can also be confirmed by comparison, as shown below.

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' ->> 'unit' = 'C' AS sensor_reading;

sensor_reading -------------------- t

Here, we are directly comparing the result of 'reading' ->> 'unit' with a string. The result t (true) indicates that the unit is "C."

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' = 'C' AS sensor_reading;

ERROR: invalid input syntax for type json LINE 2: ...5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' = 'C' AS sen... ^

DETAIL: Token "C" is invalid. CONTEXT: JSON data, line 1: C

Here, the comparison -> 'unit' = 'C' fails because -> extracts the value of unit as a JSON object, not as a plain text string. Comparing a JSON object to a string like 'C' is not valid, so the comparison fails.

Querying arrays inside a JSONB column

Let's explore how querying works with arrays inside a JSONB column. We can query the sensor_devices table that we created earlier. In the JSONB column, we have temperature_readings as an array. We can query the array as we query any other key: SELECT id, data -> 'active' AS active, data -> 'temperature_readings' AS temperature_readings FROM sensor_devices;

id | active | temperature_readings ----+--------+---------------------- 2 | false | 1 | true | [22.5, 23.0, 21.8]

To query a specific position within an array, you can use the arrow operator (->) followed by the array's position index. If you prefer the result as a text value (e.g., to retrieve the temperature as a string), you can use the ->> operator instead. For instance, to find the third temperature reading for a sensor with ID 1.

SELECT id, data -> 'temperature_readings' -> 2 AS third_temperature_reading FROM sensor_devices where id=1;

id | third_temperature_reading ----+--------------------------- 1 | 21.8Here, we can also use ->> to extract this value as text.

SELECT id, data -> 'temperature_readings' ->> 2 AS third_temperature_reading FROM sensor_devices where id=1;

id | third_temperature_reading ----+--------------------------- 1 | 21.8

There are several other operators available to work with JSONB columns. Let's dive into each of these operators and explore how to use them.

@> Operator

The containment operator @> tests whether one document contains another. This operator can compare partial JSON strings against a JSONB column.

For example, the below query will return all rows where the data column contains the key-value pair "active": false

SELECT id, data FROM sensor_devices WHERE data @> '{"active": false}';

id | data ----+------------------- 2 | {"active": false}

This is equivalent to:

SELECT id, data FROM sensor_devices WHERE data ->> 'active' = 'false'; id | data ----+------------------- 2 | {"active": false}

? Operator

The ? operator is used to check if a specific key exists within a JSONB object. This operator returns true if the key exists at the top level of the object and false otherwise. Let's say we concatenate a new field "device_tags” array in the JSONB column:

UPDATE sensor_devices SET data = data || '{"device_tags": ["indoor", "battery-powered", "wifi"]}'::jsonb WHERE id = 1;

We also concatenate a nested object “manufacturer” in the same JSONB column:

UPDATE sensor_devices SET data = data || '{"manufacturer": {"company_name": "TechDevices"}}'::jsonb WHERE id = 1;

Now, if we execute this query, we can observe how our data is structured:

SELECT data->'manufacturer' AS manufacturer , data->'device_tags' As device_tags from sensor_devices WHERE id=1;

manufacturer | device_tags ---------------------------------+--------------------------------------- {"company_name": "TechDevices"} | ["indoor", "battery-powered", "wifi"]

To check if the key "manufacturer" exists at the top level of the JSONB object, we can run the following query: SELECT data ? 'manufacturer' As manufacturer FROM sensor_devices WHERE id = 1;

manufacturer -------------- t

Since the ‘manufacturer’ key exists, the query returns ‘true.’ If we attempt to query a non-existent key, such as ‘battery_level,’ it will return ‘false.’

SELECT data ? 'battery_level' As battery_level FROM sensor_devices WHERE id = 1;

battery_level --------------- f

Nested keys limitation

It’s important to note that the ? operator only works for top-level keys. If you try to check for a key that is nested inside another object, it will return false.

For example, if we try to check for the key "company_name" inside the nested "manufacturer" object:

SELECT data ? 'company_name' AS company FROM sensor_devices WHERE id = 1;

company --------- f

This query returns false because "company_name" is nested inside the "manufacturer" object, which is a second-level key in the data column.

To check nested keys, you would need to use other approaches (like using -> or ->> operators). To check for the key "company_name" inside the nested "manufacturer" object using the -> operator, the query would look like this:

SELECT data -> 'manufacturer' ? 'company_name' AS has_company_name FROM sensor_devices WHERE id = 1;

has_company_name ------------------ t

Object values are not considered for existence

The ? operator in PostgreSQL is specifically designed to check for the existence of keys in a JSONB object, not values. So, if you try a query like this:

SELECT data? 'TechDevices' AS value FROM sensor_devices WHERE id = 1;

value ------- f

The query will return false because "TechDevices" is a value, not a JSONB object key. The ? operator is looking for "TechDevices" as a key.

To check for the presence of a specific value, we need to extract the value first and then perform the comparison, as shown below.

SELECT data -> 'manufacturer' ->> 'company_name' = 'TechDevices' AS is_valid FROM sensor_devices WHERE id = 1;

is_valid ---------- t

String exists as the array element

To check if a specific string exists as an element in an array within the JSONB column, you can use the ? operator.

For example, to see if the string 'wifi' exists in the device_tags array for a specific device, you can use the following query:

SELECT data -> 'device_tags' ? 'wifi' AS array_element FROM sensor_devices WHERE id = 1;

array_element --------------- t

This query will return true if 'wifi' is present in the device_tags array for the device with id = 1, and false if it is not.

?| Operator

The ?| operator is used to check if any specified keys or array elements exist in a JSONB object or array. To check device_tags for sensor A, you can use the following query:

SELECT data->'device_tags' AS tags FROM sensor_devices WHERE id = 1;

tags --------------------------------------- ["indoor", "battery-powered", "wifi"]To check if any of the elements 'wifi' or 'bluetooth' exist in the device_tags array for a specific record in the sensor_devices table, we can run the following query:

SELECT data -> 'device_tags' ?| array['wifi', 'bluetooth'] As exist FROM sensor_devices WHERE id = 1;

If either 'wifi' or 'bluetooth' exists in the device_tags array for id = 1, the query will return true. If neither element is found, it will return false. Below is the result of running this query.

exist ------- t

?& Operator

The ?& operator checks if all of the specified keys or array elements exist in a JSONB object or array. You can use this operator to ensure that all specified keys or elements are present in the JSONB column. Here's how you can use the ?& operator with your sensor_devices table:

SELECT data -> 'device_tags'?& array['wifi', 'battery-powered'] AS existing_tags FROM sensor_devices WHERE id = 1;

data -> 'device_tags': This extracts the device_tags array from the data column.
?& array['wifi', 'battery-powered']: This checks if both 'wifi' and 'battery-powered' exist in the device_tags array for the specified record.

The query will return true if both 'wifi' and 'battery-powered' exist in the device_tags array. If either element is missing from the array, it will return false, as shown below:

existing_tags --------------- t

#> Operator

The #> operator allows you to extract a JSON object at a specific path within a JSONB column. The path is defined as an array of keys (or indices) specifying the sequence of nested levels within the JSON object.

SELECT '{"a": {"b": {"c": "foo"}}}'::jsonb #> '{a,b}' AS result;

result -------------- {"c": "foo"}Here, the path is {a,b}, which indicates go to a → Inside a, then go to b→ and then return the value for b.

{"a": {"b": {"c": "foo"}}} → This is the JSON object from which we are extracting data.
#> '{a,b}' → This specifies the path {a, b}. It starts at the key a, then moves to the nested key b, and returns the value at that location ({"c": "foo"}).

In terms of sensor_device, we can query something like this:

SELECT '{"device": {"manufacturer": {"company_name": "TechDevices"}}}'::jsonb #> '{device, manufacturer,company_name}' AS result;

result --------------- "TechDevices"

Here, the top-level key is device. Within that, there's another key manufacturer containing the key company_name with the value "TechDevices." The path '{device, manufacturer,company_name}' means:

Start at the top-level key ‘device.’
Move into the nested ‘manufacturer’ object.
Finally, extract the value of the key ‘company_name.’

This will give you the value "TechDevices" at the company_name level.

The difference between #> and #>> is exactly like the difference between -> and ->>. While #> returns a JSON object or array at a specified path, #>> returns the value as plain text. Use #>> when you need a string result instead of a JSON structure.

JSONB Functions

Several interesting functions operate on JSONB. It's important to note that these functions also have a json_ variant, which works with JSON columns.

jsonb_each

The jsonb_each function expands the outermost JSONB object into a set of key/value pairs. It allows you to decompose a JSONB object into individual rows, each representing each row represents one key/value pair.

Suppose the JSONB column data contains the following JSON:

{"active": false, "location": "Room 2", "device_name": "Sensor A", "device_tags": ["indoor", "battery-powered", "wifi"], "manufacturer": {"company_name": "TechDevices"}, "temperature_readings": [21.5, 22.0, 20.8]}Using jsonb_each(data), the data column can be transformed into key/value pairs. Here’s an example query:

SELECT jsonb_each(data) from sensor_devices where id=1;When this query is executed, it returns the following result:

(active, false) (location,"""Room 2""") (device_name,"""Sensor A""") (device_tags,"[""indoor"", ""battery-powered"", ""wifi""]") (manufacturer,"{""company_name"": ""TechDevices""}") (temperature_readings,"[21.5, 22.0, 20.8]")

Each pair represents a key and its corresponding value from the JSONB object. To access these pairs more conveniently, you can use the keywords key and value as column aliases, as shown below.

SELECT key,value FROM sensor_devices,jsonb_each(data) WHERE id=1;

Here is the result of executing this query:

key | value ----------------------+--------------------------------------- active | true device_name | "Sensor A" device_tags | ["indoor", "battery-powered", "wifi"] manufacturer | {"company_name": "TechDevices"} temperature_readings | [22.5, 23.0, 21.8]

You can query or manipulate the data using jsonb_each() in both the SELECT list and the FROM clause to query or manipulate the data. It simplifies working with complex JSONB structures and makes it easier to analyze nested data.

jsonb_object_keys

The jsonb_object_keys function is used to retrieve the keys of the top-level JSONB object. It lists each key as a separate row in the result, making it useful for understanding the structure of a JSONB object or iterating through its keys. To extract the keys of the JSONB data, you can query like this:

SELECT jsonb_object_keys(data) as keys FROM sensor_devices WHERE id=1;

Below is the result of executing this query:

keys ---------------------- active device_name device_tags manufacturer temperature_readings

Note: It extracts only top-level keys, not nested ones. For example,

{"manufacturer": {"company_name": "TechCo"}} company_name will not be included in the output since it is nested inside the manufacturer.

jsonb_extract_path

The jsonb_extract_path function retrieves a JSON object or value by traversing a specified path within a JSONB structure. The path is defined as a series of keys (and/or array indices) that specify the location of the desired value. Each level of the JSONB structure passes as a separate argument to the function. For example:

jsonb_extract_path(data, 'manufacturer', 'company_name')

It means inside the data column for sensor A:

Start at the top-level key 'manufacturer.'
Inside the object 'manufacturer,’ access the key ‘company_name.’

Here’s the query to extract company_name using jsonb_extract_path:

SELECT jsonb_extract_path(data, 'manufacturer', 'company_name') AS manufacturer FROM sensor_devices WHERE id=1; Below is the result of executing this query:

manufacturer --------------- "TechDevices"

jsonb_pretty

If you want the JSONB data to be formatted in a more readable and human-friendly way, you can use the jsonb_pretty function. This function is particularly useful for debugging or displaying JSON documents. It doesn’t modify the actual JSONB structure or its content but converts JSONB into an indented and visually appealing format.

Here’s the query using jsonb_pretty: SELECT jsonb_pretty(data) FROM sensor_devices WHERE id=1;

Below is the result of executing this query:

jsonb_pretty

--------------------------------------- { + "active": true, + "device_name": "Sensor A", + "device_tags": [ + "indoor", + "battery-powered", + "wifi" + ], + "manufacturer": { + "company_name": "TechDevices"+ }, + "temperature_readings": [ + 22.5, + 23.0, + 21.8 + ] + }

jsonb_typeof(jsonb)

The jsonb_typeof function is used to determine the type of the outermost JSON value and return it as a text string. The possible types returned are the following:

Object
Array
String
Number
Boolean
Null

If we run this query:

SELECT id, jsonb_typeof(data->'temperature_readings') AS temperature_data_type, jsonb_typeof(data->'active') AS active_data_type FROM sensor_devices WHERE id = 1;

where:

jsonb_typeof(data->'temperature_readings') AS temperature_data_type:
Retrieves the type of the temperature_readings value ( array.) and labels it as temperature_data_type.
jsonb_typeof(data->'active') AS active_data_type:
Retrieves the type of the active value ( boolean) and labels it as active_data_type.
WHERE id = 1:
Filters the query to retrieve only the row where the id is 1.

Here is the query result:

id | temperature_data_type | active_data_type ----+-----------------------+------------------ 1 | array | boolean

Conclusion

In summary, JSONB is a powerful tool for handling JSON data within your PostgreSQL database. It allows the user to easily store, search, and manipulate complex data structures, making it a valuable asset for many applications. With its flexibility and powerful features, JSONB can significantly enhance the database's capabilities.

When working with time series or real-time analytics data in JSON format, combining JSONB with TimescaleDB's hypertables can significantly enhance your database's performance, especially for time-stamped JSON sensor data and live metrics. Try TimescaleDB today to see how it can optimize your time-series JSON workloads: You can host it on your machine or try our modern cloud PostgreSQL data platform for free. To learn more about Timescale's JSONB support for semi-structured data, visit our docs.

Published at Feb 17, 2025

How to Query JSONB in PostgreSQL

Try for free

Start supercharging your PostgreSQL today.

Written by Warda Bibi

Tables and their columns
Data types
Relationships
Constraints

But what happens if you’re faced with one of the following situations?

What if the data doesn’t fit neatly into fixed, predefined structures?
What if we don’t know column names and their types or how many columns will be in advance?
What if the sizes and data formats are uncertain, or the data is simply unknown?

Instead of trying to parse the data and identify column names, types, and sizes upfront, can we store the entire data as-is in a single column? The answer is yes.

JSON (JavaScript Object Notation) in PostgreSQL

Integers
Strings
Arrays
Booleans

Or even other JSON objects.

JSON-specific functions and operators are also available for data stored in these data types. PostgreSQL provides two data types for working with JSON data: JSON and JSONB.

In this blog, we will explore the power of JSONB in PostgreSQL and show you how to store, query, and index JSON data efficiently.

To learn how to query JSON metadata in PostgreSQL, check out this article.

Difference Between JSON and JSONB Data Types

The JSON and JSONB data types in PostgreSQL accept almost identical input. Their main difference is their efficiency.

The JSON data type stores the raw JSON text exactly as it is, including any spaces, line breaks, duplicates, and the order of keys in JSON objects.
On the other hand, JSONB (Here, B stands for binary) stores JSON data in a binary format. It converts the input into a binary representation before storing it.
This JSONB format eliminates the need to reparse the data each time it is accessed, leading to significantly better performance for querying and processing. However, the trade-off is that writing data to a JSONB column incurs some overhead due to converting text to binary. This conversion makes JSONB slightly slower to insert than JSON but much faster for querying and processing.
JSONB also normalizes data by removing unnecessary whitespace and formatting, and it does not preserve the order of keys in objects. Additionally, it does not allow duplicate keys in JSON objects; only the last occurrence of a key is stored, and earlier duplicates are discarded.
One of JSONB's most significant advantages is its support for indexing, which can drastically improve query performance, especially on large datasets.

{"device_id": "abc123", "active": true, "active": false}

JSON stores the data exactly as provided, retaining duplicate keys.
JSONB automatically normalizes the data, resolving duplicate keys by keeping only the last occurrence.

The result of running this query is shown below:

JSONB hasn’t preserved the order of keys, discarded duplicates, and kept only the last occurrence of "active," unlike JSON, which retains the original order and keys.

Creating a JSONB Column

Let's create a table for storing sensor device data, where each row represents a single device. The table will include an ID column (as a primary key) and a JSONB column to store the sensor data.

CREATE TABLE sensor_devices ( id SERIAL PRIMARY KEY, data jsonb );

Let’s insert a record:

SELECT * FROM sensor_devices;

After running the above query, the output will look like this:

id | data ----+----------------------------------------------------------------------- 1 | {"active": true, "device_name": "Sensor A", "temperature_readings": [22.5, 23.0, 21.8]}

Updating a JSONB Column

If you want to replace the entire JSON document for a specific device, you can directly update the data column for that device. Let's insert another record to experiment with update queries.

INSERT INTO sensor_devices (data) VALUES ( '{ "device_name": "Sensor B", "active": true, "temperature_readings": [23.5, 20.0, 22.8] }'::jsonb );

Now replace this entire data column using the UPDATE query:

UPDATE sensor_devices SET data = '{"active": false}'::jsonb WHERE id = 2;

This will replace the data column for the device with id = 2 with the new JSON object, as shown below:

Concatenate ( || )

If you want to add a new key-value pair to the existing JSON document for a device, use the || (concatenate) operator. For example, adding a location key to the existing data:

UPDATE sensor_devices SET data = data || '{"location": "Room 2"}'::jsonb WHERE id = 1;

This will add the key "location": "Room 2" to the existing JSON document for the device with id 1, as shown below:

Note: If the "location" key had already been already there, it would have been updated with the new value.

Delete ( – )

If you want to remove a key from the JSON document, use the – operator. For instance, we can remove the location key using the query mentioned below:

UPDATE sensor_devices SET data = data - 'location' WHERE id = 1;

This will remove the location key from the data column for the device with id = 1, as shown below.

Querying a JSONB Column

PostgreSQL includes two native operators: arrow operator (->) and arrow-text operator (->>) to query JSONB documents.

The arrow operator -> returns a JSONB object field by key or array index and is suitable for navigating nested structures.

The arrow-text operator ->> returns the object field as plain text. Once extracted as text, you can no longer query the structure deeper because it’s no longer treated as a JSONB object.

Let's try to understand these operators better with the help of a simple example :

SELECT '{"id": "1234", "reading": {"temperature": 22.5, "unit": "C"} }'::jsonb -> 'id' as ID;

This query extracts the value associated with the key 'id' as a JSONB object.

id -------- "1234"

Here, id is a string, meaning the following:

-> ’id’ should return an object field “1234”
->> ’id’ should return it as a plain text 1234 (without quotations if the value is a string).

SELECT '{"id": "1234", "reading": {"temperature": 22.5, "unit": "C"} }'::jsonb ->> 'id' as ID;

id ------ 1234

So, the following query will return the object:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' AS sensor_reading;

sensor_reading ----------------------------------- {"unit": "C", "temperature": 22.5}

And running the following query will return plain text:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' AS sensor_reading;

sensor_reading ----------------------- "C"

It returns "C" because the -> operator extracts the value as JSON. However, if you run:

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb ->> 'reading' -> 'unit' AS sensor_reading;

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' ->> 'unit' = 'C' AS sensor_reading;

sensor_reading -------------------- t

Here, we are directly comparing the result of 'reading' ->> 'unit' with a string. The result t (true) indicates that the unit is "C."

SELECT '{"sensor_id": "1234", "reading": {"temperature": 22.5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' = 'C' AS sensor_reading;

ERROR: invalid input syntax for type json LINE 2: ...5, "unit": "C"}}'::jsonb -> 'reading' -> 'unit' = 'C' AS sen... ^

DETAIL: Token "C" is invalid. CONTEXT: JSON data, line 1: C

Querying arrays inside a JSONB column

id | active | temperature_readings ----+--------+---------------------- 2 | false | 1 | true | [22.5, 23.0, 21.8]

SELECT id, data -> 'temperature_readings' -> 2 AS third_temperature_reading FROM sensor_devices where id=1;

id | third_temperature_reading ----+--------------------------- 1 | 21.8Here, we can also use ->> to extract this value as text.

SELECT id, data -> 'temperature_readings' ->> 2 AS third_temperature_reading FROM sensor_devices where id=1;

id | third_temperature_reading ----+--------------------------- 1 | 21.8

There are several other operators available to work with JSONB columns. Let's dive into each of these operators and explore how to use them.

@> Operator

The containment operator @> tests whether one document contains another. This operator can compare partial JSON strings against a JSONB column.

For example, the below query will return all rows where the data column contains the key-value pair "active": false

SELECT id, data FROM sensor_devices WHERE data @> '{"active": false}';

id | data ----+------------------- 2 | {"active": false}

This is equivalent to:

SELECT id, data FROM sensor_devices WHERE data ->> 'active' = 'false'; id | data ----+------------------- 2 | {"active": false}

? Operator

UPDATE sensor_devices SET data = data || '{"device_tags": ["indoor", "battery-powered", "wifi"]}'::jsonb WHERE id = 1;

We also concatenate a nested object “manufacturer” in the same JSONB column:

UPDATE sensor_devices SET data = data || '{"manufacturer": {"company_name": "TechDevices"}}'::jsonb WHERE id = 1;

Now, if we execute this query, we can observe how our data is structured:

SELECT data->'manufacturer' AS manufacturer , data->'device_tags' As device_tags from sensor_devices WHERE id=1;

manufacturer | device_tags ---------------------------------+--------------------------------------- {"company_name": "TechDevices"} | ["indoor", "battery-powered", "wifi"]

To check if the key "manufacturer" exists at the top level of the JSONB object, we can run the following query: SELECT data ? 'manufacturer' As manufacturer FROM sensor_devices WHERE id = 1;

manufacturer -------------- t

Since the ‘manufacturer’ key exists, the query returns ‘true.’ If we attempt to query a non-existent key, such as ‘battery_level,’ it will return ‘false.’

SELECT data ? 'battery_level' As battery_level FROM sensor_devices WHERE id = 1;

battery_level --------------- f

Nested keys limitation

It’s important to note that the ? operator only works for top-level keys. If you try to check for a key that is nested inside another object, it will return false.

For example, if we try to check for the key "company_name" inside the nested "manufacturer" object:

SELECT data ? 'company_name' AS company FROM sensor_devices WHERE id = 1;

company --------- f

This query returns false because "company_name" is nested inside the "manufacturer" object, which is a second-level key in the data column.

SELECT data -> 'manufacturer' ? 'company_name' AS has_company_name FROM sensor_devices WHERE id = 1;

has_company_name ------------------ t

Object values are not considered for existence

The ? operator in PostgreSQL is specifically designed to check for the existence of keys in a JSONB object, not values. So, if you try a query like this:

SELECT data? 'TechDevices' AS value FROM sensor_devices WHERE id = 1;

value ------- f

The query will return false because "TechDevices" is a value, not a JSONB object key. The ? operator is looking for "TechDevices" as a key.

To check for the presence of a specific value, we need to extract the value first and then perform the comparison, as shown below.

SELECT data -> 'manufacturer' ->> 'company_name' = 'TechDevices' AS is_valid FROM sensor_devices WHERE id = 1;

is_valid ---------- t

String exists as the array element

To check if a specific string exists as an element in an array within the JSONB column, you can use the ? operator.

For example, to see if the string 'wifi' exists in the device_tags array for a specific device, you can use the following query:

SELECT data -> 'device_tags' ? 'wifi' AS array_element FROM sensor_devices WHERE id = 1;

array_element --------------- t

This query will return true if 'wifi' is present in the device_tags array for the device with id = 1, and false if it is not.

?| Operator

The ?| operator is used to check if any specified keys or array elements exist in a JSONB object or array. To check device_tags for sensor A, you can use the following query:

SELECT data->'device_tags' AS tags FROM sensor_devices WHERE id = 1;

SELECT data -> 'device_tags' ?| array['wifi', 'bluetooth'] As exist FROM sensor_devices WHERE id = 1;

exist ------- t

?& Operator

SELECT data -> 'device_tags'?& array['wifi', 'battery-powered'] AS existing_tags FROM sensor_devices WHERE id = 1;

data -> 'device_tags': This extracts the device_tags array from the data column.
?& array['wifi', 'battery-powered']: This checks if both 'wifi' and 'battery-powered' exist in the device_tags array for the specified record.

The query will return true if both 'wifi' and 'battery-powered' exist in the device_tags array. If either element is missing from the array, it will return false, as shown below:

existing_tags --------------- t

#> Operator

SELECT '{"a": {"b": {"c": "foo"}}}'::jsonb #> '{a,b}' AS result;

result -------------- {"c": "foo"}Here, the path is {a,b}, which indicates go to a → Inside a, then go to b→ and then return the value for b.

{"a": {"b": {"c": "foo"}}} → This is the JSON object from which we are extracting data.
#> '{a,b}' → This specifies the path {a, b}. It starts at the key a, then moves to the nested key b, and returns the value at that location ({"c": "foo"}).

In terms of sensor_device, we can query something like this:

SELECT '{"device": {"manufacturer": {"company_name": "TechDevices"}}}'::jsonb #> '{device, manufacturer,company_name}' AS result;

result --------------- "TechDevices"

Here, the top-level key is device. Within that, there's another key manufacturer containing the key company_name with the value "TechDevices." The path '{device, manufacturer,company_name}' means:

Start at the top-level key ‘device.’
Move into the nested ‘manufacturer’ object.
Finally, extract the value of the key ‘company_name.’

This will give you the value "TechDevices" at the company_name level.

JSONB Functions

Several interesting functions operate on JSONB. It's important to note that these functions also have a json_ variant, which works with JSON columns.

jsonb_each

Suppose the JSONB column data contains the following JSON:

SELECT jsonb_each(data) from sensor_devices where id=1;When this query is executed, it returns the following result:

Each pair represents a key and its corresponding value from the JSONB object. To access these pairs more conveniently, you can use the keywords key and value as column aliases, as shown below.

SELECT key,value FROM sensor_devices,jsonb_each(data) WHERE id=1;

Here is the result of executing this query:

jsonb_object_keys

SELECT jsonb_object_keys(data) as keys FROM sensor_devices WHERE id=1;

Below is the result of executing this query:

keys ---------------------- active device_name device_tags manufacturer temperature_readings

Note: It extracts only top-level keys, not nested ones. For example,

{"manufacturer": {"company_name": "TechCo"}} company_name will not be included in the output since it is nested inside the manufacturer.

jsonb_extract_path

jsonb_extract_path(data, 'manufacturer', 'company_name')

It means inside the data column for sensor A:

Start at the top-level key 'manufacturer.'
Inside the object 'manufacturer,’ access the key ‘company_name.’

Here’s the query to extract company_name using jsonb_extract_path:

SELECT jsonb_extract_path(data, 'manufacturer', 'company_name') AS manufacturer FROM sensor_devices WHERE id=1; Below is the result of executing this query:

manufacturer --------------- "TechDevices"

jsonb_pretty

Here’s the query using jsonb_pretty: SELECT jsonb_pretty(data) FROM sensor_devices WHERE id=1;

Below is the result of executing this query:

jsonb_pretty

jsonb_typeof(jsonb)

The jsonb_typeof function is used to determine the type of the outermost JSON value and return it as a text string. The possible types returned are the following:

Object
Array
String
Number
Boolean
Null

If we run this query:

SELECT id, jsonb_typeof(data->'temperature_readings') AS temperature_data_type, jsonb_typeof(data->'active') AS active_data_type FROM sensor_devices WHERE id = 1;

where:

jsonb_typeof(data->'temperature_readings') AS temperature_data_type:
Retrieves the type of the temperature_readings value ( array.) and labels it as temperature_data_type.
jsonb_typeof(data->'active') AS active_data_type:
Retrieves the type of the active value ( boolean) and labels it as active_data_type.
WHERE id = 1:
Filters the query to retrieve only the row where the id is 1.

Here is the query result:

id | temperature_data_type | active_data_type ----+-----------------------+------------------ 1 | array | boolean