💡

Why JSONB? JSON vs JSONB

What makes JSONB special — and when to use it

JSON vs JSONB — Core Difference

PostgreSQL has two JSON types. Understanding the difference is the foundation of everything in this topic.

Featurejsonjsonb
Storage formatPlain text, stored as-isBinary (decomposed & parsed)
Write speed⚡ Faster (just stores text)Slightly slower (parses on insert)
Read/Query speed🐢 Slow (parses on every read)⚡ Very fast (already parsed)
Indexing❌ No GIN support✅ Full GIN indexing
Key order preserved✅ Yes❌ No (sorted internally)
Duplicate keys✅ Kept (last one wins on parse)❌ Last value kept
Use in productionRarely✅ Always prefer this
⚠️
Rule: Always use jsonb unless you specifically need to preserve key order or exact formatting. In production, jsonb is the standard.

When Should You Use JSONB?

JSONB is powerful — but not a replacement for relational design. Use it when the data is truly semi-structured or variable.

✅ Good Fit
Product attributes that vary per category (size, color, wattage, material…)
✅ Good Fit
User preferences / settings object unique per user
✅ Good Fit
External API payloads / webhook logs you want to store raw
✅ Good Fit
Audit log metadata, event properties (analytics events)
❌ Bad Fit
Data you JOIN on, filter heavily, or report on across many rows
❌ Bad Fit
Relationship data that should be a foreign key (user_id, order_id)

Creating a Table with JSONB

-- E-commerce product with variable attributes CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT NOT NULL, price NUMERIC(10,2), category TEXT, attributes JSONB -- variable per category ); -- Insert a laptop INSERT INTO products (name, price, category, attributes) VALUES ( 'ThinkPad X1', 1299.99, 'laptop', '{"ram": "16GB", "cpu": "i7", "storage": "512GB SSD", "ports": ["USB-C", "HDMI", "USB-A"]}' ); -- Insert a T-shirt (totally different attributes) INSERT INTO products (name, price, category, attributes) VALUES ( 'Classic Tee', 29.99, 'clothing', '{"color": "navy", "sizes": ["S","M","L","XL"], "material": "100% Cotton"}' );
See the power: Two products in one table, each with completely different attributes — no schema change needed.
1

JSONB Operators

-> ->> #> #>> and more

The Four Core Extraction Operators

These operators let you dig into a JSONB value to retrieve nested fields.

->
Get JSON value (keeps JSON type)
Extracts a field by key. Returns JSONB — use when you want to pass result to another operator.
attributes -> 'ram'
→ returns: "16GB" (as jsonb)
->>
Get TEXT value (unwraps quotes)
Same as -> but returns TEXT. Use when comparing, filtering, or displaying values.
attributes ->> 'ram'
→ returns: 16GB (plain text, no quotes)
#>
Get JSON at path (nested)
Extracts nested value at a path array. Returns JSONB.
attributes #> '{ports,0}'
→ returns: "USB-C" (as jsonb)
#>>
Get TEXT at path (nested)
Same as #> but returns TEXT. Most useful for deeply nested leaf values.
attributes #>> '{ports,0}'
→ returns: USB-C (plain text)
💡
Mental model: Single arrow (->) = one key. Double hash-arrow (#>) = path of many keys. Single tail (->) = JSONB result. Double tail (->>) = TEXT result.

Using Operators in Real Queries

-- Get the RAM of all laptops (as TEXT for display) SELECT name, attributes ->> 'ram' AS ram FROM products WHERE category = 'laptop'; -- result: ThinkPad X1 | 16GB -- Filter: products where RAM = '16GB' SELECT name FROM products WHERE attributes ->> 'ram' = '16GB'; -- Get first port (index 0 from array) SELECT attributes #>> '{ports,0}' AS first_port FROM products WHERE name = 'ThinkPad X1'; -- result: USB-C -- Numeric comparison (cast to numeric) SELECT name FROM products WHERE (attributes ->> 'price_discount')::numeric > 10;

Containment and Existence Operators

These operators let you check whether a JSONB value contains another, or whether a key exists. Critical for filtering and GIN index usage.

OperatorMeaningExample
@> Left contains right (containment) attributes @> '{"ram":"16GB"}'
<@ Left is contained by right '{"ram":"16GB"}' <@ attributes
? Key exists at top level attributes ? 'ram'
?| Any of the keys exist attributes ?| array['ram','cpu']
?& All of the keys exist attributes ?& array['ram','cpu']
-- Find products with RAM = 16GB (containment — uses GIN index!) SELECT name FROM products WHERE attributes @> '{"ram": "16GB"}'; -- Find products that HAVE a 'color' key SELECT name FROM products WHERE attributes ? 'color'; -- Find products that have BOTH 'ram' and 'cpu' keys SELECT name FROM products WHERE attributes ?& array['ram', 'cpu']; -- Array containment: find products that have 'HDMI' in ports array SELECT name FROM products WHERE attributes @> '{"ports": ["HDMI"]}';
🚀
GIN Index Tip: The @>, ?, ?|, ?& operators are all supported by GIN indexes. This makes them extremely fast on large tables. We'll set this up in Section 2.

Other Useful Operators

Operator / FunctionPurposeExample
||Concatenate / merge two JSONB objects'{"a":1}' || '{"b":2}'{"a":1,"b":2}
- (minus key)Delete a key from JSONBattributes - 'ram'
#-Delete at pathattributes #- '{ports,0}'
jsonb_set()Set/update a value at pathjsonb_set(attributes, '{ram}', '"32GB"')
jsonb_pretty()Format JSONB as readable stringjsonb_pretty(attributes)
jsonb_each()Expand object to (key, value) rowsSELECT * FROM jsonb_each(attributes)
jsonb_array_elements()Expand JSON array to rowsSELECT * FROM jsonb_array_elements(attributes->'ports')
-- Merge extra data into existing attributes UPDATE products SET attributes = attributes || '{"weight_kg": 1.4}' WHERE name = 'ThinkPad X1'; -- Update just one nested key UPDATE products SET attributes = jsonb_set(attributes, '{ram}', '"32GB"') WHERE name = 'ThinkPad X1'; -- Delete a key UPDATE products SET attributes = attributes - 'storage' WHERE id = 1; -- Expand ports array to individual rows SELECT name, jsonb_array_elements_text(attributes -> 'ports') AS port FROM products WHERE category = 'laptop'; -- result: -- ThinkPad X1 | USB-C -- ThinkPad X1 | HDMI -- ThinkPad X1 | USB-A

🧪 Interactive: Operator Explorer

Try JSONB Operators Simulated

Select an operator to see what it does with the sample product data below.

{
  "ram": "16GB",
  "cpu": "i7",
  "storage": "512GB SSD",
  "ports": ["USB-C", "HDMI", "USB-A"],
  "in_stock": true,
  "rating": 4.5
}
Select an operator above and click Run.
2

JSONB Indexing

GIN indexes — the secret to fast JSONB queries at scale

Why JSONB Needs Indexes

Without an index, filtering on a JSONB column does a full sequential scan — reads every row and inspects the JSON. On a table with 10M rows, this is catastrophically slow.

🐢
No index: WHERE attributes ->> 'ram' = '16GB' on 1M rows = full table scan, ~2-5 seconds. With GIN index = <10ms.

GIN Index — The Default Choice

GIN (Generalized Inverted Index) indexes every key-value pair inside the JSONB document. Think of it like a book index — it builds a map of "where does key X with value Y appear?"

-- Create a GIN index on the whole JSONB column -- This indexes ALL keys and values inside the document CREATE INDEX idx_products_attributes ON products USING GIN (attributes); -- Now these queries use the index: SELECT * FROM products WHERE attributes @> '{"ram": "16GB"}'; SELECT * FROM products WHERE attributes ? 'color'; SELECT * FROM products WHERE attributes ?& array['ram', 'cpu'];

GIN Operator Classes

PostgreSQL has two GIN operator classes for JSONB. Choose based on your use case:

Operator ClassSupportsIndex SizeBest For
jsonb_ops (default) @> ? ?| ?& Larger General JSONB queries — containment & key existence
jsonb_path_ops @> only Smaller ✅ When you only need @> (containment). Faster & smaller.
-- Default (jsonb_ops) — all operators supported CREATE INDEX idx_attrs_full ON products USING GIN (attributes); -- jsonb_path_ops — smaller index, only @> supported -- Use when you ONLY ever use @> containment queries CREATE INDEX idx_attrs_path ON products USING GIN (attributes jsonb_path_ops);

Expression Indexes on JSONB (B-Tree)

If you always query a specific key inside JSONB, a regular B-Tree expression index is often faster and smaller than GIN.

-- B-Tree index on a specific JSONB key (extracted as text) CREATE INDEX idx_products_ram ON products ((attributes ->> 'ram')); -- Now this query uses the B-Tree index (much faster for equality) SELECT * FROM products WHERE attributes ->> 'ram' = '16GB'; -- Partial index: only index in-stock laptops with known RAM CREATE INDEX idx_instock_ram ON products ((attributes ->> 'ram')) WHERE category = 'laptop' AND (attributes ->> 'in_stock')::boolean = TRUE;

Choosing the Right Index Strategy

GIN (full)
When you query many different keys, use ?, or don't know which key up front
GIN (path_ops)
When you only ever use @> containment. Smaller & faster for that case
B-Tree expression
When you always query the same known key. Best equality performance
Partial B-Tree
Combine a WHERE clause + expression index for hyper-targeted queries
⚠️
Note: GIN indexes do NOT help with attributes ->> 'key' = 'value' (arrow operator). Use @> (containment) for GIN, or create a B-Tree expression index for specific key comparisons.

Verify Index is Being Used

EXPLAIN ANALYZE SELECT * FROM products WHERE attributes @> '{"ram": "16GB"}'; -- You should see: Bitmap Index Scan on idx_products_attributes -- NOT: Seq Scan on products
3

Hybrid Relational + JSON Design

The most powerful pattern — very common in SaaS products

What Is the Hybrid Pattern?

The core idea: use relational columns for structured, known, frequently-queried data, and JSONB for flexible, variable, or extensible data. Never go full-JSON when you have structured fields.

❌ Anti-Pattern — All JSON
-- Don't put everything in JSON
CREATE TABLE users (
  id   SERIAL PRIMARY KEY,
  data JSONB  -- name, email, role all in here!
);

-- Problems:
-- Can't index email properly
-- Can't enforce NOT NULL on name
-- Can't use foreign keys
-- Poor query performance
✅ Hybrid Pattern
-- Fixed fields → columns
-- Variable fields → JSONB
CREATE TABLE users (
  id          SERIAL PRIMARY KEY,
  email       TEXT UNIQUE NOT NULL,
  name        TEXT NOT NULL,
  role        TEXT NOT NULL,
  preferences JSONB  -- theme, notifications, etc.
);

-- Best of both worlds!

Real-World Example: SaaS User with Preferences

-- HYBRID: relational columns + jsonb for extensible settings CREATE TABLE users ( id BIGSERIAL PRIMARY KEY, email TEXT UNIQUE NOT NULL, name TEXT NOT NULL, role TEXT NOT NULL DEFAULT 'member', created_at TIMESTAMPTZ DEFAULT NOW(), preferences JSONB DEFAULT '{}' ); -- Insert user with preferences INSERT INTO users (email, name, role, preferences) VALUES ( 'alice@example.com', 'Alice', 'admin', '{ "theme": "dark", "language": "en", "notifications": { "email": true, "sms": false, "push": true }, "dashboard_widgets": ["revenue", "users", "events"] }' ); -- Query: find all users with dark theme SELECT name, email FROM users WHERE preferences @> '{"theme": "dark"}'; -- Query: find users with email notifications enabled SELECT name FROM users WHERE preferences #>> '{notifications,email}' = 'true'; -- Update: add a new widget to the dashboard without touching other prefs UPDATE users SET preferences = jsonb_set( preferences, '{dashboard_widgets}', (preferences -> 'dashboard_widgets') || '"alerts"' ) WHERE email = 'alice@example.com';

Real-World Example: E-Commerce Product Catalog

The classic use case — every category has different attributes but shares a common structure.

CREATE TABLE products ( id BIGSERIAL PRIMARY KEY, sku TEXT UNIQUE NOT NULL, name TEXT NOT NULL, price NUMERIC(10,2) NOT NULL, category_id INT REFERENCES categories(id), is_active BOOLEAN DEFAULT TRUE, attributes JSONB DEFAULT '{}' -- variable per category ); -- Laptop INSERT INTO products (sku, name, price, category_id, attributes) VALUES ('LP001', 'ThinkPad X1', 1299.99, 1, '{"ram":"16GB","cpu":"i7","screen_inch":14,"weight_kg":1.4}'); -- T-Shirt INSERT INTO products (sku, name, price, category_id, attributes) VALUES ('TS001', 'Classic Tee', 29.99, 2, '{"color":"navy","sizes":["S","M","L","XL"],"material":"Cotton"}'); -- Index for fast attribute search CREATE INDEX idx_products_attrs ON products USING GIN(attributes); -- Find all products available in size 'L' SELECT name, price FROM products WHERE attributes @> '{"sizes": ["L"]}'; -- Find laptops with 16GB RAM under $1500 SELECT name, price FROM products WHERE category_id = 1 AND attributes @> '{"ram": "16GB"}' AND price < 1500;

Real-World Example: Audit Log / Event Tracking

Store events with a fixed schema + flexible payload — perfect for analytics, audit trails, and webhook logs.

CREATE TABLE events ( id BIGSERIAL PRIMARY KEY, user_id BIGINT REFERENCES users(id), event_type TEXT NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW(), metadata JSONB DEFAULT '{}' ); -- Different events with different payloads INSERT INTO events (user_id, event_type, metadata) VALUES (1, 'login', '{"ip":"1.2.3.4","device":"chrome"}'), (1, 'purchase', '{"order_id":42,"amount":99.99,"items":3}'), (1, 'profile_update','{"changed_fields":["name","avatar"]}'); -- Query: all purchase events over $50 SELECT user_id, created_at, (metadata ->> 'amount')::numeric AS amount FROM events WHERE event_type = 'purchase' AND (metadata ->> 'amount')::numeric > 50;

Real-World Example: AI / RAG System with JSONB Metadata

A common pattern in modern AI platforms — documents with metadata stored as JSONB.

CREATE TABLE documents ( id BIGSERIAL PRIMARY KEY, org_id INT NOT NULL, title TEXT NOT NULL, content TEXT, embedding vector(1536), -- pgvector metadata JSONB DEFAULT '{}', -- source, author, tags, etc. created_at TIMESTAMPTZ DEFAULT NOW() ); -- Insert a document with rich metadata INSERT INTO documents (org_id, title, metadata) VALUES (1, 'Q4 Earnings Report', '{ "source": "google_drive", "author": "finance_team", "tags": ["earnings","finance","Q4"], "confidential": true, "department": "finance" }'); -- Find finance department docs tagged with 'earnings' SELECT title FROM documents WHERE org_id = 1 AND metadata @> '{"department":"finance","tags":["earnings"]}'; -- GIN index to make this fast CREATE INDEX idx_doc_metadata ON documents USING GIN(metadata);

Decision Framework — Column vs JSONB

Ask these questions for each field:
▶ Is this field always present? → Use a column
▶ Do you JOIN on this field? → Use a column
▶ Do you ORDER BY or GROUP BY on it? → Use a column
▶ Is this a foreign key? → Use a column
▶ Is this field optional / varies per row? → Use JSONB
▶ Is this a bag of user settings / preferences? → Use JSONB
▶ Is this an external API payload you want to keep raw? → Use JSONB
▶ Does the structure change per category / type? → Use JSONB

🧪 Interactive: Hybrid Design Advisor

Should this field be a column or JSONB? Simulated
Fill in the details and click Get Recommendation.
4

Advanced JSONB Patterns

Aggregation, validation, generated columns & more

Building JSON from Relational Data

PostgreSQL can build JSON on the fly from relational tables — perfect for APIs.

-- Build JSON object per row SELECT json_build_object( 'id', id, 'name', name, 'price', price, 'attributes', attributes ) AS product_json FROM products; -- Aggregate all products into a JSON array SELECT json_agg( json_build_object('id', id, 'name', name, 'price', price) ) AS products_list FROM products WHERE is_active = TRUE; -- Nest related data — orders with their items SELECT o.id, o.total, jsonb_agg(jsonb_build_object( 'product', p.name, 'qty', oi.quantity, 'price', oi.unit_price )) AS items FROM orders o JOIN order_items oi ON o.id = oi.order_id JOIN products p ON oi.product_id = p.id GROUP BY o.id, o.total;

Generated Columns from JSONB

Extract a JSONB key into a real column automatically — best of both worlds: JSONB flexibility + column performance.

-- Generated column: auto-populated from JSONB field -- Now you can index and query it as a plain column! ALTER TABLE products ADD COLUMN ram_gen TEXT GENERATED ALWAYS AS (attributes ->> 'ram') STORED; -- Index the generated column like any normal column CREATE INDEX idx_products_ram_gen ON products(ram_gen); -- Now this is a fast B-Tree lookup — no JSON overhead SELECT * FROM products WHERE ram_gen = '16GB';
Pro technique: Use generated columns when a JSONB key is queried very frequently. It turns a JSON lookup into a direct column access with a B-Tree index — far more efficient.

JSONB with CHECK Constraints (Basic Validation)

-- Ensure 'theme' is always 'light' or 'dark' if present ALTER TABLE users ADD CONSTRAINT chk_theme CHECK ( preferences ->> 'theme' IS NULL OR preferences ->> 'theme' IN ('light', 'dark') ); -- Ensure attributes is always an object (not an array or string) ALTER TABLE products ADD CONSTRAINT chk_attrs_object CHECK (jsonb_typeof(attributes) = 'object');

Querying Across Nested Arrays (Lateral Join)

When JSONB contains an array and you need to filter or join on elements inside it, use LATERAL with jsonb_array_elements.

-- Find all products and list each tag as a separate row SELECT p.name, tag.value AS tag FROM documents p, LATERAL jsonb_array_elements_text(p.metadata -> 'tags') AS tag(value) WHERE tag.value = 'finance'; -- Count products per tag SELECT tag.value AS tag, COUNT(*) AS count FROM documents d, LATERAL jsonb_array_elements_text(d.metadata -> 'tags') AS tag(value) GROUP BY tag.value ORDER BY count DESC;

JSONB Path Queries (jsonpath — PostgreSQL 12+)

PostgreSQL 12 added SQL/JSON path language — a powerful way to query deep into JSONB with conditions.

-- jsonb_path_exists: does any port start with 'USB'? SELECT name FROM products WHERE jsonb_path_exists( attributes, '$.ports[*] ? (@ starts with "USB")' ); -- jsonb_path_query: extract all ports containing 'USB' SELECT name, jsonb_path_query_array(attributes, '$.ports[*] ? (@ starts with "USB")') AS usb_ports FROM products WHERE category = 'laptop'; -- Filter: products where rating > 4 SELECT name FROM products WHERE jsonb_path_exists(attributes, '$.rating ? (@ > 4)');
⚠️

Common JSONB Mistakes

Pitfalls that kill performance in production

Mistake 1 — Using ->> Instead of @> for GIN Queries

❌ Won't Use GIN Index
-- ->> returns text, GIN can't help here
-- Results in a sequential scan!
SELECT * FROM products
WHERE attributes->>'ram' = '16GB';
✅ Uses GIN Index
-- @> containment uses GIN index
SELECT * FROM products
WHERE attributes @> '{"ram":"16GB"}';

Mistake 2 — Putting Relational Data in JSONB

❌ Wrong
-- user_id in JSONB = can't use FK, slow JOINs
INSERT INTO orders (data) VALUES (
  '{"user_id": 42, "total": 99.99}'
);
✅ Correct
-- user_id as a proper FK column
INSERT INTO orders (user_id, total, metadata)
VALUES (42, 99.99,
  '{"promo_code":"SAVE10"}');

Mistake 3 — Forgetting to Cast JSONB Numbers

❌ Wrong — String Comparison
-- "9" > "10" as text! Wrong result.
WHERE attributes->>'count' > '10'
✅ Correct — Numeric Cast
-- Cast to numeric first
WHERE (attributes->>'count')::numeric > 10

Mistake 4 — No Index on Frequently Queried JSONB

🚨
Critical: A table with 1M rows and no GIN index on JSONB will do a full sequential scan on every query. Always add at minimum CREATE INDEX ... USING GIN (attributes) if you query JSONB columns.

Mistake 5 — Over-using JSONB (JSON for Everything)

⚠️
Remember: JSONB is a tool, not a replacement for relational design. If you catch yourself putting IDs, foreign keys, timestamps, or any field you regularly JOIN or sort on into JSONB — stop and use a proper column. JSONB is for the variable part of your data.

🎯 Quick Check — Test Your Understanding

JSONB Quiz 5 Questions
📋

Summary & Cheatsheet

Everything you need to remember from Topic 13

Operator Cheatsheet

OperatorReturnsUse When
-> 'key'JSONBPass to another operator / chain access
->> 'key'TEXTDisplay, compare, or cast value
#> '{a,b}'JSONBNested path access (keep as JSONB)
#>> '{a,b}'TEXTNested path access (get as text)
@> '{...}'BOOLEANContainment check — uses GIN index ✅
? 'key'BOOLEANKey existence — uses GIN index ✅
?| array[...]BOOLEANAny key exists — uses GIN index ✅
?& array[...]BOOLEANAll keys exist — uses GIN index ✅
||JSONBMerge two JSONB objects
- 'key'JSONBDelete a key

Key Rules to Remember

Always use JSONB
Never use json type in production. JSONB is faster to read and supports GIN indexing.
Use @> for GIN
Only containment (@>) and existence (?) operators use GIN. Arrow operators need B-Tree expression index.
Cast numbers
Always cast: (col->>'price')::numeric. Raw ->> returns text — numeric comparison will break.
Hybrid design
Columns for fixed, queryable, FK data. JSONB for variable, optional, or extensible fields.
Index always
Always add GIN index on any JSONB column used in WHERE clauses, even during development.
Generated columns
For hot JSONB keys queried every request, extract to a generated stored column with B-Tree index.

What's Next?

🚀
Topic 14 — Replication: How PostgreSQL replicates data across servers for high availability and read scaling. You'll learn WAL, streaming replication, and logical replication.