PostgreSQL — Topic 2: OLTP Database Design

2.1

OLTP Principles

What makes a database "OLTP" and why it matters

▾

💡

OLTP vs OLAP in one sentence: OLTP is many users doing small, fast operations (place an order, update a cart). OLAP is a few analysts doing massive aggregations (total revenue by region this year). Most apps you build are OLTP.

The 5 Core OLTP Characteristics

High Transaction Volume

Thousands to millions of queries per second. Every row matters, every millisecond counts.

Small Queries

Queries touch a handful of rows — not millions. WHERE user_id = 42, not GROUP BY country.

Fast Inserts

New orders, messages, events — must write in under 5ms. Avoid heavy triggers on write paths.

Fast Updates

Update inventory, change order status, increment counters — single-row updates, always indexed.

ACID Transactions

Atomicity, Consistency, Isolation, Durability. Money never disappears; orders never half-create.

OLTP vs OLAP — Side by Side

Property	OLTP	OLAP
Query type	Point lookups, small updates	Aggregations, full scans
Data volume per query	Few rows	Millions of rows
Concurrent users	Thousands	Few analysts
Schema style	Normalized (3NF)	Denormalized (star schema)
Primary concern	Latency, throughput	Query speed on big data
Examples	PostgreSQL, MySQL	BigQuery, Redshift, Snowflake

2.2

Designing OLTP Systems

The 7-step process used by senior engineers

▾

Follow these 7 steps every time you design a new OLTP system. We'll use an e-commerce app as our running example throughout.

Step 1
Entities

Step 2
Relationships

Step 3
Normalize

Step 4
Transactions

Step 5
Indexes

Step 6
Audit Cols

Step 7
Soft Deletes

📋

Step 1: Identify Business Entities — List every noun in the system. For e-commerce, that's users, products, orders, payments, etc.

User

→

Address

→

Product

→

Category

→

Inventory

Cart

→

Order

→

OrderItem

→

Payment

-- E-Commerce Entity List
-- Ask: "What are the things (nouns) in this domain?"

-- User management
users        → people who buy
addresses    → where to ship (user has many addresses)

-- Catalog
categories   → Electronics, Clothing, Books…
products     → items for sale
inventory    → stock levels per product (maybe per warehouse)

-- Shopping
carts        → temporary basket before checkout
cart_items   → products in the cart

-- Fulfillment
orders       → confirmed purchase intent
order_items  → individual line items in an order
payments     → money transaction records
            

🔗

Step 2: Define Relationships — Map how entities connect. Draw the chain from user to money.

-- Relationship tree (read top to bottom):

User (1)
 ├── has many → Addresses (N)
 ├── has many → Orders    (N)
 │                │
 │                └── has many → OrderItems (N)
 │                                    │
 │                                    └── belongs to → Product (1)
 │                                                          │
 │                                                          └── has one → Inventory
 └── has many → Payments  (N)
                     │
                     └── belongs to → Order (1)

-- Key relationships:
-- users        1:N  orders
-- orders       1:N  order_items
-- products     1:N  order_items  (product appears in many orders)
-- orders       1:1  payments     (each order has one payment record)
-- products     M:N  categories   (via product_categories junction)
            

✂️

Step 3: Normalize to 3NF — Split data so each fact lives in exactly one place. Remove all redundancy.

-- ❌ BAD: storing customer info directly on the order
orders_bad
┌──────────┬───────────────┬───────────────────┬──────────────┐
│ order_id │ customer_name │ customer_email    │ customer_city│
└──────────┴───────────────┴───────────────────┴──────────────┘
-- Problem: if email changes, every order row must be updated!
-- Problem: city depends on user, not the order (transitive dep)

-- ✅ GOOD: normalize — each fact lives in one table
CREATE TABLE users (
    user_id    BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email      TEXT   UNIQUE NOT NULL,
    first_name TEXT   NOT NULL,
    last_name  TEXT   NOT NULL
);

CREATE TABLE orders (
    order_id   BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id    BIGINT       NOT NULL REFERENCES users(user_id),
    status     TEXT         NOT NULL DEFAULT 'pending',
    total      NUMERIC(12,2) NOT NULL,
    created_at TIMESTAMPTZ  DEFAULT NOW()
);
-- Customer info is fetched via JOIN — never duplicated
            

🔒

Step 4: Define Transaction Boundaries — Group related writes into a single atomic transaction so they all succeed or all fail together.

Example: Placing an order involves 4 operations. All must succeed or none should commit:

Create Order — insert into orders table with status = 'pending'

Create Order Items — insert each product line into order_items

Reduce Inventory — decrement stock in inventory for each product

Create Payment Record — insert into payments with status = 'pending'

-- All 4 operations in a SINGLE transaction

BEGIN;

-- Step 1: Create the order
INSERT INTO orders (user_id, status, total)
VALUES (42, 'pending', 350.00)
RETURNING order_id;   -- capture the new order_id

-- Step 2: Add order items
INSERT INTO order_items (order_id, product_id, quantity, price)
VALUES
    (1001, 5, 2, 100.00),
    (1001, 9, 1, 150.00);

-- Step 3: Reduce inventory (with a check!)
UPDATE inventory
SET    stock = stock - 2
WHERE  product_id = 5
AND    stock >= 2;  -- prevents negative stock

-- Step 4: Create payment record
INSERT INTO payments (order_id, amount, status, method)
VALUES (1001, 350.00, 'pending', 'card');

COMMIT;   -- all or nothing!

-- If anything fails above, run ROLLBACK instead:
-- ROLLBACK;  → undoes ALL 4 operations atomically
            

🚨

Never split related writes across multiple requests. If your server crashes between step 2 and step 3, you'd have an order with items but no inventory deduction. Always wrap in a transaction.

⚡

Step 5: Index Strategy — Add indexes on columns you filter or join on. But don't over-index — every index slows down writes.

-- ✅ Index columns you search by, filter on, or JOIN with

-- users: look up by email (login)
CREATE UNIQUE INDEX idx_users_email
    ON users (email);

-- orders: look up all orders for a user (very common)
CREATE INDEX idx_orders_user_id
    ON orders (user_id);

-- orders: look up recent orders (dashboard, listing)
CREATE INDEX idx_orders_created_at
    ON orders (created_at DESC);

-- order_items: look up items for a given order (always needed)
CREATE INDEX idx_order_items_order_id
    ON order_items (order_id);

-- payments: look up payment for an order
CREATE INDEX idx_payments_order_id
    ON payments (order_id);

-- ❌ DON'T index columns you never filter on:
--   first_name, last_name, description, notes
-- ❌ DON'T index low-cardinality columns:
--   status (only 'pending','paid','cancelled') → not selective enough

-- ✅ EXCEPTION: partial index for low-cardinality but important subset
CREATE INDEX idx_orders_pending
    ON orders (created_at)
    WHERE status = 'pending';  -- only index pending orders
            

🕒

Step 6: Audit Columns — Every OLTP table should track who created/changed a row and when. This is essential for debugging, compliance, and data lineage.

-- Standard audit columns — add to EVERY table

CREATE TABLE orders (
    order_id   BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id    BIGINT      NOT NULL REFERENCES users(user_id),
    total      NUMERIC(12,2),
    status     TEXT        DEFAULT 'pending',

    -- ✅ Audit columns
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    created_by BIGINT      REFERENCES users(user_id),  -- who placed it
    updated_by BIGINT      REFERENCES users(user_id)   -- who last touched it
);

-- Auto-update updated_at using a trigger:
CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_orders_updated_at
    BEFORE UPDATE ON orders
    FOR EACH ROW EXECUTE FUNCTION set_updated_at();

-- Now every UPDATE automatically sets updated_at = NOW()
            

✓

Pro tip: Create a reusable trigger function once and attach it to every table. Some teams use a base migration that adds these columns automatically to all new tables.

🗑️

Step 7: Soft Deletes — Don't physically delete rows in OLTP systems. Mark them as deleted instead. This preserves history, enables undo, and prevents FK violation errors.

-- Add soft delete columns to every table

ALTER TABLE users
    ADD COLUMN is_deleted BOOLEAN     NOT NULL DEFAULT FALSE,
    ADD COLUMN deleted_at  TIMESTAMPTZ,
    ADD COLUMN deleted_by  BIGINT      REFERENCES users(user_id);

-- ✅ Soft delete: mark as deleted, don't remove
UPDATE users
SET
    is_deleted = TRUE,
    deleted_at = NOW(),
    deleted_by = 1   -- the admin who deleted
WHERE user_id = 42;

-- ❌ NEVER do this in OLTP:
-- DELETE FROM users WHERE user_id = 42;
-- This would cascade-delete all their orders, payments, etc.

-- Filter out deleted records in every query:
SELECT * FROM users WHERE is_deleted = FALSE;

-- ✅ Better: create a VIEW that hides deleted rows
CREATE VIEW active_users AS
    SELECT * FROM users WHERE is_deleted = FALSE;

-- App queries active_users instead of users directly
SELECT * FROM active_users WHERE email = 'alice@example.com';

-- ✅ Partial index for performance (only index non-deleted rows)
CREATE INDEX idx_users_email_active
    ON users (email)
    WHERE is_deleted = FALSE;
            

2.3

Common OLTP Patterns

Real schemas used at scale: Auth, E-Commerce, Ride Sharing, AI SaaS

▾

Click any pattern to see its full production schema with SQL:

🔐 User Management

Auth, roles, permissions

users roles permissions user_roles

🛒 E-Commerce

Orders, products, payments

products orders order_items payments

🚗 Ride Sharing

Drivers, trips, locations

drivers riders trips locations

🤖 AI SaaS

Conversations, embeddings, RAG

conversations messages documents embeddings

🔐

User Management Pattern — RBAC (Role-Based Access Control) used by virtually every production app. Users have roles, roles have permissions.

users

user_idPK BIGINT

emailUQ TEXT

password_hashTEXT

is_activeBOOL

created_atTIMESTAMPTZ

roles

role_idPK BIGINT

nameUQ TEXT

descriptionTEXT

permissions

permission_idPK BIGINT

nameUQ TEXT

resourceTEXT

actionTEXT

user_roles

user_idFK BIGINT

role_idFK BIGINT

granted_atTIMESTAMPTZ

granted_byFK BIGINT

role_permissions

role_idFK BIGINT

permission_idFK BIGINT

-- Full RBAC schema

CREATE TABLE users (
    user_id       BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email         TEXT        UNIQUE NOT NULL,
    password_hash TEXT        NOT NULL,
    is_active     BOOLEAN     NOT NULL DEFAULT TRUE,
    is_deleted    BOOLEAN     NOT NULL DEFAULT FALSE,
    created_at    TIMESTAMPTZ DEFAULT NOW(),
    updated_at    TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE roles (
    role_id     BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name        TEXT   UNIQUE NOT NULL,  -- 'admin', 'editor', 'viewer'
    description TEXT
);

CREATE TABLE permissions (
    permission_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name          TEXT   UNIQUE NOT NULL, -- 'orders:read', 'users:delete'
    resource      TEXT   NOT NULL,        -- 'orders', 'users', 'products'
    action        TEXT   NOT NULL         -- 'read', 'write', 'delete'
);

-- Junction: users ↔ roles (M:N)
CREATE TABLE user_roles (
    user_id    BIGINT      REFERENCES users(user_id) ON DELETE CASCADE,
    role_id    BIGINT      REFERENCES roles(role_id) ON DELETE CASCADE,
    granted_at TIMESTAMPTZ DEFAULT NOW(),
    granted_by BIGINT      REFERENCES users(user_id),
    PRIMARY KEY (user_id, role_id)
);

-- Junction: roles ↔ permissions (M:N)
CREATE TABLE role_permissions (
    role_id       BIGINT REFERENCES roles(role_id) ON DELETE CASCADE,
    permission_id BIGINT REFERENCES permissions(permission_id) ON DELETE CASCADE,
    PRIMARY KEY (role_id, permission_id)
);

-- Check if a user has a specific permission:
SELECT COUNT(*) > 0 AS has_permission
FROM   user_roles ur
JOIN   role_permissions rp ON ur.role_id = rp.role_id
JOIN   permissions p       ON rp.permission_id = p.permission_id
WHERE  ur.user_id = 42
AND    p.name = 'orders:read';
          

🛒

E-Commerce Pattern — The canonical OLTP schema. Used by Flipkart, Amazon-style apps. Notice how inventory is separate from products to allow per-warehouse tracking.

products

product_idPK

nameTEXT

priceNUMERIC

skuUQ

inventory

product_idFK

stockINT

reservedINT

orders

order_idPK

user_idFK

statusTEXT

totalNUMERIC

order_items

order_idFK

product_idFK

quantityINT

priceNUMERIC

payments

payment_idPK

order_idFK

amountNUMERIC

statusTEXT

gateway_refTEXT

-- Complete E-Commerce OLTP Schema

CREATE TABLE categories (
    category_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name        TEXT   NOT NULL,
    parent_id   BIGINT REFERENCES categories(category_id)  -- self-ref for hierarchy
);

CREATE TABLE products (
    product_id  BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name        TEXT         NOT NULL,
    description TEXT,
    sku         TEXT         UNIQUE NOT NULL,   -- Stock Keeping Unit
    price       NUMERIC(10,2) NOT NULL,
    is_active   BOOLEAN      NOT NULL DEFAULT TRUE,
    created_at  TIMESTAMPTZ  DEFAULT NOW()
);

CREATE TABLE inventory (
    product_id BIGINT PRIMARY KEY REFERENCES products(product_id),
    stock      INT    NOT NULL DEFAULT 0 CHECK (stock >= 0),
    reserved   INT    NOT NULL DEFAULT 0,  -- in-progress orders
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE orders (
    order_id          BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id           BIGINT       NOT NULL REFERENCES users(user_id),
    shipping_address  JSONB,                -- snapshot of address at order time
    status            TEXT         NOT NULL DEFAULT 'pending',
    total             NUMERIC(12,2) NOT NULL,
    created_at        TIMESTAMPTZ  DEFAULT NOW(),
    updated_at        TIMESTAMPTZ  DEFAULT NOW()
);

CREATE TABLE order_items (
    order_id   BIGINT       NOT NULL REFERENCES orders(order_id),
    product_id BIGINT       NOT NULL REFERENCES products(product_id),
    quantity   INT          NOT NULL CHECK (quantity > 0),
    price      NUMERIC(10,2) NOT NULL,  -- price at time of order (snapshot)
    PRIMARY KEY (order_id, product_id)
);

CREATE TABLE payments (
    payment_id  BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    order_id    BIGINT       UNIQUE NOT NULL REFERENCES orders(order_id),
    amount      NUMERIC(12,2) NOT NULL,
    status      TEXT         NOT NULL DEFAULT 'pending',
    method      TEXT,        -- 'upi', 'card', 'netbanking', 'cod'
    gateway_ref TEXT,        -- Razorpay/Stripe transaction ID
    paid_at     TIMESTAMPTZ,
    created_at  TIMESTAMPTZ  DEFAULT NOW()
);
          

🚗

Ride Sharing Pattern — Used by Uber/Ola. Key challenge: real-time location tracking and matching drivers to riders. Location history is append-only (never updated).

drivers

driver_idPK

user_idFK

vehicle_typeTEXT

is_availableBOOL

ratingNUMERIC

trips

trip_idPK

driver_idFK

rider_idFK

statusTEXT

fareNUMERIC

locations

location_idPK

driver_idFK

lat / lngNUMERIC

recorded_atTIMESTAMPTZ

payments

payment_idPK

trip_idFK

amountNUMERIC

methodTEXT

-- Ride Sharing OLTP Schema

CREATE TABLE drivers (
    driver_id    BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    user_id      BIGINT       UNIQUE NOT NULL REFERENCES users(user_id),
    vehicle_type TEXT         NOT NULL,  -- 'auto', 'bike', 'mini', 'sedan'
    license_num  TEXT         UNIQUE NOT NULL,
    is_available BOOLEAN      NOT NULL DEFAULT FALSE,
    rating       NUMERIC(3,2) DEFAULT 5.0,   -- running average
    created_at   TIMESTAMPTZ  DEFAULT NOW()
);

CREATE TABLE trips (
    trip_id      BIGINT       GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    driver_id    BIGINT       NOT NULL REFERENCES drivers(driver_id),
    rider_id     BIGINT       NOT NULL REFERENCES users(user_id),
    status       TEXT         NOT NULL DEFAULT 'requested',
    -- status: requested → accepted → in_progress → completed / cancelled
    pickup_lat   NUMERIC(9,6),
    pickup_lng   NUMERIC(9,6),
    dropoff_lat  NUMERIC(9,6),
    dropoff_lng  NUMERIC(9,6),
    distance_km  NUMERIC(8,2),
    fare         NUMERIC(8,2),
    started_at   TIMESTAMPTZ,
    completed_at TIMESTAMPTZ,
    created_at   TIMESTAMPTZ  DEFAULT NOW()
);

-- Location pings from driver app (append-only, NEVER update)
CREATE TABLE driver_locations (
    location_id BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    driver_id   BIGINT      NOT NULL REFERENCES drivers(driver_id),
    lat         NUMERIC(9,6) NOT NULL,
    lng         NUMERIC(9,6) NOT NULL,
    recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- This table grows FAST. Partition by day/week (covered in Topic 11)

-- Get latest driver location:
SELECT DISTINCT ON (driver_id)
    driver_id, lat, lng, recorded_at
FROM  driver_locations
WHERE driver_id = 7
ORDER BY driver_id, recorded_at DESC;
          

🤖

AI SaaS Pattern — Used for apps like ChatGPT, Notion AI, or your own FastAPI + AI agent backend. Multi-tenant with conversation history and vector embeddings for RAG.

organizations

org_idPK

nameTEXT

planTEXT

conversations

conv_idPK

user_idFK

titleTEXT

modelTEXT

messages

msg_idPK

conv_idFK

roleTEXT

contentTEXT

tokensINT

documents

doc_idPK

org_idFK

filenameTEXT

contentTEXT

embeddings

embed_idPK

doc_idFK

chunk_textTEXT

vectorVECTOR

-- AI SaaS Multi-Tenant Schema
-- Requires: pgvector extension for embeddings
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE organizations (
    org_id     BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name       TEXT        NOT NULL,
    plan       TEXT        NOT NULL DEFAULT 'free',  -- 'free','pro','enterprise'
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE users (
    user_id    BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    org_id     BIGINT      NOT NULL REFERENCES organizations(org_id),
    email      TEXT        UNIQUE NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE conversations (
    conv_id    UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id    BIGINT      NOT NULL REFERENCES users(user_id),
    org_id     BIGINT      NOT NULL REFERENCES organizations(org_id),
    title      TEXT,
    model      TEXT        NOT NULL DEFAULT 'gpt-4',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE messages (
    msg_id     BIGINT      GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    conv_id    UUID        NOT NULL REFERENCES conversations(conv_id),
    role       TEXT        NOT NULL,  -- 'user', 'assistant', 'system'
    content    TEXT        NOT NULL,
    tokens     INT,                   -- track usage for billing
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE documents (
    doc_id     UUID        PRIMARY KEY DEFAULT gen_random_uuid(),
    org_id     BIGINT      NOT NULL REFERENCES organizations(org_id),
    filename   TEXT        NOT NULL,
    mime_type  TEXT,
    content    TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- RAG: store chunked text + vector embedding
CREATE TABLE embeddings (
    embed_id    BIGINT          GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    doc_id      UUID            NOT NULL REFERENCES documents(doc_id),
    chunk_index INT             NOT NULL,
    chunk_text  TEXT            NOT NULL,
    vector      VECTOR(1536)    NOT NULL,  -- 1536 dims for OpenAI ada-002
    created_at  TIMESTAMPTZ     DEFAULT NOW()
);

-- Semantic search: find most similar chunks to a query vector
SELECT chunk_text, 1 - (vector <=> '[0.1, 0.2, ...]'::VECTOR) AS similarity
FROM   embeddings
WHERE  doc_id = '...'
ORDER BY vector <=> '[0.1, 0.2, ...]'::VECTOR
LIMIT  5;
          

⚠️

Multi-tenancy tip: Always store org_id on every table and add it to every query. Use Row Level Security (RLS, covered in Topic 17) to enforce tenant isolation automatically at the database level.

Topic 2: OLTP Database Design — all sections covered.

Ready for Topic 3: PostgreSQL Data Types?