👋 Hey {{first_name|there}},

Why this matters / where it hurts

You’ve probably seen this in production. A list endpoint feels fine. Page 1 is fast. Page 5 is okay. Then someone exports data, scrolls forever, or a batch job crawls the whole dataset. Suddenly p95 drifts up, CPU climbs, and the database starts doing work that feels pointless.

I used to treat pagination like a UI detail. Offset and limit, done. It looked clean in the ORM. Then a customer with a big dataset hit page 2000 and the query started walking past thousands of rows just to throw them away. The endpoint did not get a little slower. It fell off a cliff. Incidents at odd hours followed.

If you read last week’s Queries and Indexes With Predictable Latency, this is the next step. An index can be perfect and your endpoint can still melt if pagination forces the database to scan and discard rows on deep pages.

We’ll make pagination a design choice, not an afterthought. One pattern, one sheet per endpoint, and a few defaults that keep p95 stable as data grows.

🧭 Mindset shift

From: “Offset pagination is fine, we can optimize later.”
To: “Pagination is an access path. Design it for growth from day one.”

Offset pagination looks innocent. The database still has to find the first N rows, then walk past N, then return the next page. At page 2000 with page size 50, you are effectively asking it to walk past 99,950 rows. Even if it uses an index, it is doing work you do not keep.

Keyset pagination is the opposite. You keep a cursor, usually the last seen key, and ask for the next slice. The database seeks to a position and continues. It scales with page size, not with page number. That is the difference.

Two rules that keep this simple

  • Use stable ordering with a unique tie breaker, usually created_at plus id.

  • Move from “page number” to “cursor token,” even if the UI still shows “next.”

🧰 Tool of the week: Pagination Design Sheet

Keep this one page next to each list endpoint. Fill it once. Update it when ordering or filters change.

  1. Scope and SLO
    Endpoint or query, expected QPS, p95 target, max page size, and whether deep paging is allowed.

  2. Sort order contract
    Write the order by clause. Include a unique tie breaker. Example ORDER BY created_at DESC, id DESC.

  3. Filters and cardinality
    List filters users can apply. Note any highly skewed ones. This affects indexing and cursor design.

  4. Pagination mode
    Choose one

    • Keyset cursor for user lists and deep browsing

    • Offset only for small datasets or bounded pages, with a hard cap

    • Hybrid if you must. Offset for first few pages, cursor after

  5. Cursor shape
    Define what the token represents. Usually the last seen (created_at, id) pair plus filter context. Never accept a raw offset from clients for deep pages.

  6. SQL pattern
    Write the exact query pattern

    • First page query

    • Next page query using cursor conditions

    • Include the tie breaker logic.

  7. Index plan
    Define the index that supports filters plus the order. Example composite index (tenant_id, created_at DESC, id DESC) with include columns for list fields if needed.

  8. Consistency expectations
    State whether the list is a snapshot, approximate, or “best effort.” Cursor pagination is usually best effort unless you add snapshot semantics.

  9. Verification
    Run EXPLAIN on production-like data. Confirm index usage. Confirm no large sorts. Confirm rows read stays near rows returned.

  10. Escape hatch and safeguards
    Max page size, max crawl time, export endpoint separate from browse, rate limit per tenant. Document the fallback.

🔍 Example: Audit log list for a large tenant

Scope
GET /audit-events filtered by tenant. Users browse in UI, and some export for compliance. Target p95 300 ms at 100 RPS. Page size 50. Deep paging happens.

Context
Table has 200M rows. Tenants vary wildly. Some have years of history. Offset paging becomes dangerous fast.

Using the sheet

  • Sort order contract: ORDER BY occurred_at DESC, id DESC

  • Filters: tenant_id is required, optional event_type, optional actor_id

  • Pagination mode: keyset cursor for UI browsing, separate export job for full dataset

  • Cursor shape: last (occurred_at, id) plus tenant_id and filter hash

  • SQL pattern

    • First page
      WHERE tenant_id = ? AND filters ORDER BY occurred_at DESC, id DESC LIMIT 50

    • Next page
      WHERE tenant_id = ? AND filters AND (occurred_at, id) < (?, ?) ORDER BY occurred_at DESC, id DESC LIMIT 50

  • Index plan: (tenant_id, occurred_at DESC, id DESC) plus optional partial or additional index for hot filter combos

  • Consistency: best effort, new events may appear before older ones. That is acceptable for audit browsing

  • Verification: EXPLAIN shows index range scan, no external sort. Rows read close to 50 to 200, not tens of thousands

  • Escape hatch: page size capped at 100. Export runs async with backpressure and progress tracking

What success looks like
Page 1 and page 2000 have similar p95. CPU does not climb with deep browsing. Export traffic does not share the same path.

Small confession
If product insists on showing “page numbers,” I keep cursor under the hood and render page numbers as a UI illusion for the first few pages. It is slightly fake. It keeps the system healthy.

Do this / avoid this

Do

  • Use keyset pagination for any endpoint that can be browsed deeply

  • Always sort by a stable key with a unique tie breaker

  • Carry cursor values, not page numbers, across requests

  • Build composite indexes that match filters and order

  • Separate export from browse, and rate limit exports per tenant

Avoid

  • Deep offset pagination on large tables

  • Sorting by non-unique columns without a tie breaker

  • Allowing unbounded page sizes “because some customers need it”

  • Using random ordering or unstable sorts

  • Treating export as “just page through everything” on the same endpoint

🧪 Mini challenge

Goal: make one list endpoint safe for deep paging today.

  • Pick one list endpoint with real traffic and a table that can grow large

  • Identify its current order by and whether it has a stable tie breaker

  • Implement keyset pagination using a cursor based on (sort_key, id)

  • Add the supporting composite index for the order and primary filter

  • Run EXPLAIN and confirm rows read does not scale with page number

  • Cap page size and add a separate export path if needed

Hit reply and tell me which endpoint you fixed and what surprised you.

🎯 Action step for this week

  • Inventory top 5 list endpoints by traffic and user impact

  • For each, fill a Pagination Design Sheet and choose keyset or bounded offset

  • Standardize one cursor token format across services

  • Add guardrails. Page size caps, rate limits, and separate export endpoints

  • Add a dashboard tile per endpoint showing rows read vs returned and p95 by page depth proxy

By end of this week, aim to have two endpoints running on keyset pagination with verified index plans and visible metrics.

👋 Wrapping up

  • Offset pagination makes the database walk past work it cannot reuse.

  • Keyset pagination scales with page size, not page number.

  • Stable order plus tie breaker is non-negotiable.

  • Separate browsing from exports and add guardrails early.

⭐ Most read issues (good place to start)

If you’re new here, these are the five issues readers keep coming back to:

I would love your input. What database topic would help you most right now
Hit reply and tell me.

Thanks for reading.

See you next week,
Bogdan Colța
Tech Architect Insights

Keep Reading