PostgreSQL - freeCodeCamp.org

How to Schedule Jobs in PostgreSQL with pg_cron

iyiola — Wed, 17 Jun 2026 19:19:31 +0000

Every backend system eventually needs something to run on a schedule. Old sessions need deleting, summary tables need rebuilding, materialized views need refreshing, and maintenance tasks need to happen while everyone is asleep.

The usual answer is to reach outside the database: a system crontab, a Kubernetes CronJob, a Celery beat worker, or a scheduler service. All of these work, but they add moving parts. Now you have credentials to manage, a separate process to monitor, and one more thing that can silently stop running.

pg_cron takes a different approach. It's a PostgreSQL extension that runs a cron-style scheduler inside the database itself. You schedule jobs with plain SQL, the database executes them, and the run history lands in a table you can query like anything else.

In this tutorial, you'll learn how pg_cron works, how to install and configure it, and how to use it for real maintenance tasks. You'll also learn how to monitor jobs, manage permissions, and decide when pg_cron is the right tool — and when it isn't.

Prerequisites
What Is pg_cron?
How pg_cron Works
How to Install and Set Up pg_cron
A Quick Refresher on Cron Syntax
How to Schedule Your First Job
Practical pg_cron Examples
How to View and Monitor Your Jobs
How to Update and Remove Jobs
How to Run Jobs in Other Databases
How to Let Other Users Schedule Jobs
When to Use pg_cron (and When to Avoid It)
Best Practices for Working with pg_cron
Conclusion

Prerequisites

To follow along with the examples, you'll need:

Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE)
A running PostgreSQL instance (version 13 or later is ideal, though pg_cron supports version 10 and up)
Superuser or admin access to that instance, since installing the extension requires it
A SQL client like psql, pgAdmin, or DBeaver

If you don't run your own server, that's fine too. Most managed PostgreSQL services — including Amazon RDS, Azure Database for PostgreSQL, Google Cloud SQL, Supabase, and Neon — support pg_cron. I'll cover how to enable it on those later in the tutorial.

What Is pg_cron?

pg_cron is an open source PostgreSQL extension, originally built by the team at Citus Data, that lets you schedule SQL commands using the familiar cron syntax.

Instead of writing a crontab entry on a server, you write a SQL statement:

SELECT cron.schedule(
  'nightly-cleanup',
  '0 3 * * *',
  $$DELETE FROM sessions WHERE expires_at < now()$$
);

That single statement tells PostgreSQL to delete expired sessions every day at 3 AM. No external process, no shell script, no extra credentials. The job definition lives in the database, version-controlled alongside your migrations if you want it to be.

Because the scheduler is just another extension, your jobs travel with the database. Anyone who can connect and query can see exactly what's scheduled, when it last ran, and whether it succeeded.

How pg_cron Works

When PostgreSQL starts with pg_cron enabled, the extension launches a background worker. This worker has one job: watch the cron.job table, which holds every scheduled job along with its schedule, command, target database, and the user it runs as.

When a job's scheduled time arrives, the worker executes the command. By default it does this by opening a new local connection to the database, just as your application would. You can also configure it to use PostgreSQL background workers instead of connections, which I'll show you in the setup section.

Two behaviors are worth knowing up front:

First, pg_cron can run many different jobs in parallel, but it never runs two instances of the same job at once. If a job is still running when its next scheduled time arrives, the new run waits in a queue and starts as soon as the current one finishes. This protects you from a slow cleanup job piling up on top of itself.

Second, pg_cron doesn't run jobs while a server is in hot standby mode. If you use streaming replication, jobs only execute on the primary. When a replica gets promoted, the scheduler starts up automatically — so failover doesn't leave you without your scheduled jobs.

How to Install and Set Up pg_cron

Setting up pg_cron on a self-managed server takes three steps: install the package, update the configuration, and create the extension.

Step 1: Install the Package

On Debian or Ubuntu using the official PostgreSQL apt repository, install the package that matches your PostgreSQL major version. For PostgreSQL 17, that's:

sudo apt-get install postgresql-17-cron

On Red Hat-based systems using the PGDG yum repository:

sudo yum install pg_cron_17

If you're on PostgreSQL 16 or 18, swap the version number accordingly. You can also build the extension from source if your platform doesn't have a package.

Step 2: Update postgresql.conf

pg_cron needs to start its background worker when PostgreSQL boots, so it must be preloaded. Add it to shared_preload_libraries in your postgresql.conf:

shared_preload_libraries = 'pg_cron'

If that setting already lists other libraries, add pg_cron to the comma-separated list rather than replacing them.

By default, the scheduler stores its metadata in the database named postgres. If your application lives in a different database and you want the jobs there, set:

cron.database_name = 'app_db'

One more setting worth knowing: pg_cron interprets all schedules in GMT by default. If you want your "3 AM cleanup" to actually run at 3 AM local time, set the timezone explicitly:

cron.timezone = 'Africa/Lagos'

These settings require a server restart to take effect:

sudo systemctl restart postgresql

Step 3: Create the Extension

Connect to the database you configured in cron.database_name and create the extension as a superuser:

CREATE EXTENSION pg_cron;

This creates the cron schema, the metadata tables, and the scheduling functions. You're ready to schedule jobs.

Note that pg_cron can only be installed in one database per PostgreSQL instance. That sounds limiting, but it isn't. You can still run jobs in any database on the instance using cron.schedule_in_database(), which we'll cover later.

A Note on How Jobs Connect

Since pg_cron opens local connections by default, your pg_hba.conf needs to allow them. The common approaches are enabling trust authentication for localhost connections for the job's user, or putting the password in a .pgpass file.

If you'd rather avoid connection authentication entirely, tell pg_cron to use background workers instead:

cron.use_background_workers = on
max_worker_processes = 20

With background workers, the number of jobs that can run concurrently is capped by max_worker_processes, so raise it if you schedule a lot of overlapping jobs.

Using pg_cron on Managed Database Services

If you're on a managed service, you usually can't edit postgresql.conf directly, but the providers expose the same settings through their own mechanisms:

Amazon RDS and Aurora PostgreSQL: add pg_cron to the shared_preload_libraries parameter in your DB parameter group, reboot the instance, then run CREATE EXTENSION pg_cron; as a user with rds_superuser. The scheduler runs in the postgres database.
Azure Database for PostgreSQL: enable pg_cron under server parameters (shared_preload_libraries and azure.extensions), restart, then create the extension.
Google Cloud SQL: set the cloudsql.enable_pg_cron flag, restart, then create the extension.
Supabase: enable the pg_cron extension with a single toggle in the dashboard under Database → Extensions.
Neon: pg_cron is available as a supported extension you can enable per project.

The SQL you write afterward is identical everywhere, which is part of the appeal.

A Quick Refresher on Cron Syntax

pg_cron uses the same five-field schedule format as classic Unix cron:

┌──────────── minute (0–59)
│ ┌────────── hour (0–23)
│ │ ┌──────── day of month (1–31, or $ for the last day)
│ │ │ ┌────── month (1–12)
│ │ │ │ ┌──── day of week (0–6, Sunday = 0)
│ │ │ │ │
* * * * *

An asterisk means "every value". You can combine values with commas, ranges with hyphens, and steps with slashes. Some schedules you'll use constantly:

*/5 * * * *    every 5 minutes
0 * * * *      every hour, on the hour
0 3 * * *      every day at 3:00 AM
0 3 * * 1-5    3:00 AM on weekdays
30 1 * * 0     1:30 AM every Sunday
0 0 1 * *      midnight on the 1st of each month

pg_cron also adds two extensions to the standard syntax that regular cron doesn't have.

You can use $ in the day-of-month field to mean the last day of the month, which is genuinely painful to express in standard cron:

0 23 $ * *     11:00 PM on the last day of every month

And for jobs that need to run more often than once a minute, you can use a plain interval between 1 and 59 seconds:

'30 seconds'   every 30 seconds

The seconds syntax stands alone — you can't mix it with the five-field format.

If you ever doubt what a schedule means, crontab.guru translates cron expressions into plain English. Just remember that pg_cron evaluates schedules in the timezone set by cron.timezone, which defaults to GMT.

How to Schedule Your First Job

The core function is cron.schedule(). It comes in two forms: one with a name and one without.

The named form is the one you should use, because names make jobs easy to find, update, and remove:

SELECT cron.schedule(
  'delete-expired-sessions',                          -- job name
  '0 3 * * *',                                        -- schedule
  $$DELETE FROM sessions WHERE expires_at < now()$$   -- command
);

The function returns the job's ID:

 schedule
----------
        1
(1 row)

A few details worth noticing:

The command is wrapped in $$ ... $$, PostgreSQL's dollar quoting. This saves you from escaping the single quotes inside the SQL. For commands without quotes, regular string literals work fine.

The job runs in the database where you called cron.schedule(), as the user you called it with, using that user's normal permissions. There's no privilege escalation hiding in the scheduler — if your user can't delete from sessions, neither can the job.

And if you call cron.schedule() again with the same job name, pg_cron updates the existing job instead of creating a duplicate. That makes schedules idempotent, which is handy if you define jobs inside database migrations.

Practical pg_cron Examples

Let's walk through the patterns that cover most real-world use. Each example is something you can adapt directly.

Example 1: Clean Up Old Rows Every Night

Tables that collect transient data — sessions, tokens, audit events, notification logs — grow forever unless something prunes them. A nightly delete is the classic first pg_cron job:

SELECT cron.schedule(
  'purge-old-events',
  '0 2 * * *',
  $$DELETE FROM events WHERE created_at < now() - interval '90 days'$$
);

Every night at 2:00 AM, rows older than 90 days disappear. If the table is large, consider batching the delete inside a function so each run stays short, then schedule the function instead.

Example 2: Refresh a Materialized View Every Hour

Materialized views are a great way to cache expensive aggregations, but PostgreSQL never refreshes them on its own. pg_cron fixes that:

SELECT cron.schedule(
  'refresh-daily-sales',
  '5 * * * *',
  'REFRESH MATERIALIZED VIEW CONCURRENTLY daily_sales_summary'
);

This refreshes the view at five minutes past every hour. The CONCURRENTLY option lets reads continue during the refresh, as long as the view has a unique index.

Example 3: Build a Daily Summary Table

Rollup tables are another common pattern: instead of aggregating millions of raw rows on every dashboard load, you precompute the numbers once a day.

SELECT cron.schedule(
  'rollup-daily-orders',
  '15 0 * * *',
  $$
  INSERT INTO daily_order_stats (day, order_count, total_amount)
  SELECT created_at::date, count(*), sum(amount)
  FROM orders
  WHERE created_at >= current_date - 1
    AND created_at < current_date
  GROUP BY created_at::date
  ON CONFLICT (day) DO UPDATE
    SET order_count = EXCLUDED.order_count,
        total_amount = EXCLUDED.total_amount
  $$
);

At fifteen minutes past midnight, yesterday's orders get summarized into one row. The ON CONFLICT clause makes the job safe to re-run — if it executes twice, it overwrites rather than duplicates.

Example 4: Run a Job Every 30 Seconds

Some work needs to happen more often than cron's one-minute floor allows: flushing a buffer table, picking up rows from an outbox, advancing a lightweight queue. The seconds syntax handles this:

SELECT cron.schedule(
  'process-outbox',
  '30 seconds',
  'CALL process_outbox_batch()'
);

Remember the guarantee from earlier: pg_cron won't start a second instance of this job while the first is still running. If a batch occasionally takes 45 seconds, the next run simply waits its turn instead of stampeding.

Example 5: Run Maintenance on the Last Day of the Month

Month-end jobs are awkward in standard cron because months have different lengths. pg_cron's $ makes it trivial:

SELECT cron.schedule(
  'month-end-vacuum',
  '0 23 $ * *',
  'VACUUM ANALYZE orders'
);

This runs VACUUM ANALYZE on the orders table at 11:00 PM on the 28th, 29th, 30th, or 31st — whichever happens to be the last day of that month.

How to View and Monitor Your Jobs

Everything pg_cron knows lives in two tables in the cron schema, and you query them like any other tables.

To see what's scheduled, look at cron.job:

SELECT jobid, jobname, schedule, command, active
FROM cron.job;

 jobid |         jobname         |  schedule  |            command             | active
-------+-------------------------+------------+--------------------------------+--------
     1 | delete-expired-sessions | 0 3 * * *  | DELETE FROM sessions WHERE ... | t
     2 | refresh-daily-sales     | 5 * * * *  | REFRESH MATERIALIZED VIEW ...  | t
(2 rows)

To see how jobs have actually been running, query cron.job_run_details:

SELECT jobid, status, return_message, start_time, end_time
FROM cron.job_run_details
ORDER BY start_time DESC
LIMIT 10;

Each row records one execution: whether it succeeded or failed, the message it returned, and exactly when it started and ended. A failed job shows status = 'failed' along with the error message, so debugging usually starts and ends with this table.

One important catch: pg_cron never cleans this table up by itself. A job running every 30 seconds writes almost three thousand rows a day. The standard fix is delightfully recursive — schedule a pg_cron job to prune pg_cron's own history:

SELECT cron.schedule(
  'purge-cron-history',
  '0 12 * * *',
  $$DELETE FROM cron.job_run_details
    WHERE end_time < now() - interval '14 days'$$
);

If you don't want run history recorded at all, set cron.log_run = off in your configuration.

How to Update and Remove Jobs

To change an existing job, use cron.alter_job() with the job's ID. Only the parameters you pass get changed — everything else stays as it was:

-- Move job 1 from 3 AM to 4 AM
SELECT cron.alter_job(1, schedule := '0 4 * * *');

-- Pause a job without deleting it
SELECT cron.alter_job(1, active := false);

-- Resume it later
SELECT cron.alter_job(1, active := true);

Pausing with active := false is underrated. During an incident or a big migration, you can switch off a noisy job and switch it back on afterward, without losing its definition.

To remove a job permanently, use cron.unschedule() with either the name or the ID:

SELECT cron.unschedule('delete-expired-sessions');
-- or
SELECT cron.unschedule(1);

Both return true when the job was found and removed.

How to Run Jobs in Other Databases

Remember that pg_cron is installed in exactly one database per instance, usually postgres. If your instance hosts several databases, you don't install pg_cron in each one — you schedule cross-database jobs from the one place it lives, using cron.schedule_in_database():

SELECT cron.schedule_in_database(
  'analytics-nightly-vacuum',
  '0 4 * * *',
  'VACUUM ANALYZE page_views',
  'analytics_db'
);

The job is stored centrally but executes inside analytics_db. The function also accepts an optional username if the job should run as a different user, and an active flag if you want to create it paused.

This pattern keeps all scheduling in one schema on one database, which makes auditing simple: a single SELECT * FROM cron.job shows every scheduled job across the whole instance.

How to Let Other Users Schedule Jobs

Out of the box, only superusers can call the scheduling functions. To let an application role manage its own jobs, grant it usage on the cron schema:

GRANT USAGE ON SCHEMA cron TO app_user;

The permission model after that is sensible and safe:

Jobs run with the permissions of the user who scheduled them, nothing more.
A row-level security policy on cron.job means users only see and modify their own jobs. Superusers see everything.
Each user can also delete their own rows from cron.job_run_details, so the cleanup job from earlier works without superuser rights.

In practice, I recommend creating a dedicated role for scheduled work rather than scheduling jobs as a personal account. When the engineer who scheduled everything leaves and their role gets dropped, you don't want the nightly rollups going with them.

When to Use pg_cron (and When to Avoid It)

pg_cron shines when the work is database work. Use it for:

Data retention: pruning old rows from sessions, logs, events, and token tables.
Aggregations: refreshing materialized views and building rollup tables.
Maintenance: targeted VACUUM ANALYZE, rebuilding statistics, managing partitions (it pairs beautifully with pg_partman).
Lightweight pipelines: moving rows between tables, processing outbox patterns, expiring soft-deleted records.

The common thread: the entire job is expressible as SQL or a stored procedure, and it touches nothing outside the database.

You should reach for something else when:

The job needs to call external systems. pg_cron runs SQL. It can't send an HTTP request, push to a queue, or send an email on its own. Jobs like that belong in your application or a workflow engine.
You need retries, backoff, and alerting built in. pg_cron records failures but won't retry them or page you. For workflows that must complete, tools like Temporal or a proper job queue earn their complexity.
The work is heavy and long-running. A four-hour batch job running inside your primary OLTP database competes with your application for CPU, memory, and locks. Schedule heavy compute elsewhere.
Jobs need complex dependencies. "Run B only after A succeeds, then fan out to C and D" is orchestration. That's Airflow territory, not cron territory.

A reasonable rule of thumb: pg_cron replaces the crontab entry that used to run psql -c "..." on some forgotten server. It doesn't replace your job queue or your workflow orchestrator.

Best Practices for Working with pg_cron

A handful of habits will keep your scheduled jobs boring, in the best sense of the word:

Name every job: Anonymous jobs identified only by an ID are painful to manage six months later. Names also make cron.schedule() idempotent, which lets you define jobs safely in migrations.

Set the timezone deliberately: The default is GMT, and "why does the 3 AM job run at 4 AM?" is a rite of passage you can skip by setting cron.timezone on day one.

Keep individual runs short: Wrap big deletes in batched stored procedures. A job that finishes in seconds holds locks briefly and queues less behind itself.

Make jobs idempotent: Servers restart, and a job can fail halfway. Use ON CONFLICT, time-window predicates, and other patterns that make a re-run harmless.

Prune cron.job_run_details: Schedule the cleanup job from the monitoring section before the table grows large enough that you notice it the hard way.

Monitor for silence, not just failure: A failed run appears in job_run_details, but a job that stopped being scheduled at all leaves no trace. A periodic check that each critical job has a recent successful run catches both cases:

SELECT j.jobname, max(d.end_time) AS last_success
FROM cron.job j
LEFT JOIN cron.job_run_details d
  ON d.jobid = j.jobid AND d.status = 'succeeded'
GROUP BY j.jobname
HAVING max(d.end_time) < now() - interval '1 day'
   OR max(d.end_time) IS NULL;

Any job this query returns hasn't succeeded in over a day, and deserves a look.

Conclusion

pg_cron turns PostgreSQL into its own scheduler. You define jobs in SQL, the database runs them, and the entire system — definitions, history, failures — is visible through ordinary queries.

In this tutorial, you learned how the extension works under the hood, how to install it on your own servers and on managed services, how to write schedules (including pg_cron's seconds and last-day-of-month extensions), and how to apply it to the maintenance work every real database accumulates: pruning, rollups, refreshes, and vacuums. You also saw how to monitor jobs, manage permissions, and recognize the point where a real job queue or orchestrator becomes the better tool.

If your infrastructure currently has a lonely server whose only purpose is running psql from a crontab, you now know how to retire it.

Thanks for reading! I write about PostgreSQL and backend engineering. You can connect with me on LinkedIn and X.

The Saga Pattern in Node.js: How to Roll Back Distributed Transactions Across Microservices

Md Tarikul Islam — Sat, 13 Jun 2026 06:45:43 +0000

Building reliable workflows across multiple microservices is challenging. In a monolith, a database transaction can ensure that multiple operations either succeed or fail together. But once data is spread across different services and databases, that guarantee disappears.

This is where the Saga Pattern comes in. Instead of using distributed transactions, a saga coordinates a sequence of local transactions and runs compensation actions when something goes wrong.

In this article, we'll build an orchestrated Saga Pattern using NestJS, gRPC, PostgreSQL, and Sequelize. You'll learn how to coordinate work across services, implement compensation-based rollbacks, handle idempotency, and track workflow progress in a production-style microservice architecture.

Prerequisites
1. Introduction
2. The Problem in One Picture
3. Why You Need a Saga
4. Choreography vs Orchestration
- Choreography
- Orchestration
5. The Example Project
6. Architecture
7. The Saga Flow, Step by Step
8. The State Machine
9. Implementing the Orchestrator
10. Implementing the Participant
11. Rollback (Compensation)
12. Tracking, Idempotency and Observability
13. Testing a Saga
14. When NOT to Use a Saga
15. Trade-offs and Lessons Learned
16. Conclusion

Prerequisites

This article assumes you're already familiar with some backend development concepts. You don't need prior experience with the Saga Pattern, but you should be comfortable with:

JavaScript, TypeScript, Node.js
NestJS fundamentals (controllers, services, dependency injection)
Basic PostgreSQL concepts
Database transactions
Docker (recommended for local development)
Microservice architecture basics
gRPC fundamentals (helpful but not required)

If you've already built a few backend services with NestJS and PostgreSQL, you'll have everything you need to follow this guide.

1. Introduction

A saga is a sequence of local transactions across multiple services. Each step commits its own database transaction. If a later step fails, the saga runs compensating transactions to semantically undo the work already committed.

The pattern was first described by Hector Garcia-Molina and Kenneth Salem in 1987 for long-lived database transactions. It was rediscovered a decade ago when companies started splitting monoliths into microservices and realised that the database transaction — the single most powerful tool in a backend developer's belt — stops working at the service boundary.

This article walks through an orchestrated saga in Node.js (NestJS + gRPC) for onboarding an agency, where two services must agree on a single business outcome:

agency-service — owns the agency record.
auth-service — owns the organization, user and role.

If either side fails, the system must end up as if nothing ever happened. No half-created users, orphan organizations, or 3am Slack threads.

2. The Problem in One Picture

Here's the bug a saga is built to prevent:

Step 1: auth-service     ✅ creates Organization #42
Step 2: auth-service     ✅ creates User #99
Step 3: agency-service   ❌ fails (DB down, validation, network blip…)

Result without a saga:
   Organization #42 and User #99 still exist.
   There is no Agency row.
   The user can log in but has nothing to manage.
   Support gets a ticket. Engineer writes a one-off SQL cleanup.
   Repeat every week.

The saga's job is to detect that step 3 failed and explicitly delete Organization #42 and User #99, so the system is consistent again — even though those rows live in a different service's database.

3. Why You Need a Saga

In a monolith, you wrap everything in one DB transaction and let the database handle atomicity:

await sequelize.transaction(async (tx) => {
  await Organization.create({...}, { transaction: tx });
  await User.create({...}, { transaction: tx });
  await Agency.create({...}, { transaction: tx });
});

In microservices, each service has its own database. You can't wrap two services in one ACID transaction. The classic alternatives all have problems:

Option	Problem
Two-Phase Commit (2PC)	Locks rows across services, coordinator is a single point of failure, and doesn't scale. Most modern databases don't support it well across HTTP/gRPC.
"Just hope it works"	Leaves orphan users / billing rows when half the flow fails. Real data corruption — and the longer the system runs, the more orphans accumulate.
Manual cleanup scripts	Works for a week. Bugs hide for months. New engineers don't know they exist.
Eventual consistency without compensation	Fine for some domains (analytics) but completely wrong for billing, identity, or anything with money.
Saga pattern	Each service commits locally. The orchestrator owns the workflow and runs explicit compensation on failure. It's auditable, restartable, and reasonable.

The saga gives you eventual consistency with a clear, auditable rollback path — without distributed locks.

4. Choreography vs Orchestration

There are two ways to implement a saga:

Choreography

With Choreography, services emit events and other services subscribe and react.

auth-service → emits "UserCreated"
agency-service → listens, creates agency, emits "AgencyCreated"
billing-service → listens, creates subscription…

It's simple at first, but brittle later. The workflow is scattered across N codebases. Nobody owns it. Debugging means tracing events across logs. Adding a step means changing several services.

Orchestration

With Orchestration, one service is the conductor. It calls the others in order.

orchestrator:
   1. authClient.provisionAccount(...)
   2. agencyRepo.create(...)
   3. authClient.sendWelcomeEmail(...)

There's slightly more coupling here (the orchestrator imports clients), but the entire workflow lives in one file. Onboarding new engineers becomes a one-hour task. Adding a step is a single PR.

Pick orchestration unless you have a strong reason not to. This article — and the reference implementation — uses orchestration.

5. The Example Project

Our goal here is to create an Agency in the system. This is the moment a new B2B customer signs up.

It requires two services to agree on a single outcome:

auth-service must create:

an Organization row (the tenant)
a User row (the agency admin who will log in)
a UserRole row linking the user to the AGENCY_ADMIN role

agency-service must create:

an Agency row containing business details (size, registration number, website, branches…), linked to the user/organization above

These rows have foreign-key relationships within a service, but not across services — Postgres can't enforce that the user in auth's DB matches the authUserId in agency's DB. The application has to do it.

auth-service DB                    agency-service DB
─────────────────                  ─────────────────
organizations  ◄────────┐
   │                    │
   │ (1:1)              │   foreign reference (no FK)
   ▼                    │           agencies
users  ──────► user_roles                     ─ authUserId
                                              └ authOrganizationId

If step 2 fails after step 1 succeeded, we end up with a user who can authenticate but has no agency — the exact bug from 2. That's what the saga prevents.

6. Architecture

                     ┌───────────────────────────────┐
                     │        API Gateway            │
                     └──────────────┬────────────────┘
                                    │ HTTP
                                    ▼
   ┌──────────────────────────────────────────────────┐
   │              agency-service                      │
   │   ┌─────────────────────────────────────────┐    │
   │   │   AgencyOnboardingOrchestrator (SAGA)   │    │
   │   └───────────────┬─────────────────────────┘    │
   │                   │ writes state                 │
   │                   ▼                              │
   │      agency_onboarding_sagas  (Postgres)         │
   └───────────────┬─────────────────┬────────────────┘
                   │ gRPC            │ gRPC
       provisionAgencyAccount   compensateAgencyAccount
                   │                 │
                   ▼                 ▼
   ┌──────────────────────────────────────────────────┐
   │              auth-service                        │
   │   AgencyProvisioningService  (Participant)       │
   │                                                  │
   │   organizations · users · user_roles             │
   │   agency_provision_records  ← idempotency log    │
   └──────────────────────────────────────────────────┘

Three components do all the work:

AgencyOnboardingOrchestrator in agency-service — drives the workflow.
agency_onboarding_sagas table in agency-service — the durable log of the saga's progress.
AgencyProvisioningService in auth-service — exposes a do operation (provisionAgencyAccount) and an undo operation (compensateAgencyAccount). It's backed by its own agency_provision_records idempotency table.

The orchestrator never reaches into the auth database directly. The boundary is enforced by gRPC.

7. The Saga Flow, Step by Step

This sequence diagram shows the complete lifecycle of the onboarding saga. The workflow begins when a client sends a request to create a new agency. The orchestrator first creates a saga record in its database and marks it as STARTED, giving it a durable record of the workflow before any business action takes place.

At a high level, the orchestrator begins by creating a saga record and then asks auth-service to provision the organization, user, and role. Once that succeeds, the orchestrator creates the agency record in its own database.

If every step succeeds, the saga reaches the COMPLETED state. If the agency creation fails after the auth resources have already been created, the orchestrator triggers a compensation step that instructs auth-service to remove everything it previously provisioned.

The key idea is that each service commits its own local transaction, while the saga coordinates the overall business workflow and ensures the system can return to a consistent state when failures occur.

sequenceDiagram
    autonumber
    participant C as Client
    participant AS as agency-service
Orchestrator
    participant DB1 as saga store
    participant AU as auth-service
    participant DB2 as auth DB

    C->>AS: POST /agencies
    AS->>DB1: INSERT saga (STARTED, payload)
    AS->>AU: provisionAgencyAccount(sagaId, …)
    AU->>DB2: BEGIN TX
    AU->>DB2: create org + user + role + provision_record
    AU->>DB2: COMMIT
    AU-->>AS: { userId, organizationId, roleId }
    AS->>DB1: UPDATE saga (AUTH_PROVISIONED)
    AS->>AS: create Agency row
    alt Agency row OK
        AS->>DB1: UPDATE saga (AGENCY_CREATED → COMPLETED)
        AS->>AU: sendAgencyWelcomeEmail (non-critical)
        AS-->>C: 200 OK + sagaId
    else Agency row fails
        AS->>DB1: UPDATE saga (COMPENSATING)
        AS->>AU: compensateAgencyAccount(sagaId)
        AU->>DB2: BEGIN TX
        AU->>DB2: delete role + token + user + org + record
        AU->>DB2: COMMIT
        AS->>DB1: UPDATE saga (COMPENSATED → FAILED)
        AS-->>C: 5xx + error code
    end

Read this once top to bottom and you'll understand the entire onboarding workflow. That's the value of orchestration — the sequence diagram is the architecture.

8. The State Machine

Every transition is written to agency_onboarding_sagas before the next step runs. That is what makes the saga observable and recoverable.

export enum AgencyOnboardingSagaStatus {
  STARTED            = 'STARTED',            // Row exists, no side effects yet
  AUTH_PROVISIONED   = 'AUTH_PROVISIONED',   // Auth side committed
  AGENCY_CREATED     = 'AGENCY_CREATED',     // Agency row committed
  COMPLETED          = 'COMPLETED',          // Happy-path terminal state
  COMPENSATING       = 'COMPENSATING',       // Rollback in progress
  COMPENSATED        = 'COMPENSATED',        // Rollback finished
  FAILED             = 'FAILED',             // Terminal failure (with or without compensation)
}

Why so many states? Because "what went wrong here?" is a question someone will ask at 2am. A saga that only stores success | failure is useless for forensics.

                ┌── auth fails ──────────► FAILED  (nothing to compensate)
                │
STARTED ──► AUTH_PROVISIONED ──► AGENCY_CREATED ──► COMPLETED  (happy path)
                                       │
                       agency fails ───┘
                                       ▼
                                COMPENSATING
                                       │
                                       ▼
                                COMPENSATED ──► FAILED  (consistent again)

The “point of no return” is AUTH_PROVISIONED. Before it, we can fail fast — there's nothing to undo. After it, every failure path must go through compensation.

9. Implementing the Orchestrator

The orchestrator is the only place that knows the workflow. Each step is a private method, and each step persists its result before returning.

Creating the Saga Record

// agency-onboarding.saga.repository.ts
async createSaga(payload: CreateAgencyOrchestrationInput) {
  return this.sagaModel.create({
    sagaId: randomUUID(),                          // correlation id for everything
    status: AgencyOnboardingSagaStatus.STARTED,
    currentStep: 'STARTED',
    payload,                                       // full input snapshot for replay
  });
}

The sagaId is a UUID generated once and propagated to every downstream call. It's the single identifier that ties the saga log on the orchestrator side to the provision record on the participant side.

The Main Loop

// agency-onboarding.orchestrator.ts (trimmed for the article)
async execute(input: CreateAgencyOrchestrationInput) {
  const saga = await this.sagaRepository.createSaga(input); // STARTED

  try {
    // Step 1 — auth-service work
    const authStep = await this.provisionAuth(saga, input);
    if (!authStep.ok) {
      await this.markFailed(saga, authStep.failure); // nothing to compensate
      return authStep.failure;
    }

    // Step 2 — agency-service work
    let activeSaga = authStep.saga; // status: AUTH_PROVISIONED
    try {
      activeSaga = await this.createAgencyRow(activeSaga, input, authStep.authIds);
    } catch (err) {
      // The expensive case: undo what auth-service did
      await this.compensateAuth(activeSaga, 'SAGA_FAILED');
      const failure = mapSagaFailure(err.message, 'SAGA_FAILED', 'CREATE_AGENCY');
      await this.markFailed(activeSaga, failure);
      return failure;
    }

    // Step 3 — mark done and run non-critical side effects
    activeSaga = await this.sagaRepository.updateSaga(activeSaga, {
      status: AgencyOnboardingSagaStatus.COMPLETED,
    });
    await this.sendWelcomeEmail(input, activeSaga); // best-effort

    return mapSagaSuccess(activeSaga, await this.agencyModel.findByPk(activeSaga.agencyId!));
  } catch (error) {
    // Defensive catch-all (lost DB connection, unexpected throw)
    await this.compensateAuth(saga, 'SAGA_FAILED');
    const failure = mapSagaFailure(error.message, 'SAGA_FAILED', 'SAGA');
    await this.markFailed(saga, failure);
    return failure;
  }
}

A Single Step in Detail

private async provisionAuth(saga: AgencyOnboardingSaga, input: ...) {
  this.logger.log(`[${saga.sagaId}] PROVISION_AUTH`);

  const auth = await firstValueFrom(
    this.authClient.provisionAgencyAccount({
      sagaId: saga.sagaId,                  // <-- correlation
      organizationName: input.agencyName.trim(),
      email: input.email.trim().toLowerCase(),
      // …
    }),
  );

  if (!auth.status || !auth.data) {
    return { ok: false, failure: mapAuthProvisionFailure(auth) };
  }

  // Persist the IDs we will need if we have to compensate later
  const updated = await this.sagaRepository.updateSaga(saga, {
    authOrganizationId: Number(auth.data.organizationId),
    authUserId: Number(auth.data.userId),
    authUserRoleId: Number(auth.data.userRoleId),
    status: AgencyOnboardingSagaStatus.AUTH_PROVISIONED,
  });

  return { ok: true, saga: updated, authIds: auth.data };
}

The line that does most of the work is the updateSaga call. It stores the foreign IDs returned by auth-service on the saga row, so even if the orchestrator process crashes and restarts, a recovery job can read that row and still know what to compensate.

Habits Worth Copying

Persist after every successful step, including the IDs you'll need to undo it.
Distinguish critical vs non-critical steps. Welcome emails, audit logs and analytics events are not worth rolling a saga back for. They're best-effort.
One log line per transition, prefixed with [${sagaId}]. Grep is your debugger.

10. Implementing the Participant

The participant (auth-service) wraps all of its own work in a local DB transaction. Inside that boundary it's still ACID — the saga only handles the cross-service problem.

// agency-provisioning.service.ts (trimmed)
async provisionAgencyAccount(req: ProvisionAgencyAccountInput) {

  // 1. Idempotency — return the previous result if this sagaId already provisioned.
  const existing = await this.provisionRecordModel.findOne({
    where: { sagaId: req.sagaId },
  });
  if (existing) {
    return serviceSuccess('Agency admin already onboarded', {
      userId: Number(existing.userId),
      organizationId: Number(existing.organizationId),
      userRoleId: Number(existing.roleId),
    });
  }

  // 2. Domain validation BEFORE the transaction (fail fast).
  if (await this.emailExists(req.email)) {
    return serviceFailure('Email already exists', { code: 'EMAIL_EXISTS' });
  }
  if (await this.organizationExists(req.organizationName)) {
    return serviceFailure('Organization already exists', { code: 'ORGANIZATION_EXISTS' });
  }

  // 3. The actual work — atomic at the auth-service boundary.
  return withSequelizeTransaction(this.sequelize, async (tx) => {
    const org = await this.organizationModel.create({ ... }, { transaction: tx });
    const user = await this.userModel.create({ ..., organizationId: org.id }, { transaction: tx });
    await this.userRoleModel.create({ userId: user.id, roleId: agencyAdminRole.id }, { transaction: tx });

    // The audit record that makes compensation possible later.
    await this.provisionRecordModel.create(
      { sagaId: req.sagaId, organizationId: org.id, userId: user.id, roleId: agencyAdminRole.id },
      { transaction: tx },
    );

    return serviceSuccess('Provisioned', {
      userId: user.id, organizationId: org.id, userRoleId: agencyAdminRole.id,
    });
  });
}

Three things make this method "saga-safe":

Idempotency check first: If the orchestrator retries (network blip, gRPC timeout), the second call is a no-op that returns the same IDs. No duplicate users.
Validation outside the transaction: Cheap reads first, expensive writes second.
One transaction wraps every write: If any insert fails, the whole thing rolls back automatically. The orchestrator sees a clean failure response and knows nothing was persisted.

The agency_provision_records table is the single most important piece of the participant. It's both the idempotency key and the compensation lookup — keyed by the same sagaId the orchestrator uses.

11. Rollback (Compensation)

Compensation is just another gRPC call. The orchestrator sends the sagaId and the IDs it remembers. The participant deletes everything it created, in reverse dependency order, inside its own DB transaction.

On the Orchestrator Side

private async compensateAuth(saga: AgencyOnboardingSaga, errorCode?: string) {
  if (!saga.authUserId && !saga.authOrganizationId) {
    // Nothing was provisioned — nothing to compensate.
    return;
  }

  // Mark the saga as compensating BEFORE the call, so the row is consistent
  // even if the compensating RPC times out.
  await this.sagaRepository.updateSaga(saga, {
    status: AgencyOnboardingSagaStatus.COMPENSATING,
    currentStep: 'COMPENSATING',
    errorCode,
  });

  try {
    const rollback = await firstValueFrom(this.authClient.compensateAgencyAccount({
      sagaId: saga.sagaId,
      organizationId: saga.authOrganizationId,
      userId: saga.authUserId,
    }));
    if (!rollback.status) {
      this.logger.error(`[\({saga.sagaId}] Auth compensation returned failure: \){rollback.message}`);
    }
  } catch (err) {
    this.logger.error(`[\({saga.sagaId}] Auth compensation RPC failed: \){err.message}`);
  }

  await this.sagaRepository.updateSaga(saga, {
    status: AgencyOnboardingSagaStatus.COMPENSATED,
    currentStep: 'COMPENSATED',
  });
}

On the Participant Side

private async rollbackProvisionedAuth(req, sagaId: string, tx: Transaction) {
  // Use the saga log as the source of truth — even if the caller forgot IDs.
  const record = await this.provisionRecordModel.findOne({
    where: { sagaId }, transaction: tx,
  });
  const userId         = req.userId         ?? record?.userId;
  const organizationId = req.organizationId ?? record?.organizationId;

  if (userId) {
    const user = await this.userModel.findByPk(userId, { transaction: tx, attributes: ['email'] });
    await this.userRoleModel.destroy({ where: { userId }, transaction: tx });
    if (user?.email) {
      await this.passwordResetTokenModel.destroy({ where: { email: user.email }, transaction: tx });
    }
    await this.userModel.destroy({ where: { id: userId }, transaction: tx });
  }
  if (organizationId) {
    await this.organizationModel.destroy({ where: { id: organizationId }, transaction: tx });
  }
  if (record) {
    await record.destroy({ transaction: tx });
  }
}

Rules of a Good Compensation

Reverse the order of creation: Children first (user_roles, tokens), then parents (users, organizations). The same rule you follow for DROP TABLE statements.
Be idempotent: Receiving the same sagaId twice must be safe — every destroy is a no-op if the row is already gone.
Use the saga log, not just the request: If the caller forgets an ID or sends a partial payload, look it up by sagaId. Defence in depth.
Wrap it in a local transaction: The rollback must itself be atomic — half-undone is worse than not-undone.
Always close the loop on the orchestrator side: Mark COMPENSATED even if the RPC failed. The failure should also be surfaced (log, metric, alert). A stuck COMPENSATING row is an operational landmine.

What Happens if the Compensation Itself Fails?

This is the worst case in any saga design. There are three reasonable strategies:

First, you can retry with exponential backoff. This works for transient failures (network, deadlocks).

Second, you can dead-letter the saga — write it to a "needs human attention" queue and alert.

Third, you can expose a manual rollback endpoint. This reference implementation does that via RollbackAgencyOnboarding gRPC, so an operator can replay compensation with the same sagaId.

A production system should combine all three. The pattern doesn't decide for you. You decide based on your business risk.

12. Tracking, Idempotency and Observability

Two tables, both keyed by the same UUID sagaId, give you full traceability across services.

Orchestrator Side — `agency_onboarding_sagas`

column	purpose
`sagaId` (UUID, unique)	Propagated to every RPC. The join key across services.
`status`	Current state in the state machine.
`currentStep`	Human-readable label for dashboards (`PROVISION_AUTH`, `CREATE_AGENCY`…).
`payload` (JSONB)	Snapshot of the input — used for replay, debug, support.
`authOrganizationId`, `authUserId`, `authUserRoleId`	Foreign IDs needed for compensation.
`agencyId`	Set once the agency row exists.
`errorCode`, `errorMessage`	Filled on failure.
`createdAt`, `updatedAt`	Timeline for the saga.

A real row in COMPLETED state looks roughly like this:

{
  "sagaId": "0a4f3e2c-7b11-4f8d-9a2c-90b6f5f5b8a1",
  "status": "COMPLETED",
  "currentStep": "COMPLETED",
  "agencyId": 17,
  "authOrganizationId": 42,
  "authUserId": 99,
  "authUserRoleId": 3,
  "errorCode": null,
  "errorMessage": null,
  "payload": { "agencyName": "Acme Education", "email": "admin@acme.com", "...": "..." },
  "createdAt": "2026-05-22T10:14:32.118Z",
  "updatedAt": "2026-05-22T10:14:33.412Z"
}

Participant Side — `agency_provision_records`

column	purpose
`sagaId` (unique)	Idempotency key. The same `sagaId` from the orchestrator.
`userId`, `organizationId`, `roleId`	What to delete on compensation.
`createdAt`, `updatedAt`	Audit timestamps.

Observability for Free

Because every log line is prefixed with [${sagaId}], a single grep across both services gives the full timeline:

[0a4f3e2c…] PROVISION_AUTH                  agency-service
[0a4f3e2c…] provisionAgencyAccount: ok      auth-service
[0a4f3e2c…] CREATE_AGENCY                   agency-service
[0a4f3e2c…] Agency step failed: ...         agency-service
[0a4f3e2c…] Auth compensation completed     auth-service

In a structured-logging setup (Loki, Elasticsearch, Datadog) this becomes a one-click filter. The sagaId is your distributed trace.

13. Testing a Saga

A saga is just a state machine, so the test matrix is finite and small. Cover at least these cases:

#	Scenario	Expected end state
1	Happy path	`COMPLETED`, agency exists, user exists
2	Auth step fails (e.g. email exists)	`FAILED`, no rows on either side
3	Agency step fails	`COMPENSATED`, auth rows gone, no agency
4	Compensation RPC times out	`COMPENSATING` → operator-driven recovery
5	Caller retries with the same `sagaId`	Second call returns the first call's result; no duplicate rows
6	Welcome email fails	`COMPLETED` still — non-critical step did not cascade

Two practical tips for testing:

First, mock the gRPC client at the orchestrator level, not the network. You want to assert that compensateAgencyAccount was called with the right sagaId, not that bytes hit a socket.

Second, spin up a real Postgres in integration tests (Testcontainers, or a Docker Compose postgres service). The saga state machine is too easy to "test" against a mock and too easy to break against a real DB.

14. When NOT to Use a Saga

Sagas are not free. Skip them when:

One service does all the writes. Use a regular DB transaction. Don't reinvent the wheel.
The workflow is read-only or analytical. No rollback semantics exist for a SELECT.
The "rollback" is impossible. You sent a real email. You charged a credit card and the gateway doesn't support refunds. In those cases, design forward: send an apology email, queue a manual refund. Sagas can't unsend physical actions.
You don't actually have multiple services yet. A saga in a monolith is over-engineering. Wait until the service boundary is real.

A saga adds a state table, a compensation method per step, and an operational habit of grepping by sagaId. That cost is worth paying when the alternative is orphaned data — and not before.

15. Trade-offs and Lessons Learned

Things that worked well in this design:

Synchronous orchestration is easier to debug than choreography. A new engineer reads one file and understands the whole flow.
Idempotency at the participant is non-negotiable. Retries from the orchestrator must be safe. Build it in from day one — retro-fitting is painful.
The saga table replaces tribal knowledge. Ops can answer "what happened to this signup?" with a single SQL query. The payload JSONB is gold during incidents.
sagaId as the trace key plays nicely with OpenTelemetry / Datadog / Loki — no extra infra to set up.

Things to know before copying this pattern:

A failing compensation is the worst case. If compensateAgencyAccount itself errors, you have inconsistent state. Plan for retries + dead-letter + a manual rollback endpoint from the start.
Non-critical steps must be marked explicitly. Here, the welcome email is allowed to fail without rolling back the agency. Don't accidentally compensate over a flaky SMTP provider.
Sagas aren't a replacement for local transactions. Inside each service, still use a real DB transaction. The saga only handles the cross-service seam.
Synchronous gRPC is simple but couples availability. If auth-service is down, agency creation fails. Swap the gRPC calls for a durable message bus (RabbitMQ / Kafka) and treat each step as a command + reply when you need higher resilience.
The orchestrator becomes a critical service. Treat its uptime accordingly — monitor saga durations, alert on stuck COMPENSATING rows, and run more than one replica.

16. Conclusion

The saga pattern isn't magic. It's a disciplined version of what experienced engineers already do by hand: commit locally, record what you did, and know how to undo it.

In Node.js with NestJS, you only need three ingredients:

A state table to track the saga.
An orchestrator that drives the workflow and writes that state.
A participant that exposes a do and an undo operation, both idempotent and keyed by sagaId.

Get those three right and your microservices can offer the same "all-or-nothing" feel as a monolithic transaction — without the operational pain of distributed locks.

Start simple, use orchestration, make every step idempotent, persist before you call, and always know how to undo. That's the whole pattern.

How to Build a PostgreSQL-Backed Job Queue in Go

timothy ogbemudia — Tue, 09 Jun 2026 23:21:55 +0000

When you build a web application, not every task should happen inside a user's request.

Some work is slow. Some work can fail. Some work should happen later. Sending emails, resizing images, processing webhooks, generating reports, and retrying third-party APIs are all good examples.

These tasks are usually handled by a background job system.

In this article, you'll use an open source Go project called Swig as a practical example of how a PostgreSQL-backed job queue works in practice.

By the end, you'll understand how to build a background job queue with Go and PostgreSQL, and why PostgreSQL is more capable than most developers realize.

Prerequisites
What You Will Learn
What Is a Job Queue?
Why Use PostgreSQL for a Queue?
Swig's Architecture
How to Represent Jobs in PostgreSQL
How to Define a Worker in Go
How to Register Workers Without Sharing State
How to Add a Job
How to Handle Multiple Workers Safely
How to Use Goroutines for Concurrent Workers
How to Wake Workers with LISTEN/NOTIFY
How to Elect a Leader with Advisory Locks
How to Handle Failed Jobs
How to Abstract the Database Driver
Conclusion

Prerequisites

To follow along, you should have:

Basic familiarity with Go (structs, interfaces, goroutines)
A working understanding of PostgreSQL and SQL
Go installed (1.21 or later)
A PostgreSQL instance available locally or remotely

What You Will Learn

How to represent and store jobs in PostgreSQL
How to claim jobs safely across concurrent workers using FOR UPDATE SKIP LOCKED
How to wake workers efficiently using LISTEN/NOTIFY
How to elect a leader across instances using advisory locks
How Go interfaces, goroutines, contexts, and transactions fit together in a real system

What Is a Job Queue?

A job queue is a system that stores work to be done later.

Your application adds a job to the queue. A worker takes a job from the queue and runs it.

For example, when a user signs up, your application might create the user immediately and then add a job like this:

{
  "kind": "send_welcome_email",
  "payload": {
    "to": "user@example.com",
    "subject": "Welcome!"
  }
}

A background worker later picks up that job and sends the email. This keeps the user request fast. The signup route doesn't need to wait for the email provider before returning a response.

A job queue usually needs to answer a few important questions:

Where are jobs stored?
How do workers find jobs?
How do you stop two workers from processing the same job?
How do you retry failed jobs?
How do you shut workers down safely?
How do you keep job creation consistent with application data?

Swig answers those questions with Go and PostgreSQL.

Why Use PostgreSQL for a Queue?

Many job queues use Redis, RabbitMQ, SQS, or Kafka. Those are all useful tools. But many applications already depend on PostgreSQL. If your app already has Postgres, you may not want to operate another service just to run background jobs.

PostgreSQL gives you several features that are surprisingly useful for queues:

Tables for durable job storage
Transactions for atomic writes
Row locks for safe concurrent processing
SKIP LOCKED for letting workers claim different jobs
LISTEN/NOTIFY for waking workers when new jobs arrive
Advisory locks for leader election
JSONB for flexible job payloads

The tradeoff is important. A PostgreSQL-backed queue isn't trying to replace Kafka for event streaming or RabbitMQ for complex routing. It makes common application background jobs simple, reliable, and easy to operate without adding infrastructure.

Swig's Architecture

At a high level, Swig has five parts:

A swig_jobs table in PostgreSQL
Go workers that process jobs
A worker registry that maps job names to worker types
A driver layer that supports both pgx and database/sql
A leader loop for shared maintenance work

The basic flow looks like this:

Your app calls AddJob
Swig serializes the job payload to JSON
Swig inserts a row into swig_jobs
PostgreSQL sends a notification that a job was created
A Go worker wakes up and tries to claim one pending job
PostgreSQL row locks ensure only one worker claims that row
The worker runs the job
Swig marks the job as completed or failed

The hard parts are concurrency, failure, connection lifecycle, and shutdown. That's where Go and PostgreSQL work together.

How to Represent Jobs in PostgreSQL

A simplified version of Swig's job table looks like this:

CREATE TABLE swig_jobs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  kind TEXT NOT NULL,
  queue TEXT NOT NULL,
  payload JSONB NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending',
  priority INTEGER NOT NULL DEFAULT 0,
  attempts INTEGER NOT NULL DEFAULT 0,
  max_attempts INTEGER NOT NULL DEFAULT 3,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  scheduled_for TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  instance_id UUID,
  worker_id UUID,
  locked_at TIMESTAMPTZ,
  last_error TEXT,
  last_error_at TIMESTAMPTZ
);

Each row is one job. The important columns are:

kind: the type of job, such as send_email
payload: the JSON data needed to run the job
status: whether the job is pending, processing, completed, or failed
attempts: how many times the job has been tried
scheduled_for: when the job is allowed to run
locked_at: when the job was claimed

The table is the source of truth. PostgreSQL notifications can wake workers, but notifications aren't the durable queue. The rows in swig_jobs are.

How to Define a Worker in Go

In Swig, a worker is a Go type that knows how to process one kind of job.

Here's a simple email worker:

type EmailWorker struct {
    To      string `json:"to"`
    Subject string `json:"subject"`
    Body    string `json:"body"`
}

func (w *EmailWorker) JobName() string {
    return "send_email"
}

func (w *EmailWorker) Process(ctx context.Context) error {
    fmt.Printf("Sending email to %s with subject %s\n", w.To, w.Subject)
    return nil
}

There are two important methods:

JobName tells Swig what kind of job this worker handles
Process contains the actual work

The struct fields are also the job arguments. When you enqueue an EmailWorker, Swig serializes the struct into JSON and stores it in PostgreSQL. Later, a worker claims the row, unmarshals the JSON back into a fresh EmailWorker, and calls Process.

Go Interfaces

Go interfaces describe behavior. Swig doesn't need to know the exact concrete type of every worker. It only needs to know that a worker can provide a job name and process a job:

type Worker interface {
    JobName() string
    Process(context.Context) error
}

If a type has those methods, it satisfies the interface with no explicit declaration required. This is one of the reasons interfaces are so useful in Go. They let you design around behavior instead of inheritance.

Swig has a worker registry that maps a job name to a worker type:

registry := workers.NewWorkerRegistry()
registry.RegisterWorker(&EmailWorker{})

Later, when a job row says kind = 'send_email', Swig looks up the registered worker and runs it.

There's a subtle concurrency issue here. If the registry stored the exact &EmailWorker{} pointer and reused it for every job, multiple goroutines could unmarshal payloads into the same Go value at the same time.

Swig avoids this with a factory approach internally. Registration captures the worker type, and each claimed job gets a fresh worker instance before JSON is unmarshaled. The API stays simple, but internally Swig creates a new EmailWorker for each job. This is a useful Go pattern: keep the public API simple while making the internal lifecycle safer.

How to Add a Job

Here's what adding a job looks like from the user side:

err := swigClient.AddJob(ctx, &EmailWorker{
    To:      "user@example.com",
    Subject: "Welcome!",
    Body:    "Thanks for signing up.",
})

Inside Swig, the process is roughly:

argsJSON, err := json.Marshal(workerWithArgs)
if err != nil {
    return err
}

_, err = db.ExecContext(ctx, `
    INSERT INTO swig_jobs (kind, queue, payload, priority, scheduled_for, status)
    VALUES (\(1, \)2, \(3, \)4, $5, 'pending')
`, jobName, queue, argsJSON, priority, runAt)

How to Enqueue Jobs Inside Transactions

One of the best reasons to use PostgreSQL for jobs is transactional enqueueing.

Imagine a user signs up. You want to insert the user and queue a welcome email. If those happen separately, you can get inconsistent states. With a transaction, both succeed or both fail:

tx, err := pool.Begin(ctx)
if err != nil {
    return err
}
defer tx.Rollback(ctx)

_, err = tx.Exec(ctx, `INSERT INTO users (email) VALUES ($1)`, email)
if err != nil {
    return err
}

err = swigClient.AddJobWithTx(ctx, tx, &EmailWorker{
    To:      email,
    Subject: "Welcome!",
    Body:    "Thanks for joining.",
})
if err != nil {
    return err
}

return tx.Commit(ctx)

If the transaction rolls back, the user isn't created and the job isn't queued. This is much harder to guarantee when your database and queue are separate systems.

How to Handle Multiple Workers Safely

A queue gets interesting when many workers run at the same time. Imagine three workers all asking PostgreSQL for the next pending job. You don't want all three to process the same job.

A naïve approach has a race condition. Two workers can select the same job before either one updates it.

PostgreSQL FOR UPDATE SKIP LOCKED

PostgreSQL can lock rows selected inside a transaction. FOR UPDATE means "lock this row because I plan to update it." SKIP LOCKED means "if another worker already locked a row, skip it and find another one."

This is perfect for a queue:

Worker A locks job 1
Worker B skips job 1 and locks job 2
Worker C skips jobs 1 and 2 and locks job 3

No central coordinator is needed. Swig uses an atomic update pattern:

UPDATE swig_jobs
SET status = 'processing',
    instance_id = $1,
    worker_id = $2,
    locked_at = NOW(),
    attempts = attempts + 1
WHERE id = (
  SELECT id
  FROM swig_jobs
  WHERE status = 'pending'
    AND scheduled_for <= NOW()
  ORDER BY priority DESC, created_at
  FOR UPDATE SKIP LOCKED
  LIMIT 1
)
RETURNING id, kind, payload;

This query finds a pending job, skips already-locked jobs, marks it as processing, records which worker claimed it, and returns the job data. All of this happens atomically. Workers never do a separate SELECT and hope the later UPDATE is still safe.

How to Use Goroutines for Concurrent Workers

Swig starts worker loops as goroutines:

for i := 0; i < maxWorkers; i++ {
    go s.startWorker(ctx, queueType)
}

Each worker runs independently. PostgreSQL coordinates which job each worker gets. Go handles concurrency with goroutines, while PostgreSQL handles safe job claiming with locks.

How to Handle Graceful Shutdown

When a service shuts down, it should wait for workers to finish cleanly. Go's sync.WaitGroup helps:

var wg sync.WaitGroup

wg.Add(1)
go func() {
    defer wg.Done()
    processJobs()
}()

wg.Wait()

Swig also uses sync.Once to make shutdown idempotent. Calling Stop more than once shouldn't panic because of a double channel close. Shutdown paths are often where production systems behave differently from happy-path demos.

How to Wake Workers with LISTEN/NOTIFY

If workers constantly poll the database for jobs, they waste resources when the queue is empty. PostgreSQL has LISTEN/NOTIFY to solve this.

A connection can listen on a channel:

LISTEN swig_jobs;

Another session can send a notification:

NOTIFY swig_jobs, '{"id":"job-id"}';

Swig creates a trigger so PostgreSQL sends a notification after a job is inserted. Workers sleep when there's no work and wake when a new job arrives.

There's an important PostgreSQL detail here: LISTEN is session-scoped. A worker must wait for notifications on the same database session that executed LISTEN. Swig handles this by creating a dedicated listener for each worker that owns one database session throughout its lifecycle.

This is a common backend engineering lesson: abstractions like connection pools are useful, but some database features depend on the lifecycle of a specific connection.

How to Elect a Leader with Advisory Locks

Some queue maintenance tasks should only run on one instance at a time, including retrying failed jobs, recovering stale jobs, and cleaning old history.

Swig uses PostgreSQL advisory locks for this:

SELECT pg_try_advisory_lock($1);

If the result is true, that Swig instance becomes the leader. Advisory locks are also session-scoped, so Swig uses a dedicated advisory-lock connection for leadership. If that session ends, PostgreSQL releases the lock and another instance can take over. Simple failover without ZooKeeper or etcd.

How to Handle Failed Jobs

When a worker returns an error, Swig records the error and either retries the job or marks it as failed:

UPDATE swig_jobs
SET status = CASE
    WHEN attempts >= max_attempts THEN 'failed'
    ELSE 'pending'
  END,
  last_error = $2,
  last_error_at = NOW()
WHERE id = $1;

A Note on Delivery Semantics

It's tempting to say a job queue processes jobs exactly once. In distributed systems, that's a dangerous claim.

Consider this scenario:

A worker sends an email
The worker crashes before marking the job completed
The job is retried
The email might be sent again

The accurate description is that Swig provides atomic claiming and at-least-once processing. Because jobs can be retried, workers should be idempotent. Running the same operation more than once should produce the same result as running it once.

How to Abstract the Database Driver

Swig supports both pgx and database/sql through a driver interface:

type Driver interface {
    Exec(ctx context.Context, sql string, args ...interface{}) error
    Query(ctx context.Context, sql string, args ...interface{}) (Rows, error)
    QueryRow(ctx context.Context, sql string, args ...interface{}) Row
    WithTx(ctx context.Context, fn func(tx Transaction) error) error
    NewListener(ctx context.Context, channel string) (Listener, error)
    TryAdvisoryLock(ctx context.Context, lockID int64) (AdvisoryLock, bool, error)
}

The core queue code only depends on behavior, not a specific library. This is a common Go design: define the behavior your core package needs, write small adapters for concrete dependencies, and keep the core logic independent.

Conclusion

A PostgreSQL-backed queue isn't the right answer for every system. If you need massive event streaming, Kafka may be a better fit. If you need complex routing, RabbitMQ may be better.

But for many Go applications, PostgreSQL is already there. Swig shows how far you can get with a small Go API and a few PostgreSQL features:

Store jobs in a table
Claim jobs atomically with FOR UPDATE SKIP LOCKED
Wake workers with dedicated LISTEN/NOTIFY sessions
Coordinate leadership with advisory locks
Keep app data and jobs consistent with transactions
Manage worker lifecycles with goroutines and contexts

That combination makes a solid foundation for background processing and a great project for learning how Go and PostgreSQL work together in production systems. You can explore the full source code at github.com/glamboyosa/swig.

How to Use PostgreSQL as a Cache, Queue, and Search Engine

Aaron Yong — Tue, 21 Apr 2026 16:58:55 +0000

"Just use Postgres" has been circulating as advice for years, but most articles arguing for it are opinion pieces. I wanted hard numbers.

So I built a benchmark suite that pits vanilla PostgreSQL against a feature-optimized PostgreSQL instance — measuring caching, message queues, full-text search, and pub/sub under controlled conditions.

In this article, you'll learn how to use PostgreSQL's built-in features for caching, job queues, full-text search, and pub/sub. You'll see actual benchmark results (latency percentiles, throughput, and error rates) comparing naive PostgreSQL patterns against optimized ones, and understand where PostgreSQL's limits are so you can decide whether you really need that extra service in your stack.

Prerequisites
The Setup
Benchmark 1: Caching with UNLOGGED Tables
Benchmark 2: Job Queues with SKIP LOCKED
Benchmark 3: Full-Text Search with tsvector
Benchmark 4: Pub/Sub with LISTEN/NOTIFY
The Combined Workload: The Honest Test
What I Learned

Prerequisites

To follow along or reproduce the benchmarks, you'll need:

Docker and Docker Compose
Node.js 20+ (for the Express TypeScript API layer)
k6 for load testing
Basic familiarity with SQL and PostgreSQL

The full benchmark project is open source on GitHub — you can clone it and run every test yourself.

The Setup

The benchmark uses two identical PostgreSQL 17 instances running in Docker containers, each with fixed resource constraints (2 CPUs, 2 GB RAM). Both share the same Express TypeScript API layer — the only difference is which PostgreSQL features are enabled.

┌─────────┐     ┌──────────────────┐     ┌─────────────────┐
│   k6    │────>│  Express API     │────>│  PG Baseline    │
│  (load  │     │  (TypeScript)    │     │  (vanilla PG17) │
│  test)  │────>│  Port 3001/3002  │────>│  PG Modded      │
└─────────┘     └──────────────────┘     │  (features on)  │
                                         └─────────────────┘

The baseline instance uses naïve approaches (regular tables, ILIKE search, polling). The modded instance uses PostgreSQL's built-in features (UNLOGGED tables, tsvector with GIN indexes, LISTEN/NOTIFY, partial indexes). Same hardware, same API code, same data. Only the database features differ.

Both instances share this tuned postgresql.conf:

# Memory allocation
shared_buffers = 512MB           # 25% of available RAM
effective_cache_size = 1536MB    # 75% of RAM — helps the query planner
work_mem = 16MB                  # per-sort/hash operation memory

# SSD-optimized planner settings
random_page_cost = 1.1           # default 4.0 assumes spinning disks
effective_io_concurrency = 200   # allow parallel I/O on SSDs

These settings matter. The defaults assume spinning disks from the early 2000s. Setting random_page_cost = 1.1 tells the query planner that random reads are nearly as fast as sequential reads on SSDs, which encourages index usage over sequential scans.

Benchmark 1: Caching with UNLOGGED Tables

The idea: Use an UNLOGGED table as an in-database cache. UNLOGGED tables skip PostgreSQL's Write-Ahead Log (WAL) — the mechanism that guarantees durability. Since cache data is ephemeral by nature, losing it on a crash is acceptable, and skipping WAL removes the biggest write bottleneck.

-- Modded: UNLOGGED table for cache entries
CREATE UNLOGGED TABLE cache_entries (
    key TEXT PRIMARY KEY,
    value JSONB NOT NULL,
    expires_at TIMESTAMPTZ
);

-- Baseline: same schema, but a regular (logged) table
CREATE TABLE cache_entries (
    key TEXT PRIMARY KEY,
    value JSONB NOT NULL,
    expires_at TIMESTAMPTZ
);

Results (200 Virtual Users)

Mode	p50	p95	avg	req/s
Baseline (regular table)	1.87ms	6.00ms	2.50ms	1,754/s
Modded (UNLOGGED table)	1.71ms	5.24ms	2.17ms	1,760/s

A consistent 13% improvement across all percentiles. Not dramatic, but free — you change one keyword in your CREATE TABLE statement.

Under Stress (1,000 Virtual Users, No Sleep)

Mode	p50	p95	req/s	Total Requests
Baseline	83.38ms	143.23ms	7,663/s	728,021
Modded	77.69ms	126.39ms	8,062/s	765,934

The relative improvement stays locked at 12-13% regardless of load level. The UNLOGGED advantage is a per-write optimization — it saves the same amount of I/O whether you are doing 100 or 10,000 writes per second. The modded instance served 37,000 more requests in the same time window.

The Verdict

UNLOGGED tables won't match Redis for sub-millisecond hot-path caching (real-time bidding, gaming leaderboards). But for web applications where the difference between 2ms and 5ms is invisible to users, they eliminate an entire infrastructure dependency for zero additional complexity.

You do give up Redis data structures (sorted sets, HyperLogLog, streams). If you need those, a dedicated cache is still the right call.

Benchmark 2: Job Queues with SKIP LOCKED

The idea: Use PostgreSQL as a job queue with SELECT ... FOR UPDATE SKIP LOCKED. Multiple workers poll the same table, and SKIP LOCKED ensures each worker gets a different row — no duplicates, no contention.

-- Queue table with a partial index on pending jobs only
CREATE TABLE job_queue (
    id SERIAL PRIMARY KEY,
    payload JSONB NOT NULL,
    status TEXT NOT NULL DEFAULT 'pending',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Partial index: only indexes pending jobs
-- As jobs complete, they leave the index — it stays small forever
CREATE INDEX idx_pending_jobs ON job_queue (created_at)
    WHERE status = 'pending';

The dequeue pattern:

-- Atomic dequeue: select + update in one statement
UPDATE job_queue SET status = 'processing'
WHERE id = (
    SELECT id FROM job_queue
    WHERE status = 'pending'
    ORDER BY created_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED  -- skip rows locked by other workers
) RETURNING *;

How SKIP LOCKED works: Worker A locks row 1. Worker B tries row 1, sees the lock, skips it, and takes row 2 instead. No blocking, no duplicates. If a worker crashes, the transaction rolls back and the row becomes available again.

Results (100 Producers + 50 Consumers)

Mode	p50	p95	avg	req/s
Baseline (full index)	1.90ms	5.01ms	2.30ms	1,053/s
Modded (partial index)	1.81ms	5.28ms	2.29ms	1,052/s

They're virtually identical. The partial index doesn't show its value in a 60-second benchmark because the table doesn't accumulate enough completed rows for the index size difference to matter. In a production system with millions of completed jobs, the partial index keeps the index at kilobytes while a full index grows to gigabytes.

The Verdict

SKIP LOCKED is production-ready for job queues. Libraries like pg-boss (Node.js) and river (Go) build on this exact pattern.

You do give up exchange/routing patterns (fan-out, topic-based routing) and consumer groups with message replay. If you need those, a dedicated message broker is still the right tool. For simple "process this job once" workloads, PostgreSQL handles it.

Benchmark 3: Full-Text Search with tsvector

The idea: Use PostgreSQL's built-in full-text search instead of a separate search service. A tsvector column stores pre-processed search tokens, and a GIN (Generalized Inverted Index) enables fast lookups using the same inverted index concept that powers Elasticsearch.

-- Search-optimized article table
CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    body TEXT NOT NULL,
    search_vector tsvector  -- pre-computed search tokens
);

-- GIN index for full-text search
CREATE INDEX idx_search ON articles USING GIN (search_vector);

-- Auto-update search_vector on insert/update
CREATE OR REPLACE FUNCTION update_search_vector() RETURNS trigger AS $$
BEGIN
    NEW.search_vector := to_tsvector('english',
        COALESCE(NEW.title, '') || ' ' || COALESCE(NEW.body, ''));
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_search
    BEFORE INSERT OR UPDATE ON articles
    FOR EACH ROW EXECUTE FUNCTION update_search_vector();

The baseline uses ILIKE with a leading wildcard — the approach most developers reach for first:

-- Baseline: sequential scan on every query
SELECT * FROM articles
WHERE title ILIKE '%postgresql%' OR body ILIKE '%postgresql%';

-- Modded: GIN index lookup with relevance ranking
SELECT id, title,
    ts_rank(search_vector, plainto_tsquery('english', 'postgresql')) AS rank
FROM articles
WHERE search_vector @@ plainto_tsquery('english', 'postgresql')
ORDER BY rank DESC LIMIT 20;

Results (500 Virtual Users)

Mode	p50	p95	avg	req/s
Baseline (ILIKE)	1.96ms	101.83ms	25.22ms	561/s
Modded (tsvector + GIN)	2.76ms	10.39ms	3.76ms	675/s

This is the standout result. The baseline's p95 of 101ms versus the modded's 10ms is a 10x improvement.

Why the baseline's p50 (1.96ms) is slightly better than the modded's (2.76ms): simple ILIKE queries on small result sets can be fast when the data fits in shared_buffers. But as load increases and the buffer cache is contested, sequential scans degrade dramatically. The GIN index stays stable.

Under Stress (500 Virtual Users, No Sleep)

Mode	p50	p95	req/s	Total Requests
Baseline (ILIKE)	599ms	1,000ms	558/s	50,212
Modded (tsvector)	209ms	396ms	1,441/s	129,679

ILIKE collapses to 1-second p95 latencies. Each query forces a sequential scan of all 10,000 articles, blocking shared buffers and starving concurrent queries. The tsvector approach serves 2.6x more requests in the same time window because the GIN index lookup is O(log n) regardless of concurrency.

The Verdict

This is the strongest argument in the entire benchmark. The fix requires zero extensions — to_tsvector(), plainto_tsquery(), and CREATE INDEX USING GIN are all built into core PostgreSQL. If you're doing WHERE column ILIKE '%term%' on any table with more than a few thousand rows, you're leaving massive performance on the table.

You do give up distributed search across shards, complex analyzers for CJK languages, and aggregation/faceted search pipelines. For a product search bar, blog search, or internal tool — PostgreSQL is enough.

Benchmark 4: Pub/Sub with LISTEN/NOTIFY

The idea: Use PostgreSQL's native LISTEN/NOTIFY for pub/sub messaging, triggered automatically on INSERT via a database trigger.

-- Trigger that fires pg_notify on every new message
CREATE OR REPLACE FUNCTION notify_message() RETURNS trigger AS $$
BEGIN
    PERFORM pg_notify(NEW.channel, NEW.payload::text);
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_notify
    AFTER INSERT ON messages
    FOR EACH ROW EXECUTE FUNCTION notify_message();

Results (200 Virtual Users)

Mode	p50	p95	avg	req/s
Baseline (poll-based)	1.99ms	6.04ms	2.84ms	1,116/s
Modded (LISTEN/NOTIFY)	1.65ms	4.80ms	2.13ms	1,131/s

Here we have a 20% improvement at p95. The trigger-based approach does more work per INSERT (INSERT + NOTIFY), but the reduced round trips and better connection reuse patterns offset the overhead.

The Verdict

LISTEN/NOTIFY works for real-time features where you would otherwise reach for Redis pub/sub. The main limitation is payload size (8,000 bytes maximum) and the requirement for dedicated connections (incompatible with PgBouncer in transaction mode).

The Combined Workload: The Honest Test

Individual benchmarks are flattering. The real question: can one PostgreSQL instance handle caching, queues, search, and pub/sub simultaneously without degrading?

Results (All Four Workloads Running Together)

Mode	p50	p95	avg	req/s
Baseline	1.65ms	5.24ms	2.17ms	1,424/s
Modded	1.86ms	6.05ms	2.47ms	1,417/s

Under combined load, the baseline marginally outperforms the modded setup. The modded PostgreSQL does more work per operation — maintaining GIN indexes, firing triggers, running pg_cron in the background. When all these features are active simultaneously, the overhead is measurable: about 15% higher p95 latency.

But both setups stay comfortably under 10ms at p95. For most web applications, that's more than good enough.

What I Learned

After running all these benchmarks, here's what I would tell a team evaluating whether to "just use Postgres":

Do it for full-text search: Switching from ILIKE to tsvector with a GIN index is a 10x improvement that requires zero extensions. This is the single highest-ROI change in the entire PostgreSQL ecosystem, and most developers don't know it exists.
Do it for job queues: SKIP LOCKED is production-ready and eliminates RabbitMQ for simple "process this job" workloads. Use a library like pg-boss or river rather than rolling your own.
Consider it for caching: UNLOGGED tables give a steady 13% improvement over regular tables. If sub-millisecond latency is not a hard requirement (and for most web apps, it is not), you can drop Redis entirely.
Be honest about the overhead: Running all four roles simultaneously adds about 15% latency compared to running any single role. Whether that matters depends on your latency budget.
Know where to stop: PostgreSQL won't match Redis for sub-millisecond caching, Kafka for millions of messages per second, or Elasticsearch for distributed multi-node search with complex analyzers. The line is at extreme throughput or extreme specialization.

The honest conclusion is not "PostgreSQL does everything." It is: for most applications, a single well-configured PostgreSQL instance handles 80% of what you would otherwise need three to five additional services for. That is less infrastructure to deploy, monitor, and maintain — and fewer things to break at 3 AM.

Enterprise-scale applications processing millions of messages per second, serving sub-millisecond cache hits to millions of concurrent users, or running distributed search across terabytes of documents will still need specialized tools. Those tools exist for a reason, and at that scale the operational cost of running them is justified by the performance you get back.

But most of us aren't building at that scale — and may never need to. Starting with PostgreSQL for these roles means you ship faster with fewer moving parts. If and when you outgrow what PostgreSQL can handle, your benchmarks will tell you exactly which role needs to be extracted into a dedicated service. That is a much better position than starting with five services on day one because you assumed you would need them.

The benchmark project is open source if you want to reproduce these results or adapt the tests for your own workload.

You can find more of my writing at site.aaronhsyong.com.

How Database Indexes Work – A Practical Guide with PostgreSQL Examples

iyiola — Thu, 16 Apr 2026 17:27:44 +0000

Every developer eventually runs into a slow query. The table has grown from a few hundred rows to a few million, and what used to take milliseconds now takes seconds — or worse.

The fix, more often than not, is an index.

A database index is a data structure that helps the database find rows faster without scanning the entire table. It works a lot like the index at the back of a textbook: instead of reading every page to find a topic, you look it up in the index, get the page number, and go straight there.

In this tutorial, you'll learn how indexes work under the hood, how to create and use them effectively in PostgreSQL, and how to avoid the common mistakes that make indexes useless or even harmful.

Prerequisites
Why Do You Need Indexes?
How Indexes Work Under the Hood
How to Create Your First Index
How to Use EXPLAIN ANALYZE to Measure Performance
Types of Indexes in PostgreSQL
How to Create a Composite Index
How to Create a Partial Index
How to Create an Expression Index
How to Create a Unique Index
How to Manage Indexes
When Indexes Hurt Instead of Help
Common Mistakes That Prevent Index Usage
Best Practices for Indexing
Conclusion

Prerequisites

To follow along with the examples, you'll need:

Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE, WHERE, JOIN)
A running PostgreSQL instance (version 12 or later)
A SQL client like psql, pgAdmin, or DBeaver

If you don't have PostgreSQL installed locally, you can use a free cloud-hosted instance from services like Neon or Supabase.

Why Do You Need Indexes?

When you run a query like SELECT * FROM users WHERE email = 'jane@example.com', the database needs to find the matching row. Without an index, PostgreSQL performs a sequential scan — it reads every single row in the table and checks whether the email column matches.

For a table with 100 rows, this is fine. For a table with 10 million rows, it's painfully slow.

An index solves this by creating a separate, sorted data structure that maps column values to their row locations. Instead of scanning 10 million rows, PostgreSQL can look up the value in the index and jump directly to the matching row. This can reduce query time from seconds to milliseconds.

But indexes aren't free. They come with trade-offs you need to understand before adding them everywhere. You'll learn about those trade-offs throughout this tutorial.

How Indexes Work Under the Hood

PostgreSQL's default index type is the B-tree (balanced tree). Understanding how a B-tree works will help you make smarter decisions about when and how to index.

A B-tree organizes data into a sorted, hierarchical structure with three levels:

Root node — the top of the tree. It holds a few values that divide the data into broad ranges.
Internal nodes — each one further narrows down the range.
Leaf nodes — the bottom level. These hold the actual indexed values along with pointers to the corresponding rows in the table.

When PostgreSQL uses a B-tree index to find a value, it starts at the root and follows the path that matches the target value, moving through internal nodes until it reaches the correct leaf node. This path is called a tree traversal, and it typically requires only 3–4 steps even for tables with millions of rows.

Think of it like a phone book. You don't start at page one and read every name. You open to roughly the right section (root), narrow it down to the right page (internal nodes), and scan the entries on that page (leaf node).

This sorted structure is also why B-tree indexes work well for range queries like WHERE price > 50 AND price < 100. The database finds the starting point in the tree and then scans forward through the leaf nodes, which are already in order.

How to Create Your First Index

Let's build a practical example. You'll create a table, load it with data, and see the difference an index makes.

Step 1 – Create the Table and Insert Sample Data

CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    city VARCHAR(50),
    created_at TIMESTAMP DEFAULT NOW()
);

Now insert a large number of rows so the performance difference is visible. This generates 500,000 rows of sample data:

INSERT INTO customers (first_name, last_name, email, city)
SELECT
    'User' || gs,
    'Last' || gs,
    'user' || gs || '@example.com',
    (ARRAY['Lagos', 'London', 'New York', 'Berlin', 'Tokyo'])[1 + (gs % 5)]
FROM generate_series(1, 500000) AS gs;

Step 2 – Query Without an Index

EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';

You'll see output similar to this:

Seq Scan on customers  (cost=0.00..11374.00 rows=1 width=52) (actual time=45.123..91.456 rows=1 loops=1)
  Filter: ((email)::text = 'user250000@example.com'::text)
  Rows Removed by Filter: 499999
Planning Time: 0.085 ms
Execution Time: 91.502 ms

The key detail here is Seq Scan — PostgreSQL scanned all 500,000 rows to find a single match. It filtered out 499,999 rows. That's a lot of wasted work.

Step 3 – Create an Index

CREATE INDEX idx_customers_email ON customers (email);

This creates a B-tree index on the email column. The name idx_customers_email follows a common naming convention: idx_ prefix, then the table name, then the column name.

Step 4 – Query With the Index

Run the same query again:

EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';

Now you'll see something like this:

Index Scan using idx_customers_email on customers  (cost=0.42..8.44 rows=1 width=52) (actual time=0.034..0.036 rows=1 loops=1)
  Index Cond: ((email)::text = 'user250000@example.com'::text)
Planning Time: 0.112 ms
Execution Time: 0.058 ms

The scan type changed from Seq Scan to Index Scan. The execution time dropped from ~91ms to ~0.06ms. That's roughly a 1,500x improvement — from one line of SQL.

How to Use `EXPLAIN ANALYZE` to Measure Performance

EXPLAIN ANALYZE is your most important tool for understanding how PostgreSQL executes a query. You already saw it in the previous section, but let's break down what the output means.

EXPLAIN ANALYZE SELECT * FROM customers WHERE city = 'Lagos';

The output will tell you several things:

Scan type — whether PostgreSQL used a sequential scan, index scan, bitmap index scan, or another access method
Cost — the estimated cost in arbitrary units. The first number is the startup cost, the second is the total cost
Rows — how many rows PostgreSQL estimated it would find versus how many it actually found
Actual time — the real time in milliseconds to execute the query
Rows Removed by Filter — how many rows were scanned but didn't match the condition

If you see Seq Scan on a large table with a selective WHERE clause, that's usually a sign you need an index. If you see Index Scan or Index Only Scan, your index is working.

One thing to keep in mind: EXPLAIN without ANALYZE shows the plan without actually running the query. EXPLAIN ANALYZE runs the query and shows real timing data. Always use EXPLAIN ANALYZE when you're investigating performance, but be careful with it on destructive queries — EXPLAIN ANALYZE DELETE FROM ... will actually delete the rows. Wrap those in a transaction and roll back:

BEGIN;
EXPLAIN ANALYZE DELETE FROM customers WHERE city = 'Berlin';
ROLLBACK;

Types of Indexes in PostgreSQL

PostgreSQL supports several index types, each optimized for different query patterns.

B-tree (Default)

B-tree is the default index type and covers the vast majority of use cases. It supports equality checks (=), range queries (<, >, <=, >=, BETWEEN), sorting (ORDER BY), and IS NULL / IS NOT NULL checks.

-- These are equivalent – B-tree is the default
CREATE INDEX idx_name ON customers (last_name);
CREATE INDEX idx_name ON customers USING btree (last_name);

Use B-tree when you don't have a specific reason to use something else.

Hash

Hash indexes are optimized purely for equality comparisons (=). They don't support range queries or sorting. In practice, B-tree handles equality checks almost as fast, so hash indexes are rarely necessary.

CREATE INDEX idx_email_hash ON customers USING hash (email);

Consider a hash index only if you have a very large table with frequent equality-only lookups and want to save a small amount of index space.

GIN (Generalized Inverted Index)

GIN indexes are designed for values that contain multiple elements — like arrays, JSONB documents, or full-text search vectors. Instead of indexing a single value per row, GIN indexes every element within the value.

-- Add a JSONB column
ALTER TABLE customers ADD COLUMN preferences JSONB DEFAULT '{}';

-- Index the JSONB column
CREATE INDEX idx_preferences ON customers USING gin (preferences);

-- Now this query uses the GIN index
SELECT * FROM customers WHERE preferences @> '{"newsletter": true}';

Use GIN when you're querying inside JSONB data, searching arrays with @> or &&, or doing full-text search with tsvector.

GiST (Generalized Search Tree)

GiST indexes support geometric data, ranges, and full-text search. They're commonly used with PostGIS for geospatial queries.

-- Range type example
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    duration TSRANGE
);

CREATE INDEX idx_event_duration ON events USING gist (duration);

-- Find overlapping events
SELECT * FROM events WHERE duration && '[2025-01-01, 2025-01-31]'::tsrange;

Use GiST when you're working with spatial data, range types, or need overlap/containment operators.

BRIN (Block Range Index)

BRIN indexes are extremely small and work well on large tables where the physical row order correlates with the indexed column's value. A common example is a timestamp column on an append-only table where new rows always have later timestamps.

CREATE INDEX idx_created_at_brin ON customers USING brin (created_at);

BRIN stores summary information (min/max values) for each block of rows rather than indexing every row individually. This makes the index much smaller than a B-tree, but it only works well when the data is naturally ordered.

Use BRIN for very large, append-only tables with naturally ordered data — like logs, events, or time-series data.

How to Create a Composite Index

A composite index (also called a multi-column index) covers more than one column. It's useful when your queries frequently filter or sort by multiple columns together.

CREATE INDEX idx_city_lastname ON customers (city, last_name);

The order of columns in a composite index matters. PostgreSQL can use this index for queries that filter on city alone, or on both city and last_name. But it can't efficiently use this index for queries that filter only on last_name.

Think of it like a phone book sorted by city first, then by last name within each city. You can easily look up everyone in Lagos. You can also look up everyone named "Adeyemi" in Lagos. But finding all people named "Adeyemi" across all cities requires scanning the whole book.

This principle is called the leftmost prefix rule: PostgreSQL can use a composite index for queries that include the leftmost column(s) of the index, but not for queries that skip them.

-- ✅ Uses the index (matches leftmost column)
SELECT * FROM customers WHERE city = 'Lagos';

-- ✅ Uses the index (matches both columns, left to right)
SELECT * FROM customers WHERE city = 'Lagos' AND last_name = 'Adeyemi';

-- ❌ Cannot use this index efficiently (skips the leftmost column)
SELECT * FROM customers WHERE last_name = 'Adeyemi';

When deciding column order, place the most selective column first — the one that narrows down the results the most.

How to Create a Partial Index

A partial index covers only a subset of rows in a table. You define the subset with a WHERE clause in the index definition.

This is useful when you only query a specific portion of the data. For example, if you have an orders table and you frequently query for pending orders but rarely look at completed ones:

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total NUMERIC(10, 2),
    created_at TIMESTAMP DEFAULT NOW()
);

-- Only index rows where status is 'pending'
CREATE INDEX idx_orders_pending ON orders (customer_id)
WHERE status = 'pending';

This index is smaller than a full index because it skips all rows that don't match the WHERE condition. Smaller indexes use less disk space, consume less memory, and are faster to maintain during writes.

For the index to be used, your query's WHERE clause must match the index's condition:

-- ✅ Uses the partial index
SELECT * FROM orders WHERE status = 'pending' AND customer_id = 42;

-- ❌ Cannot use the partial index (different status)
SELECT * FROM orders WHERE status = 'shipped' AND customer_id = 42;

How to Create an Expression Index

Sometimes you need to index the result of a function or expression rather than a raw column value. Expression indexes (also called functional indexes) handle this.

A common scenario is case-insensitive email lookups. If your queries use LOWER(email), a regular index on email won't help — PostgreSQL sees the function call as a different expression.

-- Regular index on email – won't help with LOWER() queries
CREATE INDEX idx_email ON customers (email);

-- This query does NOT use the index above
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';

To fix this, create an index on the expression itself:

CREATE INDEX idx_email_lower ON customers (LOWER(email));

Now queries that use LOWER(email) in their WHERE clause will use this index:

-- ✅ Uses the expression index
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';

The rule is straightforward: the expression in your query must match the expression in the index exactly. If the index is on LOWER(email), your query must also use LOWER(email).

How to Create a Unique Index

A unique index guarantees that no two rows have the same value (or combination of values) in the indexed columns. It serves a dual purpose: it enforces data integrity and provides fast lookups.

CREATE UNIQUE INDEX idx_customers_email_unique ON customers (email);

If you try to insert a duplicate value, PostgreSQL will reject the operation:

INSERT INTO customers (first_name, last_name, email, city)
VALUES ('Test', 'User', 'user1@example.com', 'Lagos');
-- ERROR: duplicate key value violates unique constraint "idx_customers_email_unique"

You might wonder how this differs from a UNIQUE constraint. Under the hood, PostgreSQL implements UNIQUE constraints by creating a unique index. The two are functionally identical.

The difference is intent — a UNIQUE constraint expresses a data integrity rule, while a unique index explicitly focuses on query performance with uniqueness as a bonus.

How to Manage Indexes

As your database grows, you'll need to inspect, monitor, and maintain your indexes.

How to List All Indexes on a Table

SELECT
    indexname,
    indexdef
FROM pg_indexes
WHERE tablename = 'customers';

This shows the name and full definition of every index on the table.

How to Check Index Size

SELECT
    pg_size_pretty(pg_relation_size('idx_customers_email')) AS index_size;

For a broader view of all indexes and their sizes:

SELECT
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
ORDER BY pg_relation_size(indexrelid) DESC;

How to Find Unused Indexes

Indexes that are never used waste disk space and slow down writes. You can find them by checking pg_stat_user_indexes:

SELECT
    indexrelname AS index_name,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
AND idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

If an index has idx_scan = 0 after a reasonable period of normal usage, it's a candidate for removal. Just make sure to check across a full business cycle — some indexes are only used during monthly reports or seasonal operations.

How to Drop an Index

DROP INDEX IF EXISTS idx_customers_email;

If you're dropping an index on a production table and want to avoid locking writes, use CONCURRENTLY:

DROP INDEX CONCURRENTLY IF EXISTS idx_customers_email;

How to Rebuild an Index

Over time, indexes can become bloated as rows are inserted, updated, and deleted. You can rebuild an index to reclaim space:

REINDEX INDEX idx_customers_email;

Or rebuild all indexes on a table:

REINDEX TABLE customers;

On production systems, use REINDEX CONCURRENTLY (PostgreSQL 12+) to avoid locking the table:

REINDEX INDEX CONCURRENTLY idx_customers_email;

When Indexes Hurt Instead of Help

Indexes aren't free. Every index you add comes with costs:

Write overhead — every INSERT, UPDATE, or DELETE must also update every index on the table. If a table has 10 indexes and you insert a row, PostgreSQL performs 11 write operations (one for the table and one for each index). On write-heavy tables, excessive indexes can significantly slow down data modification.
Storage cost — indexes consume disk space. On large tables, indexes can take up as much space as the table itself, sometimes more. You can check this with pg_relation_size.
Memory consumption — PostgreSQL caches frequently used indexes in memory. More indexes means more memory pressure, which can push useful data out of the cache and slow down other queries.
Maintenance burden — indexes need periodic maintenance (vacuuming, reindexing) and add complexity to schema migrations.

The question to ask is not "should I add an index?" but rather "does the read performance gain justify the write performance cost for this table's workload?"

Common Mistakes That Prevent Index Usage

You can have the perfect index and PostgreSQL might still ignore it. Here are the most common reasons.

Wrapping the Indexed Column in a Function

-- Index on email
CREATE INDEX idx_email ON customers (email);

-- ❌ PostgreSQL cannot use the index because of LOWER()
SELECT * FROM customers WHERE LOWER(email) = 'user1@example.com';

-- ✅ Fix: create an expression index on LOWER(email)
CREATE INDEX idx_email_lower ON customers (LOWER(email));

Any function applied to the indexed column in a WHERE clause prevents the standard index from being used. You need an expression index that matches the function.

Implicit Type Casting

-- id is an INTEGER column with an index
-- ❌ Passing a string forces a type cast, which may prevent index usage
SELECT * FROM customers WHERE id = '42';

-- ✅ Use the correct type
SELECT * FROM customers WHERE id = 42;

When the query's value type doesn't match the column type, PostgreSQL may cast the column to match, which prevents index usage.

Using OR Conditions Across Different Columns

-- ❌ OR across different columns can prevent index usage
SELECT * FROM customers WHERE email = 'user1@example.com' OR city = 'Lagos';

-- ✅ Rewrite as UNION for better index utilization
SELECT * FROM customers WHERE email = 'user1@example.com'
UNION
SELECT * FROM customers WHERE city = 'Lagos';

Leading Wildcards in LIKE Queries

-- ❌ Leading wildcard cannot use a B-tree index
SELECT * FROM customers WHERE email LIKE '%@example.com';

-- ✅ Trailing wildcard CAN use a B-tree index
SELECT * FROM customers WHERE email LIKE 'user1%';

A B-tree index is sorted from left to right. A leading wildcard (%something) means the database can't use the sorted structure and falls back to a sequential scan. If you need to search by suffix or substring, consider a GIN index with the pg_trgm extension.

Low Selectivity

If a column has very few distinct values relative to the number of rows (low selectivity), PostgreSQL may decide a sequential scan is faster than using the index.

For example, if a status column has only three possible values ('pending', 'shipped', 'delivered') and each value covers roughly a third of the table, an index on status alone provides little benefit. PostgreSQL would still need to read a large portion of the table, and the extra index lookup adds overhead.

A partial index is often the better solution in these cases.

Best Practices for Indexing

Here's a summary of the key principles to follow:

Index columns that appear in WHERE, JOIN, and ORDER BY clauses. These are the columns the database needs to search, match, or sort by. Start with the queries that run most frequently or take the longest.
Measure before and after with EXPLAIN ANALYZE. Never add an index based on guesswork. Run your query with EXPLAIN ANALYZE, add the index, and run it again. If the execution time doesn't improve meaningfully, the index isn't helping.
Don't index every column. Each index slows down writes and consumes storage. Be deliberate about which columns you index based on actual query patterns.
Use composite indexes for multi-column filters. If your queries commonly filter on city and last_name together, a composite index on (city, last_name) is more efficient than two separate single-column indexes.
Put the most selective column first in composite indexes. The column that narrows the results the most should come first.
Use partial indexes when you only query a subset of data. If 90% of your queries target rows where status = 'active', a partial index on that subset is smaller and faster than a full index.
Monitor index usage regularly. Query pg_stat_user_indexes to find unused indexes and remove them.
Rebuild bloated indexes periodically. On tables with heavy update/delete activity, indexes can become bloated. Use REINDEX CONCURRENTLY on production systems.

Conclusion

In this tutorial, you learned what database indexes are and why they matter for query performance. You explored how B-tree indexes work under the hood, created several types of indexes (single-column, composite, partial, expression, and unique), and used EXPLAIN ANALYZE to measure the impact.

You also learned about the trade-offs indexes introduce — write overhead, storage cost, and memory pressure — and the common mistakes that silently prevent PostgreSQL from using your indexes.

The core principle is simple: index deliberately based on your actual query patterns, measure the results, and remove anything that isn't pulling its weight.

If you found this tutorial helpful, you can find more of my writing on freeCodeCamp and connect with me on LinkedIn and X.

What Are Database Triggers? A Practical Introduction with PostgreSQL Examples

iyiola — Fri, 27 Mar 2026 18:49:25 +0000

If you've ever needed your database to automatically respond to changes – like logging every update to a sensitive table, enforcing a business rule before an insert, or syncing derived data after a delete – then triggers are the tool you're looking for.

A database trigger is a function that the database executes automatically when a specific event occurs on a table. You don't call it manually. Instead, you define the conditions, and the database handles the rest.

In this tutorial, you'll learn what triggers are, how they work, when to use them, and when to avoid them. You'll work through practical examples using PostgreSQL, but the core concepts apply to most relational databases.

Prerequisites
How Triggers Work
How to Create Your First Trigger
BEFORE vs AFTER Triggers
How to Build an Audit Log with an AFTER Trigger
How to Use a BEFORE Trigger for Validation
Row-Level vs Statement-Level Triggers
The NEW and OLD Variables Reference
How to Manage Triggers
When to Use Triggers
When to Avoid Triggers
Conclusion

Prerequisites

To follow along with the examples, you'll need:

Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE)
A running PostgreSQL instance (version 12 or later)
A SQL client like psql, pgAdmin, or DBeaver

If you don't have PostgreSQL installed, you can use a free cloud-hosted instance from services like Neon or Supabase to follow along.

How Triggers Work

At a high level, a trigger has three parts:

The event: what action activates the trigger (INSERT, UPDATE, DELETE, or TRUNCATE)
The timing: when the trigger fires relative to the event (BEFORE or AFTER)
The function: what logic runs when the trigger fires

Here's the general flow: a user or application performs an operation on a table, the database checks if any triggers are associated with that operation, and if a match is found, the database executes the trigger function automatically.

You can think of triggers as event listeners for your database. Just like a JavaScript addEventListener watches for a click or keypress, a database trigger watches for row-level changes on a table.

How to Create Your First Trigger

In PostgreSQL, creating a trigger is a two-step process. You first create a trigger function, then you attach that function to a table with a CREATE TRIGGER statement.

Let's build a concrete example. Say you have a products table and you want to automatically set the updated_at timestamp every time a row is modified.

Step 1 – Create the Table

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price NUMERIC(10, 2) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

Step 2 – Create the Trigger Function

A trigger function in PostgreSQL is a special function that returns the TRIGGER type. Inside the function body, you have access to two important variables: NEW (the row after the operation) and OLD (the row before the operation).

CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

This function sets the updated_at column to the current timestamp every time it runs. It then returns NEW, which tells PostgreSQL to proceed with the modified row.

Step 3 – Attach the Trigger to the Table

CREATE TRIGGER trigger_set_updated_at
BEFORE UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION set_updated_at();

Let's break down each part of this statement:

BEFORE UPDATE – the trigger fires before the update is applied to the table
ON products – the trigger is associated with the products table
FOR EACH ROW – the function runs once for every row affected by the update
EXECUTE FUNCTION set_updated_at() – the function to call

Step 4 – Test It

INSERT INTO products (name, price) VALUES ('Wireless Keyboard', 49.99);

-- Wait a moment, then update the row
UPDATE products SET price = 44.99 WHERE name = 'Wireless Keyboard';

SELECT name, price, created_at, updated_at FROM products;

You'll see that updated_at has been automatically updated to the time of the UPDATE operation, even though you didn't explicitly set it in your query. That's the trigger doing its job.

BEFORE vs AFTER Triggers

The timing of a trigger determines when the function executes relative to the actual data change.

BEFORE triggers run before the row is inserted, updated, or deleted. They are useful when you want to modify or validate the incoming data. Since the change hasn't been applied yet, you can alter the NEW row or even cancel the operation entirely by returning NULL.

AFTER triggers run after the row change has been committed to the table. They are useful for side effects like logging, sending notifications, or updating related tables. At this point, the change is already done, so you can't modify the row – but you can read both OLD and NEW to see what changed.

Here's a rule of thumb: use BEFORE triggers when you need to change or reject data, and use AFTER triggers when you need to react to a completed change.

How to Build an Audit Log with an AFTER Trigger

One of the most common uses for triggers is audit logging – keeping a record of every change made to an important table. Let's build one.

Step 1 – Create an Audit Table

CREATE TABLE product_audit (
    audit_id SERIAL PRIMARY KEY,
    product_id INT NOT NULL,
    action VARCHAR(10) NOT NULL,
    old_price NUMERIC(10, 2),
    new_price NUMERIC(10, 2),
    changed_by TEXT DEFAULT current_user,
    changed_at TIMESTAMP DEFAULT NOW()
);

Step 2 – Create the Audit Trigger Function

CREATE OR REPLACE FUNCTION log_product_changes()
RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'UPDATE' THEN
        INSERT INTO product_audit (product_id, action, old_price, new_price)
        VALUES (OLD.id, 'UPDATE', OLD.price, NEW.price);
    ELSIF TG_OP = 'DELETE' THEN
        INSERT INTO product_audit (product_id, action, old_price)
        VALUES (OLD.id, 'DELETE', OLD.price);
    ELSIF TG_OP = 'INSERT' THEN
        INSERT INTO product_audit (product_id, action, new_price)
        VALUES (NEW.id, 'INSERT', NEW.price);
    END IF;

    RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;

There are a few important things happening here. The TG_OP variable is a special string that PostgreSQL provides inside trigger functions. It tells you which operation activated the trigger: 'INSERT', 'UPDATE', or 'DELETE'. This lets you handle different operations with a single function.

The RETURN COALESCE(NEW, OLD) at the end ensures the function returns the correct row. For INSERT and UPDATE operations, NEW exists and is returned. For DELETE operations, NEW is null, so OLD is returned instead.

Step 3 – Attach the Trigger

CREATE TRIGGER trigger_product_audit
AFTER INSERT OR UPDATE OR DELETE ON products
FOR EACH ROW
EXECUTE FUNCTION log_product_changes();

Notice the AFTER INSERT OR UPDATE OR DELETE syntax. You can bind a single trigger to multiple events, which keeps your setup clean.

Step 4 – Test It

-- Insert a new product
INSERT INTO products (name, price) VALUES ('USB-C Hub', 29.99);

-- Update the price
UPDATE products SET price = 24.99 WHERE name = 'USB-C Hub';

-- Delete the product
DELETE FROM products WHERE name = 'USB-C Hub';

-- Check the audit log
SELECT * FROM product_audit ORDER BY changed_at;

You'll see three rows in product_audit (one for each operation) with the old and new prices recorded automatically. No application code needed.

How to Use a BEFORE Trigger for Validation

Triggers can also enforce business rules at the database level. Let's say you want to prevent any product from having a negative price.

CREATE OR REPLACE FUNCTION prevent_negative_price()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.price < 0 THEN
        RAISE EXCEPTION 'Product price cannot be negative. Got: %', NEW.price;
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_check_price
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION prevent_negative_price();

Now test it:

INSERT INTO products (name, price) VALUES ('Faulty Item', -10.00);
-- ERROR: Product price cannot be negative. Got: -10.00

The insert is rejected entirely. The row never makes it into the table. This is powerful because the rule is enforced at the database level regardless of which application or script sends the query.

Row-Level vs Statement-Level Triggers

All the triggers you've seen so far use FOR EACH ROW, which means the function runs once per affected row. If you update 100 rows in a single query, the trigger function runs 100 times.

PostgreSQL also supports FOR EACH STATEMENT triggers, which run once per SQL statement regardless of how many rows are affected.

CREATE OR REPLACE FUNCTION log_bulk_update()
RETURNS TRIGGER AS $$
BEGIN
    RAISE NOTICE 'A bulk operation was performed on the products table';
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_bulk_update_notice
AFTER UPDATE ON products
FOR EACH STATEMENT
EXECUTE FUNCTION log_bulk_update();

Statement-level triggers are less common, but they're useful for operations like refreshing a materialized view or sending a single notification after a batch update instead of one notification per row.

Important: in statement-level triggers, the NEW and OLD variables are not available because the trigger isn't tied to any specific row.

The NEW and OLD Variables Reference

Here's a quick reference for when NEW and OLD are available in row-level triggers:

Operation	OLD	NEW
INSERT	Not available	Contains the new row
UPDATE	Contains the row before the change	Contains the row after the change
DELETE	Contains the deleted row	Not available

Understanding when each variable is available will save you from runtime errors in your trigger functions.

How to Manage Triggers

As you add more triggers to your database, you'll need to know how to inspect, disable, and remove them.

How to List All Triggers on a Table

SELECT trigger_name, event_manipulation, action_timing
FROM information_schema.triggers
WHERE event_object_table = 'products';

How to Disable a Trigger Temporarily

-- Disable a specific trigger
ALTER TABLE products DISABLE TRIGGER trigger_product_audit;

-- Disable all triggers on a table
ALTER TABLE products DISABLE TRIGGER ALL;

This is useful during bulk data migrations where you want to skip trigger execution for performance reasons.

How to Re-Enable a Trigger

ALTER TABLE products ENABLE TRIGGER trigger_product_audit;

How to Drop a Trigger

DROP TRIGGER IF EXISTS trigger_product_audit ON products;

Note that dropping a trigger does not drop the associated function. You'll need to drop the function separately if you no longer need it:

DROP FUNCTION IF EXISTS log_product_changes();

When to Use Triggers

Triggers work well for specific use cases. Here are the scenarios where they're a strong choice:

Audit logging: automatically recording who changed what and when, as you saw earlier in this tutorial.
Derived data maintenance: keeping computed columns, counters, or summary tables in sync with the source data.
Data validation: enforcing business rules that go beyond what CHECK constraints can express, like cross-table validations.
Automatic timestamping: setting created_at and updated_at fields without relying on the application layer.

When to Avoid Triggers

Triggers are powerful, but they come with trade-offs. Here are cases where you should think twice before using them:

Complex business logic: if the logic involves calling external APIs, sending emails, or orchestrating multi-step workflows, it belongs in your application layer. Triggers should stay lightweight.
Performance-sensitive bulk operations: row-level triggers on tables that frequently receive bulk inserts or updates can create significant overhead. If you're inserting millions of rows, those triggers fire millions of times.
Cascading triggers: when one trigger's action fires another trigger, which fires another, debugging becomes extremely difficult. If you find yourself building a chain of triggers, reconsider the design.
Logic that developers need to discover easily: triggers are sometimes called "hidden logic" because they execute automatically without appearing in application code. If your team frequently asks "why did this column change?" and the answer is always "there's a trigger," that's a sign the logic might be more discoverable if placed in your application layer or a stored procedure that's called explicitly.

A good rule of thumb: if the logic is tightly coupled to the data and should always execute regardless of which client or service touches the table, a trigger is appropriate. If the logic depends on application context (like the current user's session, feature flags, or external state), it belongs in the application.

Conclusion

In this tutorial, you learned what database triggers are and how they work in PostgreSQL. You built three practical triggers: an automatic timestamp updater, a full audit logging system, and a data validation guard. You also learned the difference between BEFORE and AFTER triggers, row-level and statement-level triggers, and when NEW and OLD variables are available.

Triggers are a powerful tool for keeping your data consistent and your business rules enforced at the database level. Use them for focused, data-centric operations, and keep the logic simple.

If you found this tutorial helpful, you can connect with me on LinkedIn and X.

How to Build a Bank Ledger in Golang with PostgreSQL using the Double-Entry Accounting Principle.

Paul Babatuyi — Wed, 25 Mar 2026 17:11:25 +0000

The Hidden Bugs in How Most Developers Store Money

Imagine you're building the backend for a million-dollar fintech app. You store each user's balance as a single number in the database. It feels simple: just update the number when money moves.

But with one line of code like UPDATE accounts SET balance = balance - 100, you've created a system that can silently lose millions. A server crash, a race condition, or a clever attack, and suddenly money vanishes or appears out of thin air.

There's no audit trail, no way to know what happened, and no way to prove it didn't happen on purpose.

This isn't just a theoretical risk. It's a trap that's caught even experienced developers. The world's most trusted financial systems avoid it by using double-entry accounting. Every transaction creates two records: a debit on one account, a credit on another. This lets you reconstruct every cent from history, catch inconsistencies, and audit every transaction.

There are no deletes, and no silent updates. Just an append-only trail that makes fraud and bugs much harder to hide.

In this guide, you'll build a robust backend in Go and PostgreSQL, using patterns inspired by real fintech companies. You'll learn how to design a double-entry ledger, generate type-safe SQL with sqlc, and write transactions that are safe even under heavy load.

By the end, you'll understand why these patterns matter – and how to use them to build software you can trust with real money.

Prerequisites and Project Overview
The Double-Entry Foundation
Type-Safe SQL with sqlc
The Store Layer: Transactions and Retries
The Service Layer: Business Logic
The API Layer
Running It Locally
Testing: Prove the System Works
Deployment
Conclusion

Project Resources:

Here's the project repository: https://github.com/PaulBabatuyi/double-entry-bank-Go

And here's the front-end repository: https://github.com/PaulBabatuyi/double-entry-bank

You can find the live frontend here: https://golangbank.app

You can find the live Swagger back-end API here: https://golangbank.app/swagger

Prerequisites and Project Overview

Before you dive in, make sure you have the following installed:

Go 1.23 or newer
Docker and Docker Compose
golang-migrate CLI: go install github.com/golang-migrate/migrate/v4/cmd/migrate@latest
sqlc CLI: go install github.com/sqlc-dev/sqlc/cmd/sqlc@latest

You'll also need a basic understanding of PostgreSQL and REST APIs to follow along.

If you've built a CRUD app before, you're ready for this. The project uses sqlc for type-safe queries, JWT for authentication, and a layered architecture that keeps business logic, persistence, and HTTP handling cleanly separated.

Here's how the project is organized:

.
├── cmd/                # Server entrypoint
│   └── main.go
├── internal/
│   ├── api/            # HTTP handlers & middleware
│   ├── db/             # Store layer (transactions, sqlc)
│   └── service/        # Business logic (ledger operations)
├── postgres/
│   ├── migrations/     # SQL migration files
│   └── queries/        # sqlc query files
├── docs/               # Swagger docs
├── Dockerfile, docker-compose.yml, Makefile
└── README.md

The architecture follows a clear three-layer pattern:

API Layer: Handles HTTP requests, authentication, and routing.
Service Layer: Contains the business logic. This is where double-entry rules are enforced.
Store Layer: Manages database transactions and persistence.

Every request flows from the handler, through the service, to the store, and finally to PostgreSQL. This separation makes the code easier to test, debug, and extend.

Backend Request Flow

graph TD
    A[HTTP Request] --> B[Handler - API Layer]
    B --> C[LedgerService - Business Logic]
    C --> D[Store - Persistence Layer]
    D --> E[(PostgreSQL)]
    E --> D
    D --> C
    C --> B
    B --> F[HTTP Response]

The Double-Entry Foundation: How Every Penny is Accounted For

Let's get to the heart of what makes this system bulletproof: double-entry accounting. Every operation – a deposit, withdrawal, or transfer – creates two entries that always balance. This is the secret sauce that keeps banks, payment apps, and even crypto exchanges from losing track of money.

Picture a simple deposit of $1,000:

| Account              | Debit   | Credit  |
|----------------------|---------|---------|
| User Account         |         | 1,000   |
| Settlement Account   | 1,000   |         |

Total debits always equal total credits. This is the fundamental rule. Every single operation in this system produces exactly this structure, with no exceptions.

Now picture a $200 transfer from User A to User B. Notice there are four entries, not two – both sides of both accounts are recorded:

| Account       | Debit   | Credit  | Description           |
|---------------|---------|---------|-----------------------|
| User A        | 200     |         | Transfer to User B    |
| User B        |         | 200     | Transfer from User A  |

Both entries share the same transaction_id, so you can always retrieve the complete picture of what happened with a single query. There's no guessing and no reconstructing, as the ledger tells the full story.

Why the Settlement Account Goes Negative

This trips up newcomers, so it's worth explaining explicitly. When a user deposits $1,000, the settlement account is debited $1,000. After several user deposits, the settlement balance will be negative. That's correct and expected: it represents the total amount of real-world money currently held inside the system on behalf of users. The invariant is:

SUM(all user account balances) + settlement balance = 0

If that ever doesn't hold, something is broken.

Enforcing the Rules in the Database

The database itself enforces these rules, not just the application code. Here's the core of the entries table migration:

CREATE TABLE IF NOT EXISTS entries (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id UUID NOT NULL REFERENCES accounts(id) ON DELETE RESTRICT,
    debit NUMERIC(19,4) NOT NULL DEFAULT 0.0000 CHECK (debit >= 0),
    credit NUMERIC(19,4) NOT NULL DEFAULT 0.0000 CHECK (credit >= 0),
    transaction_id UUID NOT NULL,
    operation_type operation_type NOT NULL,
    description TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,

    CONSTRAINT check_single_side CHECK (
        (debit > 0 AND credit = 0) OR (debit = 0 AND credit > 0)
    )
);

Let's break down why each piece matters:

Single-sided entries are impossible. The check_single_side constraint means every entry must be either a debit or a credit, never both. If you try to insert an invalid row, the database rejects it – there's no way around it.
Every transaction is linked. Both the debit and credit entries share the same transaction_id (a UUID). This lets you fetch both sides of any operation instantly, making audits and debugging straightforward.
Operation types are explicit. The operation_type column is an enum at the database level, so only valid types like deposit, withdrawal, or transfer are allowed. There are no typos and no surprises.

The Settlement Account: The System's Anchor

Every real-world ledger needs a way to represent money entering or leaving the system. That's what the settlement account does. Here's how it's seeded in the database:

INSERT INTO accounts (id, name, balance, currency, is_system)
SELECT gen_random_uuid(), 'Settlement Account', 0.0000, 'USD', TRUE
WHERE NOT EXISTS (
    SELECT 1 FROM accounts WHERE is_system = TRUE AND name = 'Settlement Account'
);

The settlement account represents the "outside world." When a user deposits money, it comes from the settlement account. When they withdraw, it goes back. Using WHERE NOT EXISTS makes this migration idempotent – that is, safe to run multiple times without creating duplicates.

Type-Safe SQL with sqlc: No More Surprises

In financial systems, you can't afford surprises from your database layer. That's why this project uses sqlc, a tool that turns your SQL queries into type-safe Go code at compile time.

With sqlc, you see exactly what SQL runs, catch mistakes before they hit production, and avoid the "magic" (and hidden bugs) of most ORMs. Every query is explicit, every type is checked, and you get the best of both worlds: raw SQL power with Go's safety.

Why NUMERIC Becomes String (and Not float64)

Here's a subtle but critical detail from sqlc.yaml:

overrides:
    - db_type: "pg_catalog.numeric"
      go_type: "string"
    - column: "entries.debit"
      go_type: "string"
    - column: "entries.credit"
      go_type: "string"
    - column: "accounts.balance"
      go_type: "string"
    - db_type: "operation_type"
      go_type: "string"

Why string, not float64? Floating point arithmetic is imprecise. 0.1 + 0.2 in most programming languages does not equal exactly 0.3.

For money, you need exact decimal arithmetic. This project uses shopspring/decimal for all calculations and stores amounts as strings, converting at the service layer boundary. The database column itself is NUMERIC(19,4), which stores exact decimals – no float rounding ever touches your money.

Preventing Race Conditions: Locking with FOR UPDATE

One of the most important queries in the system is GetAccountForUpdate:

SELECT * FROM accounts
WHERE id = $1
LIMIT 1
FOR UPDATE; -- locks row for update, prevents TOCTOU races

This query uses FOR UPDATE to lock the account row during a transaction. Why? Imagine two requests both see a $500 balance and both try to withdraw $400. Without locking, both would succeed, and you'd end up with a negative balance. With FOR UPDATE, the second transaction waits until the first finishes, eliminating this classic race condition.

Calculating the True Balance: Always Trust the Entries

The real source of truth for any account is the sum of its entries, not the denormalized balance column. Here's the reconciliation query:

SELECT CAST(
    (COALESCE(SUM(credit), 0::NUMERIC) - COALESCE(SUM(debit), 0::NUMERIC))
    AS NUMERIC(19,4)
) AS calculated_balance
FROM entries
WHERE account_id = $1;

This computes the true balance from the ledger itself. It's how you catch bugs, audit the system, and prove that every penny is accounted for. The balance column on accounts is a denormalized cache for fast reads – and this query is the ground truth that validates it.

The Store Layer: Transactions and Automatic Retries

Every financial operation in this system runs inside a transaction – no exceptions. This is enforced by the ExecTx pattern in the store layer:

func (store *Store) ExecTx(ctx context.Context, fn func(q *sqlc.Queries) error) error {
    const maxAttempts = 10
    var lastErr error
    for attempt := 0; attempt < maxAttempts; attempt++ {
        lastErr = store.execTxOnce(ctx, fn)
        if lastErr == nil {
            return nil
        }
        if !isSerializationError(lastErr) {
            return lastErr
        }
        if attempt < maxAttempts-1 {
            if waitErr := sleepWithContext(ctx, retryWait(attempt)); waitErr != nil {
                return waitErr
            }
        }
    }
    return fmt.Errorf("transaction failed after %d attempts due to serialization conflicts: %w", maxAttempts, lastErr)
}

Why Serializable Isolation?

The transaction uses PostgreSQL's strictest isolation level: sql.LevelSerializable. This is like running transactions one at a time, eliminating entire classes of concurrency bugs. If two operations would conflict, PostgreSQL aborts one and returns a serialization error (SQLSTATE 40001).

Automatic Retries: Handling Real-World Concurrency

When a serialization error occurs, the code automatically retries with exponential backoff:

func retryWait(attempt int) time.Duration {
    base := 50 * time.Millisecond
    for i := 0; i < attempt; i++ {
        base *= 2
        if base >= time.Second {
            return time.Second
        }
    }
    return base
}

func sleepWithContext(ctx context.Context, d time.Duration) error {
    select {
    case <-ctx.Done():
        return ctx.Err()
    case <-time.After(d):
        return nil
    }
}

The backoff starts at 50ms and doubles each attempt, capping at 1 second. Up to 10 attempts are made. If the client disconnects mid-retry, sleepWithContext detects the cancelled context and returns immediately. This means no wasted resources.

The Service Layer: Where Business Logic Meets Double-Entry

The service layer is the heart of the system. Its job is to translate business operations – deposits, withdrawals, transfers – into double-entry journal entries that always balance.

Deposit: Crediting the User, Debiting the Settlement

Every deposit creates two entries: a credit to the user's account and a matching debit to the settlement account. Both entries share the same transaction ID.

func (s *LedgerService) Deposit(ctx context.Context, accountID uuid.UUID, amountStr string) error {
    amount, err := validatePositiveAmount(amountStr)
    if err != nil {
        return err
    }
    return s.store.ExecTx(ctx, func(q *sqlc.Queries) error {
        settlement, err := q.GetSettlementAccountForUpdate(ctx)
        if err != nil {
            return fmt.Errorf("settlement account not found: %w", err)
        }
        account, err := q.GetAccountForUpdate(ctx, accountID)
        if err != nil {
            return fmt.Errorf("account not found: %w", err)
        }
        if account.Currency != settlement.Currency {
            return ErrCurrencyMismatch
        }
        txID := uuid.New()
        // 1. Credit user account
        _, err = q.CreateEntry(ctx, sqlc.CreateEntryParams{
            AccountID:     accountID,
            Debit:         decimal.Zero.StringFixed(4),
            Credit:        amount.StringFixed(4),
            TransactionID: txID,
            OperationType: "deposit",
            Description:   sql.NullString{String: "External deposit", Valid: true},
        })
        if err != nil { return err }
        // 2. Debit settlement (opposing entry)
        _, err = q.CreateEntry(ctx, sqlc.CreateEntryParams{
            AccountID:     settlement.ID,
            Debit:         amount.StringFixed(4),
            Credit:        decimal.Zero.StringFixed(4),
            TransactionID: txID,
            OperationType: "deposit",
            Description:   sql.NullString{String: fmt.Sprintf("Deposit to account %s", accountID), Valid: true},
        })
        if err != nil { return err }
        // 3. Update both balances atomically
        if err = q.UpdateAccountBalance(ctx, sqlc.UpdateAccountBalanceParams{
            Balance: amount.StringFixed(4), ID: accountID,
        }); err != nil { return err }
        return q.UpdateAccountBalance(ctx, sqlc.UpdateAccountBalanceParams{
            Balance: amount.Neg().StringFixed(4), ID: settlement.ID,
        })
    })
}

Two things are worth highlighting. First, both accounts are locked with GetAccountForUpdate and GetSettlementAccountForUpdate before any entries are written. This prevents any other concurrent transaction from reading a stale balance and acting on it.

Second, amount.Neg() is used to debit the settlement. Its balance goes down, representing real money now held inside the system.

Withdraw: Debiting the User, Crediting the Settlement

Withdrawals are the mirror image of deposits. The key difference is the insufficient funds check, which must happen inside the transaction after the lock is acquired:

balanceDec, err := decimal.NewFromString(account.Balance)
if err != nil {
    return errors.New("invalid balance")
}
if balanceDec.LessThan(amount) {
    return ErrInsufficientFunds
}

Checking balance inside the transaction after FOR UPDATE is critical. Checking it before, outside the transaction, would create a classic time-of-check-to-time-of-use (TOCTOU) race. Two concurrent withdrawals could both pass the check, then both execute, overdrawing the account.

The entries for a $500 withdrawal look like this:

| Account              | Debit   | Credit  |
|----------------------|---------|---------|
| User Account         | 500     |         |
| Settlement Account   |         | 500     |

The settlement is credited because real money is leaving the system, and it's being "returned" to the outside world.

Transfer: User-to-User, No Settlement Involved

Transfers move money directly between two user accounts. The settlement account isn't involved. Both accounts are locked, currency is validated, and an insufficient funds check runs before any entries are created:

func (s *LedgerService) Transfer(ctx context.Context, fromID, toID uuid.UUID, amountStr string) error {
    amount, err := validatePositiveAmount(amountStr)
    if err != nil { return err }
    if fromID == toID {
        return ErrSameAccountTransfer
    }
    return s.store.ExecTx(ctx, func(q *sqlc.Queries) error {
        fromAcc, err := q.GetAccountForUpdate(ctx, fromID)
        if err != nil { return err }
        toAcc, err := q.GetAccountForUpdate(ctx, toID)
        if err != nil { return err }
        if fromAcc.Currency != toAcc.Currency {
            return ErrCurrencyMismatch
        }
        fromBalance, _ := decimal.NewFromString(fromAcc.Balance)
        if fromBalance.LessThan(amount) {
            return ErrInsufficientFunds
        }
        txID := uuid.New()
        // Debit sender, credit receiver — same transaction ID
        // ... CreateEntry calls + UpdateAccountBalance calls
    })
}

A $200 transfer creates exactly two entries under the same transaction_id:

| Account  | Debit   | Credit  |
|----------|---------|---------|
| Sender   | 200     |         |
| Receiver |         | 200     |

ReconcileAccount: Trust, But Verify

Reconciliation is how you prove the system is correct. The ReconcileAccount function compares the stored balance column against the sum of all credits minus debits in the entries table:

func (s *LedgerService) ReconcileAccount(ctx context.Context, accountID uuid.UUID) (bool, error) {
    account, err := s.store.GetAccount(ctx, accountID)
    if err != nil { return false, fmt.Errorf("account not found: %w", err) }

    calculatedStr, err := s.store.GetAccountBalance(ctx, accountID)
    if err != nil { return false, fmt.Errorf("failed to calculate balance: %w", err) }

    calculated, _ := decimal.NewFromString(calculatedStr)
    stored, _ := decimal.NewFromString(account.Balance)

    if !stored.Equal(calculated) {
        log.Error().
            Str("stored_balance", account.Balance).
            Str("calculated", calculated.StringFixed(4)).
            Msg("Balance mismatch detected")
        return false, fmt.Errorf("balance mismatch: stored %s, calculated %s",
            account.Balance, calculated.StringFixed(4))
    }
    return true, nil
}

If they don't match, something has gone wrong: a bug, a direct database modification, or a race condition that slipped through. In production, this check can run as a background job to catch issues before they become incidents.

The API Layer: Secure, Predictable, and Boring (By Design)

The API layer is where your business logic meets the outside world. Its job is to be secure, predictable, and, if you've done things right, a little bit boring.

JWT Authentication: Secrets Matter

Authentication is handled with JWTs. The secret used to sign tokens must be at least 32 characters long (as shorter secrets are insecure and can be brute-forced). This is enforced at startup:

// internal/api/middleware.go
func InitTokenAuth(secret string) error {
    if secret == "" {
        return errors.New("JWT_SECRET environment variable is required")
    }
    if len(secret) < 32 {
        return errors.New("JWT_SECRET must be at least 32 characters")
    }
    TokenAuth = jwtauth.New("HS256", []byte(secret), nil)
    return nil
}

The server will refuse to start if the secret is missing or too short. There's no fallback and no default: the system fails loudly rather than running insecurely.

The Handler Pattern: Parse, Authorize, Validate, Call, Respond

Every handler follows the same recipe: extract JWT claims, parse the account ID, fetch the account and verify ownership, decode the request body, call the service, and respond. Authorization always happens before calling the service layer. The service knows nothing about users, keeping business logic clean and testable.

// internal/api/handler.go
func (h *Handler) Register(w http.ResponseWriter, r *http.Request) {
    var input struct {
        Email    string `json:"email"`
        Password string `json:"password"`
    }
    if err := json.NewDecoder(r.Body).Decode(&input); err != nil {
        respondError(w, http.StatusBadRequest, "invalid input")
        return
    }
    // ... hash password, create user, generate JWT ...
}

Amount Normalization: Defensive by Default

API clients send amounts in different formats – sometimes as strings, sometimes as numbers. The normalization logic ensures all amounts are handled safely:

// internal/api/amount.go
func normalizeAmountInput(value interface{}) (string, error) {
    switch v := value.(type) {
    case string:
        return strings.TrimSpace(v), nil
    case json.Number:
        return strings.TrimSpace(v.String()), nil
    case float64:
        return strconv.FormatFloat(v, 'f', -1, 64), nil
    default:
        return "", errors.New("amount must be a number or string")
    }
}

The decoder uses dec.UseNumber() so JSON numbers arrive as json.Number rather than float64, preserving full precision. The float64 case exists as a safety fallback only.

Frontend Deployment Boundary

The backend no longer serves static frontend files. The frontend is deployed separately at https://golangbank.app from its own repository: https://github.com/PaulBabatuyi/double-entry-bank.

Running It Locally: Your First End-to-End Test

git clone https://github.com/PaulBabatuyi/double-entry-bank-Go.git
cd double-entry-bank-Go
cp .env.example .env
# Edit .env — set JWT_SECRET with: openssl rand -base64 32
make postgres
make migrate-up
make server

Once the server is running:

Frontend: https://golangbank.app
Swagger UI: http://localhost:8080/swagger/index.html (local dev) or https://golangbank.app/swagger (production)
Health check: http://localhost:8080/health

The Swagger UI lets you explore every endpoint, authorize with your JWT token, and test operations directly in the browser.

Testing: Prove the System Works

Testing financial systems is non-negotiable, and claims about correctness need to be backed by code. This project tests all three layers, each targeting a different kind of failure.

Service Layer: Core Financial Logic

The most important tests live in internal/service/ledger_test.go. They run against a real PostgreSQL database – not mocks – because mock-based tests can give a false sense of security. Real database tests catch issues that only appear in production-like environments.

func TestDeposit_Success(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "0.00")

    err := ledger.Deposit(context.Background(), accountID, "100.00")
    require.NoError(t, err)

    balance := getAccountBalance(t, ledger, accountID)
    assert.Equal(t, "100.0000", balance)
}

func TestWithdraw_InsufficientFunds(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "50.00")

    err := ledger.Withdraw(context.Background(), accountID, "100.00")
    assert.ErrorIs(t, err, ErrInsufficientFunds)
}

The createTestAccount helper uses the settlement account's currency automatically, which is important: all accounts must share a currency for transfers to work, and tests that silently use a different currency will fail in confusing ways.

Concurrency Test: Proving Serializable Isolation Works

This is the most important test in the suite:

func TestConcurrentDeposits(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "0.00")

    var wg sync.WaitGroup
    wg.Add(2)
    go func() {
        defer wg.Done()
        _ = ledger.Deposit(context.Background(), accountID, "100.00")
    }()
    go func() {
        defer wg.Done()
        _ = ledger.Deposit(context.Background(), accountID, "100.00")
    }()
    wg.Wait()

    balance := getAccountBalance(t, ledger, accountID)
    assert.Equal(t, "200.0000", balance)
}

Two goroutines deposit simultaneously. The serializable isolation level and retry logic ensure both operations succeed and neither overwrites the other. Without the FOR UPDATE locks and transaction retry logic, this test would fail non-deterministically – which is exactly the kind of bug that's impossible to reproduce in development but devastating in production.

Store Layer: Transaction Mechanics

Tests in internal/db/store_test.go verify the retry infrastructure itself, without needing a database connection:

func TestIsSerializationError(t *testing.T) {
    pqErr := &pq.Error{Code: "40001"}
    assert.True(t, isSerializationError(pqErr))
    assert.False(t, isSerializationError(errors.New("some other error")))
}

func TestRetryWait(t *testing.T) {
    assert.Equal(t, 50*time.Millisecond, retryWait(0))
    assert.Equal(t, 100*time.Millisecond, retryWait(1))
    assert.Equal(t, 200*time.Millisecond, retryWait(2))
    assert.Equal(t, time.Second, retryWait(5)) // capped
}

func TestSleepWithContext_Cancel(t *testing.T) {
    ctx, cancel := context.WithCancel(context.Background())
    cancel() // cancel immediately
    err := sleepWithContext(ctx, 50*time.Millisecond)
    assert.Error(t, err) // should return immediately, not wait
}

API Layer: Authentication and Input Handling

Handler tests in internal/api/handler_test.go verify that the HTTP layer behaves correctly at its boundaries:

func TestRegisterHandler_BadRequest(t *testing.T) {
    h := setupTestHandler(t)
    req := httptest.NewRequest(http.MethodPost, "/register", nil)
    rw := httptest.NewRecorder()
    h.Register(rw, req)
    assert.Equal(t, http.StatusBadRequest, rw.Code)
}

func TestRegisterHandler_Success(t *testing.T) {
    h := setupTestHandler(t)
    _ = InitTokenAuth("fV7sliKV3qn657I60wEFtw/Auk/0bNU9zdp30wFzfDg=")

    email := "testuser_" + uuid.New().String() + "@example.com"
    body, _ := json.Marshal(map[string]string{"email": email, "password": "testpassword123"})

    req := httptest.NewRequest(http.MethodPost, "/register", bytes.NewReader(body))
    rw := httptest.NewRecorder()
    h.Register(rw, req)
    assert.Equal(t, http.StatusCreated, rw.Code)
}

Using uuid.New().String() in the email ensures each test run creates a unique user, preventing conflicts on repeated runs against the same database.

Middleware tests verify the security boundary itself:

func TestInitTokenAuthFromEnv_MissingSecret(t *testing.T) {
    os.Unsetenv("JWT_SECRET")
    err := InitTokenAuthFromEnv()
    assert.Error(t, err) // must fail without a secret
}

Running the Tests

# Start the database
make postgres

# Run all tests with race detection
make test

# Run with coverage report
make coverage

# Run tests the same way CI does (includes migrations)
make ci-test

The -race flag is non-negotiable for financial code. It instruments the binary to detect data races at runtime – something static analysis can't catch. If a race exists, the race detector will find it.

Deployment: Engineering Decisions That Matter in Production

The deployment setup for this project reflects several engineering decisions worth understanding, regardless of what platform you deploy to.

Migrations on Container Start

The Docker entrypoint runs golang-migrate up before starting the Go binary:

# docker-entrypoint
migrate -path /app/postgres/migrations -database "$migrate_db_url" up
exec /usr/local/bin/ledger

Running migrations at startup rather than as a separate CI step has trade-offs. The upside is simplicity: the container is always self-consistent when it starts. The downside is that each deployment takes slightly longer. For a solo project or small team, this is the right call. At scale you'd separate migrations from deployment.

Startup Retry Logic

The entrypoint retries migrations up to 12 times with a 5-second sleep between attempts:

max_attempts=12
attempt=1
while [ "\(attempt" -le "\)max_attempts" ]; do
    migration_output=$(migrate ... up 2>&1)
    # If "connection refused" or "timeout", keep retrying
    # If any other error, fail immediately
    attempt=$((attempt + 1))
done

The critical distinction is which errors trigger a retry. Network-transient errors (connection refused, timeout) are retried. Everything else – a bad migration SQL, a missing tabl – fails immediately. This avoids waiting the full 60 seconds when a deployment has a real problem.

DB URL Fallback Chain

In cloud environments, the internal database URL is often a different variable than what you configure locally. The resolveDBURL function handles this transparently:

func resolveDBURL() string {
    connStr := strings.TrimSpace(os.Getenv("DB_URL"))
    fallbackVars := []string{"INTERNAL_DATABASE_URL", "RENDER_DATABASE_URL", "DATABASE_URL"}
    // Falls back through the chain if DB_URL is empty or resolves to localhost
    ...
}

This pattern means local developers set DB_URL in .env and don't need to think about it, while the deployed container automatically uses the internal database connection without any manual wiring.

HTTP Server Timeouts

The server is configured with explicit timeouts:

srv := &http.Server{
    Addr:              ":" + port,
    Handler:           r,
    ReadTimeout:       15 * time.Second,
    WriteTimeout:      15 * time.Second,
    IdleTimeout:       60 * time.Second,
    ReadHeaderTimeout: 5 * time.Second,
}

Without timeouts, a slow or malicious client can hold connections open indefinitely, eventually exhausting the server's resources. ReadHeaderTimeout is particularly important: it limits how long the server waits for the HTTP headers before closing the connection, protecting against Slowloris-style attacks.

Conclusion: Building for the Real World

You've just walked through the core patterns that power real fintech systems:

Double-entry ledger with database-enforced constraints
Settlement account for tracking external cash flows
Serializable transactions with exponential backoff retry
Reconciliation endpoint for verifying correctness
Type-safe queries with sqlc
Row-level locking to prevent race conditions
Tests that prove correctness under concurrency

These aren't just Go patterns. They're the same principles used at companies like Monzo, Stripe, and Nubank. The implementation details differ, but the underlying ideas are the same: every dollar is accounted for, every operation is atomic, and the system can always explain where every penny went.

What's next? Three concrete next steps:

Add idempotency keys to prevent duplicate transactions on retries. If a client retries a deposit because of a network timeout, you need to detect and reject the duplicate.
Add Prometheus metrics for transaction latency and failure rates. You want to know when your p99 latency spikes before your users do.
Add a scheduled reconciliation job that runs ReconcileAccount for every account on a schedule and alerts on mismatches. Catch bugs automatically, before they become customer complaints.

The developer who stores balance as a single number and updates it directly will eventually have an incident. The developer who builds a ledger has an audit trail, a reconciliation tool, and a system that can explain every penny.

That's the real reason fintech engineers build this way: not because it's more complex, but because it's more honest about what money actually is.

How to Implement the Outbox Pattern in Go and PostgreSQL

Alex Pliutau — Thu, 19 Mar 2026 17:26:11 +0000

In event-driven systems, two things need to happen when you process a request: you need to save data to your database, and you need to publish an event to a message broker so other services know something changed.

These two operations look simple, but they hide a dangerous reliability problem. What if the database write succeeds but the message broker is temporarily unreachable? Or your service crashes between the two steps? You end up in an inconsistent state: your database has the new data, but the rest of the system never heard about it.

The Outbox Pattern is a well-established solution to this problem. In this tutorial, you'll learn what the pattern is, why it works, and how to implement it in Go with PostgreSQL and Google Cloud Pub/Sub.

Prerequisites

Before reading this tutorial, you should be familiar with:

The basics of the Go programming language
SQL and PostgreSQL
The concept of database transactions
Basic familiarity with event-driven or distributed systems (helpful but not required)

The Problem: Two Operations, No Atomicity
How the Outbox Pattern Works
The Outbox Table Schema
The Message Relay
Go and PostgreSQL Implementation
- The Orders Service
- The Relay Service
Why Messages Can Be Delivered More Than Once
Alternative: PostgreSQL Logical Replication
Conclusion

The Problem: Two Operations, No Atomicity

To understand why the Outbox Pattern exists, you need to understand a core challenge in distributed systems: atomicity across different systems.

In a relational database, a transaction lets you group multiple operations so they either all succeed or all fail together. If you insert a row and update another row in the same transaction, you're guaranteed that both happen – or neither does.

The problem arises when you try to extend this guarantee across two different systems: for example, your database and your message broker (like Kafka, RabbitMQ, or Pub/Sub). These systems don't share a transaction boundary.

Here's a typical event-driven flow that breaks without the Outbox Pattern:

A user places an order.
Your service saves the order to the database ✅
Your service publishes an order.created event to the message broker ❌ (broker is down)
The order exists in the database, but downstream services never learned about it.

Or the reverse failure:

Your service publishes the event first ✅
Your service tries to save the order to the database ❌ (database times out)
Downstream services received a notification for an order that doesn't exist.

Either scenario leaves your system in an inconsistent state. This is the core problem the Outbox Pattern solves.

Here's what the process looks like when not using the Outbox Pattern:

How the Outbox Pattern Works

The Outbox Pattern solves the atomicity problem by keeping both operations inside the database:

Saves your business data (for example, a new order) to your database.
Writes the event message to a special table called the outbox table in the same database transaction.
A separate background process called the Message Relay polls the outbox table and publishes pending messages to the broker.
Once the broker confirms receipt, the relay marks the message as processed.

Because steps 1 and 2 happen in the same database transaction, they are atomic. Either both succeed or neither does. You can never end up with saved data but no corresponding event queued – or an event queued for data that was never saved.

The message is never published directly to the broker in your main application code. Instead, the database acts as a reliable staging area.

The Outbox Table Schema

The outbox table stores pending messages until the relay picks them up. Here's a typical PostgreSQL schema:

CREATE TABLE outbox (
    id          uuid PRIMARY KEY DEFAULT gen_random_uuid(),
    topic       varchar(255)  NOT NULL,
    message     jsonb         NOT NULL,
    state       varchar(50)   NOT NULL DEFAULT 'pending',
    created_at  timestamptz   NOT NULL DEFAULT now(),
    processed_at timestamptz
);

Let's walk through each column:

id: A unique identifier for each message. Using UUIDs makes it easy to reference specific messages.
topic: The destination topic or queue name in your message broker (for example, orders.created).
message: The event payload, stored as JSON. This is the data your consumers will receive.
state: Tracks whether the message has been sent. The two main values are pending (waiting to be published) and processed (successfully published).
created_at: When the message was inserted. The relay uses this to process messages in order.
processed_at: When the relay successfully published the message.

You may want to add additional columns depending on your needs: for example, a retry_count column to track how many times the relay has attempted to send a message, or an error column to log failure reasons.

The Message Relay

The Message Relay is a background process (often a goroutine, a sidecar, or a separate service) that bridges the outbox table and the message broker.

Its responsibilities are:

Periodically query the outbox table for messages with state = 'pending'.
Publish each message to the appropriate topic in the broker.
Once the broker confirms delivery, update the row's state to 'processed'.
Handle failures gracefully: if publishing fails, leave the message as 'pending' so it will be retried.

This design gives you at-least-once delivery: a message will always be sent, even if the relay crashes and restarts. The trade-off is that a message might occasionally be sent more than once (more on this below), so your consumers should handle duplicates.

Go and PostgreSQL Implementation

Let's build a concrete example. Imagine you have an orders service. When a new order is created, you want to:

Save the order to a PostgreSQL orders table.
Publish an order.created event to Google Cloud Pub/Sub.

You'll use pgx for the PostgreSQL driver.

The Orders Service

The key insight is that the order insert and the outbox insert happen inside the same transaction. If anything goes wrong, both are rolled back.

// orders/main.go

package main

import (
	"context"
	"encoding/json"
	"log"
	"os"

	"github.com/google/uuid"
	"github.com/jackc/pgx/v5"
	"github.com/jackc/pgx/v5/pgxpool"
)

// Order represents a customer order in our system.
type Order struct {
	ID       uuid.UUID `json:"id"`
	Product  string    `json:"product"`
	Quantity int       `json:"quantity"`
}

// OrderCreatedEvent is the payload published to the message broker.
// It contains only the fields that downstream services need to know about.
type OrderCreatedEvent struct {
	OrderID uuid.UUID `json:"order_id"`
	Product string    `json:"product"`
}

// createOrderInTx saves a new order and its outbox event atomically.
// Both operations share the same transaction (tx), so either both succeed
// or both are rolled back — ensuring consistency.
func createOrderInTx(ctx context.Context, tx pgx.Tx, order Order) error {
	// Step 1: Insert the business data (the actual order).
	_, err := tx.Exec(ctx,
		"INSERT INTO orders (id, product, quantity) VALUES (\(1, \)2, $3)",
		order.ID, order.Product, order.Quantity,
	)
	if err != nil {
		return err
	}
	log.Printf("Inserted order %s into database", order.ID)

	// Step 2: Serialize the event payload that consumers will receive.
	event := OrderCreatedEvent{
		OrderID: order.ID,
		Product: order.Product,
	}
	msg, err := json.Marshal(event)
	if err != nil {
		return err
	}

	// Step 3: Write the event to the outbox table.
	// This does NOT publish to Pub/Sub — it just queues it for the relay.
	_, err = tx.Exec(ctx,
		"INSERT INTO outbox (topic, message) VALUES (\(1, \)2)",
		"orders.created", msg,
	)
	if err != nil {
		return err
	}
	log.Printf("Inserted outbox event for order %s", order.ID)

	return nil
}

func main() {
	ctx := context.Background()

	pool, err := pgxpool.New(ctx, os.Getenv("DATABASE_URL"))
	if err != nil {
		log.Fatalf("Unable to connect to database: %v", err)
	}
	defer pool.Close()

	// Begin a transaction that will cover both the order insert
	// and the outbox insert.
	tx, err := pool.Begin(ctx)
	if err != nil {
		log.Fatalf("Unable to begin transaction: %v", err)
	}
	// If anything fails, the deferred Rollback is a no-op after a successful Commit.
	defer tx.Rollback(ctx)

	newOrder := Order{
		ID:       uuid.New(),
		Product:  "Super Widget",
		Quantity: 10,
	}

	if err := createOrderInTx(ctx, tx, newOrder); err != nil {
		log.Fatalf("Failed to create order: %v", err)
	}

	// Committing the transaction makes both writes permanent simultaneously.
	if err := tx.Commit(ctx); err != nil {
		log.Fatalf("Failed to commit transaction: %v", err)
	}

	log.Println("Successfully created order and queued outbox event.")
}

Notice that createOrderInTx receives a pgx.Tx (a transaction) rather than a pool connection. This is intentional: it enforces that the caller is responsible for managing the transaction boundary, making the atomicity guarantee explicit.

The Relay Service

The relay runs as a separate background process. It polls the outbox table, publishes messages, and marks them as processed.

A critical detail here is the use of FOR UPDATE SKIP LOCKED in the SQL query. This PostgreSQL feature lets you run multiple relay instances concurrently without them stepping on each other. When one instance locks a row to process it, other instances skip that row and move on to the next one.

// relay/main.go

package main

import (
	"context"
	"log"
	"time"

	"cloud.google.com/go/pubsub"
	"github.com/google/uuid"
	"github.com/jackc/pgx/v5/pgxpool"
)

// OutboxMessage mirrors the columns we need from the outbox table.
type OutboxMessage struct {
	ID      uuid.UUID
	Topic   string
	Message []byte
}

// processOutboxMessages picks up one pending message, publishes it to Pub/Sub,
// and marks it as processed — all within a single database transaction.
func processOutboxMessages(ctx context.Context, pool *pgxpool.Pool, pubsubClient *pubsub.Client) error {
	tx, err := pool.Begin(ctx)
	if err != nil {
		return err
	}
	defer tx.Rollback(ctx)

	// Query for the next pending message.
	// FOR UPDATE SKIP LOCKED ensures that if multiple relay instances are
	// running, they won't try to process the same message simultaneously.
	rows, err := tx.Query(ctx, `
		SELECT id, topic, message
		FROM outbox
		WHERE state = 'pending'
		ORDER BY created_at
		LIMIT 1
		FOR UPDATE SKIP LOCKED
	`)
	if err != nil {
		return err
	}
	defer rows.Close()

	var msg OutboxMessage
	if rows.Next() {
		if err := rows.Scan(&msg.ID, &msg.Topic, &msg.Message); err != nil {
			return err
		}
	} else {
		// No pending messages — nothing to do.
		return nil
	}

	log.Printf("Publishing message %s to topic %s", msg.ID, msg.Topic)

	// Publish the message to the Pub/Sub topic and wait for confirmation.
	result := pubsubClient.Topic(msg.Topic).Publish(ctx, &pubsub.Message{
		Data: msg.Message,
	})
	if _, err = result.Get(ctx); err != nil {
		// Publishing failed. We return the error here without committing,
		// so the transaction rolls back and the message stays 'pending'.
		// The relay will retry it on the next polling interval.
		return err
	}

	// Mark the message as processed now that the broker has confirmed receipt.
	_, err = tx.Exec(ctx,
		"UPDATE outbox SET state = 'processed', processed_at = now() WHERE id = $1",
		msg.ID,
	)
	if err != nil {
		return err
	}
	log.Printf("Marked message %s as processed", msg.ID)

	// Commit the transaction: the state update becomes permanent.
	return tx.Commit(ctx)
}

func main() {
	// In production, initialize real connections using environment variables
	// or a config file. These are left as placeholders for clarity.
	var (
		pool         *pgxpool.Pool
		pubsubClient *pubsub.Client
	)

	// Poll the outbox table every second.
	// Adjust the interval based on your latency requirements.
	ticker := time.NewTicker(1 * time.Second)
	defer ticker.Stop()

	for range ticker.C {
		if err := processOutboxMessages(context.Background(), pool, pubsubClient); err != nil {
			log.Printf("Error processing outbox: %v", err)
		}
	}
}

The polling interval (1 second in this example) controls the maximum latency between an event being written to the outbox and it being published to the broker. For most use cases, 1–5 seconds is perfectly acceptable. If you need lower latency, you can reduce the interval, or consider using PostgreSQL's LISTEN/NOTIFY feature to wake up the relay immediately when a new row is inserted.

Why Messages Can Be Delivered More Than Once

You might wonder: isn't the Outbox Pattern supposed to guarantee exactly once delivery?

It does not. It guarantees at-least-once delivery. Here's the edge case:

The relay publishes the message to Pub/Sub successfully.
Before it can update the outbox row to 'processed', the relay process crashes.
On restart, the relay sees the message is still 'pending' and publishes it again.

This is a rare but possible scenario. The standard way to handle it is to design your message consumers to be idempotent. This means that they can safely receive and process the same message multiple times without causing incorrect behavior.

Common strategies for idempotency include:

Using the message's id as a deduplication key, and checking if you've already processed it before acting.
Making your operations naturally idempotent. For example, using INSERT ... ON CONFLICT DO NOTHING instead of a plain INSERT.

Alternative: PostgreSQL Logical Replication

The polling approach described above is simple and works well, but it has two drawbacks: it introduces some latency (up to one polling interval), and it issues database queries even when there's nothing to process.

For high-throughput systems where these trade-offs matter, PostgreSQL offers a more advanced alternative: logical replication via the Write-Ahead Log (WAL).

Every change made to a PostgreSQL database is first written to the WAL – an append-only log used for crash recovery and replication. With logical replication, you can subscribe to changes in specific tables and receive them as a stream in near real-time.

Instead of your relay asking "Are there any new messages?" on a schedule, PostgreSQL will proactively notify your relay the moment a new row is inserted into the outbox table.

This approach is lower latency and more resource-efficient for high-volume workloads. The trade-off is added implementation complexity: you need to manage a replication slot in PostgreSQL and handle the WAL stream correctly.

In Go, you can use the pglogrepl library to interact with PostgreSQL's logical replication protocol.

For more details on how WAL and change data capture work in PostgreSQL, see the official Write-Ahead Logging documentation.

Conclusion

The Outbox Pattern solves a fundamental problem in distributed systems: how do you reliably perform a database write and publish a message to a broker in a consistent way?

The key idea is to use your database as the source of truth for both the business data and the pending messages. By writing to the outbox table in the same transaction as your business data, you get atomic guarantees from the database itself: no distributed transaction protocol required.

Here's a quick summary of the key concepts:

The outbox table stores pending events as part of your regular database schema.
The transaction wraps both the business write and the outbox write, making them atomic.
The Message Relay is a background process that reads from the outbox and publishes to the broker.
At-least-once delivery means your consumers must be idempotent.
FOR UPDATE SKIP LOCKED allows multiple relay instances to run safely in parallel.
Logical replication is an advanced alternative that avoids polling for high-throughput systems.

The pattern is simple in concept, but there are several ways to implement it depending on your scale and infrastructure. The polling approach shown in this tutorial is a solid starting point for most applications.

Resources

How to Elevate Your Database Game: Supercharging Query Performance with Postgres FDW

Hamdaan Ali — Wed, 18 Feb 2026 22:36:48 +0000

Foreign data wrappers (FDWs) make remote Postgres tables feel local. That convenience is exactly why FDW performance surprises are so common.

A query that looks like a normal join can execute like a distributed system: rows move across the network, remote statements get executed repeatedly, and the local planner quietly becomes a coordinator. In that world, “fast SQL” is not mainly about CPU or indexes. It’s about data movement and round-trips.

This handbook covers the mechanism that determines whether a federated query behaves like a clean remote query or a chatty distributed workflow: pushdown.

Pushdown is not “moving compute”. Pushdown determines whether filtering, joining, ordering, and aggregation occur at the data source or after the data has already crossed the wire. When pushdown works, the local server receives a reduced result set. When it doesn’t, Postgres often has to fetch broad intermediate sets and finish the work locally.

The chapters ahead will help you build a practical mental model of what is “shippable” in postgres_fdw, why some expressions are blocked, and how to read EXPLAIN (ANALYZE, BUFFERS, VERBOSE) without getting tricked by familiar plan shapes.

After the core method, the handbook covers tuning knobs that matter in production, schema and indexing considerations, benchmarking methodology, monitoring and logging, and a case study that shows what a real pushdown win looks like end-to-end.

The later sections go deeper into advanced shippability edge cases, cost model calibration, and regression-proofing FDW workloads.

Prerequisites
Executive Summary
Motivation
FDW Basics Without the Setup Tax
Pushdown Mechanics
Shippable Operations: a Deep Dive
Pushdown Blockers and Why They Exist
Reading EXPLAIN Like a Pro
How to Tune postgres_fdw
Schema and Index Recommendations
Benchmarking Methodology
Monitoring and Logging
Case Study: Refactoring a Keycloak Coverage Query
Checklist and Troubleshooting Guide
Case Study Takeaways
Advanced Operations: A Deeper Dive into Shippability
Common Anti‑Patterns and How to Avoid Them
Extending Tuning: Calibrating Cost Models
Further Case Studies and Practical Examples
Monitoring, Diagnostics, and Regression Testing
Extended Guidelines for Advanced DBAs
Bringing it All Together
References

Prerequisites

This handbook assumes basic comfort with Postgres query plans. It builds on EXPLAIN (ANALYZE, BUFFERS) rather than reintroducing SQL fundamentals, indexing, or join algorithms.

The focus here is federated execution: how foreign queries behave, and how to reason about them with the same clarity as local plans.

Here’s what you should already be comfortable with:

Reading EXPLAIN (ANALYZE, BUFFERS) output and spotting obvious plan smells (row explosions, bad join order, missed indexes).
Basic join mechanics (nested loop, hash join, merge join) and why cardinality estimates matter.
Postgres statistics at a practical level (ANALYZE, correlation, and what “estimated rows vs actual rows” implies).

And here’s what you need to follow along with the examples:

A Postgres “local” instance that will run postgres_fdw and act as the coordinator.
A Postgres “remote” instance that holds the foreign tables.
Permission on the local side to:
- CREATE EXTENSION postgres_fdw;
- create a SERVER and USER MAPPING
- create FOREIGN TABLE objects (or permission to use existing ones)
A way to run queries and capture plans:
- psql is enough, and so is any GUI, as long as you can run EXPLAIN (ANALYZE, BUFFERS, VERBOSE).

We won’t go through a long environment setup walkthrough. The examples assume the FDW objects exist and focus on plans and behavior.

We also won’t go into general distributed systems theory. Only the pieces that show up in an FDW plan are used.

Executive Summary

The single most important lesson of this handbook is that FDW pushdown reduces data movement. It’s tempting to think of pushdown as merely changing where a calculation happens (“move the work to the remote”). But what really matters is whether the remote server is asked for only the rows you need.

When pushdown is working, the remote server performs the selective join and filtering, and the local Postgres receives a small, already reduced result set. When pushdown fails, the local server becomes a distributed query coordinator: it pulls large intermediate sets over the network and then finishes the heavy lifting locally.

Why does this matter? Because a refactor that makes more of your query shippable to the remote server can slash end‑to‑end latency without changing a single row of output. In the case study we'll explore later, rewriting a query so that the FDW can ship a joined remote query instead of performing multiple foreign scans and local joins reduces runtime from approximately 166 ms to 25 ms. The business logic did not change – the shape of the work changed.

Below is a simple bar chart illustrating that dramatic drop. The chart uses actual timings from the case study. If you run the experiment yourself, the numbers may differ depending on your hardware and network, but the relative difference should be clear.

Motivation

Foreign data wrappers let you query remote data using the same SQL syntax you use locally. That convenience is exactly why they can be so deceptive.

A federated query may look like a normal join, but under the hood, it behaves like a distributed system: some part of the plan runs on the remote server, some on the local server, and every boundary between them is a network hop. The slow path is rarely “bad SQL” – it’s usually a combination of two things:

Too many rows are pulled over the network. Without pushdown, the FDW retrieves a large slice of the remote table and applies your filters and joins locally. This may lead to tens of thousands or millions of rows being shipped across the network when you only needed hundreds or fewer.
Too many round-trips. If the plan performs a nested loop that drives a foreign scan, it can end up executing the same remote query hundreds or thousands of times. Each call might be fast on its own, but latency adds up.

This isn't speculation. PostgreSQL's documentation makes clear that a foreign table has no local storage and that Postgres “asks the FDW to fetch data from the external source” [1]. There is no local buffer cache or heap storage to hide mistakes. Every row you retrieve must traverse the network at least once. If your plan fetches more rows than it needs, or repeatedly does so, performance can degrade quickly.

That’s why you should treat the Remote SQL shown in EXPLAIN (VERBOSE) as part of your query plan. It tells you exactly what the remote server is being asked to do. If it’s missing your filters or joins, you know the local server will have to finish the job. The rest of this handbook will teach you how to read that plan, how to force pushdown when possible, and how to recognize the signs that something has gone wrong.

FDW Basics Without the Setup Tax

You might be tempted to skip this section if you've already created foreign tables in your own databases. Don't. Understanding the architecture of foreign data wrappers is essential to understanding why pushdown matters.

SQL/MED in a nutshell

PostgreSQL implements the SQL/MED (Management of External Data) standard through its FDW framework. To access a remote Postgres server via postgres_fdw, you perform four steps:

Install the extension: CREATE EXTENSION postgres_fdw tells Postgres to load the FDW code.
Create a foreign server: CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '...', port '...', dbname '...')defines where the remote server resides and how to connect.
Create a user mapping: CREATE USER MAPPING FOR your_user SERVER foreign_server OPTIONS (user 'remote_user', password '...') tells Postgres how to authenticate on the remote side.
Create a foreign table: CREATE FOREIGN TABLE remote_table (...) SERVER foreign_server OPTIONS (schema_name '...', table_name '...'); defines the columns and references the remote table.

Once you've done that, you can run SELECT statements against the foreign table as if it were local. But the definition hides an important detail: there is no storage associated with that foreign table [1]. Every time you SELECT, INSERT, UPDATE, or DELETE, the FDW must connect to the remote server, build a remote query, send it, and read the results. This overhead is small for simple queries but becomes critical as queries get more complex.

What postgres_fdw does and does not do

postgres_fdw does two things for you:

It builds remote SQL from your query, including pushing down safe filters, joins, sorts, and aggregates when it can.
It fetches rows from the remote server and hands them to the local executor. If some part of your query cannot be executed remotely, the local executor performs that part.

The FDW tries hard to minimize data transfer by sending as much of your WHERE clause as possible to the remote server and by not retrieving unused columns [2]. It also has a number of tuning knobs that we'll explore later (such as fetch_size, use_remote_estimate, fdw_startup_cost, and fdw_tuple_cost[3]). But the real win often comes from structuring your query so that the FDW can push work down.

There's one last architectural point to keep in mind: the remote server runs with a restricted session environment. In remote sessions opened by postgres_fdw, the search_path is set to pg_catalog only, and TimeZone, DateStyle, and IntervalStyle are set to specific values [4]. This means that any functions you expect to run remotely must be schema‑qualified or packaged in a way that the FDW can find them. It also underscores why you should not override session settings for FDW connections unless you know exactly what you are doing [4].

Pushdown Mechanics

At a high level, “pushdown” means pushing as much of your SQL query as possible to the remote server. But the FDW cannot simply send arbitrary SQL. It must be safe and portable for remote evaluation. Postgres uses the term shippable to describe expressions and operations that can be evaluated on the foreign server.

What “shippable” means in practice

An expression is considered shippable if it meets several conditions:

It uses built‑in functions, operators, or data types, or functions/operators from extensions that have been explicitly allow‑listed via the extensions option on the foreign server [2]. If you use a custom function or an extension that has not been declared, the FDW assumes it cannot run remotely.
It’s marked IMMUTABLE. Postgres distinguishes between IMMUTABLE, STABLE, and VOLATILE functions. Only immutable functions – those that always return the same output for the same inputs and don’t depend on session state – are candidates for pushdown [5]. This rule prevents time‑dependent functions, such as now() or random() from being evaluated remotely, because the result might differ between the local and remote servers.
It doesn’t depend on local collations or type conversions. PostgreSQL’s docs warn that type or collation mismatches can lead to semantic anomalies [1]. If the FDW cannot guarantee that a comparison behaves identically on both servers, it will refuse to push it down. For example, comparing a citext column to a text constant could be unsafe if the remote server doesn’t have the citext extension installed.

From these rules, you can derive a mental checklist: avoid non‑immutable functions in your WHERE clause, keep your join conditions simple and typed correctly, and list any third‑party extensions you want to use in the foreign server’s extensions option so that they are considered shippable [2].

WHERE pushdown

If a WHERE clause consists entirely of shippable expressions, it will be included in the remote query. Otherwise, it will be evaluated locally. This matters because pushing a filter down reduces the number of rows returned to the local server.

Consider a predicate like this:

WHERE created_at >= now() - interval '30 days'

Because now() is volatile (it returns a different value each time it’s called), Postgres cannot assume the remote server will interpret now() the same way. The FDW therefore pulls the entire table and applies the filter locally.

A better approach is to pass a parameter into the query or compute the cutoff timestamp once in the application and embed it into the SQL.

Join pushdown conditions

Joins are the next big lever. When postgres_fdw encounters a join between foreign tables on the same foreign server, it will send the entire join to the remote server unless it believes it will be more efficient to fetch the tables individually or unless the tables use different user mappings [6].

It applies the same precautions described for WHERE clauses: the join condition must be shippable, and both tables must be on the same server. Cross‑server joins are never pushed down – the FDW will perform them locally.

Shippability decision tree

It can be helpful to visualize the shippability rules as a flowchart. Below is a simple decision tree that you can use when inspecting an expression or join clause.

It starts with the question of whether an expression is in a WHERE or JOIN clause. Further decisions are made based on factors like using volatile functions, built-in functions, type mismatches, or cross-server joins. The flowchart concludes with outcomes like "Not shippable, evaluated locally" or "Shippable, included in Remote SQL."

If you reach the left side of the tree, the expression will be evaluated locally. If you reach the right side, the FDW can ship it.

Shippable Operations: a Deep Dive

Postgres has been expanding what postgres_fdw can be pushed down over several versions. This section walks through each operation class and the conditions required for pushdown.

Filters (WHERE clauses)

As explained above, simple filters that use built‑in operators and immutable functions are generally pushed down. If you see a Filter: node above a Foreign Scan in your plan, it means some part of your predicate didn’t qualify. Common reasons include using now(), timezone() or other volatile functions, referencing a non‑allow‑listed extension, or comparing different collation settings.

When this happens, the entire table (or at least all rows matching other shippable conditions) is fetched, and the filter is applied locally.

Plan smell: Look for a Foreign Scan node with a Filter: line directly above it. That means filtering happened locally. Also look for broad Remote SQL such as:

SELECT * FROM remote_table WHERE (name = 'Hamdaan')

with no group constraints. That's a sign that the filter was not pushed down.

Joins

Simple inner joins between foreign tables on the same foreign server are usually pushable. The join condition must satisfy the same shippability rules as filters. If the join involves more than one foreign server, if the join condition uses an unshippable function, or if the foreign tables use different user mappings, the FDW will fetch each table separately and join them locally [6]. This can lead to large intermediate sets being transferred.

Plan smell: A Hash Join or Merge Join where both inputs are Foreign Scan nodes indicates that the join was performed locally. Conversely, a single Foreign Scan representing a join and containing the JOIN ... ON clause in Remote SQL indicates that the join was pushed down.

Aggregates (GROUP BY, COUNT, SUM, and so on)

Starting in PostgreSQL 10, aggregates can be pushed to the remote server when possible. The release notes state explicitly: “push aggregate functions to the remote server,” and explain that this reduces the amount of data that must be transferred from the remote server and offloads aggregate computation [7].

To qualify, both the grouping expressions and the aggregate functions themselves must be shippable. If the FDW cannot push an aggregate, it will fetch the raw rows and perform the aggregation locally.

Plan smell: Look for a GroupAggregate node above a Foreign Scan that returns many rows. When the aggregate is pushed down, there will be no local aggregate node. Instead, the Remote SQL will include a GROUP BY clause.

ORDER BY and LIMIT

Prior to PostgreSQL 12, sorting and limiting were rarely pushed down. In version 12, Etsuro Fujita’s patch allows ORDER BY sorts and LIMIT clauses to be pushed to postgres_fdw foreign servers in more cases [8]. For the sort or limit to be pushed, the underlying scan must be pushable, and the ordering expression must be shippable. Partitioned queries or complicated join trees may still cause the sort or limit to be applied locally.

Plan smell: A local Sort or Limit node above a Foreign Scan indicates the operation was not pushed down. Conversely, a Remote SQL statement containing ORDER BY and LIMIT indicates that pushdown succeeded.

DISTINCT

Distinct operations can be pushed down when the distinct expression list is shippable. But if the distinct is combined with unshippable expressions, or if the distinct is applied after a join that cannot be pushed down, the FDW will retrieve all rows and perform the distinct locally.

Window functions

In practice, window functions are rarely pushed down through postgres_fdw. They often require ordering or partitioning semantics that are difficult to represent portably. If you see a WindowAgg node in your plan, it’s almost always local. That doesn’t mean you can't use window functions with foreign tables, but you should expect them to incur network and CPU costs.

Version differences

Postgres developers continue to improve the FDW layer. Here are some notable changes by version:

PostgreSQL 9.6 introduced remote join pushdown and allowed UPDATE/DELETE pushdown. Before 9.6, all joins were local.
PostgreSQL 10 introduced aggregate pushdown, enabling remote GROUP BY and aggregate functions [7].
PostgreSQL 12 expanded ORDER BY and LIMIT pushdown [8].
PostgreSQL 15 added pushdown for certain CASE expressions and other improvements.

If you learned FDW behavior on an older version, revisit your assumptions.

Pushdown Blockers and Why They Exist

When pushdown fails, it’s not due to bad luck. There’s always a reason grounded in safety or correctness. Here are the most common blockers and how to diagnose them.

Non‑immutable functions

Functions marked VOLATILE or STABLE cannot be pushed down because their results may differ between the local and remote server. Examples include now(), random(), current_user, and user‑defined functions that look at session variables or query the database. Even functions you might think are harmless, like age() or clock_timestamp(), can cause pushdown to fail.

Fix: Compute volatile values in your application or in a CTE before referencing the foreign table. For example, compute timestamp 'now' - interval '30 days' as a constant and compare your created_at column against that constant. Alternatively, move the logic into a stored generated column on the remote table.

Type and collation mismatches

The documentation warns that when types or collations don’t match between the local and remote tables, the remote server may interpret conditions differently [1]. This is particularly insidious when text comparisons, case‑insensitive collations, or non‑default locale settings are used. If Postgres can't guarantee the same semantics, it will pull rows locally and evaluate the expression.

Fix: Make sure that your foreign table definition uses the same data types and collations as the remote table. When in doubt, explicitly cast values to a common type.

Cross‑server joins

Joins across different foreign servers cannot be pushed down. The FDW can only ship a join when both tables reside on the same remote server and use the same user mapping [6]. Otherwise, it will perform two separate scans and join the results locally.

Fix: If you frequently join tables across servers, consider consolidating the tables on a single server, materializing a view on one side, or pulling the smaller table into a temporary local table before joining.

Mixed local and foreign joins

A join between a local table and a foreign table will not be pushed down. Even though the foreign side might be pushdown‑eligible, the FDW cannot join it with local data on the remote server. A nested loop with a parameterized foreign scan is the typical pattern here, resulting in many remote calls.

Fix: Filter or aggregate as much as possible on the foreign side first (via a CTE or by materializing a subset) before joining to local tables.

Remote session settings and search paths

Because postgres_fdw sets a restricted search_path, TimeZone, DateStyle, and IntervalStyle in remote sessions [4], any functions you call must be schema‑qualified or otherwise compatible. If a function relies on the current search path or session settings, it may break or produce different results on the remote side.

Fix: Schema‑qualify remote functions and ensure that any environment‑dependent logic is safe to execute under the default FDW session settings. If necessary, attach SET search_path or other settings to your remote functions.

Troubleshooting matrix

The table below maps symptoms in your EXPLAIN plan to likely causes and fixes. Use it as a quick diagnostic tool when something looks off.

Symptom in plan	Likely cause	Suggested fix
Foreign Scan has loops much greater than 1	Parameterized remote lookup caused by nested loop, join conditions not shippable	Rewrite join so the FDW can ship a single joined query, or batch remote requests via an `IN` list or temporary table
Broad Remote SQL that lacks scope predicates	`WHERE` clause contains non‑immutable functions or unsupported operators	Replace volatile functions with constants or allow‑list extension functions, ensure types and collations match
Local Hash Join or Merge Join between two foreign tables	Join could not be pushed down (different servers, user mappings, or unshippable join expression)	Consolidate tables on one server, align user mappings, or rewrite the join condition
Local Sort, Limit, or Unique on top of a Foreign Scan	`ORDER BY`, `LIMIT`, or `DISTINCT` could not be pushed down	Simplify sort expressions, push filters deeper, check PG version for improvements
Plan runs but gives wrong results when pushdown is enabled	Semantic mismatch due to type/collation differences or remote session settings [1] [4]	Align types/collations, schema‑qualify functions, use stable session settings

Reading EXPLAIN Like a Pro

Many developers skim EXPLAIN plans for local queries, looking at the top nodes and overall cost. For FDW queries, you must invert that habit: read the foreign parts first. The Remote SQL string tells you what the remote server is being asked to do, and the loops field tells you how many times that remote call is executed.

Inspect the Foreign Scan nodes

Start by finding the Foreign Scan node(s). In EXPLAIN (VERBOSE), each foreign scan includes a line like:

Remote SQL: SELECT ...

This line is not a trivial – it’s the actual SQL that will run on the remote server. Read it carefully. Does it include your WHERE predicates? Does it include your join conditions? If not, you know the local server will pick up the slack.

Look at the loops column. If the loops exceed 1, the same remote query is executed multiple times. For example:

Foreign Scan on public.user_entity  (rows=1 loops=416)
  Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE enabled AND service_account_client_link IS NULL AND id = $1

This is the “N+1” problem in disguise. The plan executes the foreign scan once per outer row. Multiply the per‑loop cost by the number of loops to understand why the query is slow. The fix is to rewrite the query so that the join and filters are applied in a single remote call.

Recognize InitPlan vs SubPlan

An InitPlan runs once and caches its result. A SubPlan can run per outer row. In FDW queries, subplans often drive parameterized remote scans. If you see a SubPlan attached to a nested loop that feeds a foreign scan, suspect a parameterized remote lookup and look for ways to turn it into an InitPlan or merge it into a single remote query.

Understand CTE materialization

Common table expressions (CTEs) behave differently depending on whether they are marked MATERIALIZED or NOT MATERIALIZED. A materialized CTE is computed once and stored in a temporary structure, then read by the rest of the query. A non‑materialized CTE is inlined into the parent query, allowing optimizations to span across the boundary.

In PostgreSQL 12 and later, CTEs are inlined by default unless they’re referenced multiple times or explicitly marked MATERIALIZED. Materializing a CTE that contains a foreign scan can freeze a broad remote fetch and prevent later clauses from being pushed down. On the other hand, materialization can prevent repeated remote scans if the CTE is referenced multiple times. Use this lever deliberately to control where remote work happens.

Annotated example

Let's annotate a simplified excerpt from a real plan. The goal is to show how to quickly read the relevant parts.

Nested Loop  (rows=414 loops=1)
  -> Hash Join  (rows=416 loops=1)
       -> Foreign Scan on public.user_entity (rows=1 loops=416)
            Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE enabled AND service_account_client_link IS NULL AND id = $1
  -> Foreign Scan on public.user_attribute (rows=671 loops=1)
       Remote SQL: SELECT ua.user_id, ua.value FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN tenant r ON u.tenant_id = r.id WHERE ua.name = 'attribute A' AND r.name = 'demo' AND u.enabled AND u.service_account_client_link IS NULL AND (g.name = 'keycloak-group-a' OR g.parent_group = $1)

In the old plan, the first Foreign Scan executed 416 times, each time retrieving a single row. The Remote SQL only applies the filter on enabled and service_account_client_link – it doesn’t include the tenant or group scoping. That scoping is applied by the nested loop outside the foreign scan.

In the refactored plan, the second Foreign Scan results from combining user_attribute, user_entity, user_group_membership, keycloak_group, and tenant into a single remote query. It retrieves 671 rows in a single query and includes all relevant filters. There is no repeated remote call. The timing difference is driven by the different loop values and the selectivity of the Remote SQL.

How to Tune postgres_fdw

Once you've structured your query for maximum pushdown, tuning knobs let you squeeze out further performance improvements and adjust planner decisions.

fetch_size

fetch_size controls how many rows postgres_fdw retrieves per network fetch. The default is 100 rows [9]. A small fetch size means more round-trips and lower memory usage. A larger fetch size reduces network overhead at the cost of buffering more rows in memory.

In practice, increasing fetch_size to a few thousand can reduce latency for large result sets. It’s specified either at the foreign server or foreign table level:

ALTER SERVER foreign_server OPTIONS (ADD fetch_size '1000');
ALTER FOREIGN TABLE remote_table OPTIONS (ADD fetch_size '1000');

use_remote_estimate

By default, the planner estimates the cost of foreign scans using local statistics. This can be wildly inaccurate if the foreign table has a different data distribution. Setting use_remote_estimate to true tells postgres_fdw to run EXPLAIN on the remote server to get row count and cost estimates. This can dramatically improve join order selection at the cost of an additional remote query during planning [3]. You can set this per table or per server:

ALTER SERVER foreign_server OPTIONS (SET use_remote_estimate 'true');

fdw_startup_cost and fdw_tuple_cost

These cost parameters model the overhead of starting a foreign scan and the cost per row fetched. Adjusting them can influence the planner’s choice of join strategy. A higher fdw_startup_cost discourages the planner from choosing plans with many small foreign scans (which might generate many remote calls). A higher fdw_tuple_cost discourages plans that fetch large numbers of rows [3]. Use these only after you have solid evidence from EXPLAIN and experiments.

ANALYZE and analyze_sampling

Running ANALYZE on a foreign table collects local statistics by sampling the remote table [3]. Accurate stats are essential for good estimates when use_remote_estimate is false.

But if the remote table changes frequently, these stats become stale quickly. The analyze_sampling option controls whether sampling happens on the remote side or locally. When analyze_sampling is set to random, system, bernoulli, or auto, ANALYZE will sample rows remotely instead of pulling all rows into the local server[3].

extensions

The extensions option lists extensions whose functions and operators can be shipped to the remote server [2]. If you rely on functions from citext, pg_trgm, or other extensions, add them to the server definition:

ALTER SERVER foreign_server OPTIONS (SET extensions 'citext,pg_trgm');

A quick knob impact table

Knob	Primary effect	When to change it	Possible downside
fetch_size	Number of rows per fetch	Result sets are large and latency dominates	Too large consumes memory
use_remote_estimate	Better row count/cost estimates	Planner misestimates foreign scans	Extra remote queries during planning
fdw_startup_cost	Penalty per foreign scan	Planner chooses many small foreign scans	Wrong values bias the planner
fdw_tuple_cost	Cost per row fetched	Planner pulls too many rows	Mis‑tuned values mislead planner
extensions	Which extension functions are shippable	Using extension functions in predicates	Extensions must exist and match on both servers

Schema and Index Recommendations

Pushdown doesn’t eliminate the need for good indexes. In fact, effective pushdown depends on the remote server having indexes that support the filter and join predicates you’re shipping.

Below are some patterns to watch for in FDW queries and the indexes that support them. You can adapt these to your own schema.

Table	Access pattern	Recommended index	Why
tenant (remote)	Filter by tenant.name	UNIQUE (name) or BTREE (name)	Resolves tenant ID quickly
keycloak_group (remote)	Filter by name, join by tenant_id, filter on parent_group	Composite (tenant_id, name) and (parent_group)	Supports resolving root group and walking one‑level hierarchy
user_group_membership (remote)	Join by user_id, filter by group_id	BTREE (group_id, user_id)	Efficiently finds users in a set of groups
user_attribute (remote)	Filter by name, join by user_id	Composite (name, user_id) (optionally include value)	Matches “attribute name → users → values” flow
user_entity (remote)	Filter by tenant_id, enabled, service_account_client_link IS NULL, join by id	Partial index on (tenant_id, id) with predicate on enabled and service_account_client_link IS NULL	Helps remote planner start from user table when tenant and user filters are applied
filtercategory (local)	Filter by category && uuid[], join on (entitytype, entityid)	GIN index on category, BTREE (entitytype, entityid)	Speeds array overlap checks and join predicate

In general, indexes should reflect the join order you expect the remote planner to use. If your Remote SQL starts with:

FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN user_group_membership ugm ON ...

ensure that indexes exist on user_attribute(user_id) and user_group_membership(user_id).

Benchmarking Methodology

It’s easy to claim a performance improvement without proper measurement. Here's a repeatable method you can use to benchmark FDW query changes.

Warm the caches. Run each query once to load data into the remote buffer cache and the local FDW connection. Discard the timings.
Measure latencies. Use EXPLAIN (ANALYZE, BUFFERS, VERBOSE) to capture execution times, buffer usage, and remote row counts. Be aware that EXPLAIN ANALYZE adds overhead, so record the raw execution time if possible by running the query directly.
Record remote metrics. On the remote server, enable pg_stat_statements and track the calls, total_time, and rows for each remote query. This gives you a per‑query breakdown and confirms what Remote SQL is executed.
Control for concurrency and network latency. Run benchmarks during a quiet period or isolate the test cluster. If your environment has high network latency, record the round‑trip time separately to attribute delays.
Compare apples to apples. Benchmark the old and new queries under identical conditions. Use the same sample data, same remote server, and same connection settings.
Look at row counts. The primary goal of pushdown is to reduce the number of rows shipped. Compare the rows column of each Foreign Scan node.

Here's a simple matrix you can use to record your experiments:

Scenario	What you're testing	Expected change in Remote SQL	Metrics to record
Baseline (old query)	Starting point: broad remote scans + local joins	Remote SQL lacks scoping predicates	p50/p95 latency, remote row count, local sort/hash time
Refactor (new query)	Join + filter pushdown	Remote SQL includes joins and filters	Same metrics, plus remote row count
Introduce a volatile function	Pushdown blocker test	Clause removed from Remote SQL	Remote row count increases, local filter cost increases
Type or collation mismatch	Semantic risk test	Remote SQL might change behavior or lose pushdown	Compare correctness and row counts
ORDER/LIMIT pushdown	Version‑dependent test	Remote SQL includes ORDER BY, LIMIT	Sort time shifts to remote. Row count should remain
use_remote_estimate on/off	Planning accuracy test	Planner uses remote estimates	Planning time, join order, and runtime difference

Monitoring and Logging

In production, you need to know when a query starts misbehaving. There are two places to look: the local server and the remote server.

Local metrics

pg_stat_statements. This extension tracks planning and execution times, row counts, and buffer hits for each query. Look for high total times relative to rows or calls.
Auto Explain or auto_explain. Turn on auto_explain.log_min_duration_statement to capture slow queries with plans. This will show you the Remote SQL executed and whether the plan changed.
Connection pool metrics. Monitor connection counts and wait events related to FDW operations (for example, PostgresFdwConnect, PostgresFdwGetResult) as described in the documentation [10].

Remote metrics

pg_stat_statements on the remote server. This lets you see which Remote SQL queries are being executed, how often, and how long they take. Compare these with the Remote SQL strings in your local EXPLAIN plans.
Server logs. Increase log_statement or log_min_duration_statement on the remote server to capture long-running remote queries.

Correlating local and remote metrics can reveal patterns such as a new code path causing a surge in remote queries or pushdown failures, leading to heavy remote scans.

Case Study: Refactoring a Keycloak Coverage Query

The theory above may seem abstract until you see it play out in practice. Let's walk through a real example inspired by a Keycloak integration.

The original query calculated coverage: given a list of category IDs, it returned the percentage of users who had attributes mapped to those categories and a JSON array of entity counts. The query used a CTE to build a list of scoped users, then joined it with user attributes, category mappings, and a few other tables.

Symptom

In a test environment with 100K user records, the query averaged 166 ms. This was slower than expected. Running EXPLAIN (ANALYZE, BUFFERS, VERBOSE) showed two foreign scans on the Keycloak database. The first scanned user_entity 416 times (loops = 416). The second pulled all rows from user_attribute where name = 'attributeA' before filtering by tenant and group locally.

Here's a simplified excerpt (numbers are approximate):

Foreign Scan on public.user_entity  (actual time=0.117..0.117 rows=1 loops=416)
  Remote SQL: SELECT id, tenant_id FROM public.user_entity WHERE (enabled AND service_account_client_link IS NULL AND id = $1)
Foreign Scan on public.user_attribute  (actual time=41.267..80.352 rows=80739 loops=1)
  Remote SQL: SELECT value, user_id FROM public.user_attribute WHERE (('attributeA' = name))

The first scan performed a single-row lookup 416 times. The second scan retrieved 80,739 rows because the only condition pushed down was name = 'attributeA'. Tenant and group scoping occurred locally. That meant 80k rows were transferred over the network and then filtered down to about 671 on the local side.

Diagnosis

There were two main issues.

First was the N+1 remote calls on user_entity. The join to user_entity was not pushed down, so the plan executed a remote lookup for each row from user_group_membership. This created 416 remote queries.

Second was the unscoped attribute fetch. Because the WHERE clause included user_entity.tenant_id = tenant.id and keycloak_group.name = 'groupA' in a higher CTE, the FDW could not see those predicates when scanning user_attribute. It therefore fetched all rows with name = 'attributeA' and left the tenant and group filters to the local side.

Refactor

The fix was to inline the tenant and group joins into the user_attribute scan to avoid the nested-loop pattern. The refactored selected_user_attributes CTE looked like this (simplified for readability):

WITH selected_user_attributes AS (
  SELECT DISTINCT ua.user_id, ua.value
  FROM public.user_attribute ua
  JOIN public.user_entity u ON u.id = ua.user_id
  JOIN public.user_group_membership ugm ON ugm.user_id = u.id
  JOIN public.keycloak_group g ON g.id = ugm.group_id
  JOIN public.tenant r ON r.id = u.tenant_id
  WHERE ua.name = 'attributeA'
    AND u.enabled
    AND u.service_account_client_link IS NULL
    AND r.name = 'tenantA'
    AND (g.name = 'groupA' OR g.parent_group = (
         SELECT id FROM public.keycloak_group WHERE name = 'groupA' AND tenant_id= r.id
    ))
)

This single query expresses the same scoping logic that previously lived in separate CTEs. Because all the join conditions are on the same foreign server and use built‑in operators, the FDW can push down the entire join. The new plan looked like this:

Foreign Scan  (actual time=7.840..7.856 rows=671 loops=1)
  Remote SQL: SELECT ua.user_id, ua.value FROM user_attribute ua JOIN user_entity u ON ua.user_id = u.id JOIN user_group_membership ugm ON ugm.user_id = u.id JOIN keycloak_group g ON g.id = ugm.group_id JOIN tenant r ON u.tenant_id= r.id WHERE ua.name = 'attributeA' AND u.enabled AND u.service_account_client_link IS NULL AND r.name = 'tenantA' AND (g.name = 'groupA' OR g.parent_group = $1)

Only one remote query is executed, and it returns 671 rows. Tenant and group scoping occur on the remote server. There is no nested loop or repeated remote scan. The final runtime dropped to about 25 ms.

Why it improved

Fewer rows crossing the network. The old plan fetched 80k attribute rows and filtered them locally. The new plan fetched only the 671 scoped rows.
No repeated remote calls. The old plan executed 416 remote scans of user_entity. The new plan performs one joined remote query.
Less local work. Because the join and filtering happen remotely, the local side no longer hashes or filters large sets.

Key takeaway

If you see a Foreign Scan with a high loops count or a Remote SQL that doesn’t contain your filters and joins, you’re leaving performance on the table. Merging filters and joins into a single remote query (subject to shippability rules) often yields orders-of-magnitude improvements.

Checklist and Troubleshooting Guide

The following steps summarize how to approach FDW performance tuning:

Inspect the Remote SQL. Always run EXPLAIN (VERBOSE) and look at what is being sent to the remote. If your predicates are missing, the FDW isn't pushing them down.
Check loops. If the loops are greater than 1 on a Foreign Scan, you are paying for repeated remote calls. Rewrite the query or reorder the joins to make the foreign scan run once.
Make predicates shippable. Replace volatile functions with constants or parameters. Ensure operators and functions are built‑in or explicitly allow‑listed via the extensions option [2].
Align types and collations. Use the same data types and collations on both sides to avoid semantic mismatches [1].
Push joins to the same server. Consolidate tables on one foreign server if possible. Joins across servers cannot be pushed down [6].
Use use_remote_estimate when planning seems off. Enabling remote estimates can improve join order selection [3].
Tune fetch_size and costs if your queries transfer many rows. A bigger fetch_size reduces round-trip; adjusting fdw_startup_cost and fdw_tuple_cost influences the planner [3].
Analyze foreign tables if you rely on local cost estimates. Keep in mind that stats can get stale quickly [3].
Monitor both servers. Use pg_stat_statements on local and remote servers to see how often remote queries run and how long they take.
Test version upgrades. Each major release improves FDW pushdown semantics (for example, aggregates in 10 [7], ORDER/LIMIT in 12 [8]). Retest after upgrading.

Case Study Takeaways

Querying remote data with PostgreSQL’s postgres_fdw can be fast and convenient if you respect the underlying mechanics. Pushdown is the difference between streaming a trickle of relevant rows and hauling an ocean of data across the network. It isn't simply a matter of moving CPU cycles – it changes how much data moves, how many network round-trip occur, and how much your local server has to do.

The rules may seem restrictive – use only immutable functions, avoid cross‑server joins, align types and collations – but they exist to preserve correctness while enabling optimization.

By reading EXPLAIN from the bottom up, inspecting the Remote SQL, and understanding the shippability rules, you can spot slow patterns quickly. Armed with tuning knobs like fetch_size and use_remote_estimate, and a willingness to rewrite queries to make joins and filters pushable, you can often achieve dramatic performance gains without touching your hardware.

This case study shows that rewriting a query to enable a single-joined remote query reduced runtime from around 166 ms to 25 ms. That sort of improvement is not rare. It’s what happens when you treat FDW queries as distributed queries rather than local queries in disguise.

The next time you debug a slow FDW query, remember this handbook. Check the Remote SQL. Count the loops. Ask yourself: “Am I doing the work close to the data, or am I bringing the data to the work?” Adjust accordingly, and you'll write queries that make the most of Postgres's federated capabilities while keeping your latency in check.

This section closes the case study loop and summarizes exactly what changed in the plan and why it produced a large end-to-end win. The following sections of the handbook turn that single win into a repeatable method: how Postgres determines what is shippable, how to quickly read FDW plans, which operations and versions matter, and how to debug common failure modes that prevent pushdown.

Advanced Operations: A Deeper Dive into Shippability

The previous sections introduced the basic rules around what can be pushed to the remote and why. To really make sense of those rules, you need to see how they play out on the operations you use every day.

This section walks through filters, joins, aggregates, ordering, and limits, DISTINCT queries, and window functions in more detail. By the end, you should have a mental map of which operations to trust and which to double‑check when reading your plans.

Filters and simple predicates

WHERE clauses matter more than you think

When you specify WHERE attribute = 'value' on a foreign table, the FDW will happily transmit that predicate to the remote server as long as the comparison uses built‑in types and immutable operators. For example:

WHERE id = 42 is fine
WHERE lower(username) = 'hamdaan' is fine if lower() is allow‑listed and immutable
WHERE created_at >= now() - interval '7 days' is not shippable because now() is volatile

When such a predicate cannot be pushed, the FDW will fetch every row that matches all the shippable predicates and apply the rest locally. That means that a seemingly innocuous call to now() can blow up your network traffic.

The lesson is simple: compute volatile values up front (in your application or in a CTE) and reference them as constants in the query against the foreign table.

Complex expressions are not automatically unsafe

Suppose you have WHERE (status = 'active' AND (age BETWEEN 18 AND 29 OR age > 65)). This entire expression is shippable because it uses built‑in boolean logic, simple comparisons, and immutable operators. The FDW will deparse it into remote SQL and forward it. You only need to worry when one of the subexpressions introduces a function or operator that the FDW doesn’t recognize or cannot safely assume exists on the remote.

A good heuristic is: if you can express your filter using only simple comparisons, boolean logic, and built‑in functions, pushdown should work. When in doubt, check the Remote SQL.

Array and JSON operators

Modern Postgres makes heavy use of array and JSON functions. Many of these functions, like the array overlap operator && used in the case study, are built‑in and can be shipped. But some JSON functions are provided by extensions (like jsonb_path_query or functions from the pgjson family).

If your filter uses one of these, ensure that the extension is available and allow‑listed on the foreign server. Otherwise, the FDW will fetch rows and perform the JSON logic locally. This is rarely what you want when dealing with large JSON columns.

Joins: the good, the bad, and the ugly

Same‑server joins are your friend

If you join multiple foreign tables that are all defined on the same foreign server and user mapping, and if the join condition uses only shippable expressions, then the FDW can generate a single remote join. This is the ideal case.

For example, joining orders and customers on orders.customer_id = customers.id is pushable, as long as both tables reside on the same foreign server. The remote planner will use its own statistics and indexes to plan the join, and the local server will simply iterate through the result. Postgres 9.6 and later support this pattern [6].

Cross‑server joins break pushdown

If you attempt to join two foreign tables that live on different servers (or even on the same remote server but with different user mappings), postgres_fdw will fetch the tables separately and join them locally. This is almost always slower than pushing the join down, because you end up transferring both tables in their entirety.

The FDW design team chose not to support cross‑server joins because there is no portable way to tell two remote servers to cooperate on a join. Your options are: replicate one table on the other server, materialize the smaller table locally before joining, or restructure the query to filter aggressively on each side before joining locally.

Mixed local/foreign joins are tricky

Joining a local table to a foreign table cannot be pushed down, for straightforward reasons: the remote server has no access to your local data. A common pattern that triggers repeated remote calls looks like this:

SELECT u.id, a.value
FROM users u
LEFT JOIN user_attribute a
  ON a.user_id = u.id AND a.name = 'favorite_color';

If users is a local table and user_attribute is foreign, the plan may use a nested loop: for each local u, it executes a remote lookup in user_attribute to retrieve attributes.

The fix is to flip the query: retrieve all relevant rows from user_attribute in one remote scan, then join them locally. Or, if possible, create a small temporary table on the remote side with your u.id values, perform the join entirely remotely, and then fetch the results.

Join conditions matter

Even when joining two foreign tables on the same server, an unshippable join condition will force the join to be local. For example, JOIN ON textcol ILIKE '%foo%' is not pushable because ILIKE might not exist or behave identically on the remote.

If you need case‑insensitive matching, consider lowercasing both sides: LOWER(textcol) = 'foo' (assuming the remote server has the lower() function available and allowed). Similarly, joining on a cast expression (for example, JOIN ON CAST(a.id AS text) = b.text_id) can block pushdown. Define your columns with matching types instead.

Aggregates and grouping

Aggregates are where the data movement story shines. When you can push down a GROUP BY and aggregate functions like COUNT, SUM, AVG, or MAX, you reduce the result set to just the aggregated rows. This can be a difference of several orders of magnitude.

Postgres 10 introduced aggregate pushdown [7]. But not all aggregates are equal:

Simple aggregates such as COUNT(*), SUM(col), AVG(col), MIN(col), and MAX(col) are shippable when applied to shippable expressions. Even COUNT(DISTINCT col) is often shippable, because the remote can deduplicate before counting. The FDW will wrap the aggregate in a remote query and return just the aggregated row.

If you see a GroupAggregate node on the local side, check whether all involved columns and functions are shippable. If they are, ensure that the join conditions above are also pushable.

Filtered aggregates such as COUNT(*) FILTER (WHERE x > 5) or SUM(col) FILTER (WHERE status = 'active') are often pushable, because they translate into SUM(CASE WHEN condition THEN col ELSE 0 END) or COUNT(...). As long as the filter is shippable, the FDW will push it into the remote aggregate.

User‑defined aggregates are rarely pushable. If you have a custom aggregate function, the FDW will not assume that it exists or behaves the same on the remote server. Even if you install the function on both servers, postgres_fdw won't push it unless the function is in an allow‑listed extension.

Grouping sets and rollups are not currently pushable. When you write GROUP BY GROUPING SETS (...) or ROLLUP(...), Postgres will compute the grouping locally even if the underlying scan is remote.

If you need complex rollups, consider performing them in two steps: push down the initial grouping to the remote server to reduce rows, then perform the rollup locally.

ORDER BY, LIMIT, and DISTINCT

Ordering and limiting rows may seem like purely cosmetic features, but they affect how much data is transferred. If the remote can sort and limit, the local server only receives the top N rows. If it cannot, the local server must sort everything.

Postgres 12 expanded the cases where ORDER BY and LIMIT are pushed down [8]. Here are guidelines:

Single foreign scan with simple sort: If your query selects from one foreign table and sorts by a shippable expression (for example, ORDER BY created_at DESC), the FDW will include ORDER BY in Remote SQL. It will also push down LIMIT and OFFSET. This is ideal because the remote server does the sort and sends only the top rows.
Sort after join: If you sort after joining two foreign tables on the same server, and the join and sort expressions are shippable, the FDW may push both down. But if the sort requires columns from the local side or from a different remote server, the FDW cannot push it down.
Sort after aggregation: Sorting aggregated results is often pushable as long as the aggregate itself is pushable. But when grouping occurs locally, the sort remains local.
DISTINCT behaves like GROUP BY. If the distinct expression list is shippable, the FDW can push it down. If you write SELECT DISTINCT ON (col1) col2, col3 FROM ... and col3 is not part of the DISTINCT list, Postgres will treat this as GROUP BY and may push it. Be aware that DISTINCT ON semantics differ from plain DISTINCT and may not be pushable in older Postgres versions.

Window functions

Window functions (for example, ROW_NUMBER() OVER (PARTITION BY ...), RANK(), LAG(), LEAD()) rely on ordering and partitioning across rows.

Postgres has not yet taught postgres_fdw how to push window functions. When you see a WindowAgg node in your plan, it’s almost always local. The FDW will fetch the rows, and the local server will sort, partition, and compute the window. If you need to run window functions on remote data, plan to transfer the data locally.

Version‑specific quirks

The exact pushdown capabilities vary by release. When planning migrations or deciding whether to rely on a pushdown behavior, check the release notes:

9.6: first version to support pushdown of joins and sorts, and remote updates and deletes.
10: introduced aggregate pushdown [7], significantly reducing network use for GROUP BY queries.
11: improved partition pruning and join ordering for foreign tables.
12: expanded ORDER BY and LIMIT pushdown [8].
15: added pushdown for simple CASE expressions and additional built‑in functions.
17 (development at the time of writing) continues to expand shippable constructs. Always test on your target version because subtle improvements can change what the FDW can ship.

Common Anti‑Patterns and How to Avoid Them

Everyone has run into FDW queries that seemed reasonable but turned out to be bottlenecks. Here are a few of the most common mistakes and how to correct them. These examples are deliberately simplified – so you can adapt them to your schema.

Using volatile functions in predicates

Anti‑pattern:

SELECT *
FROM audit_logs
WHERE event_ts >= now() - interval '1 day';

now() is a volatile function, so the FDW refuses to push this predicate. It pulls all rows from audit_logs and filters them locally.

Better:

SELECT *
FROM audit_logs
WHERE event_ts >= $1;

Compute $1 (a timestamp) in your application or upstream query. Or compute it once in a CTE:

WITH cutoff AS (SELECT now() - interval '1 day' AS ts) SELECT * FROM audit_logs, cutoff WHERE event_ts >= cutoff.ts;

The FDW sees a constant and pushes the predicate.

Joining local and foreign data first

Anti‑pattern:

SELECT u.email, ua.value
FROM users u
LEFT JOIN user_attribute ua ON u.id = ua.user_id AND ua.name = 'favorite_movie';

This uses a local table (users) to drive a join to a foreign table (user_attribute). The FDW receives 10,000 individual remote queries if users have 10,000 rows. Each call fetches one or zero rows from user_attribute.

Better:

-- Fetch all favorite movies remotely and join locally
WITH remote_movies AS (
  SELECT ua.user_id, ua.value
  FROM user_attribute ua
  WHERE ua.name = 'favorite_movie'
)
SELECT u.email, rm.value
FROM users u
LEFT JOIN remote_movies rm ON u.id = rm.user_id;

Now the FDW issues one query to fetch all relevant attributes, and the join is done locally in one pass.

Cross‑server joins without materialization

Anti‑pattern:

SELECT *
FROM remote_db1.orders o
JOIN remote_db2.customers c ON o.customer_id = c.id;

This is not pushable because the two tables are on different foreign servers. Postgres will fetch orders and customers separately and join them locally. If orders have 1 million rows and customers have 50,000 rows, you will transfer 1.05 million rows.

Better: Replicate or materialize one side on the other server (or locally) before joining. For example, create a materialized view m_customers on remote_db1 containing just the id and name of the customers you need, then join orders and m_customers on the same server. Alternatively, copy customers into a temporary table on the local server and join there.

Complex expressions on join keys

Anti‑pattern:

SELECT *
FROM remote_table a
JOIN remote_table b ON CAST(a.key AS text) = b.key_text;

Casting a numeric key to text prevents pushdown. The remote server cannot use indexes and must return both tables. The local server performs the join and cast.

Better: Align your schemas so that the join columns use the same type. If you cannot change the schema, create a computed column on the remote server with the appropriate type and use it in the join.

Ignoring collation and type mismatches

Anti‑pattern:

SELECT *
FROM remote_table
WHERE citext_col = 'abc';

If the remote server doesn’t have the citext extension installed, the comparison semantics will differ, and the FDW will refuse to ship the filter. This appears harmless until you see the plan and realize all rows were fetched.

Better: Install the same extensions and collations on the remote server, or convert the column to a base type like text on both sides.

Extending Tuning: Calibrating Cost Models

Earlier, we discussed fetch_size, use_remote_estimate, and the cost knobs. This section expands on how to use them strategically.

Balancing fetch size and memory

fetch_size controls how many rows the FDW asks for in each round trip [9]. Think of it as the batch size. The default (100) works well for small result sets. If you expect to retrieve tens of thousands of rows, a higher fetch size reduces the overhead of many network requests. But there are trade‑offs:

Memory consumption: Each foreign scan buffers rows until they are consumed. A huge fetch size (for example, 10,000) may allocate more memory than you expect, especially when multiple scans run concurrently. Monitor memory usage as you increase this setting.
Latency hiding: If network latency is high, overlapping network requests with local processing can hide some latency. But postgres_fdw does not pipeline multiple fetches – it waits for one batch before requesting the next. This means that a larger batch size reduces the number of waits, but cannot overlap them. If you operate across data centers, consider using a connection pooler or caching layer instead of just increasing fetch_size.

Remote estimates vs. local estimates

The planner uses statistics to estimate how many rows each node will produce, which in turn influences join order. When use_remote_estimate is false (the default), the planner guesses based on local stats collected by ANALYZE on the foreign table. This can be wrong if the remote table has a different distribution than the local sample, or if the table has changed since the last ANALYZE.

Setting use_remote_estimate to true instructs the FDW to run EXPLAIN on the remote server during planning to obtain row counts and cost estimates [3]. This can improve join ordering, especially when joining multiple foreign tables or mixing local and foreign tables. The downside is increased planning time because each remote estimate runs an extra query.

In practice:

Enable use_remote_estimate on queries with complex joins where the planner picks obviously wrong join orders. If enabling it improves the plan, consider leaving it on for that server or table.
Use ANALYZE on foreign tables periodically if your remote data is relatively static. This populates local stats and can avoid the overhead of remote estimates.
Don’t enable use_remote_estimate indiscriminately on simple lookups. The cost of additional round-trip remote flights may outweigh the benefit.

Tuning cost parameters

fdw_startup_cost and fdw_tuple_cost control how much the planner thinks it costs to start a foreign scan and fetch each row [3]. If these are too low, the planner may choose a nested loop that generates many small remote calls. If they are too high, the planner might avoid remote scans even when they are efficient.

You can adjust these parameters based on empirical measurement:

Increase fdw_startup_cost to discourage the planner from using nested loops that call the remote table repeatedly. You might set it to the average cost of a round-trip remote.
Increase fdw_tuple_cost if network bandwidth is limited or expensive. This indicates to the planner that each remote row incurs higher fetch costs than a local row. The planner will prefer plans that filter early on the remote side.

Always adjust these settings gradually and observe the effect on the plan. Keep separate settings per foreign server if network conditions differ.

When to analyze foreign tables

Running ANALYZE on a foreign table collects sample statistics by pulling a subset of rows from the remote server. This helps the planner estimate row counts when use_remote_estimate is off. It also helps decide whether to use an index on the remote side. You should analyze foreign tables when:

The remote table is large and static, and you want accurate local estimates without the overhead of remote estimates.
You have just defined a foreign table, and the default stats are empty.
You changed the extensions allow‑list to enable more pushdown and want the planner to see the effect.

Conversely, if the remote data changes constantly, ANALYZE results will quickly become stale. In that case, rely on use_remote_estimate instead.

Further Case Studies and Practical Examples

The Keycloak coverage example is not the only place where pushdown matters. The following scenarios illustrate other patterns you may encounter.

Reporting on a sharded logging system

Imagine you store application logs across multiple shards, each a separate Postgres database. You want to produce a report of the number of error logs per service per day.

A naïve approach might join all shards in one query:

SELECT shard, service, date_trunc('day', log_time) AS day, COUNT(*)
FROM shard1.logs
UNION ALL
SELECT shard, service, date_trunc('day', log_time) AS day, COUNT(*)
FROM shard2.logs
...;

This approach will fetch all log rows to the local server and aggregate them locally. A better solution is to push the grouping to each shard:

SELECT shard, service, day, sum(count)
FROM (
  SELECT 1 AS shard, service, date_trunc('day', log_time) AS day, COUNT(*) AS count
  FROM shard1.logs
  WHERE log_time >= $1 AND log_time < $2
  GROUP BY service, day
  UNION ALL
  SELECT 2 AS shard, service, date_trunc('day', log_time) AS day, COUNT(*)
  FROM shard2.logs
  WHERE log_time >= $1 AND log_time < $2
  GROUP BY service, day
  ...
) x
GROUP BY shard, service, day;

Here, each foreign server returns a small set of aggregated rows instead of raw logs. The outer aggregation sums across shards. This pattern generalizes: push grouping and filtering to the remote side, then combine locally.

Combining remote and local data for analytics

Suppose you have a local table users and a remote table orders. You want to compute the average order amount per user segment. A naïve query might look like:

SELECT u.segment, AVG(o.amount)
FROM users u
JOIN orders o ON o.user_id = u.id
GROUP BY u.segment;

This is a local join driving a remote nested loop. The better approach is to aggregate orders remotely by user_id and join on the small result:

WITH remote_totals AS (
  SELECT user_id, SUM(amount) AS total, COUNT(*) AS n
  FROM orders
  GROUP BY user_id
)
SELECT u.segment, AVG(rt.total / rt.n)
FROM users u
JOIN remote_totals rt ON u.id = rt.user_id
GROUP BY u.segment;

This pushes the heavy aggregation to the remote and transfers only one row per user. The local join then groups by segment. As with other examples, the key is to reduce remote rows before they cross the network.

Avoiding pushdown for correctness

There are legitimate cases where you should prevent pushdown because of semantic differences. Postgres allows you to do this by adding OFFSET 0 or wrapping the foreign table in a CTE.

For example, if a built‑in function behaves differently on the remote due to a version mismatch, you can force local evaluation:

WITH local_eval AS (SELECT  FROM remote_table)  -- CTE prevents pushdown
SELECT 
FROM local_eval
WHERE some_complex_expression(local_eval.col) > 0;

Alternatively, a WHERE clause like random() < 0.1 will not push down because random() is volatile – you don't need to force it. But adding OFFSET 0 is a simple hack that prevents any pushdown:

SELECT * FROM remote_table OFFSET 0;

Knowing how to disable pushdown intentionally helps you debug. If a query returns different results when pushdown occurs, suspect type/collation mismatches or remote session settings [4].

Monitoring, Diagnostics, and Regression Testing

Monitoring doesn't end at counting remote rows. To make pushdown reliable in production, you need to set up mechanisms to detect regressions and gather evidence when performance changes.

Automate EXPLAIN regression tests

In addition to unit tests and integration tests, you can add tests that assert the shape of your plans. For instance, if a mission‑critical report must always push down a WHERE clause, you can write a test that runs EXPLAIN (VERBOSE) and checks that the Remote SQL contains the filter. You might even parse loops and assert that it is 1. When a developer inadvertently adds a non‑immutable function or changes a join, the test will fail. This is akin to snapshot testing for SQL.

Monitor pg_stat_statements across servers

Enable pg_stat_statements on both the local and remote servers. On the local side, track the total time, planning time, and rows for each FDW query. On the remote side, track which queries are being executed.

Look for outliers: a query whose remote calls spike or whose average remote rows jump from hundreds to thousands. Those are early signs of pushdown failure.

Log remote SQL with auto_explain

Setting auto_explain.log_min_duration_statement (for example, to 500ms) causes Postgres to automatically log slow queries with their plans. Combine this with auto_explain.log_verbose = true and auto_explain.log_nested_statements = true to capture remote SQL as well. When a federated query slows down, the log will show you exactly what remote SQL was executed and how often. This is invaluable in production, where you cannot always run EXPLAIN interactively.

Use connection pooling and prepare statements

postgres_fdw maintains a connection pool keyed on the user mapping. It reuses connections between queries, but you can also use connection pooling at the network level (for example, pgbouncer or pgcat).

Keeping connections warm reduces the startup cost, as captured by fdw_startup_cost. Meanwhile, preparing statements on the remote server (via PREPARE and EXECUTE) can save parse time when the same remote SQL is executed frequently. postgres_fdw can use server‑side prepared statements for parameterized scans.

Regression testing after version upgrades

Every major Postgres release brings improvements to postgres_fdw pushdown semantics. But new releases also change planner heuristics and remote SQL generation. After an upgrade, rerun your key queries with EXPLAIN (VERBOSE), compare the Remote SQL, and benchmark them.

In some cases, a release may push down something previously local, revealing a latent type mismatch or a function difference. In other cases, pushdown may be withheld due to a new rule. Don’t assume that an upgrade automatically improves performance – test it.

Extended Guidelines for Advanced DBAs

To close this handbook, here are consolidated guidelines distilled from the previous sections. They go beyond simple bullet points to capture nuances. Keep them handy for reference or print them out for your team.

Respect the FDW safety model. Immutable functions and built‑in operators are your friends. Anything outside that scope must be explicitly allowed or evaluated locally. Understand which items belong to each category and plan accordingly.
Always read the Remote SQL. Don’t trust your intuition about what is being pushed down. The Remote SQL string is the only source of truth. It indicates whether a predicate, join, sort, or limit operation is occurring remotely. It also shows parameter placeholders (for example, $1) that correspond to values passed from the local plan.
Reduce before you fetch. The network is the highest cost. If the remote can reduce rows through filtering, grouping, or limiting, let it. If it cannot, structure your query to enable it. Avoid queries that require pulling large raw tables and processing them locally.
Beware of join order. The planner sometimes chooses a nested loop with a foreign table as the inner side, resulting in repeated remote calls. Examine loops: if you see a high number, consider rewriting the query or adjusting cost parameters.
Use CTEs strategically. A CTE can isolate remote scans and let you control whether they are materialized once or inlined. Use MATERIALIZED to avoid repeated remote scans when a CTE is referenced multiple times. Use NOT MATERIALIZED to allow optimizations across CTE boundaries.
Instrument, monitor, iterate. Good FDW performance is not a one‑off fix. Monitor queries and plans. Use tests to catch regressions. Adjust tuning knobs and indexes as your data or workload changes. Document your reasoning so others can understand why a particular plan is expected.
Educate your team. Federated queries invite subtle bugs and performance traps. Share the high‑level rules – immutable functions only, cross‑server joins are local, always check remote SQL – so engineers write safer queries by default. A 30‑minute training can save hours of debugging later.

Bringing it All Together

This handbook has covered a lot of ground: from the high‑level principle that pushdown is about data movement, to the nitty‑gritty of join conditions and tuning knobs, to troubleshooting steps and case studies. It is intentionally opinionated and personal: these are the patterns and pitfalls encountered in real systems, not abstract guidelines. By sharing specific examples, I hoped to make the rules memorable and show how they interplay with actual workloads.

The goal is not just to tell you what to do, but to show you how to think and problem solve: review the plan, trace data movement, and determine whether the query is doing the heavy work in the right place.

That thinking process, practiced enough times, becomes second nature. When you write a new query, you'll automatically consider whether your predicates are immutable, whether the join can be shipped, and whether you are about to trigger an N+1 pattern. When you review plans, you'll start from the Foreign Scan nodes and remote SQL, not the top‑level node. When you tune, you'll know which knobs to twist and in which order.

Keep experimenting. Use the examples here as starting points. Try different structures in a test environment and measure the difference. The more you play with pushdown, the more comfortable you'll become with its constraints and superpowers.

If this handbook helps you avoid one performance incident or saves you from shipping a broken query, it has done its job. Enjoy exploring the federated world of Postgres.

References

[1] [2] [3] [4] [5] [6] [9] [10] PostgreSQL: Documentation: 18: F.38. postgres_fdw – access data stored in external PostgreSQL servers (https://www.postgresql.org/docs/current/postgres-fdw.html)

[7] PostgreSQL: Release Notes (https://www.postgresql.org/docs/release/10.0/)

[8] PostgreSQL: Release Notes (https://www.postgresql.org/docs/release/12.0/)

How to Build a Payroll System with Express and Monnify Using Background Jobs

David Aniebo — Wed, 14 Jan 2026 21:41:50 +0000

Processing payroll payments is an important operation for any business. When you need to pay employees simultaneously, you can't afford to have your server hang, get blocking errors, or timeout while waiting for each payment to complete.

Building a payroll system is an excellent way to practice real-world backend development skills. Unlike simple CRUD applications, payroll systems require you to think about:

Asynchronous processing: When you need to pay hundreds of employees, processing payments synchronously can cause your server to timeout. Background jobs with Bull and Redis allow you to handle long-running operations without blocking your API.
Payment gateway integration: Working with payment APIs like Monnify teaches you how to handle external service integrations, authentication flows, webhook verification, and error handling in production systems.
Data consistency: Payroll systems need to maintain accurate records. You'll learn about transaction reconciliation, idempotency, and how to handle partial failures gracefully.
Production-ready patterns: This tutorial covers patterns you'll use in real applications: job queues, webhook handlers, database migrations, and proper error handling.

Whether you're building a fintech application, an HR system, or just want to understand how payment processing works, the concepts in this tutorial will serve you well. The combination of Express, TypeScript, background jobs, and payment APIs represents a common stack in modern backend development.

In this tutorial, you’ll learn how to build a production-grade payroll engine using Express.js, TypeScript, and Monnify's payment API. You'll implement background job processing with Bull and Redis to handle bulk disbursements efficiently.

By the end, you will have a fully functional payroll system that can:

Manage employee records with bank account details
Create and process payroll batches
Process bulk payments using Monnify's disbursement API
Handle payment status updates via webhooks
Reconcile transactions to ensure data consistency

Prerequisites
Project Architecture Overview
Setting Up the Project
Configuring Docker for PostgreSQL and Redis
Setting Up the Database
Creating Database Models
PayrollItemModel
Building the Monnify Client
Implementing Background Job Processing
Creating the API Controllers
Setting Up Webhook Handlers
Wiring Up Routes
Testing the System
Setting Up Webhooks for Production
Conclusion
- Key Takeaways
- References:

Prerequisites

Before you begin, make sure you have the following:

Node.js (v18 or higher)
Docker and Docker Compose installed
A Monnify merchant account with API credentials
Basic knowledge of TypeScript and Express.js
Familiarity with REST APIs

You'll also need to obtain these credentials from your Monnify dashboard:

API Key
Secret Key
Contract Code
Webhook Secret (for verifying webhook signatures)

Project Architecture Overview

Here's how the payroll system works:

Key components:

Express API: A minimal and flexible Node.js web framework that handles HTTP requests for managing employees and payrolls. Express provides routing, middleware support, and makes it easy to build RESTful APIs.
Bull Queue: A Redis-based queue library for Node.js that processes payroll jobs asynchronously in the background. Bull handles job retries, scheduling, and provides a reliable way to process long-running tasks without blocking your main application thread.
Redis: An in-memory data structure store that serves as the backend for Bull queues. Redis stores job data, manages job states (pending, active, completed, failed), and enables distributed job processing across multiple workers.
PostgreSQL: A relational database that persists employee records, payrolls, and payment items. PostgreSQL's ACID compliance ensures data integrity, and its support for complex queries makes it ideal for financial applications.
Monnify API: A payment gateway service that handles actual money transfers to employee bank accounts. Monnify provides bulk disbursement capabilities, allowing you to process multiple payments in a single API call, which is essential for payroll systems.
Webhooks: HTTP callbacks that receive real-time payment status updates from Monnify. When a payment completes or fails, Monnify sends a webhook to your server, allowing you to update your database immediately without polling.

Setting Up the Project

In this section, we'll initialize a new Node.js project with TypeScript and install all the necessary dependencies. We'll configure TypeScript for type safety and set up the project structure that will support our payroll system.

First, create a new directory and initialize your project:

mkdir monnify-payroll-system
cd monnify-payroll-system
npm init -y

Next, install the required dependencies:

npm install express cors helmet dotenv axios bull ioredis pg swagger-jsdoc swagger-ui-express express-validator

Then install the development dependencies:

npm install -D typescript ts-node-dev @types/node @types/express @types/cors @types/pg @types/bull

Create a tsconfig.json file:

{
  "compilerOptions": {
    "target": "ES2020",
    "module": "commonjs",
    "lib": ["ES2020"],
    "outDir": "./dist",
    "rootDir": "./src",
    "strict": true,
    "esModuleInterop": true,
    "skipLibCheck": true,
    "forceConsistentCasingInFileNames": true,
    "resolveJsonModule": true,
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true
  },
  "include": ["src/**/*", "scripts/**/*"],
  "exclude": ["node_modules", "dist"]
}

And update your package.json scripts:

{
  "scripts": {
    "build": "tsc",
    "start": "node dist/index.js",
    "dev": "ts-node-dev --respawn --transpile-only src/index.ts",
    "migrate": "ts-node scripts/run-migrations.ts"
  }
}

Now, create a .env file for your environment variables. All the Monnify env details can be gotten in this route:

# Server
PORT=3008
NODE_ENV=development

# Database
DB_HOST=localhost
DB_PORT=5433
DB_NAME=payroll_db
DB_USER=payroll_user
DB_PASSWORD=payroll_password

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379

# Monnify
MONNIFY_API_KEY=your_api_key
MONNIFY_SECRET_KEY=your_secret_key
MONNIFY_BASE_URL=https://sandbox.monnify.com
MONNIFY_CONTRACT_CODE=your_contract_code
MONNIFY_WEBHOOK_SECRET=your_webhook_secret

Configuring Docker for PostgreSQL and Redis

Before we can start building our application, we need to set up the infrastructure services: PostgreSQL for data persistence and Redis for job queue management. Using Docker Compose makes it easy to run these services locally with a single command. This approach ensures consistency across development environments and simplifies deployment.

Create a docker-compose.yml file to set up PostgreSQL and Redis:

services:
  postgres:
    image: postgres:15-alpine
    container_name: monnify-payroll-db
    environment:
      POSTGRES_USER: payroll_user
      POSTGRES_PASSWORD: payroll_password
      POSTGRES_DB: payroll_db
    ports:
      - '5433:5432'
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ['CMD-SHELL', 'pg_isready -U payroll_user']
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    container_name: monnify-payroll-redis
    ports:
      - '6379:6379'
    volumes:
      - redis_data:/data
    healthcheck:
      test: ['CMD', 'redis-cli', 'ping']
      interval: 10s
      timeout: 5s
      retries: 5

volumes:
  postgres_data:
  redis_data:

Start the services:

docker-compose up -d

And verify that both services are running:

docker-compose ps

Setting Up the Database

Now we'll configure the database connection and create the necessary tables. We'll use a connection pool for efficient database access and create migration files to set up our schema. This approach ensures our database structure is version-controlled and can be easily reproduced.

Create the src/config/database.ts file to configure the PostgreSQL connection:

import { Pool, PoolConfig } from 'pg';
import dotenv from 'dotenv';

dotenv.config();

const dbName = (process.env.DB_NAME || 'payroll_db').trim();
if (!dbName) {
  throw new Error('Database name (DB_NAME) must be set and cannot be empty');
}

const config: PoolConfig = {
  host: process.env.DB_HOST || 'localhost',
  port: parseInt(process.env.DB_PORT || '5433'),
  database: dbName,
  user: process.env.DB_USER || 'payroll_user',
  password: process.env.DB_PASSWORD || 'payroll_password',
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
};

export const pool = new Pool(config);

pool.on('error', (err: Error) => {
  console.error('Unexpected error on idle client', err);
  process.exit(-1);
});

export const query = async (text: string, params?: any[]) => {
  const start = Date.now();
  try {
    const res = await pool.query(text, params);
    return res;
  } catch (error) {
    console.error('Database query error', error);
    throw error;
  }
};

Now create the migration files. First, create a migrations folder:

mkdir migrations

Then create migrations/001_create_employees_table.sql:

-- Create employees table
CREATE TABLE IF NOT EXISTS employees (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  email VARCHAR(255) NOT NULL UNIQUE,
  employee_id VARCHAR(100) NOT NULL UNIQUE,
  salary DECIMAL(15, 2) NOT NULL,
  account_number VARCHAR(50) NOT NULL,
  bank_code VARCHAR(20) NOT NULL,
  bank_name VARCHAR(255),
  is_active BOOLEAN DEFAULT true,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create indexes for faster lookups
CREATE INDEX IF NOT EXISTS idx_employees_employee_id ON employees(employee_id);
CREATE INDEX IF NOT EXISTS idx_employees_is_active ON employees(is_active);

Now, create migrations/002_create_payrolls_table.sql:

-- Create payrolls table
CREATE TABLE IF NOT EXISTS payrolls (
  id SERIAL PRIMARY KEY,
  payroll_period VARCHAR(100) NOT NULL,
  total_amount DECIMAL(15, 2) NOT NULL,
  total_employees INTEGER NOT NULL,
  status VARCHAR(50) NOT NULL DEFAULT 'pending',
  processed_count INTEGER DEFAULT 0,
  failed_count INTEGER DEFAULT 0,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  processed_at TIMESTAMP
);

-- Create indexes for faster queries
CREATE INDEX IF NOT EXISTS idx_payrolls_status ON payrolls(status);
CREATE INDEX IF NOT EXISTS idx_payrolls_period ON payrolls(payroll_period);

And next, create migrations/003_create_payroll_items_table.sql:

-- Create payroll_items table
CREATE TABLE IF NOT EXISTS payroll_items (
  id SERIAL PRIMARY KEY,
  payroll_id INTEGER NOT NULL REFERENCES payrolls(id) ON DELETE CASCADE,
  employee_id INTEGER NOT NULL REFERENCES employees(id) ON DELETE CASCADE,
  amount DECIMAL(15, 2) NOT NULL,
  status VARCHAR(50) NOT NULL DEFAULT 'pending',
  transaction_reference VARCHAR(255),
  error_message TEXT,
  processed_at TIMESTAMP,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create indexes for faster queries
CREATE INDEX IF NOT EXISTS idx_payroll_items_payroll_id ON payroll_items(payroll_id);
CREATE INDEX IF NOT EXISTS idx_payroll_items_employee_id ON payroll_items(employee_id);
CREATE INDEX IF NOT EXISTS idx_payroll_items_status ON payroll_items(status);
CREATE INDEX IF NOT EXISTS idx_payroll_items_transaction_ref ON payroll_items(transaction_reference);

Then create a migration runner script at scripts/run-migrations.ts:

import fs from 'fs';
import path from 'path';
import { pool } from '../src/config/database';

async function runMigrations() {
  const migrationsDir = path.join(__dirname, '../migrations');
  const files = fs.readdirSync(migrationsDir).sort();

  for (const file of files) {
    if (file.endsWith('.sql')) {
      console.log(`Running migration: ${file}`);
      const sql = fs.readFileSync(path.join(migrationsDir, file), 'utf-8');
      await pool.query(sql);
      console.log(`Completed: ${file}`);
    }
  }

  console.log('All migrations completed');
  await pool.end();
}

runMigrations().catch((err) => {
  console.error('Migration failed:', err);
  process.exit(1);
});

Run the migrations:

npm run migrate

Creating Database Models

In this section, we'll create the data access layer for our payroll system. Models encapsulate all database operations, providing a clean interface for the rest of the application. We'll build two main models: one for managing employees and another for handling payrolls and payroll items.

For each model, I’ll first explain its purpose and key methods, then show you the complete code implementation. This approach helps you understand what each model does before you implement it.

Employee Model

The EmployeeModel serves as the data-access layer for employee records. It handles creating, reading, updating, and soft-deleting employees. The model includes automatic employee ID generation (format: EMP001, EMP002, and so on) and ensures that each employee has the banking details required for payroll disbursement.

Start by creating a new file at src/models/employee.ts where we’ll implement all employee-related database logic.

After creating the file, import a shared database query helper to execute parameterized SQL safely.

import { query } from '../config/database';

This keeps raw SQL isolated from controllers and ensures protection against SQL injection.

Employee Data Structure (`Employee` Interface)

Next, we’ll define the employee interface.

The Employee interface represents a row in the employees database table and captures both operational and audit fields. It includes identifying fields (`id`, employee_id), personal fields (`name`, email), payroll fields (`salary`), banking details (`account_number`, bank_code, bank_name), operational state (`is_active`), and timestamps (`created_at`, updated_at). The is_active flag is used to support soft deletion and employee deactivation without permanently removing historical payroll relationships.

export interface Employee {
  id: number;
  name: string;
  email: string;
  employee_id: string;
  salary: number;
  account_number: string;
  bank_code: string;
  bank_name: string;
  is_active: boolean;
  created_at: Date;
  updated_at: Date;
}

Now, we’ll define the CreateEmployeeInput interface which represent the expected payload for creating an employee. It includes required fields such as name, email, salary, and bank details.

export interface CreateEmployeeInput {
  name: string;
  email: string;
  employee_id?: string;
  salary: number;
  account_number: string;
  bank_code: string;
  bank_name: string;
}

The employee_id field is optional, allowing the system to auto-generate a unique identifier if one is not provided. This flexibility supports both automated workflows and manual HR data imports.

Employee Model Class

Next, we’ll define the EmployeeModel class.

export class EmployeeModel {
  // Class methods will go here
}

This class encapsulates all database operations related to employee records. It centralizes logic for creating, retrieving, updating, and deleting employees, as well as generating unique sequential employee IDs.

Auto-Generating Employee IDs (`generateEmployeeId`)

We start by creating the generateEmployeeId method inside the EmployeeModel class.

  private static async generateEmployeeId(): Promise<string> {
    // Get the highest existing employee_id number that matches EMP### pattern
    const result = await query(
      `SELECT employee_id FROM employees
       WHERE employee_id LIKE 'EMP%'
       AND LENGTH(employee_id) >= 4
       AND SUBSTRING(employee_id FROM 4) ~ '^[0-9]+$'
       ORDER BY CAST(SUBSTRING(employee_id FROM 4) AS INTEGER) DESC
       LIMIT 1`
    );

    if (result.rows.length === 0) {
      return 'EMP001';
    }

    const lastId = result.rows[0].employee_id;
    const numberPart = lastId.substring(3);
    const lastNumber = parseInt(numberPart, 10);

    if (isNaN(lastNumber)) {
      return 'EMP001';
    }

    const nextNumber = lastNumber + 1;
    // Format as EMP001, EMP002, etc. (3 digits minimum)
    return `EMP${nextNumber.toString().padStart(3, '0')}`;
  }

The private generateEmployeeId method generates a unique employee identifier in a readable sequential format such as EMP001, EMP002, and so on. It queries the database for the highest existing employee ID that matches the expected pattern (EMP prefix followed by numeric digits), orders by the numeric suffix in descending order, and increments the latest number to produce the next ID.

If no matching record exists, it starts from EMP001. The method also protects against malformed data by returning EMP001 if parsing fails.

Finally, it ensures formatting consistency by padding the number portion to at least three digits using padStart(3, '0'), which keeps IDs aligned and easy to sort visually.

Creating an Employee (`create`)

Next, we’ll define the create method, which inserts a new employee record into the database. If the caller does not supply an employee_id, the method generates one automatically using generateEmployeeId.

  static async create(data: CreateEmployeeInput): Promise {
    // Auto-generate employee_id if not provided
    let employeeId = data.employee_id;
    if (!employeeId) {
      employeeId = await this.generateEmployeeId();
    }

    // Check if employee_id already exists (if manually provided)
    if (data.employee_id) {
      const existing = await this.findByEmployeeId(data.employee_id);
      if (existing) {
        throw new Error('Employee ID already exists');
      }
    }

    const result = await query(
      `INSERT INTO employees (name, email, employee_id, salary, account_number, bank_code, bank_name)
       VALUES ($1, $2, $3, $4, $5, $6, $7)
       RETURNING *`,
      [
        data.name,
        data.email,
        employeeId,
        data.salary,
        data.account_number,
        data.bank_code,
        data.bank_name,
      ]
    );
    return result.rows[0];
  }

Here’s what’s happening in the code:

If an employee_id is manually provided, it validates uniqueness by checking if that ID already exists among active employees, preventing collisions and ensuring each employee has a distinct identifier. After validations, the employee is inserted into the employees table and the new record is returned. This method ensures every employee created has complete banking details required for payroll disbursement.

Retrieving All Active Employees (`findAll`)

The findAll method fetches all active employees from the database.

static async findAll(): Promise {
  const result = await query(
    'SELECT * FROM employees WHERE is_active = true ORDER BY created_at DESC'
  );
  return result.rows;
}

The findAll method returns all active employees (is_active = true) ordered by most recent creation date. This behavior supports common UI patterns such as HR dashboards and payroll selection screens, where only active employees should be visible by default.

Retrieving an Employee by Database ID (`findById`)

The findById method retrieves a single employee by the internal numeric primary key (id).

static async findById(id: number): Promisenull> {
  const result = await query('SELECT * FROM employees WHERE id = $1', [id]);
  return result.rows[0] || null;
}

If the employee does not exist, it returns null. This is typically used for internal operations such as payroll processing, updates, or admin detail views.

Retrieving an Employee by Employee Identifier (`findByEmployeeId`)

The findByEmployeeId method retrieves an active employee using the business-friendly employee_id (for example, EMP014).

static async findByEmployeeId(employeeId: string): Promisenull> {
    const result = await query(
      'SELECT * FROM employees WHERE employee_id = $1 AND is_active = true',
      [employeeId]
    );
    return result.rows[0] || null;
}

The method filters by is_active = true to prevent selecting deactivated employees during operations like payroll runs or HR searches.

Updating an Employee (`update`)

The update method supports partial updates by dynamically building the SQL SET clause based on the fields present in the update payload. It iterates through the provided properties, includes only those with defined values, and constructs a parameterized query to prevent SQL injection and preserve correctness.

static async update(
    id: number,
    data: Partial
  ): Promise {
    const fields: string[] = [];
    const values: any[] = [];
    let paramCount = 1;

    // Build dynamic update query based on provided fields
    Object.entries(data).forEach(([key, value]) => {
      if (value !== undefined) {
        fields.push(`${key} = $${paramCount}`);
        values.push(value);
        paramCount++;
      }
    });

    if (fields.length === 0) {
      throw new Error('No fields to update');
    }

    // Always update the updated_at timestamp
    fields.push(`updated_at = $${paramCount}`);
    values.push(new Date());
    values.push(id);

    const result = await query(
      `UPDATE employees SET ${fields.join(', ')} WHERE id = $${
        paramCount + 1
      } RETURNING *`,
      values
    );
    return result.rows[0];
  }

Here’s what’s happening in the code:

If no fields are provided, it throws an error to avoid performing a meaningless update. It also explicitly updates the updated_at timestamp to ensure accurate audit tracking. Finally, it returns the updated database record, making it easy for controllers to respond with the latest employee state.

Soft-Deleting an Employee (`delete`)

Finally, instead of permanently removing the employee record, the delete method performs a soft delete by setting is_active = false and updating the updated_at timestamp.

This approach preserves historical payroll references and audit trails while excluding inactive employees from standard queries like findAll. It’s especially important in payroll systems where historical payment records must remain valid and traceable even after an employee leaves the organization.

static async delete(id: number): Promise<void> {
  await query(
    'UPDATE employees SET is_active = false, updated_at = NOW() WHERE id = $1',
    [id]
  );
}

Key features of the employee model:

Auto-generates sequential employee IDs if not provided
Validates employee ID uniqueness
Supports soft deletion to preserve historical payroll records
Provides methods for finding employees by database ID or employee identifier

Payroll Model

The PayrollModel manages payroll batches and individual payroll items. A payroll represents a single payment cycle (for example, "December 2024"), while payroll items represent individual employee payments within that cycle. This separation allows us to track the status of each payment independently.

Key features:

Creates payroll batches with automatic calculation of totals
Supports filtering employees for selective payroll runs
Tracks status at both batch and item levels
Provides methods for reconciliation and status updates

Let's implement the Payroll Model.

We’ll begin by creating a new file at src/models/payroll.ts, where we’ll implement the payroll models that encapsulate payroll batch creation, employee payment tracking, and payroll status management.

First, import a shared database query helper to execute parameterized SQL safely.

import { query } from '../config/database';

This keeps raw SQL isolated from controllers and ensures protection against SQL injection.

Payroll Status Lifecycle

Next, we’ll define the PayrollStatus enum.

export enum PayrollStatus {
 PENDING = 'pending',
 PROCESSING = 'processing',
 COMPLETED = 'completed',
 FAILED = 'failed',
 PARTIALLY_COMPLETED = 'partially_completed',
}

The PayrollStatus enum defines all possible states for both payroll batches and individual payroll items:

PENDING – Created but not yet processed
PROCESSING – Currently being processed by background workers
COMPLETED – Successfully processed
FAILED – Processing failed
PARTIALLY_COMPLETED – Some items succeeded while others failed

Payroll Entity

With the payroll status lifecycle defined, we can now define the payroll entity.

The Payroll interface represents a single payroll run, such as a monthly salary payout. It stores aggregate and audit information including the payroll period, total salary amount, total number of employees, processing status, counts of successful and failed payments, and timestamps for creation, updates, and completion.

Add the following interface to src/models/payroll.ts:

export interface Payroll {
 id: number;
 payroll_period: string;
 total_amount: number;
 total_employees: number;
 status: PayrollStatus;
 processed_count: number;
 failed_count: number;
 created_at: Date;
 updated_at: Date;
 processed_at?: Date;
}

This entity acts as the parent record for all employee payments within a payroll cycle and is used to track overall payroll progress and outcomes.

Payroll Item Entity

Next, we’ll define the payroll item entity, which represents an individual employee payment within a payroll.

The PayrollItem tracks the employee being paid, the payment amount, its processing status, any transaction reference returned by the payment provider, error messages in case of failure, and relevant timestamps.

Add the following interface just below the Payroll interface:

export interface PayrollItem {
  id: number;
  payroll_id: number;
  employee_id: number;
  amount: number;
  status: PayrollStatus;
  transaction_reference?: string;
  error_message?: string;
  processed_at?: Date;
  created_at: Date;
  updated_at: Date;
}

This structure allows individual employee payments to be retried, audited, or reconciled independently without affecting the rest of the payroll batch.

Creating a Payroll (`PayrollModel.create`)

Now that we’ve defined the Payroll and PayrollItem entities, we can move on to creating a payroll batch.

To keep our business logic organized, we’ll introduce a PayrollModel class. This class will be responsible for creating payroll records, calculating aggregates, and generating individual payroll items for each employee.

Before writing the model itself, let’s define the input required to create a payroll.

Add the following interface below the PayrollItem interface:

export interface CreatePayrollInput {
  payroll_period: string;
  employee_ids?: number[];
}

payroll_period identifies the payroll run (for example, 2025-01)
employee_ids is optional and allows us to run payroll for a subset of employees, enabling selective payouts or retries

Next, create the PayrollModel class. This class will encapsulate all payroll-related database operations.

export class PayrollModel {
// Payroll model class methods will go here
}

We’ll start by implementing the create method, which is responsible for creating a new payroll batch.

The method performs the following steps:

Optionally filters employees if specific employee IDs are provided
Calculates aggregate payroll statistics from the employees table
Creates a payroll record with a PENDING status
Creates a payroll item for each eligible employee

Here’s the implementation:

  static async create(data: CreatePayrollInput): Promise {
    let employeeFilter = '';
    let queryParams: any[] = [];

    // Build filter for selective employee payrolls
    if (data.employee_ids && data.employee_ids.length > 0) {
      employeeFilter = `AND id = ANY($1::int[])`;
      queryParams = [data.employee_ids];
    }

    // Calculate aggregate statistics from employees table
    const employeeStats = await query(
      `SELECT COUNT(*) as count, COALESCE(SUM(salary), 0) as total
       FROM employees
       WHERE is_active = true ${employeeFilter}`,
      queryParams
    );

    const totalEmployees = parseInt(employeeStats.rows[0].count);
    const totalAmount = parseFloat(employeeStats.rows[0].total);

    // Create the payroll record
    const result = await query(
      `INSERT INTO payrolls (payroll_period, total_amount, total_employees, status)
       VALUES ($1, $2, $3, $4)
       RETURNING *`,
      [data.payroll_period, totalAmount, totalEmployees, PayrollStatus.PENDING]
    );

    const payroll = result.rows[0];

    // Create payroll items for each employee
    // Each item starts with PENDING status and will be processed asynchronously
    const employees = await query(
      `SELECT id, salary FROM employees WHERE is_active = true ${employeeFilter}`,
      queryParams
    );

    for (const employee of employees.rows) {
      await query(
        `INSERT INTO payroll_items (payroll_id, employee_id, amount, status)
         VALUES ($1, $2, $3, $4)`,
        [payroll.id, employee.id, employee.salary, PayrollStatus.PENDING]
      );
    }

    return payroll;
  }

The payroll creation process begins by determining which employees should be included. If specific employee IDs are provided, only those employees are selected – otherwise, all active employees are included. This allows the system to support both full payroll runs and selective payouts.

Next, the system calculates aggregate payroll statistics directly from the employees table by counting eligible employees and summing their salaries. These values are stored in a new payroll record created with a PENDING status.

Finally, a payroll item is generated for each eligible employee, with each item also initialized in a PENDING state. This design separates payroll setup from payment execution, allowing employee payments to be processed asynchronously and in parallel in later stages of the system.

Fetching Payroll Records

After creating payrolls, we often need to retrieve them for administrative dashboards, reporting, and audit trails.

The PayrollModel provides two simple methods:

findById – Retrieves a single payroll by its unique identifier
findAll – Retrieves all payroll records, ordered by creation date (newest first)

These methods should be added below the create method in the PayrollModel class:

static async findById(id: number): Promisenull> {
  const result = await query('SELECT * FROM payrolls WHERE id = $1', [id]);
  return result.rows[0] || null;
}

static async findAll(): Promise {
  const result = await query(
    'SELECT * FROM payrolls ORDER BY created_at DESC'
  );
  return result.rows;
}

The findById method retrieves a single payroll by its identifier, while findAll returns all payroll records ordered by creation date.

Updating Payroll Status (`PayrollModel.updateStatus`)

Once payroll processing begins, we need a way to track the overall status of a payroll batch. The updateStatus method updates the payroll record with:

The current status (PENDING, PROCESSING, COMPLETED, and so on)
Optional counts of processed and failed payments
A processed_at timestamp automatically set for terminal states (COMPLETED or PARTIALLY_COMPLETED)

Add the following method below the fetch methods in your PayrollModel class:


  static async updateStatus(
    id: number,
    status: PayrollStatus,
    processedCount?: number,
    failedCount?: number
  ): Promise {
    const updates: string[] = ['status = $2', 'updated_at = NOW()'];
    const values: any[] = [id, status];

    // Dynamically add processed_count if provided
    if (processedCount !== undefined) {
      updates.push(`processed_count = $${values.length + 1}`);
      values.push(processedCount);
    }

    // Dynamically add failed_count if provided
    if (failedCount !== undefined) {
      updates.push(`failed_count = $${values.length + 1}`);
      values.push(failedCount);
    }

    // Set processed_at timestamp for terminal states
    if (
      status === PayrollStatus.COMPLETED ||
      status === PayrollStatus.PARTIALLY_COMPLETED
    ) {
      updates.push(`processed_at = NOW()`);
    }

    const result = await query(
      `UPDATE payrolls SET ${updates.join(', ')} WHERE id = $1 RETURNING *`,
      values
    );
    return result.rows[0];
  }
}

As payroll processing progresses, this method updates the overall payroll status along with optional counts of processed and failed payments. When a payroll reaches a terminal state such as COMPLETED or PARTIALLY_COMPLETED, the system automatically records a completion timestamp. This ensures accurate tracking of payroll execution and supports reconciliation workflows.

PayrollItemModel

After handling payroll batches with PayrollModel, we need a way to manage individual employee payments. This is where the PayrollItemModel comes in. It encapsulates database operations related to payroll items, including fetching, and updating records with employee details.

Start by adding a new class below PayrollModel:

export class PayrollItemModel {
  // Methods will go here
}

Fetching Payroll Items (`PayrollItemModel.findByPayrollId`)

Often, we want to get all payroll items for a specific payroll batch. For example, to display them on a dashboard or process them in a background worker.

This findByPayrollId method does that exactly. It retrieves all payroll items associated with a specific payroll and enriches them with employee details such as name, bank account number, and bank information through a database join.

static async findByPayrollId(payrollId: number): Promise {
  const result = await query(
    `SELECT
       pi.id, pi.payroll_id, pi.employee_id, pi.amount, pi.status,
       pi.transaction_reference, pi.error_message, pi.processed_at,
       pi.created_at, pi.updated_at,
       e.name as employee_name, e.employee_id as employee_identifier,
       e.account_number, e.bank_code, e.bank_name
     FROM payroll_items pi
     JOIN employees e ON pi.employee_id = e.id
     WHERE pi.payroll_id = $1
     ORDER BY pi.created_at`,
      [payrollId]
    );
    // Normalize numeric fields from PostgreSQL (which returns them as strings)
    return result.rows.map((row) => ({
      ...row,
      employee_id: parseInt(row.employee_id, 10),
      id: parseInt(row.id, 10),
      payroll_id: parseInt(row.payroll_id, 10),
      amount: parseFloat(row.amount),
    }));
  }

Here’s what’s happening in the code:

We use a JOIN with the employees table so each payroll item includes the employee’s name, account number, and bank information.
Some numeric fields may come as strings, so we convert them to proper JavaScript numbers (parseInt / parseFloat) for accurate calculations and display.
The results are ordered by creation date, which helps when rendering items in a UI or processing them sequentially.

This method makes it easy to work with all items in a payroll batch while keeping the data enriched and consistent.

Fetching a Single Payroll Item (`PayrollItemModel.findById`)

Sometimes, you need to look at one specific employee’s payment (for example, to retry a failed transaction or investigate an issue). The findById method helps in fetching a single payroll item along with the employee’s details, so you have everything you need in one place.

Here’s how we implement it:

static async findById(id: number): Promisenull> {
  const result = await query(
    `SELECT
       pi.id, pi.payroll_id, pi.employee_id, pi.amount, pi.status,
       pi.transaction_reference, pi.error_message, pi.processed_at,
       pi.created_at, pi.updated_at,
       e.name as employee_name, e.employee_id as employee_identifier,
       e.account_number, e.bank_code, e.bank_name
     FROM payroll_items pi
     JOIN employees e ON pi.employee_id = e.id
     WHERE pi.id = $1`,
    [id]
  );

  if (result.rows.length === 0) return null;

  const row = result.rows[0];

  // Convert numeric fields to proper JavaScript numbers for easier calculations and display
  return {
    ...row,
    employee_id: parseInt(row.employee_id, 10),
    id: parseInt(row.id, 10),
    payroll_id: parseInt(row.payroll_id, 10),
    amount: parseFloat(row.amount),
  };
}

Here’s what’s happening in the code:

We use a JOIN with the employees table to include employee info such as name, account number, and bank details.
If the ID doesn’t exist, the method returns null so you can handle missing records gracefully.
Numeric fields are converted to JavaScript numbers, making it easy to calculate totals or display amounts in the UI.

This method ensures that whenever you need a single payroll item, you get a complete, ready-to-use record.

Updating Payroll Item Status (`PayrollItemModel.updateStatus`)

As individual employee payments are processed, this method updates the payroll item’s status, stores transaction references from external payment providers, captures error messages on failure, and timestamps completion or failure events. This fine-grained tracking enables reliable retries, audits, and reconciliation with external payment systems.

Here’s the implementation:

static async updateStatus(
  id: number,
  status: PayrollStatus,
  transactionReference?: string,
  errorMessage?: string
): Promise {
  const updates: string[] = ['status = $2', 'updated_at = NOW()'];
  const values: any[] = [id, status];

  // Add transaction reference if provided (from Monnify API response)
  if (transactionReference) {
    updates.push(`transaction_reference = $${values.length + 1}`);
    values.push(transactionReference);
  }

  // Add error message if provided (from failed payment)
  if (errorMessage) {
    updates.push(`error_message = $${values.length + 1}`);
    values.push(errorMessage);
  }

  // Set processed_at timestamp for terminal states
  if (status === PayrollStatus.COMPLETED || status === PayrollStatus.FAILED) {
    updates.push(`processed_at = NOW()`);
  }

  const result = await query(
    `UPDATE payroll_items SET ${updates.join(
      ', '
     )} WHERE id = $1 RETURNING *`,
     values
   );
   return result.rows[0];
 }
}

Here’s what’s happening in the code:

We build a dynamic SET clause to update only the fields provided – status is required, while transaction reference and error message are optional.
Terminal states (COMPLETED or FAILED) trigger an automatic timestamp on processed_at, so we always know when a payment finished.
The method returns the updated payroll item, ready for further processing, logging, or UI display.

This ensures each payroll item is tracked accurately throughout its lifecycle, enabling reliable retries and complete audit trails.

Overall Payroll Flow

In this payroll flow, an administrator creates a payroll batch, which generates individual payroll items for each employee. The payroll is then handed off to background workers that process each payroll item independently via an external payment service.

As each payment succeeds or fails, payroll items are updated accordingly. Once processing concludes, the payroll batch status is updated to reflect the overall outcome, whether fully successful, partially successful, or failed.

This architecture provides scalability, resilience, and strong auditability for real-world payroll systems.

Building the Monnify Client

The Monnify client is the bridge between our application and Monnify's payment API. In this section, we'll build a reusable client that handles authentication, bulk transfers, and transaction tracking. The client automatically manages API tokens, retries failed requests, and provides a clean interface for the rest of our application.

This module implements a reusable Monnify API client responsible for handling authentication, bulk payroll disbursements, authorization, transaction tracking, and balance checks in a secure and production-ready manner. It abstracts all Monnify-specific logic behind a single class, making it easy to integrate into background jobs, payroll processors, or service layers.

We’ll begin by creating a new file at src/config/monnify.ts where we’ll implement the Monnify client.

Configuration and Environment Setup

Start by loading the configuration from environment variables using dotenv, ensuring that sensitive credentials are never hardcoded. These include the Monnify API key, secret key, base URL, and contract code (wallet account number). This setup allows the same client to be safely used across development, staging, and production environments.

import axios, { AxiosInstance } from 'axios';
import dotenv from 'dotenv';

dotenv.config();

export interface MonnifyConfig {
  apiKey: string;
  secretKey: string;
  baseUrl: string;
  contractCode: string;
}

Create the `MonnifyClient` Class

Next, you’ll define the MonnifyClient class. This class encapsulates all communication with the Monnify API. It internally manages API credentials, an Axios HTTP client, an access token, and token expiry tracking.

export class MonnifyClient {
  private readonly apiKey: string;
  private readonly secretKey: string;
  private baseUrl: string;
  private contractCode: string;
  private client: AxiosInstance;

  private accessToken: string | null = null;
  private tokenExpiry: number = 0;

This design ensures authentication is handled transparently and automatically for every request.

Axios Client and Request Interceptor

Inside the constructor, initialize the Monnify client with credentials from environment variables. The Axios instance is created with the Monnify base URL and JSON headers.

  constructor() {
    this.apiKey = process.env.MONNIFY_API_KEY || '';
    this.secretKey = process.env.MONNIFY_SECRET_KEY || '';
    this.baseUrl = process.env.MONNIFY_BASE_URL || 'https://api.monnify.com';
    this.contractCode = process.env.MONNIFY_CONTRACT_CODE || '';

    this.client = axios.create({
      baseURL: this.baseUrl,
      headers: {
        'Content-Type': 'application/json',
      },
    });

We attach the request interceptor to this client to automatically inject a valid Bearer token into every outgoing request (except the authentication endpoint). Before each request, the interceptor ensures the client is authenticated, preventing unauthorized requests and eliminating token-related boilerplate across the codebase.

    this.client.interceptors.request.use(async (config: any) => {
      // Skip auth for the login endpoint itself
      if (config.url?.includes('/auth/login')) {
        return config;
      }

      // Ensure a valid token exists before every request
      await this.ensureAuthenticated();

      if (this.accessToken) {
        config.headers.Authorization = `Bearer ${this.accessToken}`;
      }

      return config;
    });
  }

Authenticate with Monnify

Authentication is handled using Monnify’s Basic Auth mechanism, where the API key and secret key are base64-encoded and sent to the /auth/login endpoint. Upon successful authentication, the client stores the returned access token and sets an internal expiry timestamp slightly below the official token lifetime to avoid edge-case expirations. Any authentication failure is logged and surfaced as a controlled error to prevent silent failures.


  private async authenticate(): Promise<void> {
    try {
      // Encode credentials as Base64 for Basic Auth
      const credentials = Buffer.from(
        `${this.apiKey}:${this.secretKey}`
      ).toString('base64');

      const response = await axios.post(
        `${this.baseUrl}/api/v1/auth/login`,
        {},
        {
          headers: {
            Authorization: `Basic ${credentials}`,
            'Content-Type': 'application/json',
          },
        }
      );

      this.accessToken = response.data.responseBody.accessToken;
      // Set expiry to 23 hours (Monnify tokens typically last 24 hours)
      // This prevents edge cases where token expires mid-request
      this.tokenExpiry = Date.now() + 23 * 60 * 60 * 1000;
    } catch (error: any) {
      console.error(
        'Monnify authentication error:',
        error.response?.data || error.message
      );
      throw new Error('Failed to authenticate with Monnify');
    }
  }

Automatic Token Refresh (`ensureAuthenticated`)

Before any API call, the client verifies whether a valid access token exists or if the token has expired. If so, it transparently re-authenticates.

  private async ensureAuthenticated(): Promise<void> {
    if (!this.accessToken || Date.now() >= this.tokenExpiry) {
      await this.authenticate();
    }
  }

This ensures that long-running processes such as payroll queues or background workers can safely make Monnify requests without manual token handling.

Initiating Bulk Transfers

The initiateBulkTransfer method handles the creation of a bulk disbursement batch, typically used for payroll payments. It validates input transfers to ensure each payment has a valid amount, destination account number, and bank code.

A structured batch request is then constructed, including a unique batch reference, source account (contract code), narration, and a list of transactions. The request is logged for traceability and sent to Monnify’s batch disbursement endpoint. Any API error is normalized and returned with meaningful messaging to aid debugging and retries.

  async initiateBulkTransfer(
    transfers: Array<{
      amount: number;
      recipientAccountNumber: string;
      recipientBankCode: string;
      recipientName: string;
      narration: string;
      reference: string;
    }>
  ): Promise<any> {
    await this.ensureAuthenticated();

We validate inputs early to fail fast:

    if (!transfers || transfers.length === 0) {
      throw new Error('No transfers provided');
    }

    if (!this.contractCode) {
      throw new Error('Monnify contract code is not configured');
    }

Each transfer is validated individually:

    for (const transfer of transfers) {
      if (!transfer.amount || transfer.amount <= 0) {
        throw new Error(`Invalid amount for transfer: ${transfer.reference}`);
      }
      if (!transfer.recipientAccountNumber) {
        throw new Error(`Missing account number for transfer: ${transfer.reference}`);
      }
      if (!transfer.recipientBankCode) {
        throw new Error(`Missing bank code for transfer: ${transfer.reference}`);
      }
    }

We then construct the batch payload:

    const requestBody = {
      title: 'Bulk Payroll Transfers',
      batchReference: `BATCH_${Date.now()}`,
      narration: 'Payroll batch disbursement',
      sourceAccountNumber: this.contractCode,
      onValidationFailure: 'CONTINUE',
      notificationInterval: 50,
      transactionList: transfers.map((t) => ({
        amount: t.amount,
        reference: t.reference,
        narration: t.narration,
        destinationBankCode: t.recipientBankCode,
        destinationAccountNumber: t.recipientAccountNumber,
        currency: 'NGN',
      })),
    };

Finally, we send the request and normalize errors:

    try {
      const response = await this.client.post(
        '/api/v2/disbursements/batch',
        requestBody
      );
      return response.data;
    } catch (error: any) {
      const errorData = error.response?.data;
      const message =
        errorData?.responseMessage ||
        errorData?.message ||
        `Monnify API error (${error.response?.status})`;
      throw new Error(message);
    }
  }

Authorizing Bulk Transfers (OTP Validation)

Some bulk transfers require OTP authorization. The authorizeBulkTransfer method validates the presence of a batch reference and authorization code before submitting them to Monnify’s OTP validation endpoint. This step finalizes the batch disbursement and allows processing to continue. Errors are logged and surfaced clearly for operational visibility.

async authorizeBulkTransfer(
reference: string,
authorizationCode: string
): Promise<any> {
await this.ensureAuthenticated();
    if (!reference) {
      throw new Error('Batch reference is required');
    }

    if (!authorizationCode) {
      throw new Error('Authorization code (OTP) is required');
    }

    const requestBody = {
      reference,
      authorizationCode,
    };

    try {
      const response = await this.client.post(
        '/api/v2/disbursements/batch/validate-otp',
        requestBody
      );

      return response.data;
    } catch (error: any) {
      const errorDetails = error.response?.data || error.message;
      console.error(
        'Monnify authorization error:',
        JSON.stringify(errorDetails, null, 2)
      );

      if (error.response) {
        const errorData = error.response.data;
        const errorMessage =
          errorData?.responseMessage ||
          errorData?.message ||
          `Monnify API error (${error.response.status})`;
        throw new Error(errorMessage);
      }
      throw error;
    }
}

Transaction Status Lookup

The getTransactionStatus method retrieves the real-time status of an individual transaction using its reference.

async getTransactionStatus(transactionReference: string): Promise<any> {
await this.ensureAuthenticated();
    try {
      const response = await this.client.get(
        `/api/v2/disbursements/${transactionReference}/status`
      );
      return response.data;
    } catch (error: any) {
      console.error(
        'Monnify status check error:',
        error.response?.data || error.message
      );
      throw error;
    }
}

This is useful for reconciliation, webhook fallbacks, or manual verification of disbursement outcomes.

Batch Details Retrieval

The getBatchDetails method fetches detailed information about an entire disbursement batch, including the state of individual transactions.

async getBatchDetails(batchReference: string): Promise<any> {
await this.ensureAuthenticated();
    if (!batchReference) {
      throw new Error('Batch reference is required');
    }

    try {
      const response = await this.client.get(
        `/api/v2/disbursements/batch/${batchReference}`
      );
      return response.data;
    } catch (error: any) {
      console.error(
        'Monnify batch details error:',
        error.response?.data || error.message
      );
      throw error;
    }
}

This is particularly useful when reconciling payroll runs or recovering from partial failures.

Wallet Balance Check

Finally, we can query the available balance of the Monnify wallet.

The getAccountBalance method retrieves the available balance of the configured Monnify wallet (contract account).

Create src/config/monnify.ts:

async getAccountBalance(): Promise<any> {
await this.ensureAuthenticated();

    try {
      const response = await this.client.get(
        `/api/v2/disbursements/wallet-balance?accountNumber=${this.contractCode}`
      );
      return response.data;
    } catch (error: any) {
      console.error(
        'Monnify balance check error:',
        error.response?.data || error.message
      );
      throw error;
    }
}

export const monnifyClient = new MonnifyClient();

Key features of this client:

Automatic token management: The client automatically handles authentication and refreshes tokens before they expire.
Request interceptor: Every API request automatically includes the authentication token.
Bulk transfers: Uses Monnify's batch disbursement API for efficient payroll processing.
Error handling: Comprehensive error handling with meaningful error messages.

Implementing Background Job Processing

To avoid blocking HTTP requests and to ensure reliable retries, payroll execution is handled asynchronously using a background job processor. This worker is responsible for orchestrating bulk payroll disbursements, coordinating with Monnify, updating payroll and payroll item states, and handling retries safely.

Begin by creating a new file at src/jobs/payroll.processor.ts. All background payroll execution logic will live in this file.

Set Up the Payroll Processing Queue

We’ll create a Bull queue named payroll-processing and a backed by Redis. Redis connection details are loaded from environment variables, allowing flexibility across environments.

Default job options are configured to retry failed jobs up to three times using an exponential backoff strategy. This ensures resilience against transient failures such as network issues or temporary payment gateway downtime. Completed jobs are automatically removed from the queue to keep Redis storage clean.

import Queue from 'bull';
import { monnifyClient } from '../config/monnify';
import {
 PayrollItemModel,
 PayrollModel,
 PayrollStatus,
} from '../models/payroll';
import { EmployeeModel } from '../models/employee';

export const payrollQueue = new Queue('payroll-processing', {
 redis: {
 host: process.env.REDIS_HOST || 'localhost',
 port: Number(process.env.REDIS_PORT || 6379),
},
defaultJobOptions: {
 attempts: 3,
 backoff: { 
  type: 'exponential', 
  delay: 2000 
},
 removeOnComplete: true,
},
});

Queue Processor Registration

The queue registers a processor function using payrollQueue.process, which receives jobs containing a payrollId. Each job triggers the processBulkPayroll function, making the queue responsible for executing one payroll batch at a time.

payrollQueue.process(async (job) => {
 return processBulkPayroll(job.data.payrollId);
});

This design decouples payroll execution from HTTP requests and allows processing to happen asynchronously in background workers.

Bulk Payroll Processing Flow (`processBulkPayroll`)

When a payroll job is picked up, the system first fetches all payroll items associated with the given payroll ID. It filters out only items that are eligible for processing: those still in a PENDING state or previously marked as PROCESSING but missing a transaction reference.

async function processBulkPayroll(payrollId: number) {

const items = await PayrollItemModel.findByPayrollId(payrollId);

Also, it filters payroll items to include only those that still require processing. This prevents duplicate payments when jobs are retried.

const payable = items.filter(
  (i) =>
    i.status === PayrollStatus.PENDING ||
    (i.status === PayrollStatus.PROCESSING && !i.transaction_reference)
);

if (payable.length === 0) return;

If no payable items remain, the function exits early, avoiding unnecessary API calls.

Once we confirm there are payable items, we update the overall payroll status:

  await PayrollModel.updateStatus(payrollId, PayrollStatus.PROCESSING);

This provides immediate visibility that disbursement is underway.

Building the Bulk Transfer Payload

Create a variable to store the transfer list that will be sent to Monnify.

  const transfers = [];

For each payable payroll item, the corresponding employee record is fetched to retrieve bank and account details. A unique payment reference is generated using the payroll ID and payroll item ID, ensuring traceability across systems. Each payroll item is immediately marked as PROCESSING before initiating payment to prevent concurrent workers from attempting to process the same item.

A transfer object is then constructed containing the payment amount, recipient bank details, narration, and unique reference. These transfer objects are accumulated into a single batch request.

for (const item of payable) {
const employee = await EmployeeModel.findById(item.employee_id);
if (!employee) throw new Error('Employee not found');

    const reference = `PAYROLL_${payrollId}_${item.id}`;

    await PayrollItemModel.updateStatus(item.id, PayrollStatus.PROCESSING);

    transfers.push({
      amount: Number(item.amount),
      reference,
      recipientAccountNumber: employee.account_number,
      recipientBankCode: employee.bank_code,
      recipientName: employee.name,
      narration: `Payroll payment`,
    });

}

Initiating Bulk Disbursement via Monnify

Once all transfers are prepared, the system initiates a bulk transfer through the Monnify client.

const response = await monnifyClient.initiateBulkTransfer(transfers);

if (!response?.requestSuccessful) {
  throw new Error('Bulk transfer initiation failed');
}

If Monnify doesn’t confirm successful initiation, the job throws an error, allowing Bull’s retry mechanism to take over. This ensures failed initiation attempts are retried safely without manual intervention.

Storing Transaction References

After a successful bulk transfer initiation, Monnify returns a list of transactions containing unique transaction references. The system matches each response entry to its corresponding payroll item using the generated reference and updates the payroll item record with the Monnify transaction reference while keeping its status as PROCESSING.

const results = response.responseBody?.transactionList || [];

for (const item of payable) {
const ref = `PAYROLL_${payrollId}_${item.id}`;
const match = results.find((r: any) => r.reference === ref);

    if (match?.transactionReference) {
      await PayrollItemModel.updateStatus(
        item.id,
        PayrollStatus.PROCESSING,
        match.transactionReference
      );
    }

}

await updatePayrollStats(payrollId);
}

This step is critical for later reconciliation through webhooks or status polling.

Payroll Statistics Reconciliation (`updatePayrollStats`)

After initiating payments, the system recalculates payroll-level statistics by refetching all payroll items.

async function updatePayrollStats(payrollId: number) {
const items = await PayrollItemModel.findByPayrollId(payrollId);

const completed = items.filter(
  (i) => i.status === PayrollStatus.COMPLETED
).length;

The overall payroll status is derived from these counts:

const failed = items.filter((i) => i.status === PayrollStatus.FAILED).length;

let status = PayrollStatus.PROCESSING;

if (completed === items.length) {
  status = PayrollStatus.COMPLETED;
} else if (failed === items.length) {
  status = PayrollStatus.FAILED;
} else if (completed > 0) {
  status = PayrollStatus.PARTIALLY_COMPLETED;
}

 await PayrollModel.updateStatus(payrollId, status, completed, failed);
}

If all items are completed, the payroll is marked as COMPLETED. If all failed, it’s marked as FAILED. If some succeeded and some failed, it’s marked as PARTIALLY_COMPLETED. Otherwise, it remains in PROCESSING. The payroll record is then updated with the new status and aggregate counts, providing an accurate real-time snapshot of payroll execution.

Queue Entry Point (`processPayrollItems`)

The processPayrollItems function serves as the public entry point for triggering payroll execution.

export async function processPayrollItems(payrollId: number) {
  await payrollQueue.add({ payrollId, type: 'bulk' });
}

It simply enqueues a payroll job with the relevant payroll ID, allowing controllers or services to initiate payroll processing without coupling themselves to queue logic or payment execution details.

Role in the Overall Payroll Architecture

This queue worker acts as the execution engine of the payroll system. It:

Bridges payroll domain models with the Monnify payment gateway
Ensures safe retries through Bull’s job management and maintains idempotency
Continuously synchronizes payroll and payroll item states

By offloading payment execution to background workers, the system achieves scalability, reliability, and operational resilience required for real-world payroll processing.

Key features of the job processor:

Exponential backoff: Failed jobs are retried with increasing delays (2s, 4s, 8s).
Bulk processing: All payroll items are processed as a single batch transfer.
Status tracking: Each item's status is updated throughout the process.
Automatic cleanup: Completed jobs are automatically removed from the queue.

Creating the API Controllers

Next, we’ll build the HTTP controller layer for managing employees in the payroll system using Express.js. It exposes RESTful API endpoints that handle incoming requests, perform validation, interact with the employee data model, and return appropriate HTTP responses.

The controller acts as the bridge between client-facing APIs and the underlying business logic encapsulated in the EmployeeModel.

Controller Responsibilities

The EmployeeController is responsible for:

Validating incoming request data
Calling the appropriate model methods
Handling errors gracefully
Returning meaningful HTTP status codes and JSON responses

Each method follows a consistent structure using try–catch blocks to ensure reliability and simplify error handling.

Start by creating a new file at src/controllers/employee.controller.ts. This file will contain all the endpoints needed to manage employees in the payroll system.

At the top of the file, import the required Express types and the employee model:

import { Request, Response } from 'express';
import { EmployeeModel, CreateEmployeeInput } from '../models/employee';

export class EmployeeController {
  // Controller methods will go here
}

Each method inside this class will map to a specific API endpoint.

Creating an Employee (`createEmployee`)

We’ll start with an endpoint for creating a new employee.

This endpoint handles the creation of a new employee record. It extracts the request body and validates the presence of required fields such as name, email, salary, bank account number, and bank code.

static async createEmployee(req: Request, res: Response): Promise<void> {
try {
const data: CreateEmployeeInput = req.body;

      if (
        !data.name ||
        !data.email ||
        !data.salary ||
        !data.account_number ||
        !data.bank_code
      ) {
        res.status(400).json({
          error:
            'Missing required fields: name, email, salary, account_number, bank_code',
        });
        return;
      }

      const employee = await EmployeeModel.create(data);
      res.status(201).json({
        message: 'Employee created successfully',
        data: employee,
      });
    } catch (error: any) {
      console.error('Error creating employee:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to create employee' });
    }

}

If any required field is missing, the request is rejected with a 400 Bad Request response.

Upon successful validation, the controller delegates employee creation to the EmployeeModel.create method and returns a 201 Created response containing the newly created employee. Any unexpected error during the process results in a 500 Internal Server Error.

Fetching All Employees (`getAllEmployees`)

Next, we’ll add an endpoint for retrieving all employee records from the system.

This endpoint simply calls EmployeeModel.findAll and returns the result as a JSON response. This API is typically used for administrative dashboards, payroll preparation, or reporting purposes.

static async getAllEmployees(req: Request, res: Response): Promise<void> {
  try {
    const employees = await EmployeeModel.findAll();
    res.json({ data: employees });
  } catch (error: any) {
    console.error('Error fetching employees:', error);
    res
      .status(500)
      .json({ error: error.message || 'Failed to fetch employees' });
  }
}

If the retrieval is successful, the controller responds with the full list of employees. If something goes wrong, such as a database or unexpected runtime error, the error is logged and a 500 Internal Server Error is returned to the client.

Fetching a Single Employee (`getEmployeeById`)

After listing all employees, the next logical step is being able to fetch a single employee by their ID.

This endpoint retrieves a specific employee by ID, which is parsed from the URL parameters. If the employee doesn’t exist, the controller responds with a 404 Not Found. Otherwise, the employee data is returned in a successful response. This endpoint is useful for viewing or editing individual employee details.

  static async getEmployeeById(req: Request, res: Response): Promise<void> {
    try {
      const { id } = req.params;
      const employee = await EmployeeModel.findById(parseInt(id));

      if (!employee) {
        res.status(404).json({ error: 'Employee not found' });
        return;
      }

      res.json({ data: employee });
    } catch (error: any) {
      console.error('Error fetching employee:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to fetch employee' });
    }
  }

Updating an Employee (`updateEmployee`)

Now that we can retrieve individual employees, the next step is allowing their details to be updated.

This endpoint allows partial updates to an existing employee record. It first checks whether the employee exists before attempting an update.

If the employee isn’t found, a 404 Not Found response is returned. If the employee exists, the controller forwards the update payload to EmployeeModel.update and returns the updated employee record. This approach ensures data integrity and prevents silent failures.

  static async updateEmployee(req: Request, res: Response): Promise<void> {
    try {
      const { id } = req.params;
      const data: Partial = req.body;

      const employee = await EmployeeModel.findById(parseInt(id));
      if (!employee) {
        res.status(404).json({ error: 'Employee not found' });
        return;
      }

      const updated = await EmployeeModel.update(parseInt(id), data);
      res.json({
        message: 'Employee updated successfully',
        data: updated,
      });
    } catch (error: any) {
      console.error('Error updating employee:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to update employee' });
    }
  }

Deleting an Employee (`deleteEmployee`)

Finally, the last endpoint in the EmployeeController handles employee deletion.

Before deleting, it verifies that the employee exists to avoid invalid delete operations. If found, the employee record is removed using EmployeeModel.delete, and a success message is returned. If the employee doesn’t exist, the controller responds with a 404 Not Found.

 static async deleteEmployee(req: Request, res: Response): Promise<void> {
    try {
      const { id } = req.params;

      const employee = await EmployeeModel.findById(parseInt(id));
      if (!employee) {
        res.status(404).json({ error: 'Employee not found' });
        return;
      }

      await EmployeeModel.delete(parseInt(id));
      res.json({ message: 'Employee deleted successfully' });
    } catch (error: any) {
      console.error('Error deleting employee:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to delete employee' });
    }
  }

Error Handling Strategy

All controller methods use structured error handling to log errors internally while returning clean and user-friendly error messages to API consumers. This separation ensures sensitive implementation details are not leaked while still providing useful feedback for debugging and client-side handling.

Role in the Overall Payroll System

The EmployeeController provides the foundational APIs required for managing employee records, which are essential inputs for payroll processing. By cleanly separating HTTP concerns from business logic and persistence layers, this controller supports maintainability, scalability, and clear system boundaries within the payroll architecture.

Payroll Controller

This module defines the PayrollController, which serves as the primary HTTP-facing orchestration layer for payroll operations in the system. It exposes RESTful APIs that allow clients to create payrolls, retrieve payroll data, trigger payroll processing, reconcile payment results, authorize bulk transfers, and monitor transaction and account statuses.

Controller Responsibilities

The PayrollController is responsible for:

Accepting and validating client requests related to payrolls
Managing payroll lifecycle transitions (creation → processing → completion)
Triggering background job execution for bulk payroll disbursement
Reconciling payment results with Monnify
Providing real-time payroll and transaction status visibility
Acting as a safe boundary between external clients and internal services

To get started, create a new file src/controllers/payroll.controller.ts. This is where we’ll define all payroll-related endpoints.

At the top of src/controllers/payroll.controller.ts, we start with the following imports:

import { Request, Response } from 'express';
import {
  PayrollModel,
  PayrollItemModel,
  PayrollStatus,
} from '../models/payroll';
import { processPayrollItems } from '../jobs/payroll.processor';
import { monnifyClient } from '../config/monnify';

Here’s what each of these is responsible for:

Request and Response (from Express): These types give us strongly typed access to incoming HTTP requests and outgoing responses.
PayrollModel: This model handles payroll batch operations such as creating payrolls, fetching them, and updating their overall status.
PayrollItemModel: This model lets us fetch and update those items, especially during processing and reconciliation.
PayrollStatus: This is an enum that defines the valid states of a payroll or payroll item (for example: PENDING, PROCESSING, COMPLETED, FAILED). Using an enum helps keep state transitions explicit and consistent across the system.
processPayrollItems: This function is responsible for handing off payroll processing to background workers. Instead of processing payrolls synchronously in the HTTP request, we queue the work and let workers handle it asynchronously.
monnifyClient: This is our gateway to the external payment service. We use it to authorize bulk transfers, check transaction statuses, reconcile payments, and fetch account balances.

Together, these imports give the controller everything it needs to process payroll operations.

With our imports in place, we can now define the controller class itself. This class will serve as the single home for all payroll-related endpoints.

export class PayrollController {
  // Payroll endpoints will live here
}

Creating a Payroll (`createPayroll`)

With the controller in place, we’ll begin by implementing the endpoint create payroll. This endpoint initializes a new payroll batch, allowing us to either process all employees or a subset by their IDs.

static async createPayroll(req: Request, res: Response): Promise<void> {
try {
const { payroll_period, employee_ids } = req.body;

      if (!payroll_period) {
        res.status(400).json({ error: 'payroll_period is required' });
        return;
      }

      const processedEmployeeIds = employee_ids
        ? employee_ids
            .map((id: any) => parseInt(id, 10))
            .filter((id: number) => !isNaN(id))
        : undefined;

      const payroll = await PayrollModel.create({
        payroll_period,
        employee_ids: processedEmployeeIds,
      });

      res.status(201).json({
        message: 'Payroll created successfully',
        data: payroll,
      });
    } catch (error: any) {
      console.error('Error creating payroll:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to create payroll' });
    }

}

Here’s what’s happening in the code:

The endpoint requires a payroll_period and optionally accepts a list of employee IDs to support partial payroll runs.
Incoming employee IDs are normalized and validated to ensure they are valid integers before being passed to the payroll model.
The controller delegates the actual creation logic to PayrollModel.create, which computes totals and creates payroll items.
On success, the API responds with a 201 Created status and the newly created payroll record.

Fetching All Payrolls (`getAllPayrolls`)

This endpoint retrieves all payroll batches in the system. It’s typically used for administrative dashboards and payroll history views. The controller simply delegates to PayrollModel.findAll and returns the results in a structured JSON response.

static async getAllPayrolls(req: Request, res: Response): Promise<void> {
try {
const payrolls = await PayrollModel.findAll();
res.json({ data: payrolls });
} catch (error: any) {
console.error('Error fetching payrolls:', error);
res
.status(500)
.json({ error: error.message || 'Failed to fetch payrolls' });
}
}

static async getPayrollById(req: Request, res: Response): Promise<void> {
try {
const { id } = req.params;
const payroll = await PayrollModel.findById(parseInt(id));

      if (!payroll) {
        res.status(404).json({ error: 'Payroll not found' });
        return;
      }

      const items = await PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
        },
      });
    } catch (error: any) {
      console.error('Error fetching payroll:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to fetch payroll' });
    }
}

Fetching a Payroll with Items (`getPayrollById`)

Next, we’ll implement an endpoint to retrieve a single payroll by its ID along with all associated payroll items. This is useful for administrative dashboards and payroll history views.

static async getPayrollById(req: Request, res: Response): Promise<void> {
try {
const { id } = req.params;
const payroll = await PayrollModel.findById(parseInt(id));

      if (!payroll) {
        res.status(404).json({ error: 'Payroll not found' });
        return;
      }

      const items = await PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
        },
      });
    } catch (error: any) {
      console.error('Error fetching payroll:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to fetch payroll' });
    }

}

In the code, we read the id parameter from the URL and convert it to an integer.

If the payroll does not exist, a 404 Not Found response is returned. When found, the controller aggregates payroll metadata and its child payroll items into a single response object, making it convenient for detailed payroll inspection and UI rendering.

Processing a Payroll (`processPayroll`)

Next, we implement the processPayroll endpoint. This endpoint initiates payroll execution. Before queuing the payroll for processing, the controller enforces important state checks to prevent duplicate or invalid execution, ensuring payrolls that are already PROCESSING or COMPLETED cannot be reprocessed.

static async processPayroll(req: Request, res: Response): Promise<void> {
try {
const { id } = req.params;

      const payroll = await PayrollModel.findById(Number(id));

      if (!payroll) {
        res.status(404).json({ error: 'Payroll not found' });
        return;
      }

      if (
        payroll.status === PayrollStatus.COMPLETED ||
        payroll.status === PayrollStatus.PROCESSING
      ) {
        res.status(400).json({
          error: `Payroll already ${payroll.status}`,
        });
        return;
      }

      // Queue the payroll for processing
      await processPayrollItems(payroll.id);

      res.json({
        message: 'Payroll queued for bulk processing',
        data: {
          payroll_id: payroll.id,
          processing_mode: 'bulk',
        },
      });
    } catch (error: any) {
      console.error('Error processing payroll:', error);
      res.status(500).json({
        error: error.message || 'Failed to process payroll',
      });
    }
}

Here’s what’s happening in the code:

We get the id parameter from the URL and convert it to a number.
If no payroll is found with the given ID, we return a 404 Not Found response.
Before queuing, we check the payroll’s current status. Payrolls that are already PROCESSING or COMPLETED cannot be reprocessed.
Valid payrolls are handed off to processPayrollItems, which runs the bulk execution in background workers (Bull jobs).
Once queued, we respond with a JSON object confirming the payroll is ready for bulk processing.

Reconciling Payroll Payments (`reconcilePayroll`)

Next, we’ll implement the endpoint that reconciles payroll payments. This ensures that the statuses of payroll items in our system match the actual payment outcomes from Monnify.

static async reconcilePayroll(req: Request, res: Response): Promise<void> {
try {
const { id } = req.params;

      const payroll = await PayrollModel.findById(Number(id));
      if (!payroll) {
        res.status(404).json({ error: 'Payroll not found' });
        return;
      }

      const items = await PayrollItemModel.findByPayrollId(Number(id));

      const itemsToReconcile = items.filter(
        (item) => item.transaction_reference
      );

      if (itemsToReconcile.length === 0) {
        res.json({
          message: 'No items to reconcile (no transaction references found)',
          reconciled: 0,
        });
        return;
      }

      let updated = 0;
      let errors = 0;

      for (const item of itemsToReconcile) {
        try {
          const txStatus = await monnifyClient.getTransactionStatus(
            item.transaction_reference!
          );

          const responseBody = txStatus.responseBody || txStatus;
          const paymentStatus =
            responseBody.paymentStatus || responseBody.status;

          if (
            paymentStatus === 'PAID' &&
            item.status !== PayrollStatus.COMPLETED
          ) {
            await PayrollItemModel.updateStatus(
              item.id,
              PayrollStatus.COMPLETED,
              item.transaction_reference
            );
            updated++;
          } else if (
            paymentStatus === 'FAILED' &&
            item.status !== PayrollStatus.FAILED
          ) {
            const errorMessage =
              responseBody.paymentDescription ||
              responseBody.failureReason ||
              'Transaction failed';
            await PayrollItemModel.updateStatus(
              item.id,
              PayrollStatus.FAILED,
              item.transaction_reference,
              errorMessage
            );
            updated++;
          }
        } catch (error: any) {
          errors++;
          console.error(`Error reconciling item ${item.id}:`, error.message);
        }
      }

      // Update payroll stats
      await PayrollController.updatePayrollStats(Number(id));

      res.json({
        message: 'Payroll reconciled successfully',
        reconciled: updated,
        errors,
        total: itemsToReconcile.length,
      });
    } catch (error: any) {
      console.error('Error reconciling payroll:', error);
      res.status(500).json({
        error: error.message || 'Failed to reconcile payroll',
      });
    }
}

The endpoint retrieves all payroll items with transaction references and queries Monnify for each transaction’s status. Based on the response, payroll items are updated to either COMPLETED or FAILED, with failure reasons captured where applicable.

Errors during reconciliation are tracked and logged without aborting the entire reconciliation process. After reconciliation, payroll-level statistics are recalculated to ensure consistency between item-level and batch-level states.

Payroll Statistics Update (Internal Helper)

The private updatePayrollStats method recalculates payroll status based on the aggregate states of its payroll items.

private static async updatePayrollStats(payrollId: number): Promise<void> {
const items = await PayrollItemModel.findByPayrollId(payrollId);

    const completed = items.filter(
      (i) => i.status === PayrollStatus.COMPLETED
    ).length;
    const failed = items.filter(
      (i) => i.status === PayrollStatus.FAILED
    ).length;
    const total = items.length;

    let status: PayrollStatus;
    if (completed === total) {
      status = PayrollStatus.COMPLETED;
    } else if (failed === total) {
      status = PayrollStatus.FAILED;
    } else if (completed > 0) {
      status = PayrollStatus.PARTIALLY_COMPLETED;
    } else {
      status = PayrollStatus.PROCESSING;
    }

    await PayrollModel.updateStatus(payrollId, status, completed, failed);

}

The endpoint determines whether a payroll is fully completed, fully failed, partially completed, or still processing, and updates the payroll record accordingly.

This logic guarantees that the payroll’s summary status always reflects the true execution state of its underlying payments.

Fetching Payroll Status Summary (`getPayrollStatus`)

Next, we’ll implement the getPayrollStatus endpoint. This endpoint provides a comprehensive status snapshot of a payroll. In addition to returning payroll metadata and items, it computes a summary breakdown of completed, failed, pending, and processing items. This endpoint is particularly useful for real-time dashboards, monitoring tools, and operational visibility.

static async getPayrollStatus(req: Request, res: Response): Promise<void> {
try {
const { id } = req.params;
const payroll = await PayrollModel.findById(parseInt(id));

      if (!payroll) {
        res.status(404).json({ error: 'Payroll not found' });
        return;
      }

      const items = await PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
          summary: {
            total: items.length,
            completed: items.filter((i) => i.status === PayrollStatus.COMPLETED)
              .length,
            failed: items.filter((i) => i.status === PayrollStatus.FAILED)
              .length,
            pending: items.filter((i) => i.status === PayrollStatus.PENDING)
              .length,
            processing: items.filter(
              (i) => i.status === PayrollStatus.PROCESSING
            ).length,
          },
        },
      });
    } catch (error: any) {
      console.error('Error fetching payroll status:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to fetch payroll status' });
    }
}

Authorizing Bulk Transfers (`authorizeBulkTransfer`)

Next, we’ll implement the authorizeBulkTransfer endpoint. Some bulk disbursements require OTP authorization from Monnify. This endpoint accepts a batch reference and authorization code, validates their presence, and forwards them to the Monnify client for verification. Successful authorization allows the bulk transfer to proceed, while failures are clearly reported to the client.

static async authorizeBulkTransfer(
req: Request,
res: Response
): Promise<void> {
try {
const { reference, authorizationCode, payrollId } = req.body;

      if (!reference) {
        res.status(400).json({ error: 'Batch reference is required' });
        return;
      }

      if (!authorizationCode) {
        res.status(400).json({ error: 'Authorization code (OTP) is required' });
        return;
      }

      const result = await monnifyClient.authorizeBulkTransfer(
        reference,
        authorizationCode
      );

      res.json({
        message: 'Bulk transfer authorized successfully',
        data: result,
      });
    } catch (error: any) {
      console.error('Error authorizing bulk transfer:', error);
      res.status(500).json({
        error: error.message || 'Failed to authorize bulk transfer',
      });
    }
}

Here is what’s happening in the code:

Firstly, we get the batch reference, OTP, and optional payroll ID from the request body.
We return a 400 Bad Request if the reference or OTP is missing.
Next, we send the reference and OTP to Monnify to approve the bulk transfer.
If successful, return a JSON confirmation with Monnify’s response.

Checking Transaction Status (`checkTransactionStatus`)

This endpoint allows clients or administrators to query the status of an individual transaction using its reference. It delegates the lookup to the Monnify client and returns the raw response, making it useful for debugging, audits, or manual verification workflows.

static async checkTransactionStatus(
req: Request,
res: Response
): Promise<void> {
try {
const { reference } = req.params;

      if (!reference) {
        res.status(400).json({ error: 'Transaction reference is required' });
        return;
      }

      const status = await monnifyClient.getTransactionStatus(reference);
      res.json({ data: status });
    } catch (error: any) {
      console.error('Error checking transaction status:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to check transaction status' });
    }
}

Checking Wallet Balance (`getAccountBalance`)

This endpoint retrieves the current balance of the Monnify wallet associated with the payroll contract code. It’s typically used for pre-disbursement checks, monitoring available funds, or administrative reporting.

  static async getAccountBalance(req: Request, res: Response): Promise<void> {
    try {
      const balance = await monnifyClient.getAccountBalance();
      res.json({ data: balance });
    } catch (error: any) {
      console.error('Error fetching account balance:', error);
      res
        .status(500)
        .json({ error: error.message || 'Failed to fetch account balance' });
    }
  }

Error Handling and Resilience

All controller methods use structured try–catch blocks to ensure unexpected failures are logged and surfaced as controlled HTTP error responses. This approach prevents sensitive internal errors from leaking while maintaining clarity and debuggability for API consumers.

Role in the Overall Payroll Architecture

The PayrollController acts as the central coordinator of the payroll system. It bridges client requests, domain models, background job processing, and external payment services into a cohesive workflow.

By enforcing state transitions, delegating heavy processing to background workers, and providing reconciliation and monitoring capabilities, this controller ensures payroll execution remains reliable, auditable, and scalable in real-world production environments.

Setting Up Webhook Handlers

Webhooks are essential for receiving real-time payment status updates from Monnify. When a payment completes or fails, Monnify sends a notification to your webhook endpoint.

Start by creating a new file src/routes/monnify.webhook.ts. This file will contain everything related to handling Monnify webhook events.


import { Router, Request, Response } from 'express';
import crypto from 'crypto';
import {
PayrollItemModel,
PayrollModel,
PayrollStatus,
} from '../models/payroll';

const router = Router();

function verifySignature(req: Request): boolean {
const signature = req.headers['monnify-signature'] as string;
if (!signature) return false;

const secret = process.env.MONNIFY_WEBHOOK_SECRET!;
const hash = crypto
.createHmac('sha512', secret)
.update(JSON.stringify(req.body))
.digest('hex');

return hash === signature;
}

router.post('/monnify/webhook', async (req: Request, res: Response) => {
try {
console.log('Monnify Webhook:', JSON.stringify(req.body, null, 2));

    const { eventType, eventData } = req.body;

    if (!eventData?.reference) {
      console.warn('Missing reference, ignoring webhook');
      return res.status(200).send('Ignored');
    }

    const paymentReference = eventData.reference;
    const transactionReference = eventData.transactionReference;
    const description = eventData.transactionDescription || '';

    // Parse our reference format: PAYROLL_{payrollId}_{itemId}
    const [prefix, payrollIdStr, itemIdStr] = paymentReference.split('_');

    if (prefix !== 'PAYROLL') {
      console.warn('Invalid payment reference format:', paymentReference);
      return res.status(200).send('Ignored');
    }

    const payrollId = Number(payrollIdStr);
    const itemId = Number(itemIdStr);

    if (isNaN(payrollId) || isNaN(itemId)) {
      console.warn('Invalid payroll/item IDs:', paymentReference);
      return res.status(200).send('Ignored');
    }

    const item = await PayrollItemModel.findById(itemId);

    if (!item) {
      console.warn('Payroll item not found:', itemId);
      return res.status(200).send('Ignored');
    }

    // Idempotency check - don't process already finalized items
    if (
      item.status === PayrollStatus.COMPLETED ||
      item.status === PayrollStatus.FAILED
    ) {
      console.log(`Item ${itemId} already finalized (${item.status})`);
      return res.status(200).send('Already processed');
    }

    // Update status based on event type
    if (
      eventType === 'SUCCESSFUL_DISBURSEMENT' ||
      eventData.status === 'SUCCESS'
    ) {
      await PayrollItemModel.updateStatus(
        itemId,
        PayrollStatus.COMPLETED,
        transactionReference
      );
      console.log(`✅ Payroll item ${itemId} COMPLETED`);
    } else if (
      eventType === 'FAILED_DISBURSEMENT' ||
      eventType === 'REVERSED_DISBURSEMENT' ||
      eventData.status === 'FAILED'
    ) {
      await PayrollItemModel.updateStatus(
        itemId,
        PayrollStatus.FAILED,
        transactionReference,
        description
      );
      console.log(`Payroll item ${itemId} FAILED`);
    } else {
      console.log(`Unhandled Monnify eventType: ${eventType}`);
    }

    // Update overall payroll stats
    await updatePayrollStats(payrollId);

    return res.status(200).send('OK');

} catch (error: any) {
console.error('Monnify webhook error:', error.message);
return res.status(200).send('OK'); // Always return 200 to prevent retries
}
});

export default router;

async function updatePayrollStats(payrollId: number) {
const items = await PayrollItemModel.findByPayrollId(payrollId);

const completed = items.filter(
(i) => i.status === PayrollStatus.COMPLETED
).length;

const failed = items.filter((i) => i.status === PayrollStatus.FAILED).length;

let status = PayrollStatus.PROCESSING;

if (completed === items.length) {
status = PayrollStatus.COMPLETED;
} else if (failed === items.length) {
status = PayrollStatus.FAILED;
} else if (completed > 0) {
status = PayrollStatus.PARTIALLY_COMPLETED;
}

await PayrollModel.updateStatus(payrollId, status, completed, failed);
}

Key webhook implementation details:

Signature verification: The verifySignature function validates that webhooks actually come from Monnify.
Idempotency: The handler checks if an item is already finalized before processing.
Always return 200: Even on errors, return 200 to prevent Monnify from retrying indefinitely.
Reference parsing: Our reference format PAYROLL_{payrollId}_{itemId} lets us identify which payment item to update.

Wiring Up Routes

Employee Routes

We’ll start by defining routes for employee management. These routes expose CRUD operations for employees and simply delegate the actual logic to the EmployeeController.

Create the file src/routes/employee.routes.ts:

import { Router } from 'express';
import { EmployeeController } from '../controllers/employee.controller';

const router = Router();

router.post('/', EmployeeController.createEmployee);
router.get('/', EmployeeController.getAllEmployees);
router.get('/:id', EmployeeController.getEmployeeById);
router.put('/:id', EmployeeController.updateEmployee);
router.delete('/:id', EmployeeController.deleteEmployee);

export default router;

What this gives us:

A clean /api/employees entry point for all employee-related operations
Clear separation between routing (URLs) and business logic (controllers)
A predictable REST structure that’s easy to extend later

Payroll Routes

Next, we define routes for payroll operations. Payroll is more complex than employees, so this router exposes endpoints for creation, processing, reconciliation, authorization, and monitoring.

Create the file src/routes/payroll.routes.ts:

import { Router } from 'express';
import { PayrollController } from '../controllers/payroll.controller';

const router = Router();

router.post('/', PayrollController.createPayroll);
router.get('/', PayrollController.getAllPayrolls);
router.get('/:id', PayrollController.getPayrollById);
router.post('/:id/process', PayrollController.processPayroll);
router.post('/batch/authorize', PayrollController.authorizeBulkTransfer);
router.get('/:id/status', PayrollController.getPayrollStatus);
router.get(
  '/transaction/:reference/status',
  PayrollController.checkTransactionStatus
);
router.get('/account/balance', PayrollController.getAccountBalance);
router.post('/:id/reconcile', PayrollController.reconcilePayroll);

export default router;

What’s happening here:

Each route maps directly to a well-defined payroll operation
Long-running or sensitive actions (processing, reconciliation, authorization) are clearly separated
Monitoring and operational endpoints (status, transaction lookup, balance checks) are first-class citizens

Main Application Entry Point

With all routes defined, we now bring everything together in the main application file. This is where we configure middleware, register routes, and start the server.

Create the file src/index.ts:

import express, { Application, Request, Response } from 'express';
import cors from 'cors';
import helmet from 'helmet';
import dotenv from 'dotenv';
import path from 'path';
import { pool } from './config/database';
import employeeRoutes from './routes/employee.routes';
import payrollRoutes from './routes/payroll.routes';
import monnifyWebhookRoutes from './routes/monnify.webhook';

dotenv.config();

const app: Application = express();
const PORT = process.env.PORT || 3008;

// Middleware
app.use(
  helmet({
    contentSecurityPolicy: false,
  })
);
app.use(
  cors({
    origin: '*',
    methods: ['GET', 'POST', 'PUT', 'DELETE', 'OPTIONS'],
    allowedHeaders: ['Content-Type', 'Authorization'],
  })
);
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Health check endpoint
app.get('/health', async (req: Request, res: Response) => {
  try {
    await pool.query('SELECT 1');
    res.json({ status: 'healthy', database: 'connected' });
  } catch (error) {
    res.status(500).json({ status: 'unhealthy', database: 'disconnected' });
  }
});

// Routes
app.use('/api/employees', employeeRoutes);
app.use('/api/payrolls', payrollRoutes);
app.use('/api', monnifyWebhookRoutes);

// 404 handler
app.use((req: Request, res: Response) => {
  res.status(404).json({ error: 'Route not found' });
});

// Error handler
app.use((err: any, req: Request, res: Response, next: any) => {
  console.error('Error:', err);
  res.status(err.status || 500).json({
    error: err.message || 'Internal server error',
  });
});

app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
  console.log(`Environment: ${process.env.NODE_ENV || 'development'}`);
});

// Graceful shutdown
process.on('SIGTERM', async () => {
  console.log('SIGTERM signal received: closing HTTP server');
  await pool.end();
  process.exit(0);
});

process.on('SIGINT', async () => {
  console.log('SIGINT signal received: closing HTTP server');
  await pool.end();
  process.exit(0);
});

Testing the System

Now let's test the complete payroll flow.

Start the application:

docker-compose up -d
npm run dev

Create employees:

curl -X POST http://localhost:3008/api/employees \
  -H "Content-Type: application/json" \
  -d '{
    "name": "John Doe",
    "email": "john.doe@company.com",
    "salary": 50000,
    "account_number": "0123456789",
    "bank_code": "058",
    "bank_name": "GTBank"
  }'

Create a few more employees with different salaries to see how it’s handled.

Create a payroll:

curl -X POST http://localhost:3008/api/payrolls \
  -H "Content-Type: application/json" \
  -d '{
    "payroll_period": "2024-12"
  }'

This creates a payroll with all active employees.

Process the payroll:

curl -X POST http://localhost:3008/api/payrolls/1/process

This queues the payroll for background processing. The system will:

Create a bulk transfer request to Monnify
Update each payroll item with a transaction reference
Wait for webhooks to update final status

Authorize the bulk transfer (if OTP is required):

After processing, Monnify sends an OTP to your registered email. Use it to authorize:

curl -X POST http://localhost:3008/api/payrolls/batch/authorize \
  -H "Content-Type: application/json" \
  -d '{
    "reference": "BATCH_1702123456789",
    "authorizationCode": "123456",
    "payrollId": 1
  }'

Check the payroll status:

curl http://localhost:3008/api/payrolls/1/status

This returns detailed status including a summary of completed, failed, and pending items.

Now, reconcile if needed – if webhooks were missed or you need to sync status:

curl -X POST http://localhost:3008/api/payrolls/1/reconcile

Setting Up Webhooks for Production

For Monnify to send webhooks to your local development environment, you'll need to expose your local server. You can use ngrok:

ngrok http 3008

Then configure the webhook URL in your Monnify dashboard:

https://your-ngrok-url.ngrok.io/api/monnify/webhook

For production, use your actual server URL and ensure HTTPS is enabled.

Then when transactions are successful it will be revealed on the monnify dashboard as well as the transactions that failed.

Conclusion

You've built a complete payroll system that:

Manages employees with their bank account details
Creates payroll batches with automatic amount calculation
Processes bulk payments using Monnify's disbursement API
Uses background jobs to prevent request timeouts
Handles webhooks for real-time status updates
Supports reconciliation to ensure data consistency

Key Takeaways

Background jobs are essential: Processing payments synchronously would timeout for large payrolls. Bull and Redis provide reliable async processing.
Idempotency matters: Both the webhook handler and reconciliation process check current status before updating, preventing duplicate processing.
Bulk transfers save time: Monnify's batch API lets you process hundreds of payments with a single OTP authorization.
Status tracking is critical: The system tracks status at both the payroll and individual item level, making it easy to identify and handle failures.
Reconciliation is your safety net: When webhooks fail or get delayed, the reconciliation endpoint ensures your database stays in sync with actual payment status.

References:

Monnify Docs

How to Deploy a Next.js API with PostgreSQL and Sevalla

Manish Shivanandhan — Mon, 18 Aug 2025 13:54:12 +0000

When developers think of Next.js, they often associate it with SEO-friendly static websites or React-based frontends. But what many miss is how Next.js can also be used to build full-featured backend APIs – all within the same project.

I’ve recently written an article on working with Next.js API and deploying it to production. In this case, I would’ve used a JSON file as a mini-database.

But JSON or any type of file storage isn’t fit for a production application. This is because file-based storage isn’t designed for concurrent access, so multiple users writing data at the same time can cause corruption or loss.

It also lacks indexing and query capabilities, making it slow as data grows. Backups, security, and scalability are also harder to manage compared to a proper database.

In short, while JSON files work for demos or prototypes, production systems need a database that can handle concurrency, large datasets, complex queries, and reliable persistence.

So in this article, we'll walk through how to build a REST API with Next.js, store data in a Sevalla-managed database, and deploy the whole project to production using Sevalla's PaaS infrastructure.

What is Next.js?
Installation and Setup
How to Build a NextJS API
Provisioning a Database in Sevalla
Deploying to Sevalla
Conclusion

What is Next.js?

Next.js is an open-source React framework developed by Vercel. It's known for server-side rendering, static generation, and seamless routing. But beyond its frontend superpowers, it allows developers to build backend logic and APIs through its file-based routing system. This makes Next.js a great choice for building full-stack apps.

Installation and Setup

To get started, make sure Node.js and NPM are installed.

$ node --version
v22.16.0

$ npm --version
10.9.2

Now, create a new Next.js project:

npx create-next-app@latest

The result of the above command will ask you a series of questions to setup your app:

What is your project named? my-app
Would you like to use TypeScript? No / Yes
Would you like to use ESLint? No / Yes
Would you like to use Tailwind CSS? No / Yes
Would you like your code inside a `src/` directory? No / Yes
Would you like to use App Router? (recommended) No / Yes
Would you like to use Turbopack for `next dev`?  No / Yes
Would you like to customize the import alias (`@/*` by default)? No / Yes
What import alias would you like configured? @/*

But for this tutorial, we aren’t interested in a full stack app – just an API. So let’s re-create the app using the — - api flag.

$ npx create-next-app@latest --api

It will still ask you a few questions. Use the default settings and finish creating the app.

Once the setup is done, you can see the folder with your app name. Let’s go into the folder and run the app.

$ npm run dev

Your API template should be running at port 3000. Go to http://localhost:3000 and you should see the following message:

{
"message": "Hello world!"
}

How to Build a NextJS API

Now that we’ve set up our API template, let's write a basic REST API with two endpoints: one to create data and one to view data

The API code will reside under /app within the project directory. Next.js uses file-based routing for building URL paths.

For example, if you want a URL path /users, you should have a directory called “users” with a route.ts file to handle all the CRUD operations for /users. For /users/:id, you should have a directory called [id] under “users” directory with a route.ts file. The square brackets are to tell Next.js that you expect dynamic values for the /users/:id route.

Here is a screenshot of the setup. Delete the [slug] directory that comes with the project since it won’t be relevant for us.

The route.ts file at the bottom handles CRUD operations for / (this is where the response “hello world” was generated from)
The route.ts file under /users handles CRUD operations for /users

While this setup can seem complicated for a simple project, it provides a clear structure for large-scale web applications. If you want to go deeper into building complex APIs with Next.js, here is a tutorial you can follow.

The code under /app/route.ts is the default file for our API. You can see it serving the GET request and responding with “Hello World!”:

import { NextResponse } from "next/server";

export async function GET() {
  return NextResponse.json({ message: "Hello world!" });
}

Now we need two routes:

GET /users which lists all users
POST /users which creates a new user

For this project, we’ll use a database to store our records. We’re not going to install a database on our local machine. Instead, we’ll provision the database in the cloud and use it with our API. This approach is common in test / prod environments to ensure data consistency.

Provisioning a Database in Sevalla

Sevalla is a modern, usage-based Platform-as-a-service provider and an alternative to sites like Heroku or to your self-managed setup on AWS. It combines powerful features with a smooth developer experience.

Sevalls offers application hosting, database, object storage, and static site hosting for your projects. It comes with a generous free tier, so we’ll use it to connect to a database as well as deploy our app to the cloud.

If you are new to Sevalla, you can sign up using your GitHub account to enable direct deploys from your GitHub. Every time you push code to your project, Sevalla will auto-pull and deploy your app to the cloud.

Once you login to Sevalla, click on “Databases”.

Now let’s create a PostgreSQL database.

Use the default settings. Once the database is created, it will disable the external connections by default for security to ensure no one outside our server can connect to it. Since we want to test our connection from our local machine, let’s enable an external connection.

The value we need to connect to the database from our local endpoint is “url” under external connection. Create a file called .env in the project and paste the URL in the below format:

PGSQL_URL=postgres://:-@asia-east1-001.proxy.kinsta.app:30503/

The reason we use .env is to store environment variables specific to the environment. In production, we won’t need this file (never push .env files to GitHub). Sevalla will give us the option to add environment variables via the GUI when we deploy the app.

Now let’s test our database connection. Install the pg package for Node to interact with PostgreSQL. Let’s also install the TypeScript extension for pg to support TypeScript definitions.

$ npm i pg
$ npm install --save-dev @types/pg

Change the route.ts that served “hello world” to the below:

// app/api/your-endpoint/route.ts
import { NextResponse } from "next/server";
import { Client } from "pg";

export async function GET() {
  const client = new Client({
    connectionString: process.env.PGSQL_URL,
  });

  try {
    await client.connect();
    await client.end();
    return NextResponse.json({ message: "Connected to database" });
  } catch (error) {
    console.error("Database connection error:", error);
    return NextResponse.json({ message: "Connection failed" }, { status: 500 });
  }
}

Now when your app and go to localhost:3000, it should say “connected to database”.

Great. Now let’s write our two routes, one to create data and the other to view the data we created. Use this code under users/route.ts:

import { NextResponse } from "next/server";
import type { NextRequest } from "next/server";
import { Client } from "pg";

// Define the structure of a User object
interface User {
  id: string;
  name: string;
  email: string;
  age: number;
}

// Create a PostgreSQL client
function getClient() {
  return new Client({
    connectionString: process.env.PGSQL_URL,
  });
}

// Fetch all users from the database
async function readUsers(): Promise<User[]> {
  const client = getClient();
  await client.connect();

  try {
    const result = await client.query("SELECT id, name, email, age FROM users");
    return result.rows;
  } finally {
    await client.end();
  }
}

// Insert or update users in the database
async function writeUsers(users: User[]) {
  const client = getClient();
  await client.connect();

  try {
    const insertQuery = `
      INSERT INTO users (id, name, email, age)
      VALUES ($1, $2, $3, $4)
      ON CONFLICT (id) DO UPDATE SET
        name = EXCLUDED.name,
        email = EXCLUDED.email,
        age = EXCLUDED.age;
    `;

    for (const user of users) {
      await client.query(insertQuery, [user.id, user.name, user.email, user.age]);
    }
  } finally {
    await client.end();
  }
}

// Handle GET request: return list of users
export async function GET() {
  try {
    const users = await readUsers();
    return NextResponse.json(users);
  } catch (err) {
    console.error("Error reading users from DB:", err);
    return NextResponse.json({ error: "Failed to fetch users" }, { status: 500 });
  }
}

export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const users: User[] = Array.isArray(body) ? body : [body];

    await writeUsers(users);

    return NextResponse.json({ success: true, count: users.length });
  } catch (err) {
    console.error("Error writing users to DB:", err);
    return NextResponse.json({ error: "Failed to write users" }, { status: 500 });
  }
}

Now when you go to localhost:3000/users, it will give you an error because the users table does exist. So let’s create one.

In the database UI, click on “Studio”. You’ll get a visual editor for your database where you can manage your data directly (pretty cool, right?).

Press the “+” icon and choose “create table”. Create the table with the schema below. Click the “add column” link to create new columns.

Click “create table and you should see the table created as below:

Let’s add a dummy record using “add record” button to use it to test our API. The id field should be in UUID format (and you can generate one here).

Now let’s test our API.

You should see the user you created as the response to the localhost:3000/users query. Now let’s create a new user using our API.

We’ll use Postman for this since its easy to create POST requests using Postman. We’ll send a sample data under “body” → “raw” → “JSON”.

The response from Postman should be as below:

Now going to localhost:3000/users, you should see the new record created.

Great job! Now let’s get this app live.

Deploying to Sevalla

Push your code to GitHub or fork my repository. Now lets go to Sevalla and create a new app.

Choose your repository from the dropdown and check “Automatic deployment on commit”. This will ensure that the deployment is automatic every time you push code. Choose “Hobby” under the resources section.

Click “Create” and not “Create and deploy”. We haven’t added our PostgreSQL URL as an environment variable, so the app will crash if you try to deploy it.

Go to the “Environment variables” section and add the key “PGSQL_URL” and the URL in the value field.

Now go back to the “Overview” section and click “Deploy now”.

Once deployment is complete, click “Visit app” to get the live URL of your API. You can replace localhost:3000 with the new URL in Postman and test your API.

Congratulations – your app is now live. You can do more with your app using the admin interface, like:

Monitor the performance of your app
Watch real-time logs
Add custom domains
Update network settings (open/close ports for security, and so on)
Add more storage

Conclusion

Next.js is no longer just a frontend framework. It’s a powerful full-stack platform that lets you build and deploy production-ready APIs with minimal friction. By pairing it with Sevalla’s developer-friendly infrastructure, you can go from local development to a live, cloud-hosted API in minutes.

In this tutorial, you learned how to set up a Next.js API project, connect it to a cloud-hosted PostgreSQL database on Sevalla, and deploy everything seamlessly. Whether you're building a small side project or a full-scale application, this stack gives you the speed, structure, and scalability to move fast without losing flexibility.

Hope you enjoyed this article. I’ll see you soon with another one. You can connect with me here or visit my blog.

How to Deploy Your FastAPI + PostgreSQL App on Render: A Beginner's Guide

Preston Osoro — Thu, 22 May 2025 15:55:44 +0000

This guide is a comprehensive roadmap for deploying a FastAPI backend connected to a PostgreSQL database using Render, a cloud platform that supports hosting Python web apps and managed PostgreSQL databases.

You can find the complete source code here.

Deployment Context

When deploying a FastAPI app connected to PostgreSQL, you need to select a platform that supports Python web applications and managed databases. This guide uses Render as the example platform because it provides both web hosting and a PostgreSQL database service in one environment, making it straightforward to connect your backend with the database.

You can apply the concepts here to other cloud providers as well, but the steps will differ depending on the platform’s specifics.

Here’s what we’ll cover:

Project Structure for a Real-World FastAPI App
What You'll Need Before You Start
Deployment Steps
Local Development Workflow
Best Practices and Common Troubleshooting Tips
Common Issues and Solutions
Conclusion

Project Structure

If you’re building a real-world API with FastAPI you’ll quickly outgrow a single main.py file. That’s when modular project structure becomes essential for maintainability.

Here’s an example structure we’ll use throughout this guide:

FastAPI/
├── database/
│   ├── base.py
│   ├── database.py
│   └── __init__.py
├── fastapi_app/
│   └── main.py
├── items/
│   ├── models/
│   │   ├── __init__.py
│   │   └── item.py
│   ├── routes/
│   │   ├── __init__.py
│   │   └── item.py
│   └── schemas/
│       ├── __init__.py
│       └── item.py
├── models/
│   └── __init__.py
├── orders/
│   ├── models/
│   │   ├── __init__.py
│   │   └── order.py
│   ├── routes/
│   │   ├── __init__.py
│   │   └── order.py
│   └── schemas/
│       ├── __init__.py
│       └── order.py
└── users/
    ├── models/
    │   ├── __init__.py
    │   └── user.py
    ├── routes/
    │   ├── __init__.py
    │   └── user.py
    └── schemas/
        ├── __init__.py
        └── user.py

What You'll Need Before You Start

Before diving in, make sure you've got:

A free Render account (sign up if you don't have one)
A GitHub or GitLab repository for your FastAPI project
Basic familiarity with Python, FastAPI, and Git
Your project structure set up similarly to the example above

Deployment Steps

Step 1: Set Up Local PostgreSQL Database

For local development, you'll need to set up PostgreSQL on your machine like this:

-- 1. Log in as superuser
psql -U postgres

-- 2. Create a new database
CREATE DATABASE your_db;

-- 3. Create a user with password
CREATE USER your_user WITH PASSWORD 'your_secure_password';

-- 4. Grant all privileges on the database
GRANT ALL PRIVILEGES ON DATABASE your_db TO your_user;

-- 5. (Optional) Allow the user to create tables
ALTER USER your_user CREATEDB;

-- 6. Exit
\q

After setting up your local database, create a .env file in your project root:

DATABASE_URL=postgresql://your_user:your_secure_password@localhost:5432/your_db

Step 2: Set Up Your Database Connection

Create database/database.py to manage your PostgreSQL connection with SQLAlchemy:

This file is crucial as it creates the database engine, defines session management, and provides a dependency function for your routes.

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
import os
from dotenv import load_dotenv

load_dotenv()

DATABASE_URL = os.getenv("DATABASE_URL")
"""
The engine manages the connection to the database and handles query execution.
"""
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)

# Database dependency for routes
def get_db():
    db = SessionLocal()
    try:
        yield db
    finally:
        db.close()

And add database/base.py for the base class:

from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()

Step 3: Configure Your FastAPI Main Application

Create main FastAPI application file fastapi_app/main.py to import all your route modules:

import os
from fastapi import FastAPI, APIRouter
from fastapi.openapi.utils import get_openapi
from fastapi.security import OAuth2PasswordBearer
import uvicorn
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Database imports
from database import Base, engine

# Import models to ensure they're registered with SQLAlchemy
import models

# Import router modules
from items.routes import item_router
from orders.routes import order_router
from users.routes import user_router

# Initialize FastAPI app
app = FastAPI(
    title="Store API",
    version="1.0.0",
    description="API documentation for Store API"
)

# Create database tables on startup
Base.metadata.create_all(bind=engine)

# Root endpoint
@app.get("/")
async def root():
    return {"message": "Welcome to FastAPI Store"}

# Setup versioned API router and include module routers
api_router = APIRouter(prefix="/v1")
api_router.include_router(item_router)
api_router.include_router(order_router)
api_router.include_router(user_router)

# Register the master router with the app
app.include_router(api_router)

# Setup OAuth2 scheme for Swagger UI login flow
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/v1/auth/login")

# Custom OpenAPI schema with security configuration
def custom_openapi():
    if app.openapi_schema:
        return app.openapi_schema

    openapi_schema = get_openapi(
        title=app.title,
        version=app.version,
        description=app.description,
        routes=app.routes,
    )

    # Add security scheme
    openapi_schema["components"]["securitySchemes"] = {
        "BearerAuth": {
            "type": "http",
            "scheme": "bearer",
            "bearerFormat": "JWT",
        }
    }

    # Apply global security requirement
    openapi_schema["security"] = [{"BearerAuth": []}]

    app.openapi_schema = openapi_schema
    return app.openapi_schema

app.openapi = custom_openapi

# Run the app using Uvicorn when executed directly
if __name__ == "__main__":
    port = os.environ.get("PORT")
    if not port:
        raise EnvironmentError("PORT environment variable is not set")
    uvicorn.run("fastapi_app.main:app", host="0.0.0.0", port=int(port), reload=False)

Step 4: Create a Requirements File

In your project root, create a requirements.txt file that includes all the necessary dependencies:

fastapi>=0.68.0
uvicorn>=0.15.0
sqlalchemy>=1.4.23
psycopg2-binary>=2.9.1
python-dotenv>=0.19.0
pydantic>=1.8.2

Step 5: Provision a PostgreSQL Database on Render

Then click "New +" in the top right and select "PostgreSQL".

Fill in the details:

Name: your-app-db (choose a descriptive name)
Database: your_app (this will be your database name)
User: leave default (auto-generated)
Region: Choose the closest to your target users
Plan: Free tier

Save and note the Internal Database URL shown after creation, which will look something like this:

postgres://user:password@postgres-instance.render.com/your_app

Step 6: Deploy Your FastAPI App on Render

With your database provisioned, it's time to deploy your API. You can do that by following these steps:

In Render dashboard, click "New +" and select "Web Service"
Connect your GitHub/GitLab repository
Name your service
Then configure the build settings:
- Environment: Python 3
- Build Command: pip install -r requirements.txt
- Start Command: python3 -m fastapi_app.main
Add your environment variables:
- Click "Environment" tab
- Add your database URL:
  - Key: DATABASE_URL
  - Value: Paste the Internal Database URL from your PostgreSQL service
- Add any other environment variables your application needs
Finally, click Deploy Web Service.
- Render will start building and deploying your application
- This process takes a few minutes. You can monitor logs during build and deployment in real-time

Step 7: Test Your API Endpoints

Once deployed, access your API’s URL (for example, https://your-app-name.onrender.com).

Navigate to /docs to open the interactive Swagger UI, where you can test your endpoints directly:

Expand an endpoint
Click Try it out
Provide any required input
Click Execute
View the response

Local Development Workflow

While your app is deployed, you'll still need to work on it locally. Here's how to maintain a smooth development workflow:

First, create a local .env file (don't commit this to Git):

DATABASE_URL=postgresql://username:password@localhost:5432/your_local_db

Then install your dependencies in a virtual environment:

python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

Next, run your local server:

python3 -m fastapi_app.main

This command triggers the __main__ block in fastapi_app/main.py, which starts the FastAPI app using Uvicorn. It reads the PORT from your environment, so ensure it's set (e.g., via a .env file).

Then make changes to your code and test locally before pushing to GitHub/GitLab. You can push your changes to automatically trigger a new deployment on Render.

Best Practices and Tips

Use database migrations: Add Alembic to your project for managing schema changes
```
 pip install alembic
 alembic init migrations
```

Separate development and production configurations:

 if os.environ.get("ENVIRONMENT") == "production":
     # Production settings
 else:
     # Development settings

Monitor your application:
- Render provides logs and metrics for your application. You can set up alerts for errors or high resource usage.
Optimize database queries:
- Use SQLAlchemy's relationship loading options.
- Consider adding indexes to frequently queried fields.
Scale when needed:
- Render allows you to upgrade your plan as your application grows. Consider upgrading your database plan for production applications.

Common Issues and Solutions

When deploying a Python web app on Render, a few issues can commonly occur. Here's a more detailed look at them and how you can resolve each one.

Database connection errors:

If your app can’t connect to the database, first double-check that your DATABASE_URL environment variable is correctly set in your Render dashboard. Make sure the URL includes the right username, password, host, port, and database name.

Also, confirm that your SQLAlchemy models match the actual schema in your database. A mismatch here can lead to errors during migrations or app startup. If you're using Postgres, ensure that the database user has permission to read/write tables and perform migrations.

Deployment fails entirely:

When deployment fails, Render usually provides helpful logs under the “Events” tab. Check there for any error messages. A few common culprits include:

A missing requirements.txt file or forgotten dependencies.
A bad start command in the Render settings. Double-check that it points to your correct entry point (for example, gunicorn app:app or uvicorn main:app --host=0.0.0.0 --port=10000).
Improper Python version. You can specify this in a runtime.txt file (for example, python-3.11.1).

API returns 500 Internal Server errors:

Internal server errors can happen for several reasons. To debug:

Open your Render logs and look for Python tracebacks or unhandled exceptions.
Try to reproduce the issue locally using the same request and data.
Add try/except blocks around critical logic to capture and log errors more gracefully.

Even better, set up structured logging or error tracking (for example, with Sentry) to catch these before your users do.

Slow response times:

If your app is slow or intermittently timing out, check:

Whether you're still on the free Render tier, which has limited CPU and memory. Consider upgrading if you’re handling production-level traffic.
If you're running heavy or unoptimized database queries, tools like SQLAlchemy’s .explain() or Django Debug Toolbar can help.
If you’re frequently fetching the same data, try caching it using a lightweight in-memory cache like functools.lru_cache or a Redis instance.

Conclusion

Deploying a FastAPI app connected to PostgreSQL on Render is straightforward with the right structure and setup. While this guide used Render as an example, the concepts apply broadly across cloud platforms.

With this setup, you can develop, test, and deploy robust Python APIs backed by PostgreSQL databases efficiently.

The free tier on Render has some limitations, including PostgreSQL databases that expire after 90 days unless upgraded. For production applications, consider upgrading to a paid plan for better performance and reliability.

Happy coding!

How to Use PostgreSQL in Django

Udemezue John — Tue, 22 Apr 2025 13:43:52 +0000

If you’re building a Django project and wondering which database to use, PostgreSQL is a great choice. It’s reliable, fast, packed with powerful features, and works beautifully with Django.

I’ve used it across multiple projects – from small web apps to large-scale platforms – and it never disappoints.

In this post, I’ll walk you through how to connect PostgreSQL with Django step-by-step.

Let’s get started.

What we’ll cover:

Why Use PostgreSQL with Django?
What You'll Need
How to Use PostgreSQL in Django
Common Issues (and Fixes)
Optional: Use dj-database-url for Better Settings
Frequently Asked Questions
Wrapping Up
Further Resources

Why Use PostgreSQL with Django?

PostgreSQL is a popular, open-source relational database that’s known for its performance, flexibility, and powerful features like:

Advanced data types (JSON, arrays, and so on)
Full-text search
Support for complex queries
Data integrity and reliability

Django officially recommends PostgreSQL as the most feature-complete database backend it supports. If you're planning to build a serious web application, PostgreSQL is usually the best database to pair with Django.

What You’ll Need

Before we begin, make sure you have the following:

Python installed (3.7 or higher is best)
Django installed (I’ll be using version 4.x)
PostgreSQL installed and running
psycopg2 or psycopg2-binary (This is the adapter that lets Django talk to PostgreSQL)

How to Use PostgreSQL in Django

Here is how to get started:

Step 1: Install PostgreSQL

If you haven’t installed PostgreSQL yet, you can grab it from the official PostgreSQL website. It works on Windows, macOS, and Linux.

Make sure you remember the username, password, and database name when you set it up – you’ll need those later.

Step 2: Install the PostgreSQL Adapter for Python

Django needs a little help to connect with PostgreSQL. That’s where psycopg2 comes in.

You can install it using pip:

pip install psycopg2-binary

Tip: The -binary version is easier to install and works for most people. If you run into issues later, you can switch to psycopg2 (non-binary).

Step 3: Create a Django Project (If You Haven’t Yet)

If you haven’t created a project yet, start with:

django-admin startproject myproject
cd myproject

This will give you the basic project structure.

Step 4: Create a PostgreSQL Database

Now, open your PostgreSQL client (like psql or pgAdmin), and create a new database:

CREATE DATABASE mydatabase;
CREATE USER myuser WITH PASSWORD 'mypassword';
ALTER ROLE myuser SET client_encoding TO 'utf8';
ALTER ROLE myuser SET default_transaction_isolation TO 'read committed';
ALTER ROLE myuser SET timezone TO 'UTC';
GRANT ALL PRIVILEGES ON DATABASE mydatabase TO myuser;

This sets up a database and user with the right permissions. Replace mydatabase, myuser, and mypassword with whatever values you prefer.

Step 5: Update Django Settings to Use PostgreSQL

Now it’s time to tell Django to use your new PostgreSQL database.

Open myproject/settings.py and look for the DATABASES setting. Replace the default sqlite3 section with this:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'mydatabase',
        'USER': 'myuser',
        'PASSWORD': 'mypassword',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}

This tells Django to:

Use PostgreSQL (django.db.backends.postgresql)
Connect to a local database called mydatabase
Use the user and password you set up earlier

Step 6: Run Migrations

Now that everything’s connected, let’s create the database tables Django needs:

python manage.py migrate

If everything’s working, you’ll see Django creating tables in PostgreSQL. No errors? You’re good to go!

Step 7: Test the Connection

Let’s test it all by creating a superuser (admin account):

python manage.py createsuperuser

Follow the prompts, then run:

python manage.py runserver

Open your browser and go to http://127.0.0.1:8000/admin. Log in with your superuser account. You’ll be in the Django admin dashboard – and yes, all of this is backed by PostgreSQL now!

Common Issues (and Fixes)

Here are a few things that might trip you up:

Error: psycopg2.errors.UndefinedTable: This usually means you forgot to run migrate.
Can’t connect to database: Double-check your database name, user, and password. Make sure PostgreSQL is running.
Role doesn’t exist: You might’ve forgotten to create the user in PostgreSQL, or you used the wrong name in settings.py.

Optional: Use `dj-database-url` for Better Settings

If you’re planning to deploy your app later (especially on services like Heroku), managing your database settings through a URL is cleaner.

Install the helper package:

pip install dj-database-url

Then in your settings.py:

import dj_database_url

DATABASES = {
    'default': dj_database_url.config(default='postgres://myuser:mypassword@localhost:5432/mydatabase')
}

This lets you control your database config from an environment variable, which is more secure and flexible.

Frequently Asked Questions

Is PostgreSQL better than SQLite for Django?

For learning or small projects, SQLite is fine. But for serious apps with lots of users or advanced queries, PostgreSQL is much better.

Do I need to install PostgreSQL on my production server?

Yes – unless you’re using a hosted PostgreSQL solution like Amazon RDS, Heroku Postgres, or Supabase.

Is psycopg2-binary safe to use in production?

Yes, for most cases. But some recommend switching to the non-binary version (psycopg2) in production for better control.

Can I switch from SQLite to PostgreSQL mid-project?

Yes, but you’ll need to migrate your data. Tools like Django’s dumpdata and loaddata can help with that.

Wrapping Up

Using PostgreSQL in Django is a great step forward when you want to build real, production-ready apps.

The setup is pretty straightforward once you know the steps, and the performance gains are worth it.

Come say hey on X.com/_udemezue and check out my blog while you're at it!

Further Resources

If you want to dive deeper, here are a few links I recommend:

PostgreSQL - freeCodeCamp.org

How to Schedule Jobs in PostgreSQL with pg_cron

Table of Contents

Prerequisites

What Is pg_cron?

How pg_cron Works

How to Install and Set Up pg_cron

Step 1: Install the Package

Step 2: Update postgresql.conf

Step 3: Create the Extension

A Note on How Jobs Connect

Using pg_cron on Managed Database Services

A Quick Refresher on Cron Syntax

How to Schedule Your First Job

Practical pg_cron Examples

Example 1: Clean Up Old Rows Every Night

Example 2: Refresh a Materialized View Every Hour

Example 3: Build a Daily Summary Table

Example 4: Run a Job Every 30 Seconds

Example 5: Run Maintenance on the Last Day of the Month

How to View and Monitor Your Jobs

How to Update and Remove Jobs

How to Run Jobs in Other Databases

How to Let Other Users Schedule Jobs

When to Use pg_cron (and When to Avoid It)

Best Practices for Working with pg_cron

Conclusion

The Saga Pattern in Node.js: How to Roll Back Distributed Transactions Across Microservices

Table of Contents

Prerequisites

1. Introduction

2. The Problem in One Picture

3. Why You Need a Saga

4. Choreography vs Orchestration

Choreography

Orchestration

5. The Example Project

6. Architecture

7. The Saga Flow, Step by Step

8. The State Machine

9. Implementing the Orchestrator

Creating the Saga Record

The Main Loop

A Single Step in Detail

Habits Worth Copying

10. Implementing the Participant

11. Rollback (Compensation)

On the Orchestrator Side

On the Participant Side

Rules of a Good Compensation

What Happens if the Compensation Itself Fails?

12. Tracking, Idempotency and Observability

Orchestrator Side — agency_onboarding_sagas

Participant Side — agency_provision_records

Observability for Free

13. Testing a Saga

14. When NOT to Use a Saga

15. Trade-offs and Lessons Learned

16. Conclusion

How to Build a PostgreSQL-Backed Job Queue in Go

Table of Contents

Prerequisites

What You Will Learn

What Is a Job Queue?

Why Use PostgreSQL for a Queue?

Swig's Architecture

How to Represent Jobs in PostgreSQL

How to Define a Worker in Go

Go Interfaces

How to Register Workers Without Sharing State

How to Add a Job

How to Enqueue Jobs Inside Transactions

How to Handle Multiple Workers Safely

PostgreSQL FOR UPDATE SKIP LOCKED

How to Use Goroutines for Concurrent Workers

How to Handle Graceful Shutdown

How to Wake Workers with LISTEN/NOTIFY

How to Elect a Leader with Advisory Locks

How to Handle Failed Jobs

A Note on Delivery Semantics

Orchestrator Side — `agency_onboarding_sagas`

Participant Side — `agency_provision_records`

How to Use `EXPLAIN ANALYZE` to Measure Performance