<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ PostgreSQL - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ PostgreSQL - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Sat, 30 May 2026 16:31:09 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/postgresql/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Use PostgreSQL as a Cache, Queue, and Search Engine ]]>
                </title>
                <description>
                    <![CDATA[ "Just use Postgres" has been circulating as advice for years, but most articles arguing for it are opinion pieces. I wanted hard numbers. So I built a benchmark suite that pits vanilla PostgreSQL agai ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-postgresql-as-a-cache-queue-and-search-engine/</link>
                <guid isPermaLink="false">69e7accfe43672781470ff97</guid>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ database ]]>
                    </category>
                
                    <category>
                        <![CDATA[ backend ]]>
                    </category>
                
                    <category>
                        <![CDATA[ performance ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Aaron Yong ]]>
                </dc:creator>
                <pubDate>Tue, 21 Apr 2026 16:58:55 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/6fcdd3c0-eead-42a7-b2f0-cf4c6a3d06dc.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>"Just use Postgres" has been circulating as advice for years, but most articles arguing for it are opinion pieces. I wanted hard numbers.</p>
<p>So I built a benchmark suite that pits vanilla PostgreSQL against a feature-optimized PostgreSQL instance — measuring caching, message queues, full-text search, and pub/sub under controlled conditions.</p>
<p>In this article, you'll learn how to use PostgreSQL's built-in features for caching, job queues, full-text search, and pub/sub. You'll see actual benchmark results (latency percentiles, throughput, and error rates) comparing naive PostgreSQL patterns against optimized ones, and understand where PostgreSQL's limits are so you can decide whether you really need that extra service in your stack.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-the-setup">The Setup</a></p>
</li>
<li><p><a href="#heading-benchmark-1-caching-with-unlogged-tables">Benchmark 1: Caching with UNLOGGED Tables</a></p>
</li>
<li><p><a href="#heading-benchmark-2-job-queues-with-skip-locked">Benchmark 2: Job Queues with SKIP LOCKED</a></p>
</li>
<li><p><a href="#heading-benchmark-3-full-text-search-with-tsvector">Benchmark 3: Full-Text Search with tsvector</a></p>
</li>
<li><p><a href="#heading-benchmark-4-pubsub-with-listennotify">Benchmark 4: Pub/Sub with LISTEN/NOTIFY</a></p>
</li>
<li><p><a href="#heading-the-combined-workload-the-honest-test">The Combined Workload: The Honest Test</a></p>
</li>
<li><p><a href="#heading-what-i-learned">What I Learned</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along or reproduce the benchmarks, you'll need:</p>
<ul>
<li><p>Docker and Docker Compose</p>
</li>
<li><p>Node.js 20+ (for the Express TypeScript API layer)</p>
</li>
<li><p><a href="https://k6.io/">k6</a> for load testing</p>
</li>
<li><p>Basic familiarity with SQL and PostgreSQL</p>
</li>
</ul>
<p>The full benchmark project is <a href="https://github.com/aaronhsyong2/pg-stack-benchmark">open source on GitHub</a> — you can clone it and run every test yourself.</p>
<h2 id="heading-the-setup">The Setup</h2>
<p>The benchmark uses two identical PostgreSQL 17 instances running in Docker containers, each with fixed resource constraints (2 CPUs, 2 GB RAM). Both share the same Express TypeScript API layer — the only difference is which PostgreSQL features are enabled.</p>
<pre><code class="language-plaintext">┌─────────┐     ┌──────────────────┐     ┌─────────────────┐
│   k6    │────&gt;│  Express API     │────&gt;│  PG Baseline    │
│  (load  │     │  (TypeScript)    │     │  (vanilla PG17) │
│  test)  │────&gt;│  Port 3001/3002  │────&gt;│  PG Modded      │
└─────────┘     └──────────────────┘     │  (features on)  │
                                         └─────────────────┘
</code></pre>
<p>The baseline instance uses naïve approaches (regular tables, <code>ILIKE</code> search, polling). The modded instance uses PostgreSQL's built-in features (UNLOGGED tables, <code>tsvector</code> with GIN indexes, <code>LISTEN/NOTIFY</code>, partial indexes). Same hardware, same API code, same data. Only the database features differ.</p>
<p>Both instances share this tuned <code>postgresql.conf</code>:</p>
<pre><code class="language-ini"># Memory allocation
shared_buffers = 512MB           # 25% of available RAM
effective_cache_size = 1536MB    # 75% of RAM — helps the query planner
work_mem = 16MB                  # per-sort/hash operation memory

# SSD-optimized planner settings
random_page_cost = 1.1           # default 4.0 assumes spinning disks
effective_io_concurrency = 200   # allow parallel I/O on SSDs
</code></pre>
<p>These settings matter. The defaults assume spinning disks from the early 2000s. Setting <code>random_page_cost = 1.1</code> tells the query planner that random reads are nearly as fast as sequential reads on SSDs, which encourages index usage over sequential scans.</p>
<h2 id="heading-benchmark-1-caching-with-unlogged-tables">Benchmark 1: Caching with UNLOGGED Tables</h2>
<p><strong>The idea:</strong> Use an UNLOGGED table as an in-database cache. UNLOGGED tables skip PostgreSQL's Write-Ahead Log (WAL) — the mechanism that guarantees durability. Since cache data is ephemeral by nature, losing it on a crash is acceptable, and skipping WAL removes the biggest write bottleneck.</p>
<pre><code class="language-sql">-- Modded: UNLOGGED table for cache entries
CREATE UNLOGGED TABLE cache_entries (
    key TEXT PRIMARY KEY,
    value JSONB NOT NULL,
    expires_at TIMESTAMPTZ
);

-- Baseline: same schema, but a regular (logged) table
CREATE TABLE cache_entries (
    key TEXT PRIMARY KEY,
    value JSONB NOT NULL,
    expires_at TIMESTAMPTZ
);
</code></pre>
<h3 id="heading-results-200-virtual-users">Results (200 Virtual Users)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>avg</th>
<th>req/s</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline (regular table)</td>
<td>1.87ms</td>
<td>6.00ms</td>
<td>2.50ms</td>
<td>1,754/s</td>
</tr>
<tr>
<td>Modded (UNLOGGED table)</td>
<td>1.71ms</td>
<td>5.24ms</td>
<td>2.17ms</td>
<td>1,760/s</td>
</tr>
</tbody></table>
<p>A consistent 13% improvement across all percentiles. Not dramatic, but free — you change one keyword in your <code>CREATE TABLE</code> statement.</p>
<h3 id="heading-under-stress-1000-virtual-users-no-sleep">Under Stress (1,000 Virtual Users, No Sleep)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>req/s</th>
<th>Total Requests</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline</td>
<td>83.38ms</td>
<td>143.23ms</td>
<td>7,663/s</td>
<td>728,021</td>
</tr>
<tr>
<td>Modded</td>
<td>77.69ms</td>
<td>126.39ms</td>
<td>8,062/s</td>
<td>765,934</td>
</tr>
</tbody></table>
<p>The relative improvement stays locked at 12-13% regardless of load level. The UNLOGGED advantage is a per-write optimization — it saves the same amount of I/O whether you are doing 100 or 10,000 writes per second. The modded instance served 37,000 more requests in the same time window.</p>
<h3 id="heading-the-verdict">The Verdict</h3>
<p>UNLOGGED tables won't match Redis for sub-millisecond hot-path caching (real-time bidding, gaming leaderboards). But for web applications where the difference between 2ms and 5ms is invisible to users, they eliminate an entire infrastructure dependency for zero additional complexity.</p>
<p>You do give up Redis data structures (sorted sets, HyperLogLog, streams). If you need those, a dedicated cache is still the right call.</p>
<h2 id="heading-benchmark-2-job-queues-with-skip-locked">Benchmark 2: Job Queues with SKIP LOCKED</h2>
<p><strong>The idea:</strong> Use PostgreSQL as a job queue with <code>SELECT ... FOR UPDATE SKIP LOCKED</code>. Multiple workers poll the same table, and <code>SKIP LOCKED</code> ensures each worker gets a different row — no duplicates, no contention.</p>
<pre><code class="language-sql">-- Queue table with a partial index on pending jobs only
CREATE TABLE job_queue (
    id SERIAL PRIMARY KEY,
    payload JSONB NOT NULL,
    status TEXT NOT NULL DEFAULT 'pending',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Partial index: only indexes pending jobs
-- As jobs complete, they leave the index — it stays small forever
CREATE INDEX idx_pending_jobs ON job_queue (created_at)
    WHERE status = 'pending';
</code></pre>
<p>The dequeue pattern:</p>
<pre><code class="language-sql">-- Atomic dequeue: select + update in one statement
UPDATE job_queue SET status = 'processing'
WHERE id = (
    SELECT id FROM job_queue
    WHERE status = 'pending'
    ORDER BY created_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED  -- skip rows locked by other workers
) RETURNING *;
</code></pre>
<p>How <code>SKIP LOCKED</code> works: Worker A locks row 1. Worker B tries row 1, sees the lock, skips it, and takes row 2 instead. No blocking, no duplicates. If a worker crashes, the transaction rolls back and the row becomes available again.</p>
<h3 id="heading-results-100-producers-50-consumers">Results (100 Producers + 50 Consumers)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>avg</th>
<th>req/s</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline (full index)</td>
<td>1.90ms</td>
<td>5.01ms</td>
<td>2.30ms</td>
<td>1,053/s</td>
</tr>
<tr>
<td>Modded (partial index)</td>
<td>1.81ms</td>
<td>5.28ms</td>
<td>2.29ms</td>
<td>1,052/s</td>
</tr>
</tbody></table>
<p>They're virtually identical. The partial index doesn't show its value in a 60-second benchmark because the table doesn't accumulate enough completed rows for the index size difference to matter. In a production system with millions of completed jobs, the partial index keeps the index at kilobytes while a full index grows to gigabytes.</p>
<h3 id="heading-the-verdict">The Verdict</h3>
<p><code>SKIP LOCKED</code> is production-ready for job queues. Libraries like <a href="https://github.com/timgit/pg-boss">pg-boss</a> (Node.js) and <a href="https://github.com/riverqueue/river">river</a> (Go) build on this exact pattern.</p>
<p>You do give up exchange/routing patterns (fan-out, topic-based routing) and consumer groups with message replay. If you need those, a dedicated message broker is still the right tool. For simple "process this job once" workloads, PostgreSQL handles it.</p>
<h2 id="heading-benchmark-3-full-text-search-with-tsvector">Benchmark 3: Full-Text Search with tsvector</h2>
<p><strong>The idea:</strong> Use PostgreSQL's built-in full-text search instead of a separate search service. A <code>tsvector</code> column stores pre-processed search tokens, and a GIN (Generalized Inverted Index) enables fast lookups using the same inverted index concept that powers Elasticsearch.</p>
<pre><code class="language-sql">-- Search-optimized article table
CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    body TEXT NOT NULL,
    search_vector tsvector  -- pre-computed search tokens
);

-- GIN index for full-text search
CREATE INDEX idx_search ON articles USING GIN (search_vector);

-- Auto-update search_vector on insert/update
CREATE OR REPLACE FUNCTION update_search_vector() RETURNS trigger AS $$
BEGIN
    NEW.search_vector := to_tsvector('english',
        COALESCE(NEW.title, '') || ' ' || COALESCE(NEW.body, ''));
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_search
    BEFORE INSERT OR UPDATE ON articles
    FOR EACH ROW EXECUTE FUNCTION update_search_vector();
</code></pre>
<p>The baseline uses <code>ILIKE</code> with a leading wildcard — the approach most developers reach for first:</p>
<pre><code class="language-sql">-- Baseline: sequential scan on every query
SELECT * FROM articles
WHERE title ILIKE '%postgresql%' OR body ILIKE '%postgresql%';

-- Modded: GIN index lookup with relevance ranking
SELECT id, title,
    ts_rank(search_vector, plainto_tsquery('english', 'postgresql')) AS rank
FROM articles
WHERE search_vector @@ plainto_tsquery('english', 'postgresql')
ORDER BY rank DESC LIMIT 20;
</code></pre>
<h3 id="heading-results-500-virtual-users">Results (500 Virtual Users)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>avg</th>
<th>req/s</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline (ILIKE)</td>
<td>1.96ms</td>
<td>101.83ms</td>
<td>25.22ms</td>
<td>561/s</td>
</tr>
<tr>
<td>Modded (tsvector + GIN)</td>
<td>2.76ms</td>
<td>10.39ms</td>
<td>3.76ms</td>
<td>675/s</td>
</tr>
</tbody></table>
<p>This is the standout result. The baseline's p95 of 101ms versus the modded's 10ms is a 10x improvement.</p>
<p>Why the baseline's p50 (1.96ms) is slightly better than the modded's (2.76ms): simple <code>ILIKE</code> queries on small result sets can be fast when the data fits in <code>shared_buffers</code>. But as load increases and the buffer cache is contested, sequential scans degrade dramatically. The GIN index stays stable.</p>
<h3 id="heading-under-stress-500-virtual-users-no-sleep">Under Stress (500 Virtual Users, No Sleep)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>req/s</th>
<th>Total Requests</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline (ILIKE)</td>
<td>599ms</td>
<td>1,000ms</td>
<td>558/s</td>
<td>50,212</td>
</tr>
<tr>
<td>Modded (tsvector)</td>
<td>209ms</td>
<td>396ms</td>
<td>1,441/s</td>
<td>129,679</td>
</tr>
</tbody></table>
<p>ILIKE collapses to 1-second p95 latencies. Each query forces a sequential scan of all 10,000 articles, blocking shared buffers and starving concurrent queries. The tsvector approach serves 2.6x more requests in the same time window because the GIN index lookup is O(log n) regardless of concurrency.</p>
<h3 id="heading-the-verdict">The Verdict</h3>
<p>This is the strongest argument in the entire benchmark. The fix requires zero extensions — <code>to_tsvector()</code>, <code>plainto_tsquery()</code>, and <code>CREATE INDEX USING GIN</code> are all built into core PostgreSQL. If you're doing <code>WHERE column ILIKE '%term%'</code> on any table with more than a few thousand rows, you're leaving massive performance on the table.</p>
<p>You do give up distributed search across shards, complex analyzers for CJK languages, and aggregation/faceted search pipelines. For a product search bar, blog search, or internal tool — PostgreSQL is enough.</p>
<h2 id="heading-benchmark-4-pubsub-with-listennotify">Benchmark 4: Pub/Sub with LISTEN/NOTIFY</h2>
<p><strong>The idea:</strong> Use PostgreSQL's native <code>LISTEN/NOTIFY</code> for pub/sub messaging, triggered automatically on INSERT via a database trigger.</p>
<pre><code class="language-sql">-- Trigger that fires pg_notify on every new message
CREATE OR REPLACE FUNCTION notify_message() RETURNS trigger AS $$
BEGIN
    PERFORM pg_notify(NEW.channel, NEW.payload::text);
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_notify
    AFTER INSERT ON messages
    FOR EACH ROW EXECUTE FUNCTION notify_message();
</code></pre>
<h3 id="heading-results-200-virtual-users">Results (200 Virtual Users)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>avg</th>
<th>req/s</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline (poll-based)</td>
<td>1.99ms</td>
<td>6.04ms</td>
<td>2.84ms</td>
<td>1,116/s</td>
</tr>
<tr>
<td>Modded (LISTEN/NOTIFY)</td>
<td>1.65ms</td>
<td>4.80ms</td>
<td>2.13ms</td>
<td>1,131/s</td>
</tr>
</tbody></table>
<p>Here we have a 20% improvement at p95. The trigger-based approach does more work per INSERT (INSERT + NOTIFY), but the reduced round trips and better connection reuse patterns offset the overhead.</p>
<h3 id="heading-the-verdict">The Verdict</h3>
<p><code>LISTEN/NOTIFY</code> works for real-time features where you would otherwise reach for Redis pub/sub. The main limitation is payload size (8,000 bytes maximum) and the requirement for dedicated connections (incompatible with PgBouncer in transaction mode).</p>
<h2 id="heading-the-combined-workload-the-honest-test">The Combined Workload: The Honest Test</h2>
<p>Individual benchmarks are flattering. The real question: can one PostgreSQL instance handle caching, queues, search, and pub/sub simultaneously without degrading?</p>
<h3 id="heading-results-all-four-workloads-running-together">Results (All Four Workloads Running Together)</h3>
<table>
<thead>
<tr>
<th>Mode</th>
<th>p50</th>
<th>p95</th>
<th>avg</th>
<th>req/s</th>
</tr>
</thead>
<tbody><tr>
<td>Baseline</td>
<td>1.65ms</td>
<td>5.24ms</td>
<td>2.17ms</td>
<td>1,424/s</td>
</tr>
<tr>
<td>Modded</td>
<td>1.86ms</td>
<td>6.05ms</td>
<td>2.47ms</td>
<td>1,417/s</td>
</tr>
</tbody></table>
<p>Under combined load, the baseline marginally outperforms the modded setup. The modded PostgreSQL does more work per operation — maintaining GIN indexes, firing triggers, running <code>pg_cron</code> in the background. When all these features are active simultaneously, the overhead is measurable: about 15% higher p95 latency.</p>
<p>But both setups stay comfortably under 10ms at p95. For most web applications, that's more than good enough.</p>
<h2 id="heading-what-i-learned">What I Learned</h2>
<p>After running all these benchmarks, here's what I would tell a team evaluating whether to "just use Postgres":</p>
<ol>
<li><p><strong>Do it for full-text search:</strong> Switching from <code>ILIKE</code> to <code>tsvector</code> with a GIN index is a 10x improvement that requires zero extensions. This is the single highest-ROI change in the entire PostgreSQL ecosystem, and most developers don't know it exists.</p>
</li>
<li><p><strong>Do it for job queues:</strong> <code>SKIP LOCKED</code> is production-ready and eliminates RabbitMQ for simple "process this job" workloads. Use a library like pg-boss or river rather than rolling your own.</p>
</li>
<li><p><strong>Consider it for caching:</strong> UNLOGGED tables give a steady 13% improvement over regular tables. If sub-millisecond latency is not a hard requirement (and for most web apps, it is not), you can drop Redis entirely.</p>
</li>
<li><p><strong>Be honest about the overhead:</strong> Running all four roles simultaneously adds about 15% latency compared to running any single role. Whether that matters depends on your latency budget.</p>
</li>
<li><p><strong>Know where to stop:</strong> PostgreSQL won't match Redis for sub-millisecond caching, Kafka for millions of messages per second, or Elasticsearch for distributed multi-node search with complex analyzers. The line is at extreme throughput or extreme specialization.</p>
</li>
</ol>
<p>The honest conclusion is not "PostgreSQL does everything." It is: for most applications, a single well-configured PostgreSQL instance handles 80% of what you would otherwise need three to five additional services for. That is less infrastructure to deploy, monitor, and maintain — and fewer things to break at 3 AM.</p>
<p>Enterprise-scale applications processing millions of messages per second, serving sub-millisecond cache hits to millions of concurrent users, or running distributed search across terabytes of documents will still need specialized tools. Those tools exist for a reason, and at that scale the operational cost of running them is justified by the performance you get back.</p>
<p>But most of us aren't building at that scale — and may never need to. Starting with PostgreSQL for these roles means you ship faster with fewer moving parts. If and when you outgrow what PostgreSQL can handle, your benchmarks will tell you exactly which role needs to be extracted into a dedicated service. That is a much better position than starting with five services on day one because you assumed you would need them.</p>
<p>The <a href="https://github.com/aaronhsyong2/pg-stack-benchmark">benchmark project</a> is open source if you want to reproduce these results or adapt the tests for your own workload.</p>
<p>You can find more of my writing at <a href="https://site.aaronhsyong.com">site.aaronhsyong.com</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How Database Indexes Work – A Practical Guide with PostgreSQL Examples ]]>
                </title>
                <description>
                    <![CDATA[ Every developer eventually runs into a slow query. The table has grown from a few hundred rows to a few million, and what used to take milliseconds now takes seconds — or worse. The fix, more often th ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-database-indexes-work-a-practical-guide-with-postgresql-examples/</link>
                <guid isPermaLink="false">69e11c10ffbb787634dea035</guid>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ indexing ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ iyiola ]]>
                </dc:creator>
                <pubDate>Thu, 16 Apr 2026 17:27:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/cf6919a4-f803-4783-83ff-5c7674141c55.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Every developer eventually runs into a slow query. The table has grown from a few hundred rows to a few million, and what used to take milliseconds now takes seconds — or worse.</p>
<p>The fix, more often than not, is an index.</p>
<p>A database index is a data structure that helps the database find rows faster without scanning the entire table. It works a lot like the index at the back of a textbook: instead of reading every page to find a topic, you look it up in the index, get the page number, and go straight there.</p>
<p>In this tutorial, you'll learn how indexes work under the hood, how to create and use them effectively in PostgreSQL, and how to avoid the common mistakes that make indexes useless or even harmful.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-why-do-you-need-indexes">Why Do You Need Indexes?</a></p>
</li>
<li><p><a href="#heading-how-indexes-work-under-the-hood">How Indexes Work Under the Hood</a></p>
</li>
<li><p><a href="#heading-how-to-create-your-first-index">How to Create Your First Index</a></p>
</li>
<li><p><a href="#heading-how-to-use-explain-analyze-to-measure-performance">How to Use EXPLAIN ANALYZE to Measure Performance</a></p>
</li>
<li><p><a href="#heading-types-of-indexes-in-postgresql">Types of Indexes in PostgreSQL</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-composite-index">How to Create a Composite Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-partial-index">How to Create a Partial Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-an-expression-index">How to Create an Expression Index</a></p>
</li>
<li><p><a href="#heading-how-to-create-a-unique-index">How to Create a Unique Index</a></p>
</li>
<li><p><a href="#heading-how-to-manage-indexes">How to Manage Indexes</a></p>
</li>
<li><p><a href="#heading-when-indexes-hurt-instead-of-help">When Indexes Hurt Instead of Help</a></p>
</li>
<li><p><a href="#heading-common-mistakes-that-prevent-index-usage">Common Mistakes That Prevent Index Usage</a></p>
</li>
<li><p><a href="#heading-best-practices-for-indexing">Best Practices for Indexing</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along with the examples, you'll need:</p>
<ul>
<li><p>Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE, WHERE, JOIN)</p>
</li>
<li><p>A running PostgreSQL instance (version 12 or later)</p>
</li>
<li><p>A SQL client like <code>psql</code>, pgAdmin, or DBeaver</p>
</li>
</ul>
<p>If you don't have PostgreSQL installed locally, you can use a free cloud-hosted instance from services like <a href="https://neon.tech">Neon</a> or <a href="https://supabase.com">Supabase</a>.</p>
<h2 id="heading-why-do-you-need-indexes">Why Do You Need Indexes?</h2>
<p>When you run a query like <code>SELECT * FROM users WHERE email = 'jane@example.com'</code>, the database needs to find the matching row. Without an index, PostgreSQL performs a <strong>sequential scan</strong> — it reads every single row in the table and checks whether the <code>email</code> column matches.</p>
<p>For a table with 100 rows, this is fine. For a table with 10 million rows, it's painfully slow.</p>
<p>An index solves this by creating a separate, sorted data structure that maps column values to their row locations. Instead of scanning 10 million rows, PostgreSQL can look up the value in the index and jump directly to the matching row. This can reduce query time from seconds to milliseconds.</p>
<p>But indexes aren't free. They come with trade-offs you need to understand before adding them everywhere. You'll learn about those trade-offs throughout this tutorial.</p>
<h2 id="heading-how-indexes-work-under-the-hood">How Indexes Work Under the Hood</h2>
<p>PostgreSQL's default index type is the <strong>B-tree</strong> (balanced tree). Understanding how a B-tree works will help you make smarter decisions about when and how to index.</p>
<p>A B-tree organizes data into a sorted, hierarchical structure with three levels:</p>
<ol>
<li><p><strong>Root node</strong> — the top of the tree. It holds a few values that divide the data into broad ranges.</p>
</li>
<li><p><strong>Internal nodes</strong> — each one further narrows down the range.</p>
</li>
<li><p><strong>Leaf nodes</strong> — the bottom level. These hold the actual indexed values along with pointers to the corresponding rows in the table.</p>
</li>
</ol>
<p>When PostgreSQL uses a B-tree index to find a value, it starts at the root and follows the path that matches the target value, moving through internal nodes until it reaches the correct leaf node. This path is called a <strong>tree traversal</strong>, and it typically requires only 3–4 steps even for tables with millions of rows.</p>
<p>Think of it like a phone book. You don't start at page one and read every name. You open to roughly the right section (root), narrow it down to the right page (internal nodes), and scan the entries on that page (leaf node).</p>
<p>This sorted structure is also why B-tree indexes work well for range queries like <code>WHERE price &gt; 50 AND price &lt; 100</code>. The database finds the starting point in the tree and then scans forward through the leaf nodes, which are already in order.</p>
<h2 id="heading-how-to-create-your-first-index">How to Create Your First Index</h2>
<p>Let's build a practical example. You'll create a table, load it with data, and see the difference an index makes.</p>
<h3 id="heading-step-1-create-the-table-and-insert-sample-data">Step 1 – Create the Table and Insert Sample Data</h3>
<pre><code class="language-sql">CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    city VARCHAR(50),
    created_at TIMESTAMP DEFAULT NOW()
);
</code></pre>
<p>Now insert a large number of rows so the performance difference is visible. This generates 500,000 rows of sample data:</p>
<pre><code class="language-sql">INSERT INTO customers (first_name, last_name, email, city)
SELECT
    'User' || gs,
    'Last' || gs,
    'user' || gs || '@example.com',
    (ARRAY['Lagos', 'London', 'New York', 'Berlin', 'Tokyo'])[1 + (gs % 5)]
FROM generate_series(1, 500000) AS gs;
</code></pre>
<h3 id="heading-step-2-query-without-an-index">Step 2 – Query Without an Index</h3>
<pre><code class="language-sql">EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';
</code></pre>
<p>You'll see output similar to this:</p>
<pre><code class="language-plaintext">Seq Scan on customers  (cost=0.00..11374.00 rows=1 width=52) (actual time=45.123..91.456 rows=1 loops=1)
  Filter: ((email)::text = 'user250000@example.com'::text)
  Rows Removed by Filter: 499999
Planning Time: 0.085 ms
Execution Time: 91.502 ms
</code></pre>
<p>The key detail here is <code>Seq Scan</code> — PostgreSQL scanned all 500,000 rows to find a single match. It filtered out 499,999 rows. That's a lot of wasted work.</p>
<h3 id="heading-step-3-create-an-index">Step 3 – Create an Index</h3>
<pre><code class="language-sql">CREATE INDEX idx_customers_email ON customers (email);
</code></pre>
<p>This creates a B-tree index on the <code>email</code> column. The name <code>idx_customers_email</code> follows a common naming convention: <code>idx_</code> prefix, then the table name, then the column name.</p>
<h3 id="heading-step-4-query-with-the-index">Step 4 – Query With the Index</h3>
<p>Run the same query again:</p>
<pre><code class="language-sql">EXPLAIN ANALYZE
SELECT * FROM customers WHERE email = 'user250000@example.com';
</code></pre>
<p>Now you'll see something like this:</p>
<pre><code class="language-plaintext">Index Scan using idx_customers_email on customers  (cost=0.42..8.44 rows=1 width=52) (actual time=0.034..0.036 rows=1 loops=1)
  Index Cond: ((email)::text = 'user250000@example.com'::text)
Planning Time: 0.112 ms
Execution Time: 0.058 ms
</code></pre>
<p>The scan type changed from <code>Seq Scan</code> to <code>Index Scan</code>. The execution time dropped from ~91ms to ~0.06ms. That's roughly a 1,500x improvement — from one line of SQL.</p>
<h2 id="heading-how-to-use-explain-analyze-to-measure-performance">How to Use <code>EXPLAIN ANALYZE</code> to Measure Performance</h2>
<p><code>EXPLAIN ANALYZE</code> is your most important tool for understanding how PostgreSQL executes a query. You already saw it in the previous section, but let's break down what the output means.</p>
<pre><code class="language-sql">EXPLAIN ANALYZE SELECT * FROM customers WHERE city = 'Lagos';
</code></pre>
<p>The output will tell you several things:</p>
<ul>
<li><p><strong>Scan type</strong> — whether PostgreSQL used a sequential scan, index scan, bitmap index scan, or another access method</p>
</li>
<li><p><strong>Cost</strong> — the estimated cost in arbitrary units. The first number is the startup cost, the second is the total cost</p>
</li>
<li><p><strong>Rows</strong> — how many rows PostgreSQL estimated it would find versus how many it actually found</p>
</li>
<li><p><strong>Actual time</strong> — the real time in milliseconds to execute the query</p>
</li>
<li><p><strong>Rows Removed by Filter</strong> — how many rows were scanned but didn't match the condition</p>
</li>
</ul>
<p>If you see <code>Seq Scan</code> on a large table with a selective WHERE clause, that's usually a sign you need an index. If you see <code>Index Scan</code> or <code>Index Only Scan</code>, your index is working.</p>
<p>One thing to keep in mind: <code>EXPLAIN</code> without <code>ANALYZE</code> shows the plan without actually running the query. <code>EXPLAIN ANALYZE</code> runs the query and shows real timing data. Always use <code>EXPLAIN ANALYZE</code> when you're investigating performance, but be careful with it on destructive queries — <code>EXPLAIN ANALYZE DELETE FROM ...</code> will actually delete the rows. Wrap those in a transaction and roll back:</p>
<pre><code class="language-sql">BEGIN;
EXPLAIN ANALYZE DELETE FROM customers WHERE city = 'Berlin';
ROLLBACK;
</code></pre>
<h2 id="heading-types-of-indexes-in-postgresql">Types of Indexes in PostgreSQL</h2>
<p>PostgreSQL supports several index types, each optimized for different query patterns.</p>
<h3 id="heading-b-tree-default">B-tree (Default)</h3>
<p>B-tree is the default index type and covers the vast majority of use cases. It supports equality checks (<code>=</code>), range queries (<code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, <code>&gt;=</code>, <code>BETWEEN</code>), sorting (<code>ORDER BY</code>), and <code>IS NULL</code> / <code>IS NOT NULL</code> checks.</p>
<pre><code class="language-sql">-- These are equivalent – B-tree is the default
CREATE INDEX idx_name ON customers (last_name);
CREATE INDEX idx_name ON customers USING btree (last_name);
</code></pre>
<p>Use B-tree when you don't have a specific reason to use something else.</p>
<h3 id="heading-hash">Hash</h3>
<p>Hash indexes are optimized purely for equality comparisons (<code>=</code>). They don't support range queries or sorting. In practice, B-tree handles equality checks almost as fast, so hash indexes are rarely necessary.</p>
<pre><code class="language-sql">CREATE INDEX idx_email_hash ON customers USING hash (email);
</code></pre>
<p>Consider a hash index only if you have a very large table with frequent equality-only lookups and want to save a small amount of index space.</p>
<h3 id="heading-gin-generalized-inverted-index">GIN (Generalized Inverted Index)</h3>
<p>GIN indexes are designed for values that contain multiple elements — like arrays, JSONB documents, or full-text search vectors. Instead of indexing a single value per row, GIN indexes every element within the value.</p>
<pre><code class="language-sql">-- Add a JSONB column
ALTER TABLE customers ADD COLUMN preferences JSONB DEFAULT '{}';

-- Index the JSONB column
CREATE INDEX idx_preferences ON customers USING gin (preferences);

-- Now this query uses the GIN index
SELECT * FROM customers WHERE preferences @&gt; '{"newsletter": true}';
</code></pre>
<p>Use GIN when you're querying inside JSONB data, searching arrays with <code>@&gt;</code> or <code>&amp;&amp;</code>, or doing full-text search with <code>tsvector</code>.</p>
<h3 id="heading-gist-generalized-search-tree">GiST (Generalized Search Tree)</h3>
<p>GiST indexes support geometric data, ranges, and full-text search. They're commonly used with PostGIS for geospatial queries.</p>
<pre><code class="language-sql">-- Range type example
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    duration TSRANGE
);

CREATE INDEX idx_event_duration ON events USING gist (duration);

-- Find overlapping events
SELECT * FROM events WHERE duration &amp;&amp; '[2025-01-01, 2025-01-31]'::tsrange;
</code></pre>
<p>Use GiST when you're working with spatial data, range types, or need overlap/containment operators.</p>
<h3 id="heading-brin-block-range-index">BRIN (Block Range Index)</h3>
<p>BRIN indexes are extremely small and work well on large tables where the physical row order correlates with the indexed column's value. A common example is a timestamp column on an append-only table where new rows always have later timestamps.</p>
<pre><code class="language-sql">CREATE INDEX idx_created_at_brin ON customers USING brin (created_at);
</code></pre>
<p>BRIN stores summary information (min/max values) for each block of rows rather than indexing every row individually. This makes the index much smaller than a B-tree, but it only works well when the data is naturally ordered.</p>
<p>Use BRIN for very large, append-only tables with naturally ordered data — like logs, events, or time-series data.</p>
<h2 id="heading-how-to-create-a-composite-index">How to Create a Composite Index</h2>
<p>A composite index (also called a multi-column index) covers more than one column. It's useful when your queries frequently filter or sort by multiple columns together.</p>
<pre><code class="language-sql">CREATE INDEX idx_city_lastname ON customers (city, last_name);
</code></pre>
<p>The order of columns in a composite index matters. PostgreSQL can use this index for queries that filter on <code>city</code> alone, or on both <code>city</code> and <code>last_name</code>. But it <strong>can't</strong> efficiently use this index for queries that filter only on <code>last_name</code>.</p>
<p>Think of it like a phone book sorted by city first, then by last name within each city. You can easily look up everyone in Lagos. You can also look up everyone named "Adeyemi" in Lagos. But finding all people named "Adeyemi" across all cities requires scanning the whole book.</p>
<p>This principle is called the <strong>leftmost prefix rule</strong>: PostgreSQL can use a composite index for queries that include the leftmost column(s) of the index, but not for queries that skip them.</p>
<pre><code class="language-sql">-- ✅ Uses the index (matches leftmost column)
SELECT * FROM customers WHERE city = 'Lagos';

-- ✅ Uses the index (matches both columns, left to right)
SELECT * FROM customers WHERE city = 'Lagos' AND last_name = 'Adeyemi';

-- ❌ Cannot use this index efficiently (skips the leftmost column)
SELECT * FROM customers WHERE last_name = 'Adeyemi';
</code></pre>
<p>When deciding column order, place the most selective column first — the one that narrows down the results the most.</p>
<h2 id="heading-how-to-create-a-partial-index">How to Create a Partial Index</h2>
<p>A partial index covers only a subset of rows in a table. You define the subset with a WHERE clause in the index definition.</p>
<p>This is useful when you only query a specific portion of the data. For example, if you have an <code>orders</code> table and you frequently query for pending orders but rarely look at completed ones:</p>
<pre><code class="language-sql">CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'pending',
    total NUMERIC(10, 2),
    created_at TIMESTAMP DEFAULT NOW()
);

-- Only index rows where status is 'pending'
CREATE INDEX idx_orders_pending ON orders (customer_id)
WHERE status = 'pending';
</code></pre>
<p>This index is smaller than a full index because it skips all rows that don't match the WHERE condition. Smaller indexes use less disk space, consume less memory, and are faster to maintain during writes.</p>
<p>For the index to be used, your query's WHERE clause must match the index's condition:</p>
<pre><code class="language-sql">-- ✅ Uses the partial index
SELECT * FROM orders WHERE status = 'pending' AND customer_id = 42;

-- ❌ Cannot use the partial index (different status)
SELECT * FROM orders WHERE status = 'shipped' AND customer_id = 42;
</code></pre>
<h2 id="heading-how-to-create-an-expression-index">How to Create an Expression Index</h2>
<p>Sometimes you need to index the result of a function or expression rather than a raw column value. Expression indexes (also called functional indexes) handle this.</p>
<p>A common scenario is case-insensitive email lookups. If your queries use <code>LOWER(email)</code>, a regular index on <code>email</code> won't help — PostgreSQL sees the function call as a different expression.</p>
<pre><code class="language-sql">-- Regular index on email – won't help with LOWER() queries
CREATE INDEX idx_email ON customers (email);

-- This query does NOT use the index above
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';
</code></pre>
<p>To fix this, create an index on the expression itself:</p>
<pre><code class="language-sql">CREATE INDEX idx_email_lower ON customers (LOWER(email));
</code></pre>
<p>Now queries that use <code>LOWER(email)</code> in their WHERE clause will use this index:</p>
<pre><code class="language-sql">-- ✅ Uses the expression index
SELECT * FROM customers WHERE LOWER(email) = 'user100@example.com';
</code></pre>
<p>The rule is straightforward: the expression in your query must match the expression in the index exactly. If the index is on <code>LOWER(email)</code>, your query must also use <code>LOWER(email)</code>.</p>
<h2 id="heading-how-to-create-a-unique-index">How to Create a Unique Index</h2>
<p>A unique index guarantees that no two rows have the same value (or combination of values) in the indexed columns. It serves a dual purpose: it enforces data integrity and provides fast lookups.</p>
<pre><code class="language-sql">CREATE UNIQUE INDEX idx_customers_email_unique ON customers (email);
</code></pre>
<p>If you try to insert a duplicate value, PostgreSQL will reject the operation:</p>
<pre><code class="language-sql">INSERT INTO customers (first_name, last_name, email, city)
VALUES ('Test', 'User', 'user1@example.com', 'Lagos');
-- ERROR: duplicate key value violates unique constraint "idx_customers_email_unique"
</code></pre>
<p>You might wonder how this differs from a UNIQUE constraint. Under the hood, PostgreSQL implements UNIQUE constraints by creating a unique index. The two are functionally identical.</p>
<p>The difference is intent — a UNIQUE constraint expresses a data integrity rule, while a unique index explicitly focuses on query performance with uniqueness as a bonus.</p>
<h2 id="heading-how-to-manage-indexes">How to Manage Indexes</h2>
<p>As your database grows, you'll need to inspect, monitor, and maintain your indexes.</p>
<h3 id="heading-how-to-list-all-indexes-on-a-table">How to List All Indexes on a Table</h3>
<pre><code class="language-sql">SELECT
    indexname,
    indexdef
FROM pg_indexes
WHERE tablename = 'customers';
</code></pre>
<p>This shows the name and full definition of every index on the table.</p>
<h3 id="heading-how-to-check-index-size">How to Check Index Size</h3>
<pre><code class="language-sql">SELECT
    pg_size_pretty(pg_relation_size('idx_customers_email')) AS index_size;
</code></pre>
<p>For a broader view of all indexes and their sizes:</p>
<pre><code class="language-sql">SELECT
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
ORDER BY pg_relation_size(indexrelid) DESC;
</code></pre>
<h3 id="heading-how-to-find-unused-indexes">How to Find Unused Indexes</h3>
<p>Indexes that are never used waste disk space and slow down writes. You can find them by checking <code>pg_stat_user_indexes</code>:</p>
<pre><code class="language-sql">SELECT
    indexrelname AS index_name,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'customers'
AND idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
</code></pre>
<p>If an index has <code>idx_scan = 0</code> after a reasonable period of normal usage, it's a candidate for removal. Just make sure to check across a full business cycle — some indexes are only used during monthly reports or seasonal operations.</p>
<h3 id="heading-how-to-drop-an-index">How to Drop an Index</h3>
<pre><code class="language-sql">DROP INDEX IF EXISTS idx_customers_email;
</code></pre>
<p>If you're dropping an index on a production table and want to avoid locking writes, use <code>CONCURRENTLY</code>:</p>
<pre><code class="language-sql">DROP INDEX CONCURRENTLY IF EXISTS idx_customers_email;
</code></pre>
<h3 id="heading-how-to-rebuild-an-index">How to Rebuild an Index</h3>
<p>Over time, indexes can become bloated as rows are inserted, updated, and deleted. You can rebuild an index to reclaim space:</p>
<pre><code class="language-sql">REINDEX INDEX idx_customers_email;
</code></pre>
<p>Or rebuild all indexes on a table:</p>
<pre><code class="language-sql">REINDEX TABLE customers;
</code></pre>
<p>On production systems, use <code>REINDEX CONCURRENTLY</code> (PostgreSQL 12+) to avoid locking the table:</p>
<pre><code class="language-sql">REINDEX INDEX CONCURRENTLY idx_customers_email;
</code></pre>
<h2 id="heading-when-indexes-hurt-instead-of-help">When Indexes Hurt Instead of Help</h2>
<p>Indexes aren't free. Every index you add comes with costs:</p>
<ol>
<li><p><strong>Write overhead</strong> — every INSERT, UPDATE, or DELETE must also update every index on the table. If a table has 10 indexes and you insert a row, PostgreSQL performs 11 write operations (one for the table and one for each index). On write-heavy tables, excessive indexes can significantly slow down data modification.</p>
</li>
<li><p><strong>Storage cost</strong> — indexes consume disk space. On large tables, indexes can take up as much space as the table itself, sometimes more. You can check this with <code>pg_relation_size</code>.</p>
</li>
<li><p><strong>Memory consumption</strong> — PostgreSQL caches frequently used indexes in memory. More indexes means more memory pressure, which can push useful data out of the cache and slow down other queries.</p>
</li>
<li><p><strong>Maintenance burden</strong> — indexes need periodic maintenance (vacuuming, reindexing) and add complexity to schema migrations.</p>
</li>
</ol>
<p>The question to ask is not "should I add an index?" but rather "does the read performance gain justify the write performance cost for this table's workload?"</p>
<h2 id="heading-common-mistakes-that-prevent-index-usage">Common Mistakes That Prevent Index Usage</h2>
<p>You can have the perfect index and PostgreSQL might still ignore it. Here are the most common reasons.</p>
<h3 id="heading-wrapping-the-indexed-column-in-a-function">Wrapping the Indexed Column in a Function</h3>
<pre><code class="language-sql">-- Index on email
CREATE INDEX idx_email ON customers (email);

-- ❌ PostgreSQL cannot use the index because of LOWER()
SELECT * FROM customers WHERE LOWER(email) = 'user1@example.com';

-- ✅ Fix: create an expression index on LOWER(email)
CREATE INDEX idx_email_lower ON customers (LOWER(email));
</code></pre>
<p>Any function applied to the indexed column in a WHERE clause prevents the standard index from being used. You need an expression index that matches the function.</p>
<h3 id="heading-implicit-type-casting">Implicit Type Casting</h3>
<pre><code class="language-sql">-- id is an INTEGER column with an index
-- ❌ Passing a string forces a type cast, which may prevent index usage
SELECT * FROM customers WHERE id = '42';

-- ✅ Use the correct type
SELECT * FROM customers WHERE id = 42;
</code></pre>
<p>When the query's value type doesn't match the column type, PostgreSQL may cast the column to match, which prevents index usage.</p>
<h3 id="heading-using-or-conditions-across-different-columns">Using OR Conditions Across Different Columns</h3>
<pre><code class="language-sql">-- ❌ OR across different columns can prevent index usage
SELECT * FROM customers WHERE email = 'user1@example.com' OR city = 'Lagos';

-- ✅ Rewrite as UNION for better index utilization
SELECT * FROM customers WHERE email = 'user1@example.com'
UNION
SELECT * FROM customers WHERE city = 'Lagos';
</code></pre>
<h3 id="heading-leading-wildcards-in-like-queries">Leading Wildcards in LIKE Queries</h3>
<pre><code class="language-sql">-- ❌ Leading wildcard cannot use a B-tree index
SELECT * FROM customers WHERE email LIKE '%@example.com';

-- ✅ Trailing wildcard CAN use a B-tree index
SELECT * FROM customers WHERE email LIKE 'user1%';
</code></pre>
<p>A B-tree index is sorted from left to right. A leading wildcard (<code>%something</code>) means the database can't use the sorted structure and falls back to a sequential scan. If you need to search by suffix or substring, consider a GIN index with the <code>pg_trgm</code> extension.</p>
<h3 id="heading-low-selectivity">Low Selectivity</h3>
<p>If a column has very few distinct values relative to the number of rows (low selectivity), PostgreSQL may decide a sequential scan is faster than using the index.</p>
<p>For example, if a <code>status</code> column has only three possible values (<code>'pending'</code>, <code>'shipped'</code>, <code>'delivered'</code>) and each value covers roughly a third of the table, an index on <code>status</code> alone provides little benefit. PostgreSQL would still need to read a large portion of the table, and the extra index lookup adds overhead.</p>
<p>A partial index is often the better solution in these cases.</p>
<h2 id="heading-best-practices-for-indexing">Best Practices for Indexing</h2>
<p>Here's a summary of the key principles to follow:</p>
<ol>
<li><p><strong>Index columns that appear in WHERE, JOIN, and ORDER BY clauses.</strong> These are the columns the database needs to search, match, or sort by. Start with the queries that run most frequently or take the longest.</p>
</li>
<li><p><strong>Measure before and after with EXPLAIN ANALYZE.</strong> Never add an index based on guesswork. Run your query with <code>EXPLAIN ANALYZE</code>, add the index, and run it again. If the execution time doesn't improve meaningfully, the index isn't helping.</p>
</li>
<li><p><strong>Don't index every column.</strong> Each index slows down writes and consumes storage. Be deliberate about which columns you index based on actual query patterns.</p>
</li>
<li><p><strong>Use composite indexes for multi-column filters.</strong> If your queries commonly filter on <code>city</code> and <code>last_name</code> together, a composite index on <code>(city, last_name)</code> is more efficient than two separate single-column indexes.</p>
</li>
<li><p><strong>Put the most selective column first in composite indexes.</strong> The column that narrows the results the most should come first.</p>
</li>
<li><p><strong>Use partial indexes when you only query a subset of data.</strong> If 90% of your queries target rows where <code>status = 'active'</code>, a partial index on that subset is smaller and faster than a full index.</p>
</li>
<li><p><strong>Monitor index usage regularly.</strong> Query <code>pg_stat_user_indexes</code> to find unused indexes and remove them.</p>
</li>
<li><p><strong>Rebuild bloated indexes periodically.</strong> On tables with heavy update/delete activity, indexes can become bloated. Use <code>REINDEX CONCURRENTLY</code> on production systems.</p>
</li>
</ol>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, you learned what database indexes are and why they matter for query performance. You explored how B-tree indexes work under the hood, created several types of indexes (single-column, composite, partial, expression, and unique), and used <code>EXPLAIN ANALYZE</code> to measure the impact.</p>
<p>You also learned about the trade-offs indexes introduce — write overhead, storage cost, and memory pressure — and the common mistakes that silently prevent PostgreSQL from using your indexes.</p>
<p>The core principle is simple: index deliberately based on your actual query patterns, measure the results, and remove anything that isn't pulling its weight.</p>
<p>If you found this tutorial helpful, you can find more of my writing on <a href="https://freecodecamp.org/news/author/iyiola">freeCodeCamp</a> and connect with me on <a href="https://linkedin.com/in/iyioladev">LinkedIn</a> and <a href="https://x.com/iyiola_dev_">X</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ What Are Database Triggers? A Practical Introduction with PostgreSQL Examples ]]>
                </title>
                <description>
                    <![CDATA[ If you've ever needed your database to automatically respond to changes – like logging every update to a sensitive table, enforcing a business rule before an insert, or syncing derived data after a de ]]>
                </description>
                <link>https://www.freecodecamp.org/news/what-are-database-triggers-practical-intro-with-postgresql-examples/</link>
                <guid isPermaLink="false">69c6d1357cf270651037755c</guid>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ iyiola ]]>
                </dc:creator>
                <pubDate>Fri, 27 Mar 2026 18:49:25 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/b5940820-d1aa-4d10-8b40-06005bec7e60.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>If you've ever needed your database to automatically respond to changes – like logging every update to a sensitive table, enforcing a business rule before an insert, or syncing derived data after a delete – then triggers are the tool you're looking for.</p>
<p>A database trigger is a function that the database executes automatically when a specific event occurs on a table. You don't call it manually. Instead, you define the conditions, and the database handles the rest.</p>
<p>In this tutorial, you'll learn what triggers are, how they work, when to use them, and when to avoid them. You'll work through practical examples using PostgreSQL, but the core concepts apply to most relational databases.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="#heading-how-triggers-work">How Triggers Work</a></p>
</li>
<li><p><a href="#heading-how-to-create-your-first-trigger">How to Create Your First Trigger</a></p>
</li>
<li><p><a href="#heading-before-vs-after-triggers">BEFORE vs AFTER Triggers</a></p>
</li>
<li><p><a href="#heading-how-to-build-an-audit-log-with-an-after-trigger">How to Build an Audit Log with an AFTER Trigger</a></p>
</li>
<li><p><a href="#heading-how-to-use-a-before-trigger-for-validation">How to Use a BEFORE Trigger for Validation</a></p>
</li>
<li><p><a href="#heading-row-level-vs-statement-level-triggers">Row-Level vs Statement-Level Triggers</a></p>
</li>
<li><p><a href="#heading-the-new-and-old-variables-reference">The NEW and OLD Variables Reference</a></p>
</li>
<li><p><a href="#heading-how-to-manage-triggers">How to Manage Triggers</a></p>
</li>
<li><p><a href="#heading-when-to-use-triggers">When to Use Triggers</a></p>
</li>
<li><p><a href="#heading-when-to-avoid-triggers">When to Avoid Triggers</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along with the examples, you'll need:</p>
<ul>
<li><p>Basic knowledge of SQL (SELECT, INSERT, UPDATE, DELETE)</p>
</li>
<li><p>A running PostgreSQL instance (version 12 or later)</p>
</li>
<li><p>A SQL client like <code>psql</code>, pgAdmin, or DBeaver</p>
</li>
</ul>
<p>If you don't have PostgreSQL installed, you can use a free cloud-hosted instance from services like <a href="https://neon.tech">Neon</a> or <a href="https://supabase.com">Supabase</a> to follow along.</p>
<h2 id="heading-how-triggers-work">How Triggers Work</h2>
<p>At a high level, a trigger has three parts:</p>
<ol>
<li><p><strong>The event</strong>: what action activates the trigger (INSERT, UPDATE, DELETE, or TRUNCATE)</p>
</li>
<li><p><strong>The timing</strong>: when the trigger fires relative to the event (BEFORE or AFTER)</p>
</li>
<li><p><strong>The function</strong>: what logic runs when the trigger fires</p>
</li>
</ol>
<p>Here's the general flow: a user or application performs an operation on a table, the database checks if any triggers are associated with that operation, and if a match is found, the database executes the trigger function automatically.</p>
<p>You can think of triggers as event listeners for your database. Just like a JavaScript <code>addEventListener</code> watches for a click or keypress, a database trigger watches for row-level changes on a table.</p>
<h2 id="heading-how-to-create-your-first-trigger">How to Create Your First Trigger</h2>
<p>In PostgreSQL, creating a trigger is a two-step process. You first create a trigger function, then you attach that function to a table with a <code>CREATE TRIGGER</code> statement.</p>
<p>Let's build a concrete example. Say you have a <code>products</code> table and you want to automatically set the <code>updated_at</code> timestamp every time a row is modified.</p>
<h3 id="heading-step-1-create-the-table">Step 1 – Create the Table</h3>
<pre><code class="language-sql">CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price NUMERIC(10, 2) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);
</code></pre>
<h3 id="heading-step-2-create-the-trigger-function">Step 2 – Create the Trigger Function</h3>
<p>A trigger function in PostgreSQL is a special function that returns the <code>TRIGGER</code> type. Inside the function body, you have access to two important variables: <code>NEW</code> (the row after the operation) and <code>OLD</code> (the row before the operation).</p>
<pre><code class="language-sql">CREATE OR REPLACE FUNCTION set_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
</code></pre>
<p>This function sets the <code>updated_at</code> column to the current timestamp every time it runs. It then returns <code>NEW</code>, which tells PostgreSQL to proceed with the modified row.</p>
<h3 id="heading-step-3-attach-the-trigger-to-the-table">Step 3 – Attach the Trigger to the Table</h3>
<pre><code class="language-sql">CREATE TRIGGER trigger_set_updated_at
BEFORE UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION set_updated_at();
</code></pre>
<p>Let's break down each part of this statement:</p>
<ul>
<li><p><code>BEFORE UPDATE</code> – the trigger fires before the update is applied to the table</p>
</li>
<li><p><code>ON products</code> – the trigger is associated with the <code>products</code> table</p>
</li>
<li><p><code>FOR EACH ROW</code> – the function runs once for every row affected by the update</p>
</li>
<li><p><code>EXECUTE FUNCTION set_updated_at()</code> – the function to call</p>
</li>
</ul>
<h3 id="heading-step-4-test-it">Step 4 – Test It</h3>
<pre><code class="language-sql">INSERT INTO products (name, price) VALUES ('Wireless Keyboard', 49.99);

-- Wait a moment, then update the row
UPDATE products SET price = 44.99 WHERE name = 'Wireless Keyboard';

SELECT name, price, created_at, updated_at FROM products;
</code></pre>
<p>You'll see that <code>updated_at</code> has been automatically updated to the time of the UPDATE operation, even though you didn't explicitly set it in your query. That's the trigger doing its job.</p>
<h2 id="heading-before-vs-after-triggers">BEFORE vs AFTER Triggers</h2>
<p>The timing of a trigger determines when the function executes relative to the actual data change.</p>
<p><strong>BEFORE triggers</strong> run before the row is inserted, updated, or deleted. They are useful when you want to modify or validate the incoming data. Since the change hasn't been applied yet, you can alter the <code>NEW</code> row or even cancel the operation entirely by returning <code>NULL</code>.</p>
<p><strong>AFTER triggers</strong> run after the row change has been committed to the table. They are useful for side effects like logging, sending notifications, or updating related tables. At this point, the change is already done, so you can't modify the row – but you can read both <code>OLD</code> and <code>NEW</code> to see what changed.</p>
<p>Here's a rule of thumb: use BEFORE triggers when you need to change or reject data, and use AFTER triggers when you need to react to a completed change.</p>
<h2 id="heading-how-to-build-an-audit-log-with-an-after-trigger">How to Build an Audit Log with an AFTER Trigger</h2>
<p>One of the most common uses for triggers is audit logging – keeping a record of every change made to an important table. Let's build one.</p>
<h3 id="heading-step-1-create-an-audit-table">Step 1 – Create an Audit Table</h3>
<pre><code class="language-sql">CREATE TABLE product_audit (
    audit_id SERIAL PRIMARY KEY,
    product_id INT NOT NULL,
    action VARCHAR(10) NOT NULL,
    old_price NUMERIC(10, 2),
    new_price NUMERIC(10, 2),
    changed_by TEXT DEFAULT current_user,
    changed_at TIMESTAMP DEFAULT NOW()
);
</code></pre>
<h3 id="heading-step-2-create-the-audit-trigger-function">Step 2 – Create the Audit Trigger Function</h3>
<pre><code class="language-sql">CREATE OR REPLACE FUNCTION log_product_changes()
RETURNS TRIGGER AS $$
BEGIN
    IF TG_OP = 'UPDATE' THEN
        INSERT INTO product_audit (product_id, action, old_price, new_price)
        VALUES (OLD.id, 'UPDATE', OLD.price, NEW.price);
    ELSIF TG_OP = 'DELETE' THEN
        INSERT INTO product_audit (product_id, action, old_price)
        VALUES (OLD.id, 'DELETE', OLD.price);
    ELSIF TG_OP = 'INSERT' THEN
        INSERT INTO product_audit (product_id, action, new_price)
        VALUES (NEW.id, 'INSERT', NEW.price);
    END IF;

    RETURN COALESCE(NEW, OLD);
END;
$$ LANGUAGE plpgsql;
</code></pre>
<p>There are a few important things happening here. The <code>TG_OP</code> variable is a special string that PostgreSQL provides inside trigger functions. It tells you which operation activated the trigger: <code>'INSERT'</code>, <code>'UPDATE'</code>, or <code>'DELETE'</code>. This lets you handle different operations with a single function.</p>
<p>The <code>RETURN COALESCE(NEW, OLD)</code> at the end ensures the function returns the correct row. For INSERT and UPDATE operations, <code>NEW</code> exists and is returned. For DELETE operations, <code>NEW</code> is null, so <code>OLD</code> is returned instead.</p>
<h3 id="heading-step-3-attach-the-trigger">Step 3 – Attach the Trigger</h3>
<pre><code class="language-sql">CREATE TRIGGER trigger_product_audit
AFTER INSERT OR UPDATE OR DELETE ON products
FOR EACH ROW
EXECUTE FUNCTION log_product_changes();
</code></pre>
<p>Notice the <code>AFTER INSERT OR UPDATE OR DELETE</code> syntax. You can bind a single trigger to multiple events, which keeps your setup clean.</p>
<h3 id="heading-step-4-test-it">Step 4 – Test It</h3>
<pre><code class="language-sql">-- Insert a new product
INSERT INTO products (name, price) VALUES ('USB-C Hub', 29.99);

-- Update the price
UPDATE products SET price = 24.99 WHERE name = 'USB-C Hub';

-- Delete the product
DELETE FROM products WHERE name = 'USB-C Hub';

-- Check the audit log
SELECT * FROM product_audit ORDER BY changed_at;
</code></pre>
<p>You'll see three rows in <code>product_audit</code> (one for each operation) with the old and new prices recorded automatically. No application code needed.</p>
<h2 id="heading-how-to-use-a-before-trigger-for-validation">How to Use a BEFORE Trigger for Validation</h2>
<p>Triggers can also enforce business rules at the database level. Let's say you want to prevent any product from having a negative price.</p>
<pre><code class="language-sql">CREATE OR REPLACE FUNCTION prevent_negative_price()
RETURNS TRIGGER AS $$
BEGIN
    IF NEW.price &lt; 0 THEN
        RAISE EXCEPTION 'Product price cannot be negative. Got: %', NEW.price;
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_check_price
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW
EXECUTE FUNCTION prevent_negative_price();
</code></pre>
<p>Now test it:</p>
<pre><code class="language-sql">INSERT INTO products (name, price) VALUES ('Faulty Item', -10.00);
-- ERROR: Product price cannot be negative. Got: -10.00
</code></pre>
<p>The insert is rejected entirely. The row never makes it into the table. This is powerful because the rule is enforced at the database level regardless of which application or script sends the query.</p>
<h2 id="heading-row-level-vs-statement-level-triggers">Row-Level vs Statement-Level Triggers</h2>
<p>All the triggers you've seen so far use <code>FOR EACH ROW</code>, which means the function runs once per affected row. If you update 100 rows in a single query, the trigger function runs 100 times.</p>
<p>PostgreSQL also supports <code>FOR EACH STATEMENT</code> triggers, which run once per SQL statement regardless of how many rows are affected.</p>
<pre><code class="language-sql">CREATE OR REPLACE FUNCTION log_bulk_update()
RETURNS TRIGGER AS $$
BEGIN
    RAISE NOTICE 'A bulk operation was performed on the products table';
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_bulk_update_notice
AFTER UPDATE ON products
FOR EACH STATEMENT
EXECUTE FUNCTION log_bulk_update();
</code></pre>
<p>Statement-level triggers are less common, but they're useful for operations like refreshing a materialized view or sending a single notification after a batch update instead of one notification per row.</p>
<p><strong>Important</strong>: in statement-level triggers, the <code>NEW</code> and <code>OLD</code> variables are not available because the trigger isn't tied to any specific row.</p>
<h2 id="heading-the-new-and-old-variables-reference">The NEW and OLD Variables Reference</h2>
<p>Here's a quick reference for when <code>NEW</code> and <code>OLD</code> are available in row-level triggers:</p>
<table>
<thead>
<tr>
<th>Operation</th>
<th>OLD</th>
<th>NEW</th>
</tr>
</thead>
<tbody><tr>
<td>INSERT</td>
<td>Not available</td>
<td>Contains the new row</td>
</tr>
<tr>
<td>UPDATE</td>
<td>Contains the row before the change</td>
<td>Contains the row after the change</td>
</tr>
<tr>
<td>DELETE</td>
<td>Contains the deleted row</td>
<td>Not available</td>
</tr>
</tbody></table>
<p>Understanding when each variable is available will save you from runtime errors in your trigger functions.</p>
<h2 id="heading-how-to-manage-triggers">How to Manage Triggers</h2>
<p>As you add more triggers to your database, you'll need to know how to inspect, disable, and remove them.</p>
<h3 id="heading-how-to-list-all-triggers-on-a-table">How to List All Triggers on a Table</h3>
<pre><code class="language-sql">SELECT trigger_name, event_manipulation, action_timing
FROM information_schema.triggers
WHERE event_object_table = 'products';
</code></pre>
<h3 id="heading-how-to-disable-a-trigger-temporarily">How to Disable a Trigger Temporarily</h3>
<pre><code class="language-sql">-- Disable a specific trigger
ALTER TABLE products DISABLE TRIGGER trigger_product_audit;

-- Disable all triggers on a table
ALTER TABLE products DISABLE TRIGGER ALL;
</code></pre>
<p>This is useful during bulk data migrations where you want to skip trigger execution for performance reasons.</p>
<h3 id="heading-how-to-re-enable-a-trigger">How to Re-Enable a Trigger</h3>
<pre><code class="language-sql">ALTER TABLE products ENABLE TRIGGER trigger_product_audit;
</code></pre>
<h3 id="heading-how-to-drop-a-trigger">How to Drop a Trigger</h3>
<pre><code class="language-sql">DROP TRIGGER IF EXISTS trigger_product_audit ON products;
</code></pre>
<p>Note that dropping a trigger does not drop the associated function. You'll need to drop the function separately if you no longer need it:</p>
<pre><code class="language-sql">DROP FUNCTION IF EXISTS log_product_changes();
</code></pre>
<h2 id="heading-when-to-use-triggers">When to Use Triggers</h2>
<p>Triggers work well for specific use cases. Here are the scenarios where they're a strong choice:</p>
<ul>
<li><p><strong>Audit logging</strong>: automatically recording who changed what and when, as you saw earlier in this tutorial.</p>
</li>
<li><p><strong>Derived data maintenance</strong>: keeping computed columns, counters, or summary tables in sync with the source data.</p>
</li>
<li><p><strong>Data validation</strong>: enforcing business rules that go beyond what CHECK constraints can express, like cross-table validations.</p>
</li>
<li><p><strong>Automatic timestamping</strong>: setting <code>created_at</code> and <code>updated_at</code> fields without relying on the application layer.</p>
</li>
</ul>
<h2 id="heading-when-to-avoid-triggers">When to Avoid Triggers</h2>
<p>Triggers are powerful, but they come with trade-offs. Here are cases where you should think twice before using them:</p>
<ul>
<li><p><strong>Complex business logic</strong>: if the logic involves calling external APIs, sending emails, or orchestrating multi-step workflows, it belongs in your application layer. Triggers should stay lightweight.</p>
</li>
<li><p><strong>Performance-sensitive bulk operations</strong>: row-level triggers on tables that frequently receive bulk inserts or updates can create significant overhead. If you're inserting millions of rows, those triggers fire millions of times.</p>
</li>
<li><p><strong>Cascading triggers</strong>: when one trigger's action fires another trigger, which fires another, debugging becomes extremely difficult. If you find yourself building a chain of triggers, reconsider the design.</p>
</li>
<li><p><strong>Logic that developers need to discover easily</strong>: triggers are sometimes called "hidden logic" because they execute automatically without appearing in application code. If your team frequently asks "why did this column change?" and the answer is always "there's a trigger," that's a sign the logic might be more discoverable if placed in your application layer or a stored procedure that's called explicitly.</p>
</li>
</ul>
<p>A good rule of thumb: if the logic is tightly coupled to the data and should always execute regardless of which client or service touches the table, a trigger is appropriate. If the logic depends on application context (like the current user's session, feature flags, or external state), it belongs in the application.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this tutorial, you learned what database triggers are and how they work in PostgreSQL. You built three practical triggers: an automatic timestamp updater, a full audit logging system, and a data validation guard. You also learned the difference between BEFORE and AFTER triggers, row-level and statement-level triggers, and when <code>NEW</code> and <code>OLD</code> variables are available.</p>
<p>Triggers are a powerful tool for keeping your data consistent and your business rules enforced at the database level. Use them for focused, data-centric operations, and keep the logic simple.</p>
<p>If you found this tutorial helpful, you can connect with me on <a href="https://linkedin.com/in/iyioladev">LinkedIn</a> and <a href="https://x.com/iyiola_dev_">X</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Build a Bank Ledger in Golang with PostgreSQL using the Double-Entry Accounting Principle. ]]>
                </title>
                <description>
                    <![CDATA[ The Hidden Bugs in How Most Developers Store Money Imagine you're building the backend for a million-dollar fintech app. You store each user's balance as a single number in the database. It feels simp ]]>
                </description>
                <link>https://www.freecodecamp.org/news/build-a-bank-ledger-in-go-with-postgresql-using-the-double-entry-accounting-principle/</link>
                <guid isPermaLink="false">69c4173d10e664c5dac8cea1</guid>
                
                    <category>
                        <![CDATA[ Go Language ]]>
                    </category>
                
                    <category>
                        <![CDATA[ golang ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ SQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ banking ]]>
                    </category>
                
                    <category>
                        <![CDATA[ accounting ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ double entry ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Paul Babatuyi ]]>
                </dc:creator>
                <pubDate>Wed, 25 Mar 2026 17:11:25 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/faea1d4c-5319-4746-96b0-315f37017e26.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <h2 id="heading-the-hidden-bugs-in-how-most-developers-store-money">The Hidden Bugs in How Most Developers Store Money</h2>
<p>Imagine you're building the backend for a million-dollar fintech app. You store each user's balance as a single number in the database. It feels simple: just update the number when money moves.</p>
<p>But with one line of code like <code>UPDATE accounts SET balance = balance - 100</code>, you've created a system that can silently lose millions. A server crash, a race condition, or a clever attack, and suddenly money vanishes or appears out of thin air.</p>
<p>There's no audit trail, no way to know what happened, and no way to prove it didn't happen on purpose.</p>
<p>This isn't just a theoretical risk. It's a trap that's caught even experienced developers. The world's most trusted financial systems avoid it by using double-entry accounting. Every transaction creates two records: a debit on one account, a credit on another. This lets you reconstruct every cent from history, catch inconsistencies, and audit every transaction.</p>
<p>There are no deletes, and no silent updates. Just an append-only trail that makes fraud and bugs much harder to hide.</p>
<p>In this guide, you'll build a robust backend in Go and PostgreSQL, using patterns inspired by real fintech companies. You'll learn how to design a double-entry ledger, generate type-safe SQL with sqlc, and write transactions that are safe even under heavy load.</p>
<p>By the end, you'll understand why these patterns matter –&nbsp;and how to use them to build software you can trust with real money.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="#heading-prerequisites-and-project-overview">Prerequisites and Project Overview</a></p>
</li>
<li><p><a href="#heading-the-double-entry-foundation-how-every-penny-is-accounted-for">The Double-Entry Foundation</a></p>
</li>
<li><p><a href="#heading-type-safe-sql-with-sqlc-no-more-surprises">Type-Safe SQL with sqlc</a></p>
</li>
<li><p><a href="#heading-the-store-layer-transactions-and-automatic-retries">The Store Layer: Transactions and Retries</a></p>
</li>
<li><p><a href="#heading-the-service-layer-where-business-logic-meets-double-entry">The Service Layer: Business Logic</a></p>
</li>
<li><p><a href="#heading-the-api-layer-secure-predictable-and-boring-by-design">The API Layer</a></p>
</li>
<li><p><a href="#heading-running-it-locally-your-first-end-to-end-test">Running It Locally</a></p>
</li>
<li><p><a href="#heading-testing-prove-the-system-works">Testing: Prove the System Works</a></p>
</li>
<li><p><a href="#heading-deployment-engineering-decisions-that-matter-in-production">Deployment</a></p>
</li>
<li><p><a href="#heading-conclusion-building-for-the-real-world">Conclusion</a></p>
</li>
</ul>
<h3 id="heading-project-resources">Project Resources:</h3>
<p>Here's the project repository: <a href="https://github.com/PaulBabatuyi/double-entry-bank-Go">https://github.com/PaulBabatuyi/double-entry-bank-Go</a></p>
<p>And here's the front-end repository: <a href="https://github.com/PaulBabatuyi/double-entry-bankhttps://github.com/PaulBabatuyi/double-entry-bank">https://github.com/PaulBabatuyi/double-entry-bank</a></p>
<p>You can find the live frontend here: <a href="https://golangbank.app">https://golangbank.app</a></p>
<img src="https://cdn.hashnode.com/uploads/covers/6968db1b0578d1643036e600/2240e617-5a6d-4742-995f-6ecb8fecb56e.png" alt="Double-entry frontend transaction" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<p>You can find the live Swagger back-end API here: <a href="https://golangbank.app/swagger">https://golangbank.app/swagger</a></p>
<img src="https://cdn.hashnode.com/uploads/covers/6968db1b0578d1643036e600/3a6c1e02-5ceb-43e4-86a3-0530735b79cb.png" alt="Backend API endpoints (Swagger)" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h2 id="heading-prerequisites-and-project-overview">Prerequisites and Project Overview</h2>
<p>Before you dive in, make sure you have the following installed:</p>
<ul>
<li><p>Go 1.23 or newer</p>
</li>
<li><p>Docker and Docker Compose</p>
</li>
<li><p><code>golang-migrate</code> CLI: <code>go install github.com/golang-migrate/migrate/v4/cmd/migrate@latest</code></p>
</li>
<li><p><code>sqlc</code> CLI: <code>go install github.com/sqlc-dev/sqlc/cmd/sqlc@latest</code></p>
</li>
</ul>
<p>You'll also need a basic understanding of PostgreSQL and REST APIs to follow along.</p>
<p>If you've built a CRUD app before, you're ready for this. The project uses sqlc for type-safe queries, JWT for authentication, and a layered architecture that keeps business logic, persistence, and HTTP handling cleanly separated.</p>
<p>Here's how the project is organized:</p>
<pre><code class="language-plaintext">.
├── cmd/                # Server entrypoint
│   └── main.go
├── internal/
│   ├── api/            # HTTP handlers &amp; middleware
│   ├── db/             # Store layer (transactions, sqlc)
│   └── service/        # Business logic (ledger operations)
├── postgres/
│   ├── migrations/     # SQL migration files
│   └── queries/        # sqlc query files
├── docs/               # Swagger docs
├── Dockerfile, docker-compose.yml, Makefile
└── README.md
</code></pre>
<p>The architecture follows a clear three-layer pattern:</p>
<ul>
<li><p><strong>API Layer</strong>: Handles HTTP requests, authentication, and routing.</p>
</li>
<li><p><strong>Service Layer</strong>: Contains the business logic. This is where double-entry rules are enforced.</p>
</li>
<li><p><strong>Store Layer</strong>: Manages database transactions and persistence.</p>
</li>
</ul>
<p>Every request flows from the handler, through the service, to the store, and finally to PostgreSQL. This separation makes the code easier to test, debug, and extend.</p>
<h3 id="heading-backend-request-flow">Backend Request Flow</h3>
<pre><code class="language-mermaid">graph TD
    A[HTTP Request] --&gt; B[Handler - API Layer]
    B --&gt; C[LedgerService - Business Logic]
    C --&gt; D[Store - Persistence Layer]
    D --&gt; E[(PostgreSQL)]
    E --&gt; D
    D --&gt; C
    C --&gt; B
    B --&gt; F[HTTP Response]
</code></pre>
<h2 id="heading-the-double-entry-foundation-how-every-penny-is-accounted-for">The Double-Entry Foundation: How Every Penny is Accounted For</h2>
<p>Let's get to the heart of what makes this system bulletproof: double-entry accounting. Every operation – a deposit, withdrawal, or transfer&nbsp;– creates two entries that always balance. This is the secret sauce that keeps banks, payment apps, and even crypto exchanges from losing track of money.</p>
<p>Picture a simple deposit of $1,000:</p>
<pre><code class="language-plaintext">| Account              | Debit   | Credit  |
|----------------------|---------|---------|
| User Account         |         | 1,000   |
| Settlement Account   | 1,000   |         |
</code></pre>
<p>Total debits always equal total credits. This is the fundamental rule. Every single operation in this system produces exactly this structure, with no exceptions.</p>
<p>Now picture a $200 transfer from User A to User B. Notice there are four entries, not two – both sides of both accounts are recorded:</p>
<pre><code class="language-plaintext">| Account       | Debit   | Credit  | Description           |
|---------------|---------|---------|-----------------------|
| User A        | 200     |         | Transfer to User B    |
| User B        |         | 200     | Transfer from User A  |
</code></pre>
<p>Both entries share the same <code>transaction_id</code>, so you can always retrieve the complete picture of what happened with a single query. There's no guessing and no reconstructing, as the ledger tells the full story.</p>
<h3 id="heading-why-the-settlement-account-goes-negative">Why the Settlement Account Goes Negative</h3>
<p>This trips up newcomers, so it's worth explaining explicitly. When a user deposits \(1,000, the settlement account is debited \)1,000. After several user deposits, the settlement balance will be negative. That's correct and expected: it represents the total amount of real-world money currently held inside the system on behalf of users. The invariant is:</p>
<pre><code class="language-plaintext">SUM(all user account balances) + settlement balance = 0
</code></pre>
<p>If that ever doesn't hold, something is broken.</p>
<h3 id="heading-enforcing-the-rules-in-the-database">Enforcing the Rules in the Database</h3>
<p>The database itself enforces these rules, not just the application code. Here's the core of the <code>entries</code> table migration:</p>
<pre><code class="language-sql">CREATE TABLE IF NOT EXISTS entries (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id UUID NOT NULL REFERENCES accounts(id) ON DELETE RESTRICT,
    debit NUMERIC(19,4) NOT NULL DEFAULT 0.0000 CHECK (debit &gt;= 0),
    credit NUMERIC(19,4) NOT NULL DEFAULT 0.0000 CHECK (credit &gt;= 0),
    transaction_id UUID NOT NULL,
    operation_type operation_type NOT NULL,
    description TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,

    CONSTRAINT check_single_side CHECK (
        (debit &gt; 0 AND credit = 0) OR (debit = 0 AND credit &gt; 0)
    )
);
</code></pre>
<p>Let's break down why each piece matters:</p>
<ul>
<li><p><strong>Single-sided entries are impossible.</strong> The <code>check_single_side</code> constraint means every entry must be either a debit or a credit, never both. If you try to insert an invalid row, the database rejects it – there's no way around it.</p>
</li>
<li><p><strong>Every transaction is linked.</strong> Both the debit and credit entries share the same <code>transaction_id</code> (a UUID). This lets you fetch both sides of any operation instantly, making audits and debugging straightforward.</p>
</li>
<li><p><strong>Operation types are explicit.</strong> The <code>operation_type</code> column is an enum at the database level, so only valid types like <code>deposit</code>, <code>withdrawal</code>, or <code>transfer</code> are allowed. There are no typos and no surprises.</p>
</li>
</ul>
<h3 id="heading-the-settlement-account-the-systems-anchor">The Settlement Account: The System's Anchor</h3>
<p>Every real-world ledger needs a way to represent money entering or leaving the system. That's what the settlement account does. Here's how it's seeded in the database:</p>
<pre><code class="language-sql">INSERT INTO accounts (id, name, balance, currency, is_system)
SELECT gen_random_uuid(), 'Settlement Account', 0.0000, 'USD', TRUE
WHERE NOT EXISTS (
    SELECT 1 FROM accounts WHERE is_system = TRUE AND name = 'Settlement Account'
);
</code></pre>
<p>The settlement account represents the "outside world." When a user deposits money, it comes from the settlement account. When they withdraw, it goes back. Using <code>WHERE NOT EXISTS</code> makes this migration idempotent –&nbsp;that is, safe to run multiple times without creating duplicates.</p>
<h2 id="heading-type-safe-sql-with-sqlc-no-more-surprises">Type-Safe SQL with sqlc: No More Surprises</h2>
<p>In financial systems, you can't afford surprises from your database layer. That's why this project uses sqlc, a tool that turns your SQL queries into type-safe Go code at compile time.</p>
<p>With sqlc, you see exactly what SQL runs, catch mistakes before they hit production, and avoid the "magic" (and hidden bugs) of most ORMs. Every query is explicit, every type is checked, and you get the best of both worlds: raw SQL power with Go's safety.</p>
<h3 id="heading-why-numeric-becomes-string-and-not-float64">Why NUMERIC Becomes String (and Not float64)</h3>
<p>Here's a subtle but critical detail from <code>sqlc.yaml</code>:</p>
<pre><code class="language-yaml">overrides:
    - db_type: "pg_catalog.numeric"
      go_type: "string"
    - column: "entries.debit"
      go_type: "string"
    - column: "entries.credit"
      go_type: "string"
    - column: "accounts.balance"
      go_type: "string"
    - db_type: "operation_type"
      go_type: "string"
</code></pre>
<p><strong>Why string, not float64?</strong> Floating point arithmetic is imprecise. <code>0.1 + 0.2</code> in most programming languages does not equal exactly <code>0.3</code>.</p>
<p>For money, you need exact decimal arithmetic. This project uses <code>shopspring/decimal</code> for all calculations and stores amounts as strings, converting at the service layer boundary. The database column itself is <code>NUMERIC(19,4)</code>, which stores exact decimals – no float rounding ever touches your money.</p>
<h3 id="heading-preventing-race-conditions-locking-with-for-update">Preventing Race Conditions: Locking with FOR UPDATE</h3>
<p>One of the most important queries in the system is <code>GetAccountForUpdate</code>:</p>
<pre><code class="language-sql">SELECT * FROM accounts
WHERE id = $1
LIMIT 1
FOR UPDATE; -- locks row for update, prevents TOCTOU races
</code></pre>
<p>This query uses <code>FOR UPDATE</code> to lock the account row during a transaction. Why? Imagine two requests both see a \(500 balance and both try to withdraw \)400. Without locking, both would succeed, and you'd end up with a negative balance. With <code>FOR UPDATE</code>, the second transaction waits until the first finishes, eliminating this classic race condition.</p>
<h3 id="heading-calculating-the-true-balance-always-trust-the-entries">Calculating the True Balance: Always Trust the Entries</h3>
<p>The real source of truth for any account is the sum of its entries, not the denormalized <code>balance</code> column. Here's the reconciliation query:</p>
<pre><code class="language-sql">SELECT CAST(
    (COALESCE(SUM(credit), 0::NUMERIC) - COALESCE(SUM(debit), 0::NUMERIC))
    AS NUMERIC(19,4)
) AS calculated_balance
FROM entries
WHERE account_id = $1;
</code></pre>
<p>This computes the true balance from the ledger itself. It's how you catch bugs, audit the system, and prove that every penny is accounted for. The <code>balance</code> column on accounts is a denormalized cache for fast reads –&nbsp;and this query is the ground truth that validates it.</p>
<h2 id="heading-the-store-layer-transactions-and-automatic-retries">The Store Layer: Transactions and Automatic Retries</h2>
<p>Every financial operation in this system runs inside a transaction –&nbsp;no exceptions. This is enforced by the <code>ExecTx</code> pattern in the store layer:</p>
<pre><code class="language-go">func (store *Store) ExecTx(ctx context.Context, fn func(q *sqlc.Queries) error) error {
    const maxAttempts = 10
    var lastErr error
    for attempt := 0; attempt &lt; maxAttempts; attempt++ {
        lastErr = store.execTxOnce(ctx, fn)
        if lastErr == nil {
            return nil
        }
        if !isSerializationError(lastErr) {
            return lastErr
        }
        if attempt &lt; maxAttempts-1 {
            if waitErr := sleepWithContext(ctx, retryWait(attempt)); waitErr != nil {
                return waitErr
            }
        }
    }
    return fmt.Errorf("transaction failed after %d attempts due to serialization conflicts: %w", maxAttempts, lastErr)
}
</code></pre>
<h3 id="heading-why-serializable-isolation">Why Serializable Isolation?</h3>
<p>The transaction uses PostgreSQL's strictest isolation level: <code>sql.LevelSerializable</code>. This is like running transactions one at a time, eliminating entire classes of concurrency bugs. If two operations would conflict, PostgreSQL aborts one and returns a serialization error (SQLSTATE 40001).</p>
<h3 id="heading-automatic-retries-handling-real-world-concurrency">Automatic Retries: Handling Real-World Concurrency</h3>
<p>When a serialization error occurs, the code automatically retries with exponential backoff:</p>
<pre><code class="language-go">func retryWait(attempt int) time.Duration {
    base := 50 * time.Millisecond
    for i := 0; i &lt; attempt; i++ {
        base *= 2
        if base &gt;= time.Second {
            return time.Second
        }
    }
    return base
}

func sleepWithContext(ctx context.Context, d time.Duration) error {
    select {
    case &lt;-ctx.Done():
        return ctx.Err()
    case &lt;-time.After(d):
        return nil
    }
}
</code></pre>
<p>The backoff starts at 50ms and doubles each attempt, capping at 1 second. Up to 10 attempts are made. If the client disconnects mid-retry, <code>sleepWithContext</code> detects the cancelled context and returns immediately. This means no wasted resources.</p>
<h2 id="heading-the-service-layer-where-business-logic-meets-double-entry">The Service Layer: Where Business Logic Meets Double-Entry</h2>
<p>The service layer is the heart of the system. Its job is to translate business operations – deposits, withdrawals, transfers – into double-entry journal entries that always balance.</p>
<h3 id="heading-deposit-crediting-the-user-debiting-the-settlement">Deposit: Crediting the User, Debiting the Settlement</h3>
<p>Every deposit creates two entries: a credit to the user's account and a matching debit to the settlement account. Both entries share the same transaction ID.</p>
<pre><code class="language-go">func (s *LedgerService) Deposit(ctx context.Context, accountID uuid.UUID, amountStr string) error {
    amount, err := validatePositiveAmount(amountStr)
    if err != nil {
        return err
    }
    return s.store.ExecTx(ctx, func(q *sqlc.Queries) error {
        settlement, err := q.GetSettlementAccountForUpdate(ctx)
        if err != nil {
            return fmt.Errorf("settlement account not found: %w", err)
        }
        account, err := q.GetAccountForUpdate(ctx, accountID)
        if err != nil {
            return fmt.Errorf("account not found: %w", err)
        }
        if account.Currency != settlement.Currency {
            return ErrCurrencyMismatch
        }
        txID := uuid.New()
        // 1. Credit user account
        _, err = q.CreateEntry(ctx, sqlc.CreateEntryParams{
            AccountID:     accountID,
            Debit:         decimal.Zero.StringFixed(4),
            Credit:        amount.StringFixed(4),
            TransactionID: txID,
            OperationType: "deposit",
            Description:   sql.NullString{String: "External deposit", Valid: true},
        })
        if err != nil { return err }
        // 2. Debit settlement (opposing entry)
        _, err = q.CreateEntry(ctx, sqlc.CreateEntryParams{
            AccountID:     settlement.ID,
            Debit:         amount.StringFixed(4),
            Credit:        decimal.Zero.StringFixed(4),
            TransactionID: txID,
            OperationType: "deposit",
            Description:   sql.NullString{String: fmt.Sprintf("Deposit to account %s", accountID), Valid: true},
        })
        if err != nil { return err }
        // 3. Update both balances atomically
        if err = q.UpdateAccountBalance(ctx, sqlc.UpdateAccountBalanceParams{
            Balance: amount.StringFixed(4), ID: accountID,
        }); err != nil { return err }
        return q.UpdateAccountBalance(ctx, sqlc.UpdateAccountBalanceParams{
            Balance: amount.Neg().StringFixed(4), ID: settlement.ID,
        })
    })
}
</code></pre>
<p>Two things are worth highlighting. First, both accounts are locked with <code>GetAccountForUpdate</code> and <code>GetSettlementAccountForUpdate</code> before any entries are written. This prevents any other concurrent transaction from reading a stale balance and acting on it.</p>
<p>Second, <code>amount.Neg()</code> is used to debit the settlement. Its balance goes down, representing real money now held inside the system.</p>
<h3 id="heading-withdraw-debiting-the-user-crediting-the-settlement">Withdraw: Debiting the User, Crediting the Settlement</h3>
<p>Withdrawals are the mirror image of deposits. The key difference is the insufficient funds check, which must happen inside the transaction after the lock is acquired:</p>
<pre><code class="language-go">balanceDec, err := decimal.NewFromString(account.Balance)
if err != nil {
    return errors.New("invalid balance")
}
if balanceDec.LessThan(amount) {
    return ErrInsufficientFunds
}
</code></pre>
<p>Checking balance inside the transaction after <code>FOR UPDATE</code> is critical. Checking it before, outside the transaction, would create a classic time-of-check-to-time-of-use (TOCTOU) race. Two concurrent withdrawals could both pass the check, then both execute, overdrawing the account.</p>
<p>The entries for a $500 withdrawal look like this:</p>
<pre><code class="language-plaintext">| Account              | Debit   | Credit  |
|----------------------|---------|---------|
| User Account         | 500     |         |
| Settlement Account   |         | 500     |
</code></pre>
<p>The settlement is credited because real money is leaving the system, and it's being "returned" to the outside world.</p>
<h3 id="heading-transfer-user-to-user-no-settlement-involved">Transfer: User-to-User, No Settlement Involved</h3>
<p>Transfers move money directly between two user accounts. The settlement account isn't involved. Both accounts are locked, currency is validated, and an insufficient funds check runs before any entries are created:</p>
<pre><code class="language-go">func (s *LedgerService) Transfer(ctx context.Context, fromID, toID uuid.UUID, amountStr string) error {
    amount, err := validatePositiveAmount(amountStr)
    if err != nil { return err }
    if fromID == toID {
        return ErrSameAccountTransfer
    }
    return s.store.ExecTx(ctx, func(q *sqlc.Queries) error {
        fromAcc, err := q.GetAccountForUpdate(ctx, fromID)
        if err != nil { return err }
        toAcc, err := q.GetAccountForUpdate(ctx, toID)
        if err != nil { return err }
        if fromAcc.Currency != toAcc.Currency {
            return ErrCurrencyMismatch
        }
        fromBalance, _ := decimal.NewFromString(fromAcc.Balance)
        if fromBalance.LessThan(amount) {
            return ErrInsufficientFunds
        }
        txID := uuid.New()
        // Debit sender, credit receiver — same transaction ID
        // ... CreateEntry calls + UpdateAccountBalance calls
    })
}
</code></pre>
<p>A $200 transfer creates exactly two entries under the same <code>transaction_id</code>:</p>
<pre><code class="language-plaintext">| Account  | Debit   | Credit  |
|----------|---------|---------|
| Sender   | 200     |         |
| Receiver |         | 200     |
</code></pre>
<h3 id="heading-reconcileaccount-trust-but-verify">ReconcileAccount: Trust, But Verify</h3>
<p>Reconciliation is how you prove the system is correct. The <code>ReconcileAccount</code> function compares the stored <code>balance</code> column against the sum of all credits minus debits in the entries table:</p>
<pre><code class="language-go">func (s *LedgerService) ReconcileAccount(ctx context.Context, accountID uuid.UUID) (bool, error) {
    account, err := s.store.GetAccount(ctx, accountID)
    if err != nil { return false, fmt.Errorf("account not found: %w", err) }

    calculatedStr, err := s.store.GetAccountBalance(ctx, accountID)
    if err != nil { return false, fmt.Errorf("failed to calculate balance: %w", err) }

    calculated, _ := decimal.NewFromString(calculatedStr)
    stored, _ := decimal.NewFromString(account.Balance)

    if !stored.Equal(calculated) {
        log.Error().
            Str("stored_balance", account.Balance).
            Str("calculated", calculated.StringFixed(4)).
            Msg("Balance mismatch detected")
        return false, fmt.Errorf("balance mismatch: stored %s, calculated %s",
            account.Balance, calculated.StringFixed(4))
    }
    return true, nil
}
</code></pre>
<p>If they don't match, something has gone wrong: a bug, a direct database modification, or a race condition that slipped through. In production, this check can run as a background job to catch issues before they become incidents.</p>
<h2 id="heading-the-api-layer-secure-predictable-and-boring-by-design">The API Layer: Secure, Predictable, and Boring (By Design)</h2>
<p>The API layer is where your business logic meets the outside world. Its job is to be secure, predictable, and, if you've done things right, a little bit boring.</p>
<h3 id="heading-jwt-authentication-secrets-matter">JWT Authentication: Secrets Matter</h3>
<p>Authentication is handled with JWTs. The secret used to sign tokens must be at least 32 characters long (as shorter secrets are insecure and can be brute-forced). This is enforced at startup:</p>
<pre><code class="language-go">// internal/api/middleware.go
func InitTokenAuth(secret string) error {
    if secret == "" {
        return errors.New("JWT_SECRET environment variable is required")
    }
    if len(secret) &lt; 32 {
        return errors.New("JWT_SECRET must be at least 32 characters")
    }
    TokenAuth = jwtauth.New("HS256", []byte(secret), nil)
    return nil
}
</code></pre>
<p>The server will refuse to start if the secret is missing or too short. There's no fallback and no default: the system fails loudly rather than running insecurely.</p>
<h3 id="heading-the-handler-pattern-parse-authorize-validate-call-respond">The Handler Pattern: Parse, Authorize, Validate, Call, Respond</h3>
<p>Every handler follows the same recipe: extract JWT claims, parse the account ID, fetch the account and verify ownership, decode the request body, call the service, and respond. Authorization always happens before calling the service layer. The service knows nothing about users, keeping business logic clean and testable.</p>
<pre><code class="language-go">// internal/api/handler.go
func (h *Handler) Register(w http.ResponseWriter, r *http.Request) {
    var input struct {
        Email    string `json:"email"`
        Password string `json:"password"`
    }
    if err := json.NewDecoder(r.Body).Decode(&amp;input); err != nil {
        respondError(w, http.StatusBadRequest, "invalid input")
        return
    }
    // ... hash password, create user, generate JWT ...
}
</code></pre>
<h3 id="heading-amount-normalization-defensive-by-default">Amount Normalization: Defensive by Default</h3>
<p>API clients send amounts in different formats –&nbsp;sometimes as strings, sometimes as numbers. The normalization logic ensures all amounts are handled safely:</p>
<pre><code class="language-go">// internal/api/amount.go
func normalizeAmountInput(value interface{}) (string, error) {
    switch v := value.(type) {
    case string:
        return strings.TrimSpace(v), nil
    case json.Number:
        return strings.TrimSpace(v.String()), nil
    case float64:
        return strconv.FormatFloat(v, 'f', -1, 64), nil
    default:
        return "", errors.New("amount must be a number or string")
    }
}
</code></pre>
<p>The decoder uses <code>dec.UseNumber()</code> so JSON numbers arrive as <code>json.Number</code> rather than <code>float64</code>, preserving full precision. The <code>float64</code> case exists as a safety fallback only.</p>
<h3 id="heading-frontend-deployment-boundary">Frontend Deployment Boundary</h3>
<p>The backend no longer serves static frontend files. The frontend is deployed separately at <code>https://golangbank.app</code> from its own repository: <code>https://github.com/PaulBabatuyi/double-entry-bank</code>.</p>
<h2 id="heading-running-it-locally-your-first-end-to-end-test">Running It Locally: Your First End-to-End Test</h2>
<pre><code class="language-bash">git clone https://github.com/PaulBabatuyi/double-entry-bank-Go.git
cd double-entry-bank-Go
cp .env.example .env
# Edit .env — set JWT_SECRET with: openssl rand -base64 32
make postgres
make migrate-up
make server
</code></pre>
<p>Once the server is running:</p>
<ul>
<li><p><strong>Frontend</strong>: <a href="https://golangbank.app">https://golangbank.app</a></p>
</li>
<li><p><strong>Swagger UI</strong>: <a href="http://localhost:8080/swagger/index.html">http://localhost:8080/swagger/index.html</a> (local dev) or <a href="https://golangbank.app/swagger">https://golangbank.app/swagger</a> (production)</p>
</li>
<li><p><strong>Health check</strong>: <a href="http://localhost:8080/health">http://localhost:8080/health</a></p>
</li>
</ul>
<p>The Swagger UI lets you explore every endpoint, authorize with your JWT token, and test operations directly in the browser.</p>
<h2 id="heading-testing-prove-the-system-works">Testing: Prove the System Works</h2>
<p>Testing financial systems is non-negotiable, and claims about correctness need to be backed by code. This project tests all three layers, each targeting a different kind of failure.</p>
<h3 id="heading-service-layer-core-financial-logic">Service Layer: Core Financial Logic</h3>
<p>The most important tests live in <code>internal/service/ledger_test.go</code>. They run against a real PostgreSQL database – not mocks –&nbsp;because mock-based tests can give a false sense of security. Real database tests catch issues that only appear in production-like environments.</p>
<pre><code class="language-go">func TestDeposit_Success(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "0.00")

    err := ledger.Deposit(context.Background(), accountID, "100.00")
    require.NoError(t, err)

    balance := getAccountBalance(t, ledger, accountID)
    assert.Equal(t, "100.0000", balance)
}

func TestWithdraw_InsufficientFunds(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "50.00")

    err := ledger.Withdraw(context.Background(), accountID, "100.00")
    assert.ErrorIs(t, err, ErrInsufficientFunds)
}
</code></pre>
<p>The <code>createTestAccount</code> helper uses the settlement account's currency automatically, which is important: all accounts must share a currency for transfers to work, and tests that silently use a different currency will fail in confusing ways.</p>
<h3 id="heading-concurrency-test-proving-serializable-isolation-works">Concurrency Test: Proving Serializable Isolation Works</h3>
<p>This is the most important test in the suite:</p>
<pre><code class="language-go">func TestConcurrentDeposits(t *testing.T) {
    ledger := setupTestLedger(t)
    accountID := createTestAccount(t, ledger, "0.00")

    var wg sync.WaitGroup
    wg.Add(2)
    go func() {
        defer wg.Done()
        _ = ledger.Deposit(context.Background(), accountID, "100.00")
    }()
    go func() {
        defer wg.Done()
        _ = ledger.Deposit(context.Background(), accountID, "100.00")
    }()
    wg.Wait()

    balance := getAccountBalance(t, ledger, accountID)
    assert.Equal(t, "200.0000", balance)
}
</code></pre>
<p>Two goroutines deposit simultaneously. The serializable isolation level and retry logic ensure both operations succeed and neither overwrites the other. Without the <code>FOR UPDATE</code> locks and transaction retry logic, this test would fail non-deterministically – which is exactly the kind of bug that's impossible to reproduce in development but devastating in production.</p>
<h3 id="heading-store-layer-transaction-mechanics">Store Layer: Transaction Mechanics</h3>
<p>Tests in <code>internal/db/store_test.go</code> verify the retry infrastructure itself, without needing a database connection:</p>
<pre><code class="language-go">func TestIsSerializationError(t *testing.T) {
    pqErr := &amp;pq.Error{Code: "40001"}
    assert.True(t, isSerializationError(pqErr))
    assert.False(t, isSerializationError(errors.New("some other error")))
}

func TestRetryWait(t *testing.T) {
    assert.Equal(t, 50*time.Millisecond, retryWait(0))
    assert.Equal(t, 100*time.Millisecond, retryWait(1))
    assert.Equal(t, 200*time.Millisecond, retryWait(2))
    assert.Equal(t, time.Second, retryWait(5)) // capped
}

func TestSleepWithContext_Cancel(t *testing.T) {
    ctx, cancel := context.WithCancel(context.Background())
    cancel() // cancel immediately
    err := sleepWithContext(ctx, 50*time.Millisecond)
    assert.Error(t, err) // should return immediately, not wait
}
</code></pre>
<h3 id="heading-api-layer-authentication-and-input-handling">API Layer: Authentication and Input Handling</h3>
<p>Handler tests in <code>internal/api/handler_test.go</code> verify that the HTTP layer behaves correctly at its boundaries:</p>
<pre><code class="language-go">func TestRegisterHandler_BadRequest(t *testing.T) {
    h := setupTestHandler(t)
    req := httptest.NewRequest(http.MethodPost, "/register", nil)
    rw := httptest.NewRecorder()
    h.Register(rw, req)
    assert.Equal(t, http.StatusBadRequest, rw.Code)
}

func TestRegisterHandler_Success(t *testing.T) {
    h := setupTestHandler(t)
    _ = InitTokenAuth("fV7sliKV3qn657I60wEFtw/Auk/0bNU9zdp30wFzfDg=")

    email := "testuser_" + uuid.New().String() + "@example.com"
    body, _ := json.Marshal(map[string]string{"email": email, "password": "testpassword123"})

    req := httptest.NewRequest(http.MethodPost, "/register", bytes.NewReader(body))
    rw := httptest.NewRecorder()
    h.Register(rw, req)
    assert.Equal(t, http.StatusCreated, rw.Code)
}
</code></pre>
<p>Using <code>uuid.New().String()</code> in the email ensures each test run creates a unique user, preventing conflicts on repeated runs against the same database.</p>
<p>Middleware tests verify the security boundary itself:</p>
<pre><code class="language-go">func TestInitTokenAuthFromEnv_MissingSecret(t *testing.T) {
    os.Unsetenv("JWT_SECRET")
    err := InitTokenAuthFromEnv()
    assert.Error(t, err) // must fail without a secret
}
</code></pre>
<h3 id="heading-running-the-tests">Running the Tests</h3>
<pre><code class="language-bash"># Start the database
make postgres

# Run all tests with race detection
make test

# Run with coverage report
make coverage

# Run tests the same way CI does (includes migrations)
make ci-test
</code></pre>
<p>The <code>-race</code> flag is non-negotiable for financial code. It instruments the binary to detect data races at runtime –&nbsp;something static analysis can't catch. If a race exists, the race detector will find it.</p>
<h2 id="heading-deployment-engineering-decisions-that-matter-in-production">Deployment: Engineering Decisions That Matter in Production</h2>
<p>The deployment setup for this project reflects several engineering decisions worth understanding, regardless of what platform you deploy to.</p>
<h3 id="heading-migrations-on-container-start">Migrations on Container Start</h3>
<p>The Docker entrypoint runs <code>golang-migrate up</code> before starting the Go binary:</p>
<pre><code class="language-sh"># docker-entrypoint
migrate -path /app/postgres/migrations -database "$migrate_db_url" up
exec /usr/local/bin/ledger
</code></pre>
<p>Running migrations at startup rather than as a separate CI step has trade-offs. The upside is simplicity: the container is always self-consistent when it starts. The downside is that each deployment takes slightly longer. For a solo project or small team, this is the right call. At scale you'd separate migrations from deployment.</p>
<h3 id="heading-startup-retry-logic">Startup Retry Logic</h3>
<p>The entrypoint retries migrations up to 12 times with a 5-second sleep between attempts:</p>
<pre><code class="language-sh">max_attempts=12
attempt=1
while [ "\(attempt" -le "\)max_attempts" ]; do
    migration_output=$(migrate ... up 2&gt;&amp;1)
    # If "connection refused" or "timeout", keep retrying
    # If any other error, fail immediately
    attempt=$((attempt + 1))
done
</code></pre>
<p>The critical distinction is which errors trigger a retry. Network-transient errors (connection refused, timeout) are retried. Everything else&nbsp;–&nbsp;a bad migration SQL, a missing tabl&nbsp;–&nbsp;fails immediately. This avoids waiting the full 60 seconds when a deployment has a real problem.</p>
<h3 id="heading-db-url-fallback-chain">DB URL Fallback Chain</h3>
<p>In cloud environments, the internal database URL is often a different variable than what you configure locally. The <code>resolveDBURL</code> function handles this transparently:</p>
<pre><code class="language-go">func resolveDBURL() string {
    connStr := strings.TrimSpace(os.Getenv("DB_URL"))
    fallbackVars := []string{"INTERNAL_DATABASE_URL", "RENDER_DATABASE_URL", "DATABASE_URL"}
    // Falls back through the chain if DB_URL is empty or resolves to localhost
    ...
}
</code></pre>
<p>This pattern means local developers set <code>DB_URL</code> in <code>.env</code> and don't need to think about it, while the deployed container automatically uses the internal database connection without any manual wiring.</p>
<h3 id="heading-http-server-timeouts">HTTP Server Timeouts</h3>
<p>The server is configured with explicit timeouts:</p>
<pre><code class="language-go">srv := &amp;http.Server{
    Addr:              ":" + port,
    Handler:           r,
    ReadTimeout:       15 * time.Second,
    WriteTimeout:      15 * time.Second,
    IdleTimeout:       60 * time.Second,
    ReadHeaderTimeout: 5 * time.Second,
}
</code></pre>
<p>Without timeouts, a slow or malicious client can hold connections open indefinitely, eventually exhausting the server's resources. <code>ReadHeaderTimeout</code> is particularly important: it limits how long the server waits for the HTTP headers before closing the connection, protecting against Slowloris-style attacks.</p>
<h2 id="heading-conclusion-building-for-the-real-world">Conclusion: Building for the Real World</h2>
<p>You've just walked through the core patterns that power real fintech systems:</p>
<ul>
<li><p>Double-entry ledger with database-enforced constraints</p>
</li>
<li><p>Settlement account for tracking external cash flows</p>
</li>
<li><p>Serializable transactions with exponential backoff retry</p>
</li>
<li><p>Reconciliation endpoint for verifying correctness</p>
</li>
<li><p>Type-safe queries with sqlc</p>
</li>
<li><p>Row-level locking to prevent race conditions</p>
</li>
<li><p>Tests that prove correctness under concurrency</p>
</li>
</ul>
<p>These aren't just Go patterns. They're the same principles used at companies like Monzo, Stripe, and Nubank. The implementation details differ, but the underlying ideas are the same: every dollar is accounted for, every operation is atomic, and the system can always explain where every penny went.</p>
<p>What's next? Three concrete next steps:</p>
<ol>
<li><p><strong>Add idempotency keys</strong> to prevent duplicate transactions on retries. If a client retries a deposit because of a network timeout, you need to detect and reject the duplicate.</p>
</li>
<li><p><strong>Add Prometheus metrics</strong> for transaction latency and failure rates. You want to know when your p99 latency spikes before your users do.</p>
</li>
<li><p><strong>Add a scheduled reconciliation job</strong> that runs <code>ReconcileAccount</code> for every account on a schedule and alerts on mismatches. Catch bugs automatically, before they become customer complaints.</p>
</li>
</ol>
<p>The developer who stores balance as a single number and updates it directly will eventually have an incident. The developer who builds a ledger has an audit trail, a reconciliation tool, and a system that can explain every penny.</p>
<p>That's the real reason fintech engineers build this way: not because it's more complex, but because it's more honest about what money actually is.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Implement the Outbox Pattern in Go and PostgreSQL ]]>
                </title>
                <description>
                    <![CDATA[ In event-driven systems, two things need to happen when you process a request: you need to save data to your database, and you need to publish an event to a message broker so other services know somet ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-implement-the-outbox-pattern-in-go-and-postgresql/</link>
                <guid isPermaLink="false">69bc31b3b238fd45a31f1291</guid>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                    <category>
                        <![CDATA[ golang ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Alex Pliutau ]]>
                </dc:creator>
                <pubDate>Thu, 19 Mar 2026 17:26:11 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/7a24b5a7-6619-4997-b24c-c4a743f37c33.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>In event-driven systems, two things need to happen when you process a request: you need to save data to your database, and you need to publish an event to a message broker so other services know something changed.</p>
<p>These two operations look simple, but they hide a dangerous reliability problem. What if the database write succeeds but the message broker is temporarily unreachable? Or your service crashes between the two steps? You end up in an inconsistent state: your database has the new data, but the rest of the system never heard about it.</p>
<p>The <strong>Outbox Pattern</strong> is a well-established solution to this problem. In this tutorial, you'll learn what the pattern is, why it works, and how to implement it in Go with PostgreSQL and Google Cloud Pub/Sub.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before reading this tutorial, you should be familiar with:</p>
<ul>
<li><p>The basics of the Go programming language</p>
</li>
<li><p>SQL and PostgreSQL</p>
</li>
<li><p>The concept of database transactions</p>
</li>
<li><p>Basic familiarity with event-driven or distributed systems (helpful but not required)</p>
</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><p><a href="#heading-the-problem-two-operations-no-atomicity">The Problem: Two Operations, No Atomicity</a></p>
</li>
<li><p><a href="#heading-how-the-outbox-pattern-works">How the Outbox Pattern Works</a></p>
</li>
<li><p><a href="#heading-the-outbox-table-schema">The Outbox Table Schema</a></p>
</li>
<li><p><a href="#heading-the-message-relay">The Message Relay</a></p>
</li>
<li><p><a href="#heading-go-and-postgresql-implementation">Go and PostgreSQL Implementation</a></p>
<ul>
<li><p><a href="#heading-the-orders-service">The Orders Service</a></p>
</li>
<li><p><a href="#heading-the-relay-service">The Relay Service</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-why-messages-can-be-delivered-more-than-once">Why Messages Can Be Delivered More Than Once</a></p>
</li>
<li><p><a href="#heading-alternative-postgresql-logical-replication">Alternative: PostgreSQL Logical Replication</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="heading-the-problem-two-operations-no-atomicity">The Problem: Two Operations, No Atomicity</h2>
<p>To understand why the Outbox Pattern exists, you need to understand a core challenge in distributed systems: <strong>atomicity across different systems</strong>.</p>
<p>In a relational database, a <strong>transaction</strong> lets you group multiple operations so they either all succeed or all fail together. If you insert a row and update another row in the same transaction, you're guaranteed that both happen – or neither does.</p>
<p>The problem arises when you try to extend this guarantee <em>across two different systems:</em> for example, your database and your message broker (like Kafka, RabbitMQ, or Pub/Sub). These systems don't share a transaction boundary.</p>
<p>Here's a typical event-driven flow that breaks without the Outbox Pattern:</p>
<ol>
<li><p>A user places an order.</p>
</li>
<li><p>Your service saves the order to the database ✅</p>
</li>
<li><p>Your service publishes an <code>order.created</code> event to the message broker ❌ (broker is down)</p>
</li>
<li><p>The order exists in the database, but downstream services never learned about it.</p>
</li>
</ol>
<p>Or the reverse failure:</p>
<ol>
<li><p>Your service publishes the event first ✅</p>
</li>
<li><p>Your service tries to save the order to the database ❌ (database times out)</p>
</li>
<li><p>Downstream services received a notification for an order that doesn't exist.</p>
</li>
</ol>
<p>Either scenario leaves your system in an inconsistent state. This is the core problem the Outbox Pattern solves.</p>
<p>Here's what the process looks like when not using the Outbox Pattern:</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ea89c91fdc930d846b413ab/9f9abcaa-adc8-48ab-b8cb-c47cb731724e.png" alt="diagram without outbox" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h2 id="heading-how-the-outbox-pattern-works">How the Outbox Pattern Works</h2>
<p>The Outbox Pattern solves the atomicity problem by keeping both operations <em>inside</em> the database:</p>
<ol>
<li><p>Saves your business data (for example, a new order) to your database.</p>
</li>
<li><p>Writes the event message to a special table called the outbox table in the same database transaction.</p>
</li>
<li><p>A separate background process called the Message Relay polls the outbox table and publishes pending messages to the broker.</p>
</li>
<li><p>Once the broker confirms receipt, the relay marks the message as processed.</p>
</li>
</ol>
<p>Because steps 1 and 2 happen in the same database transaction, they are <strong>atomic</strong>. Either both succeed or neither does. You can never end up with saved data but no corresponding event queued – or an event queued for data that was never saved.</p>
<p>The message is never published directly to the broker in your main application code. Instead, the database acts as a reliable staging area.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ea89c91fdc930d846b413ab/ef5413b0-6c8e-42b8-949f-5eefe3844231.png" alt="diagram with outbox" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h2 id="heading-the-outbox-table-schema">The Outbox Table Schema</h2>
<p>The outbox table stores pending messages until the relay picks them up. Here's a typical PostgreSQL schema:</p>
<pre><code class="language-sql">CREATE TABLE outbox (
    id          uuid PRIMARY KEY DEFAULT gen_random_uuid(),
    topic       varchar(255)  NOT NULL,
    message     jsonb         NOT NULL,
    state       varchar(50)   NOT NULL DEFAULT 'pending',
    created_at  timestamptz   NOT NULL DEFAULT now(),
    processed_at timestamptz
);
</code></pre>
<p>Let's walk through each column:</p>
<ul>
<li><p><code>id</code>: A unique identifier for each message. Using UUIDs makes it easy to reference specific messages.</p>
</li>
<li><p><code>topic</code>: The destination topic or queue name in your message broker (for example, <code>orders.created</code>).</p>
</li>
<li><p><code>message</code>: The event payload, stored as JSON. This is the data your consumers will receive.</p>
</li>
<li><p><code>state</code>: Tracks whether the message has been sent. The two main values are <code>pending</code> (waiting to be published) and <code>processed</code> (successfully published).</p>
</li>
<li><p><code>created_at</code>: When the message was inserted. The relay uses this to process messages in order.</p>
</li>
<li><p><code>processed_at</code>: When the relay successfully published the message.</p>
</li>
</ul>
<p>You may want to add additional columns depending on your needs: for example, a <code>retry_count</code> column to track how many times the relay has attempted to send a message, or an <code>error</code> column to log failure reasons.</p>
<h2 id="heading-the-message-relay">The Message Relay</h2>
<p>The Message Relay is a background process (often a goroutine, a sidecar, or a separate service) that bridges the outbox table and the message broker.</p>
<p>Its responsibilities are:</p>
<ol>
<li><p>Periodically query the outbox table for messages with <code>state = 'pending'</code>.</p>
</li>
<li><p>Publish each message to the appropriate topic in the broker.</p>
</li>
<li><p>Once the broker confirms delivery, update the row's <code>state</code> to <code>'processed'</code>.</p>
</li>
<li><p>Handle failures gracefully: if publishing fails, leave the message as <code>'pending'</code> so it will be retried.</p>
</li>
</ol>
<p>This design gives you <strong>at-least-once delivery</strong>: a message will always be sent, even if the relay crashes and restarts. The trade-off is that a message might occasionally be sent more than once (more on this below), so your consumers should handle duplicates.</p>
<h2 id="heading-go-and-postgresql-implementation">Go and PostgreSQL Implementation</h2>
<p>Let's build a concrete example. Imagine you have an orders service. When a new order is created, you want to:</p>
<ol>
<li><p>Save the order to a PostgreSQL <code>orders</code> table.</p>
</li>
<li><p>Publish an <code>order.created</code> event to Google Cloud Pub/Sub.</p>
</li>
</ol>
<p>You'll use <a href="https://github.com/jackc/pgx">pgx</a> for the PostgreSQL driver.</p>
<h3 id="heading-the-orders-service">The Orders Service</h3>
<p>The key insight is that the order insert and the outbox insert happen <strong>inside the same transaction</strong>. If anything goes wrong, both are rolled back.</p>
<pre><code class="language-go">// orders/main.go

package main

import (
	"context"
	"encoding/json"
	"log"
	"os"

	"github.com/google/uuid"
	"github.com/jackc/pgx/v5"
	"github.com/jackc/pgx/v5/pgxpool"
)

// Order represents a customer order in our system.
type Order struct {
	ID       uuid.UUID `json:"id"`
	Product  string    `json:"product"`
	Quantity int       `json:"quantity"`
}

// OrderCreatedEvent is the payload published to the message broker.
// It contains only the fields that downstream services need to know about.
type OrderCreatedEvent struct {
	OrderID uuid.UUID `json:"order_id"`
	Product string    `json:"product"`
}

// createOrderInTx saves a new order and its outbox event atomically.
// Both operations share the same transaction (tx), so either both succeed
// or both are rolled back — ensuring consistency.
func createOrderInTx(ctx context.Context, tx pgx.Tx, order Order) error {
	// Step 1: Insert the business data (the actual order).
	_, err := tx.Exec(ctx,
		"INSERT INTO orders (id, product, quantity) VALUES (\(1, \)2, $3)",
		order.ID, order.Product, order.Quantity,
	)
	if err != nil {
		return err
	}
	log.Printf("Inserted order %s into database", order.ID)

	// Step 2: Serialize the event payload that consumers will receive.
	event := OrderCreatedEvent{
		OrderID: order.ID,
		Product: order.Product,
	}
	msg, err := json.Marshal(event)
	if err != nil {
		return err
	}

	// Step 3: Write the event to the outbox table.
	// This does NOT publish to Pub/Sub — it just queues it for the relay.
	_, err = tx.Exec(ctx,
		"INSERT INTO outbox (topic, message) VALUES (\(1, \)2)",
		"orders.created", msg,
	)
	if err != nil {
		return err
	}
	log.Printf("Inserted outbox event for order %s", order.ID)

	return nil
}

func main() {
	ctx := context.Background()

	pool, err := pgxpool.New(ctx, os.Getenv("DATABASE_URL"))
	if err != nil {
		log.Fatalf("Unable to connect to database: %v", err)
	}
	defer pool.Close()

	// Begin a transaction that will cover both the order insert
	// and the outbox insert.
	tx, err := pool.Begin(ctx)
	if err != nil {
		log.Fatalf("Unable to begin transaction: %v", err)
	}
	// If anything fails, the deferred Rollback is a no-op after a successful Commit.
	defer tx.Rollback(ctx)

	newOrder := Order{
		ID:       uuid.New(),
		Product:  "Super Widget",
		Quantity: 10,
	}

	if err := createOrderInTx(ctx, tx, newOrder); err != nil {
		log.Fatalf("Failed to create order: %v", err)
	}

	// Committing the transaction makes both writes permanent simultaneously.
	if err := tx.Commit(ctx); err != nil {
		log.Fatalf("Failed to commit transaction: %v", err)
	}

	log.Println("Successfully created order and queued outbox event.")
}
</code></pre>
<p>Notice that <code>createOrderInTx</code> receives a <code>pgx.Tx</code> (a transaction) rather than a pool connection. This is intentional: it enforces that the caller is responsible for managing the transaction boundary, making the atomicity guarantee explicit.</p>
<h3 id="heading-the-relay-service">The Relay Service</h3>
<p>The relay runs as a separate background process. It polls the outbox table, publishes messages, and marks them as processed.</p>
<p>A critical detail here is the use of <code>FOR UPDATE SKIP LOCKED</code> in the SQL query. This PostgreSQL feature lets you run <strong>multiple relay instances</strong> concurrently without them stepping on each other. When one instance locks a row to process it, other instances skip that row and move on to the next one.</p>
<pre><code class="language-go">// relay/main.go

package main

import (
	"context"
	"log"
	"time"

	"cloud.google.com/go/pubsub"
	"github.com/google/uuid"
	"github.com/jackc/pgx/v5/pgxpool"
)

// OutboxMessage mirrors the columns we need from the outbox table.
type OutboxMessage struct {
	ID      uuid.UUID
	Topic   string
	Message []byte
}

// processOutboxMessages picks up one pending message, publishes it to Pub/Sub,
// and marks it as processed — all within a single database transaction.
func processOutboxMessages(ctx context.Context, pool *pgxpool.Pool, pubsubClient *pubsub.Client) error {
	tx, err := pool.Begin(ctx)
	if err != nil {
		return err
	}
	defer tx.Rollback(ctx)

	// Query for the next pending message.
	// FOR UPDATE SKIP LOCKED ensures that if multiple relay instances are
	// running, they won't try to process the same message simultaneously.
	rows, err := tx.Query(ctx, `
		SELECT id, topic, message
		FROM outbox
		WHERE state = 'pending'
		ORDER BY created_at
		LIMIT 1
		FOR UPDATE SKIP LOCKED
	`)
	if err != nil {
		return err
	}
	defer rows.Close()

	var msg OutboxMessage
	if rows.Next() {
		if err := rows.Scan(&amp;msg.ID, &amp;msg.Topic, &amp;msg.Message); err != nil {
			return err
		}
	} else {
		// No pending messages — nothing to do.
		return nil
	}

	log.Printf("Publishing message %s to topic %s", msg.ID, msg.Topic)

	// Publish the message to the Pub/Sub topic and wait for confirmation.
	result := pubsubClient.Topic(msg.Topic).Publish(ctx, &amp;pubsub.Message{
		Data: msg.Message,
	})
	if _, err = result.Get(ctx); err != nil {
		// Publishing failed. We return the error here without committing,
		// so the transaction rolls back and the message stays 'pending'.
		// The relay will retry it on the next polling interval.
		return err
	}

	// Mark the message as processed now that the broker has confirmed receipt.
	_, err = tx.Exec(ctx,
		"UPDATE outbox SET state = 'processed', processed_at = now() WHERE id = $1",
		msg.ID,
	)
	if err != nil {
		return err
	}
	log.Printf("Marked message %s as processed", msg.ID)

	// Commit the transaction: the state update becomes permanent.
	return tx.Commit(ctx)
}

func main() {
	// In production, initialize real connections using environment variables
	// or a config file. These are left as placeholders for clarity.
	var (
		pool         *pgxpool.Pool
		pubsubClient *pubsub.Client
	)

	// Poll the outbox table every second.
	// Adjust the interval based on your latency requirements.
	ticker := time.NewTicker(1 * time.Second)
	defer ticker.Stop()

	for range ticker.C {
		if err := processOutboxMessages(context.Background(), pool, pubsubClient); err != nil {
			log.Printf("Error processing outbox: %v", err)
		}
	}
}
</code></pre>
<p>The polling interval (1 second in this example) controls the maximum latency between an event being written to the outbox and it being published to the broker. For most use cases, 1–5 seconds is perfectly acceptable. If you need lower latency, you can reduce the interval, or consider using PostgreSQL's <code>LISTEN/NOTIFY</code> feature to wake up the relay immediately when a new row is inserted.</p>
<h2 id="heading-why-messages-can-be-delivered-more-than-once">Why Messages Can Be Delivered More Than Once</h2>
<p>You might wonder: isn't the Outbox Pattern supposed to guarantee <em>exactly once</em> delivery?</p>
<p>It does not. It guarantees <strong>at-least-once</strong> delivery. Here's the edge case:</p>
<ol>
<li><p>The relay publishes the message to Pub/Sub successfully.</p>
</li>
<li><p>Before it can update the outbox row to <code>'processed'</code>, the relay process crashes.</p>
</li>
<li><p>On restart, the relay sees the message is still <code>'pending'</code> and publishes it again.</p>
</li>
</ol>
<p>This is a rare but possible scenario. The standard way to handle it is to design your message <strong>consumers to be idempotent</strong>. This means that they can safely receive and process the same message multiple times without causing incorrect behavior.</p>
<p>Common strategies for idempotency include:</p>
<ul>
<li><p>Using the message's <code>id</code> as a deduplication key, and checking if you've already processed it before acting.</p>
</li>
<li><p>Making your operations naturally idempotent. For example, using <code>INSERT ... ON CONFLICT DO NOTHING</code> instead of a plain <code>INSERT</code>.</p>
</li>
</ul>
<h2 id="heading-alternative-postgresql-logical-replication">Alternative: PostgreSQL Logical Replication</h2>
<p>The polling approach described above is simple and works well, but it has two drawbacks: it introduces some latency (up to one polling interval), and it issues database queries even when there's nothing to process.</p>
<p>For high-throughput systems where these trade-offs matter, PostgreSQL offers a more advanced alternative: <strong>logical replication</strong> via the <strong>Write-Ahead Log (WAL)</strong>.</p>
<p>Every change made to a PostgreSQL database is first written to the WAL – an append-only log used for crash recovery and replication. With logical replication, you can subscribe to changes in specific tables and receive them as a stream in near real-time.</p>
<p>Instead of your relay asking "Are there any new messages?" on a schedule, PostgreSQL will proactively notify your relay the moment a new row is inserted into the outbox table.</p>
<p>This approach is lower latency and more resource-efficient for high-volume workloads. The trade-off is added implementation complexity: you need to manage a replication slot in PostgreSQL and handle the WAL stream correctly.</p>
<p>In Go, you can use the <a href="https://github.com/jackc/pglogrepl">pglogrepl</a> library to interact with PostgreSQL's logical replication protocol.</p>
<p>For more details on how WAL and change data capture work in PostgreSQL, see the <a href="https://www.postgresql.org/docs/current/wal-intro.html">official Write-Ahead Logging documentation</a>.</p>
<img src="https://cdn.hashnode.com/uploads/covers/5ea89c91fdc930d846b413ab/c706cc8f-6bbd-49d1-90c0-8c4934c2718e.png" alt="diagram with WAL" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h2 id="heading-conclusion">Conclusion</h2>
<p>The Outbox Pattern solves a fundamental problem in distributed systems: how do you reliably perform a database write and publish a message to a broker in a consistent way?</p>
<p>The key idea is to use your database as the source of truth for <em>both</em> the business data and the pending messages. By writing to the outbox table in the same transaction as your business data, you get atomic guarantees from the database itself: no distributed transaction protocol required.</p>
<p>Here's a quick summary of the key concepts:</p>
<ul>
<li><p><strong>The outbox table</strong> stores pending events as part of your regular database schema.</p>
</li>
<li><p><strong>The transaction</strong> wraps both the business write and the outbox write, making them atomic.</p>
</li>
<li><p><strong>The Message Relay</strong> is a background process that reads from the outbox and publishes to the broker.</p>
</li>
<li><p><strong>At-least-once delivery</strong> means your consumers must be idempotent.</p>
</li>
<li><p><code>FOR UPDATE SKIP LOCKED</code> allows multiple relay instances to run safely in parallel.</p>
</li>
<li><p><strong>Logical replication</strong> is an advanced alternative that avoids polling for high-throughput systems.</p>
</li>
</ul>
<p>The pattern is simple in concept, but there are several ways to implement it depending on your scale and infrastructure. The polling approach shown in this tutorial is a solid starting point for most applications.</p>
<h3 id="heading-resources">Resources</h3>
<ul>
<li><p><a href="https://github.com/plutov/packagemain/tree/master/outbox">Source code on GitHub</a></p>
</li>
<li><p><a href="https://www.postgresql.org/docs/current/wal-intro.html">PostgreSQL Write-Ahead Logging (WAL)</a></p>
</li>
<li><p><a href="https://github.com/jackc/pglogrepl">pglogrepl – Go library for PostgreSQL logical replication</a></p>
</li>
<li><p><a href="https://github.com/jackc/pgx">pgx – PostgreSQL driver and toolkit for Go</a></p>
</li>
<li><p><a href="https://packagemain.tech">Explore more Go tutorials on packagemain.tech</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Elevate Your Database Game: Supercharging Query Performance with Postgres FDW ]]>
                </title>
                <description>
                    <![CDATA[ Foreign data wrappers (FDWs) make remote Postgres tables feel local. That convenience is exactly why FDW performance surprises are so common. A query that looks like a normal join can execute like a distributed system: rows move across the network, r... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/fdw-pushdown/</link>
                <guid isPermaLink="false">69963f00d35b661838993bd0</guid>
                
                    <category>
                        <![CDATA[ performance ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Databases ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Hamdaan Ali ]]>
                </dc:creator>
                <pubDate>Wed, 18 Feb 2026 22:36:48 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1771357398917/8db8c3fd-9f16-4631-aa48-2537e8a4cb45.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Foreign data wrappers (FDWs) make remote Postgres tables feel local. That convenience is exactly why FDW performance surprises are so common.</p>
<p>A query that looks like a normal join can execute like a distributed system: rows move across the network, remote statements get executed repeatedly, and the local planner quietly becomes a coordinator. In that world, “fast SQL” is not mainly about CPU or indexes. It’s about <strong>data movement</strong> and <strong>round-trips</strong>.</p>
<p>This handbook covers the mechanism that determines whether a federated query behaves like a clean remote query or a chatty distributed workflow: <strong>pushdown</strong>.</p>
<p>Pushdown is not “moving compute”. Pushdown determines whether filtering, joining, ordering, and aggregation occur at the data source or after the data has already crossed the wire. When pushdown works, the local server receives a reduced result set. When it doesn’t, Postgres often has to fetch broad intermediate sets and finish the work locally.</p>
<p>The chapters ahead will help you build a practical mental model of what is “shippable” in <code>postgres_fdw</code>, why some expressions are blocked, and how to read <code>EXPLAIN (ANALYZE, BUFFERS, VERBOSE)</code> without getting tricked by familiar plan shapes.</p>
<p>After the core method, the handbook covers tuning knobs that matter in production, schema and indexing considerations, benchmarking methodology, monitoring and logging, and a case study that shows what a real pushdown win looks like end-to-end.</p>
<p>The later sections go deeper into advanced shippability edge cases, cost model calibration, and regression-proofing FDW workloads.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-executive-summary">Executive Summary</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-motivation">Motivation</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fdw-basics-without-the-setup-tax">FDW Basics Without the Setup Tax</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-pushdown-mechanics">Pushdown Mechanics</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-shippable-operations-a-deep-dive">Shippable Operations: a Deep Dive</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-pushdown-blockers-and-why-they-exist">Pushdown Blockers and Why They Exist</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-reading-explain-like-a-pro">Reading EXPLAIN Like a Pro</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-tune-postgresfdw">How to Tune postgres_fdw</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-schema-and-index-recommendations">Schema and Index Recommendations</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-benchmarking-methodology">Benchmarking Methodology</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-monitoring-and-logging">Monitoring and Logging</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-case-study-refactoring-a-keycloak-coverage-query">Case Study: Refactoring a Keycloak Coverage Query</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-checklist-and-troubleshooting-guide">Checklist and Troubleshooting Guide</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-case-study-takeaways">Case Study Takeaways</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-advanced-operations-a-deeper-dive-into-shippability">Advanced Operations: A Deeper Dive into Shippability</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-common-antipatterns-and-how-to-avoid-them">Common Anti‑Patterns and How to Avoid Them</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-extending-tuning-calibrating-cost-models">Extending Tuning: Calibrating Cost Models</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-further-case-studies-and-practical-examples">Further Case Studies and Practical Examples</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-monitoring-diagnostics-and-regression-testing">Monitoring, Diagnostics, and Regression Testing</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-extended-guidelines-for-advanced-dbas">Extended Guidelines for Advanced DBAs</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-bringing-it-all-together">Bringing it All Together</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-references">References</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>This handbook assumes basic comfort with Postgres query plans. It builds on <code>EXPLAIN (ANALYZE, BUFFERS)</code> rather than reintroducing SQL fundamentals, indexing, or join algorithms.</p>
<p>The focus here is federated execution: how foreign queries behave, and how to reason about them with the same clarity as local plans.</p>
<p>Here’s what you should already be comfortable with:</p>
<ul>
<li><p>Reading <code>EXPLAIN (ANALYZE, BUFFERS)</code> output and spotting obvious plan smells (row explosions, bad join order, missed indexes).</p>
</li>
<li><p>Basic join mechanics (nested loop, hash join, merge join) and why cardinality estimates matter.</p>
</li>
<li><p>Postgres statistics at a practical level (<code>ANALYZE</code>, correlation, and what “estimated rows vs actual rows” implies).</p>
</li>
</ul>
<p>And here’s what you need to follow along with the examples:</p>
<ul>
<li><p>A Postgres “local” instance that will run <code>postgres_fdw</code> and act as the coordinator.</p>
</li>
<li><p>A Postgres “remote” instance that holds the foreign tables.</p>
</li>
<li><p>Permission on the local side to:</p>
<ul>
<li><p><code>CREATE EXTENSION postgres_fdw;</code></p>
</li>
<li><p>create a <code>SERVER</code> and <code>USER MAPPING</code></p>
</li>
<li><p>create <code>FOREIGN TABLE</code> objects (or permission to use existing ones)</p>
</li>
</ul>
</li>
<li><p>A way to run queries and capture plans:</p>
<ul>
<li><code>psql</code> is enough, and so is any GUI, as long as you can run <code>EXPLAIN (ANALYZE, BUFFERS, VERBOSE)</code>.</li>
</ul>
</li>
</ul>
<p>We won’t go through a long environment setup walkthrough. The examples assume the FDW objects exist and focus on plans and behavior.</p>
<p>We also won’t go into general distributed systems theory. Only the pieces that show up in an FDW plan are used.</p>
<h2 id="heading-executive-summary">Executive Summary</h2>
<p>The single most important lesson of this handbook is that <strong>FDW pushdown reduces data movement</strong>. It’s tempting to think of pushdown as merely changing where a calculation happens (“move the work to the remote”). But what really matters is whether the remote server is asked for only the rows you need.</p>
<p>When pushdown is working, the remote server performs the selective join and filtering, and the local Postgres receives a small, already reduced result set. When pushdown fails, the local server becomes a distributed query coordinator: it pulls large intermediate sets over the network and then finishes the heavy lifting locally.</p>
<p>Why does this matter? Because a refactor that makes more of your query shippable to the remote server can slash end‑to‑end latency without changing a single row of output. In the case study we'll explore later, rewriting a query so that the FDW can ship a joined remote query instead of performing multiple foreign scans and local joins reduces runtime from approximately <strong>166 ms to 25 ms</strong>. The business logic did not change – the <em>shape</em> of the work changed.</p>
<p>Below is a simple bar chart illustrating that dramatic drop. The chart uses actual timings from the case study. If you run the experiment yourself, the numbers may differ depending on your hardware and network, but the relative difference should be clear.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771117284661/ecadfc8b-7e45-4122-921d-5b06215d627a.png" alt="Bar chart titled &quot;Query Execution Time: Before vs After Refactor.&quot; The chart shows execution time in milliseconds on the vertical axis. The &quot;Before&quot; bar is much taller, over 160 ms, compared to the &quot;After&quot; bar, which is below 20 ms, indicating a significant improvement in execution time after refactoring." class="image--center mx-auto" width="840" height="630" loading="lazy"></p>
<h2 id="heading-motivation">Motivation</h2>
<p>Foreign data wrappers let you query remote data using the same SQL syntax you use locally. That convenience is exactly why they can be so deceptive.</p>
<p>A federated query may look like a normal join, but under the hood, it behaves like a distributed system: some part of the plan runs on the remote server, some on the local server, and every boundary between them is a network hop. The slow path is rarely “bad SQL” – it’s usually a combination of two things:</p>
<ol>
<li><p><strong>Too many rows are pulled over the network.</strong> Without pushdown, the FDW retrieves a large slice of the remote table and applies your filters and joins locally. This may lead to tens of thousands or millions of rows being shipped across the network when you only needed hundreds or fewer.</p>
</li>
<li><p><strong>Too many round-trips.</strong> If the plan performs a nested loop that drives a foreign scan, it can end up executing the same remote query hundreds or thousands of times. Each call might be fast on its own, but latency adds up.</p>
</li>
</ol>
<p>This isn't speculation. PostgreSQL's documentation makes clear that a foreign table <strong>has no local storage</strong> and that Postgres “asks the FDW to fetch data from the external source” <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a>. There is no local buffer cache or heap storage to hide mistakes. Every row you retrieve must traverse the network at least once. If your plan fetches more rows than it needs, or repeatedly does so, performance can degrade quickly.</p>
<p>That’s why you should treat the Remote SQL shown in <code>EXPLAIN (VERBOSE)</code> as part of your query plan. It tells you exactly what the remote server is being asked to do. If it’s missing your filters or joins, you know the local server will have to finish the job. The rest of this handbook will teach you how to read that plan, how to force pushdown when possible, and how to recognize the signs that something has gone wrong.</p>
<h2 id="heading-fdw-basics-without-the-setup-tax">FDW Basics Without the Setup Tax</h2>
<p>You might be tempted to skip this section if you've already created foreign tables in your own databases. Don't. Understanding the architecture of foreign data wrappers is essential to understanding why pushdown matters.</p>
<h3 id="heading-sqlmed-in-a-nutshell">SQL/MED in a nutshell</h3>
<p>PostgreSQL implements the <strong>SQL/MED</strong> (Management of External Data) standard through its FDW framework. To access a remote Postgres server via <code>postgres_fdw</code>, you perform four steps:</p>
<ol>
<li><p><strong>Install the extension</strong>: <code>CREATE EXTENSION postgres_fdw</code> tells Postgres to load the FDW code.</p>
</li>
<li><p><strong>Create a foreign server</strong>: <code>CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '...', port '...', dbname '...')</code>defines where the remote server resides and how to connect.</p>
</li>
<li><p><strong>Create a user mapping</strong>: <code>CREATE USER MAPPING FOR your_user SERVER foreign_server OPTIONS (user 'remote_user', password '...')</code> tells Postgres how to authenticate on the remote side.</p>
</li>
<li><p><strong>Create a foreign table</strong>: <code>CREATE FOREIGN TABLE remote_table (...) SERVER foreign_server OPTIONS (schema_name '...', table_name '...');</code> defines the columns and references the remote table.</p>
</li>
</ol>
<p>Once you've done that, you can run <code>SELECT</code> statements against the foreign table as if it were local. But the definition hides an important detail: there is no storage associated with that foreign table <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a>. Every time you <code>SELECT</code>, <code>INSERT</code>, <code>UPDATE</code>, or <code>DELETE</code>, the FDW must connect to the remote server, build a remote query, send it, and read the results. This overhead is small for simple queries but becomes critical as queries get more complex.</p>
<h3 id="heading-what-postgresfdw-does-and-does-not-do">What postgres_fdw does and does not do</h3>
<p><code>postgres_fdw</code> does two things for you:</p>
<ol>
<li><p>It builds remote SQL from your query, including pushing down safe filters, joins, sorts, and aggregates when it can.</p>
</li>
<li><p>It fetches rows from the remote server and hands them to the local executor. If some part of your query cannot be executed remotely, the local executor performs that part.</p>
</li>
</ol>
<p>The FDW tries hard to minimize data transfer by sending as much of your <code>WHERE</code> clause as possible to the remote server and by not retrieving unused columns <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a>. It also has a number of tuning knobs that we'll explore later (such as <code>fetch_size</code>, <code>use_remote_estimate</code>, <code>fdw_startup_cost</code>, and <code>fdw_tuple_cost</code><a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>). But the real win often comes from structuring your query so that the FDW can push work down.</p>
<p>There's one last architectural point to keep in mind: the remote server runs with a restricted session environment. In remote sessions opened by <code>postgres_fdw</code>, the <code>search_path</code> is set to <code>pg_catalog</code> only, and <code>TimeZone</code>, <code>DateStyle</code>, and <code>IntervalStyle</code> are set to specific values <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a>. This means that any functions you expect to run remotely must be schema‑qualified or packaged in a way that the FDW can find them. It also underscores why you should not override session settings for FDW connections unless you know exactly what you are doing <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a>.</p>
<h2 id="heading-pushdown-mechanics">Pushdown Mechanics</h2>
<p>At a high level, “pushdown” means pushing as much of your SQL query as possible to the remote server. But the FDW cannot simply send arbitrary SQL. It must be <em>safe</em> and <em>portable</em> for remote evaluation. Postgres uses the term <strong>shippable</strong> to describe expressions and operations that can be evaluated on the foreign server.</p>
<h3 id="heading-what-shippable-means-in-practice">What “shippable” means in practice</h3>
<p>An expression is considered shippable if it meets several conditions:</p>
<ol>
<li><p><strong>It uses built‑in functions, operators, or data types</strong>, or functions/operators from extensions that have been explicitly allow‑listed via the extensions option on the foreign server <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a>. If you use a custom function or an extension that has not been declared, the FDW assumes it cannot run remotely.</p>
</li>
<li><p><strong>It’s marked IMMUTABLE.</strong> Postgres distinguishes between <code>IMMUTABLE</code>, <code>STABLE</code>, and <code>VOLATILE</code> functions. Only immutable functions – those that always return the same output for the same inputs and don’t depend on session state – are candidates for pushdown <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=functions%20in%20such%20clauses%20must,to%20reduce%20the%20risk%20of">[5]</a>. This rule prevents time‑dependent functions, such as <code>now()</code> or <code>random()</code> from being evaluated remotely, because the result might differ between the local and remote servers.</p>
</li>
<li><p><strong>It doesn’t depend on local collations or type conversions</strong>. PostgreSQL’s docs warn that type or collation mismatches can lead to semantic anomalies <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a>. If the FDW cannot guarantee that a comparison behaves identically on both servers, it will refuse to push it down. For example, comparing a <code>citext</code> column to a <code>text</code> constant could be unsafe if the remote server doesn’t have the <code>citext</code> extension installed.</p>
</li>
</ol>
<p>From these rules, you can derive a mental checklist: avoid non‑immutable functions in your <code>WHERE</code> clause, keep your join conditions simple and typed correctly, and list any third‑party extensions you want to use in the foreign server’s extensions option so that they are considered shippable <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a>.</p>
<h3 id="heading-where-pushdown">WHERE pushdown</h3>
<p>If a <code>WHERE</code> clause consists entirely of shippable expressions, it will be included in the remote query. Otherwise, it will be evaluated locally. This matters because pushing a filter down reduces the number of rows returned to the local server.</p>
<p>Consider a predicate like this:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">WHERE</span> created_at &gt;= now() - <span class="hljs-type">interval</span> <span class="hljs-string">'30 days'</span>
</code></pre>
<p>Because <code>now()</code> is volatile (it returns a different value each time it’s called), Postgres cannot assume the remote server will interpret <code>now()</code> the same way. The FDW therefore pulls the entire table and applies the filter locally.</p>
<p>A better approach is to pass a parameter into the query or compute the cutoff timestamp once in the application and embed it into the SQL.</p>
<h3 id="heading-join-pushdown-conditions">Join pushdown conditions</h3>
<p>Joins are the next big lever. When <code>postgres_fdw</code> encounters a join between foreign tables on the <strong>same foreign server</strong>, it will send the entire join to the remote server unless it believes it will be more efficient to fetch the tables individually or unless the tables use different user mappings <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a>.</p>
<p>It applies the same precautions described for <code>WHERE</code> clauses: the join condition must be shippable, and both tables must be on the same server. Cross‑server joins are never pushed down – the FDW will perform them locally.</p>
<h3 id="heading-shippability-decision-tree">Shippability decision tree</h3>
<p>It can be helpful to visualize the shippability rules as a flowchart. Below is a simple decision tree that you can use when inspecting an expression or join clause.</p>
<p>It starts with the question of whether an expression is in a WHERE or JOIN clause. Further decisions are made based on factors like using volatile functions, built-in functions, type mismatches, or cross-server joins. The flowchart concludes with outcomes like "Not shippable, evaluated locally" or "Shippable, included in Remote SQL."</p>
<p>If you reach the left side of the tree, the expression will be evaluated locally. If you reach the right side, the FDW can ship it.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771109842865/9dafcd32-c390-487d-8b35-2911d6075b13.png" alt="Flowchart for determining SQL expression shippability. It starts with the question of whether an expression is in a WHERE or JOIN clause. Further decisions are made based on factors like using volatile functions, built-in functions, type mismatches, or cross-server joins. The flowchart concludes with outcomes like &quot;Not shippable, evaluated locally&quot; or &quot;Shippable, included in Remote SQL.&quot;" class="image--center mx-auto" width="8192" height="2404" loading="lazy"></p>
<h2 id="heading-shippable-operations-a-deep-dive">Shippable Operations: a Deep Dive</h2>
<p>Postgres has been expanding what <code>postgres_fdw</code> can be pushed down over several versions. This section walks through each operation class and the conditions required for pushdown.</p>
<h3 id="heading-filters-where-clauses">Filters (WHERE clauses)</h3>
<p>As explained above, simple filters that use built‑in operators and immutable functions are generally pushed down. If you see a <code>Filter:</code> node above a Foreign Scan in your plan, it means some part of your predicate didn’t qualify. Common reasons include using <code>now()</code>, <code>timezone()</code> or other volatile functions, referencing a non‑allow‑listed extension, or comparing different collation settings.</p>
<p>When this happens, the entire table (or at least all rows matching other shippable conditions) is fetched, and the filter is applied locally.</p>
<p><strong>Plan smell:</strong> Look for a Foreign Scan node with a <code>Filter:</code> line directly above it. That means filtering happened locally. Also look for broad Remote SQL such as:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> remote_table <span class="hljs-keyword">WHERE</span> (<span class="hljs-type">name</span> = <span class="hljs-string">'Hamdaan'</span>)
</code></pre>
<p>with no group constraints. That's a sign that the filter was not pushed down.</p>
<h3 id="heading-joins">Joins</h3>
<p>Simple inner joins between foreign tables on the same foreign server are usually pushable. The join condition must satisfy the same shippability rules as filters. If the join involves more than one foreign server, if the join condition uses an unshippable function, or if the foreign tables use different user mappings, the FDW will fetch each table separately and join them locally <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a>. This can lead to large intermediate sets being transferred.</p>
<p><strong>Plan smell:</strong> A Hash Join or Merge Join where both inputs are Foreign Scan nodes indicates that the join was performed locally. Conversely, a single Foreign Scan representing a join and containing the <code>JOIN ... ON</code> clause in Remote SQL indicates that the join was pushed down.</p>
<h3 id="heading-aggregates-group-by-count-sum-and-so-on">Aggregates (GROUP BY, COUNT, SUM, and so on)</h3>
<p>Starting in PostgreSQL 10, aggregates can be pushed to the remote server when possible. The release notes state explicitly: “push aggregate functions to the remote server,” and explain that this <strong>reduces the amount of data that must be transferred from the remote server and offloads aggregate computation</strong> <a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a>.</p>
<p>To qualify, both the grouping expressions and the aggregate functions themselves must be shippable. If the FDW cannot push an aggregate, it will fetch the raw rows and perform the aggregation locally.</p>
<p><strong>Plan smell:</strong> Look for a <code>GroupAggregate</code> node above a Foreign Scan that returns many rows. When the aggregate is pushed down, there will be no local aggregate node. Instead, the Remote SQL will include a <code>GROUP BY</code> clause.</p>
<h3 id="heading-order-by-and-limit">ORDER BY and LIMIT</h3>
<p>Prior to PostgreSQL 12, sorting and limiting were rarely pushed down. In version 12, Etsuro Fujita’s patch allows ORDER BY sorts and LIMIT clauses to be pushed to <code>postgres_fdw</code> foreign servers <strong>in more cases</strong> <a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a>. For the sort or limit to be pushed, the underlying scan must be pushable, and the ordering expression must be shippable. Partitioned queries or complicated join trees may still cause the sort or limit to be applied locally.</p>
<p><strong>Plan smell:</strong> A local Sort or Limit node above a Foreign Scan indicates the operation was not pushed down. Conversely, a Remote SQL statement containing ORDER BY and LIMIT indicates that pushdown succeeded.</p>
<h3 id="heading-distinct">DISTINCT</h3>
<p>Distinct operations can be pushed down when the distinct expression list is shippable. But if the distinct is combined with unshippable expressions, or if the distinct is applied after a join that cannot be pushed down, the FDW will retrieve all rows and perform the distinct locally.</p>
<h3 id="heading-window-functions">Window functions</h3>
<p>In practice, window functions are rarely pushed down through <code>postgres_fdw</code>. They often require ordering or partitioning semantics that are difficult to represent portably. If you see a <code>WindowAgg</code> node in your plan, it’s almost always local. That doesn’t mean you can't use window functions with foreign tables, but you should expect them to incur network and CPU costs.</p>
<h3 id="heading-version-differences">Version differences</h3>
<p>Postgres developers continue to improve the FDW layer. Here are some notable changes by version:</p>
<ol>
<li><p><strong>PostgreSQL 9.6</strong> introduced remote join pushdown and allowed UPDATE/DELETE pushdown. Before 9.6, all joins were local.</p>
</li>
<li><p><strong>PostgreSQL 10</strong> introduced aggregate pushdown, enabling remote GROUP BY and aggregate functions <a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a>.</p>
</li>
<li><p><strong>PostgreSQL 12</strong> expanded ORDER BY and LIMIT pushdown <a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a>.</p>
</li>
<li><p><strong>PostgreSQL 15</strong> added pushdown for certain CASE expressions and other improvements.</p>
</li>
</ol>
<p>If you learned FDW behavior on an older version, revisit your assumptions.</p>
<h2 id="heading-pushdown-blockers-and-why-they-exist">Pushdown Blockers and Why They Exist</h2>
<p>When pushdown fails, it’s not due to bad luck. There’s always a reason grounded in safety or correctness. Here are the most common blockers and how to diagnose them.</p>
<h3 id="heading-nonimmutable-functions">Non‑immutable functions</h3>
<p>Functions marked <code>VOLATILE</code> or <code>STABLE</code> cannot be pushed down because their results may differ between the local and remote server. Examples include <code>now()</code>, <code>random()</code>, <code>current_user</code>, and user‑defined functions that look at session variables or query the database. Even functions you might think are harmless, like <code>age()</code> or <code>clock_timestamp()</code>, can cause pushdown to fail.</p>
<p><strong>Fix:</strong> Compute volatile values in your application or in a CTE before referencing the foreign table. For example, compute timestamp <code>'now' - interval '30 days'</code> as a constant and compare your <code>created_at</code> column against that constant. Alternatively, move the logic into a stored generated column on the remote table.</p>
<h3 id="heading-type-and-collation-mismatches">Type and collation mismatches</h3>
<p>The documentation warns that when types or collations don’t match between the local and remote tables, the remote server may interpret conditions differently <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a>. This is particularly insidious when text comparisons, case‑insensitive collations, or non‑default locale settings are used. If Postgres can't guarantee the same semantics, it will pull rows locally and evaluate the expression.</p>
<p><strong>Fix:</strong> Make sure that your foreign table definition uses the same data types and collations as the remote table. When in doubt, explicitly cast values to a common type.</p>
<h3 id="heading-crossserver-joins">Cross‑server joins</h3>
<p>Joins across different foreign servers cannot be pushed down. The FDW can only ship a join when both tables reside on the same remote server and use the same user mapping <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a>. Otherwise, it will perform two separate scans and join the results locally.</p>
<p><strong>Fix:</strong> If you frequently join tables across servers, consider consolidating the tables on a single server, materializing a view on one side, or pulling the smaller table into a temporary local table before joining.</p>
<h3 id="heading-mixed-local-and-foreign-joins">Mixed local and foreign joins</h3>
<p>A join between a local table and a foreign table will not be pushed down. Even though the foreign side might be pushdown‑eligible, the FDW cannot join it with local data on the remote server. A nested loop with a parameterized foreign scan is the typical pattern here, resulting in many remote calls.</p>
<p><strong>Fix:</strong> Filter or aggregate as much as possible on the foreign side first (via a CTE or by materializing a subset) before joining to local tables.</p>
<h3 id="heading-remote-session-settings-and-search-paths">Remote session settings and search paths</h3>
<p>Because <code>postgres_fdw</code> sets a restricted <code>search_path</code>, <code>TimeZone</code>, <code>DateStyle</code>, and <code>IntervalStyle</code> in remote sessions <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a>, any functions you call must be schema‑qualified or otherwise compatible. If a function relies on the current search path or session settings, it may break or produce different results on the remote side.</p>
<p><strong>Fix:</strong> Schema‑qualify remote functions and ensure that any environment‑dependent logic is safe to execute under the default FDW session settings. If necessary, attach <code>SET search_path</code> or other settings to your remote functions.</p>
<h3 id="heading-troubleshooting-matrix">Troubleshooting matrix</h3>
<p>The table below maps symptoms in your <code>EXPLAIN</code> plan to likely causes and fixes. Use it as a quick diagnostic tool when something looks off.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Symptom in plan</strong></td><td><strong>Likely cause</strong></td><td><strong>Suggested fix</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Foreign Scan has loops much greater than 1</td><td>Parameterized remote lookup caused by nested loop, join conditions not shippable</td><td>Rewrite join so the FDW can ship a single joined query, or batch remote requests via an <code>IN</code> list or temporary table</td></tr>
<tr>
<td>Broad Remote SQL that lacks scope predicates</td><td><code>WHERE</code> clause contains non‑immutable functions or unsupported operators</td><td>Replace volatile functions with constants or allow‑list extension functions, ensure types and collations match</td></tr>
<tr>
<td>Local Hash Join or Merge Join between two foreign tables</td><td>Join could not be pushed down (different servers, user mappings, or unshippable join expression)</td><td>Consolidate tables on one server, align user mappings, or rewrite the join condition</td></tr>
<tr>
<td>Local Sort, Limit, or Unique on top of a Foreign Scan</td><td><code>ORDER BY</code>, <code>LIMIT</code>, or <code>DISTINCT</code> could not be pushed down</td><td>Simplify sort expressions, push filters deeper, check PG version for improvements</td></tr>
<tr>
<td>Plan runs but gives wrong results when pushdown is enabled</td><td>Semantic mismatch due to type/collation differences or remote session settings <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a></td><td>Align types/collations, schema‑qualify functions, use stable session settings</td></tr>
</tbody>
</table>
</div><h2 id="heading-reading-explain-like-a-pro">Reading EXPLAIN Like a Pro</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1771117830315/62ca8fde-2638-4ae1-b968-1100ac5251bb.png" alt="SQL execution plan analysis table with columns: exclusive, inclusive, rows x, rows, loops, and node details. Rows display Nested Loop Join, Hash Join, and Seq Scan operations with costs, times, and buffers. Highlighted cells indicate notable metrics." class="image--center mx-auto" width="1579" height="823" loading="lazy"></p>
<p>Many developers skim <code>EXPLAIN</code> plans for local queries, looking at the top nodes and overall cost. For FDW queries, you must invert that habit: read the foreign parts first. The Remote SQL string tells you what the remote server is being asked to do, and the loops field tells you how many times that remote call is executed.</p>
<h3 id="heading-inspect-the-foreign-scan-nodes">Inspect the Foreign Scan nodes</h3>
<p>Start by finding the Foreign Scan node(s). In <code>EXPLAIN (VERBOSE)</code>, each foreign scan includes a line like:</p>
<pre><code class="lang-pgsql">Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> ...
</code></pre>
<p>This line is not a trivial – it’s the actual SQL that will run on the remote server. Read it carefully. Does it include your <code>WHERE</code> predicates? Does it include your join conditions? If not, you know the local server will pick up the slack.</p>
<p>Look at the loops column. If the loops exceed 1, the same remote query is executed multiple times. For example:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">Foreign</span> Scan <span class="hljs-keyword">on</span> <span class="hljs-built_in">public</span>.user_entity  (<span class="hljs-keyword">rows</span>=<span class="hljs-number">1</span> loops=<span class="hljs-number">416</span>)
  Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> id, tenant_id <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.user_entity <span class="hljs-keyword">WHERE</span> enabled <span class="hljs-keyword">AND</span> service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span> <span class="hljs-keyword">AND</span> id = <span class="hljs-meta">$1</span>
</code></pre>
<p>This is the “N+1” problem in disguise. The plan executes the foreign scan once per outer row. Multiply the per‑loop cost by the number of loops to understand why the query is slow. The fix is to rewrite the query so that the join and filters are applied in a single remote call.</p>
<h3 id="heading-recognize-initplan-vs-subplan">Recognize InitPlan vs SubPlan</h3>
<p>An InitPlan runs once and caches its result. A SubPlan can run per outer row. In FDW queries, subplans often drive parameterized remote scans. If you see a SubPlan attached to a nested loop that feeds a foreign scan, suspect a parameterized remote lookup and look for ways to turn it into an InitPlan or merge it into a single remote query.</p>
<h3 id="heading-understand-cte-materialization">Understand CTE materialization</h3>
<p>Common table expressions (CTEs) behave differently depending on whether they are marked <code>MATERIALIZED</code> or <code>NOT MATERIALIZED</code>. A materialized CTE is computed once and stored in a temporary structure, then read by the rest of the query. A non‑materialized CTE is inlined into the parent query, allowing optimizations to span across the boundary.</p>
<p>In PostgreSQL 12 and later, CTEs are inlined by default unless they’re referenced multiple times or explicitly marked <code>MATERIALIZED</code>. Materializing a CTE that contains a foreign scan can freeze a broad remote fetch and prevent later clauses from being pushed down. On the other hand, materialization can prevent repeated remote scans if the CTE is referenced multiple times. Use this lever deliberately to control where remote work happens.</p>
<h3 id="heading-annotated-example">Annotated example</h3>
<p>Let's annotate a simplified excerpt from a real plan. The goal is to show how to quickly read the relevant parts.</p>
<pre><code class="lang-pgsql">Nested <span class="hljs-keyword">Loop</span>  (<span class="hljs-keyword">rows</span>=<span class="hljs-number">414</span> loops=<span class="hljs-number">1</span>)
  -&gt; Hash <span class="hljs-keyword">Join</span>  (<span class="hljs-keyword">rows</span>=<span class="hljs-number">416</span> loops=<span class="hljs-number">1</span>)
       -&gt; <span class="hljs-keyword">Foreign</span> Scan <span class="hljs-keyword">on</span> <span class="hljs-built_in">public</span>.user_entity (<span class="hljs-keyword">rows</span>=<span class="hljs-number">1</span> loops=<span class="hljs-number">416</span>)
            Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> id, tenant_id <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.user_entity <span class="hljs-keyword">WHERE</span> enabled <span class="hljs-keyword">AND</span> service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span> <span class="hljs-keyword">AND</span> id = <span class="hljs-meta">$1</span>
  -&gt; <span class="hljs-keyword">Foreign</span> Scan <span class="hljs-keyword">on</span> <span class="hljs-built_in">public</span>.user_attribute (<span class="hljs-keyword">rows</span>=<span class="hljs-number">671</span> loops=<span class="hljs-number">1</span>)
       Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> ua.user_id, ua.<span class="hljs-keyword">value</span> <span class="hljs-keyword">FROM</span> user_attribute ua <span class="hljs-keyword">JOIN</span> user_entity u <span class="hljs-keyword">ON</span> ua.user_id = u.id <span class="hljs-keyword">JOIN</span> tenant r <span class="hljs-keyword">ON</span> u.tenant_id = r.id <span class="hljs-keyword">WHERE</span> ua.name = <span class="hljs-string">'attribute A'</span> <span class="hljs-keyword">AND</span> r.name = <span class="hljs-string">'demo'</span> <span class="hljs-keyword">AND</span> u.enabled <span class="hljs-keyword">AND</span> u.service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span> <span class="hljs-keyword">AND</span> (g.name = <span class="hljs-string">'keycloak-group-a'</span> <span class="hljs-keyword">OR</span> g.parent_group = <span class="hljs-meta">$1</span>)
</code></pre>
<p>In the old plan, the first Foreign Scan executed 416 times, each time retrieving a single row. The Remote SQL only applies the filter on enabled and service_account_client_link – it doesn’t include the tenant or group scoping. That scoping is applied by the nested loop outside the foreign scan.</p>
<p>In the refactored plan, the second Foreign Scan results from combining user_attribute, user_entity, user_group_membership, keycloak_group, and tenant into a single remote query. It retrieves 671 rows in a single query and includes all relevant filters. There is no repeated remote call. The timing difference is driven by the different loop values and the selectivity of the Remote SQL.</p>
<h2 id="heading-how-to-tune-postgresfdw">How to Tune postgres_fdw</h2>
<p>Once you've structured your query for maximum pushdown, tuning knobs let you squeeze out further performance improvements and adjust planner decisions.</p>
<h3 id="heading-fetchsize">fetch_size</h3>
<p><code>fetch_size</code> controls how many rows <code>postgres_fdw</code> retrieves per network fetch. The default is <code>100</code> rows <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[9]</a>. A small fetch size means more round-trips and lower memory usage. A larger fetch size reduces network overhead at the cost of buffering more rows in memory.</p>
<p>In practice, increasing <code>fetch_size</code> to a few thousand can reduce latency for large result sets. It’s specified either at the foreign server or foreign table level:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">SERVER</span> foreign_server <span class="hljs-keyword">OPTIONS</span> (<span class="hljs-keyword">ADD</span> fetch_size <span class="hljs-string">'1000'</span>);
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">FOREIGN</span> <span class="hljs-keyword">TABLE</span> remote_table <span class="hljs-keyword">OPTIONS</span> (<span class="hljs-keyword">ADD</span> fetch_size <span class="hljs-string">'1000'</span>);
</code></pre>
<h3 id="heading-useremoteestimate">use_remote_estimate</h3>
<p>By default, the planner estimates the cost of foreign scans using local statistics. This can be wildly inaccurate if the foreign table has a different data distribution. Setting <code>use_remote_estimate</code> to true tells <code>postgres_fdw</code> to run <code>EXPLAIN</code> on the remote server to get row count and cost estimates. This can dramatically improve join order selection at the cost of an additional remote query during planning <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>. You can set this per table or per server:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">SERVER</span> foreign_server <span class="hljs-keyword">OPTIONS</span> (<span class="hljs-keyword">SET</span> use_remote_estimate <span class="hljs-string">'true'</span>);
</code></pre>
<h3 id="heading-fdwstartupcost-and-fdwtuplecost">fdw_startup_cost and fdw_tuple_cost</h3>
<p>These cost parameters model the overhead of starting a foreign scan and the cost per row fetched. Adjusting them can influence the planner’s choice of join strategy. A higher <code>fdw_startup_cost</code> discourages the planner from choosing plans with many small foreign scans (which might generate many remote calls). A higher <code>fdw_tuple_cost</code> discourages plans that fetch large numbers of rows <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>. Use these only after you have solid evidence from <code>EXPLAIN</code> and experiments.</p>
<h3 id="heading-analyze-and-analyzesampling">ANALYZE and analyze_sampling</h3>
<p>Running <code>ANALYZE</code> on a foreign table collects local statistics by sampling the remote table <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>. Accurate stats are essential for good estimates when <code>use_remote_estimate</code> is false.</p>
<p>But if the remote table changes frequently, these stats become stale quickly. The <code>analyze_sampling</code> option controls whether sampling happens on the remote side or locally. When <code>analyze_sampling</code> is set to <code>random</code>, <code>system</code>, <code>bernoulli</code>, or <code>auto</code>, <code>ANALYZE</code> will sample rows remotely instead of pulling all rows into the local server<a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>.</p>
<h3 id="heading-extensions">extensions</h3>
<p>The extensions option lists extensions whose functions and operators can be shipped to the remote server <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a>. If you rely on functions from citext, <code>pg_trgm</code>, or other extensions, add them to the server definition:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">SERVER</span> foreign_server <span class="hljs-keyword">OPTIONS</span> (<span class="hljs-keyword">SET</span> extensions <span class="hljs-string">'citext,pg_trgm'</span>);
</code></pre>
<h3 id="heading-a-quick-knob-impact-table">A quick knob impact table</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Knob</strong></td><td><strong>Primary effect</strong></td><td><strong>When to change it</strong></td><td><strong>Possible downside</strong></td></tr>
</thead>
<tbody>
<tr>
<td>fetch_size</td><td>Number of rows per fetch</td><td>Result sets are large and latency dominates</td><td>Too large consumes memory</td></tr>
<tr>
<td>use_remote_estimate</td><td>Better row count/cost estimates</td><td>Planner misestimates foreign scans</td><td>Extra remote queries during planning</td></tr>
<tr>
<td>fdw_startup_cost</td><td>Penalty per foreign scan</td><td>Planner chooses many small foreign scans</td><td>Wrong values bias the planner</td></tr>
<tr>
<td>fdw_tuple_cost</td><td>Cost per row fetched</td><td>Planner pulls too many rows</td><td>Mis‑tuned values mislead planner</td></tr>
<tr>
<td>extensions</td><td>Which extension functions are shippable</td><td>Using extension functions in predicates</td><td>Extensions must exist and match on both servers</td></tr>
</tbody>
</table>
</div><h2 id="heading-schema-and-index-recommendations">Schema and Index Recommendations</h2>
<p>Pushdown doesn’t eliminate the need for good indexes. In fact, effective pushdown depends on the remote server having indexes that support the filter and join predicates you’re shipping.</p>
<p>Below are some patterns to watch for in FDW queries and the indexes that support them. You can adapt these to your own schema.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Table</strong></td><td><strong>Access pattern</strong></td><td><strong>Recommended index</strong></td><td><strong>Why</strong></td></tr>
</thead>
<tbody>
<tr>
<td>tenant (remote)</td><td>Filter by tenant.name</td><td>UNIQUE (name) or BTREE (name)</td><td>Resolves tenant ID quickly</td></tr>
<tr>
<td>keycloak_group (remote)</td><td>Filter by name, join by tenant_id, filter on parent_group</td><td>Composite (tenant_id, name) and (parent_group)</td><td>Supports resolving root group and walking one‑level hierarchy</td></tr>
<tr>
<td>user_group_membership (remote)</td><td>Join by user_id, filter by group_id</td><td>BTREE (group_id, user_id)</td><td>Efficiently finds users in a set of groups</td></tr>
<tr>
<td>user_attribute (remote)</td><td>Filter by name, join by user_id</td><td>Composite (name, user_id) (optionally include value)</td><td>Matches “attribute name → users → values” flow</td></tr>
<tr>
<td>user_entity (remote)</td><td>Filter by tenant_id, enabled, service_account_client_link IS NULL, join by id</td><td>Partial index on (tenant_id, id) with predicate on enabled and service_account_client_link IS NULL</td><td>Helps remote planner start from user table when tenant and user filters are applied</td></tr>
<tr>
<td>filtercategory (local)</td><td>Filter by category &amp;&amp; uuid[], join on (entitytype, entityid)</td><td>GIN index on category, BTREE (entitytype, entityid)</td><td>Speeds array overlap checks and join predicate</td></tr>
</tbody>
</table>
</div><p>In general, indexes should reflect the join order you expect the remote planner to use. If your Remote SQL starts with:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">FROM</span> user_attribute ua <span class="hljs-keyword">JOIN</span> user_entity u <span class="hljs-keyword">ON</span> ua.user_id = u.id <span class="hljs-keyword">JOIN</span> user_group_membership ugm <span class="hljs-keyword">ON</span> ...
</code></pre>
<p>ensure that indexes exist on <code>user_attribute(user_id</code>) and <code>user_group_membership(user_id)</code>.</p>
<h2 id="heading-benchmarking-methodology">Benchmarking Methodology</h2>
<p>It’s easy to claim a performance improvement without proper measurement. Here's a repeatable method you can use to benchmark FDW query changes.</p>
<ol>
<li><p><strong>Warm the caches.</strong> Run each query once to load data into the remote buffer cache and the local FDW connection. Discard the timings.</p>
</li>
<li><p><strong>Measure latencies.</strong> Use EXPLAIN (ANALYZE, BUFFERS, VERBOSE) to capture execution times, buffer usage, and remote row counts. Be aware that EXPLAIN ANALYZE adds overhead, so record the raw execution time if possible by running the query directly.</p>
</li>
<li><p><strong>Record remote metrics.</strong> On the remote server, enable pg_stat_statements and track the calls, total_time, and rows for each remote query. This gives you a per‑query breakdown and confirms what Remote SQL is executed.</p>
</li>
<li><p><strong>Control for concurrency and network latency.</strong> Run benchmarks during a quiet period or isolate the test cluster. If your environment has high network latency, record the round‑trip time separately to attribute delays.</p>
</li>
<li><p><strong>Compare apples to apples.</strong> Benchmark the old and new queries under identical conditions. Use the same sample data, same remote server, and same connection settings.</p>
</li>
<li><p><strong>Look at row counts.</strong> The primary goal of pushdown is to reduce the number of rows shipped. Compare the rows column of each Foreign Scan node.</p>
</li>
</ol>
<p>Here's a simple matrix you can use to record your experiments:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Scenario</strong></td><td><strong>What you're testing</strong></td><td><strong>Expected change in Remote SQL</strong></td><td><strong>Metrics to record</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Baseline (old query)</td><td>Starting point: broad remote scans + local joins</td><td>Remote SQL lacks scoping predicates</td><td>p50/p95 latency, remote row count, local sort/hash time</td></tr>
<tr>
<td>Refactor (new query)</td><td>Join + filter pushdown</td><td>Remote SQL includes joins and filters</td><td>Same metrics, plus remote row count</td></tr>
<tr>
<td>Introduce a volatile function</td><td>Pushdown blocker test</td><td>Clause removed from Remote SQL</td><td>Remote row count increases, local filter cost increases</td></tr>
<tr>
<td>Type or collation mismatch</td><td>Semantic risk test</td><td>Remote SQL might change behavior or lose pushdown</td><td>Compare correctness and row counts</td></tr>
<tr>
<td>ORDER/LIMIT pushdown</td><td>Version‑dependent test</td><td>Remote SQL includes ORDER BY, LIMIT</td><td>Sort time shifts to remote. Row count should remain</td></tr>
<tr>
<td>use_remote_estimate on/off</td><td>Planning accuracy test</td><td>Planner uses remote estimates</td><td>Planning time, join order, and runtime difference</td></tr>
</tbody>
</table>
</div><h2 id="heading-monitoring-and-logging">Monitoring and Logging</h2>
<p>In production, you need to know when a query starts misbehaving. There are two places to look: the local server and the remote server.</p>
<h3 id="heading-local-metrics">Local metrics</h3>
<ol>
<li><p><strong>pg_stat_statements.</strong> This extension tracks planning and execution times, row counts, and buffer hits for each query. Look for high total times relative to rows or calls.</p>
</li>
<li><p><strong>Auto Explain or auto_explain.</strong> Turn on <code>auto_explain.log_min_duration_statement</code> to capture slow queries with plans. This will show you the Remote SQL executed and whether the plan changed.</p>
</li>
<li><p><strong>Connection pool metrics.</strong> Monitor connection counts and wait events related to FDW operations (for example, PostgresFdwConnect, PostgresFdwGetResult) as described in the documentation <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=,Extension">[10]</a>.</p>
</li>
</ol>
<h3 id="heading-remote-metrics">Remote metrics</h3>
<ol>
<li><p><strong>pg_stat_statements on the remote server.</strong> This lets you see which Remote SQL queries are being executed, how often, and how long they take. Compare these with the Remote SQL strings in your local EXPLAIN plans.</p>
</li>
<li><p><strong>Server logs.</strong> Increase <code>log_statement</code> or <code>log_min_duration_statement</code> on the remote server to capture long-running remote queries.</p>
</li>
</ol>
<p>Correlating local and remote metrics can reveal patterns such as a new code path causing a surge in remote queries or pushdown failures, leading to heavy remote scans.</p>
<h2 id="heading-case-study-refactoring-a-keycloak-coverage-query">Case Study: Refactoring a Keycloak Coverage Query</h2>
<p>The theory above may seem abstract until you see it play out in practice. Let's walk through a real example inspired by a Keycloak integration.</p>
<p>The original query calculated coverage: given a list of category IDs, it returned the percentage of users who had attributes mapped to those categories and a JSON array of entity counts. The query used a CTE to build a list of scoped users, then joined it with user attributes, category mappings, and a few other tables.</p>
<h3 id="heading-symptom">Symptom</h3>
<p>In a test environment with 100K user records, the query averaged 166 ms. This was slower than expected. Running <code>EXPLAIN (ANALYZE, BUFFERS, VERBOSE)</code> showed two foreign scans on the Keycloak database. The first scanned <code>user_entity</code> 416 times (loops = 416). The second pulled all rows from <code>user_attribute</code> where <code>name = 'attributeA'</code> before filtering by tenant and group locally.</p>
<p>Here's a simplified excerpt (numbers are approximate):</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">Foreign</span> Scan <span class="hljs-keyword">on</span> <span class="hljs-built_in">public</span>.user_entity  (actual <span class="hljs-type">time</span>=<span class="hljs-number">0.117</span>.<span class="hljs-number">.0</span><span class="hljs-number">.117</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">1</span> loops=<span class="hljs-number">416</span>)
  Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> id, tenant_id <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.user_entity <span class="hljs-keyword">WHERE</span> (enabled <span class="hljs-keyword">AND</span> service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span> <span class="hljs-keyword">AND</span> id = <span class="hljs-meta">$1</span>)
<span class="hljs-keyword">Foreign</span> Scan <span class="hljs-keyword">on</span> <span class="hljs-built_in">public</span>.user_attribute  (actual <span class="hljs-type">time</span>=<span class="hljs-number">41.267</span>.<span class="hljs-number">.80</span><span class="hljs-number">.352</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">80739</span> loops=<span class="hljs-number">1</span>)
  Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">value</span>, user_id <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.user_attribute <span class="hljs-keyword">WHERE</span> ((<span class="hljs-string">'attributeA'</span> = <span class="hljs-type">name</span>))
</code></pre>
<p>The first scan performed a single-row lookup 416 times. The second scan retrieved 80,739 rows because the only condition pushed down was <code>name = 'attributeA'</code>. Tenant and group scoping occurred locally. That meant 80k rows were transferred over the network and then filtered down to about 671 on the local side.</p>
<h3 id="heading-diagnosis">Diagnosis</h3>
<p>There were two main issues.</p>
<p>First was the N+1 remote calls on user_entity. The join to <code>user_entity</code> was not pushed down, so the plan executed a remote lookup for each row from <code>user_group_membership</code>. This created 416 remote queries.</p>
<p>Second was the unscoped attribute fetch. Because the <code>WHERE</code> clause included <code>user_entity.tenant_id = tenant.id</code> and <code>keycloak_group.name = 'groupA'</code> in a higher CTE, the FDW could not see those predicates when scanning <code>user_attribute</code>. It therefore fetched all rows with <code>name = 'attributeA'</code> and left the tenant and group filters to the local side.</p>
<h3 id="heading-refactor">Refactor</h3>
<p>The fix was to inline the tenant and group joins into the user_attribute scan to avoid the nested-loop pattern. The refactored <code>selected_user_attributes</code> CTE looked like this (simplified for readability):</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">WITH</span> selected_user_attributes <span class="hljs-keyword">AS</span> (
  <span class="hljs-keyword">SELECT</span> <span class="hljs-keyword">DISTINCT</span> ua.user_id, ua.<span class="hljs-keyword">value</span>
  <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.user_attribute ua
  <span class="hljs-keyword">JOIN</span> <span class="hljs-built_in">public</span>.user_entity u <span class="hljs-keyword">ON</span> u.id = ua.user_id
  <span class="hljs-keyword">JOIN</span> <span class="hljs-built_in">public</span>.user_group_membership ugm <span class="hljs-keyword">ON</span> ugm.user_id = u.id
  <span class="hljs-keyword">JOIN</span> <span class="hljs-built_in">public</span>.keycloak_group g <span class="hljs-keyword">ON</span> g.id = ugm.group_id
  <span class="hljs-keyword">JOIN</span> <span class="hljs-built_in">public</span>.tenant r <span class="hljs-keyword">ON</span> r.id = u.tenant_id
  <span class="hljs-keyword">WHERE</span> ua.name = <span class="hljs-string">'attributeA'</span>
    <span class="hljs-keyword">AND</span> u.enabled
    <span class="hljs-keyword">AND</span> u.service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span>
    <span class="hljs-keyword">AND</span> r.name = <span class="hljs-string">'tenantA'</span>
    <span class="hljs-keyword">AND</span> (g.name = <span class="hljs-string">'groupA'</span> <span class="hljs-keyword">OR</span> g.parent_group = (
         <span class="hljs-keyword">SELECT</span> id <span class="hljs-keyword">FROM</span> <span class="hljs-built_in">public</span>.keycloak_group <span class="hljs-keyword">WHERE</span> <span class="hljs-type">name</span> = <span class="hljs-string">'groupA'</span> <span class="hljs-keyword">AND</span> tenant_id= r.id
    ))
)
</code></pre>
<p>This single query expresses the same scoping logic that previously lived in separate CTEs. Because all the join conditions are on the same foreign server and use built‑in operators, the FDW can push down the entire join. The new plan looked like this:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">Foreign</span> Scan  (actual <span class="hljs-type">time</span>=<span class="hljs-number">7.840</span>.<span class="hljs-number">.7</span><span class="hljs-number">.856</span> <span class="hljs-keyword">rows</span>=<span class="hljs-number">671</span> loops=<span class="hljs-number">1</span>)
  Remote <span class="hljs-keyword">SQL</span>: <span class="hljs-keyword">SELECT</span> ua.user_id, ua.<span class="hljs-keyword">value</span> <span class="hljs-keyword">FROM</span> user_attribute ua <span class="hljs-keyword">JOIN</span> user_entity u <span class="hljs-keyword">ON</span> ua.user_id = u.id <span class="hljs-keyword">JOIN</span> user_group_membership ugm <span class="hljs-keyword">ON</span> ugm.user_id = u.id <span class="hljs-keyword">JOIN</span> keycloak_group g <span class="hljs-keyword">ON</span> g.id = ugm.group_id <span class="hljs-keyword">JOIN</span> tenant r <span class="hljs-keyword">ON</span> u.tenant_id= r.id <span class="hljs-keyword">WHERE</span> ua.name = <span class="hljs-string">'attributeA'</span> <span class="hljs-keyword">AND</span> u.enabled <span class="hljs-keyword">AND</span> u.service_account_client_link <span class="hljs-keyword">IS</span> <span class="hljs-keyword">NULL</span> <span class="hljs-keyword">AND</span> r.name = <span class="hljs-string">'tenantA'</span> <span class="hljs-keyword">AND</span> (g.name = <span class="hljs-string">'groupA'</span> <span class="hljs-keyword">OR</span> g.parent_group = <span class="hljs-meta">$1</span>)
</code></pre>
<p>Only one remote query is executed, and it returns 671 rows. Tenant and group scoping occur on the remote server. There is no nested loop or repeated remote scan. The final runtime dropped to <strong>about 25 ms</strong>.</p>
<h3 id="heading-why-it-improved">Why it improved</h3>
<ol>
<li><p><strong>Fewer rows crossing the network.</strong> The old plan fetched 80k attribute rows and filtered them locally. The new plan fetched only the 671 scoped rows.</p>
</li>
<li><p><strong>No repeated remote calls.</strong> The old plan executed 416 remote scans of <code>user_entity</code>. The new plan performs one joined remote query.</p>
</li>
<li><p><strong>Less local work.</strong> Because the join and filtering happen remotely, the local side no longer hashes or filters large sets.</p>
</li>
</ol>
<h3 id="heading-key-takeaway">Key takeaway</h3>
<p>If you see a Foreign Scan with a high loops count or a Remote SQL that doesn’t contain your filters and joins, you’re leaving performance on the table. Merging filters and joins into a single remote query (subject to shippability rules) often yields orders-of-magnitude improvements.</p>
<h2 id="heading-checklist-and-troubleshooting-guide">Checklist and Troubleshooting Guide</h2>
<p>The following steps summarize how to approach FDW performance tuning:</p>
<ol>
<li><p><strong>Inspect the Remote SQL.</strong> Always run <code>EXPLAIN (VERBOSE)</code> and look at what is being sent to the remote. If your predicates are missing, the FDW isn't pushing them down.</p>
</li>
<li><p><strong>Check loops.</strong> If the loops are greater than 1 on a Foreign Scan, you are paying for repeated remote calls. Rewrite the query or reorder the joins to make the foreign scan run once.</p>
</li>
<li><p><strong>Make predicates shippable.</strong> Replace volatile functions with constants or parameters. Ensure operators and functions are built‑in or explicitly allow‑listed via the extensions option <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a>.</p>
</li>
<li><p><strong>Align types and collations.</strong> Use the same data types and collations on both sides to avoid semantic mismatches <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a>.</p>
</li>
<li><p><strong>Push joins to the same server.</strong> Consolidate tables on one foreign server if possible. Joins across servers cannot be pushed down <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a>.</p>
</li>
<li><p><strong>Use use_remote_estimate when planning seems off.</strong> Enabling remote estimates can improve join order selection <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>.</p>
</li>
<li><p><strong>Tune fetch_size and costs</strong> if your queries transfer many rows. A bigger fetch_size reduces round-trip; adjusting <code>fdw_startup_cost</code> and <code>fdw_tuple_cost</code> influences the planner <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>.</p>
</li>
<li><p><strong>Analyze foreign tables</strong> if you rely on local cost estimates. Keep in mind that stats can get stale quickly <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>.</p>
</li>
<li><p><strong>Monitor both servers.</strong> Use <code>pg_stat_statements</code> on local and remote servers to see how often remote queries run and how long they take.</p>
</li>
<li><p><strong>Test version upgrades.</strong> Each major release improves FDW pushdown semantics (for example, aggregates in 10 <a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a>, ORDER/LIMIT in 12 <a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a>). Retest after upgrading.</p>
</li>
</ol>
<h2 id="heading-case-study-takeaways">Case Study Takeaways</h2>
<p>Querying remote data with PostgreSQL’s <code>postgres_fdw</code> can be fast and convenient if you respect the underlying mechanics. Pushdown is the difference between streaming a trickle of relevant rows and hauling an ocean of data across the network. It isn't simply a matter of moving CPU cycles – it changes how much data moves, how many network round-trip occur, and how much your local server has to do.</p>
<p>The rules may seem restrictive – use only immutable functions, avoid cross‑server joins, align types and collations – but they exist to preserve correctness while enabling optimization.</p>
<p>By reading <code>EXPLAIN</code> from the bottom up, inspecting the Remote SQL, and understanding the shippability rules, you can spot slow patterns quickly. Armed with tuning knobs like <code>fetch_size</code> and <code>use_remote_estimate</code>, and a willingness to rewrite queries to make joins and filters pushable, you can often achieve dramatic performance gains without touching your hardware.</p>
<p>This case study shows that rewriting a query to enable a single-joined remote query reduced runtime from around <strong>166 ms to 25 ms</strong>. That sort of improvement is not rare. It’s what happens when you treat FDW queries as distributed queries rather than local queries in disguise.</p>
<p>The next time you debug a slow FDW query, remember this handbook. Check the Remote SQL. Count the loops. Ask yourself: “Am I doing the work close to the data, or am I bringing the data to the work?” Adjust accordingly, and you'll write queries that make the most of Postgres's federated capabilities while keeping your latency in check.</p>
<p>This section closes the case study loop and summarizes exactly what changed in the plan and why it produced a large end-to-end win. The following sections of the handbook turn that single win into a repeatable method: how Postgres determines what is shippable, how to quickly read FDW plans, which operations and versions matter, and how to debug common failure modes that prevent pushdown.</p>
<h2 id="heading-advanced-operations-a-deeper-dive-into-shippability">Advanced Operations: A Deeper Dive into Shippability</h2>
<p>The previous sections introduced the basic rules around what can be pushed to the remote and why. To really make sense of those rules, you need to see how they play out on the operations you use every day.</p>
<p>This section walks through filters, joins, aggregates, ordering, and limits, DISTINCT queries, and window functions in more detail. By the end, you should have a mental map of which operations to trust and which to double‑check when reading your plans.</p>
<h3 id="heading-filters-and-simple-predicates">Filters and simple predicates</h3>
<h4 id="heading-where-clauses-matter-more-than-you-think">WHERE clauses matter more than you think</h4>
<p>When you specify <code>WHERE attribute = 'value'</code> on a foreign table, the FDW will happily transmit that predicate to the remote server as long as the comparison uses built‑in types and immutable operators. For example:</p>
<ul>
<li><p><code>WHERE id = 42</code> is fine</p>
</li>
<li><p><code>WHERE lower(username) = 'hamdaan'</code> is fine if <code>lower()</code> is allow‑listed and immutable</p>
</li>
<li><p><code>WHERE created_at &gt;= now() - interval '7 days'</code> is not shippable because <code>now()</code> is volatile</p>
</li>
</ul>
<p>When such a predicate cannot be pushed, the FDW will fetch every row that matches all the shippable predicates and apply the rest locally. That means that a seemingly innocuous call to <code>now()</code> can blow up your network traffic.</p>
<p>The lesson is simple: compute volatile values up front (in your application or in a CTE) and reference them as constants in the query against the foreign table.</p>
<h4 id="heading-complex-expressions-are-not-automatically-unsafe">Complex expressions are not automatically unsafe</h4>
<p>Suppose you have <code>WHERE (status = 'active' AND (age BETWEEN 18 AND 29 OR age &gt; 65))</code>. This entire expression is shippable because it uses built‑in boolean logic, simple comparisons, and immutable operators. The FDW will deparse it into remote SQL and forward it. You only need to worry when one of the subexpressions introduces a function or operator that the FDW doesn’t recognize or cannot safely assume exists on the remote.</p>
<p>A good heuristic is: if you can express your filter using only simple comparisons, boolean logic, and built‑in functions, pushdown should work. When in doubt, check the Remote SQL.</p>
<h4 id="heading-array-and-json-operators">Array and JSON operators</h4>
<p>Modern Postgres makes heavy use of array and JSON functions. Many of these functions, like the array overlap operator <code>&amp;&amp;</code> used in the case study, are built‑in and can be shipped. But some JSON functions are provided by extensions (like <code>jsonb_path_query</code> or functions from the <code>pgjson</code> family).</p>
<p>If your filter uses one of these, ensure that the extension is available and allow‑listed on the foreign server. Otherwise, the FDW will fetch rows and perform the JSON logic locally. This is rarely what you want when dealing with large JSON columns.</p>
<h3 id="heading-joins-the-good-the-bad-and-the-ugly">Joins: the good, the bad, and the ugly</h3>
<h4 id="heading-sameserver-joins-are-your-friend">Same‑server joins are your friend</h4>
<p>If you join multiple foreign tables that are all defined on the same foreign server and user mapping, and if the join condition uses only shippable expressions, then the FDW can generate a single remote join. This is the ideal case.</p>
<p>For example, joining orders and customers on <code>orders.customer_id = customers.id</code> is pushable, as long as both tables reside on the same foreign server. The remote planner will use its own statistics and indexes to plan the join, and the local server will simply iterate through the result. Postgres 9.6 and later support this pattern <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a>.</p>
<h4 id="heading-crossserver-joins-break-pushdown">Cross‑server joins break pushdown</h4>
<p>If you attempt to join two foreign tables that live on different servers (or even on the same remote server but with different user mappings), postgres_fdw will fetch the tables separately and join them locally. This is almost always slower than pushing the join down, because you end up transferring both tables in their entirety.</p>
<p>The FDW design team chose not to support cross‑server joins because there is no portable way to tell two remote servers to cooperate on a join. Your options are: replicate one table on the other server, materialize the smaller table locally before joining, or restructure the query to filter aggressively on each side before joining locally.</p>
<h4 id="heading-mixed-localforeign-joins-are-tricky">Mixed local/foreign joins are tricky</h4>
<p>Joining a local table to a foreign table cannot be pushed down, for straightforward reasons: the remote server has no access to your local data. A common pattern that triggers repeated remote calls looks like this:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> u.id, a.<span class="hljs-keyword">value</span>
<span class="hljs-keyword">FROM</span> users u
<span class="hljs-keyword">LEFT JOIN</span> user_attribute a
  <span class="hljs-keyword">ON</span> a.user_id = u.id <span class="hljs-keyword">AND</span> a.name = <span class="hljs-string">'favorite_color'</span>;
</code></pre>
<p>If <code>users</code> is a local table and <code>user_attribute</code> is foreign, the plan may use a nested loop: for each local u, it executes a remote lookup in user_attribute to retrieve attributes.</p>
<p>The fix is to flip the query: retrieve all relevant rows from <code>user_attribute</code> in one remote scan, then join them locally. Or, if possible, create a small temporary table on the remote side with your u.id values, perform the join entirely remotely, and then fetch the results.</p>
<h4 id="heading-join-conditions-matter">Join conditions matter</h4>
<p>Even when joining two foreign tables on the same server, an unshippable join condition will force the join to be local. For example, <code>JOIN ON textcol ILIKE '%foo%'</code> is not pushable because <code>ILIKE</code> might not exist or behave identically on the remote.</p>
<p>If you need case‑insensitive matching, consider lowercasing both sides: <code>LOWER(textcol) = 'foo'</code> (assuming the remote server has the <code>lower()</code> function available and allowed). Similarly, joining on a cast expression (for example, <code>JOIN ON CAST(a.id AS text) = b.text_id</code>) can block pushdown. Define your columns with matching types instead.</p>
<h3 id="heading-aggregates-and-grouping">Aggregates and grouping</h3>
<p>Aggregates are where the data movement story shines. When you can push down a <code>GROUP BY</code> and aggregate functions like <code>COUNT</code>, <code>SUM</code>, <code>AVG</code>, or <code>MAX</code>, you reduce the result set to just the aggregated rows. This can be a difference of several orders of magnitude.</p>
<p>Postgres 10 introduced aggregate pushdown <a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a>. But not all aggregates are equal:</p>
<p><strong>Simple aggregates</strong> such as <code>COUNT(*)</code>, <code>SUM(col)</code>, <code>AVG(col)</code>, <code>MIN(col)</code>, and <code>MAX(col)</code> are shippable when applied to shippable expressions. Even <code>COUNT(DISTINCT col)</code> is often shippable, because the remote can deduplicate before counting. The FDW will wrap the aggregate in a remote query and return just the aggregated row.</p>
<p>If you see a GroupAggregate node on the local side, check whether all involved columns and functions are shippable. If they are, ensure that the join conditions above are also pushable.</p>
<p><strong>Filtered aggregates</strong> such as <code>COUNT(*) FILTER (WHERE x &gt; 5) or SUM(col) FILTER (WHERE status = 'active')</code> are often pushable, because they translate into <code>SUM(CASE WHEN condition THEN col ELSE 0 END) or COUNT(...)</code>. As long as the filter is shippable, the FDW will push it into the remote aggregate.</p>
<p><strong>User‑defined aggregates</strong> are rarely pushable. If you have a custom aggregate function, the FDW will not assume that it exists or behaves the same on the remote server. Even if you install the function on both servers, postgres_fdw won't push it unless the function is in an allow‑listed extension.</p>
<p><strong>Grouping sets and rollups</strong> are not currently pushable. When you write <code>GROUP BY GROUPING SETS (...) or ROLLUP(...)</code>, Postgres will compute the grouping locally even if the underlying scan is remote.</p>
<p>If you need complex rollups, consider performing them in two steps: push down the initial grouping to the remote server to reduce rows, then perform the rollup locally.</p>
<h3 id="heading-order-by-limit-and-distinct">ORDER BY, LIMIT, and DISTINCT</h3>
<p>Ordering and limiting rows may seem like purely cosmetic features, but they affect how much data is transferred. If the remote can sort and limit, the local server only receives the top N rows. If it cannot, the local server must sort everything.</p>
<p>Postgres 12 expanded the cases where <code>ORDER BY</code> and LIMIT are pushed down <a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a>. Here are guidelines:</p>
<ul>
<li><p><strong>Single foreign scan with simple sort:</strong> If your query selects from one foreign table and sorts by a shippable expression (for example, <code>ORDER BY created_at DESC</code>), the FDW will include <code>ORDER BY</code> in Remote SQL. It will also push down <code>LIMIT</code> and <code>OFFSET</code>. This is ideal because the remote server does the sort and sends only the top rows.</p>
</li>
<li><p><strong>Sort after join:</strong> If you sort after joining two foreign tables on the same server, and the join and sort expressions are shippable, the FDW may push both down. But if the sort requires columns from the local side or from a different remote server, the FDW cannot push it down.</p>
</li>
<li><p><strong>Sort after aggregation:</strong> Sorting aggregated results is often pushable as long as the aggregate itself is pushable. But when grouping occurs locally, the sort remains local.</p>
</li>
<li><p><strong>DISTINCT behaves like GROUP BY.</strong> If the distinct expression list is shippable, the FDW can push it down. If you write <code>SELECT DISTINCT ON (col1) col2, col3 FROM ...</code> and col3 is not part of the <code>DISTINCT</code> list, Postgres will treat this as <code>GROUP BY</code> and may push it. Be aware that <code>DISTINCT ON</code> semantics differ from plain <code>DISTINCT</code> and may not be pushable in older Postgres versions.</p>
</li>
</ul>
<h3 id="heading-window-functions-1">Window functions</h3>
<p>Window functions (for example, <code>ROW_NUMBER() OVER (PARTITION BY ...), RANK(), LAG(), LEAD()</code>) rely on ordering and partitioning across rows.</p>
<p>Postgres has not yet taught <code>postgres_fdw</code> how to push window functions. When you see a WindowAgg node in your plan, it’s almost always local. The FDW will fetch the rows, and the local server will sort, partition, and compute the window. If you need to run window functions on remote data, plan to transfer the data locally.</p>
<h3 id="heading-versionspecific-quirks">Version‑specific quirks</h3>
<p>The exact pushdown capabilities vary by release. When planning migrations or deciding whether to rely on a pushdown behavior, check the release notes:</p>
<ul>
<li><p><strong>9.6:</strong> first version to support pushdown of joins and sorts, and remote updates and deletes.</p>
</li>
<li><p><strong>10:</strong> introduced aggregate pushdown <a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a>, significantly reducing network use for <code>GROUP BY</code> queries.</p>
</li>
<li><p><strong>11:</strong> improved partition pruning and join ordering for foreign tables.</p>
</li>
<li><p><strong>12:</strong> expanded <code>ORDER BY</code> and <code>LIMIT</code> pushdown <a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a>.</p>
</li>
<li><p><strong>15:</strong> added pushdown for simple <code>CASE</code> expressions and additional built‑in functions.</p>
</li>
<li><p><strong>17</strong> (development at the time of writing) continues to expand shippable constructs. Always test on your target version because subtle improvements can change what the FDW can ship.</p>
</li>
</ul>
<h2 id="heading-common-antipatterns-and-how-to-avoid-them">Common Anti‑Patterns and How to Avoid Them</h2>
<p>Everyone has run into FDW queries that seemed reasonable but turned out to be bottlenecks. Here are a few of the most common mistakes and how to correct them. These examples are deliberately simplified – so you can adapt them to your schema.</p>
<h3 id="heading-using-volatile-functions-in-predicates">Using volatile functions in predicates</h3>
<p><strong>Anti‑pattern:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> *
<span class="hljs-keyword">FROM</span> audit_logs
<span class="hljs-keyword">WHERE</span> event_ts &gt;= now() - <span class="hljs-type">interval</span> <span class="hljs-string">'1 day'</span>;
</code></pre>
<p><code>now()</code> is a volatile function, so the FDW refuses to push this predicate. It pulls all rows from audit_logs and filters them locally.</p>
<p><strong>Better:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> *
<span class="hljs-keyword">FROM</span> audit_logs
<span class="hljs-keyword">WHERE</span> event_ts &gt;= <span class="hljs-meta">$1</span>;
</code></pre>
<p>Compute <code>$1</code> (a timestamp) in your application or upstream query. Or compute it once in a CTE:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">WITH</span> cutoff <span class="hljs-keyword">AS</span> (<span class="hljs-keyword">SELECT</span> now() - <span class="hljs-type">interval</span> <span class="hljs-string">'1 day'</span> <span class="hljs-keyword">AS</span> ts) <span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> audit_logs, cutoff <span class="hljs-keyword">WHERE</span> event_ts &gt;= cutoff.ts;
</code></pre>
<p>The FDW sees a constant and pushes the predicate.</p>
<h3 id="heading-joining-local-and-foreign-data-first">Joining local and foreign data first</h3>
<p><strong>Anti‑pattern:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> u.email, ua.<span class="hljs-keyword">value</span>
<span class="hljs-keyword">FROM</span> users u
<span class="hljs-keyword">LEFT JOIN</span> user_attribute ua <span class="hljs-keyword">ON</span> u.id = ua.user_id <span class="hljs-keyword">AND</span> ua.name = <span class="hljs-string">'favorite_movie'</span>;
</code></pre>
<p>This uses a local table (users) to drive a join to a foreign table (user_attribute). The FDW receives 10,000 individual remote queries if users have 10,000 rows. Each call fetches one or zero rows from user_attribute.</p>
<p><strong>Better:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-comment">-- Fetch all favorite movies remotely and join locally</span>
<span class="hljs-keyword">WITH</span> remote_movies <span class="hljs-keyword">AS</span> (
  <span class="hljs-keyword">SELECT</span> ua.user_id, ua.<span class="hljs-keyword">value</span>
  <span class="hljs-keyword">FROM</span> user_attribute ua
  <span class="hljs-keyword">WHERE</span> ua.name = <span class="hljs-string">'favorite_movie'</span>
)
<span class="hljs-keyword">SELECT</span> u.email, rm.<span class="hljs-keyword">value</span>
<span class="hljs-keyword">FROM</span> users u
<span class="hljs-keyword">LEFT JOIN</span> remote_movies rm <span class="hljs-keyword">ON</span> u.id = rm.user_id;
</code></pre>
<p>Now the FDW issues one query to fetch all relevant attributes, and the join is done locally in one pass.</p>
<h3 id="heading-crossserver-joins-without-materialization">Cross‑server joins without materialization</h3>
<p><strong>Anti‑pattern:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> *
<span class="hljs-keyword">FROM</span> remote_db1.orders o
<span class="hljs-keyword">JOIN</span> remote_db2.customers c <span class="hljs-keyword">ON</span> o.customer_id = c.id;
</code></pre>
<p>This is not pushable because the two tables are on different foreign servers. Postgres will fetch orders and customers separately and join them locally. If orders have 1 million rows and customers have 50,000 rows, you will transfer 1.05 million rows.</p>
<p><strong>Better:</strong> Replicate or materialize one side on the other server (or locally) before joining. For example, create a materialized view m_customers on remote_db1 containing just the id and name of the customers you need, then join orders and m_customers on the same server. Alternatively, copy customers into a temporary table on the local server and join there.</p>
<h3 id="heading-complex-expressions-on-join-keys">Complex expressions on join keys</h3>
<p><strong>Anti‑pattern:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> *
<span class="hljs-keyword">FROM</span> remote_table a
<span class="hljs-keyword">JOIN</span> remote_table b <span class="hljs-keyword">ON</span> CAST(a.key <span class="hljs-keyword">AS</span> <span class="hljs-type">text</span>) = b.key_text;
</code></pre>
<p>Casting a numeric key to text prevents pushdown. The remote server cannot use indexes and must return both tables. The local server performs the join and cast.</p>
<p><strong>Better:</strong> Align your schemas so that the join columns use the same type. If you cannot change the schema, create a computed column on the remote server with the appropriate type and use it in the join.</p>
<h3 id="heading-ignoring-collation-and-type-mismatches">Ignoring collation and type mismatches</h3>
<p><strong>Anti‑pattern:</strong></p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> *
<span class="hljs-keyword">FROM</span> remote_table
<span class="hljs-keyword">WHERE</span> citext_col = <span class="hljs-string">'abc'</span>;
</code></pre>
<p>If the remote server doesn’t have the citext extension installed, the comparison semantics will differ, and the FDW will refuse to ship the filter. This appears harmless until you see the plan and realize all rows were fetched.</p>
<p><strong>Better:</strong> Install the same extensions and collations on the remote server, or convert the column to a base type like text on both sides.</p>
<h2 id="heading-extending-tuning-calibrating-cost-models">Extending Tuning: Calibrating Cost Models</h2>
<p>Earlier, we discussed <code>fetch_size</code>, <code>use_remote_estimate</code>, and the cost knobs. This section expands on how to use them strategically.</p>
<h3 id="heading-balancing-fetch-size-and-memory">Balancing fetch size and memory</h3>
<p><code>fetch_size</code> controls how many rows the FDW asks for in each round trip <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[9]</a>. Think of it as the batch size. The default (100) works well for small result sets. If you expect to retrieve tens of thousands of rows, a higher fetch size reduces the overhead of many network requests. But there are trade‑offs:</p>
<ul>
<li><p><strong>Memory consumption:</strong> Each foreign scan buffers rows until they are consumed. A huge fetch size (for example, 10,000) may allocate more memory than you expect, especially when multiple scans run concurrently. Monitor memory usage as you increase this setting.</p>
</li>
<li><p><strong>Latency hiding:</strong> If network latency is high, overlapping network requests with local processing can hide some latency. But <code>postgres_fdw</code> does not pipeline multiple fetches – it waits for one batch before requesting the next. This means that a larger batch size reduces the number of waits, but cannot overlap them. If you operate across data centers, consider using a connection pooler or caching layer instead of just increasing fetch_size.</p>
</li>
</ul>
<h3 id="heading-remote-estimates-vs-local-estimates">Remote estimates vs. local estimates</h3>
<p>The planner uses statistics to estimate how many rows each node will produce, which in turn influences join order. When <code>use_remote_estimate</code> is false (the default), the planner guesses based on local stats collected by <code>ANALYZE</code> on the foreign table. This can be wrong if the remote table has a different distribution than the local sample, or if the table has changed since the last <code>ANALYZE</code>.</p>
<p>Setting <code>use_remote_estimate</code> to true instructs the FDW to run <code>EXPLAIN</code> on the remote server during planning to obtain row counts and cost estimates <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>. This can improve join ordering, especially when joining multiple foreign tables or mixing local and foreign tables. The downside is increased planning time because each remote estimate runs an extra query.</p>
<p>In practice:</p>
<ul>
<li><p>Enable <code>use_remote_estimate</code> on queries with complex joins where the planner picks obviously wrong join orders. If enabling it improves the plan, consider leaving it on for that server or table.</p>
</li>
<li><p>Use <code>ANALYZE</code> on foreign tables periodically if your remote data is relatively static. This populates local stats and can avoid the overhead of remote estimates.</p>
</li>
<li><p>Don’t enable <code>use_remote_estimate</code> indiscriminately on simple lookups. The cost of additional round-trip remote flights may outweigh the benefit.</p>
</li>
</ul>
<h3 id="heading-tuning-cost-parameters">Tuning cost parameters</h3>
<p><code>fdw_startup_cost</code> and <code>fdw_tuple_cost</code> control how much the planner thinks it costs to start a foreign scan and fetch each row <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a>. If these are too low, the planner may choose a nested loop that generates many small remote calls. If they are too high, the planner might avoid remote scans even when they are efficient.</p>
<p>You can adjust these parameters based on empirical measurement:</p>
<ul>
<li><p>Increase <code>fdw_startup_cost</code> to discourage the planner from using nested loops that call the remote table repeatedly. You might set it to the average cost of a round-trip remote.</p>
</li>
<li><p>Increase <code>fdw_tuple_cost</code> if network bandwidth is limited or expensive. This indicates to the planner that each remote row incurs higher fetch costs than a local row. The planner will prefer plans that filter early on the remote side.</p>
</li>
</ul>
<p>Always adjust these settings gradually and observe the effect on the plan. Keep separate settings per foreign server if network conditions differ.</p>
<h3 id="heading-when-to-analyze-foreign-tables">When to analyze foreign tables</h3>
<p>Running <code>ANALYZE</code> on a foreign table collects sample statistics by pulling a subset of rows from the remote server. This helps the planner estimate row counts when <code>use_remote_estimate</code> is off. It also helps decide whether to use an index on the remote side. You should analyze foreign tables when:</p>
<ul>
<li><p>The remote table is large and static, and you want accurate local estimates without the overhead of remote estimates.</p>
</li>
<li><p>You have just defined a foreign table, and the default stats are empty.</p>
</li>
<li><p>You changed the extensions allow‑list to enable more pushdown and want the planner to see the effect.</p>
</li>
</ul>
<p>Conversely, if the remote data changes constantly, <code>ANALYZE</code> results will quickly become stale. In that case, rely on use_remote_estimate instead.</p>
<h2 id="heading-further-case-studies-and-practical-examples">Further Case Studies and Practical Examples</h2>
<p>The Keycloak coverage example is not the only place where pushdown matters. The following scenarios illustrate other patterns you may encounter.</p>
<h3 id="heading-reporting-on-a-sharded-logging-system">Reporting on a sharded logging system</h3>
<p>Imagine you store application logs across multiple shards, each a separate Postgres database. You want to produce a report of the number of error logs per service per day.</p>
<p>A naïve approach might join all shards in one query:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> shard, service, date_trunc(<span class="hljs-string">'day'</span>, log_time) <span class="hljs-keyword">AS</span> day, COUNT(*)
<span class="hljs-keyword">FROM</span> shard1.logs
<span class="hljs-keyword">UNION</span> <span class="hljs-keyword">ALL</span>
<span class="hljs-keyword">SELECT</span> shard, service, date_trunc(<span class="hljs-string">'day'</span>, log_time) <span class="hljs-keyword">AS</span> day, COUNT(*)
<span class="hljs-keyword">FROM</span> shard2.logs
...;
</code></pre>
<p>This approach will fetch all log rows to the local server and aggregate them locally. A better solution is to push the grouping to each shard:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> shard, service, day, sum(count)
<span class="hljs-keyword">FROM</span> (
  <span class="hljs-keyword">SELECT</span> <span class="hljs-number">1</span> <span class="hljs-keyword">AS</span> shard, service, date_trunc(<span class="hljs-string">'day'</span>, log_time) <span class="hljs-keyword">AS</span> day, COUNT(*) <span class="hljs-keyword">AS</span> count
  <span class="hljs-keyword">FROM</span> shard1.logs
  <span class="hljs-keyword">WHERE</span> log_time &gt;= <span class="hljs-meta">$1</span> <span class="hljs-keyword">AND</span> log_time &lt; <span class="hljs-meta">$2</span>
  <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> service, day
  <span class="hljs-keyword">UNION</span> <span class="hljs-keyword">ALL</span>
  <span class="hljs-keyword">SELECT</span> <span class="hljs-number">2</span> <span class="hljs-keyword">AS</span> shard, service, date_trunc(<span class="hljs-string">'day'</span>, log_time) <span class="hljs-keyword">AS</span> day, COUNT(*)
  <span class="hljs-keyword">FROM</span> shard2.logs
  <span class="hljs-keyword">WHERE</span> log_time &gt;= <span class="hljs-meta">$1</span> <span class="hljs-keyword">AND</span> log_time &lt; <span class="hljs-meta">$2</span>
  <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> service, day
  ...
) x
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> shard, service, day;
</code></pre>
<p>Here, each foreign server returns a small set of aggregated rows instead of raw logs. The outer aggregation sums across shards. This pattern generalizes: push grouping and filtering to the remote side, then combine locally.</p>
<h3 id="heading-combining-remote-and-local-data-for-analytics">Combining remote and local data for analytics</h3>
<p>Suppose you have a local table <code>users</code> and a remote table <code>orders</code>. You want to compute the average order amount per user segment. A naïve query might look like:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> u.segment, AVG(o.amount)
<span class="hljs-keyword">FROM</span> users u
<span class="hljs-keyword">JOIN</span> orders o <span class="hljs-keyword">ON</span> o.user_id = u.id
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> u.segment;
</code></pre>
<p>This is a local join driving a remote nested loop. The better approach is to aggregate orders remotely by user_id and join on the small result:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">WITH</span> remote_totals <span class="hljs-keyword">AS</span> (
  <span class="hljs-keyword">SELECT</span> user_id, SUM(amount) <span class="hljs-keyword">AS</span> total, COUNT(*) <span class="hljs-keyword">AS</span> n
  <span class="hljs-keyword">FROM</span> orders
  <span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> user_id
)
<span class="hljs-keyword">SELECT</span> u.segment, AVG(rt.total / rt.n)
<span class="hljs-keyword">FROM</span> users u
<span class="hljs-keyword">JOIN</span> remote_totals rt <span class="hljs-keyword">ON</span> u.id = rt.user_id
<span class="hljs-keyword">GROUP</span> <span class="hljs-keyword">BY</span> u.segment;
</code></pre>
<p>This pushes the heavy aggregation to the remote and transfers only one row per user. The local join then groups by segment. As with other examples, the key is to reduce remote rows before they cross the network.</p>
<h3 id="heading-avoiding-pushdown-for-correctness">Avoiding pushdown for correctness</h3>
<p>There are legitimate cases where you should <em>prevent</em> pushdown because of semantic differences. Postgres allows you to do this by adding <code>OFFSET 0</code> or wrapping the foreign table in a CTE.</p>
<p>For example, if a built‑in function behaves differently on the remote due to a version mismatch, you can force local evaluation:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">WITH</span> local_eval <span class="hljs-keyword">AS</span> (<span class="hljs-keyword">SELECT</span>  <span class="hljs-keyword">FROM</span> remote_table)  <span class="hljs-comment">-- CTE prevents pushdown</span>
<span class="hljs-keyword">SELECT</span> 
<span class="hljs-keyword">FROM</span> local_eval
<span class="hljs-keyword">WHERE</span> some_complex_expression(local_eval.col) &gt; <span class="hljs-number">0</span>;
</code></pre>
<p>Alternatively, a <code>WHERE</code> clause like <code>random() &lt; 0.1</code> will not push down because <code>random()</code> is volatile – you don't need to force it. But adding <code>OFFSET 0</code> is a simple hack that prevents any pushdown:</p>
<pre><code class="lang-pgsql"><span class="hljs-keyword">SELECT</span> * <span class="hljs-keyword">FROM</span> remote_table <span class="hljs-keyword">OFFSET</span> <span class="hljs-number">0</span>;
</code></pre>
<p>Knowing how to disable pushdown intentionally helps you debug. If a query returns different results when pushdown occurs, suspect type/collation mismatches or remote session settings <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a>.</p>
<h2 id="heading-monitoring-diagnostics-and-regression-testing">Monitoring, Diagnostics, and Regression Testing</h2>
<p>Monitoring doesn't end at counting remote rows. To make pushdown reliable in production, you need to set up mechanisms to detect regressions and gather evidence when performance changes.</p>
<h3 id="heading-automate-explain-regression-tests">Automate EXPLAIN regression tests</h3>
<p>In addition to unit tests and integration tests, you can add tests that assert the shape of your plans. For instance, if a mission‑critical report must always push down a <code>WHERE</code> clause, you can write a test that runs <code>EXPLAIN (VERBOSE)</code> and checks that the Remote SQL contains the filter. You might even parse loops and assert that it is 1. When a developer inadvertently adds a non‑immutable function or changes a join, the test will fail. This is akin to snapshot testing for SQL.</p>
<h3 id="heading-monitor-pgstatstatements-across-servers">Monitor pg_stat_statements across servers</h3>
<p>Enable <code>pg_stat_statements</code> on both the local and remote servers. On the local side, track the total time, planning time, and rows for each FDW query. On the remote side, track which queries are being executed.</p>
<p>Look for outliers: a query whose remote calls spike or whose average remote rows jump from hundreds to thousands. Those are early signs of pushdown failure.</p>
<h3 id="heading-log-remote-sql-with-autoexplain">Log remote SQL with auto_explain</h3>
<p>Setting <code>auto_explain.log_min_duration_statement</code> (for example, to 500ms) causes Postgres to automatically log slow queries with their plans. Combine this with <code>auto_explain.log_verbose = true</code> and <code>auto_explain.log_nested_statements = true</code> to capture remote SQL as well. When a federated query slows down, the log will show you exactly what remote SQL was executed and how often. This is invaluable in production, where you cannot always run EXPLAIN interactively.</p>
<h3 id="heading-use-connection-pooling-and-prepare-statements">Use connection pooling and prepare statements</h3>
<p><code>postgres_fdw</code> maintains a connection pool keyed on the user mapping. It reuses connections between queries, but you can also use connection pooling at the network level (for example, pgbouncer or pgcat).</p>
<p>Keeping connections warm reduces the startup cost, as captured by <code>fdw_startup_cost</code>. Meanwhile, preparing statements on the remote server (via <code>PREPARE</code> and <code>EXECUTE</code>) can save parse time when the same remote SQL is executed frequently. <code>postgres_fdw</code> can use server‑side prepared statements for parameterized scans.</p>
<h3 id="heading-regression-testing-after-version-upgrades">Regression testing after version upgrades</h3>
<p>Every major Postgres release brings improvements to postgres_fdw pushdown semantics. But new releases also change planner heuristics and remote SQL generation. After an upgrade, rerun your key queries with EXPLAIN (VERBOSE), compare the Remote SQL, and benchmark them.</p>
<p>In some cases, a release may push down something previously local, revealing a latent type mismatch or a function difference. In other cases, pushdown may be withheld due to a new rule. Don’t assume that an upgrade automatically improves performance – test it.</p>
<h2 id="heading-extended-guidelines-for-advanced-dbas">Extended Guidelines for Advanced DBAs</h2>
<p>To close this handbook, here are consolidated guidelines distilled from the previous sections. They go beyond simple bullet points to capture nuances. Keep them handy for reference or print them out for your team.</p>
<ol>
<li><p><strong>Respect the FDW safety model.</strong> Immutable functions and built‑in operators are your friends. Anything outside that scope must be explicitly allowed or evaluated locally. Understand which items belong to each category and plan accordingly.</p>
</li>
<li><p><strong>Always read the Remote SQL.</strong> Don’t trust your intuition about what is being pushed down. The Remote SQL string is the only source of truth. It indicates whether a predicate, join, sort, or limit operation is occurring remotely. It also shows parameter placeholders (for example, $1) that correspond to values passed from the local plan.</p>
</li>
<li><p><strong>Reduce before you fetch.</strong> The network is the highest cost. If the remote can reduce rows through filtering, grouping, or limiting, let it. If it cannot, structure your query to enable it. Avoid queries that require pulling large raw tables and processing them locally.</p>
</li>
<li><p><strong>Beware of join order.</strong> The planner sometimes chooses a nested loop with a foreign table as the inner side, resulting in repeated remote calls. Examine loops: if you see a high number, consider rewriting the query or adjusting cost parameters.</p>
</li>
<li><p><strong>Use CTEs strategically.</strong> A CTE can isolate remote scans and let you control whether they are materialized once or inlined. Use <code>MATERIALIZED</code> to avoid repeated remote scans when a CTE is referenced multiple times. Use <code>NOT MATERIALIZED</code> to allow optimizations across CTE boundaries.</p>
</li>
<li><p><strong>Instrument, monitor, iterate.</strong> Good FDW performance is not a one‑off fix. Monitor queries and plans. Use tests to catch regressions. Adjust tuning knobs and indexes as your data or workload changes. Document your reasoning so others can understand why a particular plan is expected.</p>
</li>
<li><p><strong>Educate your team.</strong> Federated queries invite subtle bugs and performance traps. Share the high‑level rules – immutable functions only, cross‑server joins are local, always check remote SQL – so engineers write safer queries by default. A 30‑minute training can save hours of debugging later.</p>
</li>
</ol>
<h2 id="heading-bringing-it-all-together">Bringing it All Together</h2>
<p>This handbook has covered a lot of ground: from the high‑level principle that pushdown is about data movement, to the nitty‑gritty of join conditions and tuning knobs, to troubleshooting steps and case studies. It is intentionally opinionated and personal: these are the patterns and pitfalls encountered in real systems, not abstract guidelines. By sharing specific examples, I hoped to make the rules memorable and show how they interplay with actual workloads.</p>
<p>The goal is not just to tell you what to do, but to show you how to think and problem solve: review the plan, trace data movement, and determine whether the query is doing the heavy work in the right place.</p>
<p>That thinking process, practiced enough times, becomes second nature. When you write a new query, you'll automatically consider whether your predicates are immutable, whether the join can be shipped, and whether you are about to trigger an N+1 pattern. When you review plans, you'll start from the Foreign Scan nodes and remote SQL, not the top‑level node. When you tune, you'll know which knobs to twist and in which order.</p>
<p>Keep experimenting. Use the examples here as starting points. Try different structures in a test environment and measure the difference. The more you play with pushdown, the more comfortable you'll become with its constraints and superpowers.</p>
<p>If this handbook helps you avoid one performance incident or saves you from shipping a broken query, it has done its job. Enjoy exploring the federated world of Postgres.</p>
<h2 id="heading-references">References</h2>
<p><a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=It%20is%20generally%20recommended%20that,differently%20from%20the%20local%20server">[1]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[2]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=This%20option%2C%20which%20can%20be,false">[3]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=In%20the%20remote%20sessions%20opened,their%20expected%20search%20path%20environment">[4]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=functions%20in%20such%20clauses%20must,to%20reduce%20the%20risk%20of">[5]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=When%20,clauses">[6]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=">[9]</a> <a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html#:~:text=,Extension">[10]</a> PostgreSQL: Documentation: 18: F.38. postgres_fdw – access data stored in external PostgreSQL servers (<a target="_blank" href="https://www.postgresql.org/docs/current/postgres-fdw.html">https://www.postgresql.org/docs/current/postgres-fdw.html</a>)</p>
<p><a target="_blank" href="https://www.postgresql.org/docs/release/10.0/#:~:text=,Jeevan%20Chalke%2C%20Ashutosh%20Bapat">[7]</a> PostgreSQL: Release Notes (<a target="_blank" href="https://www.postgresql.org/docs/release/10.0/">https://www.postgresql.org/docs/release/10.0/</a>)</p>
<p><a target="_blank" href="https://www.postgresql.org/docs/release/12.0/#:~:text=,Etsuro%20Fujita%29%20%C2%A7%20%C2%A7">[8]</a> PostgreSQL: Release Notes (<a target="_blank" href="https://www.postgresql.org/docs/release/12.0/">https://www.postgresql.org/docs/release/12.0/</a>)</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Build a Payroll System with Express and Monnify Using Background Jobs ]]>
                </title>
                <description>
                    <![CDATA[ Processing payroll payments is an important operation for any business. When you need to pay employees simultaneously, you can't afford to have your server hang, get blocking errors, or timeout while waiting for each payment to complete. Building a p... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/build-a-payroll-system-with-express-and-monnify-using-background-jobs/</link>
                <guid isPermaLink="false">69680d9ead82a9267c20097d</guid>
                
                    <category>
                        <![CDATA[ Node.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Express ]]>
                    </category>
                
                    <category>
                        <![CDATA[ TypeScript ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ handbook ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ David Aniebo ]]>
                </dc:creator>
                <pubDate>Wed, 14 Jan 2026 21:41:50 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768414407566/4384def7-fdc2-4274-888d-d5bd5bd5549b.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Processing payroll payments is an important operation for any business. When you need to pay employees simultaneously, you can't afford to have your server hang, get blocking errors, or timeout while waiting for each payment to complete.</p>
<p>Building a payroll system is an excellent way to practice real-world backend development skills. Unlike simple CRUD applications, payroll systems require you to think about:</p>
<ul>
<li><p><strong>Asynchronous processing</strong>: When you need to pay hundreds of employees, processing payments synchronously can cause your server to timeout. Background jobs with Bull and Redis allow you to handle long-running operations without blocking your API.</p>
</li>
<li><p><strong>Payment gateway integration</strong>: Working with payment APIs like Monnify teaches you how to handle external service integrations, authentication flows, webhook verification, and error handling in production systems.</p>
</li>
<li><p><strong>Data consistency</strong>: Payroll systems need to maintain accurate records. You'll learn about transaction reconciliation, idempotency, and how to handle partial failures gracefully.</p>
</li>
<li><p><strong>Production-ready patterns</strong>: This tutorial covers patterns you'll use in real applications: job queues, webhook handlers, database migrations, and proper error handling.</p>
</li>
</ul>
<p>Whether you're building a fintech application, an HR system, or just want to understand how payment processing works, the concepts in this tutorial will serve you well. The combination of Express, TypeScript, background jobs, and payment APIs represents a common stack in modern backend development.</p>
<p>In this tutorial, you’ll learn how to build a production-grade payroll engine using Express.js, TypeScript, and Monnify's payment API. You'll implement background job processing with <code>Bull</code> and <code>Redis</code> to handle bulk disbursements efficiently.</p>
<p>By the end, you will have a fully functional payroll system that can:</p>
<ul>
<li><p>Manage employee records with bank account details</p>
</li>
<li><p>Create and process payroll batches</p>
</li>
<li><p>Process bulk payments using Monnify's disbursement API</p>
</li>
<li><p>Handle payment status updates via webhooks</p>
</li>
<li><p>Reconcile transactions to ensure data consistency</p>
</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-project-architecture-overview">Project Architecture Overview</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-setting-up-the-project">Setting Up the Project</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-configuring-docker-for-postgresql-and-redis">Configuring Docker for PostgreSQL and Redis</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-setting-up-the-database">Setting Up the Database</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-database-models">Creating Database Models</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-employee-model">Employee Model</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-employee-data-structure-employee-interface">Employee Data Structure (Employee Interface)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-employee-model-class">Employee Model Class</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-auto-generating-employee-ids-generateemployeeid">Auto-Generating Employee IDs (generateEmployeeId)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-an-employee-create">Creating an Employee (create)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-retrieving-all-active-employees-findall">Retrieving All Active Employees (findAll)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-retrieving-an-employee-by-database-id-findbyid">Retrieving an Employee by Database ID (findById)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-retrieving-an-employee-by-employee-identifier-findbyemployeeid">Retrieving an Employee by Employee Identifier (findByEmployeeId)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-updating-an-employee-update">Updating an Employee (update)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-soft-deleting-an-employee-delete">Soft-Deleting an Employee (delete)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-model">Payroll Model</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-status-lifecycle">Payroll Status Lifecycle</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-entity">Payroll Entity</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-item-entity">Payroll Item Entity</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-a-payroll-payrollmodelcreate">Creating a Payroll (PayrollModel.create)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-payroll-records">Fetching Payroll Records</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-updating-payroll-status-payrollmodelupdatestatus">Updating Payroll Status (PayrollModel.updateStatus)</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-payrollitemmodel">PayrollItemModel</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-fetching-payroll-items-payrollitemmodelfindbypayrollid">Fetching Payroll Items (PayrollItemModel.findByPayrollId)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-a-single-payroll-item-payrollitemmodelfindbyid">Fetching a Single Payroll Item (PayrollItemModel.findById)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-updating-payroll-item-status-payrollitemmodelupdatestatus">Updating Payroll Item Status (PayrollItemModel.updateStatus)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-overall-payroll-flow">Overall Payroll Flow</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-building-the-monnify-client">Building the Monnify Client</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-configuration-and-environment-setup">Configuration and Environment Setup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-create-the-monnifyclient-class">Create the MonnifyClient Class</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-axios-client-and-request-interceptor">Axios Client and Request Interceptor</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-authenticate-with-monnify">Authenticate with Monnify</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-automatic-token-refresh-ensureauthenticated">Automatic Token Refresh (ensureAuthenticated)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-initiating-bulk-transfers">Initiating Bulk Transfers</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-authorizing-bulk-transfers-otp-validation">Authorizing Bulk Transfers (OTP Validation)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-transaction-status-lookup">Transaction Status Lookup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-batch-details-retrieval">Batch Details Retrieval</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-wallet-balance-check">Wallet Balance Check</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-implementing-background-job-processing">Implementing Background Job Processing</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-set-up-the-payroll-processing-queue">Set Up the Payroll Processing Queue</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-queue-processor-registration">Queue Processor Registration</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-bulk-payroll-processing-flow-processbulkpayroll">Bulk Payroll Processing Flow (processBulkPayroll)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-building-the-bulk-transfer-payload">Building the Bulk Transfer Payload</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-initiating-bulk-disbursement-via-monnify">Initiating Bulk Disbursement via Monnify</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-storing-transaction-references">Storing Transaction References</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-statistics-reconciliation-updatepayrollstats">Payroll Statistics Reconciliation (updatePayrollStats)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-queue-entry-point-processpayrollitems">Queue Entry Point (processPayrollItems)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-role-in-the-overall-payroll-architecture">Role in the Overall Payroll Architecture</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-the-api-controllers">Creating the API Controllers</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-controller-responsibilities">Controller Responsibilities</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-an-employee-createemployee">Creating an Employee (createEmployee)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-all-employees-getallemployees">Fetching All Employees (getAllEmployees)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-a-single-employee-getemployeebyid">Fetching a Single Employee (getEmployeeById)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-updating-an-employee-updateemployee">Updating an Employee (updateEmployee)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-deleting-an-employee-deleteemployee">Deleting an Employee (deleteEmployee)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-error-handling-strategy">Error Handling Strategy</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-role-in-the-overall-payroll-system">Role in the Overall Payroll System</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-controller">Payroll Controller</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-controller-responsibilities-1">Controller Responsibilities</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-creating-a-payroll-createpayroll">Creating a Payroll (createPayroll)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-all-payrolls-getallpayrolls">Fetching All Payrolls (getAllPayrolls)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-a-payroll-with-items-getpayrollbyid">Fetching a Payroll with Items (getPayrollById)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-processing-a-payroll-processpayroll">Processing a Payroll (processPayroll)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-reconciling-payroll-payments-reconcilepayroll">Reconciling Payroll Payments (reconcilePayroll)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-statistics-update-internal-helper">Payroll Statistics Update (Internal Helper)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-fetching-payroll-status-summary-getpayrollstatus">Fetching Payroll Status Summary (getPayrollStatus)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-authorizing-bulk-transfers-authorizebulktransfer">Authorizing Bulk Transfers (authorizeBulkTransfer)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-checking-transaction-status-checktransactionstatus">Checking Transaction Status (checkTransactionStatus)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-checking-wallet-balance-getaccountbalance">Checking Wallet Balance (getAccountBalance)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-error-handling-and-resilience">Error Handling and Resilience</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-role-in-the-overall-payroll-architecture-1">Role in the Overall Payroll Architecture</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-setting-up-webhook-handlers">Setting Up Webhook Handlers</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-wiring-up-routes">Wiring Up Routes</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-employee-routes">Employee Routes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-payroll-routes">Payroll Routes</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-main-application-entry-point">Main Application Entry Point</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-testing-the-system">Testing the System</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-setting-up-webhooks-for-production">Setting Up Webhooks for Production</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-key-takeaways">Key Takeaways</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-references">References:</a></p>
</li>
</ul>
</li>
</ol>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>Before you begin, make sure you have the following:</p>
<ul>
<li><p>Node.js (v18 or higher)</p>
</li>
<li><p>Docker and Docker Compose installed</p>
</li>
<li><p>A Monnify merchant account with API credentials</p>
</li>
<li><p>Basic knowledge of TypeScript and Express.js</p>
</li>
<li><p>Familiarity with REST APIs</p>
</li>
</ul>
<p>You'll also need to obtain these credentials from your Monnify dashboard:</p>
<ul>
<li><p>API Key</p>
</li>
<li><p>Secret Key</p>
</li>
<li><p>Contract Code</p>
</li>
<li><p>Webhook Secret (for verifying webhook signatures)</p>
</li>
</ul>
<h2 id="heading-project-architecture-overview">Project Architecture Overview</h2>
<p>Here's how the payroll system works:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766393228193/8626c139-776c-491b-b060-2f95a760f32b.png" alt="Payroll system working principle" class="image--center mx-auto" width="2952" height="1494" loading="lazy"></p>
<p><strong>Key components:</strong></p>
<ol>
<li><p><strong>Express API</strong>: A minimal and flexible Node.js web framework that handles HTTP requests for managing employees and payrolls. Express provides routing, middleware support, and makes it easy to build RESTful APIs.</p>
</li>
<li><p><strong>Bull Queue</strong>: A Redis-based queue library for Node.js that processes payroll jobs asynchronously in the background. Bull handles job retries, scheduling, and provides a reliable way to process long-running tasks without blocking your main application thread.</p>
</li>
<li><p><strong>Redis</strong>: An in-memory data structure store that serves as the backend for Bull queues. Redis stores job data, manages job states (pending, active, completed, failed), and enables distributed job processing across multiple workers.</p>
</li>
<li><p><strong>PostgreSQL</strong>: A relational database that persists employee records, payrolls, and payment items. PostgreSQL's ACID compliance ensures data integrity, and its support for complex queries makes it ideal for financial applications.</p>
</li>
<li><p><strong>Monnify API</strong>: A payment gateway service that handles actual money transfers to employee bank accounts. Monnify provides bulk disbursement capabilities, allowing you to process multiple payments in a single API call, which is essential for payroll systems.</p>
</li>
<li><p><strong>Webhooks</strong>: HTTP callbacks that receive real-time payment status updates from Monnify. When a payment completes or fails, Monnify sends a webhook to your server, allowing you to update your database immediately without polling.</p>
</li>
</ol>
<h2 id="heading-setting-up-the-project">Setting Up the Project</h2>
<p>In this section, we'll initialize a new Node.js project with TypeScript and install all the necessary dependencies. We'll configure TypeScript for type safety and set up the project structure that will support our payroll system.</p>
<p>First, create a new directory and initialize your project:</p>
<pre><code class="lang-bash">mkdir monnify-payroll-system
<span class="hljs-built_in">cd</span> monnify-payroll-system
npm init -y
</code></pre>
<p>Next, install the required dependencies:</p>
<pre><code class="lang-bash">npm install express cors helmet dotenv axios bull ioredis pg swagger-jsdoc swagger-ui-express express-validator
</code></pre>
<p>Then install the development dependencies:</p>
<pre><code class="lang-bash">npm install -D typescript ts-node-dev @types/node @types/express @types/cors @types/pg @types/bull
</code></pre>
<p>Create a <code>tsconfig.json</code> file:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"compilerOptions"</span>: {
    <span class="hljs-attr">"target"</span>: <span class="hljs-string">"ES2020"</span>,
    <span class="hljs-attr">"module"</span>: <span class="hljs-string">"commonjs"</span>,
    <span class="hljs-attr">"lib"</span>: [<span class="hljs-string">"ES2020"</span>],
    <span class="hljs-attr">"outDir"</span>: <span class="hljs-string">"./dist"</span>,
    <span class="hljs-attr">"rootDir"</span>: <span class="hljs-string">"./src"</span>,
    <span class="hljs-attr">"strict"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"esModuleInterop"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"skipLibCheck"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"forceConsistentCasingInFileNames"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"resolveJsonModule"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"declaration"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"declarationMap"</span>: <span class="hljs-literal">true</span>,
    <span class="hljs-attr">"sourceMap"</span>: <span class="hljs-literal">true</span>
  },
  <span class="hljs-attr">"include"</span>: [<span class="hljs-string">"src/**/*"</span>, <span class="hljs-string">"scripts/**/*"</span>],
  <span class="hljs-attr">"exclude"</span>: [<span class="hljs-string">"node_modules"</span>, <span class="hljs-string">"dist"</span>]
}
</code></pre>
<p>And update your <code>package.json</code> scripts:</p>
<pre><code class="lang-json">{
  <span class="hljs-attr">"scripts"</span>: {
    <span class="hljs-attr">"build"</span>: <span class="hljs-string">"tsc"</span>,
    <span class="hljs-attr">"start"</span>: <span class="hljs-string">"node dist/index.js"</span>,
    <span class="hljs-attr">"dev"</span>: <span class="hljs-string">"ts-node-dev --respawn --transpile-only src/index.ts"</span>,
    <span class="hljs-attr">"migrate"</span>: <span class="hljs-string">"ts-node scripts/run-migrations.ts"</span>
  }
}
</code></pre>
<p>Now, create a <code>.env</code> file for your environment variables. All the Monnify env details can be gotten in this <a target="_blank" href="https://app.monnify.com/developer">route</a>:</p>
<pre><code class="lang-plaintext"># Server
PORT=3008
NODE_ENV=development

# Database
DB_HOST=localhost
DB_PORT=5433
DB_NAME=payroll_db
DB_USER=payroll_user
DB_PASSWORD=payroll_password

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379

# Monnify
MONNIFY_API_KEY=your_api_key
MONNIFY_SECRET_KEY=your_secret_key
MONNIFY_BASE_URL=https://sandbox.monnify.com
MONNIFY_CONTRACT_CODE=your_contract_code
MONNIFY_WEBHOOK_SECRET=your_webhook_secret
</code></pre>
<h2 id="heading-configuring-docker-for-postgresql-and-redis">Configuring Docker for PostgreSQL and Redis</h2>
<p>Before we can start building our application, we need to set up the infrastructure services: PostgreSQL for data persistence and Redis for job queue management. Using Docker Compose makes it easy to run these services locally with a single command. This approach ensures consistency across development environments and simplifies deployment.</p>
<p>Create a <code>docker-compose.yml</code> file to set up PostgreSQL and Redis:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">services:</span>
  <span class="hljs-attr">postgres:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">postgres:15-alpine</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">monnify-payroll-db</span>
    <span class="hljs-attr">environment:</span>
      <span class="hljs-attr">POSTGRES_USER:</span> <span class="hljs-string">payroll_user</span>
      <span class="hljs-attr">POSTGRES_PASSWORD:</span> <span class="hljs-string">payroll_password</span>
      <span class="hljs-attr">POSTGRES_DB:</span> <span class="hljs-string">payroll_db</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'5433:5432'</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">postgres_data:/var/lib/postgresql/data</span>
    <span class="hljs-attr">healthcheck:</span>
      <span class="hljs-attr">test:</span> [<span class="hljs-string">'CMD-SHELL'</span>, <span class="hljs-string">'pg_isready -U payroll_user'</span>]
      <span class="hljs-attr">interval:</span> <span class="hljs-string">10s</span>
      <span class="hljs-attr">timeout:</span> <span class="hljs-string">5s</span>
      <span class="hljs-attr">retries:</span> <span class="hljs-number">5</span>

  <span class="hljs-attr">redis:</span>
    <span class="hljs-attr">image:</span> <span class="hljs-string">redis:7-alpine</span>
    <span class="hljs-attr">container_name:</span> <span class="hljs-string">monnify-payroll-redis</span>
    <span class="hljs-attr">ports:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">'6379:6379'</span>
    <span class="hljs-attr">volumes:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">redis_data:/data</span>
    <span class="hljs-attr">healthcheck:</span>
      <span class="hljs-attr">test:</span> [<span class="hljs-string">'CMD'</span>, <span class="hljs-string">'redis-cli'</span>, <span class="hljs-string">'ping'</span>]
      <span class="hljs-attr">interval:</span> <span class="hljs-string">10s</span>
      <span class="hljs-attr">timeout:</span> <span class="hljs-string">5s</span>
      <span class="hljs-attr">retries:</span> <span class="hljs-number">5</span>

<span class="hljs-attr">volumes:</span>
  <span class="hljs-attr">postgres_data:</span>
  <span class="hljs-attr">redis_data:</span>
</code></pre>
<p>Start the services:</p>
<pre><code class="lang-bash">docker-compose up -d
</code></pre>
<p>And verify that both services are running:</p>
<pre><code class="lang-bash">docker-compose ps
</code></pre>
<h2 id="heading-setting-up-the-database">Setting Up the Database</h2>
<p>Now we'll configure the database connection and create the necessary tables. We'll use a connection pool for efficient database access and create migration files to set up our schema. This approach ensures our database structure is version-controlled and can be easily reproduced.</p>
<p>Create the <code>src/config/database.ts</code> file to configure the PostgreSQL connection:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Pool, PoolConfig } <span class="hljs-keyword">from</span> <span class="hljs-string">'pg'</span>;
<span class="hljs-keyword">import</span> dotenv <span class="hljs-keyword">from</span> <span class="hljs-string">'dotenv'</span>;

dotenv.config();

<span class="hljs-keyword">const</span> dbName = (process.env.DB_NAME || <span class="hljs-string">'payroll_db'</span>).trim();
<span class="hljs-keyword">if</span> (!dbName) {
  <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Database name (DB_NAME) must be set and cannot be empty'</span>);
}

<span class="hljs-keyword">const</span> config: PoolConfig = {
  host: process.env.DB_HOST || <span class="hljs-string">'localhost'</span>,
  port: <span class="hljs-built_in">parseInt</span>(process.env.DB_PORT || <span class="hljs-string">'5433'</span>),
  database: dbName,
  user: process.env.DB_USER || <span class="hljs-string">'payroll_user'</span>,
  password: process.env.DB_PASSWORD || <span class="hljs-string">'payroll_password'</span>,
  max: <span class="hljs-number">20</span>,
  idleTimeoutMillis: <span class="hljs-number">30000</span>,
  connectionTimeoutMillis: <span class="hljs-number">2000</span>,
};

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> pool = <span class="hljs-keyword">new</span> Pool(config);

pool.on(<span class="hljs-string">'error'</span>, <span class="hljs-function">(<span class="hljs-params">err: <span class="hljs-built_in">Error</span></span>) =&gt;</span> {
  <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Unexpected error on idle client'</span>, err);
  process.exit(<span class="hljs-number">-1</span>);
});

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> query = <span class="hljs-keyword">async</span> (text: <span class="hljs-built_in">string</span>, params?: <span class="hljs-built_in">any</span>[]) =&gt; {
  <span class="hljs-keyword">const</span> start = <span class="hljs-built_in">Date</span>.now();
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> res = <span class="hljs-keyword">await</span> pool.query(text, params);
    <span class="hljs-keyword">return</span> res;
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Database query error'</span>, error);
    <span class="hljs-keyword">throw</span> error;
  }
};
</code></pre>
<p>Now create the migration files. First, create a <code>migrations</code> folder:</p>
<pre><code class="lang-bash">mkdir migrations
</code></pre>
<p>Then create <code>migrations/001_create_employees_table.sql</code>:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- Create employees table</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> employees (
  <span class="hljs-keyword">id</span> <span class="hljs-built_in">SERIAL</span> PRIMARY <span class="hljs-keyword">KEY</span>,
  <span class="hljs-keyword">name</span> <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">255</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  email <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">255</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">UNIQUE</span>,
  employee_id <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">100</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">UNIQUE</span>,
  salary <span class="hljs-built_in">DECIMAL</span>(<span class="hljs-number">15</span>, <span class="hljs-number">2</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  account_number <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">50</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  bank_code <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">20</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  bank_name <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">255</span>),
  is_active <span class="hljs-built_in">BOOLEAN</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-literal">true</span>,
  created_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>,
  updated_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>
);

<span class="hljs-comment">-- Create indexes for faster lookups</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_employees_employee_id <span class="hljs-keyword">ON</span> employees(employee_id);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_employees_is_active <span class="hljs-keyword">ON</span> employees(is_active);
</code></pre>
<p>Now, create <code>migrations/002_create_payrolls_table.sql</code>:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- Create payrolls table</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> payrolls (
  <span class="hljs-keyword">id</span> <span class="hljs-built_in">SERIAL</span> PRIMARY <span class="hljs-keyword">KEY</span>,
  payroll_period <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">100</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  total_amount <span class="hljs-built_in">DECIMAL</span>(<span class="hljs-number">15</span>, <span class="hljs-number">2</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  total_employees <span class="hljs-built_in">INTEGER</span> <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  <span class="hljs-keyword">status</span> <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">50</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-string">'pending'</span>,
  processed_count <span class="hljs-built_in">INTEGER</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-number">0</span>,
  failed_count <span class="hljs-built_in">INTEGER</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-number">0</span>,
  created_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>,
  updated_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>,
  processed_at <span class="hljs-built_in">TIMESTAMP</span>
);

<span class="hljs-comment">-- Create indexes for faster queries</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payrolls_status <span class="hljs-keyword">ON</span> payrolls(<span class="hljs-keyword">status</span>);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payrolls_period <span class="hljs-keyword">ON</span> payrolls(payroll_period);
</code></pre>
<p>And next, create <code>migrations/003_create_payroll_items_table.sql</code>:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- Create payroll_items table</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">TABLE</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> payroll_items (
  <span class="hljs-keyword">id</span> <span class="hljs-built_in">SERIAL</span> PRIMARY <span class="hljs-keyword">KEY</span>,
  payroll_id <span class="hljs-built_in">INTEGER</span> <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">REFERENCES</span> payrolls(<span class="hljs-keyword">id</span>) <span class="hljs-keyword">ON</span> <span class="hljs-keyword">DELETE</span> <span class="hljs-keyword">CASCADE</span>,
  employee_id <span class="hljs-built_in">INTEGER</span> <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">REFERENCES</span> employees(<span class="hljs-keyword">id</span>) <span class="hljs-keyword">ON</span> <span class="hljs-keyword">DELETE</span> <span class="hljs-keyword">CASCADE</span>,
  amount <span class="hljs-built_in">DECIMAL</span>(<span class="hljs-number">15</span>, <span class="hljs-number">2</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span>,
  <span class="hljs-keyword">status</span> <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">50</span>) <span class="hljs-keyword">NOT</span> <span class="hljs-literal">NULL</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-string">'pending'</span>,
  transaction_reference <span class="hljs-built_in">VARCHAR</span>(<span class="hljs-number">255</span>),
  error_message <span class="hljs-built_in">TEXT</span>,
  processed_at <span class="hljs-built_in">TIMESTAMP</span>,
  created_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>,
  updated_at <span class="hljs-built_in">TIMESTAMP</span> <span class="hljs-keyword">DEFAULT</span> <span class="hljs-keyword">CURRENT_TIMESTAMP</span>
);

<span class="hljs-comment">-- Create indexes for faster queries</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payroll_items_payroll_id <span class="hljs-keyword">ON</span> payroll_items(payroll_id);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payroll_items_employee_id <span class="hljs-keyword">ON</span> payroll_items(employee_id);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payroll_items_status <span class="hljs-keyword">ON</span> payroll_items(<span class="hljs-keyword">status</span>);
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">INDEX</span> <span class="hljs-keyword">IF</span> <span class="hljs-keyword">NOT</span> <span class="hljs-keyword">EXISTS</span> idx_payroll_items_transaction_ref <span class="hljs-keyword">ON</span> payroll_items(transaction_reference);
</code></pre>
<p>Then create a migration runner script at <code>scripts/run-migrations.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> fs <span class="hljs-keyword">from</span> <span class="hljs-string">'fs'</span>;
<span class="hljs-keyword">import</span> path <span class="hljs-keyword">from</span> <span class="hljs-string">'path'</span>;
<span class="hljs-keyword">import</span> { pool } <span class="hljs-keyword">from</span> <span class="hljs-string">'../src/config/database'</span>;

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">runMigrations</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">const</span> migrationsDir = path.join(__dirname, <span class="hljs-string">'../migrations'</span>);
  <span class="hljs-keyword">const</span> files = fs.readdirSync(migrationsDir).sort();

  <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> file <span class="hljs-keyword">of</span> files) {
    <span class="hljs-keyword">if</span> (file.endsWith(<span class="hljs-string">'.sql'</span>)) {
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Running migration: <span class="hljs-subst">${file}</span>`</span>);
      <span class="hljs-keyword">const</span> sql = fs.readFileSync(path.join(migrationsDir, file), <span class="hljs-string">'utf-8'</span>);
      <span class="hljs-keyword">await</span> pool.query(sql);
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Completed: <span class="hljs-subst">${file}</span>`</span>);
    }
  }

  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'All migrations completed'</span>);
  <span class="hljs-keyword">await</span> pool.end();
}

runMigrations().catch(<span class="hljs-function">(<span class="hljs-params">err</span>) =&gt;</span> {
  <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Migration failed:'</span>, err);
  process.exit(<span class="hljs-number">1</span>);
});
</code></pre>
<p>Run the migrations:</p>
<pre><code class="lang-bash">npm run migrate
</code></pre>
<h2 id="heading-creating-database-models">Creating Database Models</h2>
<p>In this section, we'll create the data access layer for our payroll system. Models encapsulate all database operations, providing a clean interface for the rest of the application. We'll build two main models: one for managing employees and another for handling payrolls and payroll items.</p>
<p>For each model, I’ll first explain its purpose and key methods, then show you the complete code implementation. This approach helps you understand what each model does before you implement it.</p>
<h3 id="heading-employee-model">Employee Model</h3>
<p>The <code>EmployeeModel</code> serves as the data-access layer for employee records. It handles creating, reading, updating, and soft-deleting employees. The model includes automatic employee ID generation (format: <code>EMP001</code>, <code>EMP002</code>, and so on) and ensures that each employee has the banking details required for payroll disbursement.</p>
<p>Start by creating a new file at <code>src/models/employee.ts</code> where we’ll implement all employee-related database logic.</p>
<p>After creating the file, import a shared database query helper to execute parameterized SQL safely.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { query } <span class="hljs-keyword">from</span> <span class="hljs-string">'../config/database'</span>;
</code></pre>
<p>This keeps raw SQL isolated from controllers and ensures protection against SQL injection.</p>
<h3 id="heading-employee-data-structure-employee-interface">Employee Data Structure (<code>Employee</code> Interface)</h3>
<p>Next, we’ll define the employee interface.</p>
<p>The <code>Employee</code> interface represents a row in the <code>employees</code> database table and captures both operational and audit fields. It includes identifying fields (`id`, <code>employee_id</code>), personal fields (`name`, <code>email</code>), payroll fields (`salary`), banking details (`account_number`, <code>bank_code</code>, <code>bank_name</code>), operational state (`is_active`), and timestamps (`created_at`, <code>updated_at</code>). The <code>is_active</code> flag is used to support soft deletion and employee deactivation without permanently removing historical payroll relationships.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> Employee {
  id: <span class="hljs-built_in">number</span>;
  name: <span class="hljs-built_in">string</span>;
  email: <span class="hljs-built_in">string</span>;
  employee_id: <span class="hljs-built_in">string</span>;
  salary: <span class="hljs-built_in">number</span>;
  account_number: <span class="hljs-built_in">string</span>;
  bank_code: <span class="hljs-built_in">string</span>;
  bank_name: <span class="hljs-built_in">string</span>;
  is_active: <span class="hljs-built_in">boolean</span>;
  created_at: <span class="hljs-built_in">Date</span>;
  updated_at: <span class="hljs-built_in">Date</span>;
}
</code></pre>
<p>Now, we’ll define the <code>CreateEmployeeInput</code> interface which represent the expected payload for creating an employee. It includes required fields such as name, email, salary, and bank details.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> CreateEmployeeInput {
  name: <span class="hljs-built_in">string</span>;
  email: <span class="hljs-built_in">string</span>;
  employee_id?: <span class="hljs-built_in">string</span>;
  salary: <span class="hljs-built_in">number</span>;
  account_number: <span class="hljs-built_in">string</span>;
  bank_code: <span class="hljs-built_in">string</span>;
  bank_name: <span class="hljs-built_in">string</span>;
}
</code></pre>
<p>The <code>employee_id</code> field is optional, allowing the system to auto-generate a unique identifier if one is not provided. This flexibility supports both automated workflows and manual HR data imports.</p>
<h3 id="heading-employee-model-class">Employee Model Class</h3>
<p>Next, we’ll define the <code>EmployeeModel</code> class.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> EmployeeModel {
  <span class="hljs-comment">// Class methods will go here</span>
}
</code></pre>
<p>This class encapsulates all database operations related to employee records. It centralizes logic for creating, retrieving, updating, and deleting employees, as well as generating unique sequential employee IDs.</p>
<h3 id="heading-auto-generating-employee-ids-generateemployeeid">Auto-Generating Employee IDs (<code>generateEmployeeId</code>)</h3>
<p>We start by creating the <code>generateEmployeeId</code> method inside the <code>EmployeeModel</code> class.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">private</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> generateEmployeeId(): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">string</span>&gt; {
    <span class="hljs-comment">// Get the highest existing employee_id number that matches EMP### pattern</span>
    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`SELECT employee_id FROM employees
       WHERE employee_id LIKE 'EMP%'
       AND LENGTH(employee_id) &gt;= 4
       AND SUBSTRING(employee_id FROM 4) ~ '^[0-9]+$'
       ORDER BY CAST(SUBSTRING(employee_id FROM 4) AS INTEGER) DESC
       LIMIT 1`</span>
    );

    <span class="hljs-keyword">if</span> (result.rows.length === <span class="hljs-number">0</span>) {
      <span class="hljs-keyword">return</span> <span class="hljs-string">'EMP001'</span>;
    }

    <span class="hljs-keyword">const</span> lastId = result.rows[<span class="hljs-number">0</span>].employee_id;
    <span class="hljs-keyword">const</span> numberPart = lastId.substring(<span class="hljs-number">3</span>);
    <span class="hljs-keyword">const</span> lastNumber = <span class="hljs-built_in">parseInt</span>(numberPart, <span class="hljs-number">10</span>);

    <span class="hljs-keyword">if</span> (<span class="hljs-built_in">isNaN</span>(lastNumber)) {
      <span class="hljs-keyword">return</span> <span class="hljs-string">'EMP001'</span>;
    }

    <span class="hljs-keyword">const</span> nextNumber = lastNumber + <span class="hljs-number">1</span>;
    <span class="hljs-comment">// Format as EMP001, EMP002, etc. (3 digits minimum)</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">`EMP<span class="hljs-subst">${nextNumber.toString().padStart(<span class="hljs-number">3</span>, <span class="hljs-string">'0'</span>)}</span>`</span>;
  }
</code></pre>
<p>The private <code>generateEmployeeId</code> method generates a unique employee identifier in a readable sequential format such as <code>EMP001</code>, <code>EMP002</code>, and so on. It queries the database for the highest existing employee ID that matches the expected pattern (<code>EMP</code> prefix followed by numeric digits), orders by the numeric suffix in descending order, and increments the latest number to produce the next ID.</p>
<p>If no matching record exists, it starts from <code>EMP001</code>. The method also protects against malformed data by returning <code>EMP001</code> if parsing fails.</p>
<p>Finally, it ensures formatting consistency by padding the number portion to at least three digits using <code>padStart(3, '0')</code>, which keeps IDs aligned and easy to sort visually.</p>
<h3 id="heading-creating-an-employee-create">Creating an Employee (<code>create</code>)</h3>
<p>Next, we’ll define the <code>create</code> method, which inserts a new employee record into the database. If the caller does not supply an <code>employee_id</code>, the method generates one automatically using <code>generateEmployeeId</code>.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> create(data: CreateEmployeeInput): <span class="hljs-built_in">Promise</span>&lt;Employee&gt; {
    <span class="hljs-comment">// Auto-generate employee_id if not provided</span>
    <span class="hljs-keyword">let</span> employeeId = data.employee_id;
    <span class="hljs-keyword">if</span> (!employeeId) {
      employeeId = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.generateEmployeeId();
    }

    <span class="hljs-comment">// Check if employee_id already exists (if manually provided)</span>
    <span class="hljs-keyword">if</span> (data.employee_id) {
      <span class="hljs-keyword">const</span> existing = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.findByEmployeeId(data.employee_id);
      <span class="hljs-keyword">if</span> (existing) {
        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Employee ID already exists'</span>);
      }
    }

    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`INSERT INTO employees (name, email, employee_id, salary, account_number, bank_code, bank_name)
       VALUES ($1, $2, $3, $4, $5, $6, $7)
       RETURNING *`</span>,
      [
        data.name,
        data.email,
        employeeId,
        data.salary,
        data.account_number,
        data.bank_code,
        data.bank_name,
      ]
    );
    <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>];
  }
</code></pre>
<p>Here’s what’s happening in the code<strong>:</strong></p>
<p>If an <code>employee_id</code> is manually provided, it validates uniqueness by checking if that ID already exists among active employees, preventing collisions and ensuring each employee has a distinct identifier. After validations, the employee is inserted into the <code>employees</code> table and the new record is returned. This method ensures every employee created has complete banking details required for payroll disbursement.</p>
<h3 id="heading-retrieving-all-active-employees-findall">Retrieving All Active Employees (<code>findAll</code>)</h3>
<p>The <code>findAll</code> method fetches all active employees from the database.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findAll(): <span class="hljs-built_in">Promise</span>&lt;Employee[]&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">'SELECT * FROM employees WHERE is_active = true ORDER BY created_at DESC'</span>
  );
  <span class="hljs-keyword">return</span> result.rows;
}
</code></pre>
<p>The <code>findAll</code> method returns all active employees (<code>is_active = true</code>) ordered by most recent creation date. This behavior supports common UI patterns such as HR dashboards and payroll selection screens, where only active employees should be visible by default.</p>
<h3 id="heading-retrieving-an-employee-by-database-id-findbyid">Retrieving an Employee by Database ID (<code>findById</code>)</h3>
<p>The <code>findById</code> method retrieves a single employee by the internal numeric primary key (<code>id</code>).</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findById(id: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;Employee | <span class="hljs-literal">null</span>&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(<span class="hljs-string">'SELECT * FROM employees WHERE id = $1'</span>, [id]);
  <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>] || <span class="hljs-literal">null</span>;
}
</code></pre>
<p>If the employee does not exist, it returns <code>null</code>. This is typically used for internal operations such as payroll processing, updates, or admin detail views.</p>
<h3 id="heading-retrieving-an-employee-by-employee-identifier-findbyemployeeid">Retrieving an Employee by Employee Identifier (<code>findByEmployeeId</code>)</h3>
<p>The <code>findByEmployeeId</code> method retrieves an active employee using the business-friendly <code>employee_id</code> (for example, <code>EMP014</code>).</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findByEmployeeId(employeeId: <span class="hljs-built_in">string</span>): <span class="hljs-built_in">Promise</span>&lt;Employee | <span class="hljs-literal">null</span>&gt; {
    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">'SELECT * FROM employees WHERE employee_id = $1 AND is_active = true'</span>,
      [employeeId]
    );
    <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>] || <span class="hljs-literal">null</span>;
}
</code></pre>
<p>The method filters by <code>is_active = true</code> to prevent selecting deactivated employees during operations like payroll runs or HR searches.</p>
<h3 id="heading-updating-an-employee-update">Updating an Employee (<code>update</code>)</h3>
<p>The <code>update</code> method supports partial updates by dynamically building the SQL <code>SET</code> clause based on the fields present in the update payload. It iterates through the provided properties, includes only those with defined values, and constructs a parameterized query to prevent SQL injection and preserve correctness.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> update(
    id: <span class="hljs-built_in">number</span>,
    data: Partial&lt;CreateEmployeeInput&gt;
  ): <span class="hljs-built_in">Promise</span>&lt;Employee&gt; {
    <span class="hljs-keyword">const</span> fields: <span class="hljs-built_in">string</span>[] = [];
    <span class="hljs-keyword">const</span> values: <span class="hljs-built_in">any</span>[] = [];
    <span class="hljs-keyword">let</span> paramCount = <span class="hljs-number">1</span>;

    <span class="hljs-comment">// Build dynamic update query based on provided fields</span>
    <span class="hljs-built_in">Object</span>.entries(data).forEach(<span class="hljs-function">(<span class="hljs-params">[key, value]</span>) =&gt;</span> {
      <span class="hljs-keyword">if</span> (value !== <span class="hljs-literal">undefined</span>) {
        fields.push(<span class="hljs-string">`<span class="hljs-subst">${key}</span> = $<span class="hljs-subst">${paramCount}</span>`</span>);
        values.push(value);
        paramCount++;
      }
    });

    <span class="hljs-keyword">if</span> (fields.length === <span class="hljs-number">0</span>) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'No fields to update'</span>);
    }

    <span class="hljs-comment">// Always update the updated_at timestamp</span>
    fields.push(<span class="hljs-string">`updated_at = $<span class="hljs-subst">${paramCount}</span>`</span>);
    values.push(<span class="hljs-keyword">new</span> <span class="hljs-built_in">Date</span>());
    values.push(id);

    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`UPDATE employees SET <span class="hljs-subst">${fields.join(<span class="hljs-string">', '</span>)}</span> WHERE id = $<span class="hljs-subst">${
        paramCount + <span class="hljs-number">1</span>
      }</span> RETURNING *`</span>,
      values
    );
    <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>];
  }
</code></pre>
<p>Here’s what’s happening in the code:</p>
<p>If no fields are provided, it throws an error to avoid performing a meaningless update. It also explicitly updates the <code>updated_at</code> timestamp to ensure accurate audit tracking. Finally, it returns the updated database record, making it easy for controllers to respond with the latest employee state.</p>
<h3 id="heading-soft-deleting-an-employee-delete">Soft-Deleting an Employee (<code>delete</code>)</h3>
<p>Finally, instead of permanently removing the employee record, the <code>delete</code> method performs a soft delete by setting <code>is_active = false</code> and updating the <code>updated_at</code> timestamp.</p>
<p>This approach preserves historical payroll references and audit trails while excluding inactive employees from standard queries like <code>findAll</code>. It’s especially important in payroll systems where historical payment records must remain valid and traceable even after an employee leaves the organization.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> <span class="hljs-keyword">delete</span>(id: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
  <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">'UPDATE employees SET is_active = false, updated_at = NOW() WHERE id = $1'</span>,
    [id]
  );
}
</code></pre>
<p>Key features of the employee model:</p>
<ul>
<li><p>Auto-generates sequential employee IDs if not provided</p>
</li>
<li><p>Validates employee ID uniqueness</p>
</li>
<li><p>Supports soft deletion to preserve historical payroll records</p>
</li>
<li><p>Provides methods for finding employees by database ID or employee identifier</p>
</li>
</ul>
<h3 id="heading-payroll-model">Payroll Model</h3>
<p>The <code>PayrollModel</code> manages payroll batches and individual payroll items. A payroll represents a single payment cycle (for example, "December 2024"), while payroll items represent individual employee payments within that cycle. This separation allows us to track the status of each payment independently.</p>
<p>Key features:</p>
<ul>
<li><p>Creates payroll batches with automatic calculation of totals</p>
</li>
<li><p>Supports filtering employees for selective payroll runs</p>
</li>
<li><p>Tracks status at both batch and item levels</p>
</li>
<li><p>Provides methods for reconciliation and status updates</p>
</li>
</ul>
<p>Let's implement the Payroll Model.</p>
<p>We’ll begin by creating a new file at <code>src/models/payroll.ts</code>, where we’ll implement the payroll models that encapsulate payroll batch creation, employee payment tracking, and payroll status management.</p>
<p>First, import a shared database query helper to execute parameterized SQL safely.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { query } <span class="hljs-keyword">from</span> <span class="hljs-string">'../config/database'</span>;
</code></pre>
<p>This keeps raw SQL isolated from controllers and ensures protection against SQL injection.</p>
<h3 id="heading-payroll-status-lifecycle">Payroll Status Lifecycle</h3>
<p>Next, we’ll define the <code>PayrollStatus</code> enum.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-built_in">enum</span> PayrollStatus {
 PENDING = <span class="hljs-string">'pending'</span>,
 PROCESSING = <span class="hljs-string">'processing'</span>,
 COMPLETED = <span class="hljs-string">'completed'</span>,
 FAILED = <span class="hljs-string">'failed'</span>,
 PARTIALLY_COMPLETED = <span class="hljs-string">'partially_completed'</span>,
}
</code></pre>
<p>The <code>PayrollStatus</code> enum defines all possible states for both payroll batches and individual payroll items:</p>
<ul>
<li><p><strong>PENDING</strong> – Created but not yet processed</p>
</li>
<li><p><strong>PROCESSING</strong> – Currently being processed by background workers</p>
</li>
<li><p><strong>COMPLETED</strong> – Successfully processed</p>
</li>
<li><p><strong>FAILED</strong> – Processing failed</p>
</li>
<li><p><strong>PARTIALLY_COMPLETED</strong> – Some items succeeded while others failed</p>
</li>
</ul>
<h3 id="heading-payroll-entity">Payroll Entity</h3>
<p>With the payroll status lifecycle defined, we can now define the payroll entity.</p>
<p>The <code>Payroll</code> interface represents a single payroll run, such as a monthly salary payout. It stores aggregate and audit information including the payroll period, total salary amount, total number of employees, processing status, counts of successful and failed payments, and timestamps for creation, updates, and completion.</p>
<p>Add the following interface to <code>src/models/payroll.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> Payroll {
 id: <span class="hljs-built_in">number</span>;
 payroll_period: <span class="hljs-built_in">string</span>;
 total_amount: <span class="hljs-built_in">number</span>;
 total_employees: <span class="hljs-built_in">number</span>;
 status: PayrollStatus;
 processed_count: <span class="hljs-built_in">number</span>;
 failed_count: <span class="hljs-built_in">number</span>;
 created_at: <span class="hljs-built_in">Date</span>;
 updated_at: <span class="hljs-built_in">Date</span>;
 processed_at?: <span class="hljs-built_in">Date</span>;
}
</code></pre>
<p>This entity acts as the parent record for all employee payments within a payroll cycle and is used to track overall payroll progress and outcomes.</p>
<h3 id="heading-payroll-item-entity">Payroll Item Entity</h3>
<p>Next, we’ll define the payroll item entity, which represents an individual employee payment within a payroll.</p>
<p>The <code>PayrollItem</code> tracks the employee being paid, the payment amount, its processing status, any transaction reference returned by the payment provider, error messages in case of failure, and relevant timestamps.</p>
<p>Add the following interface just below the <code>Payroll</code> interface:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> PayrollItem {
  id: <span class="hljs-built_in">number</span>;
  payroll_id: <span class="hljs-built_in">number</span>;
  employee_id: <span class="hljs-built_in">number</span>;
  amount: <span class="hljs-built_in">number</span>;
  status: PayrollStatus;
  transaction_reference?: <span class="hljs-built_in">string</span>;
  error_message?: <span class="hljs-built_in">string</span>;
  processed_at?: <span class="hljs-built_in">Date</span>;
  created_at: <span class="hljs-built_in">Date</span>;
  updated_at: <span class="hljs-built_in">Date</span>;
}
</code></pre>
<p>This structure allows individual employee payments to be retried, audited, or reconciled independently without affecting the rest of the payroll batch.</p>
<h3 id="heading-creating-a-payroll-payrollmodelcreate">Creating a Payroll (<code>PayrollModel.create</code>)</h3>
<p>Now that we’ve defined the <code>Payroll</code> and <code>PayrollItem</code> entities, we can move on to creating a payroll batch.</p>
<p>To keep our business logic organized, we’ll introduce a <code>PayrollModel</code> class. This class will be responsible for creating payroll records, calculating aggregates, and generating individual payroll items for each employee.</p>
<p>Before writing the model itself, let’s define the input required to create a payroll.</p>
<p>Add the following interface below the <code>PayrollItem</code> interface:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> CreatePayrollInput {
  payroll_period: <span class="hljs-built_in">string</span>;
  employee_ids?: <span class="hljs-built_in">number</span>[];
}
</code></pre>
<ul>
<li><p><code>payroll_period</code> identifies the payroll run (for example, <code>2025-01</code>)</p>
</li>
<li><p><code>employee_ids</code> is optional and allows us to run payroll for a subset of employees, enabling selective payouts or retries</p>
</li>
</ul>
<p>Next, create the <code>PayrollModel</code> class. This class will encapsulate all payroll-related database operations.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> PayrollModel {
<span class="hljs-comment">// Payroll model class methods will go here</span>
}
</code></pre>
<p>We’ll start by implementing the <code>create</code> method, which is responsible for creating a new payroll batch.</p>
<p>The method performs the following steps:</p>
<ol>
<li><p>Optionally filters employees if specific employee IDs are provided</p>
</li>
<li><p>Calculates aggregate payroll statistics from the employees table</p>
</li>
<li><p>Creates a payroll record with a <code>PENDING</code> status</p>
</li>
<li><p>Creates a payroll item for each eligible employee</p>
</li>
</ol>
<p>Here’s the implementation:</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> create(data: CreatePayrollInput): <span class="hljs-built_in">Promise</span>&lt;Payroll&gt; {
    <span class="hljs-keyword">let</span> employeeFilter = <span class="hljs-string">''</span>;
    <span class="hljs-keyword">let</span> queryParams: <span class="hljs-built_in">any</span>[] = [];

    <span class="hljs-comment">// Build filter for selective employee payrolls</span>
    <span class="hljs-keyword">if</span> (data.employee_ids &amp;&amp; data.employee_ids.length &gt; <span class="hljs-number">0</span>) {
      employeeFilter = <span class="hljs-string">`AND id = ANY($1::int[])`</span>;
      queryParams = [data.employee_ids];
    }

    <span class="hljs-comment">// Calculate aggregate statistics from employees table</span>
    <span class="hljs-keyword">const</span> employeeStats = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`SELECT COUNT(*) as count, COALESCE(SUM(salary), 0) as total
       FROM employees
       WHERE is_active = true <span class="hljs-subst">${employeeFilter}</span>`</span>,
      queryParams
    );

    <span class="hljs-keyword">const</span> totalEmployees = <span class="hljs-built_in">parseInt</span>(employeeStats.rows[<span class="hljs-number">0</span>].count);
    <span class="hljs-keyword">const</span> totalAmount = <span class="hljs-built_in">parseFloat</span>(employeeStats.rows[<span class="hljs-number">0</span>].total);

    <span class="hljs-comment">// Create the payroll record</span>
    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`INSERT INTO payrolls (payroll_period, total_amount, total_employees, status)
       VALUES ($1, $2, $3, $4)
       RETURNING *`</span>,
      [data.payroll_period, totalAmount, totalEmployees, PayrollStatus.PENDING]
    );

    <span class="hljs-keyword">const</span> payroll = result.rows[<span class="hljs-number">0</span>];

    <span class="hljs-comment">// Create payroll items for each employee</span>
    <span class="hljs-comment">// Each item starts with PENDING status and will be processed asynchronously</span>
    <span class="hljs-keyword">const</span> employees = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`SELECT id, salary FROM employees WHERE is_active = true <span class="hljs-subst">${employeeFilter}</span>`</span>,
      queryParams
    );

    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> employee <span class="hljs-keyword">of</span> employees.rows) {
      <span class="hljs-keyword">await</span> query(
        <span class="hljs-string">`INSERT INTO payroll_items (payroll_id, employee_id, amount, status)
         VALUES ($1, $2, $3, $4)`</span>,
        [payroll.id, employee.id, employee.salary, PayrollStatus.PENDING]
      );
    }

    <span class="hljs-keyword">return</span> payroll;
  }
</code></pre>
<p>The payroll creation process begins by determining which employees should be included. If specific employee IDs are provided, only those employees are selected – otherwise, all active employees are included. This allows the system to support both full payroll runs and selective payouts.</p>
<p>Next, the system calculates aggregate payroll statistics directly from the employees table by counting eligible employees and summing their salaries. These values are stored in a new payroll record created with a <code>PENDING</code> status.</p>
<p>Finally, a payroll item is generated for each eligible employee, with each item also initialized in a <code>PENDING</code> state. This design separates payroll setup from payment execution, allowing employee payments to be processed asynchronously and in parallel in later stages of the system.</p>
<h3 id="heading-fetching-payroll-records">Fetching Payroll Records</h3>
<p>After creating payrolls, we often need to retrieve them for administrative dashboards, reporting, and audit trails.</p>
<p>The <code>PayrollModel</code> provides two simple methods:</p>
<ol>
<li><p><code>findById</code> – Retrieves a single payroll by its unique identifier</p>
</li>
<li><p><code>findAll</code> – Retrieves all payroll records, ordered by creation date (newest first)</p>
</li>
</ol>
<p>These methods should be added <strong>below</strong> the <code>create</code> method in the <code>PayrollModel</code> class:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findById(id: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;Payroll | <span class="hljs-literal">null</span>&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(<span class="hljs-string">'SELECT * FROM payrolls WHERE id = $1'</span>, [id]);
  <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>] || <span class="hljs-literal">null</span>;
}

<span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findAll(): <span class="hljs-built_in">Promise</span>&lt;Payroll[]&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">'SELECT * FROM payrolls ORDER BY created_at DESC'</span>
  );
  <span class="hljs-keyword">return</span> result.rows;
}
</code></pre>
<p>The <code>findById</code> method retrieves a single payroll by its identifier, while <code>findAll</code> returns all payroll records ordered by creation date.</p>
<h3 id="heading-updating-payroll-status-payrollmodelupdatestatus">Updating Payroll Status (<code>PayrollModel.updateStatus</code>)</h3>
<p>Once payroll processing begins, we need a way to track the overall status of a payroll batch. The <code>updateStatus</code> method updates the payroll record with:</p>
<ul>
<li><p>The current status (<code>PENDING</code>, <code>PROCESSING</code>, <code>COMPLETED</code>, and so on)</p>
</li>
<li><p>Optional counts of processed and failed payments</p>
</li>
<li><p>A <code>processed_at</code> timestamp automatically set for terminal states (<code>COMPLETED</code> or <code>PARTIALLY_COMPLETED</code>)</p>
</li>
</ul>
<p>Add the following method below the fetch methods in your <code>PayrollModel</code> class:</p>
<pre><code class="lang-typescript">
  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> updateStatus(
    id: <span class="hljs-built_in">number</span>,
    status: PayrollStatus,
    processedCount?: <span class="hljs-built_in">number</span>,
    failedCount?: <span class="hljs-built_in">number</span>
  ): <span class="hljs-built_in">Promise</span>&lt;Payroll&gt; {
    <span class="hljs-keyword">const</span> updates: <span class="hljs-built_in">string</span>[] = [<span class="hljs-string">'status = $2'</span>, <span class="hljs-string">'updated_at = NOW()'</span>];
    <span class="hljs-keyword">const</span> values: <span class="hljs-built_in">any</span>[] = [id, status];

    <span class="hljs-comment">// Dynamically add processed_count if provided</span>
    <span class="hljs-keyword">if</span> (processedCount !== <span class="hljs-literal">undefined</span>) {
      updates.push(<span class="hljs-string">`processed_count = $<span class="hljs-subst">${values.length + <span class="hljs-number">1</span>}</span>`</span>);
      values.push(processedCount);
    }

    <span class="hljs-comment">// Dynamically add failed_count if provided</span>
    <span class="hljs-keyword">if</span> (failedCount !== <span class="hljs-literal">undefined</span>) {
      updates.push(<span class="hljs-string">`failed_count = $<span class="hljs-subst">${values.length + <span class="hljs-number">1</span>}</span>`</span>);
      values.push(failedCount);
    }

    <span class="hljs-comment">// Set processed_at timestamp for terminal states</span>
    <span class="hljs-keyword">if</span> (
      status === PayrollStatus.COMPLETED ||
      status === PayrollStatus.PARTIALLY_COMPLETED
    ) {
      updates.push(<span class="hljs-string">`processed_at = NOW()`</span>);
    }

    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
      <span class="hljs-string">`UPDATE payrolls SET <span class="hljs-subst">${updates.join(<span class="hljs-string">', '</span>)}</span> WHERE id = $1 RETURNING *`</span>,
      values
    );
    <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>];
  }
}
</code></pre>
<p>As payroll processing progresses, this method updates the overall payroll status along with optional counts of processed and failed payments. When a payroll reaches a terminal state such as <code>COMPLETED</code> or <code>PARTIALLY_COMPLETED</code>, the system automatically records a completion timestamp. This ensures accurate tracking of payroll execution and supports reconciliation workflows.</p>
<h2 id="heading-payrollitemmodel">PayrollItemModel</h2>
<p>After handling payroll batches with <code>PayrollModel</code>, we need a way to manage individual employee payments. This is where the <code>PayrollItemModel</code> comes in. It encapsulates database operations related to payroll items, including fetching, and updating records with employee details.</p>
<p>Start by adding a new class <strong>below</strong> <code>PayrollModel</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> PayrollItemModel {
  <span class="hljs-comment">// Methods will go here</span>
}
</code></pre>
<h3 id="heading-fetching-payroll-items-payrollitemmodelfindbypayrollid">Fetching Payroll Items (<code>PayrollItemModel.findByPayrollId</code>)</h3>
<p>Often, we want to get all payroll items for a specific payroll batch. For example, to display them on a dashboard or process them in a background worker.</p>
<p>This <code>findByPayrollId</code> method does that exactly. It retrieves all payroll items associated with a specific payroll and enriches them with employee details such as name, bank account number, and bank information through a database join.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findByPayrollId(payrollId: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;PayrollItem[]&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">`SELECT
       pi.id, pi.payroll_id, pi.employee_id, pi.amount, pi.status,
       pi.transaction_reference, pi.error_message, pi.processed_at,
       pi.created_at, pi.updated_at,
       e.name as employee_name, e.employee_id as employee_identifier,
       e.account_number, e.bank_code, e.bank_name
     FROM payroll_items pi
     JOIN employees e ON pi.employee_id = e.id
     WHERE pi.payroll_id = $1
     ORDER BY pi.created_at`</span>,
      [payrollId]
    );
    <span class="hljs-comment">// Normalize numeric fields from PostgreSQL (which returns them as strings)</span>
    <span class="hljs-keyword">return</span> result.rows.map(<span class="hljs-function">(<span class="hljs-params">row</span>) =&gt;</span> ({
      ...row,
      employee_id: <span class="hljs-built_in">parseInt</span>(row.employee_id, <span class="hljs-number">10</span>),
      id: <span class="hljs-built_in">parseInt</span>(row.id, <span class="hljs-number">10</span>),
      payroll_id: <span class="hljs-built_in">parseInt</span>(row.payroll_id, <span class="hljs-number">10</span>),
      amount: <span class="hljs-built_in">parseFloat</span>(row.amount),
    }));
  }
</code></pre>
<p>Here’s what’s happening in the code:</p>
<ol>
<li><p>We use a JOIN with the <code>employees</code> table so each payroll item includes the employee’s name, account number, and bank information.</p>
</li>
<li><p>Some numeric fields may come as strings, so we convert them to proper JavaScript numbers (<code>parseInt</code> / <code>parseFloat</code>) for accurate calculations and display.</p>
</li>
<li><p>The results are ordered by creation date, which helps when rendering items in a UI or processing them sequentially.</p>
</li>
</ol>
<p>This method makes it easy to work with all items in a payroll batch while keeping the data enriched and consistent.</p>
<h3 id="heading-fetching-a-single-payroll-item-payrollitemmodelfindbyid">Fetching a Single Payroll Item (<code>PayrollItemModel.findById</code>)</h3>
<p>Sometimes, you need to look at one specific employee’s payment (for example, to retry a failed transaction or investigate an issue). The <code>findById</code> method helps in fetching a single payroll item along with the employee’s details, so you have everything you need in one place.</p>
<p>Here’s how we implement it:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> findById(id: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;PayrollItem | <span class="hljs-literal">null</span>&gt; {
  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">`SELECT
       pi.id, pi.payroll_id, pi.employee_id, pi.amount, pi.status,
       pi.transaction_reference, pi.error_message, pi.processed_at,
       pi.created_at, pi.updated_at,
       e.name as employee_name, e.employee_id as employee_identifier,
       e.account_number, e.bank_code, e.bank_name
     FROM payroll_items pi
     JOIN employees e ON pi.employee_id = e.id
     WHERE pi.id = $1`</span>,
    [id]
  );

  <span class="hljs-keyword">if</span> (result.rows.length === <span class="hljs-number">0</span>) <span class="hljs-keyword">return</span> <span class="hljs-literal">null</span>;

  <span class="hljs-keyword">const</span> row = result.rows[<span class="hljs-number">0</span>];

  <span class="hljs-comment">// Convert numeric fields to proper JavaScript numbers for easier calculations and display</span>
  <span class="hljs-keyword">return</span> {
    ...row,
    employee_id: <span class="hljs-built_in">parseInt</span>(row.employee_id, <span class="hljs-number">10</span>),
    id: <span class="hljs-built_in">parseInt</span>(row.id, <span class="hljs-number">10</span>),
    payroll_id: <span class="hljs-built_in">parseInt</span>(row.payroll_id, <span class="hljs-number">10</span>),
    amount: <span class="hljs-built_in">parseFloat</span>(row.amount),
  };
}
</code></pre>
<p>Here’s what’s happening in the code:</p>
<ul>
<li><p>We use a JOIN with the <code>employees</code> table to include employee info such as name, account number, and bank details.</p>
</li>
<li><p>If the ID doesn’t exist, the method returns <code>null</code> so you can handle missing records gracefully.</p>
</li>
<li><p>Numeric fields are converted to JavaScript numbers, making it easy to calculate totals or display amounts in the UI.</p>
</li>
</ul>
<p>This method ensures that whenever you need a single payroll item, you get a complete, ready-to-use record.</p>
<h3 id="heading-updating-payroll-item-status-payrollitemmodelupdatestatus">Updating Payroll Item Status (<code>PayrollItemModel.updateStatus</code>)</h3>
<p>As individual employee payments are processed, this method updates the payroll item’s status, stores transaction references from external payment providers, captures error messages on failure, and timestamps completion or failure events. This fine-grained tracking enables reliable retries, audits, and reconciliation with external payment systems.</p>
<p>Here’s the implementation:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> updateStatus(
  id: <span class="hljs-built_in">number</span>,
  status: PayrollStatus,
  transactionReference?: <span class="hljs-built_in">string</span>,
  errorMessage?: <span class="hljs-built_in">string</span>
): <span class="hljs-built_in">Promise</span>&lt;PayrollItem&gt; {
  <span class="hljs-keyword">const</span> updates: <span class="hljs-built_in">string</span>[] = [<span class="hljs-string">'status = $2'</span>, <span class="hljs-string">'updated_at = NOW()'</span>];
  <span class="hljs-keyword">const</span> values: <span class="hljs-built_in">any</span>[] = [id, status];

  <span class="hljs-comment">// Add transaction reference if provided (from Monnify API response)</span>
  <span class="hljs-keyword">if</span> (transactionReference) {
    updates.push(<span class="hljs-string">`transaction_reference = $<span class="hljs-subst">${values.length + <span class="hljs-number">1</span>}</span>`</span>);
    values.push(transactionReference);
  }

  <span class="hljs-comment">// Add error message if provided (from failed payment)</span>
  <span class="hljs-keyword">if</span> (errorMessage) {
    updates.push(<span class="hljs-string">`error_message = $<span class="hljs-subst">${values.length + <span class="hljs-number">1</span>}</span>`</span>);
    values.push(errorMessage);
  }

  <span class="hljs-comment">// Set processed_at timestamp for terminal states</span>
  <span class="hljs-keyword">if</span> (status === PayrollStatus.COMPLETED || status === PayrollStatus.FAILED) {
    updates.push(<span class="hljs-string">`processed_at = NOW()`</span>);
  }

  <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> query(
    <span class="hljs-string">`UPDATE payroll_items SET <span class="hljs-subst">${updates.join(
      <span class="hljs-string">', '</span>
     )}</span> WHERE id = $1 RETURNING *`</span>,
     values
   );
   <span class="hljs-keyword">return</span> result.rows[<span class="hljs-number">0</span>];
 }
}
</code></pre>
<p>Here’s what’s happening in the code:</p>
<ul>
<li><p>We build a dynamic SET clause to update only the fields provided – status is required, while transaction reference and error message are optional.</p>
</li>
<li><p>Terminal states (<code>COMPLETED</code> or <code>FAILED</code>) trigger an automatic timestamp on <code>processed_at</code>, so we always know when a payment finished.</p>
</li>
<li><p>The method returns the updated payroll item, ready for further processing, logging, or UI display.</p>
</li>
</ul>
<p>This ensures each payroll item is tracked accurately throughout its lifecycle, enabling reliable retries and complete audit trails.</p>
<h3 id="heading-overall-payroll-flow">Overall Payroll Flow</h3>
<p>In this payroll flow, an administrator creates a payroll batch, which generates individual payroll items for each employee. The payroll is then handed off to background workers that process each payroll item independently via an external payment service.</p>
<p>As each payment succeeds or fails, payroll items are updated accordingly. Once processing concludes, the payroll batch status is updated to reflect the overall outcome, whether fully successful, partially successful, or failed.</p>
<p>This architecture provides scalability, resilience, and strong auditability for real-world payroll systems.</p>
<h2 id="heading-building-the-monnify-client">Building the Monnify Client</h2>
<p>The Monnify client is the bridge between our application and Monnify's payment API. In this section, we'll build a reusable client that handles authentication, bulk transfers, and transaction tracking. The client automatically manages API tokens, retries failed requests, and provides a clean interface for the rest of our application.</p>
<p>This module implements a reusable Monnify API client responsible for handling authentication, bulk payroll disbursements, authorization, transaction tracking, and balance checks in a secure and production-ready manner. It abstracts all Monnify-specific logic behind a single class, making it easy to integrate into background jobs, payroll processors, or service layers.</p>
<p>We’ll begin by creating a new file at <code>src/config/monnify.ts</code> where we’ll implement the Monnify client.</p>
<h3 id="heading-configuration-and-environment-setup">Configuration and Environment Setup</h3>
<p>Start by loading the configuration from environment variables using <code>dotenv</code>, ensuring that sensitive credentials are never hardcoded. These include the Monnify API key, secret key, base URL, and contract code (wallet account number). This setup allows the same client to be safely used across development, staging, and production environments.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> axios, { AxiosInstance } <span class="hljs-keyword">from</span> <span class="hljs-string">'axios'</span>;
<span class="hljs-keyword">import</span> dotenv <span class="hljs-keyword">from</span> <span class="hljs-string">'dotenv'</span>;

dotenv.config();

<span class="hljs-keyword">export</span> <span class="hljs-keyword">interface</span> MonnifyConfig {
  apiKey: <span class="hljs-built_in">string</span>;
  secretKey: <span class="hljs-built_in">string</span>;
  baseUrl: <span class="hljs-built_in">string</span>;
  contractCode: <span class="hljs-built_in">string</span>;
}
</code></pre>
<h3 id="heading-create-the-monnifyclient-class">Create the <code>MonnifyClient</code> Class</h3>
<p>Next, you’ll define the <code>MonnifyClient</code> class. This class encapsulates all communication with the Monnify API. It internally manages API credentials, an Axios HTTP client, an access token, and token expiry tracking.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> MonnifyClient {
  <span class="hljs-keyword">private</span> <span class="hljs-keyword">readonly</span> apiKey: <span class="hljs-built_in">string</span>;
  <span class="hljs-keyword">private</span> <span class="hljs-keyword">readonly</span> secretKey: <span class="hljs-built_in">string</span>;
  <span class="hljs-keyword">private</span> baseUrl: <span class="hljs-built_in">string</span>;
  <span class="hljs-keyword">private</span> contractCode: <span class="hljs-built_in">string</span>;
  <span class="hljs-keyword">private</span> client: AxiosInstance;

  <span class="hljs-keyword">private</span> accessToken: <span class="hljs-built_in">string</span> | <span class="hljs-literal">null</span> = <span class="hljs-literal">null</span>;
  <span class="hljs-keyword">private</span> tokenExpiry: <span class="hljs-built_in">number</span> = <span class="hljs-number">0</span>;
</code></pre>
<p>This design ensures authentication is handled transparently and automatically for every request.</p>
<h3 id="heading-axios-client-and-request-interceptor">Axios Client and Request Interceptor</h3>
<p>Inside the constructor, initialize the Monnify client with credentials from environment variables. The Axios instance is created with the Monnify base URL and JSON headers.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">constructor</span>(<span class="hljs-params"></span>) {
    <span class="hljs-built_in">this</span>.apiKey = process.env.MONNIFY_API_KEY || <span class="hljs-string">''</span>;
    <span class="hljs-built_in">this</span>.secretKey = process.env.MONNIFY_SECRET_KEY || <span class="hljs-string">''</span>;
    <span class="hljs-built_in">this</span>.baseUrl = process.env.MONNIFY_BASE_URL || <span class="hljs-string">'https://api.monnify.com'</span>;
    <span class="hljs-built_in">this</span>.contractCode = process.env.MONNIFY_CONTRACT_CODE || <span class="hljs-string">''</span>;

    <span class="hljs-built_in">this</span>.client = axios.create({
      baseURL: <span class="hljs-built_in">this</span>.baseUrl,
      headers: {
        <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>,
      },
    });
</code></pre>
<p>We attach the request interceptor to this client to automatically inject a valid Bearer token into every outgoing request (except the authentication endpoint). Before each request, the interceptor ensures the client is authenticated, preventing unauthorized requests and eliminating token-related boilerplate across the codebase.</p>
<pre><code class="lang-typescript">    <span class="hljs-built_in">this</span>.client.interceptors.request.use(<span class="hljs-keyword">async</span> (config: <span class="hljs-built_in">any</span>) =&gt; {
      <span class="hljs-comment">// Skip auth for the login endpoint itself</span>
      <span class="hljs-keyword">if</span> (config.url?.includes(<span class="hljs-string">'/auth/login'</span>)) {
        <span class="hljs-keyword">return</span> config;
      }

      <span class="hljs-comment">// Ensure a valid token exists before every request</span>
      <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();

      <span class="hljs-keyword">if</span> (<span class="hljs-built_in">this</span>.accessToken) {
        config.headers.Authorization = <span class="hljs-string">`Bearer <span class="hljs-subst">${<span class="hljs-built_in">this</span>.accessToken}</span>`</span>;
      }

      <span class="hljs-keyword">return</span> config;
    });
  }
</code></pre>
<h3 id="heading-authenticate-with-monnify">Authenticate with Monnify</h3>
<p>Authentication is handled using Monnify’s Basic Auth mechanism, where the API key and secret key are base64-encoded and sent to the <code>/auth/login</code> endpoint. Upon successful authentication, the client stores the returned access token and sets an internal expiry timestamp slightly below the official token lifetime to avoid edge-case expirations. Any authentication failure is logged and surfaced as a controlled error to prevent silent failures.</p>
<pre><code class="lang-typescript">
  <span class="hljs-keyword">private</span> <span class="hljs-keyword">async</span> authenticate(): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-comment">// Encode credentials as Base64 for Basic Auth</span>
      <span class="hljs-keyword">const</span> credentials = Buffer.from(
        <span class="hljs-string">`<span class="hljs-subst">${<span class="hljs-built_in">this</span>.apiKey}</span>:<span class="hljs-subst">${<span class="hljs-built_in">this</span>.secretKey}</span>`</span>
      ).toString(<span class="hljs-string">'base64'</span>);

      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> axios.post(
        <span class="hljs-string">`<span class="hljs-subst">${<span class="hljs-built_in">this</span>.baseUrl}</span>/api/v1/auth/login`</span>,
        {},
        {
          headers: {
            Authorization: <span class="hljs-string">`Basic <span class="hljs-subst">${credentials}</span>`</span>,
            <span class="hljs-string">'Content-Type'</span>: <span class="hljs-string">'application/json'</span>,
          },
        }
      );

      <span class="hljs-built_in">this</span>.accessToken = response.data.responseBody.accessToken;
      <span class="hljs-comment">// Set expiry to 23 hours (Monnify tokens typically last 24 hours)</span>
      <span class="hljs-comment">// This prevents edge cases where token expires mid-request</span>
      <span class="hljs-built_in">this</span>.tokenExpiry = <span class="hljs-built_in">Date</span>.now() + <span class="hljs-number">23</span> * <span class="hljs-number">60</span> * <span class="hljs-number">60</span> * <span class="hljs-number">1000</span>;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(
        <span class="hljs-string">'Monnify authentication error:'</span>,
        error.response?.data || error.message
      );
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Failed to authenticate with Monnify'</span>);
    }
  }
</code></pre>
<h3 id="heading-automatic-token-refresh-ensureauthenticated">Automatic Token Refresh (<code>ensureAuthenticated</code>)</h3>
<p>Before any API call, the client verifies whether a valid access token exists or if the token has expired. If so, it transparently re-authenticates.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">private</span> <span class="hljs-keyword">async</span> ensureAuthenticated(): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">this</span>.accessToken || <span class="hljs-built_in">Date</span>.now() &gt;= <span class="hljs-built_in">this</span>.tokenExpiry) {
      <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.authenticate();
    }
  }
</code></pre>
<p>This ensures that long-running processes such as payroll queues or background workers can safely make Monnify requests without manual token handling.</p>
<h3 id="heading-initiating-bulk-transfers">Initiating Bulk Transfers</h3>
<p>The <code>initiateBulkTransfer</code> method handles the creation of a bulk disbursement batch, typically used for payroll payments. It validates input transfers to ensure each payment has a valid amount, destination account number, and bank code.</p>
<p>A structured batch request is then constructed, including a unique batch reference, source account (contract code), narration, and a list of transactions. The request is logged for traceability and sent to Monnify’s batch disbursement endpoint. Any API error is normalized and returned with meaningful messaging to aid debugging and retries.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">async</span> initiateBulkTransfer(
    transfers: <span class="hljs-built_in">Array</span>&lt;{
      amount: <span class="hljs-built_in">number</span>;
      recipientAccountNumber: <span class="hljs-built_in">string</span>;
      recipientBankCode: <span class="hljs-built_in">string</span>;
      recipientName: <span class="hljs-built_in">string</span>;
      narration: <span class="hljs-built_in">string</span>;
      reference: <span class="hljs-built_in">string</span>;
    }&gt;
  ): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">any</span>&gt; {
    <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();
</code></pre>
<p>We validate inputs early to fail fast:</p>
<pre><code class="lang-typescript">    <span class="hljs-keyword">if</span> (!transfers || transfers.length === <span class="hljs-number">0</span>) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'No transfers provided'</span>);
    }

    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">this</span>.contractCode) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Monnify contract code is not configured'</span>);
    }
</code></pre>
<p>Each transfer is validated individually:</p>
<pre><code class="lang-typescript">    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> transfer <span class="hljs-keyword">of</span> transfers) {
      <span class="hljs-keyword">if</span> (!transfer.amount || transfer.amount &lt;= <span class="hljs-number">0</span>) {
        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">`Invalid amount for transfer: <span class="hljs-subst">${transfer.reference}</span>`</span>);
      }
      <span class="hljs-keyword">if</span> (!transfer.recipientAccountNumber) {
        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">`Missing account number for transfer: <span class="hljs-subst">${transfer.reference}</span>`</span>);
      }
      <span class="hljs-keyword">if</span> (!transfer.recipientBankCode) {
        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">`Missing bank code for transfer: <span class="hljs-subst">${transfer.reference}</span>`</span>);
      }
    }
</code></pre>
<p>We then construct the batch payload:</p>
<pre><code class="lang-typescript">    <span class="hljs-keyword">const</span> requestBody = {
      title: <span class="hljs-string">'Bulk Payroll Transfers'</span>,
      batchReference: <span class="hljs-string">`BATCH_<span class="hljs-subst">${<span class="hljs-built_in">Date</span>.now()}</span>`</span>,
      narration: <span class="hljs-string">'Payroll batch disbursement'</span>,
      sourceAccountNumber: <span class="hljs-built_in">this</span>.contractCode,
      onValidationFailure: <span class="hljs-string">'CONTINUE'</span>,
      notificationInterval: <span class="hljs-number">50</span>,
      transactionList: transfers.map(<span class="hljs-function">(<span class="hljs-params">t</span>) =&gt;</span> ({
        amount: t.amount,
        reference: t.reference,
        narration: t.narration,
        destinationBankCode: t.recipientBankCode,
        destinationAccountNumber: t.recipientAccountNumber,
        currency: <span class="hljs-string">'NGN'</span>,
      })),
    };
</code></pre>
<p>Finally, we send the request and normalize errors:</p>
<pre><code class="lang-typescript">    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.client.post(
        <span class="hljs-string">'/api/v2/disbursements/batch'</span>,
        requestBody
      );
      <span class="hljs-keyword">return</span> response.data;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-keyword">const</span> errorData = error.response?.data;
      <span class="hljs-keyword">const</span> message =
        errorData?.responseMessage ||
        errorData?.message ||
        <span class="hljs-string">`Monnify API error (<span class="hljs-subst">${error.response?.status}</span>)`</span>;
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(message);
    }
  }
</code></pre>
<h3 id="heading-authorizing-bulk-transfers-otp-validation">Authorizing Bulk Transfers (OTP Validation)</h3>
<p>Some bulk transfers require OTP authorization. The <code>authorizeBulkTransfer</code> method validates the presence of a batch reference and authorization code before submitting them to Monnify’s OTP validation endpoint. This step finalizes the batch disbursement and allows processing to continue. Errors are logged and surfaced clearly for operational visibility.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> authorizeBulkTransfer(
reference: <span class="hljs-built_in">string</span>,
authorizationCode: <span class="hljs-built_in">string</span>
): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">any</span>&gt; {
<span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();
    <span class="hljs-keyword">if</span> (!reference) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Batch reference is required'</span>);
    }

    <span class="hljs-keyword">if</span> (!authorizationCode) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Authorization code (OTP) is required'</span>);
    }

    <span class="hljs-keyword">const</span> requestBody = {
      reference,
      authorizationCode,
    };

    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.client.post(
        <span class="hljs-string">'/api/v2/disbursements/batch/validate-otp'</span>,
        requestBody
      );

      <span class="hljs-keyword">return</span> response.data;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-keyword">const</span> errorDetails = error.response?.data || error.message;
      <span class="hljs-built_in">console</span>.error(
        <span class="hljs-string">'Monnify authorization error:'</span>,
        <span class="hljs-built_in">JSON</span>.stringify(errorDetails, <span class="hljs-literal">null</span>, <span class="hljs-number">2</span>)
      );

      <span class="hljs-keyword">if</span> (error.response) {
        <span class="hljs-keyword">const</span> errorData = error.response.data;
        <span class="hljs-keyword">const</span> errorMessage =
          errorData?.responseMessage ||
          errorData?.message ||
          <span class="hljs-string">`Monnify API error (<span class="hljs-subst">${error.response.status}</span>)`</span>;
        <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(errorMessage);
      }
      <span class="hljs-keyword">throw</span> error;
    }
}
</code></pre>
<h3 id="heading-transaction-status-lookup">Transaction Status Lookup</h3>
<p>The <code>getTransactionStatus</code> method retrieves the real-time status of an individual transaction using its reference.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> getTransactionStatus(transactionReference: <span class="hljs-built_in">string</span>): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">any</span>&gt; {
<span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.client.get(
        <span class="hljs-string">`/api/v2/disbursements/<span class="hljs-subst">${transactionReference}</span>/status`</span>
      );
      <span class="hljs-keyword">return</span> response.data;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(
        <span class="hljs-string">'Monnify status check error:'</span>,
        error.response?.data || error.message
      );
      <span class="hljs-keyword">throw</span> error;
    }
}
</code></pre>
<p>This is useful for reconciliation, webhook fallbacks, or manual verification of disbursement outcomes.</p>
<h3 id="heading-batch-details-retrieval">Batch Details Retrieval</h3>
<p>The <code>getBatchDetails</code> method fetches detailed information about an entire disbursement batch, including the state of individual transactions.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> getBatchDetails(batchReference: <span class="hljs-built_in">string</span>): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">any</span>&gt; {
<span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();
    <span class="hljs-keyword">if</span> (!batchReference) {
      <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Batch reference is required'</span>);
    }

    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.client.get(
        <span class="hljs-string">`/api/v2/disbursements/batch/<span class="hljs-subst">${batchReference}</span>`</span>
      );
      <span class="hljs-keyword">return</span> response.data;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(
        <span class="hljs-string">'Monnify batch details error:'</span>,
        error.response?.data || error.message
      );
      <span class="hljs-keyword">throw</span> error;
    }
}
</code></pre>
<p>This is particularly useful when reconciling payroll runs or recovering from partial failures.</p>
<h3 id="heading-wallet-balance-check">Wallet Balance Check</h3>
<p>Finally, we can query the available balance of the Monnify wallet.</p>
<p>The <code>getAccountBalance</code> method retrieves the available balance of the configured Monnify wallet (contract account).</p>
<p>Create <code>src/config/monnify.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> getAccountBalance(): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">any</span>&gt; {
<span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.ensureAuthenticated();

    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> <span class="hljs-built_in">this</span>.client.get(
        <span class="hljs-string">`/api/v2/disbursements/wallet-balance?accountNumber=<span class="hljs-subst">${<span class="hljs-built_in">this</span>.contractCode}</span>`</span>
      );
      <span class="hljs-keyword">return</span> response.data;
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(
        <span class="hljs-string">'Monnify balance check error:'</span>,
        error.response?.data || error.message
      );
      <span class="hljs-keyword">throw</span> error;
    }
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> monnifyClient = <span class="hljs-keyword">new</span> MonnifyClient();
</code></pre>
<p>Key features of this client:</p>
<ol>
<li><p><strong>Automatic token management</strong>: The client automatically handles authentication and refreshes tokens before they expire.</p>
</li>
<li><p><strong>Request interceptor</strong>: Every API request automatically includes the authentication token.</p>
</li>
<li><p><strong>Bulk transfers</strong>: Uses Monnify's batch disbursement API for efficient payroll processing.</p>
</li>
<li><p><strong>Error handling</strong>: Comprehensive error handling with meaningful error messages.</p>
</li>
</ol>
<h2 id="heading-implementing-background-job-processing">Implementing Background Job Processing</h2>
<p>To avoid blocking HTTP requests and to ensure reliable retries, payroll execution is handled asynchronously using a background job processor. This worker is responsible for orchestrating bulk payroll disbursements, coordinating with Monnify, updating payroll and payroll item states, and handling retries safely.</p>
<p>Begin by creating a new file at <code>src/jobs/payroll.processor.ts</code>. All background payroll execution logic will live in this file.</p>
<h3 id="heading-set-up-the-payroll-processing-queue">Set Up the Payroll Processing Queue</h3>
<p>We’ll create a Bull queue named <code>payroll-processing</code> and a backed by Redis. Redis connection details are loaded from environment variables, allowing flexibility across environments.</p>
<p>Default job options are configured to retry failed jobs up to three times using an exponential backoff strategy. This ensures resilience against transient failures such as network issues or temporary payment gateway downtime. Completed jobs are automatically removed from the queue to keep Redis storage clean.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> Queue <span class="hljs-keyword">from</span> <span class="hljs-string">'bull'</span>;
<span class="hljs-keyword">import</span> { monnifyClient } <span class="hljs-keyword">from</span> <span class="hljs-string">'../config/monnify'</span>;
<span class="hljs-keyword">import</span> {
 PayrollItemModel,
 PayrollModel,
 PayrollStatus,
} <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/payroll'</span>;
<span class="hljs-keyword">import</span> { EmployeeModel } <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/employee'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> payrollQueue = <span class="hljs-keyword">new</span> Queue(<span class="hljs-string">'payroll-processing'</span>, {
 redis: {
 host: process.env.REDIS_HOST || <span class="hljs-string">'localhost'</span>,
 port: <span class="hljs-built_in">Number</span>(process.env.REDIS_PORT || <span class="hljs-number">6379</span>),
},
defaultJobOptions: {
 attempts: <span class="hljs-number">3</span>,
 backoff: { 
  <span class="hljs-keyword">type</span>: <span class="hljs-string">'exponential'</span>, 
  delay: <span class="hljs-number">2000</span> 
},
 removeOnComplete: <span class="hljs-literal">true</span>,
},
});
</code></pre>
<h3 id="heading-queue-processor-registration">Queue Processor Registration</h3>
<p>The queue registers a processor function using <code>payrollQueue.process</code>, which receives jobs containing a <code>payrollId</code>. Each job triggers the <code>processBulkPayroll</code> function, making the queue responsible for executing one payroll batch at a time.</p>
<pre><code class="lang-typescript">payrollQueue.process(<span class="hljs-keyword">async</span> (job) =&gt; {
 <span class="hljs-keyword">return</span> processBulkPayroll(job.data.payrollId);
});
</code></pre>
<p>This design decouples payroll execution from HTTP requests and allows processing to happen asynchronously in background workers.</p>
<h3 id="heading-bulk-payroll-processing-flow-processbulkpayroll">Bulk Payroll Processing Flow (<code>processBulkPayroll</code>)</h3>
<p>When a payroll job is picked up, the system first fetches all payroll items associated with the given payroll ID. It filters out only items that are eligible for processing: those still in a <code>PENDING</code> state or previously marked as <code>PROCESSING</code> but missing a transaction reference.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">processBulkPayroll</span>(<span class="hljs-params">payrollId: <span class="hljs-built_in">number</span></span>) </span>{

<span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payrollId);
</code></pre>
<p>Also, it filters payroll items to include only those that still require processing. This prevents duplicate payments when jobs are retried.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> payable = items.filter(
  <span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span>
    i.status === PayrollStatus.PENDING ||
    (i.status === PayrollStatus.PROCESSING &amp;&amp; !i.transaction_reference)
);

<span class="hljs-keyword">if</span> (payable.length === <span class="hljs-number">0</span>) <span class="hljs-keyword">return</span>;
</code></pre>
<p>If no payable items remain, the function exits early, avoiding unnecessary API calls.</p>
<p>Once we confirm there are payable items, we update the overall payroll status:</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">await</span> PayrollModel.updateStatus(payrollId, PayrollStatus.PROCESSING);
</code></pre>
<p>This provides immediate visibility that disbursement is underway.</p>
<h3 id="heading-building-the-bulk-transfer-payload">Building the Bulk Transfer Payload</h3>
<p>Create a variable to store the transfer list that will be sent to Monnify.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">const</span> transfers = [];
</code></pre>
<p>For each payable payroll item, the corresponding employee record is fetched to retrieve bank and account details. A unique payment reference is generated using the payroll ID and payroll item ID, ensuring traceability across systems. Each payroll item is immediately marked as <code>PROCESSING</code> before initiating payment to prevent concurrent workers from attempting to process the same item.</p>
<p>A transfer object is then constructed containing the payment amount, recipient bank details, narration, and unique reference. These transfer objects are accumulated into a single batch request.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> item <span class="hljs-keyword">of</span> payable) {
<span class="hljs-keyword">const</span> employee = <span class="hljs-keyword">await</span> EmployeeModel.findById(item.employee_id);
<span class="hljs-keyword">if</span> (!employee) <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Employee not found'</span>);

    <span class="hljs-keyword">const</span> reference = <span class="hljs-string">`PAYROLL_<span class="hljs-subst">${payrollId}</span>_<span class="hljs-subst">${item.id}</span>`</span>;

    <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(item.id, PayrollStatus.PROCESSING);

    transfers.push({
      amount: <span class="hljs-built_in">Number</span>(item.amount),
      reference,
      recipientAccountNumber: employee.account_number,
      recipientBankCode: employee.bank_code,
      recipientName: employee.name,
      narration: <span class="hljs-string">`Payroll payment`</span>,
    });

}
</code></pre>
<h3 id="heading-initiating-bulk-disbursement-via-monnify">Initiating Bulk Disbursement via Monnify</h3>
<p>Once all transfers are prepared, the system initiates a bulk transfer through the Monnify client.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> response = <span class="hljs-keyword">await</span> monnifyClient.initiateBulkTransfer(transfers);

<span class="hljs-keyword">if</span> (!response?.requestSuccessful) {
  <span class="hljs-keyword">throw</span> <span class="hljs-keyword">new</span> <span class="hljs-built_in">Error</span>(<span class="hljs-string">'Bulk transfer initiation failed'</span>);
}
</code></pre>
<p>If Monnify doesn’t confirm successful initiation, the job throws an error, allowing Bull’s retry mechanism to take over. This ensures failed initiation attempts are retried safely without manual intervention.</p>
<h3 id="heading-storing-transaction-references">Storing Transaction References</h3>
<p>After a successful bulk transfer initiation, Monnify returns a list of transactions containing unique transaction references. The system matches each response entry to its corresponding payroll item using the generated reference and updates the payroll item record with the Monnify transaction reference while keeping its status as <code>PROCESSING</code>.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> results = response.responseBody?.transactionList || [];

<span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> item <span class="hljs-keyword">of</span> payable) {
<span class="hljs-keyword">const</span> ref = <span class="hljs-string">`PAYROLL_<span class="hljs-subst">${payrollId}</span>_<span class="hljs-subst">${item.id}</span>`</span>;
<span class="hljs-keyword">const</span> match = results.find(<span class="hljs-function">(<span class="hljs-params">r: <span class="hljs-built_in">any</span></span>) =&gt;</span> r.reference === ref);

    <span class="hljs-keyword">if</span> (match?.transactionReference) {
      <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(
        item.id,
        PayrollStatus.PROCESSING,
        match.transactionReference
      );
    }

}

<span class="hljs-keyword">await</span> updatePayrollStats(payrollId);
}
</code></pre>
<p>This step is critical for later reconciliation through webhooks or status polling.</p>
<h3 id="heading-payroll-statistics-reconciliation-updatepayrollstats">Payroll Statistics Reconciliation (<code>updatePayrollStats</code>)</h3>
<p>After initiating payments, the system recalculates payroll-level statistics by refetching all payroll items.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">updatePayrollStats</span>(<span class="hljs-params">payrollId: <span class="hljs-built_in">number</span></span>) </span>{
<span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payrollId);

<span class="hljs-keyword">const</span> completed = items.filter(
  <span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.COMPLETED
).length;
</code></pre>
<p>The overall payroll status is derived from these counts:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">const</span> failed = items.filter(<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.FAILED).length;

<span class="hljs-keyword">let</span> status = PayrollStatus.PROCESSING;

<span class="hljs-keyword">if</span> (completed === items.length) {
  status = PayrollStatus.COMPLETED;
} <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (failed === items.length) {
  status = PayrollStatus.FAILED;
} <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (completed &gt; <span class="hljs-number">0</span>) {
  status = PayrollStatus.PARTIALLY_COMPLETED;
}

 <span class="hljs-keyword">await</span> PayrollModel.updateStatus(payrollId, status, completed, failed);
}
</code></pre>
<p>If all items are completed, the payroll is marked as <code>COMPLETED</code>. If all failed, it’s marked as <code>FAILED</code>. If some succeeded and some failed, it’s marked as <code>PARTIALLY_COMPLETED</code>. Otherwise, it remains in <code>PROCESSING</code>. The payroll record is then updated with the new status and aggregate counts, providing an accurate real-time snapshot of payroll execution.</p>
<h3 id="heading-queue-entry-point-processpayrollitems">Queue Entry Point (<code>processPayrollItems</code>)</h3>
<p>The <code>processPayrollItems</code> function serves as the public entry point for triggering payroll execution.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">processPayrollItems</span>(<span class="hljs-params">payrollId: <span class="hljs-built_in">number</span></span>) </span>{
  <span class="hljs-keyword">await</span> payrollQueue.add({ payrollId, <span class="hljs-keyword">type</span>: <span class="hljs-string">'bulk'</span> });
}
</code></pre>
<p>It simply enqueues a payroll job with the relevant payroll ID, allowing controllers or services to initiate payroll processing without coupling themselves to queue logic or payment execution details.</p>
<h3 id="heading-role-in-the-overall-payroll-architecture">Role in the Overall Payroll Architecture</h3>
<p>This queue worker acts as the execution engine of the payroll system. It:</p>
<ul>
<li><p>Bridges payroll domain models with the Monnify payment gateway</p>
</li>
<li><p>Ensures safe retries through Bull’s job management and maintains idempotency</p>
</li>
<li><p>Continuously synchronizes payroll and payroll item states</p>
</li>
</ul>
<p>By offloading payment execution to background workers, the system achieves scalability, reliability, and operational resilience required for real-world payroll processing.</p>
<p>Key features of the job processor:</p>
<ol>
<li><p><strong>Exponential backoff</strong>: Failed jobs are retried with increasing delays (2s, 4s, 8s).</p>
</li>
<li><p><strong>Bulk processing</strong>: All payroll items are processed as a single batch transfer.</p>
</li>
<li><p><strong>Status tracking</strong>: Each item's status is updated throughout the process.</p>
</li>
<li><p><strong>Automatic cleanup</strong>: Completed jobs are automatically removed from the queue.</p>
</li>
</ol>
<h2 id="heading-creating-the-api-controllers">Creating the API Controllers</h2>
<p>Next, we’ll build the HTTP controller layer for managing employees in the payroll system using Express.js. It exposes RESTful API endpoints that handle incoming requests, perform validation, interact with the employee data model, and return appropriate HTTP responses.</p>
<p>The controller acts as the bridge between client-facing APIs and the underlying business logic encapsulated in the <code>EmployeeModel</code>.</p>
<h3 id="heading-controller-responsibilities">Controller Responsibilities</h3>
<p>The <code>EmployeeController</code> is responsible for:</p>
<ul>
<li><p>Validating incoming request data</p>
</li>
<li><p>Calling the appropriate model methods</p>
</li>
<li><p>Handling errors gracefully</p>
</li>
<li><p>Returning meaningful HTTP status codes and JSON responses</p>
</li>
</ul>
<p>Each method follows a consistent structure using <code>try–catch</code> blocks to ensure reliability and simplify error handling.</p>
<p>Start by creating a new file at <code>src/controllers/employee.controller.ts</code>. This file will contain all the endpoints needed to manage employees in the payroll system.</p>
<p>At the top of the file, import the required Express types and the employee model:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Request, Response } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> { EmployeeModel, CreateEmployeeInput } <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/employee'</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> EmployeeController {
  <span class="hljs-comment">// Controller methods will go here</span>
}
</code></pre>
<p>Each method inside this class will map to a specific API endpoint.</p>
<h3 id="heading-creating-an-employee-createemployee">Creating an Employee (<code>createEmployee</code>)</h3>
<p>We’ll start with an endpoint for creating a new employee.</p>
<p>This endpoint handles the creation of a new employee record. It extracts the request body and validates the presence of required fields such as name, email, salary, bank account number, and bank code.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> createEmployee(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> data: CreateEmployeeInput = req.body;

      <span class="hljs-keyword">if</span> (
        !data.name ||
        !data.email ||
        !data.salary ||
        !data.account_number ||
        !data.bank_code
      ) {
        res.status(<span class="hljs-number">400</span>).json({
          error:
            <span class="hljs-string">'Missing required fields: name, email, salary, account_number, bank_code'</span>,
        });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> employee = <span class="hljs-keyword">await</span> EmployeeModel.create(data);
      res.status(<span class="hljs-number">201</span>).json({
        message: <span class="hljs-string">'Employee created successfully'</span>,
        data: employee,
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error creating employee:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to create employee'</span> });
    }

}
</code></pre>
<p>If any required field is missing, the request is rejected with a <code>400 Bad Request</code> response.</p>
<p>Upon successful validation, the controller delegates employee creation to the <code>EmployeeModel.create</code> method and returns a <code>201 Created</code> response containing the newly created employee. Any unexpected error during the process results in a <code>500 Internal Server Error</code>.</p>
<h3 id="heading-fetching-all-employees-getallemployees">Fetching All Employees (<code>getAllEmployees</code>)</h3>
<p>Next, we’ll add an endpoint for retrieving all employee records from the system.</p>
<p>This endpoint simply calls <code>EmployeeModel.findAll</code> and returns the result as a JSON response. This API is typically used for administrative dashboards, payroll preparation, or reporting purposes.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getAllEmployees(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> employees = <span class="hljs-keyword">await</span> EmployeeModel.findAll();
    res.json({ data: employees });
  } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching employees:'</span>, error);
    res
      .status(<span class="hljs-number">500</span>)
      .json({ error: error.message || <span class="hljs-string">'Failed to fetch employees'</span> });
  }
}
</code></pre>
<p>If the retrieval is successful, the controller responds with the full list of employees. If something goes wrong, such as a database or unexpected runtime error, the error is logged and a 500 Internal Server Error is returned to the client.</p>
<h3 id="heading-fetching-a-single-employee-getemployeebyid">Fetching a Single Employee (<code>getEmployeeById</code>)</h3>
<p>After listing all employees, the next logical step is being able to fetch a single employee by their ID.</p>
<p>This endpoint retrieves a specific employee by ID, which is parsed from the URL parameters. If the employee doesn’t exist, the controller responds with a <code>404 Not Found</code>. Otherwise, the employee data is returned in a successful response. This endpoint is useful for viewing or editing individual employee details.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getEmployeeById(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> { id } = req.params;
      <span class="hljs-keyword">const</span> employee = <span class="hljs-keyword">await</span> EmployeeModel.findById(<span class="hljs-built_in">parseInt</span>(id));

      <span class="hljs-keyword">if</span> (!employee) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Employee not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      res.json({ data: employee });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching employee:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to fetch employee'</span> });
    }
  }
</code></pre>
<h3 id="heading-updating-an-employee-updateemployee">Updating an Employee (<code>updateEmployee</code>)</h3>
<p>Now that we can retrieve individual employees, the next step is allowing their details to be updated.</p>
<p>This endpoint allows partial updates to an existing employee record. It first checks whether the employee exists before attempting an update.</p>
<p>If the employee isn’t found, a <code>404 Not Found</code> response is returned. If the employee exists, the controller forwards the update payload to <code>EmployeeModel.update</code> and returns the updated employee record. This approach ensures data integrity and prevents silent failures.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> updateEmployee(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> { id } = req.params;
      <span class="hljs-keyword">const</span> data: Partial&lt;CreateEmployeeInput&gt; = req.body;

      <span class="hljs-keyword">const</span> employee = <span class="hljs-keyword">await</span> EmployeeModel.findById(<span class="hljs-built_in">parseInt</span>(id));
      <span class="hljs-keyword">if</span> (!employee) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Employee not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> updated = <span class="hljs-keyword">await</span> EmployeeModel.update(<span class="hljs-built_in">parseInt</span>(id), data);
      res.json({
        message: <span class="hljs-string">'Employee updated successfully'</span>,
        data: updated,
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error updating employee:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to update employee'</span> });
    }
  }
</code></pre>
<h3 id="heading-deleting-an-employee-deleteemployee">Deleting an Employee (<code>deleteEmployee</code>)</h3>
<p>Finally, the last endpoint in the <code>EmployeeController</code> handles employee deletion.</p>
<p>Before deleting, it verifies that the employee exists to avoid invalid delete operations. If found, the employee record is removed using <code>EmployeeModel.delete</code>, and a success message is returned. If the employee doesn’t exist, the controller responds with a <code>404 Not Found</code>.</p>
<pre><code class="lang-typescript"> <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> deleteEmployee(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> { id } = req.params;

      <span class="hljs-keyword">const</span> employee = <span class="hljs-keyword">await</span> EmployeeModel.findById(<span class="hljs-built_in">parseInt</span>(id));
      <span class="hljs-keyword">if</span> (!employee) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Employee not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">await</span> EmployeeModel.delete(<span class="hljs-built_in">parseInt</span>(id));
      res.json({ message: <span class="hljs-string">'Employee deleted successfully'</span> });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error deleting employee:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to delete employee'</span> });
    }
  }
</code></pre>
<h3 id="heading-error-handling-strategy">Error Handling Strategy</h3>
<p>All controller methods use structured error handling to log errors internally while returning clean and user-friendly error messages to API consumers. This separation ensures sensitive implementation details are not leaked while still providing useful feedback for debugging and client-side handling.</p>
<h3 id="heading-role-in-the-overall-payroll-system">Role in the Overall Payroll System</h3>
<p>The <code>EmployeeController</code> provides the foundational APIs required for managing employee records, which are essential inputs for payroll processing. By cleanly separating HTTP concerns from business logic and persistence layers, this controller supports maintainability, scalability, and clear system boundaries within the payroll architecture.</p>
<h3 id="heading-payroll-controller">Payroll Controller</h3>
<p>This module defines the PayrollController, which serves as the primary HTTP-facing orchestration layer for payroll operations in the system. It exposes RESTful APIs that allow clients to create payrolls, retrieve payroll data, trigger payroll processing, reconcile payment results, authorize bulk transfers, and monitor transaction and account statuses.</p>
<h3 id="heading-controller-responsibilities-1">Controller Responsibilities</h3>
<p>The <code>PayrollController</code> is responsible for:</p>
<ul>
<li><p>Accepting and validating client requests related to payrolls</p>
</li>
<li><p>Managing payroll lifecycle transitions (creation → processing → completion)</p>
</li>
<li><p>Triggering background job execution for bulk payroll disbursement</p>
</li>
<li><p>Reconciling payment results with Monnify</p>
</li>
<li><p>Providing real-time payroll and transaction status visibility</p>
</li>
<li><p>Acting as a safe boundary between external clients and internal services</p>
</li>
</ul>
<p>To get started, create a new file <code>src/controllers/payroll.controller.ts</code>. This is where we’ll define all payroll-related endpoints.</p>
<p>At the top of <code>src/controllers/payroll.controller.ts</code>, we start with the following imports:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Request, Response } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> {
  PayrollModel,
  PayrollItemModel,
  PayrollStatus,
} <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/payroll'</span>;
<span class="hljs-keyword">import</span> { processPayrollItems } <span class="hljs-keyword">from</span> <span class="hljs-string">'../jobs/payroll.processor'</span>;
<span class="hljs-keyword">import</span> { monnifyClient } <span class="hljs-keyword">from</span> <span class="hljs-string">'../config/monnify'</span>;
</code></pre>
<p>Here’s what each of these is responsible for:</p>
<ul>
<li><p><code>Request</code> and <code>Response</code> (from Express): These types give us strongly typed access to incoming HTTP requests and outgoing responses.</p>
</li>
<li><p><code>PayrollModel</code>: This model handles payroll batch operations such as creating payrolls, fetching them, and updating their overall status.</p>
</li>
<li><p><code>PayrollItemModel</code>: This model lets us fetch and update those items, especially during processing and reconciliation.</p>
</li>
<li><p><code>PayrollStatus</code>: This is an enum that defines the valid states of a payroll or payroll item (for example: <code>PENDING</code>, <code>PROCESSING</code>, <code>COMPLETED</code>, <code>FAILED</code>). Using an enum helps keep state transitions explicit and consistent across the system.</p>
</li>
<li><p><code>processPayrollItems</code>: This function is responsible for handing off payroll processing to background workers. Instead of processing payrolls synchronously in the HTTP request, we queue the work and let workers handle it asynchronously.</p>
</li>
<li><p><code>monnifyClient</code>: This is our gateway to the external payment service. We use it to authorize bulk transfers, check transaction statuses, reconcile payments, and fetch account balances.</p>
</li>
</ul>
<p>Together, these imports give the controller everything it needs to process payroll operations.</p>
<p>With our imports in place, we can now define the controller class itself. This class will serve as the single home for all payroll-related endpoints.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">export</span> <span class="hljs-keyword">class</span> PayrollController {
  <span class="hljs-comment">// Payroll endpoints will live here</span>
}
</code></pre>
<h3 id="heading-creating-a-payroll-createpayroll">Creating a Payroll (<code>createPayroll</code>)</h3>
<p>With the controller in place, we’ll begin by implementing the endpoint create payroll. This endpoint initializes a new payroll batch, allowing us to either process all employees or a subset by their IDs.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> createPayroll(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { payroll_period, employee_ids } = req.body;

      <span class="hljs-keyword">if</span> (!payroll_period) {
        res.status(<span class="hljs-number">400</span>).json({ error: <span class="hljs-string">'payroll_period is required'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> processedEmployeeIds = employee_ids
        ? employee_ids
            .map(<span class="hljs-function">(<span class="hljs-params">id: <span class="hljs-built_in">any</span></span>) =&gt;</span> <span class="hljs-built_in">parseInt</span>(id, <span class="hljs-number">10</span>))
            .filter(<span class="hljs-function">(<span class="hljs-params">id: <span class="hljs-built_in">number</span></span>) =&gt;</span> !<span class="hljs-built_in">isNaN</span>(id))
        : <span class="hljs-literal">undefined</span>;

      <span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.create({
        payroll_period,
        employee_ids: processedEmployeeIds,
      });

      res.status(<span class="hljs-number">201</span>).json({
        message: <span class="hljs-string">'Payroll created successfully'</span>,
        data: payroll,
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error creating payroll:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to create payroll'</span> });
    }

}
</code></pre>
<p>Here’s what’s happening in the code:</p>
<ul>
<li><p>The endpoint requires a <code>payroll_period</code> and optionally accepts a list of employee IDs to support partial payroll runs.</p>
</li>
<li><p>Incoming employee IDs are normalized and validated to ensure they are valid integers before being passed to the payroll model.</p>
</li>
<li><p>The controller delegates the actual creation logic to <code>PayrollModel.create</code>, which computes totals and creates payroll items.</p>
</li>
<li><p>On success, the API responds with a <code>201 Created</code> status and the newly created payroll record.</p>
</li>
</ul>
<h3 id="heading-fetching-all-payrolls-getallpayrolls">Fetching All Payrolls (<code>getAllPayrolls</code>)</h3>
<p>This endpoint retrieves all payroll batches in the system. It’s typically used for administrative dashboards and payroll history views. The controller simply delegates to <code>PayrollModel.findAll</code> and returns the results in a structured JSON response.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getAllPayrolls(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> payrolls = <span class="hljs-keyword">await</span> PayrollModel.findAll();
res.json({ data: payrolls });
} <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
<span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching payrolls:'</span>, error);
res
.status(<span class="hljs-number">500</span>)
.json({ error: error.message || <span class="hljs-string">'Failed to fetch payrolls'</span> });
}
}

<span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getPayrollById(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { id } = req.params;
<span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.findById(<span class="hljs-built_in">parseInt</span>(id));

      <span class="hljs-keyword">if</span> (!payroll) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Payroll not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
        },
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching payroll:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to fetch payroll'</span> });
    }
}
</code></pre>
<h3 id="heading-fetching-a-payroll-with-items-getpayrollbyid">Fetching a Payroll with Items (<code>getPayrollById</code>)</h3>
<p>Next, we’ll implement an endpoint to retrieve a single payroll by its ID along with all associated payroll items. This is useful for administrative dashboards and payroll history views.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getPayrollById(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { id } = req.params;
<span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.findById(<span class="hljs-built_in">parseInt</span>(id));

      <span class="hljs-keyword">if</span> (!payroll) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Payroll not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
        },
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching payroll:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to fetch payroll'</span> });
    }

}
</code></pre>
<p>In the code, we read the <code>id</code> parameter from the URL and convert it to an integer.</p>
<p>If the payroll does not exist, a <code>404 Not Found</code> response is returned. When found, the controller aggregates payroll metadata and its child payroll items into a single response object, making it convenient for detailed payroll inspection and UI rendering.</p>
<h3 id="heading-processing-a-payroll-processpayroll">Processing a Payroll (<code>processPayroll</code>)</h3>
<p>Next, we implement the <code>processPayroll</code> endpoint. This endpoint initiates payroll execution. Before queuing the payroll for processing, the controller enforces important state checks to prevent duplicate or invalid execution, ensuring payrolls that are already <code>PROCESSING</code> or <code>COMPLETED</code> cannot be reprocessed.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> processPayroll(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { id } = req.params;

      <span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.findById(<span class="hljs-built_in">Number</span>(id));

      <span class="hljs-keyword">if</span> (!payroll) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Payroll not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">if</span> (
        payroll.status === PayrollStatus.COMPLETED ||
        payroll.status === PayrollStatus.PROCESSING
      ) {
        res.status(<span class="hljs-number">400</span>).json({
          error: <span class="hljs-string">`Payroll already <span class="hljs-subst">${payroll.status}</span>`</span>,
        });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-comment">// Queue the payroll for processing</span>
      <span class="hljs-keyword">await</span> processPayrollItems(payroll.id);

      res.json({
        message: <span class="hljs-string">'Payroll queued for bulk processing'</span>,
        data: {
          payroll_id: payroll.id,
          processing_mode: <span class="hljs-string">'bulk'</span>,
        },
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error processing payroll:'</span>, error);
      res.status(<span class="hljs-number">500</span>).json({
        error: error.message || <span class="hljs-string">'Failed to process payroll'</span>,
      });
    }
}
</code></pre>
<p>Here’s what’s happening in the code:</p>
<ul>
<li><p>We get the <code>id</code> parameter from the URL and convert it to a number.</p>
</li>
<li><p>If no payroll is found with the given ID, we return a <code>404 Not Found</code> response.</p>
</li>
<li><p>Before queuing, we check the payroll’s current status. Payrolls that are already <code>PROCESSING</code> or <code>COMPLETED</code> cannot be reprocessed.</p>
</li>
<li><p>Valid payrolls are handed off to <code>processPayrollItems</code>, which runs the bulk execution in background workers (Bull jobs).</p>
</li>
<li><p>Once queued, we respond with a JSON object confirming the payroll is ready for bulk processing.</p>
</li>
</ul>
<h3 id="heading-reconciling-payroll-payments-reconcilepayroll">Reconciling Payroll Payments (<code>reconcilePayroll</code>)</h3>
<p>Next, we’ll implement the endpoint that reconciles payroll payments. This ensures that the statuses of payroll items in our system match the actual payment outcomes from Monnify.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> reconcilePayroll(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { id } = req.params;

      <span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.findById(<span class="hljs-built_in">Number</span>(id));
      <span class="hljs-keyword">if</span> (!payroll) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Payroll not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(<span class="hljs-built_in">Number</span>(id));

      <span class="hljs-keyword">const</span> itemsToReconcile = items.filter(
        <span class="hljs-function">(<span class="hljs-params">item</span>) =&gt;</span> item.transaction_reference
      );

      <span class="hljs-keyword">if</span> (itemsToReconcile.length === <span class="hljs-number">0</span>) {
        res.json({
          message: <span class="hljs-string">'No items to reconcile (no transaction references found)'</span>,
          reconciled: <span class="hljs-number">0</span>,
        });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">let</span> updated = <span class="hljs-number">0</span>;
      <span class="hljs-keyword">let</span> errors = <span class="hljs-number">0</span>;

      <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> item <span class="hljs-keyword">of</span> itemsToReconcile) {
        <span class="hljs-keyword">try</span> {
          <span class="hljs-keyword">const</span> txStatus = <span class="hljs-keyword">await</span> monnifyClient.getTransactionStatus(
            item.transaction_reference!
          );

          <span class="hljs-keyword">const</span> responseBody = txStatus.responseBody || txStatus;
          <span class="hljs-keyword">const</span> paymentStatus =
            responseBody.paymentStatus || responseBody.status;

          <span class="hljs-keyword">if</span> (
            paymentStatus === <span class="hljs-string">'PAID'</span> &amp;&amp;
            item.status !== PayrollStatus.COMPLETED
          ) {
            <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(
              item.id,
              PayrollStatus.COMPLETED,
              item.transaction_reference
            );
            updated++;
          } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (
            paymentStatus === <span class="hljs-string">'FAILED'</span> &amp;&amp;
            item.status !== PayrollStatus.FAILED
          ) {
            <span class="hljs-keyword">const</span> errorMessage =
              responseBody.paymentDescription ||
              responseBody.failureReason ||
              <span class="hljs-string">'Transaction failed'</span>;
            <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(
              item.id,
              PayrollStatus.FAILED,
              item.transaction_reference,
              errorMessage
            );
            updated++;
          }
        } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
          errors++;
          <span class="hljs-built_in">console</span>.error(<span class="hljs-string">`Error reconciling item <span class="hljs-subst">${item.id}</span>:`</span>, error.message);
        }
      }

      <span class="hljs-comment">// Update payroll stats</span>
      <span class="hljs-keyword">await</span> PayrollController.updatePayrollStats(<span class="hljs-built_in">Number</span>(id));

      res.json({
        message: <span class="hljs-string">'Payroll reconciled successfully'</span>,
        reconciled: updated,
        errors,
        total: itemsToReconcile.length,
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error reconciling payroll:'</span>, error);
      res.status(<span class="hljs-number">500</span>).json({
        error: error.message || <span class="hljs-string">'Failed to reconcile payroll'</span>,
      });
    }
}
</code></pre>
<p>The endpoint retrieves all payroll items with transaction references and queries Monnify for each transaction’s status. Based on the response, payroll items are updated to either <code>COMPLETED</code> or <code>FAILED</code>, with failure reasons captured where applicable.</p>
<p>Errors during reconciliation are tracked and logged without aborting the entire reconciliation process. After reconciliation, payroll-level statistics are recalculated to ensure consistency between item-level and batch-level states.</p>
<h3 id="heading-payroll-statistics-update-internal-helper">Payroll Statistics Update (Internal Helper)</h3>
<p>The private <code>updatePayrollStats</code> method recalculates payroll status based on the aggregate states of its payroll items.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">private</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> updatePayrollStats(payrollId: <span class="hljs-built_in">number</span>): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payrollId);

    <span class="hljs-keyword">const</span> completed = items.filter(
      <span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.COMPLETED
    ).length;
    <span class="hljs-keyword">const</span> failed = items.filter(
      <span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.FAILED
    ).length;
    <span class="hljs-keyword">const</span> total = items.length;

    <span class="hljs-keyword">let</span> status: PayrollStatus;
    <span class="hljs-keyword">if</span> (completed === total) {
      status = PayrollStatus.COMPLETED;
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (failed === total) {
      status = PayrollStatus.FAILED;
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (completed &gt; <span class="hljs-number">0</span>) {
      status = PayrollStatus.PARTIALLY_COMPLETED;
    } <span class="hljs-keyword">else</span> {
      status = PayrollStatus.PROCESSING;
    }

    <span class="hljs-keyword">await</span> PayrollModel.updateStatus(payrollId, status, completed, failed);

}
</code></pre>
<p>The endpoint determines whether a payroll is fully completed, fully failed, partially completed, or still processing, and updates the payroll record accordingly.</p>
<p>This logic guarantees that the payroll’s summary status always reflects the true execution state of its underlying payments.</p>
<h3 id="heading-fetching-payroll-status-summary-getpayrollstatus">Fetching Payroll Status Summary (<code>getPayrollStatus</code>)</h3>
<p>Next, we’ll implement the <code>getPayrollStatus</code> endpoint. This endpoint provides a comprehensive status snapshot of a payroll. In addition to returning payroll metadata and items, it computes a summary breakdown of completed, failed, pending, and processing items. This endpoint is particularly useful for real-time dashboards, monitoring tools, and operational visibility.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getPayrollStatus(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { id } = req.params;
<span class="hljs-keyword">const</span> payroll = <span class="hljs-keyword">await</span> PayrollModel.findById(<span class="hljs-built_in">parseInt</span>(id));

      <span class="hljs-keyword">if</span> (!payroll) {
        res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Payroll not found'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payroll.id);

      res.json({
        data: {
          ...payroll,
          items,
          summary: {
            total: items.length,
            completed: items.filter(<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.COMPLETED)
              .length,
            failed: items.filter(<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.FAILED)
              .length,
            pending: items.filter(<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.PENDING)
              .length,
            processing: items.filter(
              <span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.PROCESSING
            ).length,
          },
        },
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching payroll status:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to fetch payroll status'</span> });
    }
}
</code></pre>
<h3 id="heading-authorizing-bulk-transfers-authorizebulktransfer">Authorizing Bulk Transfers (<code>authorizeBulkTransfer</code>)</h3>
<p>Next, we’ll implement the <code>authorizeBulkTransfer</code> endpoint. Some bulk disbursements require OTP authorization from Monnify. This endpoint accepts a batch reference and authorization code, validates their presence, and forwards them to the Monnify client for verification. Successful authorization allows the bulk transfer to proceed, while failures are clearly reported to the client.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> authorizeBulkTransfer(
req: Request,
res: Response
): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { reference, authorizationCode, payrollId } = req.body;

      <span class="hljs-keyword">if</span> (!reference) {
        res.status(<span class="hljs-number">400</span>).json({ error: <span class="hljs-string">'Batch reference is required'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">if</span> (!authorizationCode) {
        res.status(<span class="hljs-number">400</span>).json({ error: <span class="hljs-string">'Authorization code (OTP) is required'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> monnifyClient.authorizeBulkTransfer(
        reference,
        authorizationCode
      );

      res.json({
        message: <span class="hljs-string">'Bulk transfer authorized successfully'</span>,
        data: result,
      });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error authorizing bulk transfer:'</span>, error);
      res.status(<span class="hljs-number">500</span>).json({
        error: error.message || <span class="hljs-string">'Failed to authorize bulk transfer'</span>,
      });
    }
}
</code></pre>
<p>Here is what’s happening in the code:</p>
<ul>
<li><p>Firstly, we get the batch reference, OTP, and optional payroll ID from the request body.</p>
</li>
<li><p>We return a <code>400 Bad Request</code> if the reference or OTP is missing.</p>
</li>
<li><p>Next, we send the reference and OTP to Monnify to approve the bulk transfer.</p>
</li>
<li><p>If successful, return a JSON confirmation with Monnify’s response.</p>
</li>
</ul>
<h3 id="heading-checking-transaction-status-checktransactionstatus">Checking Transaction Status (<code>checkTransactionStatus</code>)</h3>
<p>This endpoint allows clients or administrators to query the status of an individual transaction using its reference. It delegates the lookup to the Monnify client and returns the raw response, making it useful for debugging, audits, or manual verification workflows.</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> checkTransactionStatus(
req: Request,
res: Response
): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-keyword">const</span> { reference } = req.params;

      <span class="hljs-keyword">if</span> (!reference) {
        res.status(<span class="hljs-number">400</span>).json({ error: <span class="hljs-string">'Transaction reference is required'</span> });
        <span class="hljs-keyword">return</span>;
      }

      <span class="hljs-keyword">const</span> status = <span class="hljs-keyword">await</span> monnifyClient.getTransactionStatus(reference);
      res.json({ data: status });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error checking transaction status:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to check transaction status'</span> });
    }
}
</code></pre>
<h3 id="heading-checking-wallet-balance-getaccountbalance">Checking Wallet Balance (<code>getAccountBalance</code>)</h3>
<p>This endpoint retrieves the current balance of the Monnify wallet associated with the payroll contract code. It’s typically used for pre-disbursement checks, monitoring available funds, or administrative reporting.</p>
<pre><code class="lang-typescript">  <span class="hljs-keyword">static</span> <span class="hljs-keyword">async</span> getAccountBalance(req: Request, res: Response): <span class="hljs-built_in">Promise</span>&lt;<span class="hljs-built_in">void</span>&gt; {
    <span class="hljs-keyword">try</span> {
      <span class="hljs-keyword">const</span> balance = <span class="hljs-keyword">await</span> monnifyClient.getAccountBalance();
      res.json({ data: balance });
    } <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
      <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error fetching account balance:'</span>, error);
      res
        .status(<span class="hljs-number">500</span>)
        .json({ error: error.message || <span class="hljs-string">'Failed to fetch account balance'</span> });
    }
  }
</code></pre>
<h3 id="heading-error-handling-and-resilience">Error Handling and Resilience</h3>
<p>All controller methods use structured <code>try–catch</code> blocks to ensure unexpected failures are logged and surfaced as controlled HTTP error responses. This approach prevents sensitive internal errors from leaking while maintaining clarity and debuggability for API consumers.</p>
<h3 id="heading-role-in-the-overall-payroll-architecture-1">Role in the Overall Payroll Architecture</h3>
<p>The <code>PayrollController</code> acts as the central coordinator of the payroll system. It bridges client requests, domain models, background job processing, and external payment services into a cohesive workflow.</p>
<p>By enforcing state transitions, delegating heavy processing to background workers, and providing reconciliation and monitoring capabilities, this controller ensures payroll execution remains reliable, auditable, and scalable in real-world production environments.</p>
<h2 id="heading-setting-up-webhook-handlers">Setting Up Webhook Handlers</h2>
<p>Webhooks are essential for receiving real-time payment status updates from Monnify. When a payment completes or fails, Monnify sends a notification to your webhook endpoint.</p>
<p>Start by creating a new file <code>src/routes/monnify.webhook.ts</code>. This file will contain everything related to handling Monnify webhook events.</p>
<pre><code class="lang-typescript">
<span class="hljs-keyword">import</span> { Router, Request, Response } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> crypto <span class="hljs-keyword">from</span> <span class="hljs-string">'crypto'</span>;
<span class="hljs-keyword">import</span> {
PayrollItemModel,
PayrollModel,
PayrollStatus,
} <span class="hljs-keyword">from</span> <span class="hljs-string">'../models/payroll'</span>;

<span class="hljs-keyword">const</span> router = Router();

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">verifySignature</span>(<span class="hljs-params">req: Request</span>): <span class="hljs-title">boolean</span> </span>{
<span class="hljs-keyword">const</span> signature = req.headers[<span class="hljs-string">'monnify-signature'</span>] <span class="hljs-keyword">as</span> <span class="hljs-built_in">string</span>;
<span class="hljs-keyword">if</span> (!signature) <span class="hljs-keyword">return</span> <span class="hljs-literal">false</span>;

<span class="hljs-keyword">const</span> secret = process.env.MONNIFY_WEBHOOK_SECRET!;
<span class="hljs-keyword">const</span> hash = crypto
.createHmac(<span class="hljs-string">'sha512'</span>, secret)
.update(<span class="hljs-built_in">JSON</span>.stringify(req.body))
.digest(<span class="hljs-string">'hex'</span>);

<span class="hljs-keyword">return</span> hash === signature;
}

router.post(<span class="hljs-string">'/monnify/webhook'</span>, <span class="hljs-keyword">async</span> (req: Request, res: Response) =&gt; {
<span class="hljs-keyword">try</span> {
<span class="hljs-built_in">console</span>.log(<span class="hljs-string">'Monnify Webhook:'</span>, <span class="hljs-built_in">JSON</span>.stringify(req.body, <span class="hljs-literal">null</span>, <span class="hljs-number">2</span>));

    <span class="hljs-keyword">const</span> { eventType, eventData } = req.body;

    <span class="hljs-keyword">if</span> (!eventData?.reference) {
      <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">'Missing reference, ignoring webhook'</span>);
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'Ignored'</span>);
    }

    <span class="hljs-keyword">const</span> paymentReference = eventData.reference;
    <span class="hljs-keyword">const</span> transactionReference = eventData.transactionReference;
    <span class="hljs-keyword">const</span> description = eventData.transactionDescription || <span class="hljs-string">''</span>;

    <span class="hljs-comment">// Parse our reference format: PAYROLL_{payrollId}_{itemId}</span>
    <span class="hljs-keyword">const</span> [prefix, payrollIdStr, itemIdStr] = paymentReference.split(<span class="hljs-string">'_'</span>);

    <span class="hljs-keyword">if</span> (prefix !== <span class="hljs-string">'PAYROLL'</span>) {
      <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">'Invalid payment reference format:'</span>, paymentReference);
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'Ignored'</span>);
    }

    <span class="hljs-keyword">const</span> payrollId = <span class="hljs-built_in">Number</span>(payrollIdStr);
    <span class="hljs-keyword">const</span> itemId = <span class="hljs-built_in">Number</span>(itemIdStr);

    <span class="hljs-keyword">if</span> (<span class="hljs-built_in">isNaN</span>(payrollId) || <span class="hljs-built_in">isNaN</span>(itemId)) {
      <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">'Invalid payroll/item IDs:'</span>, paymentReference);
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'Ignored'</span>);
    }

    <span class="hljs-keyword">const</span> item = <span class="hljs-keyword">await</span> PayrollItemModel.findById(itemId);

    <span class="hljs-keyword">if</span> (!item) {
      <span class="hljs-built_in">console</span>.warn(<span class="hljs-string">'Payroll item not found:'</span>, itemId);
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'Ignored'</span>);
    }

    <span class="hljs-comment">// Idempotency check - don't process already finalized items</span>
    <span class="hljs-keyword">if</span> (
      item.status === PayrollStatus.COMPLETED ||
      item.status === PayrollStatus.FAILED
    ) {
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Item <span class="hljs-subst">${itemId}</span> already finalized (<span class="hljs-subst">${item.status}</span>)`</span>);
      <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'Already processed'</span>);
    }

    <span class="hljs-comment">// Update status based on event type</span>
    <span class="hljs-keyword">if</span> (
      eventType === <span class="hljs-string">'SUCCESSFUL_DISBURSEMENT'</span> ||
      eventData.status === <span class="hljs-string">'SUCCESS'</span>
    ) {
      <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(
        itemId,
        PayrollStatus.COMPLETED,
        transactionReference
      );
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`✅ Payroll item <span class="hljs-subst">${itemId}</span> COMPLETED`</span>);
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (
      eventType === <span class="hljs-string">'FAILED_DISBURSEMENT'</span> ||
      eventType === <span class="hljs-string">'REVERSED_DISBURSEMENT'</span> ||
      eventData.status === <span class="hljs-string">'FAILED'</span>
    ) {
      <span class="hljs-keyword">await</span> PayrollItemModel.updateStatus(
        itemId,
        PayrollStatus.FAILED,
        transactionReference,
        description
      );
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Payroll item <span class="hljs-subst">${itemId}</span> FAILED`</span>);
    } <span class="hljs-keyword">else</span> {
      <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Unhandled Monnify eventType: <span class="hljs-subst">${eventType}</span>`</span>);
    }

    <span class="hljs-comment">// Update overall payroll stats</span>
    <span class="hljs-keyword">await</span> updatePayrollStats(payrollId);

    <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'OK'</span>);

} <span class="hljs-keyword">catch</span> (error: <span class="hljs-built_in">any</span>) {
<span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Monnify webhook error:'</span>, error.message);
<span class="hljs-keyword">return</span> res.status(<span class="hljs-number">200</span>).send(<span class="hljs-string">'OK'</span>); <span class="hljs-comment">// Always return 200 to prevent retries</span>
}
});

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> router;

<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">updatePayrollStats</span>(<span class="hljs-params">payrollId: <span class="hljs-built_in">number</span></span>) </span>{
<span class="hljs-keyword">const</span> items = <span class="hljs-keyword">await</span> PayrollItemModel.findByPayrollId(payrollId);

<span class="hljs-keyword">const</span> completed = items.filter(
<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.COMPLETED
).length;

<span class="hljs-keyword">const</span> failed = items.filter(<span class="hljs-function">(<span class="hljs-params">i</span>) =&gt;</span> i.status === PayrollStatus.FAILED).length;

<span class="hljs-keyword">let</span> status = PayrollStatus.PROCESSING;

<span class="hljs-keyword">if</span> (completed === items.length) {
status = PayrollStatus.COMPLETED;
} <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (failed === items.length) {
status = PayrollStatus.FAILED;
} <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (completed &gt; <span class="hljs-number">0</span>) {
status = PayrollStatus.PARTIALLY_COMPLETED;
}

<span class="hljs-keyword">await</span> PayrollModel.updateStatus(payrollId, status, completed, failed);
}
</code></pre>
<p>Key webhook implementation details:</p>
<ol>
<li><p><strong>Signature verification</strong>: The <code>verifySignature</code> function validates that webhooks actually come from Monnify.</p>
</li>
<li><p><strong>Idempotency</strong>: The handler checks if an item is already finalized before processing.</p>
</li>
<li><p><strong>Always return 200</strong>: Even on errors, return 200 to prevent Monnify from retrying indefinitely.</p>
</li>
<li><p><strong>Reference parsing</strong>: Our reference format <code>PAYROLL_{payrollId}_{itemId}</code> lets us identify which payment item to update.</p>
</li>
</ol>
<h2 id="heading-wiring-up-routes">Wiring Up Routes</h2>
<h3 id="heading-employee-routes">Employee Routes</h3>
<p>We’ll start by defining routes for employee management. These routes expose CRUD operations for employees and simply delegate the actual logic to the <code>EmployeeController</code>.</p>
<p>Create the file <code>src/routes/employee.routes.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Router } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> { EmployeeController } <span class="hljs-keyword">from</span> <span class="hljs-string">'../controllers/employee.controller'</span>;

<span class="hljs-keyword">const</span> router = Router();

router.post(<span class="hljs-string">'/'</span>, EmployeeController.createEmployee);
router.get(<span class="hljs-string">'/'</span>, EmployeeController.getAllEmployees);
router.get(<span class="hljs-string">'/:id'</span>, EmployeeController.getEmployeeById);
router.put(<span class="hljs-string">'/:id'</span>, EmployeeController.updateEmployee);
router.delete(<span class="hljs-string">'/:id'</span>, EmployeeController.deleteEmployee);

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> router;
</code></pre>
<p>What this gives us:</p>
<ul>
<li><p>A clean <code>/api/employees</code> entry point for all employee-related operations</p>
</li>
<li><p>Clear separation between routing (URLs) and business logic (controllers)</p>
</li>
<li><p>A predictable REST structure that’s easy to extend later</p>
</li>
</ul>
<h3 id="heading-payroll-routes">Payroll Routes</h3>
<p>Next, we define routes for payroll operations. Payroll is more complex than employees, so this router exposes endpoints for creation, processing, reconciliation, authorization, and monitoring.</p>
<p>Create the file <code>src/routes/payroll.routes.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { Router } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> { PayrollController } <span class="hljs-keyword">from</span> <span class="hljs-string">'../controllers/payroll.controller'</span>;

<span class="hljs-keyword">const</span> router = Router();

router.post(<span class="hljs-string">'/'</span>, PayrollController.createPayroll);
router.get(<span class="hljs-string">'/'</span>, PayrollController.getAllPayrolls);
router.get(<span class="hljs-string">'/:id'</span>, PayrollController.getPayrollById);
router.post(<span class="hljs-string">'/:id/process'</span>, PayrollController.processPayroll);
router.post(<span class="hljs-string">'/batch/authorize'</span>, PayrollController.authorizeBulkTransfer);
router.get(<span class="hljs-string">'/:id/status'</span>, PayrollController.getPayrollStatus);
router.get(
  <span class="hljs-string">'/transaction/:reference/status'</span>,
  PayrollController.checkTransactionStatus
);
router.get(<span class="hljs-string">'/account/balance'</span>, PayrollController.getAccountBalance);
router.post(<span class="hljs-string">'/:id/reconcile'</span>, PayrollController.reconcilePayroll);

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> router;
</code></pre>
<p>What’s happening here:</p>
<ul>
<li><p>Each route maps directly to a well-defined payroll operation</p>
</li>
<li><p>Long-running or sensitive actions (processing, reconciliation, authorization) are clearly separated</p>
</li>
<li><p>Monitoring and operational endpoints (status, transaction lookup, balance checks) are first-class citizens</p>
</li>
</ul>
<h3 id="heading-main-application-entry-point">Main Application Entry Point</h3>
<p>With all routes defined, we now bring everything together in the main application file. This is where we configure middleware, register routes, and start the server.</p>
<p>Create the file <code>src/index.ts</code>:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> express, { Application, Request, Response } <span class="hljs-keyword">from</span> <span class="hljs-string">'express'</span>;
<span class="hljs-keyword">import</span> cors <span class="hljs-keyword">from</span> <span class="hljs-string">'cors'</span>;
<span class="hljs-keyword">import</span> helmet <span class="hljs-keyword">from</span> <span class="hljs-string">'helmet'</span>;
<span class="hljs-keyword">import</span> dotenv <span class="hljs-keyword">from</span> <span class="hljs-string">'dotenv'</span>;
<span class="hljs-keyword">import</span> path <span class="hljs-keyword">from</span> <span class="hljs-string">'path'</span>;
<span class="hljs-keyword">import</span> { pool } <span class="hljs-keyword">from</span> <span class="hljs-string">'./config/database'</span>;
<span class="hljs-keyword">import</span> employeeRoutes <span class="hljs-keyword">from</span> <span class="hljs-string">'./routes/employee.routes'</span>;
<span class="hljs-keyword">import</span> payrollRoutes <span class="hljs-keyword">from</span> <span class="hljs-string">'./routes/payroll.routes'</span>;
<span class="hljs-keyword">import</span> monnifyWebhookRoutes <span class="hljs-keyword">from</span> <span class="hljs-string">'./routes/monnify.webhook'</span>;

dotenv.config();

<span class="hljs-keyword">const</span> app: Application = express();
<span class="hljs-keyword">const</span> PORT = process.env.PORT || <span class="hljs-number">3008</span>;

<span class="hljs-comment">// Middleware</span>
app.use(
  helmet({
    contentSecurityPolicy: <span class="hljs-literal">false</span>,
  })
);
app.use(
  cors({
    origin: <span class="hljs-string">'*'</span>,
    methods: [<span class="hljs-string">'GET'</span>, <span class="hljs-string">'POST'</span>, <span class="hljs-string">'PUT'</span>, <span class="hljs-string">'DELETE'</span>, <span class="hljs-string">'OPTIONS'</span>],
    allowedHeaders: [<span class="hljs-string">'Content-Type'</span>, <span class="hljs-string">'Authorization'</span>],
  })
);
app.use(express.json());
app.use(express.urlencoded({ extended: <span class="hljs-literal">true</span> }));

<span class="hljs-comment">// Health check endpoint</span>
app.get(<span class="hljs-string">'/health'</span>, <span class="hljs-keyword">async</span> (req: Request, res: Response) =&gt; {
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">await</span> pool.query(<span class="hljs-string">'SELECT 1'</span>);
    res.json({ status: <span class="hljs-string">'healthy'</span>, database: <span class="hljs-string">'connected'</span> });
  } <span class="hljs-keyword">catch</span> (error) {
    res.status(<span class="hljs-number">500</span>).json({ status: <span class="hljs-string">'unhealthy'</span>, database: <span class="hljs-string">'disconnected'</span> });
  }
});

<span class="hljs-comment">// Routes</span>
app.use(<span class="hljs-string">'/api/employees'</span>, employeeRoutes);
app.use(<span class="hljs-string">'/api/payrolls'</span>, payrollRoutes);
app.use(<span class="hljs-string">'/api'</span>, monnifyWebhookRoutes);

<span class="hljs-comment">// 404 handler</span>
app.use(<span class="hljs-function">(<span class="hljs-params">req: Request, res: Response</span>) =&gt;</span> {
  res.status(<span class="hljs-number">404</span>).json({ error: <span class="hljs-string">'Route not found'</span> });
});

<span class="hljs-comment">// Error handler</span>
app.use(<span class="hljs-function">(<span class="hljs-params">err: <span class="hljs-built_in">any</span>, req: Request, res: Response, next: <span class="hljs-built_in">any</span></span>) =&gt;</span> {
  <span class="hljs-built_in">console</span>.error(<span class="hljs-string">'Error:'</span>, err);
  res.status(err.status || <span class="hljs-number">500</span>).json({
    error: err.message || <span class="hljs-string">'Internal server error'</span>,
  });
});

app.listen(PORT, <span class="hljs-function">() =&gt;</span> {
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Server is running on port <span class="hljs-subst">${PORT}</span>`</span>);
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">`Environment: <span class="hljs-subst">${process.env.NODE_ENV || <span class="hljs-string">'development'</span>}</span>`</span>);
});

<span class="hljs-comment">// Graceful shutdown</span>
process.on(<span class="hljs-string">'SIGTERM'</span>, <span class="hljs-keyword">async</span> () =&gt; {
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'SIGTERM signal received: closing HTTP server'</span>);
  <span class="hljs-keyword">await</span> pool.end();
  process.exit(<span class="hljs-number">0</span>);
});

process.on(<span class="hljs-string">'SIGINT'</span>, <span class="hljs-keyword">async</span> () =&gt; {
  <span class="hljs-built_in">console</span>.log(<span class="hljs-string">'SIGINT signal received: closing HTTP server'</span>);
  <span class="hljs-keyword">await</span> pool.end();
  process.exit(<span class="hljs-number">0</span>);
});
</code></pre>
<h2 id="heading-testing-the-system">Testing the System</h2>
<p>Now let's test the complete payroll flow.</p>
<p>Start the application:</p>
<pre><code class="lang-bash">docker-compose up -d
npm run dev
</code></pre>
<p>Create employees:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:3008/api/employees \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -d <span class="hljs-string">'{
    "name": "John Doe",
    "email": "john.doe@company.com",
    "salary": 50000,
    "account_number": "0123456789",
    "bank_code": "058",
    "bank_name": "GTBank"
  }'</span>
</code></pre>
<p>Create a few more employees with different salaries to see how it’s handled.</p>
<p>Create a payroll:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:3008/api/payrolls \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -d <span class="hljs-string">'{
    "payroll_period": "2024-12"
  }'</span>
</code></pre>
<p>This creates a payroll with all active employees.</p>
<p>Process the payroll:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:3008/api/payrolls/1/process
</code></pre>
<p>This queues the payroll for background processing. The system will:</p>
<ol>
<li><p>Create a bulk transfer request to Monnify</p>
</li>
<li><p>Update each payroll item with a transaction reference</p>
</li>
<li><p>Wait for webhooks to update final status</p>
</li>
</ol>
<p>Authorize the bulk transfer (if OTP is required):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766392280287/4d8ae61f-4ccf-4d63-86a1-a6f72d7286e1.png" alt="Monnify payroll authorization OTP email" class="image--center mx-auto" width="2358" height="1460" loading="lazy"></p>
<p>After processing, Monnify sends an OTP to your registered email. Use it to authorize:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:3008/api/payrolls/batch/authorize \
  -H <span class="hljs-string">"Content-Type: application/json"</span> \
  -d <span class="hljs-string">'{
    "reference": "BATCH_1702123456789",
    "authorizationCode": "123456",
    "payrollId": 1
  }'</span>
</code></pre>
<p>Check the payroll status:</p>
<pre><code class="lang-bash">curl http://localhost:3008/api/payrolls/1/status
</code></pre>
<p>This returns detailed status including a summary of completed, failed, and pending items.</p>
<p>Now, reconcile if needed – if webhooks were missed or you need to sync status:</p>
<pre><code class="lang-bash">curl -X POST http://localhost:3008/api/payrolls/1/reconcile
</code></pre>
<h2 id="heading-setting-up-webhooks-for-production">Setting Up Webhooks for Production</h2>
<p>For Monnify to send webhooks to your local development environment, you'll need to expose your local server. You can use ngrok:</p>
<pre><code class="lang-bash">ngrok http 3008
</code></pre>
<p>Then configure the webhook URL in your <a target="_blank" href="https://app.monnify.com/developer#webhook-urls">Monnify dashboard</a>:</p>
<pre><code class="lang-plaintext">https://your-ngrok-url.ngrok.io/api/monnify/webhook
</code></pre>
<p>For production, use your actual server URL and ensure HTTPS is enabled.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766392444369/440bc1a9-7c70-42b0-9157-892f1ef07861.png" alt="Monnify webhook URL configuration" class="image--center mx-auto" width="3024" height="1722" loading="lazy"></p>
<p>Then when transactions are successful it will be revealed on the monnify dashboard as well as the transactions that failed.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766392958199/e8abaa75-5a2f-44fd-b322-b110cf71e92d.png" alt="Monnify dashboard with payroll transaction status" class="image--center mx-auto" width="3024" height="2476" loading="lazy"></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>You've built a complete payroll system that:</p>
<ul>
<li><p>Manages employees with their bank account details</p>
</li>
<li><p>Creates payroll batches with automatic amount calculation</p>
</li>
<li><p>Processes bulk payments using Monnify's disbursement API</p>
</li>
<li><p>Uses background jobs to prevent request timeouts</p>
</li>
<li><p>Handles webhooks for real-time status updates</p>
</li>
<li><p>Supports reconciliation to ensure data consistency</p>
</li>
</ul>
<h3 id="heading-key-takeaways">Key Takeaways</h3>
<ol>
<li><p><strong>Background jobs are essential</strong>: Processing payments synchronously would timeout for large payrolls. Bull and Redis provide reliable async processing.</p>
</li>
<li><p><strong>Idempotency matters</strong>: Both the webhook handler and reconciliation process check current status before updating, preventing duplicate processing.</p>
</li>
<li><p><strong>Bulk transfers save time</strong>: Monnify's batch API lets you process hundreds of payments with a single OTP authorization.</p>
</li>
<li><p><strong>Status tracking is critical</strong>: The system tracks status at both the payroll and individual item level, making it easy to identify and handle failures.</p>
</li>
<li><p><strong>Reconciliation is your safety net</strong>: When webhooks fail or get delayed, the reconciliation endpoint ensures your database stays in sync with actual payment status.</p>
</li>
</ol>
<h3 id="heading-references">References:</h3>
<ul>
<li><a target="_blank" href="https://developers.monnify.com/">Monnify Docs</a></li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Deploy a Next.js API with PostgreSQL and Sevalla ]]>
                </title>
                <description>
                    <![CDATA[ When developers think of Next.js, they often associate it with SEO-friendly static websites or React-based frontends. But what many miss is how Next.js can also be used to build full-featured backend APIs – all within the same project. I’ve recently ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-deploy-a-nextjs-api-with-postgresql-and-sevalla/</link>
                <guid isPermaLink="false">68a33084f6c19271552e2ab0</guid>
                
                    <category>
                        <![CDATA[ Next.js ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ APIs ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Manish Shivanandhan ]]>
                </dc:creator>
                <pubDate>Mon, 18 Aug 2025 13:54:12 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755525213723/75759868-d5e9-4ea7-a6be-22bc33dde0d8.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>When developers think of Next.js, they often associate it with SEO-friendly static websites or React-based frontends. But what many miss is how Next.js can also be used to build full-featured backend APIs – all within the same project.</p>
<p>I’ve <a target="_blank" href="https://www.freecodecamp.org/news/how-to-deploy-a-nextjs-api-to-production-using-sevalla/">recently written an article</a> on working with Next.js API and deploying it to production. In this case, I would’ve used a JSON file as a mini-database.</p>
<p>But JSON or any type of file storage isn’t fit for a production application. This is because file-based storage isn’t designed for concurrent access, so multiple users writing data at the same time can cause corruption or loss.</p>
<p>It also lacks indexing and query capabilities, making it slow as data grows. Backups, security, and scalability are also harder to manage compared to a proper database.</p>
<p>In short, while JSON files work for demos or prototypes, production systems need a database that can handle concurrency, large datasets, complex queries, and reliable persistence.</p>
<p>So in this article, we'll walk through how to build a REST API with Next.js, store data in a Sevalla-managed database, and deploy the whole project to production using Sevalla's <a target="_blank" href="https://www.freecodecamp.org/news/vps-vs-paas-how-to-choose-a-hosting-solution/">PaaS infrastructure</a>.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-is-nextjs">What is Next.js?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-installation-and-setup">Installation and Setup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-build-a-nextjs-api">How to Build a NextJS API</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-provisioning-a-database-in-sevalla">Provisioning a Database in Sevalla</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-deploying-to-sevalla">Deploying to Sevalla</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-what-is-nextjs"><strong>What is Next.js?</strong></h2>
<p><a target="_blank" href="https://nextjs.org/">Next.js</a> is an open-source React framework developed by Vercel. It's known for server-side rendering, static generation, and seamless routing. But beyond its frontend superpowers, it allows developers to build backend logic and APIs through its file-based routing system. This makes Next.js a great choice for building full-stack apps.</p>
<h2 id="heading-installation-and-setup"><strong>Installation and Setup</strong></h2>
<p>To get started, make sure Node.js and NPM are installed.</p>
<pre><code class="lang-bash">$ node --version
v22.16.0

$ npm --version
10.9.2
</code></pre>
<p>Now, create a new Next.js project:</p>
<pre><code class="lang-bash">npx create-next-app@latest
</code></pre>
<p>The result of the above command will ask you a series of questions to setup your app:</p>
<pre><code class="lang-plaintext">What is your project named? my-app
Would you like to use TypeScript? No / Yes
Would you like to use ESLint? No / Yes
Would you like to use Tailwind CSS? No / Yes
Would you like your code inside a `src/` directory? No / Yes
Would you like to use App Router? (recommended) No / Yes
Would you like to use Turbopack for `next dev`?  No / Yes
Would you like to customize the import alias (`@/*` by default)? No / Yes
What import alias would you like configured? @/*
</code></pre>
<p>But for this tutorial, we aren’t interested in a full stack app – just an API. So let’s re-create the app using the <code>— - api</code> flag.</p>
<pre><code class="lang-plaintext">$ npx create-next-app@latest --api
</code></pre>
<p>It will still ask you a few questions. Use the default settings and finish creating the app.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754476744959/9f1d2763-7df5-491b-8cb3-05161b35fbd9.webp" alt="9f1d2763-7df5-491b-8cb3-05161b35fbd9" class="image--center mx-auto" width="1682" height="1208" loading="lazy"></p>
<p>Once the setup is done, you can see the folder with your app name. Let’s go into the folder and run the app.</p>
<pre><code class="lang-plaintext">$ npm run dev
</code></pre>
<p>Your API template should be running at port 3000. Go to <a target="_blank" href="http://localhost:3000/">http://localhost:3000</a> and you should see the following message:</p>
<pre><code class="lang-plaintext">{
"message": "Hello world!"
}
</code></pre>
<h2 id="heading-how-to-build-a-nextjs-api"><strong>How to Build a NextJS API</strong></h2>
<p>Now that we’ve set up our API template, let's write a basic REST API with two endpoints: one to create data and one to view data</p>
<p>The API code will reside under /app within the project directory. Next.js uses file-based routing for building URL paths.</p>
<p>For example, if you want a URL path /users, you should have a directory called “users” with a route.ts file to handle all the CRUD operations for /users. For /users/:id, you should have a directory called [id] under “users” directory with a route.ts file. The square brackets are to tell Next.js that you expect dynamic values for the /users/:id route.</p>
<p>Here is a screenshot of the setup. Delete the [slug] directory that comes with the project since it won’t be relevant for us.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754479396056/a80a0fd3-707d-4813-b402-041561354c94.png" alt="Folder setup" class="image--center mx-auto" width="400" height="288" loading="lazy"></p>
<ul>
<li><p>The route.ts file at the bottom handles CRUD operations for / (this is where the response “hello world” was generated from)</p>
</li>
<li><p>The route.ts file under /users handles CRUD operations for /users</p>
</li>
</ul>
<p>While this setup can seem complicated for a simple project, it provides a clear structure for large-scale web applications. If you want to go deeper into building complex APIs with Next.js, <a target="_blank" href="https://nextjs.org/blog/building-apis-with-nextjs">here is a tutorial</a> you can follow.</p>
<p>The code under /app/route.ts is the default file for our API. You can see it serving the GET request and responding with “Hello World!”:</p>
<pre><code class="lang-plaintext">import { NextResponse } from "next/server";

export async function GET() {
  return NextResponse.json({ message: "Hello world!" });
}
</code></pre>
<p>Now we need two routes:</p>
<ul>
<li><p>GET /users which lists all users</p>
</li>
<li><p>POST /users which creates a new user</p>
</li>
</ul>
<p>For this project, we’ll use a database to store our records. We’re not going to install a database on our local machine. Instead, we’ll provision the database in the cloud and use it with our API. This approach is common in test / prod environments to ensure data consistency.</p>
<h2 id="heading-provisioning-a-database-in-sevalla">Provisioning a Database in Sevalla</h2>
<p><a target="_blank" href="https://sevalla.com/">Sevalla</a> is a modern, usage-based Platform-as-a-service provider and an alternative to sites like Heroku or to your self-managed setup on AWS. It combines powerful features with a smooth developer experience.</p>
<p>Sevalls offers application hosting, database, object storage, and static site hosting for your projects. It comes with a generous free tier, so we’ll use it to connect to a database as well as deploy our app to the cloud.</p>
<p>If you are new to Sevalla, you can <a target="_blank" href="https://sevalla.com/signup/">sign up</a> using your GitHub account to enable direct deploys from your GitHub. Every time you push code to your project, Sevalla will auto-pull and deploy your app to the cloud.</p>
<p>Once you login to Sevalla, click on “Databases”.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754477578430/7b7fa655-0f35-4901-90be-07bd5abdf2c0.png" alt="Sevalla Databases" class="image--center mx-auto" width="2930" height="1292" loading="lazy"></p>
<p>Now let’s create a <a target="_blank" href="https://www.freecodecamp.org/news/posgresql-course-for-beginners/">PostgreSQL</a> database.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754477639118/d6ea82ae-45c9-40a7-bcf5-d144885db929.png" alt="Create Postgresql Database" class="image--center mx-auto" width="2366" height="1726" loading="lazy"></p>
<p>Use the default settings. Once the database is created, it will disable the external connections by default for security to ensure no one outside our server can connect to it. Since we want to test our connection from our local machine, let’s enable an external connection.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754479205197/58c01504-59c0-4df3-b9f9-cb14e1431135.png" alt="Database settings" class="image--center mx-auto" width="2420" height="1500" loading="lazy"></p>
<p>The value we need to connect to the database from our local endpoint is “url” under external connection. Create a file called .env in the project and paste the URL in the below format:</p>
<pre><code class="lang-typescript">PGSQL_URL=postgres:<span class="hljs-comment">//&lt;username&gt;:&lt;password&gt;-@asia-east1-001.proxy.kinsta.app:30503/&lt;db_name&gt;</span>
</code></pre>
<p>The reason we use .env is to store environment variables specific to the environment. In production, we won’t need this file (never push .env files to GitHub). Sevalla will give us the option to add environment variables via the GUI when we deploy the app.</p>
<p>Now let’s test our database connection. Install the <code>pg</code> package for Node to interact with PostgreSQL. Let’s also install the TypeScript extension for <code>pg</code> to support TypeScript definitions.</p>
<pre><code class="lang-typescript">$ npm i pg
$ npm install --save-dev <span class="hljs-meta">@types</span>/pg
</code></pre>
<p>Change the route.ts that served “hello world” to the below:</p>
<pre><code class="lang-typescript"><span class="hljs-comment">// app/api/your-endpoint/route.ts</span>
<span class="hljs-keyword">import</span> { NextResponse } <span class="hljs-keyword">from</span> <span class="hljs-string">"next/server"</span>;
<span class="hljs-keyword">import</span> { Client } <span class="hljs-keyword">from</span> <span class="hljs-string">"pg"</span>;

<span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">GET</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">const</span> client = <span class="hljs-keyword">new</span> Client({
    connectionString: process.env.PGSQL_URL,
  });

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">await</span> client.connect();
    <span class="hljs-keyword">await</span> client.end();
    <span class="hljs-keyword">return</span> NextResponse.json({ message: <span class="hljs-string">"Connected to database"</span> });
  } <span class="hljs-keyword">catch</span> (error) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Database connection error:"</span>, error);
    <span class="hljs-keyword">return</span> NextResponse.json({ message: <span class="hljs-string">"Connection failed"</span> }, { status: <span class="hljs-number">500</span> });
  }
}
</code></pre>
<p>Now when your app and go to localhost:3000, it should say “connected to database”.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754485714515/c63f11fc-2310-462a-9b42-c0528e500637.png" alt="Postgresql successful connection" class="image--center mx-auto" width="884" height="224" loading="lazy"></p>
<p>Great. Now let’s write our two routes, one to create data and the other to view the data we created. Use this code under users/route.ts:</p>
<pre><code class="lang-typescript"><span class="hljs-keyword">import</span> { NextResponse } <span class="hljs-keyword">from</span> <span class="hljs-string">"next/server"</span>;
<span class="hljs-keyword">import</span> <span class="hljs-keyword">type</span> { NextRequest } <span class="hljs-keyword">from</span> <span class="hljs-string">"next/server"</span>;
<span class="hljs-keyword">import</span> { Client } <span class="hljs-keyword">from</span> <span class="hljs-string">"pg"</span>;

<span class="hljs-comment">// Define the structure of a User object</span>
<span class="hljs-keyword">interface</span> User {
  id: <span class="hljs-built_in">string</span>;
  name: <span class="hljs-built_in">string</span>;
  email: <span class="hljs-built_in">string</span>;
  age: <span class="hljs-built_in">number</span>;
}

<span class="hljs-comment">// Create a PostgreSQL client</span>
<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">getClient</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">return</span> <span class="hljs-keyword">new</span> Client({
    connectionString: process.env.PGSQL_URL,
  });
}

<span class="hljs-comment">// Fetch all users from the database</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">readUsers</span>(<span class="hljs-params"></span>): <span class="hljs-title">Promise</span>&lt;<span class="hljs-title">User</span>[]&gt; </span>{
  <span class="hljs-keyword">const</span> client = getClient();
  <span class="hljs-keyword">await</span> client.connect();

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> result = <span class="hljs-keyword">await</span> client.query(<span class="hljs-string">"SELECT id, name, email, age FROM users"</span>);
    <span class="hljs-keyword">return</span> result.rows;
  } <span class="hljs-keyword">finally</span> {
    <span class="hljs-keyword">await</span> client.end();
  }
}

<span class="hljs-comment">// Insert or update users in the database</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">writeUsers</span>(<span class="hljs-params">users: User[]</span>) </span>{
  <span class="hljs-keyword">const</span> client = getClient();
  <span class="hljs-keyword">await</span> client.connect();

  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> insertQuery = <span class="hljs-string">`
      INSERT INTO users (id, name, email, age)
      VALUES ($1, $2, $3, $4)
      ON CONFLICT (id) DO UPDATE SET
        name = EXCLUDED.name,
        email = EXCLUDED.email,
        age = EXCLUDED.age;
    `</span>;

    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">const</span> user <span class="hljs-keyword">of</span> users) {
      <span class="hljs-keyword">await</span> client.query(insertQuery, [user.id, user.name, user.email, user.age]);
    }
  } <span class="hljs-keyword">finally</span> {
    <span class="hljs-keyword">await</span> client.end();
  }
}

<span class="hljs-comment">// Handle GET request: return list of users</span>
<span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">GET</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> users = <span class="hljs-keyword">await</span> readUsers();
    <span class="hljs-keyword">return</span> NextResponse.json(users);
  } <span class="hljs-keyword">catch</span> (err) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Error reading users from DB:"</span>, err);
    <span class="hljs-keyword">return</span> NextResponse.json({ error: <span class="hljs-string">"Failed to fetch users"</span> }, { status: <span class="hljs-number">500</span> });
  }
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">POST</span>(<span class="hljs-params">req: NextRequest</span>) </span>{
  <span class="hljs-keyword">try</span> {
    <span class="hljs-keyword">const</span> body = <span class="hljs-keyword">await</span> req.json();
    <span class="hljs-keyword">const</span> users: User[] = <span class="hljs-built_in">Array</span>.isArray(body) ? body : [body];

    <span class="hljs-keyword">await</span> writeUsers(users);

    <span class="hljs-keyword">return</span> NextResponse.json({ success: <span class="hljs-literal">true</span>, count: users.length });
  } <span class="hljs-keyword">catch</span> (err) {
    <span class="hljs-built_in">console</span>.error(<span class="hljs-string">"Error writing users to DB:"</span>, err);
    <span class="hljs-keyword">return</span> NextResponse.json({ error: <span class="hljs-string">"Failed to write users"</span> }, { status: <span class="hljs-number">500</span> });
  }
}
</code></pre>
<p>Now when you go to localhost:3000/users, it will give you an error because the users table does exist. So let’s create one.</p>
<p>In the database UI, click on “Studio”. You’ll get a visual editor for your database where you can manage your data directly (pretty cool, right?).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754486852876/2437c76a-562a-4575-9cc0-a5f563aa6206.png" alt="Database studio" class="image--center mx-auto" width="2934" height="1016" loading="lazy"></p>
<p>Press the “+” icon and choose “create table”. Create the table with the schema below. Click the “add column” link to create new columns.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754487097219/9d01d9b7-e3c6-427b-9b42-c97065826af7.png" alt="Database Schema" class="image--center mx-auto" width="1128" height="526" loading="lazy"></p>
<p>Click “create table and you should see the table created as below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754487162119/c64e0577-d094-4549-85f4-ab8c8d15f48e.png" alt="Users table" class="image--center mx-auto" width="2942" height="1082" loading="lazy"></p>
<p>Let’s add a dummy record using “add record” button to use it to test our API. The id field should be in UUID format (and you can <a target="_blank" href="https://www.uuidgenerator.net/">generate one here</a>).</p>
<p>Now let’s test our API.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754487408705/3fd9784e-3a83-415d-870f-f3f5d23dec51.png" alt="3fd9784e-3a83-415d-870f-f3f5d23dec51" class="image--center mx-auto" width="1088" height="386" loading="lazy"></p>
<p>You should see the user you created as the response to the localhost:3000/users query. Now let’s create a new user using our API.</p>
<p>We’ll use <a target="_blank" href="https://www.postman.com/">Postman</a> for this since its easy to create POST requests using Postman. We’ll send a sample data under “body” → “raw” → “JSON”.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754543941765/a77be1b8-05c3-4c61-a5c3-f7f0fbf48b4d.png" alt="Post Request" class="image--center mx-auto" width="1680" height="804" loading="lazy"></p>
<p>The response from Postman should be as below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754544001954/5a52331c-a445-4b10-8c4b-9337ca873c13.png" alt="Postman results" class="image--center mx-auto" width="1048" height="328" loading="lazy"></p>
<p>Now going to localhost:3000/users, you should see the new record created.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754544086690/8c4533c6-4250-42e1-a850-b52c460775fc.png" alt="Get /users" class="image--center mx-auto" width="1194" height="676" loading="lazy"></p>
<p>Great job! Now let’s get this app live.</p>
<h2 id="heading-deploying-to-sevalla">Deploying to Sevalla</h2>
<p>Push your code to GitHub or <a target="_blank" href="https://github.com/manishmshiva/nextjs-api-pgsql">fork my repository</a>. Now lets go to Sevalla and create a new app.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754545093624/9747a06d-0dcf-482a-89b9-732b9937b1dc.png" alt="Sevalla create app" class="image--center mx-auto" width="3006" height="1252" loading="lazy"></p>
<p>Choose your repository from the dropdown and check “Automatic deployment on commit”. This will ensure that the deployment is automatic every time you push code. Choose “Hobby” under the resources section.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754545136001/dde5fe4d-4691-401c-a3ef-959b8e53f62a.png" alt="Sevalla Create New App" class="image--center mx-auto" width="2964" height="1442" loading="lazy"></p>
<p>Click “Create” and not “Create and deploy”. We haven’t added our PostgreSQL URL as an environment variable, so the app will crash if you try to deploy it.</p>
<p>Go to the “Environment variables” section and add the key “PGSQL_URL” and the URL in the value field.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754545371348/7525c6cd-63af-40b2-80c5-6b49b6101f19.png" alt="7525c6cd-63af-40b2-80c5-6b49b6101f19" class="image--center mx-auto" width="2954" height="1316" loading="lazy"></p>
<p>Now go back to the “Overview” section and click “Deploy now”.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754545664510/c3f12f86-0732-4518-bf51-4867ac86abdd.png" alt="c3f12f86-0732-4518-bf51-4867ac86abdd" class="image--center mx-auto" width="2490" height="1678" loading="lazy"></p>
<p>Once deployment is complete, click “Visit app” to get the live URL of your API. You can replace localhost:3000 with the new URL in Postman and test your API.</p>
<p>Congratulations – your app is now live. You can do more with your app using the admin interface, like:</p>
<ul>
<li><p>Monitor the performance of your app</p>
</li>
<li><p>Watch real-time logs</p>
</li>
<li><p>Add custom domains</p>
</li>
<li><p>Update network settings (open/close ports for security, and so on)</p>
</li>
<li><p>Add more storage</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Next.js is no longer just a frontend framework. It’s a powerful full-stack platform that lets you build and deploy production-ready APIs with minimal friction. By pairing it with Sevalla’s developer-friendly infrastructure, you can go from local development to a live, cloud-hosted API in minutes.</p>
<p>In this tutorial, you learned how to set up a Next.js API project, connect it to a cloud-hosted PostgreSQL database on Sevalla, and deploy everything seamlessly. Whether you're building a small side project or a full-scale application, this stack gives you the speed, structure, and scalability to move fast without losing flexibility.</p>
<p>Hope you enjoyed this article. I’ll see you soon with another one. You can <a target="_blank" href="https://manishshivanandhan.com/">connect with me here</a> or <a target="_blank" href="https://blog.manishshivanandhan.com/">visit my blog</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Deploy Your FastAPI + PostgreSQL App on Render: A Beginner's Guide ]]>
                </title>
                <description>
                    <![CDATA[ This guide is a comprehensive roadmap for deploying a FastAPI backend connected to a PostgreSQL database using Render, a cloud platform that supports hosting Python web apps and managed PostgreSQL databases.   You can find the complete source code he... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/deploy-fastapi-postgresql-app-on-render/</link>
                <guid isPermaLink="false">682f4900bcc94cb9bccbf905</guid>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                    <category>
                        <![CDATA[ render.com ]]>
                    </category>
                
                    <category>
                        <![CDATA[ FastAPI ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Preston Osoro ]]>
                </dc:creator>
                <pubDate>Thu, 22 May 2025 15:55:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747923566699/58fc1283-d2f5-4964-acfa-b5dcad0f3d4f.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>This guide is a comprehensive roadmap for deploying a FastAPI backend connected to a PostgreSQL database using <a target="_blank" href="https://render.com/">Render</a>, a cloud platform that supports hosting Python web apps and managed PostgreSQL databases.  </p>
<p>You can find the complete source code <a target="_blank" href="https://github.com/preston-56/FastAPI">here</a>.</p>
<h2 id="heading-deployment-context">Deployment Context</h2>
<p>When deploying a FastAPI app connected to PostgreSQL, you need to select a platform that supports Python web applications and managed databases. This guide uses Render as the example platform because it provides both web hosting and a PostgreSQL database service in one environment, making it straightforward to connect your backend with the database.</p>
<p>You can apply the concepts here to other cloud providers as well, but the steps will differ depending on the platform’s specifics.</p>
<h3 id="heading-heres-what-well-cover">Here’s what we’ll cover:</h3>
<ol>
<li><p><a class="post-section-overview" href="#heading-project-structure">Project Structure for a Real-World FastAPI App</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-youll-need-before-you-start">What You'll Need Before You Start</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-deployment-steps">Deployment Steps</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-step-1-set-up-local-postgresql-database">Step 1: Set Up Local PostgreSQL Database</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-2-set-up-your-database-connection">Step 2: Set Up Your Database Connection</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-3-configure-your-fastapi-main-application">Step 3: Configure Your FastAPI Main Application</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-4-create-a-requirements-file">Step 4: Create a Requirements File</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-5-provision-a-postgresql-database-on-render">Step 5: Provision a PostgreSQL Database on Render</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-6-deploy-your-fastapi-app-on-render">Step 6: Deploy Your FastAPI App on Render</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-7-test-your-api-endpoints">Step 7: Test Your API Endpoints</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-local-development-workflow">Local Development Workflow</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-best-practices-and-tips">Best Practices and Common Troubleshooting Tips</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-common-issues-and-solutions">Common Issues and Solutions</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="heading-project-structure">Project Structure</h2>
<p>If you’re building a real-world API with <a target="_blank" href="https://fastapi.tiangolo.com/">FastAPI</a> you’ll quickly outgrow a single <code>main.py</code> file. That’s when modular project structure becomes essential for maintainability.</p>
<p>Here’s an example structure we’ll use throughout this guide:</p>
<pre><code class="lang-python">FastAPI/
├── database/
│   ├── base.py
│   ├── database.py
│   └── __init__.py
├── fastapi_app/
│   └── main.py
├── items/
│   ├── models/
│   │   ├── __init__.py
│   │   └── item.py
│   ├── routes/
│   │   ├── __init__.py
│   │   └── item.py
│   └── schemas/
│       ├── __init__.py
│       └── item.py
├── models/
│   └── __init__.py
├── orders/
│   ├── models/
│   │   ├── __init__.py
│   │   └── order.py
│   ├── routes/
│   │   ├── __init__.py
│   │   └── order.py
│   └── schemas/
│       ├── __init__.py
│       └── order.py
└── users/
    ├── models/
    │   ├── __init__.py
    │   └── user.py
    ├── routes/
    │   ├── __init__.py
    │   └── user.py
    └── schemas/
        ├── __init__.py
        └── user.py
</code></pre>
<h2 id="heading-what-youll-need-before-you-start">What You'll Need Before You Start</h2>
<p>Before diving in, make sure you've got:</p>
<ul>
<li><p>A free <a target="_blank" href="https://render.com/">Render</a> account (sign up if you don't have one)</p>
</li>
<li><p>A GitHub or GitLab repository for your FastAPI project</p>
</li>
<li><p>Basic familiarity with Python, FastAPI, and Git</p>
</li>
<li><p>Your project structure set up similarly to the example above</p>
</li>
</ul>
<h2 id="heading-deployment-steps">Deployment Steps</h2>
<h3 id="heading-step-1-set-up-local-postgresql-database">Step 1: Set Up Local PostgreSQL Database</h3>
<p>For local development, you'll need to set up PostgreSQL on your machine like this:</p>
<pre><code class="lang-sql"><span class="hljs-comment">-- 1. Log in as superuser</span>
psql -U postgres

<span class="hljs-comment">-- 2. Create a new database</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">DATABASE</span> your_db;

<span class="hljs-comment">-- 3. Create a user with password</span>
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">USER</span> your_user <span class="hljs-keyword">WITH</span> <span class="hljs-keyword">PASSWORD</span> <span class="hljs-string">'your_secure_password'</span>;

<span class="hljs-comment">-- 4. Grant all privileges on the database</span>
<span class="hljs-keyword">GRANT</span> <span class="hljs-keyword">ALL</span> <span class="hljs-keyword">PRIVILEGES</span> <span class="hljs-keyword">ON</span> <span class="hljs-keyword">DATABASE</span> your_db <span class="hljs-keyword">TO</span> your_user;

<span class="hljs-comment">-- 5. (Optional) Allow the user to create tables</span>
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">USER</span> your_user CREATEDB;

<span class="hljs-comment">-- 6. Exit</span>
\q
</code></pre>
<p>After setting up your local database, create a <code>.env</code> file in your project root:</p>
<pre><code class="lang-bash">DATABASE_URL=postgresql://your_user:your_secure_password@localhost:5432/your_db
</code></pre>
<h3 id="heading-step-2-set-up-your-database-connection">Step 2: Set Up Your Database Connection</h3>
<p>Create <code>database/database.py</code> to manage your PostgreSQL connection with SQLAlchemy:</p>
<p>This file is crucial as it creates the database engine, defines session management, and provides a dependency function for your routes.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> sqlalchemy <span class="hljs-keyword">import</span> create_engine
<span class="hljs-keyword">from</span> sqlalchemy.orm <span class="hljs-keyword">import</span> sessionmaker
<span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

load_dotenv()

DATABASE_URL = os.getenv(<span class="hljs-string">"DATABASE_URL"</span>)
<span class="hljs-string">"""
The engine manages the connection to the database and handles query execution.
"""</span>
engine = create_engine(DATABASE_URL)
SessionLocal = sessionmaker(autocommit=<span class="hljs-literal">False</span>, autoflush=<span class="hljs-literal">False</span>, bind=engine)

<span class="hljs-comment"># Database dependency for routes</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_db</span>():</span>
    db = SessionLocal()
    <span class="hljs-keyword">try</span>:
        <span class="hljs-keyword">yield</span> db
    <span class="hljs-keyword">finally</span>:
        db.close()
</code></pre>
<p>And add <code>database/base.py</code> for the base class:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> sqlalchemy.ext.declarative <span class="hljs-keyword">import</span> declarative_base
Base = declarative_base()
</code></pre>
<h3 id="heading-step-3-configure-your-fastapi-main-application">Step 3: Configure Your FastAPI Main Application</h3>
<p>Create main FastAPI application file <code>fastapi_app/main.py</code> to import all your route modules:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> os
<span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, APIRouter
<span class="hljs-keyword">from</span> fastapi.openapi.utils <span class="hljs-keyword">import</span> get_openapi
<span class="hljs-keyword">from</span> fastapi.security <span class="hljs-keyword">import</span> OAuth2PasswordBearer
<span class="hljs-keyword">import</span> uvicorn
<span class="hljs-keyword">from</span> dotenv <span class="hljs-keyword">import</span> load_dotenv

<span class="hljs-comment"># Load environment variables</span>
load_dotenv()

<span class="hljs-comment"># Database imports</span>
<span class="hljs-keyword">from</span> database <span class="hljs-keyword">import</span> Base, engine

<span class="hljs-comment"># Import models to ensure they're registered with SQLAlchemy</span>
<span class="hljs-keyword">import</span> models

<span class="hljs-comment"># Import router modules</span>
<span class="hljs-keyword">from</span> items.routes <span class="hljs-keyword">import</span> item_router
<span class="hljs-keyword">from</span> orders.routes <span class="hljs-keyword">import</span> order_router
<span class="hljs-keyword">from</span> users.routes <span class="hljs-keyword">import</span> user_router

<span class="hljs-comment"># Initialize FastAPI app</span>
app = FastAPI(
    title=<span class="hljs-string">"Store API"</span>,
    version=<span class="hljs-string">"1.0.0"</span>,
    description=<span class="hljs-string">"API documentation for Store API"</span>
)

<span class="hljs-comment"># Create database tables on startup</span>
Base.metadata.create_all(bind=engine)

<span class="hljs-comment"># Root endpoint</span>
<span class="hljs-meta">@app.get("/")</span>
<span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">root</span>():</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span>: <span class="hljs-string">"Welcome to FastAPI Store"</span>}

<span class="hljs-comment"># Setup versioned API router and include module routers</span>
api_router = APIRouter(prefix=<span class="hljs-string">"/v1"</span>)
api_router.include_router(item_router)
api_router.include_router(order_router)
api_router.include_router(user_router)

<span class="hljs-comment"># Register the master router with the app</span>
app.include_router(api_router)

<span class="hljs-comment"># Setup OAuth2 scheme for Swagger UI login flow</span>
oauth2_scheme = OAuth2PasswordBearer(tokenUrl=<span class="hljs-string">"/v1/auth/login"</span>)

<span class="hljs-comment"># Custom OpenAPI schema with security configuration</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">custom_openapi</span>():</span>
    <span class="hljs-keyword">if</span> app.openapi_schema:
        <span class="hljs-keyword">return</span> app.openapi_schema

    openapi_schema = get_openapi(
        title=app.title,
        version=app.version,
        description=app.description,
        routes=app.routes,
    )

    <span class="hljs-comment"># Add security scheme</span>
    openapi_schema[<span class="hljs-string">"components"</span>][<span class="hljs-string">"securitySchemes"</span>] = {
        <span class="hljs-string">"BearerAuth"</span>: {
            <span class="hljs-string">"type"</span>: <span class="hljs-string">"http"</span>,
            <span class="hljs-string">"scheme"</span>: <span class="hljs-string">"bearer"</span>,
            <span class="hljs-string">"bearerFormat"</span>: <span class="hljs-string">"JWT"</span>,
        }
    }

    <span class="hljs-comment"># Apply global security requirement</span>
    openapi_schema[<span class="hljs-string">"security"</span>] = [{<span class="hljs-string">"BearerAuth"</span>: []}]

    app.openapi_schema = openapi_schema
    <span class="hljs-keyword">return</span> app.openapi_schema

app.openapi = custom_openapi

<span class="hljs-comment"># Run the app using Uvicorn when executed directly</span>
<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    port = os.environ.get(<span class="hljs-string">"PORT"</span>)
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> port:
        <span class="hljs-keyword">raise</span> EnvironmentError(<span class="hljs-string">"PORT environment variable is not set"</span>)
    uvicorn.run(<span class="hljs-string">"fastapi_app.main:app"</span>, host=<span class="hljs-string">"0.0.0.0"</span>, port=int(port), reload=<span class="hljs-literal">False</span>)
</code></pre>
<h3 id="heading-step-4-create-a-requirements-file">Step 4: Create a Requirements File</h3>
<p>In your project root, create a <code>requirements.txt</code> file that includes all the necessary dependencies:</p>
<pre><code class="lang-python">fastapi&gt;=<span class="hljs-number">0.68</span><span class="hljs-number">.0</span>
uvicorn&gt;=<span class="hljs-number">0.15</span><span class="hljs-number">.0</span>
sqlalchemy&gt;=<span class="hljs-number">1.4</span><span class="hljs-number">.23</span>
psycopg2-binary&gt;=<span class="hljs-number">2.9</span><span class="hljs-number">.1</span>
python-dotenv&gt;=<span class="hljs-number">0.19</span><span class="hljs-number">.0</span>
pydantic&gt;=<span class="hljs-number">1.8</span><span class="hljs-number">.2</span>
</code></pre>
<h3 id="heading-step-5-provision-a-postgresql-database-on-render">Step 5: Provision a PostgreSQL Database on Render</h3>
<p>Log in to your Render dashboard at <a target="_blank" href="https://dashboard.render.com/login">dashboard.render.com</a>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747782796468/e7564ed7-66cd-4466-a1d0-913b93dc9a77.png" alt="Render dashboard" class="image--center mx-auto" width="2564" height="1672" loading="lazy"></p>
<p>Then click "<strong>New +</strong>" in the top right and select "<strong>PostgreSQL</strong>".</p>
<p>Fill in the details:</p>
<ul>
<li><p>Name: <code>your-app-db</code> (choose a descriptive name)</p>
</li>
<li><p>Database: <code>your_app</code> (this will be your database name)</p>
</li>
<li><p>User: leave default (auto-generated)</p>
</li>
<li><p>Region: Choose the closest to your target users</p>
</li>
<li><p>Plan: Free tier</p>
</li>
</ul>
<p>Save and note the Internal Database URL shown after creation, which will look something like this:</p>
<pre><code class="lang-bash">postgres://user:password@postgres-instance.render.com/your_app
</code></pre>
<h3 id="heading-step-6-deploy-your-fastapi-app-on-render">Step 6: Deploy Your FastAPI App on Render</h3>
<p>With your database provisioned, it's time to deploy your API. You can do that by following these steps:</p>
<ol>
<li><p>In Render dashboard, click "<strong>New +</strong>" and select "<strong>Web Service</strong>"</p>
</li>
<li><p>Connect your GitHub/GitLab repository</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747813206325/5338209e-eb5c-4ba2-b28a-511296220935.png" alt="Connect to GitHub/GitLab" class="image--center mx-auto" width="1847" height="341" loading="lazy"></p>
</li>
<li><p>Name your service</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747813320278/e21998cc-317b-4ea6-8dec-d52493e2969f.png" alt="Naming your service" class="image--center mx-auto" width="1840" height="964" loading="lazy"></p>
</li>
<li><p><strong>Then configure the build settings</strong>:</p>
<ul>
<li><p>Environment: <code>Python 3</code></p>
</li>
<li><p>Build Command: <code>pip install -r requirements.txt</code></p>
</li>
<li><p>Start Command: <code>python3 -m fastapi_app.main</code></p>
</li>
</ul>
</li>
<li><p><strong>Add your environment variables</strong>:</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747813450598/6b0913b0-3081-44c4-b746-6b28549a2dd0.png" alt="Adding environment variables" class="image--center mx-auto" width="1494" height="509" loading="lazy"></p>
<ul>
<li><p>Click "Environment" tab</p>
</li>
<li><p>Add your database URL:</p>
<ul>
<li><p>Key: <code>DATABASE_URL</code></p>
</li>
<li><p>Value: Paste the <strong>Internal Database URL</strong> from your PostgreSQL service</p>
</li>
</ul>
</li>
<li><p>Add any other environment variables your application needs</p>
</li>
</ul>
</li>
<li><p>Finally, click <strong>Deploy Web Service</strong>.</p>
<ul>
<li><p>Render will start building and deploying your application</p>
</li>
<li><p>This process takes a few minutes. You can monitor logs during build and deployment in real-time</p>
</li>
</ul>
</li>
</ol>
<h3 id="heading-step-7-test-your-api-endpoints">Step 7: Test Your API Endpoints</h3>
<p>Once deployed, access your API’s URL (for example, <a target="_blank" href="https://your-app-name.onrender.com"><code>https://your-app-name.onrender.com</code></a>).</p>
<p>Navigate to <code>/docs</code> to open the interactive Swagger UI, where you can test your endpoints directly:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1747783210993/95ea29a5-d2aa-430f-a107-ef25c8ab4e24.png" alt="Test endpoints in Swagger" width="1511" height="790" loading="lazy"></p>
<ul>
<li><p>Expand an endpoint</p>
</li>
<li><p>Click <strong>Try it out</strong></p>
</li>
<li><p>Provide any required input</p>
</li>
<li><p>Click <strong>Execute</strong></p>
</li>
<li><p>View the response</p>
</li>
</ul>
<h2 id="heading-local-development-workflow">Local Development Workflow</h2>
<p>While your app is deployed, you'll still need to work on it locally. Here's how to maintain a smooth development workflow:</p>
<p>First, create a local <code>.env</code> file (don't commit this to Git):</p>
<pre><code class="lang-python">DATABASE_URL=postgresql://username:password@localhost:<span class="hljs-number">5432</span>/your_local_db
</code></pre>
<p>Then install your dependencies in a virtual environment:</p>
<pre><code class="lang-bash">python3 -m venv venv
<span class="hljs-built_in">source</span> venv/bin/activate  <span class="hljs-comment"># Windows: venv\Scripts\activate</span>
pip install -r requirements.txt
</code></pre>
<p>Next, run your local server:</p>
<pre><code class="lang-bash">python3 -m fastapi_app.main
</code></pre>
<p>This command triggers the <code>__main__</code> block in <code>fastapi_app/main.py</code>, which starts the FastAPI app using Uvicorn. It reads the <code>PORT</code> from your environment, so ensure it's set (e.g., via a <code>.env</code> file).</p>
<p>Then make changes to your code and test locally before pushing to GitHub/GitLab. You can push your changes to automatically trigger a new deployment on Render.</p>
<h2 id="heading-best-practices-and-tips">Best Practices and Tips</h2>
<ol>
<li><p><strong>Use database migrations</strong>: Add Alembic to your project for managing schema changes</p>
<pre><code class="lang-bash"> pip install alembic
 alembic init migrations
</code></pre>
</li>
<li><p><strong>Separate development and production configurations</strong>:</p>
<pre><code class="lang-python"> <span class="hljs-keyword">if</span> os.environ.get(<span class="hljs-string">"ENVIRONMENT"</span>) == <span class="hljs-string">"production"</span>:
     <span class="hljs-comment"># Production settings</span>
 <span class="hljs-keyword">else</span>:
     <span class="hljs-comment"># Development settings</span>
</code></pre>
</li>
<li><p><strong>Monitor your application</strong>:</p>
<ul>
<li>Render provides logs and metrics for your application. You can set up alerts for errors or high resource usage.</li>
</ul>
</li>
<li><p><strong>Optimize database queries</strong>:</p>
<ul>
<li><p>Use SQLAlchemy's relationship loading options.</p>
</li>
<li><p>Consider adding indexes to frequently queried fields.</p>
</li>
</ul>
</li>
<li><p><strong>Scale when needed</strong>:</p>
<ul>
<li>Render allows you to upgrade your plan as your application grows. Consider upgrading your database plan for production applications.</li>
</ul>
</li>
</ol>
<h2 id="heading-common-issues-and-solutions">Common Issues and Solutions</h2>
<p>When deploying a Python web app on Render, a few issues can commonly occur. Here's a more detailed look at them and how you can resolve each one.</p>
<h3 id="heading-database-connection-errors"><strong>Database connection errors</strong>:</h3>
<p>If your app can’t connect to the database, first double-check that your <code>DATABASE_URL</code> environment variable is correctly set in your Render dashboard. Make sure the URL includes the right username, password, host, port, and database name.</p>
<p>Also, confirm that your SQLAlchemy models match the actual schema in your database. A mismatch here can lead to errors during migrations or app startup. If you're using Postgres, ensure that the database user has permission to read/write tables and perform migrations.</p>
<h3 id="heading-deployment-fails-entirely"><strong>Deployment fails entirely:</strong></h3>
<p>When deployment fails, Render usually provides helpful logs under the “Events” tab. Check there for any error messages. A few common culprits include:</p>
<ul>
<li><p>A missing <code>requirements.txt</code> file or forgotten dependencies.</p>
</li>
<li><p>A bad <code>start</code> command in the Render settings. Double-check that it points to your correct entry point (for example, <code>gunicorn app:app</code> or <code>uvicorn main:app --host=0.0.0.0 --port=10000</code>).</p>
</li>
<li><p>Improper Python version. You can specify this in a <code>runtime.txt</code> file (for example, <code>python-3.11.1</code>).</p>
</li>
</ul>
<h3 id="heading-api-returns-500-internal-server-errors"><strong>API returns 500 Internal Server errors</strong>:</h3>
<p>Internal server errors can happen for several reasons. To debug:</p>
<ul>
<li><p>Open your Render logs and look for Python tracebacks or unhandled exceptions.</p>
</li>
<li><p>Try to reproduce the issue locally using the same request and data.</p>
</li>
<li><p>Add <code>try/except</code> blocks around critical logic to capture and log errors more gracefully.</p>
</li>
</ul>
<p>Even better, set up structured logging or error tracking (for example, with Sentry) to catch these before your users do.</p>
<h3 id="heading-slow-response-times"><strong>Slow response times</strong>:</h3>
<p>If your app is slow or intermittently timing out, check:</p>
<ul>
<li><p>Whether you're still on the free Render tier, which has limited CPU and memory. Consider upgrading if you’re handling production-level traffic.</p>
</li>
<li><p>If you're running heavy or unoptimized database queries, tools like SQLAlchemy’s <code>.explain()</code> or Django Debug Toolbar can help.</p>
</li>
<li><p>If you’re frequently fetching the same data, try caching it using a lightweight in-memory cache like <code>functools.lru_cache</code> or a Redis instance.</p>
</li>
</ul>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Deploying a FastAPI app connected to PostgreSQL on Render is straightforward with the right structure and setup. While this guide used Render as an example, the concepts apply broadly across cloud platforms.</p>
<p>With this setup, you can develop, test, and deploy robust Python APIs backed by PostgreSQL databases efficiently.</p>
<p>The free tier on Render has some limitations, including PostgreSQL databases that expire after 90 days unless upgraded. For production applications, consider upgrading to a paid plan for better performance and reliability.</p>
<p>Happy coding!</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Use PostgreSQL in Django ]]>
                </title>
                <description>
                    <![CDATA[ If you’re building a Django project and wondering which database to use, PostgreSQL is a great choice. It’s reliable, fast, packed with powerful features, and works beautifully with Django. I’ve used it across multiple projects – from small web apps ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-postgresql-in-django/</link>
                <guid isPermaLink="false">68079d18a34a9e3143fe9e56</guid>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Django ]]>
                    </category>
                
                    <category>
                        <![CDATA[ PostgreSQL ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Udemezue John ]]>
                </dc:creator>
                <pubDate>Tue, 22 Apr 2025 13:43:52 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1745329406033/4d3cb010-d612-4ca8-8039-2d922e8b0337.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>If you’re building a Django project and wondering which database to use, PostgreSQL is a great choice. It’s reliable, fast, packed with powerful features, and works beautifully with Django.</p>
<p>I’ve used it across multiple projects – from small web apps to large-scale platforms – and it never disappoints.</p>
<p>In this post, I’ll walk you through how to connect PostgreSQL with Django step-by-step.</p>
<p>Let’s get started.</p>
<h3 id="heading-what-well-cover">What we’ll cover:</h3>
<ol>
<li><p><a class="post-section-overview" href="#heading-why-use-postgresql-with-django">Why Use PostgreSQL with Django?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-youll-need">What You'll Need</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-use-postgresql-in-django">How to Use PostgreSQL in Django</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-step-1-install-postgresql">Step 1: Install PostgreSQL</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-2-install-the-postgresql-adapter-for-python">Step 2: Install the PostgreSQL Adapter for Python</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-3-create-a-django-project-if-you-havent-yet">Step 3: Create a Django Project (If You Haven’t Yet)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-4-create-a-postgresql-database">Step 4: Create a PostgreSQL Database</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-5-update-django-settings-to-use-postgresql">Step 5: Update Django Settings to Use PostgreSQL</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-6-run-migrations">Step 6: Run Migrations</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-7-test-the-connection">Step 7: Test the Connection</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-common-issues-and-fixes">Common Issues (and Fixes)</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-optional-use-dj-database-url-for-better-settings">Optional: Use dj-database-url for Better Settings</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-frequently-asked-questions">Frequently Asked Questions</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-is-postgresql-better-than-sqlite-for-django">Is PostgreSQL better than SQLite for Django?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-do-i-need-to-install-postgresql-on-my-production-server">Do I need to install PostgreSQL on my production server?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-is-psycopg2-binary-safe-to-use-in-production">Is psycopg2-binary safe to use in production?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-can-i-switch-from-sqlite-to-postgresql-mid-project">Can I switch from SQLite to PostgreSQL mid-project?</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-wrapping-up">Wrapping Up</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-further-resources">Further Resources</a></p>
</li>
</ol>
<h2 id="heading-why-use-postgresql-with-django">Why Use PostgreSQL with Django?</h2>
<p>PostgreSQL is a popular, open-source relational database that’s known for its performance, flexibility, and powerful features like:</p>
<ul>
<li><p>Advanced data types (JSON, arrays, and so on)</p>
</li>
<li><p>Full-text search</p>
</li>
<li><p>Support for complex queries</p>
</li>
<li><p>Data integrity and reliability</p>
</li>
</ul>
<p>Django officially recommends PostgreSQL as the most feature-complete database backend it supports. If you're planning to build a serious web application, PostgreSQL is usually the best database to pair with Django.</p>
<h2 id="heading-what-youll-need">What You’ll Need</h2>
<p>Before we begin, make sure you have the following:</p>
<ul>
<li><p>Python installed (3.7 or higher is best)</p>
</li>
<li><p>Django installed (I’ll be using version 4.x)</p>
</li>
<li><p>PostgreSQL installed and running</p>
</li>
<li><p><code>psycopg2</code> or <code>psycopg2-binary</code> (This is the adapter that lets Django talk to PostgreSQL)</p>
</li>
</ul>
<h2 id="heading-how-to-use-postgresql-in-django">How to Use PostgreSQL in Django</h2>
<p>Here is how to get started:</p>
<h3 id="heading-step-1-install-postgresql">Step 1: Install PostgreSQL</h3>
<p>If you haven’t installed PostgreSQL yet, you can grab it from the <a target="_blank" href="https://www.postgresql.org/download/">official PostgreSQL website</a>. It works on Windows, macOS, and Linux.</p>
<p>Make sure you remember the username, password, and database name when you set it up – you’ll need those later.</p>
<h3 id="heading-step-2-install-the-postgresql-adapter-for-python">Step 2: Install the PostgreSQL Adapter for Python</h3>
<p>Django needs a little help to connect with PostgreSQL. That’s where <code>psycopg2</code> comes in.</p>
<p>You can install it using pip:</p>
<pre><code class="lang-bash">pip install psycopg2-binary
</code></pre>
<p>Tip: The <code>-binary</code> version is easier to install and works for most people. If you run into issues later, you can switch to <code>psycopg2</code> (non-binary).</p>
<h3 id="heading-step-3-create-a-django-project-if-you-havent-yet">Step 3: Create a Django Project (If You Haven’t Yet)</h3>
<p>If you haven’t created a project yet, start with:</p>
<pre><code class="lang-bash">django-admin startproject myproject
<span class="hljs-built_in">cd</span> myproject
</code></pre>
<p>This will give you the basic project structure.</p>
<h3 id="heading-step-4-create-a-postgresql-database">Step 4: Create a PostgreSQL Database</h3>
<p>Now, open your PostgreSQL client (like <code>psql</code> or pgAdmin), and create a new database:</p>
<pre><code class="lang-sql"><span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">DATABASE</span> mydatabase;
<span class="hljs-keyword">CREATE</span> <span class="hljs-keyword">USER</span> myuser <span class="hljs-keyword">WITH</span> <span class="hljs-keyword">PASSWORD</span> <span class="hljs-string">'mypassword'</span>;
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">ROLE</span> myuser <span class="hljs-keyword">SET</span> client_encoding <span class="hljs-keyword">TO</span> <span class="hljs-string">'utf8'</span>;
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">ROLE</span> myuser <span class="hljs-keyword">SET</span> default_transaction_isolation <span class="hljs-keyword">TO</span> <span class="hljs-string">'read committed'</span>;
<span class="hljs-keyword">ALTER</span> <span class="hljs-keyword">ROLE</span> myuser <span class="hljs-keyword">SET</span> timezone <span class="hljs-keyword">TO</span> <span class="hljs-string">'UTC'</span>;
<span class="hljs-keyword">GRANT</span> <span class="hljs-keyword">ALL</span> <span class="hljs-keyword">PRIVILEGES</span> <span class="hljs-keyword">ON</span> <span class="hljs-keyword">DATABASE</span> mydatabase <span class="hljs-keyword">TO</span> myuser;
</code></pre>
<p>This sets up a database and user with the right permissions. Replace <code>mydatabase</code>, <code>myuser</code>, and <code>mypassword</code> with whatever values you prefer.</p>
<h3 id="heading-step-5-update-django-settings-to-use-postgresql">Step 5: Update Django Settings to Use PostgreSQL</h3>
<p>Now it’s time to tell Django to use your new PostgreSQL database.</p>
<p>Open <code>myproject/settings.py</code> and look for the <code>DATABASES</code> setting. Replace the default <code>sqlite3</code> section with this:</p>
<pre><code class="lang-python">DATABASES = {
    <span class="hljs-string">'default'</span>: {
        <span class="hljs-string">'ENGINE'</span>: <span class="hljs-string">'django.db.backends.postgresql'</span>,
        <span class="hljs-string">'NAME'</span>: <span class="hljs-string">'mydatabase'</span>,
        <span class="hljs-string">'USER'</span>: <span class="hljs-string">'myuser'</span>,
        <span class="hljs-string">'PASSWORD'</span>: <span class="hljs-string">'mypassword'</span>,
        <span class="hljs-string">'HOST'</span>: <span class="hljs-string">'localhost'</span>,
        <span class="hljs-string">'PORT'</span>: <span class="hljs-string">'5432'</span>,
    }
}
</code></pre>
<p>This tells Django to:</p>
<ul>
<li><p>Use PostgreSQL (<code>django.db.backends.postgresql</code>)</p>
</li>
<li><p>Connect to a local database called <code>mydatabase</code></p>
</li>
<li><p>Use the user and password you set up earlier</p>
</li>
</ul>
<h3 id="heading-step-6-run-migrations">Step 6: Run Migrations</h3>
<p>Now that everything’s connected, let’s create the database tables Django needs:</p>
<pre><code class="lang-bash">python manage.py migrate
</code></pre>
<p>If everything’s working, you’ll see Django creating tables in PostgreSQL. No errors? You’re good to go!</p>
<h3 id="heading-step-7-test-the-connection">Step 7: Test the Connection</h3>
<p>Let’s test it all by creating a superuser (admin account):</p>
<pre><code class="lang-bash">python manage.py createsuperuser
</code></pre>
<p>Follow the prompts, then run:</p>
<pre><code class="lang-bash">python manage.py runserver
</code></pre>
<p>Open your browser and go to <code>http://127.0.0.1:8000/admin</code>. Log in with your superuser account. You’ll be in the Django admin dashboard – and yes, all of this is backed by PostgreSQL now!</p>
<h2 id="heading-common-issues-and-fixes">Common Issues (and Fixes)</h2>
<p>Here are a few things that might trip you up:</p>
<ul>
<li><p><strong>Error:</strong> <code>psycopg2.errors.UndefinedTable</code>: This usually means you forgot to run <code>migrate</code>.</p>
</li>
<li><p><strong>Can’t connect to database:</strong> Double-check your database name, user, and password. Make sure PostgreSQL is running.</p>
</li>
<li><p><strong>Role doesn’t exist:</strong> You might’ve forgotten to create the user in PostgreSQL, or you used the wrong name in <code>settings.py</code>.</p>
</li>
</ul>
<h2 id="heading-optional-use-dj-database-url-for-better-settings">Optional: Use <code>dj-database-url</code> for Better Settings</h2>
<p>If you’re planning to deploy your app later (especially on services like Heroku), managing your database settings through a URL is cleaner.</p>
<p>Install the helper package:</p>
<pre><code class="lang-bash">pip install dj-database-url
</code></pre>
<p>Then in your <code>settings.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> dj_database_url

DATABASES = {
    <span class="hljs-string">'default'</span>: dj_database_url.config(default=<span class="hljs-string">'postgres://myuser:mypassword@localhost:5432/mydatabase'</span>)
}
</code></pre>
<p>This lets you control your database config from an environment variable, which is more secure and flexible.</p>
<h2 id="heading-frequently-asked-questions">Frequently Asked Questions</h2>
<h3 id="heading-is-postgresql-better-than-sqlite-for-django"><strong>Is PostgreSQL better than SQLite for Django?</strong></h3>
<p>For learning or small projects, SQLite is fine. But for serious apps with lots of users or advanced queries, PostgreSQL is much better.</p>
<h3 id="heading-do-i-need-to-install-postgresql-on-my-production-server"><strong>Do I need to install PostgreSQL on my production server?</strong></h3>
<p>Yes – unless you’re using a hosted PostgreSQL solution like <a target="_blank" href="https://aws.amazon.com/rds/postgresql/">Amazon RDS</a>, <a target="_blank" href="https://devcenter.heroku.com/articles/heroku-postgresql">Heroku Postgres</a>, or <a target="_blank" href="https://supabase.com/">Supabase</a>.</p>
<h3 id="heading-is-psycopg2-binary-safe-to-use-in-production"><strong>Is psycopg2-binary safe to use in production?</strong></h3>
<p>Yes, for most cases. But some recommend switching to the non-binary version (<code>psycopg2</code>) in production for better control.</p>
<h3 id="heading-can-i-switch-from-sqlite-to-postgresql-mid-project"><strong>Can I switch from SQLite to PostgreSQL mid-project?</strong></h3>
<p>Yes, but you’ll need to migrate your data. Tools like <a target="_blank" href="https://docs.djangoproject.com/en/stable/ref/django-admin/#dumpdata">Django’s <code>dumpdata</code> and <code>loaddata</code></a> can help with that.</p>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>Using PostgreSQL in Django is a great step forward when you want to build real, production-ready apps.</p>
<p>The setup is pretty straightforward once you know the steps, and the performance gains are worth it.</p>
<p>Come say hey on <a target="_blank" href="http://X.com/_udemezue">X.com/_udemezue</a> and check out my <a target="_blank" href="https://Tchelete.com">blog</a> while you're at it!</p>
<h3 id="heading-further-resources">Further Resources</h3>
<p>If you want to dive deeper, here are a few links I recommend:</p>
<ul>
<li><p><a target="_blank" href="https://docs.djangoproject.com/en/stable/ref/settings/#databases">Django Database Settings Docs</a></p>
</li>
<li><p><a target="_blank" href="https://www.postgresql.org/docs/">PostgreSQL Official Documentation</a></p>
</li>
<li><p><a target="_blank" href="https://realpython.com/django-setup/#databases">Using PostgreSQL with Django (Real Python)</a></p>
</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
