<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Nneoma Uche - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Nneoma Uche - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Sat, 23 May 2026 22:19:41 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/author/Nene23/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Use WebSockets: From Python to FastAPI ]]>
                </title>
                <description>
                    <![CDATA[ Real-time data powers much of modern software: live stock prices, chat applications, sports scores, collaborative tools. And to build these systems, you'll need to understand how real-time communicati ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-use-websockets-from-python-to-fastapi/</link>
                <guid isPermaLink="false">69b206806c896b0519d2a308</guid>
                
                    <category>
                        <![CDATA[ websockets ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ FastAPI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Backend Development ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Real Time ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Nneoma Uche ]]>
                </dc:creator>
                <pubDate>Thu, 12 Mar 2026 00:19:12 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/8acb0854-7289-4794-8f97-242ec5d8ca61.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Real-time data powers much of modern software: live stock prices, chat applications, sports scores, collaborative tools. And to build these systems, you'll need to understand how real-time communication actually works—which isn’t always straightforward.</p>
<p>I ran into this firsthand while trying to build a live options dashboard. HTTP requests weren't going to cut it, and everything I was reading seemed overly complex until I went back to the basics. This article is the result of that process.</p>
<p>We'll cover Python's <code>websockets</code> library from scratch, then move into FastAPI, where many Python backends live. It's worth noting that WebSockets aren't the only solution for real-time communication. WebRTC may be a better fit depending on your use case, but understanding WebSockets is the right starting point before exploring further.</p>
<h3 id="heading-table-of-contents">Table of Contents</h3>
<ol>
<li><p><a href="#heading-websocket-connections-and-methods">WebSocket Connections and Methods</a></p>
</li>
<li><p><a href="#heading-how-to-build-your-first-websocket-in-python">How to Build Your First WebSocket in Python</a></p>
</li>
<li><p><a href="#heading-file-transfer-over-websockets">File Transfer Over WebSockets</a></p>
</li>
<li><p><a href="#heading-how-to-connect-to-an-external-websocket">How to Connect to an External WebSocket</a></p>
</li>
<li><p><a href="#heading-websockets-in-fastapi">WebSockets in FastAPI</a></p>
</li>
<li><p><a href="#heading-how-to-handle-websocket-disconnections-in-fastapi">How to Handle WebSocket Disconnections in FastAPI</a></p>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="heading-websocket-connections-and-methods">WebSocket Connections and Methods</h2>
<p>A WebSocket connection enables bi-directional communication between a client and a server. Once a connection is established, both sides can communicate freely without either having to ask first. This is different from a regular HTTP request, where the client always has to ask before the server can respond.</p>
<p>It looks something like this:</p>
<pre><code class="language-plaintext">        CLIENT&nbsp; &lt;===== open connection =====&gt;&nbsp; SERVER
</code></pre>
<p>Note that a WebSocket URL is not a regular web page, so you can't "visit it" like a website. You need a client to talk to it.</p>
<p>Different frameworks provide different methods for handling WebSocket connections. With Python’s <code>websockets</code> library, for instance, a connection is automatically accepted the moment a client connects. With frameworks like FastAPI, you have to explicitly call <code>await websocket.accept()</code>, otherwise the connection gets rejected.</p>
<p>Let’s look at the core methods provided by Python’s <code>websockets</code> library:</p>
<ol>
<li><p><code>websockets.serve(...)</code>:&nbsp; starts a WebSocket server.</p>
</li>
<li><p><code>websockets.connect(...)</code>: connects to a WebSocket server.</p>
</li>
<li><p><code>websockets.send(...)</code>: sends a message from either side.</p>
</li>
<li><p><code>websockets.recv()</code>: receives a message from client or server.</p>
</li>
</ol>
<p><code>recv()</code> takes no arguments because it's purely a waiting operation. It waits for the next message and returns it:</p>
<pre><code class="language-python">message = await websocket.recv()
</code></pre>
<h2 id="heading-how-to-build-your-first-websocket-in-python">How to Build Your First WebSocket in Python</h2>
<p>Before we dive into frameworks, let’s explore Python’s <code>websockets</code> library. You’ll set up a simple server and client, and exchange messages over a WebSocket connection, giving you a solid foundation for understanding WebSockets under the hood.</p>
<h3 id="heading-environment-setup">Environment Setup</h3>
<p>Run the following in your virtual environment to install or verify the WebSockets package:</p>
<pre><code class="language-python">pip install websockets
# or, to check if it's already installed:
pip show websockets
</code></pre>
<h3 id="heading-create-the-websocket-server">Create the WebSocket Server</h3>
<p>Create <code>server.py</code> in your project folder, and paste this:</p>
<pre><code class="language-python">import asyncio
import websockets

async def handler(connection):
    print("Client connected")

    message = await connection.recv()
    print("Received from client:", message)
    await connection.send("Hello client!")


async def main():
    async with websockets.serve(handler, "localhost", 8000):
        print("Server running at ws://localhost:8000")
        #await asyncio.Future()  # runs forever
        await asyncio.sleep(30)

asyncio.run(main())
</code></pre>
<p>When this line executes:</p>
<pre><code class="language-python">async with websockets.serve(handler, "localhost", 8000):
</code></pre>
<p>The library opens a TCP socket on the specified host and port and waits for incoming clients. When one connects, it creates a connection object and passes it into your handler function.</p>
<p>The handler is required because it defines what the server does with each connection. The <code>host</code> and <code>port</code> arguments are also important. Both default to <code>None</code> – passing neither raises an error because the OS cannot bind a network server without a port.</p>
<p>You could pass <code>port=0</code> to let the OS assign a free port automatically, but then you'd need an extra step to figure out which port was chosen, so the client can connect:</p>
<pre><code class="language-python">server.sockets[0].getsockname()
</code></pre>
<p>It’s simpler to specify both host and port explicitly, so the client knows exactly where the server is running.</p>
<h3 id="heading-set-up-the-client">Set Up the Client</h3>
<p>Create <code>client.py</code> in the same folder and add this:</p>
<pre><code class="language-python">import asyncio
import websockets

async def client():
    async with websockets.connect("ws://localhost:8000") as websocket:
        await websocket.send("Hello server!")
        response = await websocket.recv()
        print("Server replied:", response)

asyncio.run(client())
</code></pre>
<h3 id="heading-test-the-connection">Test the Connection</h3>
<p>First, open a terminal and run <code>server.py</code>. You should see:</p>
<pre><code class="language-plaintext">Server running at ws://localhost:8000
</code></pre>
<p>In a second terminal, run <code>client.py</code>. Messages should appear in both terminals confirming that the connection is active and both sides are communicating.</p>
<p>Note that the server must be running before you start the client – otherwise the client has nothing to connect to, and the connection will fail.</p>
<h4 id="heading-keeping-the-server-alive-a-note-on-asynciofuture">Keeping the server alive: a note on asyncio.Future()</h4>
<p>In <code>server.py</code>, there’s a line currently commented out:</p>
<pre><code class="language-python">await asyncio.Future()
</code></pre>
<p>This keeps the server running indefinitely. For local development and testing however, <code>await asyncio.sleep(30)</code> is a simpler alternative. It keeps the server alive for a fixed period without running forever.</p>
<h2 id="heading-file-transfer-over-websockets">File Transfer Over WebSockets</h2>
<p>WebSockets aren't limited to text. They support raw bytes too, which means you can send files directly over the connection. Here’s how a client can send a file to a server over a WebSocket connection:</p>
<h3 id="heading-update-serverpy">Update <code>server.py</code></h3>
<pre><code class="language-python">async def file_handler(ws):
    print("Client connected, waiting for file...")
    file_bytes = await ws.recv()  # receive bytes
    with open("received_file.png", "wb") as f:
        f.write(file_bytes)
    print("File received and saved!")
    await ws.send("File received successfully!")

async def main():
    async with websockets.serve(file_handler, "localhost", 8000):
        print("Server running on ws://localhost:8000")
        await asyncio.sleep(50)  # keep server alive

asyncio.run(main())
</code></pre>
<p>The handler waits for incoming bytes with <code>await ws.recv()</code>; the <code>websockets</code> library automatically detects whether the incoming message is text or bytes, so no extra configuration is needed. Once received, the file is written to disk in binary mode (<code>"wb"</code>) and the server sends a confirmation message back to the client.</p>
<h3 id="heading-update-clientpy">Update <code>client.py</code></h3>
<pre><code class="language-python">import asyncio
import websockets

async def send_file():
    uri = "ws://localhost:8000"
    async with websockets.connect(uri) as ws:
        with open("portfolio-image.png", "rb") as f:  #open file in binary mode
            file_bytes = f.read()
        await ws.send(file_bytes)  # send bytes
        response = await ws.recv()
        print("Server response:", response)

asyncio.run(send_file())
</code></pre>
<p>The client opens the image in binary mode (<code>"rb"</code>), reads the entire file into memory as bytes, and sends it in a single <code>ws.send()</code> call. It then waits for the server's confirmation before closing the connection.</p>
<h3 id="heading-test-it">Test it</h3>
<p>Add an image to your project folder and make sure the filename in <code>client.py</code> matches. Run <code>server.py</code> first, then <code>client.py</code> in a second terminal.</p>
<p>Once the transfer completes, the server saves the file as <code>received_file.png</code> in the same directory. You should see it appear in your workspace immediately.</p>
<p>This approach loads the entire file into memory before sending. For large files, it’s better to read and send them in chunks. But this is the easiest way to understand WebSocket byte transfer.</p>
<h2 id="heading-how-to-connect-to-an-external-websocket">How to Connect to an External WebSocket</h2>
<p>So far you've been connecting to servers you built yourself. But WebSocket clients can also connect to public servers. For example, a client can connect to Postman’s echo server:</p>
<pre><code class="language-python">import asyncio
import websockets

async def connect_external():
    uri = "wss://ws.postman-echo.com/raw"  # public WebSocket server
    async with websockets.connect(uri) as ws:
        print("Connected to external server!")

        # Send a message
        await ws.send("Hello external server!")
        print("Message sent")

        # Receive response
        response = await ws.recv()
        print("Received from server:", response)
asyncio.run(connect_external())
</code></pre>
<p>Notice the client connects to Postman’s echo server using the <code>wss://</code> URI scheme instead of <code>ws://</code>. This indicates the connection is encrypted using TLS, similar to how <code>https://</code> secures regular web requests.</p>
<p>An echo server returns exactly what you send it. So "Hello external server!" comes straight back as the response. It's a useful sandbox for testing your client-side WebSocket code without needing your own server.</p>
<h2 id="heading-websockets-in-fastapi">WebSockets in FastAPI</h2>
<p>FastAPI provides a WebSocket object (via Starlette under the hood) to manage real-time connections. You can define WebSocket endpoints just like HTTP routes, while Uvicorn handles the event loop – no manual asyncio server management needed. This makes FastAPI a natural fit for real-time projects, from chat apps to live dashboards and data feeds.</p>
<p>Before jumping into code, here's a quick reference of the core methods you'll be working with.</p>
<p><strong>Accepting:</strong></p>
<ul>
<li><code>await websocket.accept()</code>: the <code>accept()</code> method must be called first, before anything else. Skip it and the connection gets rejected.</li>
</ul>
<p><strong>Sending:</strong></p>
<ul>
<li><p><code>await websocket.send_text(data)</code>: sends a string.</p>
</li>
<li><p><code>await websocket.send_bytes(data)</code>: sends binary data.</p>
</li>
<li><p><code>await websocket.send_json(data)</code>: serializes and sends JSON.</p>
</li>
</ul>
<p><strong>Receiving:</strong></p>
<ul>
<li><p><code>await websocket.receive_text()</code>: waits for a text message.</p>
</li>
<li><p><code>await websocket.receive_bytes()</code>: waits for binary data.</p>
</li>
<li><p><code>await websocket.receive_json()</code>: receives and deserializes JSON.</p>
</li>
<li><p><code>async for msg in websocket.iter_text()</code>: iterates over incoming messages, exits cleanly on disconnect.</p>
</li>
</ul>
<p><strong>Closing:</strong></p>
<ul>
<li><code>await websocket.close(code=1000)</code>: standard code for a normal closure. It accepts an optional “reason” argument.</li>
</ul>
<p>Here's what the WebSocket lifecycle looks like in FastAPI:</p>
<img src="https://cdn.hashnode.com/uploads/covers/6426acfa5bc738c37852b5bd/b61b6b25-e027-47a1-8efc-b921e93b8521.png" alt="WebSocket in FastAPI" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h3 id="heading-building-a-simple-echo-server-with-fastapi">Building a Simple Echo Server with FastAPI</h3>
<p>As you saw with the Postman example, an echo server sends back the message a client provides. Let's build one with FastAPI.</p>
<h4 id="heading-1-install-fastapi">1. Install FastAPI:</h4>
<pre><code class="language-python">pip install "fastapi[standard]"
</code></pre>
<h4 id="heading-2-update-serverpy">2. Update <code>server.py</code>:</h4>
<pre><code class="language-python">from fastapi import FastAPI, WebSocket

app = FastAPI()

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    data = await websocket.receive_text()
   
    await websocket.send_text(f"You said: {data}")

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="127.0.0.1", port=8000)
</code></pre>
<p>A few things to note here compared to the plain <code>websockets</code> library:</p>
<ul>
<li><p>WebSocket endpoints are defined with <code>@app.websocket("/ws")</code> just like an HTTP route.</p>
</li>
<li><p><code>await websocket.accept()</code> is required before anything else. FastAPI won't accept connections without it.</p>
</li>
<li><p>Uvicorn handles the event loop and server startup for you via the <code>if name == "__main__"</code> block. No <code>asyncio.run()</code> or <code>asyncio.Future()</code> needed.</p>
</li>
</ul>
<h4 id="heading-3-update-clientpy">3. Update client.py:</h4>
<pre><code class="language-python">async def test_client():
    uri = "ws://127.0.0.1:8000/ws"
    async with websockets.connect(uri) as ws:
        await ws.send("Hello FastAPI server!")
        response = await ws.recv()
        print("Server replied:", response)

asyncio.run(test_client())
</code></pre>
<p>Since the FastAPI server isn't secured with TLS, the client URI uses <code>ws://</code> instead of <code>wss://</code>. Make sure to match the host and port from your server code.</p>
<h4 id="heading-4-interact-with-the-echo-server">4. Interact with the echo server:</h4>
<p>Start <code>server.py</code>, then run <code>client.py</code> in another terminal. The server terminal should show the echoed message.</p>
<img src="https://cdn.hashnode.com/uploads/covers/6426acfa5bc738c37852b5bd/88629d79-eb91-4af5-9752-f9596ff5e5a4.png" alt="88629d79-eb91-4af5-9752-f9596ff5e5a4" style="display:block;margin:0 auto" width="600" height="400" loading="lazy">

<h2 id="heading-how-to-handle-websocket-disconnections-in-fastapi">How to Handle WebSocket Disconnections in FastAPI</h2>
<p>Clients will inevitably disconnect in real-time applications, sometimes intentionally, sometimes unexpectedly. If not handled properly, this can crash your server or leave it in a broken state.</p>
<p>The <code>WebSocketDisconnect</code> exception in FastAPI is raised whenever a client unexpectedly closes the connection, allowing the server to handle disconnects gracefully, log the event, and clean up resources without crashing.</p>
<p>Here’s an example:</p>
<pre><code class="language-python">@app.websocket("/ws")
async def websocket_endpoint(ws: WebSocket):
    await ws.accept()
    try:
        while True:
            data = await ws.receive_text()
   
            if "bye" in data or "quit" in data:
                await ws.send_text("Closing connection")
                await ws.close(code=1000, reason="Server requested close")  
                break
            await ws.send_text(f"I got your request: {data}")
    except WebSocketDisconnect:
        print("Client disconnected")  # connection already closed
</code></pre>
<p>The server runs a continuous loop waiting for messages. If the client message contains "bye" or "quit", the server responds, calls <code>await ws.close(code=1000)</code>, and breaks out of the loop cleanly.</p>
<p>But if the client disconnects unexpectedly, <code>WebSocketDisconnect</code> is caught by the except block and the server moves on without crashing. At this point the connection is already closed on the client side, so calling <code>ws.close()</code> inside the except block is unnecessary.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>WebSockets make real-time communication possible by keeping a persistent connection open between client and server. Starting with Python’s <code>websockets</code> library helps clarify how the protocol works under the hood, while frameworks like FastAPI provide the structure needed for production applications.</p>
<p>The parts that trip most people up early on are <code>asyncio</code> and FastAPI's explicit <code>websocket.accept()</code>. With <code>asyncio</code>, the question is usually why it's needed and why the server dies instantly without something keeping it alive. And it's easy to ignore <code>websocket.accept()</code> if you're coming from the plain <code>websockets library</code> where that happens automatically. Once those click, everything else follows naturally.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Implement Dependency Injection in FastAPI ]]>
                </title>
                <description>
                    <![CDATA[ Several languages and frameworks depend on dependency injection—no pun intended. Go, Angular, NestJS, and Python's FastAPI all use it as a core pattern. If you've been working with FastAPI, you've likely encountered dependencies in action. Perhaps yo... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-implement-dependency-injection-in-fastapi/</link>
                <guid isPermaLink="false">691740a91f4fa448325a55f9</guid>
                
                    <category>
                        <![CDATA[ dependency injection ]]>
                    </category>
                
                    <category>
                        <![CDATA[ FastAPI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ backend ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Nneoma Uche ]]>
                </dc:creator>
                <pubDate>Fri, 14 Nov 2025 14:46:01 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1763131442081/76eff35b-be68-49c1-9743-d78ebc87b292.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Several languages and frameworks depend on dependency injection—no pun intended. Go, Angular, NestJS, and Python's FastAPI all use it as a core pattern.</p>
<p>If you've been working with FastAPI, you've likely encountered dependencies in action. Perhaps you saw <code>Depends()</code> in a tutorial or the docs and were confused for a minute. I certainly was. That confusion sparked weeks of experimenting with this system. The truth is, you can't avoid dependency injection when building backend services with FastAPI. It's baked into the framework's DNA, powering everything from authentication and database connections to request validation.</p>
<p>FastAPI's docs describe its dependency injection system as 'powerful but intuitive.' That’s accurate, once you understand how it works. This article breaks it down, covering function dependencies, class dependencies, dependency scopes, as well as practical examples.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-dependencies-and-dependency-injection-in-fastapi">Dependencies and Dependency Injection in FastAPI</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-getting-started-environment-setup">Getting Started: Environment Setup</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-types-of-dependencies-in-fastapi">Types of Dependencies in FastAPI</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-how-to-use-function-dependencies-in-fastapi">How to Use Function Dependencies in FastAPI</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-use-class-dependencies-in-fastapi">How to Use Class Dependencies in FastAPI</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-dependency-scope">Dependency Scope</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-path-operation-level">Path Operation Level</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-router-level">Router Level</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-application-level">Application Level</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-common-use-cases-for-dependency-injection">Common Use Cases for Dependency Injection</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along with this article, you should have:</p>
<ul>
<li><p>Working knowledge of Python.</p>
</li>
<li><p>Ability to create and activate virtual environments.</p>
</li>
<li><p>Basic understanding of FastAPI.</p>
</li>
<li><p>Familiarity with Object-Oriented Programming (OOP) concepts.</p>
</li>
</ul>
<h2 id="heading-dependencies-and-dependency-injection-in-fastapi">Dependencies and Dependency Injection in FastAPI</h2>
<p>A dependency is a reusable piece of logic, like authentication, database connection, or validation, that your path operations require. Dependency injection (DI) is how FastAPI delivers these dependencies to specific parts of your application: you declare them using <code>Depends()</code> and FastAPI automatically executes them when the associated route receives a request.</p>
<p>Think of it as requesting the tools your application needs. You declare dependencies once and FastAPI provides them wherever needed, with no repetitive setup across routes.</p>
<p>This makes for modular, scalable applications. Without DI, you would have to repeat the same setup code on every endpoint, making updates tedious and bugs more likely.</p>
<h2 id="heading-getting-started-environment-setup">Getting Started: Environment Setup</h2>
<p>Let's set up your development environment to work through the examples in this guide.</p>
<p>Start by creating a project folder, then:</p>
<p>Create and activate a virtual environment:</p>
<pre><code class="lang-bash">python -m venv deps
<span class="hljs-built_in">source</span> deps/bin/activate          <span class="hljs-comment">#on Mac</span>
deps\Scripts\activate             <span class="hljs-comment"># On Windows</span>
</code></pre>
<p>Install FastAPI with all dependencies:</p>
<pre><code class="lang-python">pip install <span class="hljs-string">'fastapi[all]'</span>
</code></pre>
<p>Organize your project as follows:</p>
<pre><code class="lang-python">fastapi-deps/
├── deps/                 <span class="hljs-comment"># Virtual environment</span>
├── function_deps.py
├── class_deps.py
├── router_deps.py
├── app.py
└── requirements.txt
</code></pre>
<h2 id="heading-types-of-dependencies-in-fastapi">Types of Dependencies in FastAPI</h2>
<p>In FastAPI, a dependency is a callable object that retrieves or verifies information before a route executes. Dependencies can be implemented as either functions or classes.</p>
<p><strong>Function dependencies</strong> are the most straightforward approach and work well for most use cases, including validation, authentication, and data retrieval. <strong>Class dependencies</strong> can handle the same tasks but are particularly useful when you need stateful logic, multiple instances with different configurations, or prefer object-oriented patterns.</p>
<h3 id="heading-how-to-use-function-dependencies-in-fastapi">How to Use Function Dependencies in FastAPI</h3>
<p>A function dependency is a helper function (such as for authentication or data retrieval) that can be injected into path operations. To demonstrate, we'll create a simple user authentication dependency using an in-memory database—a list of dictionaries.</p>
<p>Recall the folder structure from earlier? We’ll write this code in <code>fastapi-deps/function_deps.py</code>.</p>
<p>Start by importing the required modules:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, Depends, HTTPException
<span class="hljs-keyword">import</span> uvicorn
</code></pre>
<p>You bring in <code>FastAPI</code> to create the app instance, <code>Depends</code> for dependency injection, and <code>HTTPException</code> to handle errors gracefully. <code>uvicorn</code> will be used to run the application later.</p>
<p>Next, instantiate the FastAPI application:</p>
<pre><code class="lang-python">app = FastAPI()
</code></pre>
<p><code>app = FastAPI()</code> creates your application instance: the object that will hold all your endpoints and dependencies.</p>
<p>Next, create an in-memory database. Define a list of dictionaries to act as your temporary database. Each dictionary represents a user entry containing a name and a password.</p>
<pre><code class="lang-python">users = [
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Ore"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"jkzvdgwya12"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Uche"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"lga546"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Seke"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"SK99!"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Afi"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"Afi@144"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Sam"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"goTiger72*"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Ozi"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"xx%hI"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Ella"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"Opecluv18"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Claire"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"cBoss@14G"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Sena"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"SenDaBoss5"</span>},
    {<span class="hljs-string">"name"</span>: <span class="hljs-string">"Ify"</span>, <span class="hljs-string">"password"</span>: <span class="hljs-string">"184Norab"</span>}  
]
</code></pre>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">This type of database isn’t persistent; any data stored therein is lost when the application restarts.</div>
</div>

<p>Then, define a dependency function for user validation. The simple helper function below checks whether a username and password provided by the user match an existing user in the database.</p>
<pre><code class="lang-python"><span class="hljs-comment">#the dependency function</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">user_dep</span>(<span class="hljs-params">name: str, password: str</span>):</span>
    <span class="hljs-keyword">for</span> u <span class="hljs-keyword">in</span> users:
        <span class="hljs-keyword">if</span> u[<span class="hljs-string">"name"</span>] == name <span class="hljs-keyword">and</span> u[<span class="hljs-string">"password"</span>] == password:
            <span class="hljs-keyword">return</span> {<span class="hljs-string">"name"</span>: name, <span class="hljs-string">"valid"</span>: <span class="hljs-literal">True</span>}
</code></pre>
<p>This function expects two string parameters, <code>name</code> and <code>password</code>, from the incoming request. If it finds a match in the <code>users</code> database, it returns a dictionary confirming the user’s validity. FastAPI automatically converts this dictionary into a JSON response.</p>
<p>Next, inject the dependency into a path function:</p>
<pre><code class="lang-python"><span class="hljs-comment">#the web endpoint</span>
<span class="hljs-meta">@app.get("/users/{user}")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_user</span>(<span class="hljs-params">user = Depends(<span class="hljs-params">user_dep</span>)</span>) -&gt; dict:</span>
    <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> user:
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid username or password"</span>)
    <span class="hljs-keyword">return</span> user
</code></pre>
<p>The <code>user_dep</code> function is injected into the path operation using <code>Depends()</code>. When an HTTP request is made to this endpoint, FastAPI executes the dependency first, validates the input, and passes its return value to the <code>user</code> parameter.</p>
<p>The <code>-&gt; dict:</code> annotation indicates that the function returns a dictionary, which FastAPI auto-converts to JSON. If no matching record is found, an <code>HTTPException</code> with a 401 status code is raised; otherwise, the verified user data is returned.</p>
<p>Now you’ll start the FastAPI server. To start the server, open your terminal in the project directory and run:</p>
<pre><code class="lang-python">uvicorn function_deps:app --reload
</code></pre>
<ul>
<li><p><code>function_deps</code> is the name of your Python file (without the <strong>.py</strong> extension).</p>
</li>
<li><p><code>--reload</code> automatically restarts the server whenever you save changes.</p>
</li>
</ul>
<p>Once it starts, you’ll see an output similar to the image below:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762651145390/479b187c-f455-4617-aa7f-e075bf668ee5.jpeg" alt="uvicorn output in terminal" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<p>Now you can test the endpoint. Open your browser or the Postman desktop app to validate the user <strong>“Seke”</strong>. Paste this URL into your browser: <em>http://127.0.0.1:8000/users/{user}?name=Seke&amp;password=SK99!</em></p>
<p>Alternatively, you can test the endpoint using FastAPI’s built-in docs at: http://127.0.0.1:8000/docs</p>
<p>In the Swagger UI:</p>
<ul>
<li><p>Click on the <strong>Get User</strong> endpoint</p>
</li>
<li><p>Click <strong>Try it out</strong></p>
</li>
<li><p>Enter “Seke” in the name field and “SK99!” in the password field</p>
</li>
<li><p>Click <strong>Execute</strong></p>
</li>
</ul>
<p>You should get a 200 status code, with the payload in this image:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762651845087/9495107e-1ab8-4349-a701-04e5de461fb6.jpeg" alt="payload for get_user endpoint" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<p>You can also test the endpoint with usernames or passwords that don’t exist in the database. Each time, you should see a <strong>401</strong> error like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762652045213/c8dc8bb1-e2c4-456f-92f5-911dddae73eb.jpeg" alt="unauthorized error output in FastAPI docs" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<h3 id="heading-how-to-use-class-dependencies-in-fastapi">How to Use Class Dependencies in FastAPI</h3>
<p>While functions are the most common way to define dependencies, FastAPI also supports class-based dependencies. Classes are useful when you need reusable instances with configurable state or prefer object-oriented patterns.</p>
<p>Class dependencies inject the same way: through the <code>Depends</code> function in your path operation.</p>
<p>Let's convert the <code>user_dep</code> function dependency to a class. It will authenticate users, grant access to valid credentials, and raise exceptions for unauthorized attempts. We'll apply it to a user dashboard endpoint to ensure only authenticated users access their resources.</p>
<pre><code class="lang-python"><span class="hljs-comment">#Dependency class for user authentication</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">UserAuth</span>():</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, name: str, password: str</span>):</span>
        self.name = name
        self.password = password

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__call__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-comment">#check if name and password entered correspond to any row in the db</span>
        <span class="hljs-keyword">for</span> user <span class="hljs-keyword">in</span> users:
            <span class="hljs-keyword">if</span> user[<span class="hljs-string">"name"</span>] == self.name <span class="hljs-keyword">and</span> user[<span class="hljs-string">"password"</span>] == self.password:
                <span class="hljs-keyword">pass</span>
        <span class="hljs-comment">#If no match found, raise an error</span>
        <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid username or password"</span>)
</code></pre>
<p>The <code>__init</code>__ method receives the parameters from the request (<code>name</code> and <code>password</code>) and stores them as instance attributes. These can then be accessed in the <code>__call__</code> method, which contains the dependency logic.</p>
<p>Note that <code>__call__</code> doesn't return a value in this example. It simply raises an <code>HTTPException</code> if authentication fails. The <code>__call__</code> method makes the class instance callable, allowing FastAPI to invoke it like a regular function.</p>
<p>Here’s how to inject <code>UserAuth</code> into a path function:</p>
<pre><code class="lang-python"><span class="hljs-comment">#Injecting the class dependency into a path operation</span>
<span class="hljs-meta">@app.get("/user/dashboard")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_dashboard</span>(<span class="hljs-params">user: UserAuth = Depends(<span class="hljs-params">UserAuth</span>)</span>):</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span>: <span class="hljs-string">f"Access granted to <span class="hljs-subst">{user.name}</span>"</span>}
</code></pre>
<p><strong>What's happening here:</strong></p>
<p>When a client requests the <code>/user/dashboard</code> endpoint, FastAPI executes the dependency first. Recognizing <code>UserAuth</code> as a class, FastAPI automatically creates an instance and populates it with values from the query parameters.</p>
<p>Here’s the execution flow to help you understand:</p>
<ul>
<li><p><code>Depends(UserAuth)</code> tells FastAPI: “Before running this route, create a <code>UserAuth</code> instance.”</p>
</li>
<li><p>FastAPI extracts name and password from the request URL (for example, <em>/user/dashboard?name=Seke&amp;password=SK99!</em>).</p>
</li>
<li><p>It then calls <code>UserAuth(name=”Seke”, password=”SK99!”)</code> to create the instance.</p>
</li>
</ul>
<ul>
<li><p>The <code>UserAuth</code> instance, with its stored name and password attributes, is passed to the <code>user</code> parameter in <code>get_dashboard</code>.</p>
</li>
<li><p>The route function can access <code>user.name</code> and <code>user.password</code> directly.</p>
</li>
<li><p>If <code>__call__</code> raises an exception, the route never executes.</p>
</li>
</ul>
<p>Test the endpoint with valid credentials from the users list, and you should see output like this: </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762655384549/ac5ab413-0f75-4711-8166-4c99bcca9d7c.jpeg" alt="class dependency output" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<p>A closer look at <a target="_blank" href="https://fastapi.tiangolo.com/tutorial/dependencies/classes-as-dependencies/#use-it">FastAPI’s official documentation</a> provides an alternative approach to classes as dependencies. However, using the <code>__call__</code> method, in my opinion, is the most straightforward and self-contained approach. It keeps your authentication logic modular without adding extra code to the path operation.</p>
<p>The trade-off is that class dependencies are more verbose than helper functions, but cleaner for complex logic.</p>
<h2 id="heading-dependency-scope">Dependency Scope</h2>
<p>FastAPI offers two ways to inject dependencies into a path operation: as a <strong>function parameter</strong> or via the <strong>path decorator</strong>. When you include a dependency as a function parameter, the dependency's return value is available within the function. But when injected into the decorator, the dependency executes without passing a return value to the path function.</p>
<p>Beyond single endpoints, FastAPI lets you inject dependencies at the router or global level. Let’s examine these scopes in more detail.</p>
<h3 id="heading-path-operation-level">Path Operation Level</h3>
<p>While the first example injected dependencies into path function parameters, you can also inject them directly into the decorator using the <code>dependencies</code> parameter. This approach is useful for side-effects (for example, authentication guards, rate limiting or request logging) where the return data is not required in the path operation.</p>
<p>Replace the previous code in <code>fastapi-deps/function_deps.py</code> with this:</p>
<pre><code class="lang-python"><span class="hljs-comment">#dep function to pass in decorator</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">user_dep</span>(<span class="hljs-params">name: str, password: str</span>):</span>
    <span class="hljs-keyword">for</span> u <span class="hljs-keyword">in</span> users:
        <span class="hljs-keyword">if</span> u[<span class="hljs-string">"name"</span>] == name <span class="hljs-keyword">and</span> u[<span class="hljs-string">"password"</span>] == password:
            <span class="hljs-keyword">return</span>
    <span class="hljs-keyword">raise</span> HTTPException(status_code=<span class="hljs-number">401</span>, detail=<span class="hljs-string">"Invalid username or password"</span>)

<span class="hljs-comment">#path function</span>
<span class="hljs-meta">@app.get("/users/{user}", dependencies=[Depends(user_dep)])</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_user</span>() -&gt; dict:</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span> : <span class="hljs-string">"Access granted!"</span>}
</code></pre>
<p>This decorator-based dependency acts as a pre-check before the endpoint executes. It validates credentials without passing any values to the path function. On authentication failure, FastAPI raises an HTTPException and prevents the path operation from running.</p>
<p>If you test this using a valid name and password from the in-memory database, your output should look like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762656537394/06fc80cf-a8b2-44d2-8955-ec914be699ba.jpeg" alt="path decorator dependency output" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<h3 id="heading-router-level">Router Level</h3>
<p>Injecting dependencies at the router level allows multiple endpoints to share common logic without repeating the dependency in each route.</p>
<p>We'll use the same <code>user_dep</code> function but inject it at the router level. Add these imports to <code>fastapi-deps/router_deps.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> APIRouter, Depends

<span class="hljs-comment">#import the dependency function</span>
<span class="hljs-keyword">from</span> function_deps <span class="hljs-keyword">import</span> user_dep
</code></pre>
<p>Then, create an <code>APIRouter</code> instance, passing your dependency to the <code>dependencies</code> parameter. This makes the dependency run automatically for every route you define under this router. </p>
<p>In this example, <code>user_dep</code> executes before <code>get_user()</code> and any other endpoints you add to the router, eliminating the need to declare it on each route.</p>
<pre><code class="lang-python">router = APIRouter(prefix=<span class="hljs-string">"/users"</span>, dependencies=[Depends(user_dep)])

<span class="hljs-comment">#define the routes with or without additional dependencies</span>
<span class="hljs-meta">@router.get("/{user}")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_user</span>() -&gt; dict:</span>
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"message"</span> : <span class="hljs-string">"Access granted!"</span>}
</code></pre>
<p>In your main application file (<code>app.py</code>), import the router and register it with your FastAPI application using <code>include_router()</code>. This makes all routes defined in the router accessible through your application.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI
<span class="hljs-keyword">import</span> uvicorn
<span class="hljs-keyword">from</span> router_deps <span class="hljs-keyword">import</span> router <span class="hljs-keyword">as</span> user_router

app = FastAPI()
app.include_router(user_router)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    uvicorn.run(<span class="hljs-string">"app:app"</span>, reload=<span class="hljs-literal">True</span>)
</code></pre>
<p>Start your server and test the route using a valid name–password pair from the users list, then try a mismatched one. You should get a <strong>200</strong> status for the correct credentials and <strong>401</strong> for invalid ones.</p>
<h3 id="heading-application-level">Application Level</h3>
<p>Application-level dependencies (also called <em>global dependencies</em>) are defined when instantiating the FastAPI app and apply to every route in your application. Unlike router-level dependencies that target specific endpoint groups, app-level dependencies extend across the entire application. Any dependency injected into the FastAPI app object will automatically execute for all path functions.</p>
<p>Let's inject a simple <em>logging</em> dependency alongside the <em>user authentication</em> dependency we've used throughout this article. </p>
<p>Update <code>fastapi-deps/app.py</code> with this code:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fastapi <span class="hljs-keyword">import</span> FastAPI, Depends
<span class="hljs-keyword">import</span> uvicorn
<span class="hljs-keyword">from</span> function_deps <span class="hljs-keyword">import</span> user_dep
<span class="hljs-keyword">from</span> router_deps <span class="hljs-keyword">import</span> router <span class="hljs-keyword">as</span> user_router
<span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> datetime

<span class="hljs-comment">#Basic logging dependency</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">log_request</span>():</span>
    print(<span class="hljs-string">f"[<span class="hljs-subst">{datetime.now()}</span>] Request received."</span>)

app = FastAPI(dependencies=[Depends(log_request), Depends(user_dep)])
app.include_router(user_router)

<span class="hljs-meta">@app.get("/home")</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_main</span>():</span>
    <span class="hljs-keyword">return</span> <span class="hljs-string">"Welcome back!!!"</span>


<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>:
    uvicorn.run(<span class="hljs-string">"app:app"</span>, reload=<span class="hljs-literal">True</span>)
</code></pre>
<p>When you send a request to any endpoint within this application, <code>log_request</code> acknowledges it and outputs what time the request was made. Since we aren’t sending the logs to any database in particular, it will just print to the terminal (or console) like so:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762673203094/d1c43e1b-0cc2-46e5-ae54-ee4849d1af66.jpeg" alt="logging dependency output in console" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<p>Request the endpoint with valid credentials using your browser, cURL, Postman, or the Swagger UI. You should get this response:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762673465276/28d90221-4feb-4467-8c6c-8557dd54de03.jpeg" alt="Server response for API request to home page" class="image--center mx-auto" width="600" height="400" loading="lazy"></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Although the same authentication and logging logic apply to all registered routers, the specific message users see depends on what you program into each router.</div>
</div>

<h2 id="heading-common-use-cases-for-dependency-injection">Common Use Cases for Dependency Injection</h2>
<p>Dependency injection solves several common challenges in API development. Here are the most frequent use cases where you'll apply this pattern.</p>
<ol>
<li><p><strong>Database Connections:</strong> Reusing connection logic across multiple endpoints prevents connection leaks, and ensures each request has an isolated session.</p>
</li>
<li><p><strong>Authentication &amp; Authorization:</strong> Dependencies help validate tokens and verify user roles across protected routes. </p>
</li>
<li><p><strong>Logging &amp; Monitoring:</strong> A logging dependency can automatically record each request to your monitoring system or database. It is beneficial for debugging and tracking API usage.</p>
</li>
<li><p><strong>Rate Limiting:</strong> You can control request frequency and prevent API abuse by injecting rate-limiting logic in path functions.</p>
</li>
<li><p><strong>Configuration &amp; Settings:</strong> FastAPI’s dependency injection system simplifies configuration management by letting you inject settings such as API keys or environment variables wherever needed, keeping your code consistent.</p>
</li>
<li><p><strong>Pagination &amp; Filtering:</strong> Injecting common parameters like page_size and limit standardize data retrieval patterns across endpoints. </p>
</li>
</ol>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>FastAPI's dependency injection system helps you manage shared logic and resources efficiently while adhering to <em>DRY</em> principles. However, knowing when to inject a dependency versus when to skip it is a skill that comes with practice.</p>
<p>Dependency injection isn't needed for simple, standalone logic. But for resources requiring lifecycle management, shared logic, or modularity, FastAPI's dependency injection system simplifies checks and app operations—with or without return values.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Transform JSON Data to Match Any Schema ]]>
                </title>
                <description>
                    <![CDATA[ Whether you’re transferring data between APIs or just preparing JSON data for import, mismatched schemas can break your workflow.  Learning how to clean and normalize JSON data ensures a smooth, error-free data transfer. This tutorial demonstrates ho... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/transform-json-data-schema/</link>
                <guid isPermaLink="false">686f40595293ca3e659585b7</guid>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ pandas ]]>
                    </category>
                
                    <category>
                        <![CDATA[ json ]]>
                    </category>
                
                    <category>
                        <![CDATA[ json-schema ]]>
                    </category>
                
                    <category>
                        <![CDATA[ python beginner ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Nneoma Uche ]]>
                </dc:creator>
                <pubDate>Thu, 10 Jul 2025 04:23:53 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1752121420492/513db316-cdc7-47ef-8f20-4911cf5d41f9.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Whether you’re transferring data between APIs or just preparing JSON data for import, mismatched schemas can break your workflow.  Learning how to clean and normalize JSON data ensures a smooth, error-free data transfer.</p>
<p>This tutorial demonstrates how to clean messy JSON and export the results into a new file, based on a predefined schema. The JSON file we’ll be cleaning contains a dataset of 200 synthetic customer records.</p>
<p>In this tutorial, we’ll apply two methods for cleaning the input data:</p>
<ul>
<li><p>With pure Python</p>
</li>
<li><p>With <code>pandas</code></p>
</li>
</ul>
<p>You can apply either of these in your code. But the <code>pandas</code> method is better for large, complex data sets. Let’s jump right into the process.</p>
<h3 id="heading-heres-what-well-cover">Here’s what we’ll cover:</h3>
<ul>
<li><p><a class="post-section-overview" href="#heading-prerequisites">Prerequisites</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-add-and-inspect-the-json-file">Add and Inspect the JSON File</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-define-the-target-schema">Define the Target Schema</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-clean-json-data-with-pure-python">How to Clean JSON Data with Pure Python</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-clean-json-data-with-pandas">How to Clean JSON Data with Pandas</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-validate-the-cleaned-json">How to Validate the Cleaned JSON</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-pandas-vs-pure-python-for-data-cleaning">Pandas vs Pure Python for Data Cleaning</a></p>
</li>
</ul>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>To follow along with this tutorial, you should have a basic understanding of:</p>
<ul>
<li><p>Python dictionaries, lists, and loops</p>
</li>
<li><p>JSON data structure (keys, values, and nesting)</p>
</li>
<li><p>How to read and write JSON files with Python’s <code>json</code> module</p>
</li>
</ul>
<h2 id="heading-add-and-inspect-the-json-file">Add and Inspect the JSON File</h2>
<p>Before you begin writing any code, make sure that the <strong>.json</strong> file you intend to clean is in your project directory. This makes it easy to load in your script using the file name alone.</p>
<p>You can now inspect the data structure by viewing the file locally or loading it in your script, with Python’s built-in <code>json</code> module.</p>
<p>Here’s how (assuming the file name is <strong>“old_customers.json”</strong>):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1752079424973/3cd77410-6fa9-483d-9a73-edbe4c035327.jpeg" alt="Code to view or print contents of the raw JSON file in terminal" class="image--center mx-auto" width="407" height="231" loading="lazy"></p>
<p>This shows you whether the JSON file is structured as a dictionary or a list. It also prints out the entire file in your terminal. Mine is a dictionary that maps to a list of 200 customer entries. You should always open up the raw JSON file in your IDE to get a closer look at its structure and schema.</p>
<h2 id="heading-define-the-target-schema">Define the Target Schema</h2>
<p>If someone asks for JSON data to be cleaned, it probably means that the <a target="_blank" href="https://json-schema.org/understanding-json-schema/about">current schema</a> is unsuitable for its intended purpose. At this point, you want to be clear on what the final JSON export should look like.</p>
<p>JSON schema is essentially a blueprint that describes:</p>
<ul>
<li><p>required fields</p>
</li>
<li><p>field names</p>
</li>
<li><p>data type for each field</p>
</li>
<li><p>standardized formats (for example, lowercase emails, trimmed whitespace, etc.)</p>
</li>
</ul>
<p>Here’s what the old schema versus the target schema looks like:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751956173106/d5957404-57ae-4de9-b61b-90eefa0b9260.jpeg" alt="A screenshot of the old JSON Schema to be transformed" class="image--center mx-auto" width="597" height="222" loading="lazy"></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751956365336/dcf6a024-1ae6-4c95-92ae-5544ba4cbb3e.jpeg" alt="The expected JSON Schema" class="image--center mx-auto" width="460" height="186" loading="lazy"></p>
<p>As you can see, the goal is to delete the <code>”customer_id”</code> and <code>”address”</code> fields in each entry and rename the rest from:</p>
<ul>
<li><p><code>”name”</code> to <code>”full_name”</code></p>
</li>
<li><p><code>”email”</code> to <code>”email_address”</code></p>
</li>
<li><p><code>”phone”</code> to <code>”mobile”</code></p>
</li>
<li><p><code>”membership_level”</code> to <code>”tier”</code></p>
</li>
</ul>
<p>The output should contain 4 response fields instead of 6, all renamed to fit the project requirements.</p>
<h2 id="heading-how-to-clean-json-data-with-pure-python">How to Clean JSON Data with Pure Python</h2>
<p>Let’s explore using Python’s built-in <code>json</code> module to align the raw data with the predefined schema.</p>
<h3 id="heading-step-1-import-json-and-time-modules">Step 1: Import <code>json</code> and <code>time</code> modules</h3>
<p>Importing <code>json</code> is necessary because we’re working with JSON files. But we’ll use the <code>time</code> module to track how long the data cleaning process takes.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> time
</code></pre>
<h3 id="heading-step-2-load-the-file-with-jsonload">Step 2: Load the file with <code>json.load()</code></h3>
<pre><code class="lang-python">start_time = time.time()
<span class="hljs-keyword">with</span> open(<span class="hljs-string">'old_customers.json'</span>) <span class="hljs-keyword">as</span> file:
    crm_data = json.load(file)
</code></pre>
<h3 id="heading-step-3-write-a-function-to-loop-through-and-clean-each-customer-entry-in-the-dictionary">Step 3: Write a function to loop through and clean each customer entry in the dictionary</h3>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">clean_data</span>(<span class="hljs-params">records</span>):</span>
    transformed_records = []
    <span class="hljs-keyword">for</span> customer <span class="hljs-keyword">in</span> records[<span class="hljs-string">"customers"</span>]:
        transformed_records.append({
                <span class="hljs-string">"full_name"</span>: customer[<span class="hljs-string">"name"</span>],
                <span class="hljs-string">"email_address"</span>: customer[<span class="hljs-string">"email"</span>],
                <span class="hljs-string">"mobile"</span>: customer[<span class="hljs-string">"phone"</span>],
                <span class="hljs-string">"tier"</span>: customer[<span class="hljs-string">"membership_level"</span>],

                })
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"customers"</span>: transformed_records}

new_data = clean_data(crm_data)
</code></pre>
<p><code>clean_data()</code> takes in the original data (<strong>temporarily</strong>) stored in the records variable, transforming it to match our target schema.</p>
<p>Since the JSON file we loaded is a dictionary containing a <code>”customers”</code> key, which maps to a list of customer entries, we access this key and loop through each entry in the list.</p>
<p>In the for loop, we rename the relevant fields and store the cleaned entries in a new list called <code>”transformed_records”</code>.</p>
<p>Then, we return the dictionary, with the <code>”customers”</code> key intact.</p>
<h3 id="heading-step-4-save-the-output-in-a-json-file">Step 4: Save the output in a .json file</h3>
<p>Decide on a name for your cleaned JSON data and assign that to an <code>output_file</code> variable, like so:</p>
<pre><code class="lang-python">output_file = <span class="hljs-string">"transformed_data.json"</span>
<span class="hljs-keyword">with</span> open(output_file, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
    json.dump(new_data, f, indent=<span class="hljs-number">4</span>)
</code></pre>
<p>You can also add a <code>print()</code> statement below this block to confirm that the file has been saved in your project directory.</p>
<h3 id="heading-step-5-time-the-data-cleaning-process">Step 5: Time the data cleaning process</h3>
<p>At the beginning of this process, we imported the time module to measure how long it takes to clean up JSON data using pure Python. To track the runtime, we stored the current time in a <code>start_time</code> variable before the cleaning function, and we’ll now include an <code>end_time</code> variable at the end of the script.</p>
<p>The difference between the <code>end_time</code> and <code>start_time</code> values gives you the total runtime in seconds.</p>
<pre><code class="lang-python">end_time = time.time()
elapsed_time = end_time - start_time

print(<span class="hljs-string">f"Transformed data saved to <span class="hljs-subst">{output_file}</span>"</span>)
print(<span class="hljs-string">f"Processing data took <span class="hljs-subst">{elapsed_time:<span class="hljs-number">.2</span>f}</span> seconds"</span>)
</code></pre>
<p>Here’s how long the data cleaning process took with the pure Python approach:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751957367537/4a33fc16-7158-427e-b715-bec10a586857.jpeg" alt="Script runtime displayed in terminal" class="image--center mx-auto" width="766" height="88" loading="lazy"></p>
<h2 id="heading-how-to-clean-json-data-with-pandas">How to Clean JSON Data with Pandas</h2>
<p>Now we’re going to try achieving the same results as above, using Python and a third-party library called <code>pandas</code>. Pandas is an open-source library used for data manipulation and analysis in Python.</p>
<p>To get started, you need to have the Pandas library installed in your directory. In your terminal, run:</p>
<pre><code class="lang-python">pip install pandas
</code></pre>
<p>Then follow these steps:</p>
<h3 id="heading-step-1-import-the-relevant-libraries">Step 1: Import the relevant libraries</h3>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">import</span> time
<span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd
</code></pre>
<h3 id="heading-step-2-load-file-and-extract-customer-entries">Step 2: Load file and extract customer entries</h3>
<p>Unlike the pure Python method, where we simply indexed the key name <code>”customers”</code> to access the list of customer data, working with <code>pandas</code> requires a slightly different approach.</p>
<p>We must extract the list before loading it into a DataFrame because <code>pandas</code> expects structured data. Extracting the list of customer dictionaries upfront ensures that we isolate and clean the relevant records alone, preventing errors caused by nested or unrelated JSON data.</p>
<pre><code class="lang-python">start_time = time.time()
<span class="hljs-keyword">with</span> open(<span class="hljs-string">'old_customers.json'</span>, <span class="hljs-string">'r'</span>) <span class="hljs-keyword">as</span> f:
    crm_data = json.load(f)

<span class="hljs-comment">#Extract the list of customer entries</span>
clients = crm_data.get(<span class="hljs-string">"customers"</span>, [])
</code></pre>
<h3 id="heading-step-3-load-customer-entries-into-a-dataframe">Step 3: Load customer entries into a DataFrame</h3>
<p>Once you’ve got a clean list of customer dictionaries, load the list into a DataFrame and assign said list to a variable, like so:</p>
<pre><code class="lang-python"><span class="hljs-comment">#Load into a dataframe</span>
df = pd.DataFrame(clients)
</code></pre>
<p>This creates a tabular or spreadsheet-like structure, where each row represents a customer. Loading the list into a DataFrame also allows you to access <code>pandas</code>’ powerful data cleaning methods like:</p>
<ul>
<li><p><code>drop_duplicate()</code>: removes duplicate rows or entries from a DataFrame</p>
</li>
<li><p><code>dropna()</code>: drops rows with any missing or null data</p>
</li>
<li><p><code>fillna(value)</code>: replaces all missing or null data with a specified value</p>
</li>
<li><p><code>drop(columns)</code>: drops unused columns explicitly</p>
</li>
</ul>
<h3 id="heading-step-4-write-a-custom-function-to-rename-relevant-fields">Step 4: Write a custom function to rename relevant fields</h3>
<p>At this point, we need a function that takes in a single customer entry – a row – and returns a cleaned version that fits the target schema (<code>“full_name”</code>, <code>“email_address”</code>, <code>“mobile”</code> and <code>“tier”</code>).</p>
<p>The function should also handle missing data by setting default values like <strong>”Unknown”</strong> or <strong>”N/A”</strong> when a field is absent.</p>
<p><strong>P.S:</strong> At first, I used <code>drop(columns)</code> to explicitly remove the <code>“address”</code> and <code>“customer_id”</code> fields. But it’s not needed in this case, as the <code>transform_fields()</code> function only selects and renames the required fields. Any extra columns are automatically excluded from the cleaned data.</p>
<h3 id="heading-step-5-apply-schema-transformation-to-all-rows">Step 5: Apply schema transformation to all rows</h3>
<p>We’ll use <code>pandas</code>' <code>apply()</code> method to apply our custom function to each row in the DataFrame. This will creates a Series (for example, 0 → {...}, 1 → {...}, 2 → {...}), which is not JSON-friendly.</p>
<p>As <code>json.dump()</code> expects a list, not a Pandas Series, we’ll apply <code>tolist()</code>, converting the Series to a list of dictionaries.</p>
<pre><code class="lang-python"><span class="hljs-comment">#Apply schema transformation to all rows</span>
transformed_df = df.apply(transform_fields, axis=<span class="hljs-number">1</span>)

<span class="hljs-comment">#Convert series to list of dicts</span>
transformed_data = transformed_df.tolist()
</code></pre>
<p>Another way to approach this is with list comprehension. Instead of using <code>apply()</code> at all, you can write:</p>
<pre><code class="lang-python">transformed_data = [transform_fields(row) <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> df.to_dict(orient=<span class="hljs-string">"records"</span>)]
</code></pre>
<p><code>orient=”records”</code> is an argument for <code>df.to_dict</code> that tells pandas to convert the DataFrame to a list of dictionaries, where each dictionary represents a single customer record (that is, one row).</p>
<p>Then the <strong>for loop</strong> iterates through every customer record on the list, calling the custom function on each row. Finally, the list comprehension (<strong>[...]</strong>) collects the cleaned rows into a new list.</p>
<h3 id="heading-step-6-save-the-output-in-a-json-file">Step 6: Save the output in  a .json file</h3>
<pre><code class="lang-python"><span class="hljs-comment">#Save the cleaned data</span>
output_data = {<span class="hljs-string">"customers"</span>: transformed_data}
output_file = <span class="hljs-string">"applypandas_customer.json"</span>
<span class="hljs-keyword">with</span> open(output_file, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
    json.dump(output_data, f, indent=<span class="hljs-number">4</span>)
</code></pre>
<p>I recommend picking a different file name for your <code>pandas</code> output. You can inspect both files side by side to see if this output matches the result you got from cleaning with pure Python.</p>
<h3 id="heading-step-7-track-runtime">Step 7: Track runtime</h3>
<p>Once again, check for the difference between start time and end time to determine the program’s execution time.</p>
<pre><code class="lang-python">end_time = time.time()
elapsed_time = end_time - start_time

<span class="hljs-comment">#print(f"Transformed data saved to {output_file}")</span>
print(<span class="hljs-string">f"Transformed data saved to <span class="hljs-subst">{output_file}</span>"</span>)
print(<span class="hljs-string">f"Processing data took <span class="hljs-subst">{elapsed_time:<span class="hljs-number">.2</span>f}</span> seconds"</span>)
</code></pre>
<p>When I used <strong>list comprehension</strong> to apply the custom function, my script’s runtime was <strong>0.03 seconds</strong>, but with <code>pandas</code>’ <code>apply()</code> function, the total runtime dropped to <strong>0.01 seconds</strong>.</p>
<h3 id="heading-final-output-preview">Final output preview:</h3>
<p>If you followed this tutorial closely, your JSON output should look like this – whether you used the <code>pandas</code> method or the pure Python approach:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1751961256627/d7b585f7-4585-4354-9fa7-a171adb31f90.jpeg" alt="The expected JSON output after schema transformation" class="image--center mx-auto" width="455" height="310" loading="lazy"></p>
<h2 id="heading-how-to-validate-the-cleaned-json">How to Validate the Cleaned JSON</h2>
<p>Validating your output ensures that the cleaned data follows the expected structure before being used or shared. This step helps to catch formatting errors, missing fields, and wrong data types early.</p>
<p>Below are the steps for validating your cleaned JSON file:</p>
<h3 id="heading-step-1-install-and-import-jsonschema">Step 1: Install and import <code>jsonschema</code></h3>
<p><code>jsonschema</code> is a third-party validation library for Python. It helps you define the expected structure of your JSON data and automatically check if your output matches that structure.</p>
<p>In your terminal, run:</p>
<pre><code class="lang-python">pip install jsonschema
</code></pre>
<p>Import the required libraries:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
<span class="hljs-keyword">from</span> jsonschema <span class="hljs-keyword">import</span> validate, ValidationError
</code></pre>
<p><code>validate()</code> checks whether your JSON data matches the rules defined in your schema. If the data is valid, nothing happens. But if there’s an error – like a missing field or wrong data type – it raises a <code>ValidationError</code>.</p>
<h3 id="heading-step-2-define-a-schema">Step 2: Define a schema</h3>
<p>As you know, JSON schema changes with each file structure. If your JSON data differs from what we’ve been working with so far, learn how to create a schema <a target="_blank" href="https://json-schema.org/learn/getting-started-step-by-step#validate-json-data-against-the-schema">here</a>. Otherwise, the schema below defines the structure we expect for our cleaned JSON:</p>
<pre><code class="lang-python">schema = {
    <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
    <span class="hljs-string">"properties"</span>: {
        <span class="hljs-string">"customers"</span>: {
            <span class="hljs-string">"type"</span>: <span class="hljs-string">"array"</span>,
            <span class="hljs-string">"items"</span>: {
                <span class="hljs-string">"type"</span>: <span class="hljs-string">"object"</span>,
                <span class="hljs-string">"properties"</span>: {
                    <span class="hljs-string">"full_name"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>},
                    <span class="hljs-string">"email_address"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>},
                    <span class="hljs-string">"mobile"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>},
                    <span class="hljs-string">"tier"</span>: {<span class="hljs-string">"type"</span>: <span class="hljs-string">"string"</span>}
                },
                <span class="hljs-string">"required"</span>: [<span class="hljs-string">"full_name"</span>, <span class="hljs-string">"email_address"</span>, <span class="hljs-string">"mobile"</span>, <span class="hljs-string">"tier"</span>]
            }
        }
    },
    <span class="hljs-string">"required"</span>: [<span class="hljs-string">"customers"</span>]
}
</code></pre>
<ul>
<li><p>The data is an object that must contain a key: <code>"customers"</code>.</p>
</li>
<li><p><code>"customers"</code> must be an <strong>array</strong> (a list), with each object representing one customer entry.</p>
</li>
<li><p>Each customer entry must have four fields–all strings:</p>
<ul>
<li><p><code>"full_name"</code></p>
</li>
<li><p><code>"email_address"</code></p>
</li>
<li><p><code>"mobile"</code></p>
</li>
<li><p><code>"tier"</code></p>
</li>
</ul>
</li>
<li><p>The <code>"required"</code> fields ensure that none of the relevant fields are missing in any customer record.</p>
</li>
</ul>
<h3 id="heading-step-3-load-the-cleaned-json-file">Step 3: Load the cleaned JSON file</h3>
<pre><code class="lang-python"><span class="hljs-keyword">with</span> open(<span class="hljs-string">"transformed_data.json"</span>) <span class="hljs-keyword">as</span> f:
    data = json.load(f)
</code></pre>
<h3 id="heading-step-4-validate-the-data">Step 4: Validate the data</h3>
<p>For this step, we’ll use a <code>try. . . except</code> block to end the process safely, and display a helpful message if the code raises a <code>ValidationError</code>.</p>
<pre><code class="lang-python"><span class="hljs-keyword">try</span>:
    validate(instance=data, schema=schema)
    print(<span class="hljs-string">"JSON is valid."</span>)
<span class="hljs-keyword">except</span> ValidationError <span class="hljs-keyword">as</span> e:
    print(<span class="hljs-string">"JSON is invalid:"</span>, e.message)
</code></pre>
<h2 id="heading-pandas-vs-pure-python-for-data-cleaning">Pandas vs Pure Python for Data Cleaning</h2>
<p>From this tutorial, you can probably tell that using pure Python to clean and restructure JSON is the more straightforward approach. It is fast and ideal for handling small datasets or simple transformations.</p>
<p>But as data grows and becomes more complex, you might need advanced data cleaning methods that Python alone does not provide. In such cases, <code>pandas</code> becomes the better choice. It handles large, complex datasets effectively, providing built-in functions for handling missing data and removing duplicates.</p>
<p>You can study the <a target="_blank" href="https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf">Pandas cheatsheet</a> to learn more data manipulation methods.</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
