<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ ML - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ ML - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Sun, 21 Jun 2026 23:14:26 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/ml/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ Model Packaging Tools Every MLOps Engineer Should Know ]]>
                </title>
                <description>
                    <![CDATA[ Most machine learning deployments don’t fail because the model is bad. They fail because of packaging. Teams often spend months fine-tuning models (adjusting hyperparameters and improving architecture ]]>
                </description>
                <link>https://www.freecodecamp.org/news/model-packaging-tools-every-mlops-engineer-should-know/</link>
                <guid isPermaLink="false">69d3ca7840c9cabf443c9ce3</guid>
                
                    <category>
                        <![CDATA[ ML ]]>
                    </category>
                
                    <category>
                        <![CDATA[ mlops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Devops ]]>
                    </category>
                
                    <category>
                        <![CDATA[ AI ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Temitope Oyedele ]]>
                </dc:creator>
                <pubDate>Mon, 06 Apr 2026 15:00:08 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/4fa02714-2cea-4592-813e-a5d5ebaf0842.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Most machine learning deployments don’t fail because the model is bad. They fail because of packaging.</p>
<p>Teams often spend months fine-tuning models (adjusting hyperparameters and improving architectures) only to hit a wall when it’s time to deploy. Suddenly, the production system can’t even read the model file. Everything breaks at the handoff between research and production.</p>
<p>The good news? If you think about packaging from the start, you can save up to 60% of the time usually spent during deployment. That’s because you avoid the common friction between the experimental environment and the production system.</p>
<p>In this guide, we’ll walk through eleven essential tools every MLOps engineer should know. To keep things clear, we’ll group them into three stages of a model’s lifecycle:</p>
<ul>
<li><p><strong>Serialization</strong>: how models are stored and transferred</p>
</li>
<li><p><strong>Bundling &amp; Serving</strong>: how models are deployed and run</p>
</li>
<li><p><strong>Registry</strong>: how models are tracked and versioned</p>
</li>
</ul>
<h2 id="heading-table-of-contents">Table Of Contents</h2>
<ul>
<li><p><a href="#heading-model-serialization-formats">Model Serialization Formats</a></p>
<ul>
<li><p><a href="#heading-1-onnx-open-neural-network-exchangehttpsonnxai">1. ONNX (Open Neural Network Exchange)</a></p>
</li>
<li><p><a href="#heading-2-torchscripthttpsdocspytorchorgdocsstabletorchcompilerapihtml">2. TorchScript</a></p>
</li>
<li><p><a href="#heading-3-tensorflow-savedmodelhttpswwwtensorfloworgguidesavedmodel">3. TensorFlow SavedModel</a></p>
</li>
<li><p><a href="#heading-4-picklehttpsdocspythonorg3librarypicklehtmlle-joblibhttpsjoblibreadthedocsioenstable">4. Picklele / Joblib</a></p>
</li>
<li><p><a href="#heading-5-safetensorshttpsgithubcomhuggingfacesafetensors">5. Safetensors</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-model-bundling-and-serving-tools">Model Bundling and Serving Tools</a></p>
<ul>
<li><p><a href="#heading-1-bentomlhttpsdocsbentomlcomenlatest">1. BentoML</a></p>
</li>
<li><p><a href="#heading-2-nvidia-triton-inference-serverhttpsgithubcomtriton-inference-serverserver">2. NVIDIA Triton Inference Server</a></p>
</li>
<li><p><a href="#heading-3-torchservehttpsdocspytorchorgserverve">3. TorchServerve</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-model-registries">Model Registries</a></p>
<ul>
<li><p><a href="#heading-1-mlflow-model-registryhttpsmlfloworgdocslatestmlmodel-registry">1. MLflow Model Registry</a></p>
</li>
<li><p><a href="#heading-2-hugging-face-hubhttpshuggingfacecodocshubindex">2. Hugging Face Hub</a></p>
</li>
<li><p><a href="#heading-3-weights-amp-biaseshttpsdocswandbaimodels">3. Weights &amp; Biases</a></p>
</li>
</ul>
</li>
<li><p><a href="#heading-conclusion">Conclusion</a></p>
</li>
</ul>
<h2 id="heading-model-serialization-formats">Model Serialization Formats</h2>
<p>Serialization is simply the process of turning a trained model into a file that can be stored and moved around. It’s the first step in the pipeline, and it matters more than people think. The format you choose determines how your model will be loaded later in production.</p>
<p>So, you want something that either works across different frameworks or is optimized for the environment where your model will eventually run.</p>
<p>Below are some of the most common tools in this space:</p>
<h3 id="heading-1-onnx-open-neural-network-exchange"><a href="https://onnx.ai/">1. ONNX (Open Neural Network Exchange)</a></h3>
<p>ONNX is basically the common language for model serialization. It lets you train a model in one framework, like PyTorch, and then deploy it somewhere else without running into compatibility issues. It also performs well across different types of hardware.</p>
<p>ONNX separates your training framework from your inference runtime and allows hardware-level optimizations like quantization and graph fusion. It’s also widely supported across cloud platforms and edge devices.</p>
<p><strong>Key considerations:</strong> This format makes it possible to decouple training from deployment, while still enabling performance optimizations across different hardware setups.</p>
<p><strong>When to use it:</strong> Use ONNX when you need portability –&nbsp;especially if different teams or environments are involved.</p>
<h3 id="heading-2-torchscript"><a href="https://docs.pytorch.org/docs/stable/torch.compiler_api.html">2. TorchScript</a></h3>
<p>TorchScript lets you compile PyTorch models into a format that can run without Python. That means you can deploy it in environments like C++ or mobile without carrying the full Python runtime.</p>
<p>It supports two approaches: tracing (recording execution with sample inputs) and scripting (capturing full control flow).</p>
<p><strong>Key considerations:</strong> Its biggest advantage is removing the Python dependency, which helps reduce latency and makes it suitable for more constrained environments.</p>
<p><strong>When to use it:</strong> Best for high-performance systems where Python would be too heavy or introduce security concerns.</p>
<h3 id="heading-3-tensorflow-savedmodel"><a href="https://www.tensorflow.org/guide/saved_model">3. TensorFlow SavedModel</a></h3>
<p>SavedModel is TensorFlow’s native format. It stores everything –&nbsp;the computation graph, weights, and serving logic – in a single directory.</p>
<p>It’s also the standard input format for TensorFlow Serving, TFLite, and Google Cloud AI Platform.</p>
<p><strong>Key considerations:</strong> It keeps everything within the TensorFlow ecosystem intact, so you don’t lose any part of the model when moving to production.</p>
<p><strong>When to use it:</strong> If your project is built on TensorFlow, this is the default and safest choice.</p>
<h3 id="heading-4-pickle-and-joblib">4. &nbsp;<a href="https://docs.python.org/3/library/pickle.html">Pickle</a> and <a href="https://joblib.readthedocs.io/en/stable/">Joblib</a></h3>
<p>Pickle is Python’s built-in way of saving objects, and Joblib builds on top of it to better handle large arrays and models.</p>
<p>These are commonly used for scikit-learn pipelines, XGBoost models, and other traditional ML setups.</p>
<p><strong>Key considerations:</strong> They’re simple and convenient, but come with real trade-offs. Pickle can execute arbitrary code when loading, which makes it unsafe in untrusted environments. It’s also tightly coupled to Python versions and library dependencies, so models can break when moved across environments.</p>
<p><strong>When to use it:</strong> Best suited for controlled environments where everything runs in the same Python stack, such as internal tools, quick prototypes, or batch jobs.</p>
<p>It’s especially practical when you’re working with classical ML models and don’t need cross-language support or long-term portability. Avoid it for production systems that require security, reproducibility, or deployment across different environments.</p>
<h3 id="heading-5-safetensors"><a href="https://github.com/huggingface/safetensors">5. Safetensors</a></h3>
<p>Safetensors is a newer format developed by Hugging Face. It’s designed to be safe, fast, and straightforward.</p>
<p>It avoids arbitrary code execution and allows efficient loading directly from disk.</p>
<p><strong>Key considerations:</strong> It’s both memory-efficient and secure, which makes it a strong alternative to older formats like Pickle.</p>
<p><strong>When to use it:</strong> Ideal for modern workflows where speed and safety are important.</p>
<h2 id="heading-model-bundling-and-serving-tools">Model Bundling and Serving Tools</h2>
<p>Once your model is saved, the next step is making it usable in production. That means wrapping it in a way that can handle requests and connect it to the rest of your system.</p>
<h3 id="heading-1-bentoml"><a href="https://docs.bentoml.com/en/latest/">1. BentoML</a></h3>
<p>BentoML allows you to define your model service in Python – including preprocessing, inference, and postprocessing – and package everything into a single unit called a “Bento.”</p>
<p>This bundle includes the model, code, dependencies, and even Docker configuration.</p>
<p><strong>Key considerations</strong>: It simplifies deployment by packaging everything into one consistent artifact that can run anywhere.</p>
<p><strong>When to use it</strong>: Great when you want to ship your model and all its logic together as one deployable unit.</p>
<h3 id="heading-2-nvidia-triton-inference-server"><a href="https://github.com/triton-inference-server/server">2. NVIDIA Triton Inference Server</a></h3>
<p>Triton is NVIDIA’s production-grade inference server. It supports multiple model formats like ONNX, TorchScript, TensorFlow, and more.</p>
<p>It’s built for performance, using features like dynamic batching and concurrent execution to fully utilize GPUs.</p>
<p><strong>Key considerations:</strong> It delivers high throughput and efficiently uses hardware, especially GPUs, while supporting models from different frameworks.</p>
<p><strong>When to use it:</strong> Best for large-scale deployments where performance, low latency, and GPU usage are critical.</p>
<h3 id="heading-3-torchserve"><a href="https://docs.pytorch.org/serve/">3. TorchServe</a></h3>
<p>TorchServe is the official serving tool for PyTorch, developed with AWS.</p>
<p>It packages models into a MAR file, which includes weights, code, and dependencies, and provides APIs for managing models in production.</p>
<p><strong>Key considerations:</strong> It offers built-in features for versioning, batching, and management without needing to build everything from scratch.</p>
<p><strong>When to use it:</strong> A solid choice for deploying PyTorch models in a standard production setup.</p>
<h2 id="heading-model-registries">Model Registries</h2>
<p>A model registry is essentially your source of truth. It stores your models, tracks versions, and manages their lifecycle from experimentation to production.</p>
<p>Without one, things quickly become messy and hard to track.</p>
<h3 id="heading-1-mlflow-model-registry"><a href="https://mlflow.org/docs/latest/ml/model-registry/">1. MLflow Model Registry</a></h3>
<p>MLflow is one of the most widely used MLOps platforms. Its registry helps manage model versions and track their progression through stages like Staging and Production.</p>
<p>It also links models back to the experiments that created them.</p>
<p><strong>Key considerations:</strong> It provides strong lifecycle management and makes it easier to track and audit models.</p>
<p><strong>When to use it:</strong> Ideal for teams that need structured workflows and clear governance.</p>
<h3 id="heading-2-hugging-face-hub"><a href="https://huggingface.co/docs/hub/index">2. Hugging Face Hub</a></h3>
<p>The Hugging Face Hub is one of the largest platforms for sharing and managing models.</p>
<p>It supports both public and private repositories, along with dataset versioning and interactive demos.</p>
<p><strong>Key considerations:</strong> It offers a huge library of models and makes collaboration very easy.</p>
<p><strong>When to use it:</strong> Perfect for projects involving transformers, generative AI, or anything that benefits from sharing and discovery.</p>
<h3 id="heading-3-weights-and-biases"><a href="https://docs.wandb.ai/models">3. Weights and Biases</a></h3>
<p>Weights &amp; Biases combines experiment tracking with a model registry.</p>
<p>It connects each model directly to the training run that produced it.</p>
<p><strong>Key considerations:</strong> It gives you full traceability, so you always know how a model was created.</p>
<p><strong>When to use it:</strong> Best when you want a strong link between experimentation and production artifacts.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Machine learning systems rarely fail because the models are bad. They fail because the path to production is fragile.</p>
<p>Packaging is what connects research to production. If that connection is weak, even great models won’t make it into real use.</p>
<p>Choosing the right tools across serialization, serving, and registry layers makes systems easier to deploy and maintain. Formats like ONNX and Safetensors improve portability and safety. Tools like Triton and BentoML help with reliable serving. Registries like MLflow and Hugging Face Hub keep everything organized.</p>
<p>The main idea is simple: don’t leave deployment as something to figure out later.</p>
<p>When packaging is planned early, teams move faster and avoid a lot of unnecessary problems.</p>
<p>In practice, success in MLOps isn’t just about building models. It’s about making sure they actually run in the real world.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Build an MCP Server with Python, Docker, and Claude Code ]]>
                </title>
                <description>
                    <![CDATA[ Every MCP tutorial I've found so far has followed the same basic script: build a server, point Claude Desktop at it, screenshot the chat window, done. This is fine if you want a demo. But it's not fin ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-build-an-mcp-server-with-python-docker-and-claude-code/</link>
                <guid isPermaLink="false">69b09018abc0d95001a8f07f</guid>
                
                    <category>
                        <![CDATA[ AI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ ML ]]>
                    </category>
                
                    <category>
                        <![CDATA[ claude.ai ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Docker ]]>
                    </category>
                
                    <category>
                        <![CDATA[ mcp ]]>
                    </category>
                
                    <category>
                        <![CDATA[ mcp server ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Balajee Asish Brahmandam ]]>
                </dc:creator>
                <pubDate>Tue, 10 Mar 2026 21:41:44 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/uploads/covers/5e1e335a7a1d3fcc59028c64/02826050-87fa-42cb-8167-73bca4b42616.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Every MCP tutorial I've found so far has followed the same basic script: build a server, point Claude Desktop at it, screenshot the chat window, done.</p>
<p>This is fine if you want a demo. But it's not fine if you want something you can ship, defend in an interview, or hand to another developer without a README that starts with "first, install this Electron app."</p>
<p>So I built an MCP server in Python, containerized it with Docker, and wired it into Claude Code – all from the terminal, no GUI required.</p>
<p>This article walks through the full loop in one afternoon: what MCP actually is, why it matters now that OpenAI and Google have adopted it, the real security problems nobody puts in their tutorial (complete with CVEs), and every command you need to go from an empty directory to a working tool.</p>
<p>If you're between jobs and need a portfolio project that shows you understand how AI tooling actually works under the hood, this is the one.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#what-you-will-build">What You Will Build</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#prerequisites">Prerequisites</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#what-is-mcp-and-why-should-you-care">What is MCP (and Why Should You Care)?</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#why-claude-code-instead-of-claude-desktop">Why Claude Code Instead of Claude Desktop?</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#step-1-build-the-mcp-server">Step 1: Build the MCP Server</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#step-2-test-it-locally">Step 2: Test It Locally</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#step-3-dockerize-it">Step 3: Dockerize It</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#step-4-wire-it-into-claude-code">Step 4: Wire It Into Claude Code</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#step-5-use-it">Step 5: Use It</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#security-what-the-other-tutorials-leave-out">Security: What the Other Tutorials Leave Out</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#what-to-do-next">What to Do Next</a></p>
</li>
<li><p><a href="https://claude.ai/chat/1a92e709-4c86-4c9a-8fa3-b1533b9d21a5#wrapping-up">Wrapping Up</a></p>
</li>
</ul>
<h2 id="heading-what-you-will-build">What You Will Build</h2>
<p>By the end of this tutorial, you will have:</p>
<ul>
<li><p>A Python MCP server that exposes custom tools to any MCP-compatible AI client</p>
</li>
<li><p>A Docker container that packages the server for reproducible deployment</p>
</li>
<li><p>A working connection between that container and Claude Code in your terminal</p>
</li>
<li><p>An understanding of the security risks involved and how to mitigate the worst of them</p>
</li>
</ul>
<p>The server we are building is a <strong>project scaffolder</strong>. You give it a project name and a language, and it generates a starter directory structure with the right files. It's simple enough to build in an afternoon, but useful enough to actually put on your résumé.</p>
<h2 id="heading-prerequisites">Prerequisites</h2>
<p>You will need the following installed on your machine:</p>
<ul>
<li><p><strong>Python 3.10+</strong> (check with <code>python3 --version</code>)</p>
</li>
<li><p><strong>Docker</strong> (check with <code>docker --version</code>)</p>
</li>
<li><p><strong>Claude Code</strong> with an active Claude Pro, Max, or API plan (check with <code>claude --version</code>)</p>
</li>
<li><p><strong>Node.js 20+</strong> (required by Claude Code – check with <code>node --version</code>)</p>
</li>
<li><p>A terminal you are comfortable in</p>
</li>
</ul>
<p>If you don't have Claude Code installed yet, follow the <a href="https://code.claude.com/docs/en/getting-started">official installation instructions</a>. The npm installation method is deprecated, so make sure you use the native binary installer instead.</p>
<h2 id="heading-what-is-mcp-and-why-should-you-care">What is MCP (and Why Should You Care)?</h2>
<p>The Model Context Protocol (MCP) is an open standard that lets AI models connect to external tools and data sources. Anthropic released it in November 2024, and within a year it became the default way to extend what an LLM can do. OpenAI adopted it in March 2025. Google DeepMind followed in April. The protocol now has over 97 million monthly SDK downloads and more than 10,000 active servers.</p>
<p>The easiest way to think about MCP is as a USB-C port for AI. Before MCP, every AI provider had its own way of calling tools. OpenAI had function calling. Google had their own format. If you wanted your tool to work with multiple models, you had to implement it multiple times. MCP gives you one interface that works everywhere.</p>
<p>Here is how the pieces fit together:</p>
<ul>
<li><p>An <strong>MCP server</strong> exposes tools, resources, and prompts. It is your code.</p>
</li>
<li><p>An <strong>MCP client</strong> (like Claude Code, Claude Desktop, or Cursor) discovers those tools and calls them on behalf of the LLM.</p>
</li>
<li><p>The <strong>transport</strong> is how they communicate. For local servers, that's usually stdio (standard input/output). For remote servers, it's HTTP.</p>
</li>
</ul>
<p>When you type a message in Claude Code and it decides to use one of your tools, here is what happens: Claude Code sends a JSON-RPC 2.0 message to your server over stdin, your server executes the tool and writes the result to stdout, and Claude Code reads it back. The LLM never talks to your server directly. The client is always in the middle.</p>
<p>If you want the deeper architecture breakdown, freeCodeCamp already has a <a href="https://www.freecodecamp.org/news/how-does-an-mcp-work-under-the-hood/">solid explainer on how MCP works under the hood</a>. Here, I will focus on building.</p>
<h2 id="heading-why-claude-code-instead-of-claude-desktop">Why Claude Code Instead of Claude Desktop?</h2>
<p>Most MCP tutorials use Claude Desktop as the client. That works, but Claude Code has a few advantages for developers:</p>
<ol>
<li><p><strong>It lives in your terminal.</strong> No GUI to configure. No JSON files to hand-edit in hidden config directories. You add an MCP server with one command and you are done.</p>
</li>
<li><p><strong>It's already where you code.</strong> If you're writing the server, testing it, and connecting it, doing all of that in the same terminal session cuts the context switching.</p>
</li>
<li><p><strong>It works on headless machines.</strong> If you're SSHing into a dev box or running in CI, Claude Desktop isn't an option. Claude Code is.</p>
</li>
<li><p><strong>It's also an MCP server itself.</strong> Claude Code can expose its own tools (file reading, writing, shell commands) to other MCP clients via <code>claude mcp serve</code>. That's a neat trick we won't use today, but it's worth knowing about.</p>
</li>
</ol>
<p>The relevant commands:</p>
<pre><code class="language-bash"># Add an MCP server
claude mcp add &lt;name&gt; -- &lt;command&gt;

# List configured servers
claude mcp list

# Remove a server
claude mcp remove &lt;name&gt;

# Check MCP status inside Claude Code
/mcp
</code></pre>
<h2 id="heading-step-1-build-the-mcp-server">Step 1: Build the MCP Server</h2>
<p>We're using <a href="https://github.com/jlowin/fastmcp">FastMCP</a>, a Python framework that handles all the protocol plumbing so you can focus on your tools. Create a new project directory and set it up:</p>
<pre><code class="language-bash">mkdir mcp-scaffolder &amp;&amp; cd mcp-scaffolder
python3 -m venv .venv
source .venv/bin/activate
pip install "mcp[cli]&gt;=1.25,&lt;2"
</code></pre>
<p>Why pin the version? The MCP Python SDK v2.0 is in development and will change the transport layer significantly. Pinning to &gt;=1.25,&lt;2 keeps your server working until you're ready to migrate.</p>
<p>Now create <code>server.py</code>:</p>
<pre><code class="language-python"># server.py
from mcp.server.fastmcp import FastMCP
import os
import json

mcp = FastMCP("project-scaffolder")

# Templates for different languages
TEMPLATES = {
    "python": {
        "files": {
            "main.py": '"""Entry point."""\n\n\ndef main():\n    print("Hello, world!")\n\n\nif __name__ == "__main__":\n    main()\n',
            "requirements.txt": "",
            "README.md": "# {name}\n\nA Python project.\n\n## Setup\n\n```bash\npip install -r requirements.txt\npython main.py\n```\n",
            ".gitignore": "__pycache__/\n*.pyc\n.venv/\n",
        },
        "dirs": ["tests"],
    },
    "node": {
        "files": {
            "index.js": 'console.log("Hello, world!");\n',
            "package.json": '{{\n  "name": "{name}",\n  "version": "1.0.0",\n  "main": "index.js"\n}}\n',
            "README.md": "# {name}\n\nA Node.js project.\n\n## Setup\n\n```bash\nnpm install\nnode index.js\n```\n",
            ".gitignore": "node_modules/\n",
        },
        "dirs": [],
    },
    "go": {
        "files": {
            "main.go": 'package main\n\nimport "fmt"\n\nfunc main() {{\n\tfmt.Println("Hello, world!")\n}}\n',
            "go.mod": "module {name}\n\ngo 1.21\n",
            "README.md": "# {name}\n\nA Go project.\n\n## Setup\n\n```bash\ngo run main.go\n```\n",
            ".gitignore": "bin/\n",
        },
        "dirs": ["cmd", "internal"],
    },
}


@mcp.tool()
def scaffold_project(name: str, language: str) -&gt; str:
    """Create a new project directory structure.

    Args:
        name: The project name (used as the directory name)
        language: The programming language - one of: python, node, go
    """
    language = language.lower().strip()

    if language not in TEMPLATES:
        return json.dumps({
            "error": f"Unsupported language: {language}",
            "supported": list(TEMPLATES.keys()),
        })

    template = TEMPLATES[language]
    base_path = os.path.join(os.getcwd(), name)

    if os.path.exists(base_path):
        return json.dumps({
            "error": f"Directory already exists: {name}",
        })

    # Create the project directory
    os.makedirs(base_path, exist_ok=True)

    # Create subdirectories
    for dir_name in template["dirs"]:
        os.makedirs(os.path.join(base_path, dir_name), exist_ok=True)

    # Create files
    created_files = []
    for filename, content in template["files"].items():
        filepath = os.path.join(base_path, filename)
        formatted_content = content.replace("{name}", name)
        with open(filepath, "w") as f:
            f.write(formatted_content)
        created_files.append(filename)

    return json.dumps({
        "status": "created",
        "path": base_path,
        "language": language,
        "files": created_files,
        "directories": template["dirs"],
    })


@mcp.tool()
def list_templates() -&gt; str:
    """List all available project templates and their contents."""
    result = {}
    for lang, template in TEMPLATES.items():
        result[lang] = {
            "files": list(template["files"].keys()),
            "directories": template["dirs"],
        }
    return json.dumps(result, indent=2)


if __name__ == "__main__":
    mcp.run(transport="stdio")
</code></pre>
<p>A few things to notice about this code:</p>
<p>Tools return strings. MCP tools communicate through text. I'm returning JSON strings so the LLM can parse the results reliably. You could return plain text, but structured data gives the model more to work with.</p>
<p>The <code>@mcp.tool()</code> decorator does the heavy lifting. FastMCP reads your function signature and docstring to generate the JSON schema that tells the LLM what this tool does, what arguments it takes, and what types they are. Good docstrings aren't optional here – they're how the LLM decides whether to call your tool.</p>
<p><code>transport="stdio"</code> is the key line. This tells FastMCP to communicate over standard input/output, which is what Claude Code expects for local servers.</p>
<h2 id="heading-step-2-test-it-locally">Step 2: Test It Locally</h2>
<p>Before we Dockerize anything, make sure the server actually works:</p>
<pre><code class="language-bash"># Quick smoke test - the server should start without errors
python server.py
</code></pre>
<p>You should see... nothing. That is correct. An MCP server over stdio just sits there waiting for JSON-RPC messages on stdin. Press <code>Ctrl+C</code> to stop it.</p>
<p>For a proper test, use the MCP Inspector (Anthropic's debugging tool):</p>
<pre><code class="language-bash"># Install and run the inspector
npx @modelcontextprotocol/inspector python server.py
</code></pre>
<p>This opens a web interface where you can see your tools, call them manually, and inspect the JSON-RPC messages going back and forth. Verify that both <code>scaffold_project</code> and <code>list_templates</code> show up and return sensible results.</p>
<p><strong>Here's a debugging tip that will save you time:</strong> If your MCP server logs anything to stdout, it will corrupt the JSON-RPC stream and the client will disconnect. Use stderr for all logging: <code>print("debug info", file=sys.stderr)</code>. This is the single most common source of "my server connects but then immediately fails" bugs. The New Stack called stdio transport "incredibly fragile" for exactly this reason.</p>
<h2 id="heading-step-3-dockerize-it">Step 3: Dockerize It</h2>
<p>Create a <code>Dockerfile</code> in your project root:</p>
<pre><code class="language-dockerfile">FROM python:3.12-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy server code
COPY server.py .

# MCP servers over stdio need unbuffered output
ENV PYTHONUNBUFFERED=1

# The server reads from stdin and writes to stdout
CMD ["python", "server.py"]
</code></pre>
<p>Create <code>requirements.txt</code>:</p>
<pre><code class="language-plaintext">mcp[cli]&gt;=1.25,&lt;2
</code></pre>
<p>Build and verify:</p>
<pre><code class="language-bash">docker build -t mcp-scaffolder .

# Quick test - should start without errors
docker run -i mcp-scaffolder
</code></pre>
<p>Again, you'll see nothing because the server is waiting for input. <code>Ctrl+C</code> to stop.</p>
<p>Two things matter in this Dockerfile:</p>
<ol>
<li><p><code>PYTHONUNBUFFERED=1</code> <strong>is critical.</strong> Without it, Python buffers stdout, and the MCP client may hang waiting for responses that are sitting in a buffer. This is one of those bugs that works fine in local testing and breaks in Docker.</p>
</li>
<li><p><code>docker run -i</code> <strong>(interactive mode) is required.</strong> The <code>-i</code> flag keeps stdin open so the MCP client can send messages to the container. Without it, the server gets an immediate EOF and exits.</p>
</li>
</ol>
<h2 id="heading-step-4-wire-it-into-claude-code">Step 4: Wire It Into Claude Code</h2>
<p>Now connect your Docker container to Claude Code:</p>
<pre><code class="language-bash">claude mcp add scaffolder -- docker run -i --rm mcp-scaffolder
</code></pre>
<p>That's the whole command. Let me break it down:</p>
<ul>
<li><p><code>claude mcp add</code> registers a new MCP server</p>
</li>
<li><p><code>scaffolder</code> is the name you will reference it by</p>
</li>
<li><p>Everything after <code>--</code> is the command Claude Code runs to start the server</p>
</li>
<li><p><code>docker run -i --rm mcp-scaffolder</code> starts the container with interactive stdin and removes it when done</p>
</li>
</ul>
<p>Verify that it registered:</p>
<pre><code class="language-bash">claude mcp list
</code></pre>
<p>You should see <code>scaffolder</code> in the output with a <code>stdio</code> transport type.</p>
<p>Now launch Claude Code and check the connection:</p>
<pre><code class="language-bash">claude
</code></pre>
<p>Once inside Claude Code, type <code>/mcp</code> to see the status of your MCP servers. You should see <code>scaffolder</code> listed as connected with two tools available.</p>
<h2 id="heading-step-5-use-it">Step 5: Use It</h2>
<p>Still inside Claude Code, try it out:</p>
<pre><code class="language-plaintext">Create a new Python project called "weather-api"
</code></pre>
<p>Claude Code should discover your <code>scaffold_project</code> tool, call it with <code>name="weather-api"</code> and <code>language="python"</code>, and report back what it created. Check your filesystem and you should see the full project structure.</p>
<p>Try a few more:</p>
<pre><code class="language-plaintext">What project templates are available?
</code></pre>
<pre><code class="language-plaintext">Scaffold a Go project called "url-shortener"
</code></pre>
<p>If Claude Code doesn't pick up your tools, run <code>/mcp</code> to check the connection status. If it shows as disconnected, the most common causes are that the Docker image failed to build, stdout is being polluted (check for stray print statements), or the Docker daemon is not running.</p>
<h2 id="heading-security-what-the-other-tutorials-leave-out">Security: What the Other Tutorials Leave Out</h2>
<p>This is the section most MCP tutorials skip. They should not. MCP has had real security incidents, not theoretical ones, and understanding them makes you a better developer.</p>
<h3 id="heading-the-prompt-injection-problem">The Prompt Injection Problem</h3>
<p>MCP servers execute code on your machine based on what an LLM decides to do. If an attacker can influence what the LLM sees, they can influence what your server does. This is called prompt injection, and it is the number one unsolved security problem in the MCP ecosystem.</p>
<p>In May 2025, researchers at Invariant Labs demonstrated this against the official GitHub MCP server. They created a malicious GitHub issue that, when read by an AI agent, hijacked the agent into leaking private repository data (including salary information) into a public pull request. The root cause was an overly broad Personal Access Token combined with untrusted content landing in the LLM's context window.</p>
<p>This was not a contrived lab demo. It used the official GitHub MCP server, the kind of thing people install from the MCP server directory without a second thought.</p>
<h3 id="heading-real-cves-not-theory">Real CVEs, Not Theory</h3>
<p>The ecosystem has accumulated real vulnerability reports:</p>
<ul>
<li><p><strong>CVE-2025-6514:</strong> A critical command-injection bug in <code>mcp-remote</code>, a popular OAuth proxy that 437,000+ environments used. An attacker could execute arbitrary OS commands through crafted OAuth redirect URIs.</p>
</li>
<li><p><strong>CVE-2025-6515:</strong> Session hijacking in <code>oatpp-mcp</code> through predictable session IDs, letting attackers inject prompts into other users' sessions.</p>
</li>
<li><p><strong>MCP Inspector RCE:</strong> Anthropic's own debugging tool allowed unauthenticated remote code execution. Inspecting a malicious server meant giving the attacker a shell on your machine.</p>
</li>
</ul>
<p>An Equixly security assessment found command injection in 43% of tested MCP server implementations. Nearly a third were vulnerable to server-side request forgery.</p>
<h3 id="heading-what-you-should-actually-do">What You Should Actually Do</h3>
<p>For the server we built today, here is what matters:</p>
<h4 id="heading-limit-file-system-access">Limit file system access</h4>
<p>Our Docker container doesn't mount your home directory. That's intentional. If you need the server to write files to your host, mount only the specific directory you need: <code>docker run -i --rm -v $(pwd)/projects:/app/projects mcp-scaffolder</code>. Never mount <code>/</code> or <code>~</code>.</p>
<h4 id="heading-validate-all-inputs">Validate all inputs</h4>
<p>Our <code>scaffold_project</code> tool checks that the language is in a known list and that the directory does not already exist. But think about what happens if someone passes <code>name="../../etc/passwd"</code> as the project name. Path traversal is the kind of thing you need to catch. Add this to the tool:</p>
<pre><code class="language-python"># Add this validation at the top of scaffold_project
if ".." in name or "/" in name or "\\" in name:
    return json.dumps({"error": "Invalid project name"})
</code></pre>
<h4 id="heading-use-least-privilege-tokens">Use least-privilege tokens</h4>
<p>If your MCP server connects to an API, give it the minimum permissions it needs. The GitHub MCP incident happened because the PAT had access to every private repo. A read-only token scoped to one repo would have contained the blast radius.</p>
<h4 id="heading-do-not-install-mcp-servers-from-untrusted-sources">Do not install MCP servers from untrusted sources</h4>
<p>A malicious npm package posing as a "Postmark MCP Server" was caught silently BCC'ing all emails to an attacker's address. Treat MCP server packages with the same caution you would give any code that runs on your machine with your permissions.</p>
<h2 id="heading-what-to-do-next">What to Do Next</h2>
<p>You have a working MCP server in a Docker container, connected to Claude Code. Here is how to make it portfolio-ready:</p>
<ol>
<li><p><strong>Add more tools:</strong> The scaffolder is a starting point. Add a tool that reads a project's dependency file and lists outdated packages. Add one that generates a Dockerfile for an existing project. Each tool is a function with a decorator – the pattern is the same every time.</p>
</li>
<li><p><strong>Add tests:</strong> Write pytest tests that call your tool functions directly and verify the output. MCP tools are just Python functions. Test them like Python functions.</p>
</li>
<li><p><strong>Push the Docker image:</strong> Tag it and push to Docker Hub or GitHub Container Registry. Then your <code>claude mcp add</code> command becomes <code>claude mcp add scaffolder -- docker run -i --rm yourusername/mcp-scaffolder:latest</code> and anyone can use it.</p>
</li>
<li><p><strong>Write a README that explains the security model:</strong> What permissions does your server need? What file system access? What happens if inputs are malicious? Answering these questions in your README signals that you think about security, which is exactly what hiring managers are looking for right now.</p>
</li>
</ol>
<h2 id="heading-wrapping-up">Wrapping Up</h2>
<p>We built a Python MCP server with FastMCP, containerized it with Docker, and connected it to Claude Code. The whole thing fits in about 100 lines of Python, a six-line Dockerfile, and one <code>claude mcp add</code> command.</p>
<p>The MCP ecosystem is real and growing fast. The protocol has the backing of Anthropic, OpenAI, and Google. It's now governed by the Linux Foundation. But it's also young, and the security story is still being written. Build with it, but build with your eyes open.</p>
<p>If you want to go deeper, here are the resources I found most useful:</p>
<ul>
<li><p><a href="https://modelcontextprotocol.io/specification/2025-11-25">MCP specification</a>: the actual protocol docs</p>
</li>
<li><p><a href="https://code.claude.com/docs/en/mcp">Claude Code MCP documentation</a>: how Claude Code implements MCP</p>
</li>
<li><p><a href="https://github.com/jlowin/fastmcp">FastMCP GitHub</a>: the Python framework we used</p>
</li>
<li><p><a href="https://authzed.com/blog/timeline-mcp-breaches">AuthZed's timeline of MCP security incidents</a>: required reading if you are building MCP servers for production</p>
</li>
<li><p><a href="https://simonwillison.net/2025/Apr/9/mcp-prompt-injection/">Simon Willison on MCP prompt injection</a>: the clearest explanation of why this is hard to solve</p>
</li>
</ul>
<p>The complete source code for this tutorial is on <a href="https://github.com/balajeeasish/ai-workshop/tree/main/mcp-server">GitHub</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ The Open Source LLM Agent Handbook: How to Automate Complex Tasks with LangGraph and CrewAI ]]>
                </title>
                <description>
                    <![CDATA[ Ever feel like your AI tools are a bit...well, passive? Like they just sit there, waiting for your next command? Imagine if they could take initiative, break down big problems, and even work together to get things done. That's exactly what LLM agents... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/the-open-source-llm-agent-handbook/</link>
                <guid isPermaLink="false">683f04aedfb685791a4e8dd2</guid>
                
                    <category>
                        <![CDATA[ llm ]]>
                    </category>
                
                    <category>
                        <![CDATA[ openai ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Open Source ]]>
                    </category>
                
                    <category>
                        <![CDATA[ agentic AI ]]>
                    </category>
                
                    <category>
                        <![CDATA[ #agent ]]>
                    </category>
                
                    <category>
                        <![CDATA[ agents ]]>
                    </category>
                
                    <category>
                        <![CDATA[ ML ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Bash ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Beginner Developers ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ AI ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Balajee Asish Brahmandam ]]>
                </dc:creator>
                <pubDate>Tue, 03 Jun 2025 14:20:30 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748956366197/c4dd2bba-430a-4f12-a3d4-becc6707c52e.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Ever feel like your AI tools are a bit...well, passive? Like they just sit there, waiting for your next command? Imagine if they could take initiative, break down big problems, and even work together to get things done.</p>
<p>That's exactly what LLM agents bring to the table. They're changing how we automate complex tasks, and they can help bring our AI ideas to life in a whole new way.</p>
<p>In this article, we'll explore what LLM agents are, how they work, and how you can build your very own using awesome open-source frameworks.</p>
<h3 id="heading-what-well-cover">What we’ll cover:</h3>
<ol>
<li><p><a class="post-section-overview" href="#heading-the-current-state-of-llm-agents">The Current State of LLM Agents</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-from-chatbots-to-autonomous-agents">From Chatbots to Autonomous Agents</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-can-agents-do-today">What Can Agents Do Today?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-whats-available-to-build-with">What's Available to Build With?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-now-is-the-best-time-to-learn">Why Now Is the Best Time to Learn</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-what-are-llm-agents-and-why-are-they-a-big-deal">What Are LLM Agents and Why Are They a Big Deal?</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-is-an-llm">What Is an LLM?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-so-whats-an-llm-agent">So, What’s an LLM Agent?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-does-this-matter">Why Does This Matter?</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-the-rise-of-open-source-agent-frameworks">The Rise of Open-Source Agent Frameworks</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-popular-open-source-agent-frameworks">Popular Open-Source Agent Frameworks</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-what-these-tools-enable">What These Tools Enable</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-why-use-a-framework-instead-of-building-from-scratch">Why Use a Framework Instead of Building from Scratch?</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-core-concepts-behind-agent-design">Core Concepts Behind Agent Design</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-the-agent-loop">The Agent Loop</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-key-components-of-an-agent">Key Components of an Agent</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-multi-agent-collaboration">Multi-Agent Collaboration</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-project-automate-your-daily-schedule-from-emails">Project: Automate Your Daily Schedule from Emails</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-were-automating">What We’re Automating</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-1-install-the-required-tools">Step 1: Install the Required Tools</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-2-define-the-task">Step 2: Define the Task</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-3-build-the-workflow-with-langgraph">Step 3: Build the Workflow with LangGraph</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-multi-agent-collaboration-with-crewai">Multi-Agent Collaboration with CrewAI</a></p>
<ul>
<li><p><a class="post-section-overview" href="#heading-what-is-crewai">What Is CrewAI?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-sample-roles-for-the-email-summary-task">Sample Roles for the Email Summary Task</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-sample-crewai-code">Sample CrewAI Code</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#heading-what-actually-happens-during-execution">What Actually Happens During Execution?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-are-llm-agents-safe-what-to-know-about-security-and-privacy">Are LLM Agents Safe? What to Know About Security and Privacy</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-troubleshooting-and-tips">Troubleshooting &amp; Tips</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-explore-more-daily-automations">Explore More Daily Automations</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-whats-next-in-agent-technology">What’s Next in Agent Technology?</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-final-summary">Final Summary</a></p>
</li>
</ol>
<h2 id="heading-the-current-state-of-llm-agents">The Current State of LLM Agents</h2>
<p>LLM agents are one of the most exciting developments in AI right now. They’re already helping automate real tasks but they’re also still evolving. So where are we today?</p>
<h3 id="heading-from-chatbots-to-autonomous-agents">From Chatbots to Autonomous Agents</h3>
<p>Large Language Models (LLMs) like GPT-4, Claude, Gemini, and LLaMA have evolved from simple chatbots into surprisingly capable reasoning engines. They've gone from answering trivia questions and generating essays to performing complex reasoning, following multi-step instructions, and interacting with tools like web search and code interpreters.</p>
<p>But here’s the catch: these models are <strong>reactive</strong>. They wait for input and give output. They don't retain memory between tasks, plan ahead, or pursue goals on their own. That’s where <strong>LLM agents</strong> come in – they bridge this gap by adding structure, memory, and autonomy.</p>
<h3 id="heading-what-can-agents-do-today">What Can Agents Do Today?</h3>
<p>Right now, LLM agents are already being used for:</p>
<ul>
<li><p>Summarizing emails or documents</p>
</li>
<li><p>Planning daily schedules</p>
</li>
<li><p>Running DevOps scripts</p>
</li>
<li><p>Searching APIs or tools for answers</p>
</li>
<li><p>Collaborating in small “teams” to complete complex tasks</p>
</li>
</ul>
<p>But they’re not perfect yet. Agents can still:</p>
<ul>
<li><p>Get stuck in loops</p>
</li>
<li><p>Misunderstand goals</p>
</li>
<li><p>Require detailed prompts and guardrails</p>
</li>
</ul>
<p>That’s because this technology is still early-stage. Frameworks are getting better fast, but reliability and memory are still works in progress. So just keep that in mind as you experiment.</p>
<h3 id="heading-why-now-is-the-best-time-to-learn">Why Now Is the Best Time to Learn</h3>
<p>The truth is: we’re still early. But not <em>too</em> early.</p>
<p>This is the perfect time to start experimenting with agents:</p>
<ul>
<li><p>The tooling is mature enough to build real projects</p>
</li>
<li><p>The community is growing rapidly</p>
</li>
<li><p>And you don’t need to be an AI expert just comfortable with Python</p>
</li>
</ul>
<h2 id="heading-what-are-llm-agents-and-why-are-they-a-big-deal">What Are LLM Agents and Why Are They a Big Deal?</h2>
<p>Before we dive into the exciting world of agents, let's quickly chat a bit more about the basics.</p>
<h3 id="heading-what-is-an-llm">What Is an LLM?</h3>
<p>An LLM, or Large Language Model, is basically an AI that's learned from a massive amount of text from the internet – think books, articles, code, and tons more. You can picture it as a super-smart autocomplete engine. But it does way more than just finish your sentences. It can also:</p>
<ul>
<li><p>Answer tricky questions</p>
</li>
<li><p>Summarize long articles or documents</p>
</li>
<li><p>Write code, emails, or creative stories</p>
</li>
<li><p>Translate languages instantly</p>
</li>
<li><p>Even solve logic puzzles and have engaging conversations</p>
</li>
</ul>
<p>Chances are you've heard of ChatGPT, which is powered by OpenAI's GPT models. Other popular LLMs you might come across include Claude (from Anthropic), LLaMA (by Meta), Mistral, and Gemini (from Google).</p>
<p>These models work by simply predicting the next word in a sentence based on the context. While that sounds straightforward, when trained on billions of words, LLMs become capable of surprisingly intelligent behavior, understanding your instructions, following step-by-step reasoning, and producing coherent responses across almost any topic you can imagine.</p>
<h3 id="heading-so-whats-an-llm-agent">So, What’s an LLM Agent?</h3>
<p>While LLMs are super powerful, they usually just <em>react –</em> they only respond when you ask them something. An LLM agent, on the other hand, is <em>proactive</em>.</p>
<p>LLM agents can:</p>
<ul>
<li><p>Break down big, complex tasks into smaller, manageable steps</p>
</li>
<li><p>Make smart decisions and figure out what to do next</p>
</li>
<li><p>Use "tools" like web search, calculators, or even other apps</p>
</li>
<li><p>Work towards a goal, even if it takes multiple steps or tries</p>
</li>
<li><p>Team up with other agents to accomplish shared objectives</p>
</li>
</ul>
<p>In short, LLM agents can think, plan, act, and adapt.</p>
<p>Think of an LLM agent like your super-efficient new assistant: you give it a goal, and it figures out how to achieve it all on its own.</p>
<h3 id="heading-why-does-this-matter">Why Does This Matter?</h3>
<p>This shift from just responding to actively pursuing goals opens a ton of exciting possibilities:</p>
<ul>
<li><p>Automating boring IT or DevOps tasks</p>
</li>
<li><p>Generating detailed reports from raw data</p>
</li>
<li><p>Helping you with multi-step research projects</p>
</li>
<li><p>Reading through your daily emails and highlighting key info</p>
</li>
<li><p>Running your internal tools to take real-world actions</p>
</li>
</ul>
<p>Unlike older, rule-based bots, LLM agents can reason, reflect, and learn from their attempts. This makes them a much better fit for real-world tasks that are messy, require flexibility, and depend on understanding context.</p>
<h2 id="heading-the-rise-of-open-source-agent-frameworks">The Rise of Open-Source Agent Frameworks</h2>
<p>Not too long ago, if you wanted to build an AI system that could act autonomously, it meant writing a ton of custom code, painstakingly managing memory, and trying to stitch together dozens of components. It was a complex, delicate, and highly specialized job.</p>
<p>But guess what? That's not the case anymore.</p>
<p>In 2024, a wave of fantastic open-source frameworks hit the scene. These tools have made it dramatically easier to build powerful LLM agents without you having to reinvent the wheel every time.</p>
<h3 id="heading-popular-open-source-agent-frameworks">Popular Open-Source Agent Frameworks</h3>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Framework</strong></td><td><strong>Description</strong></td><td><strong>Maintainer</strong></td></tr>
</thead>
<tbody>
<tr>
<td>LangGraph</td><td>Graph-based framework for agent state and memory</td><td>LangChain</td></tr>
<tr>
<td>CrewAI</td><td>"Role-based, multi-agent collaboration engine"</td><td>Community (CrewAI)</td></tr>
<tr>
<td>AutoGen</td><td>Customizable multi-agent chat orchestration</td><td>Microsoft</td></tr>
<tr>
<td>AgentVerse</td><td>Modular framework for agent simulation and testing</td><td>Open-source project</td></tr>
</tbody>
</table>
</div><h3 id="heading-what-these-tools-enable">What These Tools Enable</h3>
<p>These frameworks give you ready-made building blocks to handle the trickier parts of creating agents:</p>
<ul>
<li><p><strong>Planning</strong> – Letting agents decide their next move</p>
</li>
<li><p><strong>Tool Use</strong> – Easily connecting agents to things like file systems, web browsers, APIs, or databases</p>
</li>
<li><p><strong>Memory</strong> – Storing and retrieving past information or intermediate results for long-term context</p>
</li>
<li><p><strong>Multi-Agent Collaboration</strong> – Setting up teams of agents that work together on shared goals</p>
</li>
</ul>
<h3 id="heading-why-use-a-framework-instead-of-building-from-scratch">Why Use a Framework Instead of Building from Scratch?</h3>
<p>While you <em>could</em> build a custom agent from the ground up, using a framework will save you a huge amount of time and effort. Open-source agent libraries come packed with:</p>
<ul>
<li><p>Built-in support for orchestrating LLMs</p>
</li>
<li><p>Proven patterns for task planning, keeping track of where you are, and getting feedback</p>
</li>
<li><p>Easy integration with popular models like OpenAI, or even models you run locally</p>
</li>
<li><p>The flexibility to grow from a single helpful agent to entire teams of agents</p>
</li>
</ul>
<p>Basically, these frameworks let you focus on <strong>what your agent should do</strong>, rather than getting bogged down in how to build all the internal workings. Plus, choosing open source means you benefit from community contributions, transparency in how they work, and the freedom to tweak them to your exact needs, without getting locked into a single vendor.</p>
<h2 id="heading-core-concepts-behind-agent-design">Core Concepts Behind Agent Design</h2>
<p>To really grasp how LLM agents operate, it helps to think of them as goal-driven systems that constantly cycle through observing, reasoning, and acting. This continuous loop allows them to tackle tasks that go beyond simple questions and answers, moving into true automation, tool usage, and adapting on the fly.</p>
<h3 id="heading-the-agent-loop">The Agent Loop</h3>
<p>Most LLM agents function based on a mental model called the <strong>Agent Loop</strong> a step-by-step cycle that repeats until the job is done. Here’s how it typically works:</p>
<ul>
<li><p><strong>Perceive:</strong> The agent starts by noticing something in its environment or receiving new information. This could be your prompt, a piece of data, or the current state of a system.</p>
</li>
<li><p><strong>Plan:</strong> Based on what it perceives and its overall goal, the agent decides what to do next. It might break the task into smaller sub-goals or figure out the best tool for the job.</p>
</li>
<li><p><strong>Act:</strong> The agent then acts. This could mean running a function, calling an API, searching the web, interacting with a database, or even asking another agent for help.</p>
</li>
<li><p><strong>Reflect:</strong> After acting, the agent looks at the outcome: Did it work? Was the result useful? Should it try a different approach? Based on this, it updates its plan and keeps going until the task is complete.</p>
</li>
</ul>
<p>This loop is what makes agents so dynamic. It allows them to handle ever-changing tasks, learn from partial results, and correct their course qualities that are vital for building truly useful AI assistants.</p>
<h3 id="heading-key-components-of-an-agent">Key Components of an Agent</h3>
<p>To do their job effectively, agents are built around several crucial parts:</p>
<ul>
<li><p><strong>Tools</strong> are how an agent interacts with the real (or digital) world. These can be anything from search engines, code execution environments, file readers, or API clients, to simple calculators or command-line scripts.</p>
</li>
<li><p><strong>Memory</strong> lets agents remember what they've done or seen across different steps. This might include previous things you've said, temporary results, or key decisions. Some frameworks offer short-term memory (just for one session), while others support long-term memory that can span multiple sessions or goals.</p>
</li>
<li><p><strong>Environment</strong> refers to the external data or system context the agent operates within think APIs, documents, databases, files, or sensor inputs. The more information and access an agent have to its environment, the more meaningful actions it can take.</p>
</li>
<li><p><strong>Goal</strong> is the agent's ultimate objective: what it's trying to achieve. Goals should be specific and clear for instance, “generate a daily schedule,” “summarize this document,” or “extract tasks from emails.”</p>
</li>
</ul>
<h3 id="heading-multi-agent-collaboration">Multi-Agent Collaboration</h3>
<p>For more advanced systems, you can even have multiple agents working together to hit a shared target. Each agent can be given a specific <strong>role</strong> that highlights its specialty just like people working on a team.</p>
<p>For example:</p>
<ul>
<li><p>A <strong>researcher agent</strong> might be tasked with gathering information.</p>
</li>
<li><p>A <strong>coder agent</strong> could write Python scripts or automation routines.</p>
</li>
<li><p>A <strong>reviewer agent</strong> might check the results and ensure everything is up to snuff.</p>
</li>
</ul>
<p>These agents can chat with each other, share information, and even debate or vote on decisions. This kind of teamwork allows AI systems to tackle bigger, more complex tasks while keeping things organized and modular.</p>
<h2 id="heading-project-automate-your-daily-schedule-from-emails">Project: Automate Your Daily Schedule from Emails</h2>
<h3 id="heading-what-were-automating">What We’re Automating</h3>
<p>Think about your typical morning routine:</p>
<ul>
<li><p>You open your inbox.</p>
</li>
<li><p>You quickly scan through a bunch of emails.</p>
</li>
<li><p>You try to spot meetings, tasks, and important reminders.</p>
</li>
<li><p>Then, you manually write a to-do list or add things to your calendar.</p>
</li>
</ul>
<p>Let's use an LLM agent to make that process effortless. Our agent will:</p>
<ul>
<li><p>Read a list of your email messages</p>
</li>
<li><p>Pull out time-sensitive items like meetings or deadlines</p>
</li>
<li><p>Summarize everything into a nice, clean daily schedule</p>
</li>
</ul>
<h3 id="heading-step-1-install-the-required-tools">Step 1: Install the Required Tools</h3>
<p>To get started, you'll need three main tools: Python, VSCode, and an OpenAI API key.</p>
<h4 id="heading-1-install-python-39-or-higher">1. Install Python 3.9 or Higher</h4>
<p>Grab the latest version of Python 3.9+ from the official website: <a target="_blank" href="https://www.python.org/downloads/">https://www.python.org/downloads/</a></p>
<p>Once it's installed, double-check it by running <code>python --version</code> in your terminal.</p>
<p>This command simply asks your system to report the Python version currently installed. You'll want to see Python 3.9.x or something higher to ensure compatibility with our project.</p>
<h4 id="heading-2-install-vscode-optional-but-recommended">2. Install VSCode (Optional but Recommended)</h4>
<p>VSCode is a fantastic, user-friendly code editor that works perfectly with Python. You can download it right here: <a target="_blank" href="https://code.visualstudio.com/">https://code.visualstudio.com/</a>.</p>
<h4 id="heading-3-get-your-openai-api-key">3. Get Your OpenAI API Key</h4>
<p>Head over to: https://platform.openai.com</p>
<p>Sign in or create a new account. Navigate to your API Keys page. Click “Create new secret key” and make sure to copy that key somewhere safe for later.</p>
<h4 id="heading-4-install-python-libraries">4. Install Python Libraries</h4>
<p>Open your terminal or command prompt and install these essential packages:</p>
<pre><code class="lang-bash">pip install langgraph langchain openai
</code></pre>
<p>This command uses pip, Python's package manager, to download and install three crucial libraries for our agent:</p>
<ul>
<li><p>langgraph: The core framework we'll use to build our agent's workflow.</p>
</li>
<li><p>langchain: A foundational library for working with large language models, upon which LangGraph is built.</p>
</li>
<li><p>openai: The official Python library for connecting to OpenAI's powerful AI models.</p>
</li>
</ul>
<p>If you're excited to try out multi-agent setups (which we'll cover in Step 5), also install CrewAI:</p>
<pre><code class="lang-bash">pip install crewai
</code></pre>
<p>This command installs CrewAI, a specialized framework that makes it easy to orchestrate multiple AI agents working together as a team.</p>
<p><strong>5. Set Your OpenAI API Key</strong></p>
<p>You need to make sure your Python code can find and use your OpenAI API key. This is typically done by setting it as an environment variable.</p>
<p>On macOS/Linux, run this in your terminal (replace "your-api-key" with your actual key):</p>
<pre><code class="lang-bash"><span class="hljs-built_in">export</span> OPENAI_API_KEY=<span class="hljs-string">"your-api-key"</span>
</code></pre>
<p>This command sets an environment variable named OPENAI_API_KEY. Environment variables are a secure way for applications (like your Python script) to access sensitive information without hardcoding it directly into the code itself.</p>
<p>On Windows (using Command Prompt), do this:</p>
<pre><code class="lang-bash"><span class="hljs-built_in">set</span> OPENAI_API_KEY=<span class="hljs-string">"your-api-key"</span>
</code></pre>
<p>This is the Windows equivalent command to set the <code>OPENAI_API_KEY</code> environment variable.</p>
<p>Now, your Python code will be all set to talk to the OpenAI model!</p>
<h3 id="heading-step-2-define-the-task">Step 2: Define the Task</h3>
<p>We discussed this briefly in the beginning of this section. But to reiterate, this is what we’ll want our agent to do:</p>
<ul>
<li><p>Scan for meetings, events, and important tasks.</p>
</li>
<li><p>Jot them down quickly in a notebook or an app.</p>
</li>
<li><p>Create a rough mental plan for your day.</p>
</li>
</ul>
<p>This routine takes time and mental energy. So having an agent do it for us will be super helpful.</p>
<h3 id="heading-step-3-build-the-workflow-with-langgraph">Step 3: Build the Workflow with LangGraph</h3>
<h4 id="heading-what-is-langgraph">What Is LangGraph?</h4>
<p>LangGraph is a cool framework that helps you build agents using a "graph-based" workflow, kind of like drawing a flowchart. It's powered by LangChain and gives you a lot more control over exactly how each step in your agent's process unfolds.</p>
<p>Each "node" in this graph represents a decision point or a function that:</p>
<ul>
<li><p>Takes some input (its current "state").</p>
</li>
<li><p>Does some reasoning or takes an action (often involving the LLM and its tools).</p>
</li>
<li><p>Returns an updated output (a new "state").</p>
</li>
</ul>
<p>You draw the connections between these nodes, and LangGraph then executes it like a smart, automated state machine.</p>
<h4 id="heading-why-use-langgraph">Why Use LangGraph?</h4>
<ul>
<li><p>You get to control the precise order of execution.</p>
</li>
<li><p>It's fantastic for building workflows that have multiple steps or even branch off into different paths.</p>
</li>
<li><p>It plays nicely with both cloud-based models (like OpenAI) and models you run locally.</p>
</li>
</ul>
<p>Alright – now let’s write the code.</p>
<h5 id="heading-1-simulate-email-input"><strong>1. Simulate Email Input</strong></h5>
<p>In a real application, your agent would probably connect to Gmail or Outlook to fetch your actual emails. For this example, though, we’ll just hardcode some sample messages to keep things simple:</p>
<pre><code class="lang-python">Python

emails = <span class="hljs-string">"""
1. Subject: Standup Call at 10 AM
2. Subject: Client Review due by 5 PM
3. Subject: Lunch with Sarah at noon
4. Subject: AWS Budget Warning – 80% usage
5. Subject: Dentist Appointment - 4 PM
"""</span>
</code></pre>
<p>This multiline Python string, <code>emails</code>, acts as our stand-in for real email content. We're providing a simple, structured list of email subjects to demonstrate how the agent will process text.</p>
<h5 id="heading-2-define-the-agent-logic"><strong>2. Define the Agent Logic</strong></h5>
<p>Now, we'll tell OpenAI’s GPT model how to process this email text and turn it into a summary.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">from</span> langgraph.graph <span class="hljs-keyword">import</span> StateGraph, END
<span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> TypedDict, Annotated, List
<span class="hljs-keyword">import</span> operator

<span class="hljs-comment"># Define the state for our graph</span>
<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">AgentState</span>(<span class="hljs-params">TypedDict</span>):</span>
    emails: str
    result: str

llm = ChatOpenAI(temperature=<span class="hljs-number">0</span>, model=<span class="hljs-string">"gpt-4o"</span>) <span class="hljs-comment"># Using gpt-4o for better performance</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">calendar_summary_agent</span>(<span class="hljs-params">state: AgentState</span>) -&gt; AgentState:</span>
    emails = state[<span class="hljs-string">"emails"</span>]
    prompt = <span class="hljs-string">f"Summarize today's schedule based on these emails, listing time-sensitive items first and then other important notes. Be concise and use bullet points:\n<span class="hljs-subst">{emails}</span>"</span>
    summary = llm.invoke(prompt).content
    <span class="hljs-keyword">return</span> {<span class="hljs-string">"result"</span>: summary, <span class="hljs-string">"emails"</span>: emails} <span class="hljs-comment"># Ensure emails is also returned</span>
</code></pre>
<p>Here’s what’s going on:</p>
<ul>
<li><p><strong>Imports</strong>: We bring in necessary components:</p>
<ul>
<li><p><code>ChatOpenAI</code> to connect to the LLM,</p>
</li>
<li><p><code>StateGraph</code> and <code>END</code> from <code>langgraph.graph</code> to build our agent workflow,</p>
</li>
<li><p><code>TypedDict</code>, <code>Annotated</code>, and <code>List</code> from <code>typing</code> for type checking and structure,</p>
</li>
<li><p><code>operator</code> (though not used in this snippet, it can help with comparisons or logic).</p>
</li>
</ul>
</li>
<li><p><strong>AgentState</strong>: This <code>TypedDict</code> defines the shape of the data our agent will work with. It includes:</p>
<ul>
<li><p><code>emails</code>: the raw input messages.</p>
</li>
<li><p><code>result</code>: the final output (the daily summary).</p>
</li>
</ul>
</li>
<li><p><strong>llm = ChatOpenAI(...)</strong>: Initializes the language model. We're using GPT-4o with <code>temperature=0</code> to ensure consistent, predictable output perfect for structured summarization tasks.</p>
</li>
<li><p><strong>calendar_summary_agent(state: AgentState)</strong>: This function is the "brain" of our agent. It:</p>
<ul>
<li><p>Takes in the current state, which includes a list of emails.</p>
</li>
<li><p>Extracts the emails from that state.</p>
</li>
<li><p>Constructs a prompt that tells the model to generate a concise daily schedule summary using bullet points, prioritizing time-sensitive items.</p>
</li>
<li><p>Sends this prompt to the model with <code>llm.invoke(prompt).content</code>, which returns the LLM’s response as plain text.</p>
</li>
<li><p>Returns a new <code>AgentState</code> dictionary containing:</p>
<ul>
<li><p><code>result</code>: the generated summary,</p>
</li>
<li><p><code>emails</code>: preserved in case we need it downstream.</p>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<h5 id="heading-3-build-and-run-the-graph"><strong>3. Build and Run the Graph</strong></h5>
<p>Now, let's use LangGraph to map out the flow of our single-agent task and then run it.</p>
<pre><code class="lang-python">builder = StateGraph(AgentState)
builder.add_node(<span class="hljs-string">"calendar"</span>, calendar_summary_agent)
builder.set_entry_point(<span class="hljs-string">"calendar"</span>)
builder.set_finish_point(<span class="hljs-string">"calendar"</span>) <span class="hljs-comment"># END is implicit if not set explicitly</span>

graph = builder.compile()

<span class="hljs-comment"># Run the graph using your simulated email data</span>
result = graph.invoke({<span class="hljs-string">"emails"</span>: emails})
print(result[<span class="hljs-string">"result"</span>])
</code></pre>
<p>Here’s what’s going on:</p>
<ul>
<li><p><strong>builder = StateGraph(AgentState):</strong> We're initiating a StateGraph object. By passing AgentState, we're telling LangGraph the expected data structure for its internal state.</p>
</li>
<li><p><strong>builder.add_node("calendar", calendar_summary_agent):</strong> This line adds a named "node" to our graph. We're calling it "calendar", and we're linking it to our <code>calendar_summary_agent</code> function, meaning that function will be executed when this node is active.</p>
</li>
<li><p><strong>builder.set_entry_point("calendar"):</strong> This sets "calendar" as the very first step in our workflow. When we start the graph, execution will begin here.</p>
</li>
<li><p><strong>builder.set_finish_point("calendar"):</strong> This tells LangGraph that once the "calendar" node finishes its job, the entire graph process is complete.</p>
</li>
<li><p><strong>graph = builder.compile():</strong> This command takes our defined graph blueprint and "compiles" it into an executable workflow.</p>
</li>
<li><p><strong>result = graph.invoke({"emails": emails}):</strong> This is where the magic happens! We're telling our graph to start running. We pass it an initial state that contains our emails data. The graph will then process this data through its nodes until it reaches an end point, returning the final state.</p>
</li>
<li><p><strong>print(result["result"]):</strong> Finally, we grab the summarized schedule from the result (the final state of our graph) and print it to the console.</p>
</li>
</ul>
<h4 id="heading-example-output">Example Output</h4>
<p><code>Your Schedule:</code><br><code>- 10:00 AM – Standup Call</code><br><code>- 12:00 PM – Lunch with Sarah</code><br><code>- 4:00 PM – Dentist Appointment</code><br><code>- Submit client report by 5:00 PM</code><br><code>- AWS Budget Warning – check usage</code></p>
<p>Boom! You've just built an AI agent that can read your emails and whip up your daily schedule. Pretty cool, right? This is a simple yet powerful peek into what LLM agents can do with just a few lines of code.</p>
<h2 id="heading-multi-agent-collaboration-with-crewai">Multi-Agent Collaboration with CrewAI</h2>
<h3 id="heading-what-is-crewai">What Is CrewAI?</h3>
<p>CrewAI is an exciting open-source framework that lets you build <em>teams</em> of agents that work together seamlessly just like a real-world project team! Each agent in a CrewAI setup:</p>
<ul>
<li><p>Has a specific, specialized role.</p>
</li>
<li><p>Can communicate and share information with its teammates.</p>
</li>
<li><p>Collaborates to achieve a shared goal.</p>
</li>
</ul>
<p>This multi-agent approach is super useful when your task is too big or too complex for just one agent, or when breaking it down into specialized parts makes it clearer and more efficient.</p>
<h3 id="heading-sample-roles-for-the-email-summary-task">Sample Roles for the Email Summary Task</h3>
<p>Let's imagine our email summary task being handled by a small team of agents:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Agent Name</strong></td><td><strong>Role</strong></td><td><strong>Responsibility</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Extractor</td><td>Email Scanner</td><td>"Find meetings, reminders, and tasks from emails"</td></tr>
<tr>
<td>Prioritizer</td><td>Schedule Optimizer</td><td>Sort items by urgency and time</td></tr>
<tr>
<td>Formatter</td><td>Output Generator</td><td>"Write a clean, polished daily agenda"</td></tr>
</tbody>
</table>
</div><h3 id="heading-sample-crewai-code">Sample CrewAI Code</h3>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> crewai <span class="hljs-keyword">import</span> Agent, Crew, Task, Process
<span class="hljs-keyword">from</span> langchain_openai <span class="hljs-keyword">import</span> ChatOpenAI
<span class="hljs-keyword">import</span> os

<span class="hljs-comment"># Set your OpenAI API key from environment variables</span>
<span class="hljs-comment"># os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY" # Make sure this is set, or defined directly</span>

<span class="hljs-comment"># Initialize the LLM (using gpt-4o for better performance)</span>
llm = ChatOpenAI(temperature=<span class="hljs-number">0</span>, model=<span class="hljs-string">"gpt-4o"</span>)

<span class="hljs-comment"># Define the agents with specific roles and goals</span>
extractor = Agent(
    role=<span class="hljs-string">"Email Scanner"</span>,
    goal=<span class="hljs-string">"Find all meetings, reminders, and tasks from the given emails, accurately extracting details like time, date, and subject."</span>,
    backstory=<span class="hljs-string">"You are an expert at scanning emails for key information. You meticulously extract every relevant detail."</span>,
    verbose=<span class="hljs-literal">True</span>,
    allow_delegation=<span class="hljs-literal">False</span>,
    llm=llm
)

prioritizer = Agent(
    role=<span class="hljs-string">"Schedule Optimizer"</span>,
    goal=<span class="hljs-string">"Sort extracted items by urgency and time, preparing them for a daily agenda."</span>,
    backstory=<span class="hljs-string">"You are a master of time management, always knowing what needs to be done first. You organize tasks logically."</span>,
    verbose=<span class="hljs-literal">True</span>,
    allow_delegation=<span class="hljs-literal">False</span>,
    llm=llm
)

formatter = Agent(
    role=<span class="hljs-string">"Output Generator"</span>,
    goal=<span class="hljs-string">"Generate a clean, polished, and concise daily agenda in bullet-point format, clearly listing all schedule items."</span>,
    backstory=<span class="hljs-string">"You are a professional secretary, ensuring all outputs are perfectly formatted and easy to read. You prioritize clarity."</span>,
    verbose=<span class="hljs-literal">True</span>,
    allow_delegation=<span class="hljs-literal">False</span>,
    llm=llm
)

<span class="hljs-comment"># Simulate email input</span>
emails = <span class="hljs-string">"""
1. Subject: Standup Call at 10 AM
2. Subject: Client Review due by 5 PM
3. Subject: Lunch with Sarah at noon
4. Subject: AWS Budget Warning – 80% usage
5. Subject: Dentist Appointment - 4 PM
"""</span>

<span class="hljs-comment"># Define the tasks for each agent</span>
extract_task = Task(
    description=<span class="hljs-string">f"Extract all relevant events, meetings, and tasks from these emails: <span class="hljs-subst">{emails}</span>. Focus on precise details."</span>,
    agent=extractor,
    expected_output=<span class="hljs-string">"A list of extracted items with their details (e.g., '- Standup Call at 10 AM', '- Client Review due by 5 PM')."</span>
)

prioritize_task = Task(
    description=<span class="hljs-string">"Prioritize the extracted items by time and urgency. Meetings first, then deadlines, then other notes."</span>,
    agent=prioritizer,
    context=[extract_task], <span class="hljs-comment"># The output of extract_task is the input here</span>
    expected_output=<span class="hljs-string">"A prioritized list of schedule items."</span>
)

format_task = Task(
    description=<span class="hljs-string">"Format the prioritized schedule into a clean, easy-to-read daily agenda using bullet points. Ensure concise language."</span>,
    agent=formatter,
    context=[prioritize_task], <span class="hljs-comment"># The output of prioritize_task is the input here</span>
    expected_output=<span class="hljs-string">"A well-formatted daily agenda with bullet points."</span>
)

<span class="hljs-comment"># Instantiate the crew</span>
crew = Crew(
    agents=[extractor, prioritizer, formatter],
    tasks=[extract_task, prioritize_task, format_task],
    process=Process.sequential, <span class="hljs-comment"># Tasks are executed sequentially</span>
    verbose=<span class="hljs-number">2</span> <span class="hljs-comment"># Outputs more details during execution</span>
)

<span class="hljs-comment"># Run the crew</span>
result = crew.kickoff()
print(<span class="hljs-string">"\n########################"</span>)
print(<span class="hljs-string">"## Final Daily Agenda ##"</span>)
print(<span class="hljs-string">"########################\n"</span>)
print(result)
</code></pre>
<p>Here’s what’s going on:</p>
<ul>
<li><p><strong>Imports:</strong> We bring in key classes from CrewAI: Agent, Crew, Task, and Process. We also import <code>ChatOpenAI</code> for our language model and os to handle environment variables.</p>
</li>
<li><p><strong>llm = ChatOpenAI(...):</strong> Just like in the LangGraph example, this sets up our OpenAI language model, making sure its responses are direct (temperature=0) and using the gpt-4o model.</p>
</li>
<li><p><strong>Agent Definitions (extractor, prioritizer, formatter):</strong></p>
<ul>
<li><p>Each of these variables creates an Agent instance. An agent is defined by its role (what it does), a specific goal it's trying to achieve, and a backstory (a sort of personality or expertise that helps the LLM understand its purpose better).</p>
</li>
<li><p>verbose=True is super helpful for debugging, as it makes the agents print out their "thoughts" as they work.</p>
</li>
<li><p>allow_delegation=False means these agents won't pass their assigned tasks to other agents (though this can be set to True for more complex delegation scenarios).</p>
</li>
<li><p>llm=llm connects each agent to our OpenAI language model.</p>
</li>
</ul>
</li>
<li><p><strong>Simulated emails:</strong> We reuse the same sample email data for this example.</p>
</li>
<li><p><strong>Task Definitions (extract_task, prioritize_task, format_task):</strong></p>
<ul>
<li><p>Each Task defines a specific piece of work that an agent needs to perform.</p>
</li>
<li><p>description clearly tells the agent what the task involves.</p>
</li>
<li><p>agent assigns this task to one of our defined agents (e.g., extractor for extract_task).</p>
</li>
<li><p>context=[...] is a critical part of CrewAI's collaboration. It tells a task to use the <em>output</em> of a previous task as its <em>input</em>. For instance, prioritize_task takes the extract_task's output as its context.</p>
</li>
<li><p>expected_output gives the agent an idea of what its result should look like, helping guide the LLM.</p>
</li>
</ul>
</li>
<li><p><strong>crew = Crew(...):</strong></p>
<ul>
<li><p>This is where we assemble our team! We create a Crew instance, giving it our list of agents and tasks.</p>
</li>
<li><p>process=Process.sequential tells the crew to execute tasks one after another in the order they're defined in the tasks list. CrewAI also supports more advanced processes like hierarchical ones.</p>
</li>
<li><p>verbose=2 will show you a very detailed log of the crew's internal workings and communication.</p>
</li>
</ul>
</li>
<li><p><strong>result = crew.kickoff():</strong> This command officially starts the entire multi-agent workflow. The agents will begin collaborating, passing information, and working through their assigned tasks in sequence.</p>
</li>
<li><p><strong>fprint(result):</strong> Finally, the consolidated output from the entire crew's collaborative effort is printed to your console.</p>
</li>
</ul>
<p>CrewAI cleverly handles all the communication between agents, figures out who needs to work on what and when, and passes the output smoothly from one agent to the next it's like having a mini AI assembly line!</p>
<h2 id="heading-what-actually-happens-during-execution">What Actually Happens During Execution?</h2>
<p>So, whether you're using LangGraph or CrewAI, what's really going on behind the scenes when an agent runs? Let's break down the execution process:</p>
<ul>
<li><p>The system gets an <strong>input state</strong> (for example, your emails).</p>
</li>
<li><p>The first agent or graph node reads this input and uses a <strong>Large Language Model (LLM)</strong> to make sense of it.</p>
</li>
<li><p>Based on its understanding, the agent decides on an <strong>action</strong> like pulling out key events or calling a specific tool.</p>
</li>
<li><p>If needed, the agent might <strong>invoke tools</strong> (like a web search or a file reader) to get more context or perform external operations.</p>
</li>
<li><p>The result of that action is then <strong>passed to the next agent</strong> in the team (if it's a multi-agent setup) or returned directly to you.</p>
</li>
</ul>
<p>Execution keeps going until:</p>
<ul>
<li><p>The task is fully completed.</p>
</li>
<li><p>All agents have finished their assigned roles.</p>
</li>
<li><p>A stopping condition or a designated "END" point in the workflow is reached.</p>
</li>
</ul>
<p>Think of this as a super-smart workflow engine where every single step involves reasoning, making decisions, and remembering previous interactions.</p>
<h2 id="heading-are-llm-agents-safe-what-to-know-about-security-and-privacy">Are LLM Agents Safe? What to Know About Security and Privacy</h2>
<p>As cool as LLM agents are, they raise an important question: <em>can you really trust an AI to run parts of your workflow or interact with your data?</em> It depends. If you’re using services like OpenAI or Anthropic, your data is encrypted in transit and (as of now) isn’t used for training.</p>
<p>But some data might still be temporarily logged to prevent abuse. That’s usually fine for testing and personal projects, but if you’re working with sensitive business info, customer data, or anything private, you’ll want to be careful.</p>
<p>Use anonymized inputs, avoid exposing full datasets, and consider running agents locally using open-source models like LLaMA or Mistral if full control matters to you.</p>
<p>You can also set clear boundaries for your agents so they don’t overstep. Think of it like onboarding a new intern: you wouldn’t give them access to everything on day one.</p>
<p>Give agents only the tools and files they need, keep logs of what they do, and always review the results before letting them make real changes.</p>
<p>As this tech grows, more safety features are coming like better sandboxing, memory limits, and role-based access. But for now, it’s smart to treat your agents like powerful helpers that still need some human supervision.</p>
<h2 id="heading-troubleshooting-amp-tips">Troubleshooting &amp; Tips</h2>
<p>Sometimes, agents can be a bit quirky! Here are some common issues you might run into and how to fix them:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Issue</strong></td><td><strong>Suggested Fix</strong></td></tr>
</thead>
<tbody>
<tr>
<td>Agent seems to loop forever</td><td>Set a maximum number of iterations or define a clearer stopping point.</td></tr>
<tr>
<td>Output is too chatty or verbose</td><td>Use more specific prompts (for example, “Respond in bullet points only”).</td></tr>
<tr>
<td>Input is too long or gets cut off</td><td>Break down large pieces of content into smaller chunks and summarize them individually.</td></tr>
<tr>
<td>Agent runs too slowly</td><td>Try using a faster LLM model like gpt-3.5 or consider running a local model.</td></tr>
</tbody>
</table>
</div><p>A handy tip: You can also add print() statements or logging messages inside your agent functions to see what's happening at each stage and debug state transitions.</p>
<h2 id="heading-explore-more-daily-automations">Explore More Daily Automations</h2>
<p>Once you've built one agent-based task, you'll find it incredibly easy to adapt the pattern for other automations. Here are some cool ideas to get your creative juices flowing:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Task Type</strong></td><td><strong>Example Automation</strong></td></tr>
</thead>
<tbody>
<tr>
<td>DevOps Assistant</td><td>"Read system logs, detect potential issues, and suggest solutions."</td></tr>
<tr>
<td>Finance Tracker</td><td>Read bank statements or CSV files and summarize your spending habits/budgets.</td></tr>
<tr>
<td>Meeting Organizer</td><td>After a meeting, automatically extract action items and assign owners.</td></tr>
<tr>
<td>Inbox Cleaner</td><td>"Automatically label, archive, and delete non-urgent emails."</td></tr>
<tr>
<td>Note Summarizer</td><td>Convert your daily notes into a neatly formatted to-do list or summary.</td></tr>
<tr>
<td>Link Checker</td><td>Extract URLs from documents and automatically test if they're still valid.</td></tr>
<tr>
<td>Resume Formatter</td><td>Score resumes against job descriptions and format them automatically.</td></tr>
</tbody>
</table>
</div><p>Each of these can be built using the very same principles and frameworks we discussed whether that's LangGraph or CrewAI.</p>
<h2 id="heading-whats-next-in-agent-technology">What’s Next in Agent Technology?</h2>
<p>LLM agents are evolving at lightning speed, and the next wave of innovation is already here:</p>
<ul>
<li><p><strong>Smarter memory systems</strong>: Expect agents to have better long-term memory, allowing them to learn over extended periods and remember past conversations and actions.</p>
</li>
<li><p><strong>Multi-modal agents</strong>: Agents won't just handle text anymore! They'll be able to process and understand images, audio, and video, making them much more versatile.</p>
</li>
<li><p><strong>Advanced planning frameworks</strong>: Techniques like ReAct, Toolformer, and AutoGen are constantly improving agents' ability to reason, plan, and reduce those pesky "hallucinations."</p>
</li>
<li><p><strong>Edge deployment</strong>: Imagine agents running entirely offline on your local computer or device using lightweight models like LLaMA 3 or Mistral.</p>
</li>
</ul>
<p>In the very near future, you'll see agents seamlessly integrated into:</p>
<ul>
<li><p>Your DevOps pipelines</p>
</li>
<li><p>Big enterprise workflows</p>
</li>
<li><p>Everyday productivity tools</p>
</li>
<li><p>Mobile apps and smart devices</p>
</li>
<li><p>Games, simulations, and educational platforms</p>
</li>
</ul>
<h2 id="heading-final-summary">Final Summary</h2>
<p>Alright, let's quickly recap all the cool stuff you've just learned and accomplished:</p>
<ul>
<li><p>You've gotten a solid grasp of what LLM agents are and why they're so powerful.</p>
</li>
<li><p>You've seen how open-source frameworks like LangGraph and CrewAI make building agents much easier.</p>
</li>
<li><p>You've built a real LLM agent using LangGraph to automate a common daily task: summarizing your inbox!</p>
</li>
<li><p>You've explored the world of multi-agent collaboration with CrewAI, understanding how teams of AIs can work together.</p>
</li>
<li><p>You've learned how to take these principles and scale them to automate countless other tasks.</p>
</li>
</ul>
<p>So, next time you find yourself stuck doing something repetitive, just ask yourself: "Hey, can I build an agent for that?" The answer is probably yes!</p>
<h3 id="heading-resources-recap">Resources Recap</h3>
<p>Here are some helpful resources if you want to dive deeper into building LLM agents:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong>Resource</strong></td><td><strong>Link</strong></td></tr>
</thead>
<tbody>
<tr>
<td>LangGraph Docs</td><td><a target="_blank" href="https://docs.langgraph.dev/">https://docs.langgraph.dev/</a></td></tr>
<tr>
<td>CrewAI GitHub</td><td><a target="_blank" href="https://github.com/joaomdmoura/crewAI">https://github.com/joaomdmoura/crewAI</a></td></tr>
<tr>
<td>LangChain Docs</td><td><a target="_blank" href="https://docs.langchain.com/docs/">https://docs.langchain.com/docs/</a></td></tr>
<tr>
<td>OpenAI API Docs</td><td><a target="_blank" href="https://platform.openai.com/docs">https://platform.openai.com/docs</a></td></tr>
<tr>
<td>Python 3.9+</td><td><a target="_blank" href="https://www.python.org/downloads/">https://www.python.org/downloads/</a></td></tr>
<tr>
<td>VSCode</td><td><a target="_blank" href="https://code.visualstudio.com/">https://code.visualstudio.com/</a></td></tr>
</tbody>
</table>
</div> ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
