ML - freeCodeCamp.org

Model Packaging Tools Every MLOps Engineer Should Know

Temitope Oyedele — Mon, 06 Apr 2026 15:00:08 +0000

Most machine learning deployments don’t fail because the model is bad. They fail because of packaging.

Teams often spend months fine-tuning models (adjusting hyperparameters and improving architectures) only to hit a wall when it’s time to deploy. Suddenly, the production system can’t even read the model file. Everything breaks at the handoff between research and production.

The good news? If you think about packaging from the start, you can save up to 60% of the time usually spent during deployment. That’s because you avoid the common friction between the experimental environment and the production system.

In this guide, we’ll walk through eleven essential tools every MLOps engineer should know. To keep things clear, we’ll group them into three stages of a model’s lifecycle:

Serialization: how models are stored and transferred
Bundling & Serving: how models are deployed and run
Registry: how models are tracked and versioned

Model Serialization Formats
Model Bundling and Serving Tools
Model Registries
Conclusion

Model Serialization Formats

Serialization is simply the process of turning a trained model into a file that can be stored and moved around. It’s the first step in the pipeline, and it matters more than people think. The format you choose determines how your model will be loaded later in production.

So, you want something that either works across different frameworks or is optimized for the environment where your model will eventually run.

Below are some of the most common tools in this space:

1. ONNX (Open Neural Network Exchange)

ONNX is basically the common language for model serialization. It lets you train a model in one framework, like PyTorch, and then deploy it somewhere else without running into compatibility issues. It also performs well across different types of hardware.

ONNX separates your training framework from your inference runtime and allows hardware-level optimizations like quantization and graph fusion. It’s also widely supported across cloud platforms and edge devices.

Key considerations: This format makes it possible to decouple training from deployment, while still enabling performance optimizations across different hardware setups.

When to use it: Use ONNX when you need portability – especially if different teams or environments are involved.

2. TorchScript

TorchScript lets you compile PyTorch models into a format that can run without Python. That means you can deploy it in environments like C++ or mobile without carrying the full Python runtime.

It supports two approaches: tracing (recording execution with sample inputs) and scripting (capturing full control flow).

Key considerations: Its biggest advantage is removing the Python dependency, which helps reduce latency and makes it suitable for more constrained environments.

When to use it: Best for high-performance systems where Python would be too heavy or introduce security concerns.

3. TensorFlow SavedModel

SavedModel is TensorFlow’s native format. It stores everything – the computation graph, weights, and serving logic – in a single directory.

It’s also the standard input format for TensorFlow Serving, TFLite, and Google Cloud AI Platform.

Key considerations: It keeps everything within the TensorFlow ecosystem intact, so you don’t lose any part of the model when moving to production.

When to use it: If your project is built on TensorFlow, this is the default and safest choice.

4. Pickle and Joblib

Pickle is Python’s built-in way of saving objects, and Joblib builds on top of it to better handle large arrays and models.

These are commonly used for scikit-learn pipelines, XGBoost models, and other traditional ML setups.

Key considerations: They’re simple and convenient, but come with real trade-offs. Pickle can execute arbitrary code when loading, which makes it unsafe in untrusted environments. It’s also tightly coupled to Python versions and library dependencies, so models can break when moved across environments.

When to use it: Best suited for controlled environments where everything runs in the same Python stack, such as internal tools, quick prototypes, or batch jobs.

It’s especially practical when you’re working with classical ML models and don’t need cross-language support or long-term portability. Avoid it for production systems that require security, reproducibility, or deployment across different environments.

5. Safetensors

Safetensors is a newer format developed by Hugging Face. It’s designed to be safe, fast, and straightforward.

It avoids arbitrary code execution and allows efficient loading directly from disk.

Key considerations: It’s both memory-efficient and secure, which makes it a strong alternative to older formats like Pickle.

When to use it: Ideal for modern workflows where speed and safety are important.

Model Bundling and Serving Tools

Once your model is saved, the next step is making it usable in production. That means wrapping it in a way that can handle requests and connect it to the rest of your system.

1. BentoML

BentoML allows you to define your model service in Python – including preprocessing, inference, and postprocessing – and package everything into a single unit called a “Bento.”

This bundle includes the model, code, dependencies, and even Docker configuration.

Key considerations: It simplifies deployment by packaging everything into one consistent artifact that can run anywhere.

When to use it: Great when you want to ship your model and all its logic together as one deployable unit.

2. NVIDIA Triton Inference Server

Triton is NVIDIA’s production-grade inference server. It supports multiple model formats like ONNX, TorchScript, TensorFlow, and more.

It’s built for performance, using features like dynamic batching and concurrent execution to fully utilize GPUs.

Key considerations: It delivers high throughput and efficiently uses hardware, especially GPUs, while supporting models from different frameworks.

When to use it: Best for large-scale deployments where performance, low latency, and GPU usage are critical.

3. TorchServe

TorchServe is the official serving tool for PyTorch, developed with AWS.

It packages models into a MAR file, which includes weights, code, and dependencies, and provides APIs for managing models in production.

Key considerations: It offers built-in features for versioning, batching, and management without needing to build everything from scratch.

When to use it: A solid choice for deploying PyTorch models in a standard production setup.

Model Registries

A model registry is essentially your source of truth. It stores your models, tracks versions, and manages their lifecycle from experimentation to production.

Without one, things quickly become messy and hard to track.

1. MLflow Model Registry

MLflow is one of the most widely used MLOps platforms. Its registry helps manage model versions and track their progression through stages like Staging and Production.

It also links models back to the experiments that created them.

Key considerations: It provides strong lifecycle management and makes it easier to track and audit models.

When to use it: Ideal for teams that need structured workflows and clear governance.

2. Hugging Face Hub

The Hugging Face Hub is one of the largest platforms for sharing and managing models.

It supports both public and private repositories, along with dataset versioning and interactive demos.

Key considerations: It offers a huge library of models and makes collaboration very easy.

When to use it: Perfect for projects involving transformers, generative AI, or anything that benefits from sharing and discovery.

3. Weights and Biases

Weights & Biases combines experiment tracking with a model registry.

It connects each model directly to the training run that produced it.

Key considerations: It gives you full traceability, so you always know how a model was created.

When to use it: Best when you want a strong link between experimentation and production artifacts.

Conclusion

Machine learning systems rarely fail because the models are bad. They fail because the path to production is fragile.

Packaging is what connects research to production. If that connection is weak, even great models won’t make it into real use.

Choosing the right tools across serialization, serving, and registry layers makes systems easier to deploy and maintain. Formats like ONNX and Safetensors improve portability and safety. Tools like Triton and BentoML help with reliable serving. Registries like MLflow and Hugging Face Hub keep everything organized.

The main idea is simple: don’t leave deployment as something to figure out later.

When packaging is planned early, teams move faster and avoid a lot of unnecessary problems.

In practice, success in MLOps isn’t just about building models. It’s about making sure they actually run in the real world.

How to Build an MCP Server with Python, Docker, and Claude Code

Balajee Asish Brahmandam — Tue, 10 Mar 2026 21:41:44 +0000

Every MCP tutorial I've found so far has followed the same basic script: build a server, point Claude Desktop at it, screenshot the chat window, done.

This is fine if you want a demo. But it's not fine if you want something you can ship, defend in an interview, or hand to another developer without a README that starts with "first, install this Electron app."

So I built an MCP server in Python, containerized it with Docker, and wired it into Claude Code – all from the terminal, no GUI required.

This article walks through the full loop in one afternoon: what MCP actually is, why it matters now that OpenAI and Google have adopted it, the real security problems nobody puts in their tutorial (complete with CVEs), and every command you need to go from an empty directory to a working tool.

If you're between jobs and need a portfolio project that shows you understand how AI tooling actually works under the hood, this is the one.

What You Will Build

By the end of this tutorial, you will have:

A Python MCP server that exposes custom tools to any MCP-compatible AI client
A Docker container that packages the server for reproducible deployment
A working connection between that container and Claude Code in your terminal
An understanding of the security risks involved and how to mitigate the worst of them

The server we are building is a project scaffolder. You give it a project name and a language, and it generates a starter directory structure with the right files. It's simple enough to build in an afternoon, but useful enough to actually put on your résumé.

Prerequisites

You will need the following installed on your machine:

Python 3.10+ (check with python3 --version)
Docker (check with docker --version)
Claude Code with an active Claude Pro, Max, or API plan (check with claude --version)
Node.js 20+ (required by Claude Code – check with node --version)
A terminal you are comfortable in

If you don't have Claude Code installed yet, follow the official installation instructions. The npm installation method is deprecated, so make sure you use the native binary installer instead.

What is MCP (and Why Should You Care)?

The Model Context Protocol (MCP) is an open standard that lets AI models connect to external tools and data sources. Anthropic released it in November 2024, and within a year it became the default way to extend what an LLM can do. OpenAI adopted it in March 2025. Google DeepMind followed in April. The protocol now has over 97 million monthly SDK downloads and more than 10,000 active servers.

The easiest way to think about MCP is as a USB-C port for AI. Before MCP, every AI provider had its own way of calling tools. OpenAI had function calling. Google had their own format. If you wanted your tool to work with multiple models, you had to implement it multiple times. MCP gives you one interface that works everywhere.

Here is how the pieces fit together:

An MCP server exposes tools, resources, and prompts. It is your code.
An MCP client (like Claude Code, Claude Desktop, or Cursor) discovers those tools and calls them on behalf of the LLM.
The transport is how they communicate. For local servers, that's usually stdio (standard input/output). For remote servers, it's HTTP.

When you type a message in Claude Code and it decides to use one of your tools, here is what happens: Claude Code sends a JSON-RPC 2.0 message to your server over stdin, your server executes the tool and writes the result to stdout, and Claude Code reads it back. The LLM never talks to your server directly. The client is always in the middle.

If you want the deeper architecture breakdown, freeCodeCamp already has a solid explainer on how MCP works under the hood. Here, I will focus on building.

Why Claude Code Instead of Claude Desktop?

Most MCP tutorials use Claude Desktop as the client. That works, but Claude Code has a few advantages for developers:

It lives in your terminal. No GUI to configure. No JSON files to hand-edit in hidden config directories. You add an MCP server with one command and you are done.
It's already where you code. If you're writing the server, testing it, and connecting it, doing all of that in the same terminal session cuts the context switching.
It works on headless machines. If you're SSHing into a dev box or running in CI, Claude Desktop isn't an option. Claude Code is.
It's also an MCP server itself. Claude Code can expose its own tools (file reading, writing, shell commands) to other MCP clients via claude mcp serve. That's a neat trick we won't use today, but it's worth knowing about.

The relevant commands:

# Add an MCP server
claude mcp add  -- 

# List configured servers
claude mcp list

# Remove a server
claude mcp remove 

# Check MCP status inside Claude Code
/mcp

Step 1: Build the MCP Server

We're using FastMCP, a Python framework that handles all the protocol plumbing so you can focus on your tools. Create a new project directory and set it up:

mkdir mcp-scaffolder && cd mcp-scaffolder
python3 -m venv .venv
source .venv/bin/activate
pip install "mcp[cli]>=1.25,<2"

Why pin the version? The MCP Python SDK v2.0 is in development and will change the transport layer significantly. Pinning to >=1.25,<2 keeps your server working until you're ready to migrate.

Now create server.py:

# server.py
from mcp.server.fastmcp import FastMCP
import os
import json

mcp = FastMCP("project-scaffolder")

# Templates for different languages
TEMPLATES = {
    "python": {
        "files": {
            "main.py": '"""Entry point."""\n\n\ndef main():\n    print("Hello, world!")\n\n\nif __name__ == "__main__":\n    main()\n',
            "requirements.txt": "",
            "README.md": "# {name}\n\nA Python project.\n\n## Setup\n\n```bash\npip install -r requirements.txt\npython main.py\n```\n",
            ".gitignore": "__pycache__/\n*.pyc\n.venv/\n",
        },
        "dirs": ["tests"],
    },
    "node": {
        "files": {
            "index.js": 'console.log("Hello, world!");\n',
            "package.json": '{{\n  "name": "{name}",\n  "version": "1.0.0",\n  "main": "index.js"\n}}\n',
            "README.md": "# {name}\n\nA Node.js project.\n\n## Setup\n\n```bash\nnpm install\nnode index.js\n```\n",
            ".gitignore": "node_modules/\n",
        },
        "dirs": [],
    },
    "go": {
        "files": {
            "main.go": 'package main\n\nimport "fmt"\n\nfunc main() {{\n\tfmt.Println("Hello, world!")\n}}\n',
            "go.mod": "module {name}\n\ngo 1.21\n",
            "README.md": "# {name}\n\nA Go project.\n\n## Setup\n\n```bash\ngo run main.go\n```\n",
            ".gitignore": "bin/\n",
        },
        "dirs": ["cmd", "internal"],
    },
}


@mcp.tool()
def scaffold_project(name: str, language: str) -> str:
    """Create a new project directory structure.

    Args:
        name: The project name (used as the directory name)
        language: The programming language - one of: python, node, go
    """
    language = language.lower().strip()

    if language not in TEMPLATES:
        return json.dumps({
            "error": f"Unsupported language: {language}",
            "supported": list(TEMPLATES.keys()),
        })

    template = TEMPLATES[language]
    base_path = os.path.join(os.getcwd(), name)

    if os.path.exists(base_path):
        return json.dumps({
            "error": f"Directory already exists: {name}",
        })

    # Create the project directory
    os.makedirs(base_path, exist_ok=True)

    # Create subdirectories
    for dir_name in template["dirs"]:
        os.makedirs(os.path.join(base_path, dir_name), exist_ok=True)

    # Create files
    created_files = []
    for filename, content in template["files"].items():
        filepath = os.path.join(base_path, filename)
        formatted_content = content.replace("{name}", name)
        with open(filepath, "w") as f:
            f.write(formatted_content)
        created_files.append(filename)

    return json.dumps({
        "status": "created",
        "path": base_path,
        "language": language,
        "files": created_files,
        "directories": template["dirs"],
    })


@mcp.tool()
def list_templates() -> str:
    """List all available project templates and their contents."""
    result = {}
    for lang, template in TEMPLATES.items():
        result[lang] = {
            "files": list(template["files"].keys()),
            "directories": template["dirs"],
        }
    return json.dumps(result, indent=2)


if __name__ == "__main__":
    mcp.run(transport="stdio")

A few things to notice about this code:

Tools return strings. MCP tools communicate through text. I'm returning JSON strings so the LLM can parse the results reliably. You could return plain text, but structured data gives the model more to work with.

The @mcp.tool() decorator does the heavy lifting. FastMCP reads your function signature and docstring to generate the JSON schema that tells the LLM what this tool does, what arguments it takes, and what types they are. Good docstrings aren't optional here – they're how the LLM decides whether to call your tool.

transport="stdio" is the key line. This tells FastMCP to communicate over standard input/output, which is what Claude Code expects for local servers.

Step 2: Test It Locally

Before we Dockerize anything, make sure the server actually works:

# Quick smoke test - the server should start without errors
python server.py

You should see... nothing. That is correct. An MCP server over stdio just sits there waiting for JSON-RPC messages on stdin. Press Ctrl+C to stop it.

For a proper test, use the MCP Inspector (Anthropic's debugging tool):

# Install and run the inspector
npx @modelcontextprotocol/inspector python server.py

This opens a web interface where you can see your tools, call them manually, and inspect the JSON-RPC messages going back and forth. Verify that both scaffold_project and list_templates show up and return sensible results.

Here's a debugging tip that will save you time: If your MCP server logs anything to stdout, it will corrupt the JSON-RPC stream and the client will disconnect. Use stderr for all logging: print("debug info", file=sys.stderr). This is the single most common source of "my server connects but then immediately fails" bugs. The New Stack called stdio transport "incredibly fragile" for exactly this reason.

Step 3: Dockerize It

Create a Dockerfile in your project root:

FROM python:3.12-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy server code
COPY server.py .

# MCP servers over stdio need unbuffered output
ENV PYTHONUNBUFFERED=1

# The server reads from stdin and writes to stdout
CMD ["python", "server.py"]

Create requirements.txt:

mcp[cli]>=1.25,<2

Build and verify:

docker build -t mcp-scaffolder .

# Quick test - should start without errors
docker run -i mcp-scaffolder

Again, you'll see nothing because the server is waiting for input. Ctrl+C to stop.

Two things matter in this Dockerfile:

PYTHONUNBUFFERED=1 is critical. Without it, Python buffers stdout, and the MCP client may hang waiting for responses that are sitting in a buffer. This is one of those bugs that works fine in local testing and breaks in Docker.
docker run -i (interactive mode) is required. The -i flag keeps stdin open so the MCP client can send messages to the container. Without it, the server gets an immediate EOF and exits.

Step 4: Wire It Into Claude Code

Now connect your Docker container to Claude Code:

claude mcp add scaffolder -- docker run -i --rm mcp-scaffolder

That's the whole command. Let me break it down:

claude mcp add registers a new MCP server
scaffolder is the name you will reference it by
Everything after -- is the command Claude Code runs to start the server
docker run -i --rm mcp-scaffolder starts the container with interactive stdin and removes it when done

Verify that it registered:

claude mcp list

You should see scaffolder in the output with a stdio transport type.

Now launch Claude Code and check the connection:

claude

Once inside Claude Code, type /mcp to see the status of your MCP servers. You should see scaffolder listed as connected with two tools available.

Step 5: Use It

Still inside Claude Code, try it out:

Create a new Python project called "weather-api"

Claude Code should discover your scaffold_project tool, call it with name="weather-api" and language="python", and report back what it created. Check your filesystem and you should see the full project structure.

Try a few more:

What project templates are available?

Scaffold a Go project called "url-shortener"

If Claude Code doesn't pick up your tools, run /mcp to check the connection status. If it shows as disconnected, the most common causes are that the Docker image failed to build, stdout is being polluted (check for stray print statements), or the Docker daemon is not running.

Security: What the Other Tutorials Leave Out

This is the section most MCP tutorials skip. They should not. MCP has had real security incidents, not theoretical ones, and understanding them makes you a better developer.

The Prompt Injection Problem

MCP servers execute code on your machine based on what an LLM decides to do. If an attacker can influence what the LLM sees, they can influence what your server does. This is called prompt injection, and it is the number one unsolved security problem in the MCP ecosystem.

In May 2025, researchers at Invariant Labs demonstrated this against the official GitHub MCP server. They created a malicious GitHub issue that, when read by an AI agent, hijacked the agent into leaking private repository data (including salary information) into a public pull request. The root cause was an overly broad Personal Access Token combined with untrusted content landing in the LLM's context window.

This was not a contrived lab demo. It used the official GitHub MCP server, the kind of thing people install from the MCP server directory without a second thought.

Real CVEs, Not Theory

The ecosystem has accumulated real vulnerability reports:

CVE-2025-6514: A critical command-injection bug in mcp-remote, a popular OAuth proxy that 437,000+ environments used. An attacker could execute arbitrary OS commands through crafted OAuth redirect URIs.
CVE-2025-6515: Session hijacking in oatpp-mcp through predictable session IDs, letting attackers inject prompts into other users' sessions.
MCP Inspector RCE: Anthropic's own debugging tool allowed unauthenticated remote code execution. Inspecting a malicious server meant giving the attacker a shell on your machine.

An Equixly security assessment found command injection in 43% of tested MCP server implementations. Nearly a third were vulnerable to server-side request forgery.

What You Should Actually Do

For the server we built today, here is what matters:

Limit file system access

Our Docker container doesn't mount your home directory. That's intentional. If you need the server to write files to your host, mount only the specific directory you need: docker run -i --rm -v $(pwd)/projects:/app/projects mcp-scaffolder. Never mount / or ~.

Validate all inputs

Our scaffold_project tool checks that the language is in a known list and that the directory does not already exist. But think about what happens if someone passes name="../../etc/passwd" as the project name. Path traversal is the kind of thing you need to catch. Add this to the tool:

# Add this validation at the top of scaffold_project
if ".." in name or "/" in name or "\\" in name:
    return json.dumps({"error": "Invalid project name"})

Use least-privilege tokens

If your MCP server connects to an API, give it the minimum permissions it needs. The GitHub MCP incident happened because the PAT had access to every private repo. A read-only token scoped to one repo would have contained the blast radius.

Do not install MCP servers from untrusted sources

A malicious npm package posing as a "Postmark MCP Server" was caught silently BCC'ing all emails to an attacker's address. Treat MCP server packages with the same caution you would give any code that runs on your machine with your permissions.

What to Do Next

You have a working MCP server in a Docker container, connected to Claude Code. Here is how to make it portfolio-ready:

Add more tools: The scaffolder is a starting point. Add a tool that reads a project's dependency file and lists outdated packages. Add one that generates a Dockerfile for an existing project. Each tool is a function with a decorator – the pattern is the same every time.
Add tests: Write pytest tests that call your tool functions directly and verify the output. MCP tools are just Python functions. Test them like Python functions.
Push the Docker image: Tag it and push to Docker Hub or GitHub Container Registry. Then your claude mcp add command becomes claude mcp add scaffolder -- docker run -i --rm yourusername/mcp-scaffolder:latest and anyone can use it.
Write a README that explains the security model: What permissions does your server need? What file system access? What happens if inputs are malicious? Answering these questions in your README signals that you think about security, which is exactly what hiring managers are looking for right now.

Wrapping Up

We built a Python MCP server with FastMCP, containerized it with Docker, and connected it to Claude Code. The whole thing fits in about 100 lines of Python, a six-line Dockerfile, and one claude mcp add command.

The MCP ecosystem is real and growing fast. The protocol has the backing of Anthropic, OpenAI, and Google. It's now governed by the Linux Foundation. But it's also young, and the security story is still being written. Build with it, but build with your eyes open.

If you want to go deeper, here are the resources I found most useful:

MCP specification: the actual protocol docs
Claude Code MCP documentation: how Claude Code implements MCP
FastMCP GitHub: the Python framework we used
AuthZed's timeline of MCP security incidents: required reading if you are building MCP servers for production
Simon Willison on MCP prompt injection: the clearest explanation of why this is hard to solve

The complete source code for this tutorial is on GitHub.

The Open Source LLM Agent Handbook: How to Automate Complex Tasks with LangGraph and CrewAI

Balajee Asish Brahmandam — Tue, 03 Jun 2025 14:20:30 +0000

Ever feel like your AI tools are a bit...well, passive? Like they just sit there, waiting for your next command? Imagine if they could take initiative, break down big problems, and even work together to get things done.

That's exactly what LLM agents bring to the table. They're changing how we automate complex tasks, and they can help bring our AI ideas to life in a whole new way.

In this article, we'll explore what LLM agents are, how they work, and how you can build your very own using awesome open-source frameworks.

What we’ll cover:

The Current State of LLM Agents
What Are LLM Agents and Why Are They a Big Deal?
The Rise of Open-Source Agent Frameworks
Core Concepts Behind Agent Design
Project: Automate Your Daily Schedule from Emails
Multi-Agent Collaboration with CrewAI
What Actually Happens During Execution?
Are LLM Agents Safe? What to Know About Security and Privacy
Troubleshooting & Tips
Explore More Daily Automations
What’s Next in Agent Technology?
Final Summary

The Current State of LLM Agents

LLM agents are one of the most exciting developments in AI right now. They’re already helping automate real tasks but they’re also still evolving. So where are we today?

From Chatbots to Autonomous Agents

Large Language Models (LLMs) like GPT-4, Claude, Gemini, and LLaMA have evolved from simple chatbots into surprisingly capable reasoning engines. They've gone from answering trivia questions and generating essays to performing complex reasoning, following multi-step instructions, and interacting with tools like web search and code interpreters.

But here’s the catch: these models are reactive. They wait for input and give output. They don't retain memory between tasks, plan ahead, or pursue goals on their own. That’s where LLM agents come in – they bridge this gap by adding structure, memory, and autonomy.

What Can Agents Do Today?

Right now, LLM agents are already being used for:

Summarizing emails or documents
Planning daily schedules
Running DevOps scripts
Searching APIs or tools for answers
Collaborating in small “teams” to complete complex tasks

But they’re not perfect yet. Agents can still:

Get stuck in loops
Misunderstand goals
Require detailed prompts and guardrails

That’s because this technology is still early-stage. Frameworks are getting better fast, but reliability and memory are still works in progress. So just keep that in mind as you experiment.

Why Now Is the Best Time to Learn

The truth is: we’re still early. But not too early.

This is the perfect time to start experimenting with agents:

The tooling is mature enough to build real projects
The community is growing rapidly
And you don’t need to be an AI expert just comfortable with Python

What Are LLM Agents and Why Are They a Big Deal?

Before we dive into the exciting world of agents, let's quickly chat a bit more about the basics.

What Is an LLM?

An LLM, or Large Language Model, is basically an AI that's learned from a massive amount of text from the internet – think books, articles, code, and tons more. You can picture it as a super-smart autocomplete engine. But it does way more than just finish your sentences. It can also:

Answer tricky questions
Summarize long articles or documents
Write code, emails, or creative stories
Translate languages instantly
Even solve logic puzzles and have engaging conversations

Chances are you've heard of ChatGPT, which is powered by OpenAI's GPT models. Other popular LLMs you might come across include Claude (from Anthropic), LLaMA (by Meta), Mistral, and Gemini (from Google).

These models work by simply predicting the next word in a sentence based on the context. While that sounds straightforward, when trained on billions of words, LLMs become capable of surprisingly intelligent behavior, understanding your instructions, following step-by-step reasoning, and producing coherent responses across almost any topic you can imagine.

So, What’s an LLM Agent?

While LLMs are super powerful, they usually just react – they only respond when you ask them something. An LLM agent, on the other hand, is proactive.

LLM agents can:

Break down big, complex tasks into smaller, manageable steps
Make smart decisions and figure out what to do next
Use "tools" like web search, calculators, or even other apps
Work towards a goal, even if it takes multiple steps or tries
Team up with other agents to accomplish shared objectives

In short, LLM agents can think, plan, act, and adapt.

Think of an LLM agent like your super-efficient new assistant: you give it a goal, and it figures out how to achieve it all on its own.

Why Does This Matter?

This shift from just responding to actively pursuing goals opens a ton of exciting possibilities:

Automating boring IT or DevOps tasks
Generating detailed reports from raw data
Helping you with multi-step research projects
Reading through your daily emails and highlighting key info
Running your internal tools to take real-world actions

Unlike older, rule-based bots, LLM agents can reason, reflect, and learn from their attempts. This makes them a much better fit for real-world tasks that are messy, require flexibility, and depend on understanding context.

The Rise of Open-Source Agent Frameworks

Not too long ago, if you wanted to build an AI system that could act autonomously, it meant writing a ton of custom code, painstakingly managing memory, and trying to stitch together dozens of components. It was a complex, delicate, and highly specialized job.

But guess what? That's not the case anymore.

In 2024, a wave of fantastic open-source frameworks hit the scene. These tools have made it dramatically easier to build powerful LLM agents without you having to reinvent the wheel every time.

Popular Open-Source Agent Frameworks

Framework	Description	Maintainer
LangGraph	Graph-based framework for agent state and memory	LangChain
CrewAI	"Role-based, multi-agent collaboration engine"	Community (CrewAI)
AutoGen	Customizable multi-agent chat orchestration	Microsoft
AgentVerse	Modular framework for agent simulation and testing	Open-source project

What These Tools Enable

These frameworks give you ready-made building blocks to handle the trickier parts of creating agents:

Planning – Letting agents decide their next move
Tool Use – Easily connecting agents to things like file systems, web browsers, APIs, or databases
Memory – Storing and retrieving past information or intermediate results for long-term context
Multi-Agent Collaboration – Setting up teams of agents that work together on shared goals

Why Use a Framework Instead of Building from Scratch?

While you could build a custom agent from the ground up, using a framework will save you a huge amount of time and effort. Open-source agent libraries come packed with:

Built-in support for orchestrating LLMs
Proven patterns for task planning, keeping track of where you are, and getting feedback
Easy integration with popular models like OpenAI, or even models you run locally
The flexibility to grow from a single helpful agent to entire teams of agents

Basically, these frameworks let you focus on what your agent should do, rather than getting bogged down in how to build all the internal workings. Plus, choosing open source means you benefit from community contributions, transparency in how they work, and the freedom to tweak them to your exact needs, without getting locked into a single vendor.

Core Concepts Behind Agent Design

To really grasp how LLM agents operate, it helps to think of them as goal-driven systems that constantly cycle through observing, reasoning, and acting. This continuous loop allows them to tackle tasks that go beyond simple questions and answers, moving into true automation, tool usage, and adapting on the fly.

The Agent Loop

Most LLM agents function based on a mental model called the Agent Loop a step-by-step cycle that repeats until the job is done. Here’s how it typically works:

Perceive: The agent starts by noticing something in its environment or receiving new information. This could be your prompt, a piece of data, or the current state of a system.
Plan: Based on what it perceives and its overall goal, the agent decides what to do next. It might break the task into smaller sub-goals or figure out the best tool for the job.
Act: The agent then acts. This could mean running a function, calling an API, searching the web, interacting with a database, or even asking another agent for help.
Reflect: After acting, the agent looks at the outcome: Did it work? Was the result useful? Should it try a different approach? Based on this, it updates its plan and keeps going until the task is complete.

This loop is what makes agents so dynamic. It allows them to handle ever-changing tasks, learn from partial results, and correct their course qualities that are vital for building truly useful AI assistants.

Key Components of an Agent

To do their job effectively, agents are built around several crucial parts:

Tools are how an agent interacts with the real (or digital) world. These can be anything from search engines, code execution environments, file readers, or API clients, to simple calculators or command-line scripts.
Memory lets agents remember what they've done or seen across different steps. This might include previous things you've said, temporary results, or key decisions. Some frameworks offer short-term memory (just for one session), while others support long-term memory that can span multiple sessions or goals.
Environment refers to the external data or system context the agent operates within think APIs, documents, databases, files, or sensor inputs. The more information and access an agent have to its environment, the more meaningful actions it can take.
Goal is the agent's ultimate objective: what it's trying to achieve. Goals should be specific and clear for instance, “generate a daily schedule,” “summarize this document,” or “extract tasks from emails.”

Multi-Agent Collaboration

For more advanced systems, you can even have multiple agents working together to hit a shared target. Each agent can be given a specific role that highlights its specialty just like people working on a team.

For example:

A researcher agent might be tasked with gathering information.
A coder agent could write Python scripts or automation routines.
A reviewer agent might check the results and ensure everything is up to snuff.

These agents can chat with each other, share information, and even debate or vote on decisions. This kind of teamwork allows AI systems to tackle bigger, more complex tasks while keeping things organized and modular.

Project: Automate Your Daily Schedule from Emails

What We’re Automating

Think about your typical morning routine:

You open your inbox.
You quickly scan through a bunch of emails.
You try to spot meetings, tasks, and important reminders.
Then, you manually write a to-do list or add things to your calendar.

Let's use an LLM agent to make that process effortless. Our agent will:

Read a list of your email messages
Pull out time-sensitive items like meetings or deadlines
Summarize everything into a nice, clean daily schedule

Step 1: Install the Required Tools

To get started, you'll need three main tools: Python, VSCode, and an OpenAI API key.

1. Install Python 3.9 or Higher

Grab the latest version of Python 3.9+ from the official website: https://www.python.org/downloads/

Once it's installed, double-check it by running python --version in your terminal.

This command simply asks your system to report the Python version currently installed. You'll want to see Python 3.9.x or something higher to ensure compatibility with our project.

2. Install VSCode (Optional but Recommended)

VSCode is a fantastic, user-friendly code editor that works perfectly with Python. You can download it right here: https://code.visualstudio.com/.

3. Get Your OpenAI API Key

Head over to: https://platform.openai.com

Sign in or create a new account. Navigate to your API Keys page. Click “Create new secret key” and make sure to copy that key somewhere safe for later.

4. Install Python Libraries

Open your terminal or command prompt and install these essential packages:

pip install langgraph langchain openai

This command uses pip, Python's package manager, to download and install three crucial libraries for our agent:

langgraph: The core framework we'll use to build our agent's workflow.
langchain: A foundational library for working with large language models, upon which LangGraph is built.
openai: The official Python library for connecting to OpenAI's powerful AI models.

If you're excited to try out multi-agent setups (which we'll cover in Step 5), also install CrewAI:

pip install crewai

This command installs CrewAI, a specialized framework that makes it easy to orchestrate multiple AI agents working together as a team.

5. Set Your OpenAI API Key

You need to make sure your Python code can find and use your OpenAI API key. This is typically done by setting it as an environment variable.

On macOS/Linux, run this in your terminal (replace "your-api-key" with your actual key):

export OPENAI_API_KEY="your-api-key"

This command sets an environment variable named OPENAI_API_KEY. Environment variables are a secure way for applications (like your Python script) to access sensitive information without hardcoding it directly into the code itself.

On Windows (using Command Prompt), do this:

set OPENAI_API_KEY="your-api-key"

This is the Windows equivalent command to set the OPENAI_API_KEY environment variable.

Now, your Python code will be all set to talk to the OpenAI model!

Step 2: Define the Task

We discussed this briefly in the beginning of this section. But to reiterate, this is what we’ll want our agent to do:

Scan for meetings, events, and important tasks.
Jot them down quickly in a notebook or an app.
Create a rough mental plan for your day.

This routine takes time and mental energy. So having an agent do it for us will be super helpful.

Step 3: Build the Workflow with LangGraph

What Is LangGraph?

LangGraph is a cool framework that helps you build agents using a "graph-based" workflow, kind of like drawing a flowchart. It's powered by LangChain and gives you a lot more control over exactly how each step in your agent's process unfolds.

Each "node" in this graph represents a decision point or a function that:

Takes some input (its current "state").
Does some reasoning or takes an action (often involving the LLM and its tools).
Returns an updated output (a new "state").

You draw the connections between these nodes, and LangGraph then executes it like a smart, automated state machine.

Why Use LangGraph?

You get to control the precise order of execution.
It's fantastic for building workflows that have multiple steps or even branch off into different paths.
It plays nicely with both cloud-based models (like OpenAI) and models you run locally.

Alright – now let’s write the code.

1. Simulate Email Input

In a real application, your agent would probably connect to Gmail or Outlook to fetch your actual emails. For this example, though, we’ll just hardcode some sample messages to keep things simple:

Python

emails = """
1. Subject: Standup Call at 10 AM
2. Subject: Client Review due by 5 PM
3. Subject: Lunch with Sarah at noon
4. Subject: AWS Budget Warning – 80% usage
5. Subject: Dentist Appointment - 4 PM
"""

This multiline Python string, emails, acts as our stand-in for real email content. We're providing a simple, structured list of email subjects to demonstrate how the agent will process text.

2. Define the Agent Logic

Now, we'll tell OpenAI’s GPT model how to process this email text and turn it into a summary.

from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated, List
import operator

# Define the state for our graph
class AgentState(TypedDict):
    emails: str
    result: str

llm = ChatOpenAI(temperature=0, model="gpt-4o") # Using gpt-4o for better performance

def calendar_summary_agent(state: AgentState) -> AgentState:
    emails = state["emails"]
    prompt = f"Summarize today's schedule based on these emails, listing time-sensitive items first and then other important notes. Be concise and use bullet points:\n{emails}"
    summary = llm.invoke(prompt).content
    return {"result": summary, "emails": emails} # Ensure emails is also returned

Here’s what’s going on:

Imports: We bring in necessary components:
- ChatOpenAI to connect to the LLM,
- StateGraph and END from langgraph.graph to build our agent workflow,
- TypedDict, Annotated, and List from typing for type checking and structure,
- operator (though not used in this snippet, it can help with comparisons or logic).
AgentState: This TypedDict defines the shape of the data our agent will work with. It includes:
- emails: the raw input messages.
- result: the final output (the daily summary).
llm = ChatOpenAI(...): Initializes the language model. We're using GPT-4o with temperature=0 to ensure consistent, predictable output perfect for structured summarization tasks.
calendar_summary_agent(state: AgentState): This function is the "brain" of our agent. It:
- Takes in the current state, which includes a list of emails.
- Extracts the emails from that state.
- Constructs a prompt that tells the model to generate a concise daily schedule summary using bullet points, prioritizing time-sensitive items.
- Sends this prompt to the model with llm.invoke(prompt).content, which returns the LLM’s response as plain text.
- Returns a new AgentState dictionary containing:
  - result: the generated summary,
  - emails: preserved in case we need it downstream.

3. Build and Run the Graph

Now, let's use LangGraph to map out the flow of our single-agent task and then run it.

builder = StateGraph(AgentState)
builder.add_node("calendar", calendar_summary_agent)
builder.set_entry_point("calendar")
builder.set_finish_point("calendar") # END is implicit if not set explicitly

graph = builder.compile()

# Run the graph using your simulated email data
result = graph.invoke({"emails": emails})
print(result["result"])

Here’s what’s going on:

builder = StateGraph(AgentState): We're initiating a StateGraph object. By passing AgentState, we're telling LangGraph the expected data structure for its internal state.
builder.add_node("calendar", calendar_summary_agent): This line adds a named "node" to our graph. We're calling it "calendar", and we're linking it to our calendar_summary_agent function, meaning that function will be executed when this node is active.
builder.set_entry_point("calendar"): This sets "calendar" as the very first step in our workflow. When we start the graph, execution will begin here.
builder.set_finish_point("calendar"): This tells LangGraph that once the "calendar" node finishes its job, the entire graph process is complete.
graph = builder.compile(): This command takes our defined graph blueprint and "compiles" it into an executable workflow.
result = graph.invoke({"emails": emails}): This is where the magic happens! We're telling our graph to start running. We pass it an initial state that contains our emails data. The graph will then process this data through its nodes until it reaches an end point, returning the final state.
print(result["result"]): Finally, we grab the summarized schedule from the result (the final state of our graph) and print it to the console.

Example Output

Your Schedule:
- 10:00 AM – Standup Call
- 12:00 PM – Lunch with Sarah
- 4:00 PM – Dentist Appointment
- Submit client report by 5:00 PM
- AWS Budget Warning – check usage

Boom! You've just built an AI agent that can read your emails and whip up your daily schedule. Pretty cool, right? This is a simple yet powerful peek into what LLM agents can do with just a few lines of code.

Multi-Agent Collaboration with CrewAI

What Is CrewAI?

CrewAI is an exciting open-source framework that lets you build teams of agents that work together seamlessly just like a real-world project team! Each agent in a CrewAI setup:

Has a specific, specialized role.
Can communicate and share information with its teammates.
Collaborates to achieve a shared goal.

This multi-agent approach is super useful when your task is too big or too complex for just one agent, or when breaking it down into specialized parts makes it clearer and more efficient.

Sample Roles for the Email Summary Task

Let's imagine our email summary task being handled by a small team of agents:

Agent Name	Role	Responsibility
Extractor	Email Scanner	"Find meetings, reminders, and tasks from emails"
Prioritizer	Schedule Optimizer	Sort items by urgency and time
Formatter	Output Generator	"Write a clean, polished daily agenda"

Sample CrewAI Code

from crewai import Agent, Crew, Task, Process
from langchain_openai import ChatOpenAI
import os

# Set your OpenAI API key from environment variables
# os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY" # Make sure this is set, or defined directly

# Initialize the LLM (using gpt-4o for better performance)
llm = ChatOpenAI(temperature=0, model="gpt-4o")

# Define the agents with specific roles and goals
extractor = Agent(
    role="Email Scanner",
    goal="Find all meetings, reminders, and tasks from the given emails, accurately extracting details like time, date, and subject.",
    backstory="You are an expert at scanning emails for key information. You meticulously extract every relevant detail.",
    verbose=True,
    allow_delegation=False,
    llm=llm
)

prioritizer = Agent(
    role="Schedule Optimizer",
    goal="Sort extracted items by urgency and time, preparing them for a daily agenda.",
    backstory="You are a master of time management, always knowing what needs to be done first. You organize tasks logically.",
    verbose=True,
    allow_delegation=False,
    llm=llm
)

formatter = Agent(
    role="Output Generator",
    goal="Generate a clean, polished, and concise daily agenda in bullet-point format, clearly listing all schedule items.",
    backstory="You are a professional secretary, ensuring all outputs are perfectly formatted and easy to read. You prioritize clarity.",
    verbose=True,
    allow_delegation=False,
    llm=llm
)

# Simulate email input
emails = """
1. Subject: Standup Call at 10 AM
2. Subject: Client Review due by 5 PM
3. Subject: Lunch with Sarah at noon
4. Subject: AWS Budget Warning – 80% usage
5. Subject: Dentist Appointment - 4 PM
"""

# Define the tasks for each agent
extract_task = Task(
    description=f"Extract all relevant events, meetings, and tasks from these emails: {emails}. Focus on precise details.",
    agent=extractor,
    expected_output="A list of extracted items with their details (e.g., '- Standup Call at 10 AM', '- Client Review due by 5 PM')."
)

prioritize_task = Task(
    description="Prioritize the extracted items by time and urgency. Meetings first, then deadlines, then other notes.",
    agent=prioritizer,
    context=[extract_task], # The output of extract_task is the input here
    expected_output="A prioritized list of schedule items."
)

format_task = Task(
    description="Format the prioritized schedule into a clean, easy-to-read daily agenda using bullet points. Ensure concise language.",
    agent=formatter,
    context=[prioritize_task], # The output of prioritize_task is the input here
    expected_output="A well-formatted daily agenda with bullet points."
)

# Instantiate the crew
crew = Crew(
    agents=[extractor, prioritizer, formatter],
    tasks=[extract_task, prioritize_task, format_task],
    process=Process.sequential, # Tasks are executed sequentially
    verbose=2 # Outputs more details during execution
)

# Run the crew
result = crew.kickoff()
print("\n########################")
print("## Final Daily Agenda ##")
print("########################\n")
print(result)

Here’s what’s going on:

Imports: We bring in key classes from CrewAI: Agent, Crew, Task, and Process. We also import ChatOpenAI for our language model and os to handle environment variables.
llm = ChatOpenAI(...): Just like in the LangGraph example, this sets up our OpenAI language model, making sure its responses are direct (temperature=0) and using the gpt-4o model.
Agent Definitions (extractor, prioritizer, formatter):
- Each of these variables creates an Agent instance. An agent is defined by its role (what it does), a specific goal it's trying to achieve, and a backstory (a sort of personality or expertise that helps the LLM understand its purpose better).
- verbose=True is super helpful for debugging, as it makes the agents print out their "thoughts" as they work.
- allow_delegation=False means these agents won't pass their assigned tasks to other agents (though this can be set to True for more complex delegation scenarios).
- llm=llm connects each agent to our OpenAI language model.
Simulated emails: We reuse the same sample email data for this example.
Task Definitions (extract_task, prioritize_task, format_task):
- Each Task defines a specific piece of work that an agent needs to perform.
- description clearly tells the agent what the task involves.
- agent assigns this task to one of our defined agents (e.g., extractor for extract_task).
- context=[...] is a critical part of CrewAI's collaboration. It tells a task to use the output of a previous task as its input. For instance, prioritize_task takes the extract_task's output as its context.
- expected_output gives the agent an idea of what its result should look like, helping guide the LLM.
crew = Crew(...):
- This is where we assemble our team! We create a Crew instance, giving it our list of agents and tasks.
- process=Process.sequential tells the crew to execute tasks one after another in the order they're defined in the tasks list. CrewAI also supports more advanced processes like hierarchical ones.
- verbose=2 will show you a very detailed log of the crew's internal workings and communication.
result = crew.kickoff(): This command officially starts the entire multi-agent workflow. The agents will begin collaborating, passing information, and working through their assigned tasks in sequence.
fprint(result): Finally, the consolidated output from the entire crew's collaborative effort is printed to your console.

CrewAI cleverly handles all the communication between agents, figures out who needs to work on what and when, and passes the output smoothly from one agent to the next it's like having a mini AI assembly line!

What Actually Happens During Execution?

So, whether you're using LangGraph or CrewAI, what's really going on behind the scenes when an agent runs? Let's break down the execution process:

The system gets an input state (for example, your emails).
The first agent or graph node reads this input and uses a Large Language Model (LLM) to make sense of it.
Based on its understanding, the agent decides on an action like pulling out key events or calling a specific tool.
If needed, the agent might invoke tools (like a web search or a file reader) to get more context or perform external operations.
The result of that action is then passed to the next agent in the team (if it's a multi-agent setup) or returned directly to you.

Execution keeps going until:

The task is fully completed.
All agents have finished their assigned roles.
A stopping condition or a designated "END" point in the workflow is reached.

Think of this as a super-smart workflow engine where every single step involves reasoning, making decisions, and remembering previous interactions.

Are LLM Agents Safe? What to Know About Security and Privacy

As cool as LLM agents are, they raise an important question: can you really trust an AI to run parts of your workflow or interact with your data? It depends. If you’re using services like OpenAI or Anthropic, your data is encrypted in transit and (as of now) isn’t used for training.

But some data might still be temporarily logged to prevent abuse. That’s usually fine for testing and personal projects, but if you’re working with sensitive business info, customer data, or anything private, you’ll want to be careful.

Use anonymized inputs, avoid exposing full datasets, and consider running agents locally using open-source models like LLaMA or Mistral if full control matters to you.

You can also set clear boundaries for your agents so they don’t overstep. Think of it like onboarding a new intern: you wouldn’t give them access to everything on day one.

Give agents only the tools and files they need, keep logs of what they do, and always review the results before letting them make real changes.

As this tech grows, more safety features are coming like better sandboxing, memory limits, and role-based access. But for now, it’s smart to treat your agents like powerful helpers that still need some human supervision.

Troubleshooting & Tips

Sometimes, agents can be a bit quirky! Here are some common issues you might run into and how to fix them:

Issue	Suggested Fix
Agent seems to loop forever	Set a maximum number of iterations or define a clearer stopping point.
Output is too chatty or verbose	Use more specific prompts (for example, “Respond in bullet points only”).
Input is too long or gets cut off	Break down large pieces of content into smaller chunks and summarize them individually.
Agent runs too slowly	Try using a faster LLM model like gpt-3.5 or consider running a local model.

A handy tip: You can also add print() statements or logging messages inside your agent functions to see what's happening at each stage and debug state transitions.

Explore More Daily Automations

Once you've built one agent-based task, you'll find it incredibly easy to adapt the pattern for other automations. Here are some cool ideas to get your creative juices flowing:

Task Type	Example Automation
DevOps Assistant	"Read system logs, detect potential issues, and suggest solutions."
Finance Tracker	Read bank statements or CSV files and summarize your spending habits/budgets.
Meeting Organizer	After a meeting, automatically extract action items and assign owners.
Inbox Cleaner	"Automatically label, archive, and delete non-urgent emails."
Note Summarizer	Convert your daily notes into a neatly formatted to-do list or summary.
Link Checker	Extract URLs from documents and automatically test if they're still valid.
Resume Formatter	Score resumes against job descriptions and format them automatically.

Each of these can be built using the very same principles and frameworks we discussed whether that's LangGraph or CrewAI.

What’s Next in Agent Technology?

LLM agents are evolving at lightning speed, and the next wave of innovation is already here:

Smarter memory systems: Expect agents to have better long-term memory, allowing them to learn over extended periods and remember past conversations and actions.
Multi-modal agents: Agents won't just handle text anymore! They'll be able to process and understand images, audio, and video, making them much more versatile.
Advanced planning frameworks: Techniques like ReAct, Toolformer, and AutoGen are constantly improving agents' ability to reason, plan, and reduce those pesky "hallucinations."
Edge deployment: Imagine agents running entirely offline on your local computer or device using lightweight models like LLaMA 3 or Mistral.

In the very near future, you'll see agents seamlessly integrated into:

Your DevOps pipelines
Big enterprise workflows
Everyday productivity tools
Mobile apps and smart devices
Games, simulations, and educational platforms

Final Summary

Alright, let's quickly recap all the cool stuff you've just learned and accomplished:

You've gotten a solid grasp of what LLM agents are and why they're so powerful.
You've seen how open-source frameworks like LangGraph and CrewAI make building agents much easier.
You've built a real LLM agent using LangGraph to automate a common daily task: summarizing your inbox!
You've explored the world of multi-agent collaboration with CrewAI, understanding how teams of AIs can work together.
You've learned how to take these principles and scale them to automate countless other tasks.

So, next time you find yourself stuck doing something repetitive, just ask yourself: "Hey, can I build an agent for that?" The answer is probably yes!

Resources Recap

Here are some helpful resources if you want to dive deeper into building LLM agents:

Resource	Link
LangGraph Docs	https://docs.langgraph.dev/
CrewAI GitHub	https://github.com/joaomdmoura/crewAI
LangChain Docs	https://docs.langchain.com/docs/
OpenAI API Docs	https://platform.openai.com/docs
Python 3.9+	https://www.python.org/downloads/
VSCode	https://code.visualstudio.com/

ML - freeCodeCamp.org

Model Packaging Tools Every MLOps Engineer Should Know

Table Of Contents

Model Serialization Formats

4. Pickle and Joblib

Model Bundling and Serving Tools

Model Registries

Conclusion

How to Build an MCP Server with Python, Docker, and Claude Code

Table of Contents

What You Will Build

Prerequisites

What is MCP (and Why Should You Care)?

Why Claude Code Instead of Claude Desktop?

Step 1: Build the MCP Server

Step 2: Test It Locally

Step 3: Dockerize It

Step 4: Wire It Into Claude Code

Step 5: Use It

Security: What the Other Tutorials Leave Out

The Prompt Injection Problem

Real CVEs, Not Theory

What You Should Actually Do

Limit file system access

Validate all inputs

Use least-privilege tokens

Do not install MCP servers from untrusted sources

What to Do Next

Wrapping Up

The Open Source LLM Agent Handbook: How to Automate Complex Tasks with LangGraph and CrewAI

What we’ll cover:

The Current State of LLM Agents

From Chatbots to Autonomous Agents

What Can Agents Do Today?

Why Now Is the Best Time to Learn

What Are LLM Agents and Why Are They a Big Deal?

What Is an LLM?

So, What’s an LLM Agent?

Why Does This Matter?

The Rise of Open-Source Agent Frameworks

Popular Open-Source Agent Frameworks

What These Tools Enable

Why Use a Framework Instead of Building from Scratch?

Core Concepts Behind Agent Design

The Agent Loop

Key Components of an Agent

Multi-Agent Collaboration

Project: Automate Your Daily Schedule from Emails

What We’re Automating

Step 1: Install the Required Tools

1. Install Python 3.9 or Higher

2. Install VSCode (Optional but Recommended)

3. Get Your OpenAI API Key

4. Install Python Libraries

Step 2: Define the Task

Step 3: Build the Workflow with LangGraph

What Is LangGraph?

Why Use LangGraph?

1. Simulate Email Input

2. Define the Agent Logic

3. Build and Run the Graph

Example Output

Multi-Agent Collaboration with CrewAI

What Is CrewAI?

Sample Roles for the Email Summary Task

Sample CrewAI Code

What Actually Happens During Execution?

Are LLM Agents Safe? What to Know About Security and Privacy

Troubleshooting & Tips

Explore More Daily Automations

What’s Next in Agent Technology?

Final Summary

Resources Recap