freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

How to Run a Docker Container in AWS Lambda

Agnes Olorundare — Wed, 24 Dec 2025 23:38:56 +0000

While containers are quite lightweight and provide various benefits, it can be challenging to decide how best to deploy them. There are a number of ways to deploy and run Docker containers. But some are best for orchestrating and managing containers, and may not suit a simple use case of running just one container.

In this article, I’ll teach you how you can deploy a single Docker container using a serverless service on AWS called Lambda.

Prerequisite/ Requirements
Serverless with AWS Lambda
How to Build, Run, and Test a Container Locally
How to Push Your Image to Amazon Elastic Container Registry (ECR)
How to Deploy Your Docker Image to Lambda
Cleanup
Conclusion

Prerequisite/ Requirements

The following tools and skills are necessary for following along with this tutorial:

Knowledge of Docker, and have Docker installed locally.
An AWS account with credentials with administrative privilege for making API calls via the CLI. Best practice would be to limit the privilege to exactly what needs to be done.
AWS CLI installed locally
Python virtual environment managers such as uv (optional)

Serverless with AWS Lambda

Containers provide a lightweight, consistent, and resource-friendly way of running applications. Serverless takes away the overhead of managing the underlying infrastructures on which the container runs. So as you can probably start to see, combining these tools helps you deploy applications in a way that lets you focus on business logic, performance, and what gives your product a competitive edge/ advantage.

One AWS tool that enables you to go serverless is Lambda. With Lambda, you’re only billed for the number of times the code in the function runs, the memory you selected at the time of provisioning the service, and the duration of each invocation of the function.

In addition to removing operational overhead, Lambda can also help you save money since you won’t have to deal with idle resources. The function only comes alive when triggered by a request sent to it.

How to Build, Run, and Test a Container Locally

Docker is a tool that helps you package applications or software into portable, standardized and shareable units that have everything the applications need such as libraries, runtime, system tools, application code, in order to run. These units are called containers.

In this section, I’ll walk you through building the Docker image, running the container, and testing it after it’s running.

You can find the project that you’ll be using here in this GitHub repository.

Build the Docker Image

To run a Docker container, you first need to build an image. The image becomes the template or class from which you create the container or instance of the class.

You can find the code to build an image in lambda_function.py.

# lambda_function.py

def lambda_handler(event, context):
    name = event["name"]
    message = f"Hello, {name}!"

    try:
        return {
            "statusCode": 200,
            "body": message
        }
    except Exception as e:
        return {
            "statusCode": 400,
            "body": {"error": str(e)}
        }

As you can see from the code above, this is a very basic Python application that expects a POST HTTP request, with a JSON payload that contains the key – name – and a corresponding value. The code then returns a greeting containing the name it has received. The application has just a single function, which also serves as the entry point to it.

To build a Docker image, you’ll need a Dockerfile to provide the blueprint for the image. For this specific case, the Dockerfile you’ll use is also very basic. Each line in a Dockerfile is called a Directive, and this provides the instruction Docker should follow when creating an image. So building a Docker image means creating a template for a container by following the instructions or directives in the Dockerfile.

# Dockerfile

FROM public.ecr.aws/lambda/python:3.12

# Copy function code... LAMBDA_TASK_ROOT is /var/task, the working directory set in the base image
COPY lambda_function.py ${LAMBDA_TASK_ROOT}    

# Set the CMD to your handler - lambda_handler
CMD ["lambda_function.lambda_handler"]

A Dockerfile usually starts with a base image. To deploy an application as a Docker container in AWS Lambda, the base image has to be of a specific kind, depending on the application run-time. For this case, you’ll need the Python run-time, so the base image is public.ecr.aws/lambda/python:3.12. It’s okay to use a different Python version.

The next directive in the Dockerfile is copying the lambda_function.py file to a specific path in the base image. That path is referenced using an environment variable that has already been defined in the base image and points to /var/task. This is the directory your code will be running from.

The last directive is simply a command to start the application when the container runs.

Now, you can run the build command from the project’s root directory:

docker build -t : .

Run the Docker Container

Next, let’s create a running container from this image.

docker run -it --rm -p 8080:8080  lambda_docker:1.0.0

The command above will create a container and run it in interactive mode just so you can see the logs generated by the application in the container. Port 8080 is also exposed on the host where the container is running and mapped to the container port, which is also 8080 (defined by AWS). The container gets automatically removed once you kill the running process with CTRL + C.

Test the Running Container

Now confirm that the application running within the container can receive and process requests. To do this, use the code in the test.py file:

# test.py

import requests

url = "http://localhost:8080/2015-03-31/functions/function/invocations"

data = {
    "name": "Janet"
}

response = requests.post(url, json=data)

print("Status Code:", response.status_code)
print("Response Body:", response.json())

You can use the Python requests library to make this call. Install the library by using a virtual environment to isolate the application from your overall system. This helps prevent issues with conflicts in the versions of libraries you install for an application to use.

If you’re using uv to manage your virtual environment, simply run the command:

uv add requests

Then run the code in test.py from within the virtual environment:

uv run python3 test.py

You should see the desired response on the terminal.

How to Push Your Image to Amazon Elastic Container Registry (ECR)

Now that you have a working Docker image to deploy to Lambda, the next step is to push the image to a Docker registry. For this use case, your image has to be pushed to Amazon ECR, a container registry for storing Docker images.

To push your Docker image, you first need to tag the image, which simply means naming the image in a specific way.

Currently, this image tag is lambda-docker:1.0.0. To tag it the AWS way, first create an ECR repository. Let’s use the AWS CLI for this (this requires you to configure the AWS credentials locally by running the aws configure command and providing your credentials).

Setup Environment Variables

# Set AWS profile
export AWS_PROFILE=

# Set other variables

AWS_REGION=
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
REPO_NAME=lambda-docker
TAG=1.0.0

The above commands set the AWS_PROFILE for the CLI to target the right AWS account for API calls. The other variables specify the region, account ID, and the ECR repository name and tag.

Create ECR Repository and Authenticate

Now, create the ECR repository:

aws ecr create-repository \
  --repository-name "$REPO_NAME" \
  --region "$AWS_REGION"

Authenticate to Amazon ECR:

aws ecr get-login-password --region "$AWS_REGION" \
  | docker login \
  --username AWS \
  --password-stdin "$ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com"

Tag and Push the Docker Image

Now, tag the Docker image:

docker tag $REPO_NAME:$TAG \
  $ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:$TAG

Push the image to the ECR repository you created:

docker push $ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/$REPO_NAME:$TAG

And that’s it! Your image is now in ECR.

How to Deploy Your Docker Image to Lambda

With your image now in ECR, you can create a Lambda function. Navigate to the Lambda console, and click Create a Function.

Create Lambda Function

Select Container Image and go ahead to search for the ECR repository you created.

Next, select the image:

Leave other configurations as default and click create.

Navigate to the function after creating.

Test Deployment

Now, let’s test the deployment. For this, simply use the existing Lambda Test tab. Provide all the details needed, including the payload for your POST request.

And that’s it. You’ve successfully deployed a Docker container on AWS by leveraging ECR and Lambda. You can go a step forward by integrating API Gateway and making the function accessible from the internet.

Cleanup

Remember to delete the services you’ve created on your AWS ECR repository and Lambda to avoid extra charges.

Conclusion

Deploying your Docker container on AWS Lambda is an efficient way to get your application running quickly without being bothered by managing servers or platforms.

Thanks for reading!

How to Prepare for Technical Job Interviews – Based on My Experience Landing a Job

Ilyas Seisov — Wed, 24 Dec 2025 17:33:14 +0000

Hi, I’m Ilyas. I’m a web developer, and this is my story about how I struggled with interviews for a long time and what finally helped me break through. I’ll talk about what failing basic interview questions taught me about recall, preparation, and smarter job searching.

If you’re a junior, mid-level, or self-taught developer who keeps getting rejected and you don’t fully understand why, I hope this helps.

Here’s what I’ll cover:

My 18-Month Job Search Struggle
The Interview Problem I Didn’t Expect
Discovering Active Recall and Flashcards
My Interview Preparation System
The Results
Changing How I Looked for Jobs
Turning My System Into a Small Tool
Lessons I learned
Final Thoughts

My 18-Month Job Search Struggle

For 18 months, I was trying to land a remote or relocation web developer job.

During that time:

I applied to more than 1,000 positions
I went through around 20–30 interviews
I failed most of them

It was exhausting. I felt like I was putting in a lot of effort but getting almost no results. Over time, I started doubting my skills and wondering whether I would ever find a job I’d actually be satisfied with.

What made this even more confusing was that a few years earlier, in 2021, I had found a remote job at a US company in just three weeks – with almost no experience

Something clearly wasn’t working anymore.

The Interview Problem I Didn’t Expect

After dozens of interviews, I noticed a pattern: I wasn’t failing because I couldn’t solve complex algorithm problems or build features under pressure. I was failing on basic technical questions.

Questions like:

“What are portals in React?”
“Can you explain how an HTTP GET request works?”

These were not hard questions. They were things I had learned before. But during interviews, under pressure, I just couldn’t recall. Or simply I skipped it during preparation because there were no systems in place.

That’s when I realized the real issue: I didn’t have a problem understanding concepts. I had a problem recalling them quickly.

My first instinct was to study more. More tutorials, more articles, more videos.

But passive learning didn’t fix the problem. I still froze during interviews. What I actually needed was a way to train my memory, not just consume information.

Discovering Active Recall and Flashcards

That’s when I came across flashcards and the concept of active recall.

Active recall means testing yourself repeatedly on what you’ve learned instead of just rereading material. You try to answer a question from memory first, then check the answer. This approach has been backed by research for more than a century.

I started practicing small, specific concepts this way, like:

React fundamentals
JavaScript basics
HTTP methods
Browser behavior

I repeated them until recalling the answer felt automatic.

This made a huge difference during interviews.

Flashcards help you cut through the noise and actually learn what matters. It's not just about memorizing facts – it's about really understanding, remembering fast, and building a solid base in every concept you study.

So to help you prepare for your interviews, I’ve taken years of experience and scientific learning methods and turned them into a tool and approach that gives you the right info at the right time.

My Interview Preparation System

Once I found the right learning method, I built a simple system around it.

Step 1: Ask What to Prepare For

Instead of guessing what to study, I started asking recruiters directly:

“What topics should I prepare for the technical interview?”

Surprisingly, many of them replied with a clear list, which helped me focus only on what actually mattered and avoid over-preparing random topics. In my experience, many HR reps are quite helpful to job applicants.

For example, when I applied for a position as a Frontend Web Developer in React, the HR specialist advised me to focus mainly on React and JavaScript. So I prepared for all the popular questions around hoisting (JS), the event loop (JS), how react works under the hood, what props are and how they work, and so on.

Overall, that interview went well – but when I got a question on React Portals, I couldn’t explain it properly. And so I didn’t get the position. But I don’t blame myself for this one, as that’s a very rare topic. 😊

I also applied for another Front End Developer role where the HR specialist advised me to prepare mainly for questions about GSAP, Framer Motion, and React/Next JS. This made sense, as the company mainly builds modern animated websites.

In my interview, the theory round went well, but I failed the take home assignment. I realized then that I didn’t have enough skills in these areas.

At another company, I asked HR about the cultural interview, which was the last round. The rep said: ”No worries, all the hard work is done from your side. Prep for just a human dialog.”

And for the last application I submitted (and after which I actually got the job offer), the HR specialist told me to strongly prepare for CSS – especially Flexbox and Grid. This made sense, as the position was for an HTML markup developer. And so I practiced all the ins and outs for these topics, even the more rare ones.

I use the same approach for each round of interviews.

Step 2: Use Flashcards (With AI Carefully)

I used ChatGPT to generate flashcards for each topic and reviewed them daily.

One important thing I learned: AI can be wrong sometimes. To reduce mistakes, I started adding links to official documentation in my prompts so the answers were grounded in reliable sources.

I kept sessions short and consistent. That consistency mattered more than long study sessions.

AI mistakes was the reason I created 99cards.dev

Here is the prompt I use in ChatGPT:

You are a web development expert with 20 years of experience. Your task is to help me to prepare for the interview.

Prepare 10 flashcards on CSS Flexbox topics. Format one question with four answers. One answer is correct.

You're going to serve all the questions one by one. After I answer, you give me feedback and then give me the next question.

Note that you should tweak your prompts for your needs, and based on what you need to review.

You can experiment with various factors, such as:

Difficulty: beginner or advanced
Specificity: from vague (for example: I want to practice with CSS) to highly specific (for example: I want to practice with the flex property in CSS Flexbox)
Number of questions: sweet spot is between 10 and 20
Add context: good practice is to add links to official docs, as it decreases the chances of AI hallucination

Here is a typical flashcard created by ChatGPT:

If you provide an answer, you’ll get feedback like this:

The Results

After a few weeks, interviews felt very different.

I was calmer. I answered basic questions without panicking. I could explain concepts clearly and confidently.

In my final interview process, I passed four rounds in a row and scored 95% on the technical test.

Soon after, I received an offer: $5,500 per month and a paid relocation package for my family and me.

For the first time in a long while, my effort finally matched the results.

Changing How I Looked for Jobs

About six weeks before getting the offer, I also changed where I searched for jobs.

Instead of relying only on large job platforms, I started using smaller communities like Telegram job groups.

This helped for two reasons:

Less competition: many smaller companies post roles there with fewer applicants
Direct communication: I could message recruiters before applying

Before submitting an application, I would ask:

“I saw this position. Here’s my CV and LinkedIn. Am I a good fit?”

If the answer was yes, I applied. If not, I moved on immediately.

This saved me a lot of time and energy.

Turning My System Into a Small Tool

While preparing for interviews, I created thousands of flashcards for myself. Managing them in notes became difficult, so I eventually turned them into a small tool called 99cards.dev.

It’s simply a collection of fact-checked web development flashcards grouped by topic, based on the same approach that helped me stop failing basic interview questions.

Here are some screenshots from the app:

Lessons I Learned

Here are a few takeaways from this experience:

Failing interviews doesn’t always mean you lack skills
Passive learning is not enough for interview prep
Being able to recall basics quickly matters a lot
Job searching is a skill, not just a numbers game
Consistency beats cramming every time

Final Thoughts

If you’re struggling with interviews right now, especially as a junior, mid-level, or self-taught developer, don’t assume you’re bad at what you do.

In my case, the problem wasn’t effort or talent. It was preparation and approach.

I also created a free interview checklist based on my experience, covering HR, technical, behavioral, system design, live coding, take-home tasks, algorithms, and cultural fit.

I hope this story saves you some time and stress.

You’re often just one good interview away.
— Ilyas

Christmas gifts for you from the freeCodeCamp community: Learn Python, SQL, Spanish, and more

Quincy Larson — Tue, 23 Dec 2025 22:46:00 +0000

2025 has been an amazing year for the global freeCodeCamp community. And we’re thrilled to cap it off with the launch of several Christmas Gifts for you:

freeCodeCamp's Python certification
freeCodeCamp's JavaScript certification (Version 10)
freeCodeCamp's Responsive Web Design Certification (Version 10)
freeCodeCamp's Relational Database + SQL Certification
Our A2 level English for Developers Certification
Our B1 level English for Developers Certification
Our beta A1 level Spanish curriculum
Our beta A1 level Mandarin Chinese curriculum

Those are a lot of gifts to unwrap, so let's start unwrapping!

Programming Certifications and Version 10 of the Full Stack Development Curriculum

Over the past 11 years, the freeCodeCamp community has built and rebuilt our core programming curriculum several times.

We are finally approaching our vision of how comprehensive and interactive a programming curriculum can be.

Version 10 of our curriculum is a series of 6 certifications – each with more than a dozen projects that you'll build to solidify your fundamental skills.

At the end of each certification, you'll take a final exam. And if you can manage to pass this exam, you'll be awarded a free, verified certification. You can then embed that on LinkedIn, or add it to your résumé, CV, or portfolio website.

So far, 4 of these certifications are now live:

And we will release the Front End Libraries and Back End Development certifications in 2026.

After earning all 6 certifications, you can build a final capstone project – which will be code-reviewed by an experienced developer. Then you’ll sit for a comprehensive final exam. And upon completion of that, you'll earn our final Full Stack Developer Certification.

If you start progressing through these first four certifications today, the last two certifications should go live well before you reach them. After all, each of them represents hundreds of hours of conceptual computer science knowledge and hand-on programming practice.

Language Coursework

First, you may be asking: when did freeCodeCamp start teaching world languages?

Well, we started designing our English for Developers curriculum back in 2022. And over the past few years, we've expanded it considerably.

The curriculum involves interacting with hand-drawn animated characters. Along the way, you get tons of practice with reading, writing, listening, and (coming in 2026) speaking.

It's a story-driven curriculum. You step into the shoes of a developer who's just arrived in California to work at a tech startup. You learn grammar, vocab, tech jargon, and slang through day-to-day interactions while living your new life.

So far, two of these certifications are fully live:

We're also developing levels A1, B2, C1, and C2 for release over the coming years. (Yes, years. Each of these is a huge undertaking to develop.)

Not only has the freeCodeCamp community designed thousands of English lessons - we also built tons of custom software tools to make all this coursework possible. So in 2024, we asked: could we use the same tools to teach people Spanish and Mandarin Chinese?

And today, the results of this effort are now in public beta. We're starting out with A1 Level for both of these languages, and will ship the remaining levels over the coming years.

Why Teach Spanish and Mandarin?

Aside from English, Spanish and Mandarin are two of the most widely-spoken languages in the world. You can use these languages to participate in tons of online communities, visit major cities, and even find new job opportunities.

Learning foreign languages is also excellent for your neuroplasticity, and can be done alongside learning other new skills like programming.

And now you can learn these languages for free, using our comprehensive end-to-end curriculum that was designed by teachers, translators, and native speakers.

Update on Translating freeCodeCamp’s coursework into major world languages

As you may know, freeCodeCamp has been available in many major world languages going back to 2020. But whenever we launch new coursework, it takes several months to translate everything.

Thankfully, machine translation has been steadily improving over the past few years.

The community is still translating tutorials and books by hand, but for something that changes as quickly as freeCodeCamp’s programming curriculum, we want to speed up the process.

We’ve conducted pilots of translating all the new coursework into both Spanish and Portuguese.

First, we used frontier Large Language Models and extensive glossaries and style guides to process the hundreds of thousands of words in our programming curriculum.
Then we had native speakers randomly sample these translations to ensure their quality.
Once we felt the translations were strong enough, we started creating data pipelines to automatically update translations as the original English text changed through open source code contributions.

The monetary cost of doing all this is not significant. So we should be able to offer freeCodeCamp’s programming curriculum in additional languages we weren’t previously able to support, such as Arabic and French.

If you are one of the hundreds of people who’ve contributed translations to freeCodeCamp over the years, we’d still welcome your help translating books and tutorials, which don’t change much after initial publication.

After all, the gold standard for localizing a document is having a single human translator holistically read and understand that document before creating the translation.

This community is just getting started.

This year the freeCodeCamp community also published:

129 free video courses on the freeCodeCamp community YouTube channel
45 free full length books and handbooks on the freeCodeCamp community publication
452 programming tutorials and articles on math, programming, and computer science
50 episodes of the freeCodeCamp podcast where I interview developers, many of whom are contributors to open source freeCodeCamp projects

We also merged 4,279 commits to freeCodeCamp’s open source learning platform, representing tons of improvements to user experience and accessibility. And we published our secure exam environment so that campers can take certification exams.

You can view our 2025 list of Top Open Source Contributors.

As a community, we are just getting started. Free open source education has never been more relevant than it is today.

We invite you to get more involved in the community, too.

I want to thank the 10,221 kind folks who donate to support our charity and our mission each month. Please consider joining them: Donate to freeCodeCamp.org.

And here are some other ways you can make a year-end donation that you can deduct from your US taxes.

freeCodeCamp has a vibrant global community of ambitious people who are learning new skills and preparing for the next stage of their career. I encourage you to join the freeCodeCamp Discord and hang out with us there.

And take Naomi’s freeCodeCamp Community Survey to help us understand what you like about freeCodeCamp and what our community can do even better.

On behalf of the global freeCodeCamp community, here’s wishing you and your family a fantastic finale to your 2025. Cheers to a fun, ambition-filled 2026.

How to Use GenUI in Flutter to Build Dynamic, AI-Driven Interfaces

Atuoha Anthony — Tue, 23 Dec 2025 16:58:51 +0000

In standard app development, the User Interface (UI) is static. You write code for a button, compile it, and it remains a button forever. GenUI flips this model on its head.

With GenUI, Google’s Generative UI SDK, your application's interface becomes dynamic. You don’t hard-code widget trees. Instead, you provide an AI agent, such as Google’s Gemini, with a "kit" of UI components called a Catalog and a goal. The AI then generates the UI in real time, deciding whether to display a slider, a text field, or a complex card based on the user’s needs at that moment.

This guide takes you from zero to a fully functional AI-powered Christmas Card Generator that does more than generate text. It also generates the actual Flutter widgets to display them.

Your Christmas Holiday Card Maker will use Generative UI and AI to create personalized, high-quality Christmas cards instantly. Users provide simple inputs such as the recipient’s name, relationship, and preferred color theme, and the AI dynamically produces a festive, polished card UI complete with heartfelt copy, seasonal styling, and structured layout.

By combining Generative UI’s reactive data model with custom catalog widgets, this project will show you how you can guide AI to produce consistent, production-ready user interfaces rather than loosely assembled components.

It’s important to note that the GenUI package is currently in Alpha and is highly experimental. Because it’s in the early stages of development, here is what you should keep in mind:

API Stability: The classes, method signatures, and overall architecture described in this guide are likely to change as the Flutter team gathers feedback from the community.
Safety and Guardrails: Since the UI is generated by an LLM, there is always a non-zero chance of "hallucinations" where the AI might attempt to use widgets or properties that don't exist in your catalog.
Production Readiness: While GenUI is incredibly exciting for prototyping and internal tools, it requires robust error handling and fallback UIs to ensure a seamless user experience if the AI service is unavailable or returns an invalid structure.

As you work through this guide, GenUI should be understood as a collaborative system rather than an autonomous one. You’re still responsible for defining the Catalog the AI can use, reviewing how those components are assembled, and testing the resulting interface in real scenarios.

This guide demonstrates GenUI in a guided setup, where Flutter provides structure and constraints, and the AI operates within them to dynamically assemble UI. The goal is not to remove developer judgment, but to shift it from hand-writing widget trees to designing, shaping, and validating the system that produces them.

Prerequisites
The Mental Model: How GenUI Thinks
Mapping GenUI Components to the Christmas Card App
Why This Architecture Works
Project Overview: What We’re Building
Project Structure
Building the View
Adding Your Own Widgets to the GenUI Catalog
Screenshots:
Final Thoughts
References

Prerequisites

To follow this guide effectively, you need:

Flutter Development Environment: Flutter SDK installed (stable channel recommended) and an IDE like VS Code or Android Studio configured.
Basic Flutter knowledge: You should understand how Widgets compose (Rows, Columns, Containers) and basic state management (setState or FutureBuilder).
Google AI Studio API key: We will be using Google's Gemini model. You’ll need to get a free API key from Google AI Studio.

The Mental Model: How GenUI Thinks

Before writing any code, it’s important to understand how GenUI conceptually sees your app. GenUI doesn’t think in terms of widget trees or screens. It thinks in terms of surfaces, state, and conversations.

A surface is simply a place where AI-generated UI can appear. A conversation controls how those surfaces evolve over time. The data model holds the truth, and messages move everything forward.

Here’s the full flow in one pass:

User Action
   |
   v
GenUiConversation
   |
   v
ContentGenerator (AI)
   |
   v
A2uiMessage stream
   |
   v
GenUiManager
   |
   v
DataModel + UI Surfaces
   |
   v
GenUiSurface (Flutter rebuild)

Nothing in this flow bypasses Flutter. GenUI does not render UI “outside” Flutter – it only decides what Flutter should render.

Mapping GenUI Components to the Christmas Card App

Now let’s ground this in the Christmas card generator we’ll be building. This is where GenUI really clicks.

1. GenUiConversation in the Christmas Card App

In the project we’ll be building, GenUiConversation represents the ongoing interaction between the user and the Christmas card generator.

When the user types a loved one’s name, selects a relationship, chooses a color, and taps Generate Card, your app sends that prompt through GenUiConversation.

At that moment, GenUiConversation already knows the conversation history. It knows whether this is the first card being generated or whether the user is regenerating a card with a different message. This context is what allows the AI to create unique cards for each person instead of repeating generic output.

Without GenUiConversation, every request would be stateless. With it, the app feels intentional and personal.

2. Catalog as the Design Constraint

In the Christmas card app, the Catalog defines the visual language of your cards.

You might allow the AI to use text widgets for greetings, image widgets for festive backgrounds, container widgets for layout, and buttons for regeneration or sharing. What matters is that the AI cannot escape these constraints.

This is how you ensure that:

Cards always look like cards
The AI does not invent unsupported UI
Your app remains visually consistent

From the AI’s perspective, the catalog is the only toolbox it’s allowed to reach into. From your perspective, it’s the safety net that keeps the UI Flutter-native and predictable.

3. DataModel as the Heart of Personalization

The DataModel is where personalization actually lives.

In the project we’ll be building, values like the recipient’s name, the greeting message, the card theme, or even animation flags live in the data model. When the user edits the name or regenerates the card, only the parts of the UI bound to those values change.

This is why GenUI feels dynamic without being inefficient. You aren’t rebuilding the entire card screen – You’re only updating what depends on the changed data.

This also means the AI doesn’t need to recreate the whole UI every time. It can simply update the data model and let Flutter do what it does best.

4. ContentGenerator as the AI Gateway

The ContentGenerator is the only part of your app that knows how to talk to the AI.

In the Christmas card example, this component sends the user’s request to the model along with system instructions like “Generate a festive Christmas card UI using the available widgets.” It then listens as the AI responds.

Because the responses arrive as streams, the UI can begin rendering as soon as the first instructions arrive. This is especially useful if you later add animations or progressive reveals to your cards.

From a design standpoint, this separation is critical. Your Flutter app never depends directly on the AI SDK. It depends on GenUI, and GenUI depends on the ContentGenerator.

5. A2uiMessage as Intent, Not UI

This is one of the most important concepts to internalize: when the AI decides to generate a Christmas card, it doesn’t send Flutter widgets. Rather, it sends A2uiMessage instructions.

One message might say “start rendering a new surface.” Another might say “update the greeting text in the data model.” Another might say “replace the background image.”

These messages are processed by the GenUiManager, which translates intent into actual UI changes. This extra layer is what prevents GenUI from becoming fragile or unpredictable.

Why This Architecture Works

What makes GenUI powerful is not that it uses AI. Plenty of tools do that. What makes it powerful is that AI never breaks Flutter’s rules, because the state is centralized, rendering is controlled, events are explicit, and updates are incremental.

In the Christmas card app, this means every card feels custom, every interaction feels responsive, and your app remains maintainable even as the AI logic grows more complex.

Once you understand this flow, you stop thinking of GenUI as “AI generating UI” and start thinking of it as AI participating in your app’s state machine.

Project Overview: What We’re Building

In this tutorial, we’ll build a Christmas Card Generator using Flutter and GenUI. The idea is simple but intuitive: a user types a name, selects a relationship and a card color description, and the AI dynamically generates a Flutter widget tree that represents a personalized Christmas card.

This project demonstrates three core GenUI ideas working together: the conversation loop, AI-driven UI rendering, and reactive state updates without manual widget wiring.

By the end, you’ll understand not just how to use GenUI, but how to structure a real Flutter app around it.

Project Structure

We’ll keep the structure intentionally simple so it’s easy to follow and extend later.

lib/
 ├── extensions/
 │    ├── loading.dart
 ├── screen/
 │    ├── components/
 │    │    ├── color_picker_list.dart       // Widget for color selection
 │    │    ├── custom_input_section.dart    // Input form fields
 │    │    ├── error_section.dart           // Error message display
 │    │   
 │    ├── data/
 │    │    └── static_list_data.dart        // Hardcoded data or constants
 │    ├── card_generator_screen.dart        // Main UI logic for generating cards
 │    └── christmas_card.dart               // The specific card widget/view
 ├── firebase_options.dart                  // Firebase configuration file
 └── main.dart                              // App entry point

Step 1: Create a New Flutter Project

Start by creating a fresh Flutter app.

flutter create genui_christmas_card
cd genui_christmas_card

This gives us a clean baseline with Material 3 support and proper platform setup.

Step 2: Configure Your Agent Provider

genui can connect to a variety of agent providers. Choose the section below for your preferred provider.

Configure Firebase AI Logic

To use the built-in FirebaseAiContentGenerator to connect to Gemini via Firebase AI Logic, follow these instructions:

Create a new Firebase project using the Firebase Console.
Enable the Gemini API for that project.
Follow the first three steps in Firebase's Flutter Setup to add Firebase to your app.
Enable Gemini Developer API

Step 3: Add Dependencies

GenUI is modular. You always install the core framework, then add a content generator that knows how to talk to your AI provider.

Open pubspec.yaml and update your dependencies:

dependencies:
  flutter:
    sdk: flutter

  genui: ^0.6.0
  logging: ^1.2.0
  genui_firebase_ai: ^0.6.0
  firebase_core: ^4.3.0
  loader_overlay: ^5.0.0
  flutter_spinkit: ^5.2.2

Then fetch the packages:

flutter pub get

At this point, your project has everything it needs to generate UI dynamically.

Step 4: Get a Google Gemini API Key

GenUI itself does not provide AI models. You’ll need to connect one. To do this, go to Google AI Studio, create a new API key, and copy it.

Important note: For real production apps, never hard-code API keys. Use --dart-define, environment variables, or a backend proxy.

Step 5: App Entry Point (`main.dart`)

Now we’ll begin writing real code.

Replace the contents of lib/main.dart with the following:

import 'package:flutter/material.dart';
import 'package:genui_flutter/screen/christmas_card.dart';
import 'package:logging/logging.dart';
import 'package:firebase_core/firebase_core.dart';
import 'firebase_options.dart';

void main() async{
  // Enable verbose logging so we can see exactly
  // what the AI sends back to GenUI.
  Logger.root.level = Level.ALL;
  Logger.root.onRecord.listen((record) {
    debugPrint(
      '${record.level.name}: ${record.time}: ${record.message}',
    );
  });

    WidgetsFlutterBinding.ensureInitialized();
    await Firebase.initializeApp(options: DefaultFirebaseOptions.currentPlatform);
    runApp(const ChristmasCardApp());
}

This logging setup is optional, but highly recommended. When something goes wrong, logs are often the fastest way to understand why the AI didn’t generate what you expected.

Next, we define the root widget for our app.

import 'package:flutter/material.dart';
import 'package:loader_overlay/loader_overlay.dart';
import 'card_generator_screen.dart';
import 'package:flutter_spinkit/flutter_spinkit.dart';

class ChristmasCardApp extends StatelessWidget {
  const ChristmasCardApp({super.key});

  @override
  Widget build(BuildContext context) {
    return Directionality(
      textDirection: TextDirection.ltr,
      child: LoaderOverlay(
        overlayWholeScreen: true,
        overlayWidgetBuilder: (_) {
          return const Center(
            child: SpinKitWaveSpinner(color: Colors.red, size: 50.0),
          );
        },
        child: MaterialApp(
          title: 'GenUI Christmas Card Generator',
          theme: ThemeData(
            colorScheme: ColorScheme.fromSeed(
              seedColor: Colors.red,
              primary: Colors.red,
            ),
            useMaterial3: true,
          ),
          home: const CardGeneratorScreen(),
        ),
      ),
    );
  }
}

This is standard Flutter – nothing GenUI-specific yet. The real work happens inside CardGeneratorScreen.

Step 6: The Logic Controller (Stateful Screen)

This screen is where we wire together Flutter, Firebase AI, and the GenUI logic. It handles the user inputs (Name, Relationship, Color) and orchestrates the AI generation.

class CardGeneratorScreen extends StatefulWidget {
  const CardGeneratorScreen({super.key});

  @override
  State createState() => _CardGeneratorScreenState();
}

Now the state class, which holds all GenUI logic and form state:

class _CardGeneratorScreenState extends State<CardGeneratorScreen> {
  // 1. Form State Management
  final TextEditingController nameController = TextEditingController();
  String selectedRelationship = 'Friend';
  String selectedColorName = 'Gold';
  Color selectedColorUi = Colors.amber;

  // 2. GenUI Core Components
  late final A2uiMessageProcessor _a2uiMessageProcessor;
  late final FirebaseAiContentGenerator _contentGenerator;
  late final GenUiConversation _conversation;

  // 3. UI State
  String? currentSurfaceId;
  String? errorMessage;

The application manages user inputs through a form state that allows for dynamic prompt injection, while the _a2uiMessageProcessor acts as a decoder to convert raw AI data into specific Flutter widgets.

The backend connection is handled by the FirebaseAiContentGenerator, which manages system instructions and tool catalogs, while the _conversation object serves as a conductor to manage chat history and route data between the AI and the UI.

Finally, the currentSurfaceId tracks the specific widget tree being displayed, ensuring the GenUiSurface renders the correct AI-generated content.

Step 7: Initializing GenUI and Firebase

All setup happens in initState:

  @override
  void initState() {
    super.initState();
    // 1. Setup the Processor with allowed widgets
    _a2uiMessageProcessor = A2uiMessageProcessor(
      catalogs: [CoreCatalogItems.asCatalog()],
    );

    // 2. Configure the AI personality and rules
     _contentGenerator = FirebaseAiContentGenerator(
      catalog: CoreCatalogItems.asCatalog(),
      systemInstruction: '''
          You are an expert Festive UI Designer and Holiday Copywriter.

          YOUR GOAL: Generate a high-end, visually appealing Christmas card using the `surfaceUpdate` tool, suitable for printing or digital sharing. The card should feel personalized, warm, and festive.

          DESIGN GUIDELINES:
          - Layout: Use a vertical Column inside a Container with rounded corners, generous padding, and a border. Fill the Container with a color that **mixes Red with $selectedColorName ** to create a rich, holiday-themed background.
          - Typography: Use distinct font weights (Bold for headers, normal for body). Center all text.
          - Visuals: Include seasonal icons (🎄, ✨, ❄️) as decorative elements. Place a Christmas tree emoji strategically without overcrowding the layout.
          - Personalization: Display the recipient's name prominently in the middle of the card in a visually striking way.

          COPYWRITING GUIDELINES:
          - Create a deeply personal, heartfelt holiday message (3-4 sentences) that matches the relationship type (fun for friends, romantic for spouse, warm for family).
          - Include a proper closing/signature.
          - NEVER use placeholders. Always generate the **final text ready to display**.

          OUTPUT INSTRUCTIONS:
          - Use the `surfaceUpdate` tool to construct the UI.
          - Ensure all elements (Container, text, emojis) are visually aligned and harmonious.
          - The card must feel festive, elegant, and balanced.
          ''',
    );

    // 3. Start the conversation and listen for updates
    _conversation = GenUiConversation(
      contentGenerator: _contentGenerator,
      a2uiMessageProcessor: _a2uiMessageProcessor,
      onSurfaceAdded: _onSurfaceAdded,
      onSurfaceDeleted: _onSurfaceDeleted,
    );
  }

  void _onSurfaceAdded(SurfaceAdded update) {
    setState(() {
      currentSurfaceId = update.surfaceId;
    });
  }

In the initState method, we first configure the A2uiMessageProcessor with CoreCatalogItems, giving the AI access to standard widgets. Then, we initialize FirebaseAiContentGenerator.

Notice the systemInstruction: you are giving the AI two distinct roles here; "UI Designer" and "Copywriter." You explicitly tell it to write specific content based on relationships and design centered text.

Finally, we link them in GenUiConversation and attach a listener (_onSurfaceAdded). When the AI creates a new UI, we update currentSurfaceId inside setState, which tells Flutter to draw the new card.

Step: 8 Sending a Dynamic Prompt to the AI

This method kicks off the generation, using the user's form data to build a specific prompt.

  Future<void> generateCard() async {
    if (nameController.text.trim().isEmpty) {
      setState(() {
        errorMessage = "Please enter a name first!";
      });
      return;
    }
    FocusScope.of(context).unfocus();
    setState(() {
      errorMessage = null;
      currentSurfaceId = null;
    });

    try {
      context.showLoader();
       final prompt = '''
        Create a personalized Christmas card for my $selectedRelationship, ${nameController.text}.
        Theme: Blend Red and $selectedColorName for a festive background.
        Layout: Vertical Column in a rounded Container with padding and border; place the recipient's name prominently in the center.
        Visuals: Add Christmas trees (🎄), sparkles (✨), or snowflakes (❄️) where appropriate.
        Typography: Bold headers, normal body text, all centered.
        Message: Write a warm, personal 3-4 sentence holiday greeting that fits the relationship type, ending with a proper signature.
        Design: Make it look like an elegant, festive Christmas card ready to display or share.
        ''';


      await _conversation.sendRequest(UserMessage.text(prompt));
    } catch (e) {
      debugPrint('Error: $e');
      if (mounted) {
        setState(() {
          errorMessage = "Oops! Failed to create card.\nError: $e";
        });
      }
    } finally {
      if (mounted) {
        context.hideLoader();
      }
    }
  }

The generateCard method is where prompt engineering meets code. First, it validates that a name exists. Then, it constructs a multi-line string using String Interpolation ($selectedRelationship, $selectedColorName). Instead of a generic request, you are sending a detailed brief: "Make a card for my Mom named Alice using Gold colors."

Finally, _conversation.sendRequest fires this prompt to Firebase. We wrap this in a try/catch block to handle network errors gracefully by showing the error message in the UI.

Building the View

Now we’ll render the complex UI using the helper components we created in the components/ folder. Here’s the code – but don’t worry, we’ll cover every custom component individually after this.

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: const Text('🎄 Holiday Card Maker')...),
      body: Stack(
        children: [
          Column(
            children: [
              // 1. The Input Form (Refactored into a component)
              CustomInputSection(
                nameController: nameController,
                selectedRelationship: selectedRelationship,
                selectedColorName: selectedColorName,
                selectedColorUi: selectedColorUi,
                onColorSelected: onColorSelected,
                generateCard: generateCard,
                selectRelationship: selectRelationship,
              ),

              const Divider(height: 1),

              // 2. The GenUI Drawing Area
              Expanded(
                child: Container(
                  color: Colors.grey[100],
                  child: currentSurfaceId != null
                      ? GenUiSurface(
                          host: _conversation.host,
                          surfaceId: currentSurfaceId!,
                        )
                      : const Center(child: Text('Fill in details...')),
                ),
              ),
            ],
          ),


          if (errorMessage != null)
            ErrorSection(errorMessage: errorMessage!, clearError: clearError),
        ],
      ),
    );
  }
}

In the build method, we use a Stack to allow us to float the LoadingWidget and ErrorSection on top of the main content.

Instead of writing all the input logic here, you used CustomInputSection. This keeps the main screen clean and focused on AI orchestration.

The bottom half of the screen contains the GenUiSurface. If currentSurfaceId exists, it renders the AI's widget tree using _conversation.host. If not, it shows a placeholder instruction.

At this point, you’ve seen the full build() method that renders the screen. Notice that the screen itself does very little visual work directly. Instead, it composes the UI from smaller, focused widgets and helper files. This is intentional.

Rather than cramming form fields, color selectors, error handling, and constants into a single screen file, the UI is split into clear, purpose-driven folders. Each folder represents a UI concern, not a state-management layer or architectural pattern.

In the next sections, we’ll walk through these folders one by one, showing how each piece contributes to the final screen you just built. You’ll see where reusable widgets live, where static UI data is defined, and how the main screen ties everything together without becoming cluttered.

Folder: `lib/screen/data/`

This folder holds the static data used to populate dropdowns and color lists.

StaticListData: `lib/screen/data/static_list_data.dart`

import 'package:flutter/material.dart';

class StaticListData {
  // List of relationships for the dropdown menu
  static final List<String> relationships = [
    'Husband',
    'Wife',
    'Son',
    'Daughter',
    'Grandma',
    'Grandpa',
    'Uncle',
    'Aunt',
    'Friend',
    'Relative',
    'Cousin',
    'Grandson',
    'Granddaughter',
    'Mom',
    'Dad',
  ];

  // Map of color names to actual Flutter Color objects
  static final Map<String, Color> colorOptions = {
    'Gold': Colors.amber,
    'Green': Colors.green,
    'Blue': Colors.blue,
    'Purple': Colors.deepPurple,
    'Silver': Colors.grey,
    'Yellow': Colors.yellow,
    'Pink': Colors.pink,
  };
}

This class serves as a central repository for constant data, housing the relationships list to allow for easy UI updates, such as adding "Colleague" or "Neighbor", without modifying core code, and the colorOptions map, which translates user-friendly names like "Gold" into functional Color objects like Colors.amber for styling.

Folder: `lib/extensions/`

This folder holds the static data used to populate dropdowns and color lists.

LoaderOverlayExtension: `lib/extensions/loading.dart`

import 'package:flutter/material.dart';
import 'package:loader_overlay/loader_overlay.dart';

extension LoaderOverlayExtension on BuildContext {
  void showLoader() {
    loaderOverlay.show();
  }

  void hideLoader() {
    loaderOverlay.hide();
  }
}

The LoaderOverlayExtension adds two methods to any BuildContext object: showLoader(), which displays a LoaderOverlay, and hideLoader(), which hides it. This allows you to call context.showLoader() or context.hideLoader() anywhere in your widgets without directly referencing loaderOverlay every time, improving readability and reducing boilerplate whenever a loading state needs to be displayed.

Folder: `lib/screen/components/`

This folder contains reusable UI components that are used specifically on screens in your app, particularly the CardGeneratorScreen. These are smaller, modular widgets that encapsulate a part of the UI, making the main screen code cleaner, easier to read, and maintainable.

ErrorSection: `error_section.dart`

import 'package:flutter/material.dart';

class ErrorSection extends StatelessWidget {
  final String errorMessage;
  final VoidCallback clearError;

  const ErrorSection({
    super.key,
    required this.errorMessage,
    required this.clearError,
  });

  @override
  Widget build(BuildContext context) {
    return Container(
      // High opacity background to block out the UI behind it
      color: Colors.white.withOpacity(0.95),
      child: Center(
        child: Padding(
          padding: const EdgeInsets.all(32.0),
          child: Column(
            mainAxisSize: MainAxisSize.min,
            children: [
              const Icon(Icons.error_outline, color: Colors.red, size: 60),
              const SizedBox(height: 16),
              // Displays the specific error message passed from the parent
              Text(
                errorMessage,
                textAlign: TextAlign.center,
                style: const TextStyle(fontSize: 16, color: Colors.red),
              ),
              const SizedBox(height: 20),
              // Button to dismiss the error
              ElevatedButton(
                onPressed: () {
                  clearError();
                },
                child: const Text("Try Again"),
              ),
            ],
          ),
        ),
      ),
    );
  }
}

This robust error-handling view utilizes a large red icon and descriptive text to clearly signal an issue, while incorporating a clearError callback that triggers when the "Try Again" button is clicked to reset the parent state's errorMessage variable and dismiss the view.

ColorPickerList: `color_picker_list.dart`

import 'package:flutter/material.dart';

class ColorPickerList extends StatelessWidget {
  const ColorPickerList({
    super.key,
    required String selectedColorName,
    required Color selectedColorUi,
    required Map<String, Color> colorOptions,
    required this.onColorSelected,
  })  : _selectedColorName = selectedColorName,
        _colorOptions = colorOptions;

  final String _selectedColorName;
  final Map<String, Color> _colorOptions;
  final void Function(String colorName, Color colorUi) onColorSelected;

  @override
  Widget build(BuildContext context) {
    return SizedBox(
      height: 85,
      // Horizontal scrolling list for colors
      child: ListView(
        scrollDirection: Axis.horizontal,
        physics: const BouncingScrollPhysics(),
        children: _colorOptions.entries.map((entry) {
          final isSelected = _selectedColorName == entry.key;

          return GestureDetector(
            onTap: () {
              // Pass the selected color back to the parent
              onColorSelected(entry.key, entry.value);
            },
            child: Container(
              margin: const EdgeInsets.only(right: 15),
              width: 50,
              child: Column(
                mainAxisSize: MainAxisSize.min,
                crossAxisAlignment: CrossAxisAlignment.center,
                children: [
                  // Outer ring animation
                  AnimatedContainer(
                    duration: const Duration(milliseconds: 250),
                    padding: const EdgeInsets.all(3),
                    decoration: BoxDecoration(
                      shape: BoxShape.circle,
                      // Show border only if selected
                      border: Border.all(
                        color: isSelected ? entry.value : Colors.transparent,
                        width: 2.5,
                      ),
                    ),
                    // Inner color circle
                    child: Container(
                      width: 35,
                      height: 35,
                      decoration: BoxDecoration(
                        color: entry.value,
                        shape: BoxShape.circle,
                        boxShadow: [
                          if (isSelected)
                            BoxShadow(
                              color: entry.value.withOpacity(0.3),
                              blurRadius: 6,
                              offset: const Offset(0, 3),
                            ),
                        ],
                        border: Border.all(color: Colors.white, width: 2),
                      ),
                    ),
                  ),
                  const SizedBox(height: 6),
                  // Color name label
                  Text(
                    entry.key,
                    textAlign: TextAlign.center,
                    maxLines: 1,
                    overflow: TextOverflow.ellipsis,
                    style: TextStyle(
                      fontSize: 10,
                      color: isSelected ? entry.value : Colors.grey[600],
                      fontWeight:
                          isSelected ? FontWeight.bold : FontWeight.normal,
                    ),
                  ),
                ],
              ),
            ),
          );
        }).toList(),
      ),
    );
  }
}

This horizontal list of color circles uses a ListView with scrollDirection: Axis.horizontal to allow users to swipe through various options, while an AnimatedContainer provides polished visual feedback by animating the outer border into view over 250ms when a color is tapped.

The widget also incorporates selection logic that checks the isSelected state to determine whether to display bold text and a colored border, clearly indicating the user's current choice.

CustomInputSection `custom_input_section.dart`

import 'package:flutter/material.dart';
import '../data/static_list_data.dart';
import 'color_picker_list.dart';

class CustomInputSection extends StatelessWidget {
  final TextEditingController nameController;
  final String selectedRelationship;
  final String selectedColorName;
  final Color selectedColorUi;
  final void Function(String colorName, Color colorUi) onColorSelected;
  final VoidCallback generateCard;
  final Function selectRelationship;

  const CustomInputSection({
    super.key,
    required this.nameController,
    required this.selectedRelationship,
    required this.selectedColorName,
    required this.selectedColorUi,
    required this.onColorSelected,
    required this.generateCard,
    required this.selectRelationship,
  });

  @override
  Widget build(BuildContext context) {
    return Container(
      decoration: BoxDecoration(
        color: Colors.white,
        boxShadow: [
          BoxShadow(
            color: Colors.black.withOpacity(0.05),
            blurRadius: 10,
            offset: const Offset(0, 5),
          ),
        ],
      ),
      child: LayoutBuilder(
        builder: (context, constraints) {
          bool isSmallScreen = constraints.maxWidth < 600;

          return Column(
            crossAxisAlignment: CrossAxisAlignment.start,
            children: [
              Padding(
                padding: const EdgeInsets.symmetric(horizontal: 18.0,vertical: 20),
                child: Flex(
                  direction: isSmallScreen ? Axis.vertical : Axis.horizontal,
                  crossAxisAlignment: CrossAxisAlignment.start,
                  children: [
                    Expanded(
                      flex: isSmallScreen ? 0 : 3,
                      child: SizedBox(
                        width: isSmallScreen ? double.infinity : null,
                        child: TextField(
                          controller: nameController,
                          decoration: const InputDecoration(
                            labelText: "Name (e.g., Alice)",
                            prefixIcon: Icon(Icons.person),
                            border: OutlineInputBorder(),
                            contentPadding: EdgeInsets.symmetric(
                              horizontal: 12,
                              vertical: 8,
                            ),
                          ),
                        ),
                      ),
                    ),
                    // Dynamic spacer
                    isSmallScreen
                        ? const SizedBox(height: 12)
                        : const SizedBox(width: 10),
                    Expanded(
                      flex: isSmallScreen ? 0 : 2,
                      child: SizedBox(
                        width: isSmallScreen ? double.infinity : null,
                        child: DropdownButtonFormField<String>(
                          initialValue: selectedRelationship,
                          decoration: const InputDecoration(
                            labelText: 'Relationship',
                            border: OutlineInputBorder(),
                            contentPadding: EdgeInsets.symmetric(
                              horizontal: 12,
                              vertical: 8,
                            ),
                          ),
                          items: StaticListData.relationships.map((String rel) {
                            return DropdownMenuItem(value: rel, child: Text(rel));
                          }).toList(),
                          onChanged: (val) => selectRelationship(val),
                        ),
                      ),
                    ),
                  ],
                ),
              ),
              const SizedBox(height: 20),
              Padding(
                padding: const EdgeInsets.only(left: 18.0),
                child: Text(
                  "Pick a theme color:",
                  style: TextStyle(
                    color: Colors.grey[700],
                    fontWeight: FontWeight.bold,
                  ),
                ),
              ),
              const SizedBox(height: 8),

              Padding(
                padding: const EdgeInsets.only(left: 16.0),
                child: Flex(
                  direction: isSmallScreen ? Axis.vertical : Axis.horizontal,
                  crossAxisAlignment: isSmallScreen
                      ? CrossAxisAlignment.stretch
                      : CrossAxisAlignment.center,
                  children: [
                    isSmallScreen
                        ? ColorPickerList(
                            selectedColorName: selectedColorName,
                            selectedColorUi: selectedColorUi,
                            colorOptions: StaticListData.colorOptions,
                            onColorSelected: onColorSelected,
                          )
                        : Expanded(
                            child: ColorPickerList(
                              selectedColorName: selectedColorName,
                              selectedColorUi: selectedColorUi,
                              colorOptions: StaticListData.colorOptions,
                              onColorSelected: onColorSelected,
                            ),
                          ),

                    if (isSmallScreen) const SizedBox(height: 16),

                    // Generate Button
                    Padding(
                      padding: const EdgeInsets.all(18.0),
                      child: SizedBox(
                        width: isSmallScreen ? double.infinity : null,
                        child: ElevatedButton.icon(
                          onPressed: generateCard,
                          style: ElevatedButton.styleFrom(
                            backgroundColor: Colors.red,
                            foregroundColor: Colors.white,
                            padding: const EdgeInsets.symmetric(
                              horizontal: 24,
                              vertical: 16,
                            ),
                            shape: RoundedRectangleBorder(
                              borderRadius: BorderRadius.circular(8),
                            ),
                          ),
                          icon: const Icon(Icons.auto_awesome),
                          label: const Text(
                            "Generate Card",
                            style: TextStyle(fontWeight: FontWeight.bold),
                          ),
                        ),
                      ),
                    ),
                  ],
                ),
              ),
            ],
          );
        },
      ),
    );
  }
}

As the most complex component in the architecture, this widget aggregates all inputs by utilizing a LayoutBuilder to monitor parent constraints, dynamically switching the Flex direction between Axis.horizontal for tablets and web and Axis.vertical for mobile stacking when the maxWidth is less than 600.

To ensure a seamless layout across devices, it leverages Expanded on large screens to fill the available space while using SizedBox(width: double.infinity) on smaller screens to force inputs to the full width of the device, all while maintaining clean code by integrating the ColorPickerList and StaticListData.

Adding Your Own Widgets to the GenUI Catalog

So far in this project, we’ve relied entirely on the widgets provided by CoreCatalogItems. These include common UI building blocks like Text, Column, Container, and Image, which are enough to get surprisingly rich results.

But GenUI really shines when you teach the AI about your own domain-specific widgets.

In our case, we’re not just generating arbitrary UI – we’re generating high-end, personalized Christmas cards. That makes this a perfect candidate for a custom catalog item.

Instead of hoping the AI assembles the perfect layout every time from primitive widgets, we can introduce a first-class “Holiday Card” widget and let the model generate data for it.

In the current implementation, the AI generates festive UIs using general-purpose widgets, which works but leads to inconsistent card structure, repeated styling instructions, and excessive layout freedom.

By introducing a custom widget into the catalog, layout and styling decisions are encoded directly in Flutter. This allows the AI to focus on content and personalization while producing more predictable, production-ready results.

Step 1: Adding `json_schema_builder`

To define a custom widget, GenUI needs to know what data it accepts. You can tell it this using a JSON Schema.

Add json_schema_builder as a dependency, using the same repository reference as GenUI:

dependencies:
  json_schema_builder:
    git:
      url: https://github.com/flutter/genui.git
      path: packages/json_schema_builder

This ensures schema compatibility with the GenUI runtime.

Step 2: Defining the Holiday Card Schema

A Christmas card in our app needs a few core pieces of data:

The recipient’s name
The relationship (friend, spouse, family, and so on)
The message body
A closing signature

Using json_schema_builder, we can define this explicitly:

final holidayCardSchema = S.object(
  properties: {
    'recipientName': S.string(
      description: 'Name of the person receiving the card',
    ),
    'relationship': S.string(
      description: 'Relationship to the recipient (friend, spouse, family)',
    ),
    'message': S.string(
      description: 'Main heartfelt holiday message',
    ),
    'signature': S.string(
      description: 'Closing signature for the card',
    ),
  },
  required: [
    'recipientName',
    'relationship',
    'message',
    'signature',
  ],
);

This schema becomes the contract between your Flutter app and the AI.

Step 3: Creating the CatalogItem

Each custom widget is registered as a CatalogItem. This ties together:

A name (used by the AI)
The schema
A widget builder that renders Flutter UI

Here’s what a HolidayCard catalog item might look like:

final holidayCardItem = CatalogItem(
  name: 'HolidayCard',
  dataSchema: holidayCardSchema,
  widgetBuilder: (context) {
    final name = context.dataContext.subscribeToString(
      context.data['recipientName'] as Map<String, Object?>?,
    );
    final message = context.dataContext.subscribeToString(
      context.data['message'] as Map<String, Object?>?,
    );
    final signature = context.dataContext.subscribeToString(
      context.data['signature'] as Map<String, Object?>?,
    );

    return ValueListenableBuilder<String?>(
      valueListenable: name,
      builder: (context, recipientName, _) {
        return ValueListenableBuilder<String?>(
          valueListenable: message,
          builder: (context, body, _) {
            return ValueListenableBuilder<String?>(
              valueListenable: signature,
              builder: (context, signOff, _) {
                return Container(
                  margin: const EdgeInsets.all(24),
                  padding: const EdgeInsets.all(24),
                  decoration: BoxDecoration(
                    color: Colors.white,
                    borderRadius: BorderRadius.circular(20),
                    border: Border.all(color: Colors.redAccent),
                  ),
                  child: Column(
                    crossAxisAlignment: CrossAxisAlignment.center,
                    children: [
                      const Text(
                        '🎄 Merry Christmas 🎄',
                        style: TextStyle(
                          fontSize: 24,
                          fontWeight: FontWeight.bold,
                        ),
                      ),
                      const SizedBox(height: 16),
                      Text(
                        'Dear ${recipientName ?? ''},',
                        style: const TextStyle(fontSize: 18),
                      ),
                      const SizedBox(height: 12),
                      Text(
                        body ?? '',
                        textAlign: TextAlign.center,
                      ),
                      const SizedBox(height: 24),
                      Text(
                        signOff ?? '',
                        style: const TextStyle(fontWeight: FontWeight.w600),
                      ),
                    ],
                  ),
                );
              },
            );
          },
        );
      },
    );
  },
);

Notice how no state is stored in the widget itself. Everything comes from the GenUI data model.

Now we’ll plug the custom widget into your existing setup.

In your initState, instead of using only CoreCatalogItems, extend the catalog:

_a2uiMessageProcessor = A2uiMessageProcessor(
  catalogs: [
    CoreCatalogItems.asCatalog().copyWith([
      holidayCardItem,
    ]),
  ],
);

This makes HolidayCard available to the AI.

Finally, we’ll update the system instruction so the AI knows when and how to use the new widget.

In your existing FirebaseAiContentGenerator, the instruction can be refined like this:

      _contentGenerator = FirebaseAiContentGenerator(
      catalog: CoreCatalogItems.asCatalog(),
      systemInstruction: '''
          You are an expert Festive UI Designer and Holiday Copywriter.

          YOUR GOAL: Generate a high-end, visually appealing Christmas card using the `surfaceUpdate` tool, suitable for printing or digital sharing. The card should feel personalized, warm, and festive.

          DESIGN GUIDELINES:
          - Layout: Use a vertical Column inside a Container with rounded corners, generous padding, and a border. Fill the Container with a color that **mixes Red with $selectedColorName ** to create a rich, holiday-themed background.
          - Typography: Use distinct font weights (Bold for headers, normal for body). Center all text.
          - Visuals: Include seasonal icons (🎄, ✨, ❄️) as decorative elements. Place a Christmas tree emoji strategically without overcrowding the layout.
          - Personalization: Display the recipient's name prominently in the middle of the card in a visually striking way.

          COPYWRITING GUIDELINES:
          - Create a deeply personal, heartfelt holiday message (3-4 sentences) that matches the relationship type (fun for friends, romantic for spouse, warm for family).
          - Include a proper closing/signature.
          - NEVER use placeholders. Always generate the **final text ready to display**.

          OUTPUT INSTRUCTIONS:
          - Use the `surfaceUpdate` tool to construct the UI.
          - Ensure all elements (Container, text, emojis) are visually aligned and harmonious.
          - The card must feel festive, elegant, and balanced. When generating a Christmas card, always use the HolidayCard widget.
          ''',
    );

Now the AI isn’t guessing – it’s explicitly guided toward your custom widget.

How This Fits into Your Existing Screen

This integration requires no structural changes to your existing CardGeneratorScreen: GenUiConversation continues to manage the interaction lifecycle, GenUiSurface still handles rendering, and your input form remains fully responsible for shaping the prompt. The only change is what the AI is allowed to generate, which significantly improves control and consistency.

By adding custom widgets to the GenUI catalog, your application moves from AI assembling loosely defined UI fragments to AI populating structured, production-ready components, resulting in a cleaner interface, stronger visual identity, reduced prompt engineering, and far more predictable outputs. This is the point where GenUI stops feeling like a demo and starts functioning as a real product framework.

Screenshots:

Final Thoughts

This project demonstrates how you can take advantage of GenUI in its most practical form: not merely as a tech demo, but as a functional Flutter paradigm that bridges the gap between static code and user intent.

By shifting the responsibility of layout orchestration from the developer to an intelligent agent, we unlock a level of personalization that was previously not possible in mobile development.

Once you master the Conversation Loop (how the AI thinks), Surfaces (how the AI draws), and Catalog Boundaries (what the AI is allowed to use), GenUI becomes a transformative addition to your Flutter toolkit. It allows you to build interfaces that aren't just "responsive" to screen sizes, but "responsive" to human needs.

As an early adopter, you are on the cutting edge of AI-Generated User Interfaces. Your explorations and feedback will help shape the future of how we build apps in the era of generative intelligence. You can find the complete project on Github here.

References

Flutter Team. GenUI: Build AI-powered user interfaces in Flutter. GitHub repository.
Available at: https://github.com/flutter/genui/
Flutter Documentation. Getting started with GenUI.
Available at: https://docs.flutter.dev/ai/genui/get-started
Dart & Flutter Ecosystem. genui package. pub.dev.
Available at: https://pub.dev/packages/genui
Dart & Flutter Ecosystem. genui_firebase_ai package. pub.dev.
Available at: https://pub.dev/packages/genui_firebase_ai

Build a Support Agent with Vercel AI SDK

Beau Carnes — Tue, 23 Dec 2025 16:34:48 +0000

Vercel AI SDK is a TypeScript-first toolkit for building AI features. It streamlines text generation, embeddings, and structured outputs.

We just posted a course on the freeCodeCamp.org YouTube channel that will teach you to use the Vercel AI SDK to create and ship a customer support agent that makes autonomous decisions to either answer questions based on your support docs or search the web in real time.

In this course, you’ll ship a customer support agent that:

Embeds support docs into a Supabase vector store.
Uses retrieval and web search as tools, selected on-the-fly based on the user’s question.
Classifies intents with structured outputs (via generateObject + Zod).
Answers questions with grounded, trustworthy responses—pulling from your docs when relevant or searching the web in real time when needed

The course covers these topics.

Explain RAG & embeddings and decide when to use each of them.
Set up Supabase as a vector store: create tables, embed documents, and handle chunking/text splitting for large files.
Implement retrieval with Supabase RPC so your agent can fetch the right context for any question.
Use Vercel AI SDK basics: embeddings and generateText for fast, reliable model calls.
Produce structured outputs with generateObject and Zod to validate and route intents.
Call tools with the AI SDK—define schemas, wire execution, and keep everything type-safe.
Treat retrieval and web search as tools, and compose them into a single agent decision flow.
Use the OpenAI web search tool to pull fresh, real-time information when your docs aren’t enough.
Combine it all into a support agent that chooses the best strategy (retrieve, search, or answer directly) and explains its answers.

Watch the full course on the freeCodeCamp.org YouTube channel (2-hour watch).

freeCodeCamp's A1 Professional Chinese Curriculum (Beta) is Now Live

Nielda Karla — Mon, 22 Dec 2025 15:51:04 +0000

The freeCodeCamp community just published the introductory chapters of our new A1 Professional Chinese Curriculum. You can now get started learning Chinese with what’s already available.

Each chapter includes hundreds of interactive tasks designed to help you take your first steps in learning Chinese with confidence.

How Does the New A1 Professional Chinese Curriculum Work?

In this A1 Professional Chinese Curriculum, you'll learn the building blocks of the Chinese language. This will follow the A1 level of the Common European Framework of Reference (CEFR). And we've focused on vocabulary that is particularly useful for professional settings.

The curriculum is broken down into several modules that include warm-up, learning, practice, review pages, and quizzes to make sure that you truly understand the material before moving onto the next module.

The Warm-up serves as preparation and provides context for the main content of the module.

The tasks in the warm-up will either introduce you to new vocabulary for the first time, or review content you have already learned that will be used in the current module.

Below is an example of what you will find in the lessons.

Each task will have an accompanying question that will help you practice the content. If you don’t know how to answer a question or need more details, you can check the explanation section.

After the Warm-up, you'll head over to Learn. This is where you'll see the new words you've learned in action! You'll listen to short sections of monologues or dialogues and answer questions about them to make sure you understand their meaning and how they're used in real conversations. This is also where you’ll learn some theory when needed.

The curriculum also has fill-in-the-blank questions that will help you practice writing using Pinyin and Hanzi.

After Learn, you'll move on to Practice where you'll complete more open-ended tasks that test your comprehension and your ability to write using Pinyin and Hanzi.

At the end of each module, there is a Review section with grammar highlights and a glossary of the main points and concepts covered. You can use these review pages to help you study for the quizzes.

The last portion of the module is the Quiz. Quizzes are designed to test your understanding from the material covered in the module.

Throughout the certification, each chapter will have these quizzes. You’ll need to complete them in order to qualify for the exam at the end of the certification.

The certification exam will be the final item released for this certification. We are currently publishing the first three chapters, and future chapters will be released progressively as they are developed by our instructional design team. Once all the chapters are available, we will release the certification exam.

Contributors recognition

We'd like to shout out to the following contributors for their help in developing the curriculum:

We'd like to extend a special thanks to S1ng S1ng, who has recorded the monologues and dialogues audios, and has also recorded instructional videos that vividly demonstrate Pinyin pronunciation. These Pinyin videos will be gradually added to the curriculum throughout the coming year.

Frequently Asked Questions

Is all of this really free?

Yes. freeCodeCamp has always been free, and we've now offered free verified certifications for more than a decade.

Can I study the Chinese curriculum in languages other than English?

We aim to make every course available in all supported languages on freeCodeCamp. Check your account settings to see if the course you are studying is already offered in your preferred language.

What language skills does the Chinese curriculum cover?

The languages courses currently cover listening, reading and writing. We have plans of adding speaking later on.

Are the audio in the language courses and exams recorded by native language speakers?

Yes. All the audios present in the language courses were recorded by native speakers of that language.

I am Deaf or hard of hearing. Can I still study the language courses?

Yes! All audio lesson have closed captions and transcripts available for reading.

Yes! freeCodeCamp courses are designed to be accessible, and you can study the language courses using a screen reader. If you run into any accessibility issues, you can report them on our GitHub repository so the community can address them.

What are the letters and numbers beside the curriculum names? (For example: A1, A2, B1)

These labels refer to the CEFR levels, which is an international framework used to describe language proficiency. A1 and A2 represent beginner levels, B1 and B2 represent intermediate levels, and C1 and C2 represent advanced levels. Each level indicates the skills and knowledge you are expected to have at that stage of your language learning journey.

Anything else?

Good luck working through freeCodeCamp’s languages coursework.

Happy learning!

How to Test and Improve AI Applications with an Evaluation Flywheel

Yemi Ojedapo — Mon, 22 Dec 2025 10:18:04 +0000

In traditional programming, developers rely on unit tests to catch mistakes in applications. But when building AI products, that safety net doesn't exist. Responses can shift with model updates, data changes, and subtle fluctuations in prompts or retrieval results. The usual testing methods like unit tests with Pytest or Jest, integration tests, CI pipelines, fail to catch accuracy drops, hallucinations, or regressions, and these silent failures can become real production risks.

In this article, you’ll learn why traditional testing methods fall short for AI systems and how an evaluation flywheel can be used as a practical approach to testing and improving AI applications. The sections below break the evaluation flywheel down step by step, from identifying the problem to implementing a repeatable evaluation loop.

Why Does Traditional Testing Fail for AI applications?
What is the Evaluation Flywheel?
Drawing Parallels to Familiar Practices
Why Silent Failures Matter: A Real-World Example
How to Create an Evaluation Flywheel
Tools and Frameworks you can use for evaluation
What a Complete Evaluation Loop Looks Like in Practice
Key Takeaways
Conclusion

Why Does Traditional Testing Fail for AI applications?

In standard programming, tests assume deterministic behavior. This means the same input is expected to always produce the same output. For example:

def authenticate_user_age(age: int) -> str:
    limit = 18

    if age >= limit:
        return "Access granted"
    else:
        return "User doesn't meet the age limit"

# Test 
assert authenticate_user_age(20) == "Access granted"
assert authenticate_user_age(16) == "User doesn't meet the age limit"

The response from this function is always predictable. You can write tests once and trust they'll catch errors forever.

However, AI models don’t behave the same way every time, they generate output based on probabilities. A query like “best programming practices” may produce strong guidance one day, and outdated or incomplete advice the next. This shift can happen because of changes in the underlying model, updates to retrieval components, or gradual data drift. Without a structured evaluation process in place, these inconsistencies slip into production unnoticed and can quietly weaken the system’s performance.

What is the Evaluation Flywheel?

The evaluation flywheel is a continuous improvement system where test cases representing real user behavior are passed through multiple evaluation steps to assess the output of AI models. The results don't just tell you whether the system passed or failed, they feed directly into the next cycle of improvement.

┌─────────────┐
│   Collect   │
│ Test Cases  │
└──────┬──────┘
       │
       ▼
┌─────────────┐
│     Run     │
│ Evaluations │
└──────┬──────┘
       │
       ▼
┌─────────────┐      ┌─────────────┐
│  Identify   │─────▶│   Improve   │
│  Failures   │      │   System    │
└─────────────┘      └──────┬──────┘
                            │
                            ▼
                       ┌─────────────┐
                       │   Repeat    │
                       └─────────────┘

Here's how it works in practice:

Collect test cases — Gather examples from real user interactions or create synthetic scenarios. These should reflect the kind of tasks and input your system needs to handle.
Run evaluations — Pass each test case through a series of checks. The check can either be programmatic (automated metrics like relevance scores or hallucination detectors) or require manual review (like verifying legal advice accuracy or brand voice consistency).
Identify failures — Detect where the model goes wrong, this can include hallucinations, irrelevant responses, or mistakes on corner-cases.
Improve the system — Based on those failures, refine prompts, improve training or retrieval data, or adjust architectural components.
Repeat the cycle — Re-run the updated system on the existing and newly collected cases. Over time, this grows and strengthens your evaluation suite and boosts system reliability.

Drawing Parallels to Familiar Practices

If you've written software before, the evaluation flywheel will feel familiar. It mirrors patterns that are already used in engineering. For instance,

Unit tests → Evaluation datasets
Unit tests confirm a function returns the right output. Evaluation datasets play the same role for AI: they're ground-truth queries and answers that guard against regressions.

Test-driven development (TDD) → Evaluation-driven development (EDD)
In TDD, you write tests before code. In EDD, you write evaluation cases before shipping prompts or updating models. This replaces assumptions with verifiable results.

CI/CD pipelines → Continuous evaluation pipelines
CI/CD runs checks automatically on every code change. Continuous evaluation does the same for models: it runs automated quality checks every time you tweak a prompt, retrain, or swap out a component.

The key difference is subtle but important. Traditional software tests check whether a function returns the right value or type. AI evaluation tests check whether the system produces the right meaning. That's harder to measure, but the principle is the same: build a safety net that grows stronger with every cycle.

Why Silent Failures Matter: A Real-World Example

AI systems often behave differently in production than they do in development. A model that seems solid in testing can drift, hallucinate, or silently fail when facing real-world input.

Case in point: A fraud detection model passed all monitoring metrics yet missed a spike in fraud. An ML engineer shared how their production monitoring dashboards tracked latency, throughput, and error rates, everything showed green. But fraudulent transactions were slipping through at twice the normal rate. Nobody noticed because existing observability tools focused on pipeline health, not prediction quality.

This silent failure cost the company significant losses. The system seemed fine by traditional metrics. It measured system performance—latency, throughput, uptime—but ignored what mattered most: prediction accuracy. As fraudsters adapted their tactics, the model drifted, and without proper evaluation loops, the degradation went undetected for weeks.

Source: InsightFinder.

Why This Example Matters

Silent failures aren't always bugs — They often stem from models failing to adapt to shifting patterns in the real world.
Static evaluation isn't enough — You need continuous, real-world feedback loops to detect when assumptions no longer hold.
Data drift has business impact — Model degradation isn't just technical, it translates directly into revenue loss, security breaches, or damaged user trust.

How to Create an Evaluation Flywheel

To show how to build a flywheel and how it works, let's create one for a customer support chatbot that answers questions about a SaaS product.

Step 1: Build Your AI System

Create your initial product: prompts, retrieval logic, and integrations. For our chatbot:

def answer_support_question(question: str) -> str:
    # Retrieve relevant docs from knowledge base
    context = retrieve_docs(question, top_k=5)

    # Generate answer using LLM
    prompt = f"""You are a helpful customer support agent.

Context: {context}

Question: {question}

Provide a clear, accurate answer based on the context."""

    response = llm.generate(prompt)
    return response

How this works: This function defines the core chat logic, it takes a customer’s question and returns an AI-generated answer. First, it searches your knowledge base to find the five most relevant documents using retrieve_docs(). These documents provide context about your product or policies. Next, it constructs a prompt that includes this context and the user's question, then sends it to a language model. The LLM reads the context and generates a relevant answer, which the function returns.

Step 2: Identify Test Cases

Build an evaluation set that reflects real user behavior. The more representative your test cases are, including common cases, edge cases, and ambiguous inputs, the better your model can catch failures before they reach production.

Sources for test cases:

Previous customer support tickets
Common FAQ topics
Edge cases discovered in beta testing
Synthetic scenarios (hypothetical but realistic queries)

Example test cases:

test_cases = [
    {
        "question": "How do I reset my password?",
        "expected_elements": ["settings page", "reset link", "email"],
        "category": "account_management"
    },
    {
        "question": "What's your refund policy?",
        "expected_elements": ["30 days", "full refund", "contact support"],
        "category": "billing"
    },
    {
        "question": "Can I export my data to CSV?",
        "expected_elements": ["yes", "export button", "dashboard"],
        "category": "features"
    },
    {
        "question": "Does your API support webhooks?",
        "expected_elements": ["yes", "webhook endpoints", "documentation"],
        "category": "technical"
    }
]

How this works: Here, we define a set of representative test cases to evaluate the AI system. Each test case includes the user’s question, a list of key elements expected in the answer, and a category for organization. These cases help ensure the chatbot is tested against real-world scenarios, edge cases, and important information that should appear in responses.

Step 3: Evaluate Outputs

Define evaluation criteria based on what matters for your use case: accuracy, faithfulness, safety, relevance, tone. Then measure the output against these criteria.

Evaluation happens in two main ways:

Automated Evaluation

Use programmatic metrics and LLM-as-judge patterns:

def evaluate_response(question: str, response: str, expected_elements: list) -> dict:
    scores = {}

    # 1. Faithfulness: Does response contain expected elements?
    scores['contains_key_info'] = all(
        elem.lower() in response.lower() 
        for elem in expected_elements
    )

    # 2. Relevance: Semantic similarity to question
    scores['relevance'] = calculate_semantic_similarity(question, response)

    # 3. Safety: Check for problematic content
    scores['is_safe'] = not contains_harmful_content(response)

    # 4. Tone: Use LLM-as-judge
    judge_prompt = f"""Rate the helpfulness of this support response on a scale of 1-5.

Question: {question}
Response: {response}

Score (1-5):"""

    scores['helpfulness'] = int(llm.generate(judge_prompt))

    return scores

# Run evaluation
for test_case in test_cases:
    response = answer_support_question(test_case['question'])
    scores = evaluate_response(
        test_case['question'],
        response,
        test_case['expected_elements']
    )
    test_case['scores'] = scores
    test_case['response'] = response

How this works: The evaluate_response() function applies four different checks to each AI response:

First, it verifies faithfulness by checking if all expected elements appear in the response using simple string matching.
Second, it calculates semantic similarity, a measure of how closely the responses meaning match the intent of the questions, using embeddings.
Third, it runs a safety check to flag any problematic content.
Fourth, it uses an LLM as a judge by asking a more powerful model (like GPT-4) to rate the helpfulness of the response on a 1-5 scale.

The loop then runs the evaluation for every test case. It generates a response for each question, evaluates it using the evaluate_response function, and then stores both the scores and the response back in the test case. This creates a complete dataset of test results for analysis and further improvements.

Common Automated Metrics:

Semantic similarity (0.0–1.0): This is measured by converting the question and response into vector embeddings and calculating cosine similarity. The score shows how closely the response matches the intent of the question, even if the wording differs.
ROUGE / BLEU scores: The model’s output is compared to reference answers by checking n-gram overlap. These metrics help spot regressions, though scores can be modest for open-ended answers.
LLM-as-judge: A stronger model (like GPT-4 or Claude) can rate the response on a fixed scale, such as 1–5. These ratings give a sense of quality and are useful for tracking improvements or drops over time.
Retrieval metrics (Precision@k, Recall@k): For retrieval-based systems, these metrics calculate how many relevant documents appear in the top-k results. Precision shows accuracy of the retrieved set, and recall indicates completeness.
Custom validators: Simple rule-based checks, like regex patterns, keywords, or length limits, ensure responses meet hard requirements. These help catch issues automated metrics might miss.

Manual Evaluation

Automated metrics can't capture everything. Subjective qualities like tone, empathy, and brand voice require human judgment, as do small factual errors that slip past keyword checks and similarity scores.

# Flag cases for human review
needs_review = [
    case for case in test_cases 
    if case['scores']['helpfulness'] < 3 
    or not case['scores']['contains_key_info']
]

# SMEs review and annotate
for case in needs_review:
    annotation = get_sme_feedback(case)
    case['human_rating'] = annotation['rating']
    case['improvement_notes'] = annotation['notes']

This code filters test cases to find responses that need human attention, those scoring below 3 for helpfulness or missing important information. Subject matter experts review these flagged cases and provide ratings with helpful feedback. Their input helps you spot patterns that automated metrics miss and shows you where to improve your prompts, retrieval setup, or system settings.

When to use manual evaluation:

Assessing tone, empathy, or brand voice
Detecting subtle hallucinations automated checks miss
Validating edge cases with domain-specific nuance
Creating ground truth labels for training evaluation models

Step 4: Learn and Improve

Once you've identified failures, adjust the controllable parts of your AI system (the "configs"):

Common configuration levers:

Prompts — Add instructions, examples, constraints
Retrieval — Change chunk size, top-k, reranking strategy
Model — Switch models, adjust temperature, max tokens
Context — Modify system instructions, add memory
Post-processing — Add validation, formatting, safety filters

Example improvement cycle:

# Problem discovered: Chatbot missing key details
failing_case = {
    "question": "What's your refund policy?",
    "response": "We offer refunds in certain cases.",
    "issue": "Too vague, missing 30-day window and process"
}

# Root cause: Retrieval returning wrong docs
retrieved_docs = retrieve_docs(failing_case['question'], top_k=5)
# Docs about "payment processing" ranked higher than "refund policy"

# Solution 1: Improve retrieval with reranking
def retrieve_docs_v2(question: str, top_k: int) -> str:
    # Initial retrieval
    candidates = vector_search(question, top_k=20)

    # Rerank by relevance
    reranked = rerank_by_relevance(question, candidates)

    return reranked[:top_k]

# Solution 2: Update prompt to require specificity
prompt_v2 = f"""You are a helpful customer support agent.

Context: {context}

Question: {question}

Provide a clear, accurate answer based on the context. Include specific details like:
- Time windows (e.g., "within 30 days")
- Step-by-step processes
- Relevant links or contact methods

Answer:"""

# Re-evaluate
new_response = answer_support_question_v2(failing_case['question'])
new_scores = evaluate_response(
    failing_case['question'],
    new_response,
    ["30 days", "full refund", "contact support"]
)

# Verify improvement
assert new_scores['contains_key_info'] == True
assert new_scores['helpfulness'] >= 4

How this works: In this example, the chatbot's refund answer was too vague. After checking what went wrong, the problem was that the system retrieved docs about payment processing instead of the refund policy.

To resolve this, two changes can be made. First, retrieval is improved by grabbing twenty documents, then picking the best five. Second, the prompt is updated to ask for specific details like dates and steps.

After making these changes, the test runs again to confirm it works: the response now has all the key info and scores at least 4 out of 5. This process turns problems into fixes you can measure.

Step 5: Automate and Repeat

Integrate evaluation into your development workflow using CI/CD:

# .github/workflows/eval.yml
name: Continuous Evaluation

on:
  pull_request:
  push:
    branches: [main]

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Run evaluation suite
        run: python run_evals.py

      - name: Check pass rate
        run: |
          PASS_RATE=$(python calculate_pass_rate.py)
          if (( $(echo "$PASS_RATE < 0.85" | bc -l) )); then
            echo "Pass rate $PASS_RATE below threshold"
            exit 1
          fi

      - name: Upload results
        uses: actions/upload-artifact@v2
        with:
          name: eval-results
          path: results/

Explanation: This GitHub Actions workflow automates your evaluation process so it runs automatically on every code change. The workflow triggers whenever someone opens a pull request or pushes code to the main branch. It checks out your code, runs your full evaluation suite using run_evals.py, then calculates what percentage of test cases passed. If the pass rate drops below 85%, the workflow fails and blocks the code from being merged, preventing quality regressions from reaching production.

Key practices for automation:

Version your test cases — Track them in Git alongside code
Set quality gates — Block deployments if pass rate drops below threshold
Monitor trends — Track metrics over time to catch gradual drift
Alert on regressions — Notify team when specific test cases start failing
Sample production traffic — Continuously add real queries to eval dataset

Tools and Frameworks you can use for evaluation

Several platforms can help implement continuous evaluation. The one you choose depends on your stack and needs:

If you're building with LLMs: Try LangSmith or Braintrust first. Both handle prompt versioning, evaluation datasets, and tracing out of the box.

If you're doing traditional ML: Weights & Biases is the industry standard. If you're in the Microsoft ecosystem, PromptFlow integrates well with Azure.

If you want full control: Build custom with pytest for test execution and MLflow for tracking results. More setup, but you own the entire pipeline

What a Complete Evaluation Loop Looks Like in Practice

This walkthrough shows how a support chatbot improves after running a single cycle of evaluations. Each stage shows how evaluation signals guide improvements and lock in quality for the next release.

Stage	Before	After
Test Case	"Can I use your API on the free plan?"	Same question
Model Response	"Yes, you can access our API."	"Yes, you can access our API on the free plan with a rate limit of 100 requests per day. For higher limits, upgrade to Pro or Enterprise."
Evaluation Scores	contains_key_info=False, helpfulness=2/5	contains_key_info=True, helpfulness=5/5
Issue Identified	Missing crucial detail: free plan rate limits	N/A (issue resolved)
Analysis / Root Cause	Retrieval returned general API docs; prompt didn’t emphasize limitations	N/A (analysis led to fix)
Fixes Applied	1. Improved retrieval to fetch plan comparison docs2. Updated prompt: "Always mention plan-specific restrictions"3. Added validation: Response must mention rate limits if asked	N/A (fix implemented)
Outcome	Test failed, regression not prevented	Test passes, regression prevented
Next Cycle Actions	N/A	1. Add this test case to permanent suite 2. Look for similar issues (other plan-related questions) 3. Monitor production queries for this pattern

Next cycle:

Add this test case to permanent suite
Look for similar issues (other plan-related questions)
Monitor if this pattern appears in production queries

Key Takeaways

AI systems need continuous evaluation, not one-time testing — Models drift, data changes, and silent failures accumulate without ongoing checks.
Build evaluation into your workflow from day one — Don't wait until production failures force you to retrofit evaluation.
Start simple, then scale — Begin with 10-20 test cases and basic metrics. Grow your suite as you encounter edge cases.
Automate what you can, involve humans for what you can't — Use programmatic checks for speed, SME review for nuance.
Treat evaluation datasets as first-class artifacts — Version control them, review changes, and grow them over time.
Make evaluation a team sport — Product, engineering, and domain experts should all contribute test cases and evaluation criteria.

Conclusion

Every developer has felt the relief of seeing "all tests passing." In AI systems, that reassurance is often misleading. A model can deploy successfully, meet performance benchmarks, and still produce incorrect, incomplete, or misleading outputs in ways traditional tests miss.

The evaluation flywheel addresses this gap by making model behavior testable in practice. Instead of assuming correctness, it forces the system to answer real questions, measures the quality of those answers, and highlights where performance degrades over time. This shifts evaluation from a one-off validation step into an ongoing part of development.

Evaluation won't eliminate uncertainty completely, but it makes failures visible before they reach users. With failures clearly exposed, teams stop guessing and start fixing based on results. This might mean adjusting prompts, improving retrieval logic, or refining evaluation criteria. Over time, this leads to AI systems that evolve in controlled ways rather than breaking silently.

Resources for further reading

Anthropic's eval guide: https://docs.anthropic.com/en/docs/build-with-claude/develop-tests
OpenAI's evals framework: https://github.com/openai/evals
LangChain evaluation: https://python.langchain.com/docs/guides/evaluation
Arize AI blog: Comprehensive resources on ML observability

freeCodeCamp's New Relational Databases Certification is Now Live

Jessica Wilkins — Fri, 19 Dec 2025 18:10:14 +0000

The freeCodeCamp community just published our new Relational Databases certification. You can now sit for the exam to earn the free verified certification, which you can add to your résumé, CV, or LinkedIn profile.

Each certification is filled with hundreds of hours worth of interactive lessons, workshops, labs, and quizzes.

How Does the New Relational Databases Certification Work?

The new Relational Databases certification will teach you core concepts including Bash scripting, SQL, Git, and more.

The certification is broken down into several modules that include lessons, workshops, labs, review pages, and quizzes to ensure that you truly understand the material before moving onto the next module.

The lessons are your first exposure to new concepts. They provide crucial theory and context for how things work in the software development industry.

At the end of each lesson, there will be three comprehension check questions to test your understanding of the material from the lesson.

After the lesson blocks, you will do the workshops. These workshops are guided step-based projects that provide you with an opportunity to practice what you have learned in the lessons.

These workshops will not be done inside the regular freeCodeCamp editor in the browser. Instead you will need to do these workshops in one of three environments:

GitHub Codespaces: This course runs in a virtual Linux machine using GitHub Codespaces.
Your own local environment: This course runs in a virtual Linux machine on your computer.
Ona: This course runs in a virtual Linux machine using Ona.

After the workshop, you will complete a lab which will help you review what you have learned so far. This will give you chance to start building projects on your own, which is a crucial skill for a developer. You will be presented with a list of users stories and will need to pass the tests to complete the lab.

At the end of each module, there is a review page containing a list of all of the concepts covered. You can use these review pages to help you study for the quizzes.

The last portion of the module is the quiz. This is a 20 question multiple choice quiz designed to test your understanding from the material covered in the module. You will need to get 18 out of 20 correct to pass.

Throughout the certification, there will be five certification projects you will need to complete in order to qualify for the exam.

Once you’ve completed all 5 certification projects, you’ll be able to take the 50 question exam using our new open source exam environment. The freeCodeCamp community designed this exam environment tool with two goals: respecting your privacy while also making it harder for people to cheat.

Once you download the app to your laptop or desktop, you can take the exam.

Frequently Asked Questions

Is all of this really free?

Yes. freeCodeCamp has always been free, and we’ve now offered free verified certifications for more than a decade. These exams are just the latest expansion to our community’s free learning resources.

What prevents people from just cheating on the exams?

Our goal is to strike a balance between preventing cheating and respecting people's right to privacy.

We've implemented a number of reliable, yet non-invasive, measures to help prevent people from cheating on freeCodeCamp's exams:

For each exam, we have a massive bank of questions and potential answers to those questions. Each time a person attempts an exam, they'll see only a small, randomized sampling of these questions.
We only allow people to attempt an exam one time per week. This reduces their ability to "brute force" the exam.
We have security in place to validate exam submissions and prevent man-in-the-middle attacks or manipulation of the exam environment.
We manually review each passing exam for evidence of cheating. Our exam environment produces tons of metrics for us to draw from.

We take cheating, and any form of academic dishonesty, seriously. We will act decisively.

This said, no one's exam results will be thrown out without human review, and no one's account will be banned without warning based on a single suspicious exam result.

Are these exams “open book” or “closed book”?

All of freeCodeCamp’s exams are “closed book”, meaning you must rely only on your mind and not outside resources.

Of course, in the real world you’ll be able to look things up. And in the real world, we encourage you to do so.

But that is not what these exams are evaluating. These exams are instead designed to test your memory of details and your comprehension of concepts.

So when taking these exams, do not use outside assistance in the form of books, notes, AI tools, or other people. Use of any of these will be considered academic dishonesty.

Do you record my webcam, microphone, or require me to upload a photo of my personal ID?

No. We considered adding these as additional test-taking security measures. But we have less privacy-invading methods of detecting most forms of academic dishonesty.

If the environment is open source, doesn't that make it less secure?

"Given enough eyeballs, all bugs are shallow." – Linus’s Law, formulated by Eric S. Raymond in his book The Cathedral and the Bazaar

Open source software projects are often more secure than their closed source equivalents. This is because a lot more people are scrutinizing the code. And a lot more people can potentially help identify bugs and other deficiencies, then fix them.

We feel confident that open source is the way to go for this exam environment system.

How can I contribute to the Exam Environment codebase?

It's fully open source, and we'd welcome your code contributions. Please read our general contributor onboarding documentation.

Then check out the GitHub repo.

You can help by creating issues to report bugs or request features.

You can also browse open help wanted issues and attempt to open pull requests addressing them.

Are the exam questions themselves open source?

For obvious exam security reasons, the exam question banks themselves are not publicly accessible. :)

These are built and maintained by freeCodeCamp's staff instructional designers.

What happens if I have internet connectivity issues mid-exam?

If you have internet connectivity issues mid exam, the next time you try submit an answer, you’ll be told there are connectivity issues. The system will keep prompting you to retry submitting until the connection succeeds.

What if my computer crashes mid-exam?

If your computer crashes mid exam, you’ll be able to re-open the Exam Environment. Then, if you still have time left for your exam attempt, you’ll be able to continue from where you left off.

Can I take exams in languages other than English?

Not yet. We’re working to add multi-lingual support in the future.

I have completed my exam. Why can't I see my results yet?

All exam attempts are reviewed by freeCodeCamp staff before we release the results. We do this to ensure the integrity of the exam process and to prevent cheating. Once your attempt has been reviewed, you'll be notified of your results the next time you log in to freeCodeCamp.org.

I am Deaf or hard of hearing. Can I still take the exams?

Yes! While some exams may include audio components, we do make written transcripts available for reading.

We’re working on it. Our curriculum is fully screen reader accessible. We're still refining our screen reader usability for the Exam Environment app. This is a high priority for us.

I use a keyboard instead of a mouse. Can I navigate the exams using just a keyboard?

This is a high priority for us. We hope to add keyboard navigation to the Exam Environment app soon.

Are exams timed?

Yes, exams are timed. We err on the side of giving plenty of time to take the exam, to account for people who are non-native English speakers, or who have ADHD and other learning differences that can make timed exams more challenging.

If you have a condition that usually qualifies you for extra time on standardized exams, please email support@freecodecamp.org. We’ll review your request and see whether we can find a reasonable solution.

What happens if I fail the exam? Can I retake it?

Yes. You get one exam attempt per week. After you attempt an exam, there is a one-week (exactly 168 hour) “cool-down” period where you cannot take any freeCodeCamp exams. This is to encourage you to study and to pace yourself.

There is no limit to the number of times you can take an exam. So if you fail, study more, practice your skills more, then try again the following week.

Do I need to redo the projects if I fail the exam?

No. Once you’ve submitted a certification project, you do not need to ever submit it again.

You can re-do projects for practice, but we recommend that you instead build some of our many practice projects in freeCodeCamp’s developer interview job search section.

What happens if I already have the old Legacy Responsive Web Design certification? Should I claim the new one?

The new certification has more theory and practice as well as an exam. So if you’re looking to brush up on your skills, then you can go through the new version of this certification.

What will happen to my existing coursework progress on the Full Stack Certification? Does it transfer over to the Responsive Web Design course?

If you’ve already started the Certified Full Stack Developer Curriculum, all of your previously completed work should already be saved there.

To be clear, we’ve copied over all of the coursework from the full stack certification to this newer certification.

Can I still continue with the current Full Stack Developer Certification and just not do the new certification?

We’ve moved the coursework for the Full Stack Developer Certification over and broken it up into smaller certifications. Currently there are seven courses available for you to go through. Here is the complete list:

The Certified Full Stack Developer Certification button will remain on the learn page for a short time to give people the opportunity to switch over to the new certifications. Over the next few months, though, this option will disappear.

Will my legacy certifications become invalid?

No. Once you claim a certification, it’s yours to keep.

Also note that we previously announced that freeCodeCamp certifications would have an expiration date and require recertification. We don’t plan to implement this anytime soon. And if we do decide to, we will give everyone at least a year’s notice.

Will the exam be available to take on my phone?

At this time, no. You’ll need to use a laptop or desktop to download the exam environment and take the exam. We hope to eventually offer these certification exams on iPhone and Android.

I have a disability or health condition that is not covered here. How can I request accommodations?

If you need specific accommodations for the exam (for example extra time, breaks, or alternative formats), please email support@freecodecamp.org. We’ll review your request and see whether we can find a reasonable solution.

Anything else?

Good luck working through freeCodeCamp’s coursework, building projects, and preparing for these exams.

Happy coding!

What Firewalls Really Do and Why Every Network (Still) Needs Them

Manish Shivanandhan — Fri, 19 Dec 2025 17:35:15 +0000

Firewalls are one of the oldest tools in network security.

Many people think they are outdated or replaced by newer tools like endpoint security or cloud security platforms, but that’s not the case. Firewalls still play a critical role in protecting networks, systems, and data.

A firewall acts like a security guard at the entrance of a building. It decides what can come in, what can go out, and what should be blocked.

Even though attacks have become more advanced, this basic control point is still essential.

In this article, I’ll explain what firewalls really do, how they work, and why every network still needs them today. We’ll also look at how firewalls have evolved to stay useful in modern cloud and hybrid environments.

What We Will Cover

What We Will Cover
What a Firewall Is in Simple Terms
What Firewalls Actually Do
How Firewalls Reduce Attack Surface
Firewalls and Internal Network Protection
Setting up a firewall
Firewalls in Cloud and Hybrid Networks
Firewalls and Compliance Requirements
Common Misunderstandings About Firewalls
Why Firewalls Still Matter Today
Firewalls as a Foundation, Not a Finish Line
Conclusion

What a Firewall Is in Simple Terms

A firewall is a system that controls network traffic based on rules. These rules define which connections are allowed and which are denied. The firewall sits between trusted systems and untrusted networks, most often between an internal network and the internet.

When data tries to move across the network, the firewall checks it. If the data follows the rules, it’s allowed through. If it breaks the rules, it’s blocked or logged for review.

Firewalls can be hardware devices, software programs, or cloud-based services. No matter the form, the goal is the same: they reduce risk by limiting exposure.

What Firewalls Actually Do

At the most basic level, a firewall filters traffic. It looks at details like IP addresses, ports, and protocols. For example, it can allow web traffic on port 443 but block unused or risky ports.

Modern firewalls go much further. They can inspect traffic at a deeper level. This is called deep packet inspection. Instead of just checking where traffic comes from, the firewall looks at what the traffic contains.

Firewalls can also track connections over time. This is known as stateful inspection. The firewall understands whether traffic is part of a valid conversation or an unexpected request. This helps stop many common attacks.

Another important job of a firewall is logging. Firewalls record what they allow and what they block. These logs are vital for audits, investigations, and compliance needs.

How Firewalls Reduce Attack Surface

Attack surface means the number of ways an attacker can try to get into a system. Firewalls reduce this by closing unnecessary paths.

Most systems don’t need to expose all services to the internet. A firewall ensures that only required services are reachable. Everything else stays hidden.

Even if an application has a weakness, a firewall can reduce the chance that attackers ever reach it. This doesn’t replace secure coding, but it adds a strong layer of defense.

This layered approach is known as defence in depth. Firewalls are a core layer in that strategy.

Firewalls and Internal Network Protection

Many people think firewalls are only for the network edge. That is no longer true. Internal firewalls are now just as important.

Inside a network, different systems have different risk levels. A database should not be freely accessible from every workstation. Firewalls help enforce this separation.

This practice is often called network segmentation. By placing firewalls between network segments, organizations limit how far an attacker can move if they gain access to one system.

Internal firewalls are especially important in large environments, data centers, and cloud platforms.

Setting Up a Firewall

To make this practical, let’s look at a real, working example using UFW, an open source firewall available on most Linux systems. These are actual commands you would run on a server.

We will assume a simple use case: the server should allow secure web traffic on port 443 and allow SSH access for administration. All other incoming traffic should be blocked.

First, make sure you have UFW installed:

sudo apt update
sudo apt install ufw

Before enabling the firewall, define the default behaviour. Blocking all incoming traffic by default is a safe baseline. Outgoing traffic is allowed so the server can still reach external services.

sudo ufw default deny incoming
sudo ufw default allow outgoing

Next, allow SSH access. This is important so you don’t lock yourself out of the server.

sudo ufw allow ssh

If you prefer to be explicit about the port, you can allow port 22 directly.

sudo ufw allow 22/tcp

Now allow HTTPS traffic so users can reach the web application.

sudo ufw allow 443/tcp

At this point, only SSH and HTTPS are allowed. Everything else is blocked automatically.

You can review the rules before enabling the firewall.

sudo ufw status verbose

When you are satisfied with the rules, enable the firewall like this:

sudo ufw enable

Once enabled, UFW immediately starts enforcing the rules.

To confirm everything is working, check the status again.

sudo ufw status numbered

Logging is disabled by default. Enabling it gives visibility into blocked and allowed connections, which is useful for security monitoring and audits.

sudo ufw logging on

UFW also supports simple protection against brute force attacks. For example, you can rate limit SSH connections.

sudo ufw limit ssh

This rule allows normal usage but blocks IP addresses that make too many connection attempts in a short time.

If you need to restrict access to a service by IP address, UFW supports that as well. For example, allowing SSH only from a trusted office IP:

sudo ufw allow from 203.0.113.10 to any port 22 proto tcp

You can remove or change rules as your requirements evolve. For example, to delete a rule using its number, do this:

sudo ufw delete 3

This setup shows what a firewall actually looks like in practice. You define defaults, allow only what is required, enable logging, and enforce the rules.

Even though enterprise firewalls and cloud firewalls use more advanced interfaces, the underlying logic is the same. Clear rules control traffic flow, reduce attack surface, and provide visibility. Open source tools like UFW make these concepts easy to understand and apply in real systems.

Firewalls in Cloud and Hybrid Networks

Cloud computing changed how networks are built, but it did not remove the need for firewalls. In fact, it increased their importance.

In cloud environments, firewalls are often provided as managed services. They may be called security groups, network security rules, or cloud firewalls. The name changes, but the role is the same.

Hybrid networks combine on-premise systems with cloud systems. Firewalls control traffic between these environments. They help enforce consistent security rules across locations.

Without firewalls, cloud resources would be exposed directly to the internet. That would be risky and costly.

Firewalls and Compliance Requirements

Many industries have strict security rules. Banks, healthcare providers, and large enterprises must follow regulations. Firewalls help meet these requirements.

Regulations often require control over network access. They also require logging and monitoring. Firewalls provide both.

Auditors frequently ask for firewall configurations and logs. A well-managed firewall setup makes audits easier and reduces compliance risk.

Even small companies benefit from these controls. Security standards are not only for large enterprises anymore.

Common Misunderstandings About Firewalls

One common myth is that firewalls stop all attacks, but this isn’t true. Firewalls aren’t magic shields. They are one part of a broader security strategy.

Another misunderstanding is that firewalls slow networks down. Modern firewalls are built for high performance. When configured correctly, the impact is minimal.

Some believe that endpoint security replaces firewalls. Endpoint tools protect individual devices. Firewalls protect the network paths between them. Both are needed.

Understanding these limits helps teams use firewalls effectively instead of relying on them blindly.

Why Firewalls Still Matter Today

Cyber attacks are more frequent and more automated than ever. Exposed systems are scanned constantly. Firewalls provide the first line of resistance.

New technologies don’t remove the need for boundaries. Even zero-trust models rely on strict access controls, often enforced by firewall-like systems.

Every network, no matter the size, benefits from clear rules about who can talk to whom. Firewalls enforce those rules reliably and visibly.

Without firewalls, organisations would rely only on application security and user behaviour. That’s not enough in today’s threat landscape.

Firewalls as a Foundation, Not a Finish Line

It’s important to see firewalls as a foundation. They create a secure base on which other controls can work better.

Security monitoring, incident response, and threat detection all depend on controlled traffic flows. Firewalls make these systems more effective.

When something goes wrong, firewall logs often provide the first clues. They show what happened at the network level.

This makes firewalls valuable not just for prevention, but also for understanding and recovery.

Conclusion

Firewalls are not outdated tools from the past. They are still essential for protecting modern networks. They control access, reduce attack surface, support compliance, and enable strong security design.

While technology keeps changing, the need to control network traffic does not go away. Firewalls have adapted to cloud, hybrid, and complex environments.

Every network still needs a firewall. Not as the only defense, but as a critical part of a layered security approach. When used correctly, firewalls continue to do what they have always done best: keep the right doors open and keep the wrong ones closed.

How to Build a Real-time AI Gym Coach with Vision Agents

Ekemini Samuel — Fri, 19 Dec 2025 17:29:13 +0000

Computer vision is transforming how people train, from at-home workouts to smart gym mirrors.

Imagine walking into your home gym, turning on your camera, and having an AI coach that sees your movements, counts your reps, and corrects your form in real time.

That's exactly what we're building in this tutorial: a real-time gym companion and fitness coach.

We'll integrate Vision Agents' low-latency video inference to detect movement patterns, count reps, and give instant voice feedback like "Straighten your back!" or "Keep your form tight!", just like a human trainer would.

Here is a demo video of the AI gym companion during a workout session:

Prerequisites

Python 3.13 or higher
API keys for:
- Gemini (for real-time LLM with vision)
- Stream (for video/audio infrastructure)
- Alternatively: OpenAI (if using OpenAI Realtime instead)
Code editor like VS Code or Windsurf

Setting Up the Project

Create a new directory on your computer called gym_buddy. You can also do it directly in your terminal with this command:

mkdir gym_buddy

Then open the directory in your IDE (for this guide, I’m using Windsurf IDE).

If you don’t have uv (a fast Python package installer and resolver) installed on your computer, install it with this command:

pip install uv

Note: After installing uv, you can also run uv -init to set up the project with sample files and a .toml file with the metadata.

Next, we’ll create the pyproject.toml file. This is a configuration file for Python projects that specifies build system requirements and other project metadata. It's a standard file used by modern Python packaging tools.

Enter the code below:

[project]
name = "gym-buddy"
version = "0.1.0"
requires-python = ">=3.13"
dependencies = [
    "python-dotenv>=1.0",
    "vision-agents",
    "vision-agents-plugins-openai",
    "vision-agents-plugins-getstream",
    "vision-agents-plugins-ultralytics",
    "vision-agents-plugins-gemini",
]

[tool.uv.sources]
"vision-agents" = {path = "../../agents-core", editable=true}
"vision-agents-plugins-deepgram" = {path = "../../plugins/deepgram", editable=true}
"vision-agents-plugins-ultralytics" = {path = "../../plugins/ultralytics", editable=true}
"vision-agents-plugins-openai" = {path = "../../plugins/openai", editable=true}
"vision-agents-plugins-getstream" = {path = "../../plugins/getstream", editable=true}
"vision-agents-plugins-gemini" = {path = "../../plugins/gemini", editable=true}

You can also create a requirements.in file with just the direct dependencies, like so:

python-dotenv>=1.0
vision-agents
vision-agents-plugins-openai
vision-agents-plugins-getstream
vision-agents-plugins-ultralytics
vision-agents-plugins-gemini

Then install dependencies using uv and either of these commands:

uv sync

This will generate the uv.lock from the uv package manager that handles the project’s dependencies and builds.

If you are using a Windows OS, you might come across a dependency installation error, particularly with NumPy. This is likely due to missing build tools on your system.

Why NumPy is required

NumPy is a Python library for numerical computing. In this project, it’s used by the computer-vision and AI components (such as YOLO-based detection and Vision Agents) to handle image data, bounding boxes, coordinates, and other numerical outputs produced during real-time video analysis.

Many of the libraries used here depend on it for fast array operations and mathematical computations. That’s why NumPy is installed as part of the setup and why issues with its installation can affect the entire pipeline.

To resolve it, install Visual Studio Build Tools (required for building Python packages with C extensions). During installation, make sure that you select "Desktop development with C++". This installs all the necessary build tools.

Visual Studio displays like this after the installation is done. You may need to restart your computer for the updates to take effect.

Now run this command in your terminal:

python -m pip install -e .

The command above installs all the necessary dependencies for the project.

How to Get Your API Keys

For this project, we need to get API keys from Stream and Gemini/OpenAI.

To get your Stream API key, go ahead and sign up with your preferred method.

Then, navigate to your dashboard and click 'Create App' to create a new app for the AI gym companion.

Enter the name for the app, choose the environment (Development/Production), select a region, and click on ‘Create App’.

After creating the app, click on the dashboard overview tab in the left sidebar, then navigate to the Video tab and click on "API Keys". Copy your API key and secret, and save them securely.

To get your Gemini API key, visit the Google AI Studio website, then click on Get started.

Then, go to your dashboard and click on 'Create API key'.

Enter a name for the key, then create a new project for the API key.

After you have created the new API key, copy it and save it securely.

Building the AI gym companion

Now that you have the API keys you’ll need for the AI gym companion, create a .env file in the project’s root directory and add all the API keys like so:

GEMINI_API_KEY=your_gemini_key
STREAM_API_KEY=your_stream_key
STREAM_API_SECRET=your_stream_secret

If you’re using OpenAI instead of Gemini, also add:

OPENAI_API_KEY=your_openai_key

This is the project and codebase structure for the gym companion app we are building:

In the root directory, create an empty _init.py file. This file makes Python treat the directory as a package. You can add a comment in the file to remember, like so:

# This file makes Python treat the directory as a package.

Next, create a gym_buddy.py file. This is the main app file, containing agent setup and call joining logic for the Gym Companion. Enter the code below in the file:

import logging
from dotenv import load_dotenv
from vision_agents.core import User, Agent, cli
from vision_agents.core.agents import AgentLauncher
from vision_agents.plugins import getstream, ultralytics, gemini
logger = logging.getLogger(__name__)
load_dotenv()
async def create_agent(**kwargs) -> Agent:
    agent = Agent(
        edge=getstream.Edge(),  # use stream for edge video transport
        agent_user=User(name="AI gym companion"),
        instructions="Read @gym_buddy.md",  # read the gym buddy markdown instructions
        llm=gemini.Realtime(fps=3),  # Share video with gemini
        # llm=openai.Realtime(fps=3), use this to switch to openai
        processors=[
            ultralytics.YOLOPoseProcessor(model_path="yolo11n-pose.pt")
        ],  # realtime pose detection with yolo
    )
    return agent
async def join_call(agent: Agent, call_type: str, call_id: str, **kwargs) -> None:
    call = await agent.create_call(call_type, call_id)
    # join the call and open a demo env
    with await agent.join(call):
        await agent.llm.simple_response(
            text="Say hi. After the user does their exercise, offer helpful feedback."
        )
        await agent.finish()  # run till the call ends
if __name__ == "__main__":
    cli(AgentLauncher(create_agent=create_agent, join_call=join_call))

Then create a gym_buddy.md file. This is an instructions file for the gym agent's coaching guide, which it will follow when analysing the workouts and providing real-time feedback. Enter the markdown code below:

You are a voice fitness coach. You will watch the user's workout and offer feedback.
The video clarifies the body position using Yolo's pose analysis, so you'll see their exact movement.
Speak with a high-energy, motivating tone. Be strict about form but encouraging. Do not give feedback if you are not sure or do not see an exercise.
# Gym Workout Coaching Guide
## 1. Introduction
A fitness coach's primary responsibility is to ensure safety and efficacy in every movement. While everybody is different, the fundamental mechanics of human movement—stability, alignment, and range of motion—remain constant. By monitoring key checkpoints like spinal alignment, joint tracking, and tempo, coaches can guide athletes toward stronger, injury-free workouts. The following guidelines break down the core compound movements into phases, with clear teaching points and coaching cues.
## 2. The Squat: Setup and Stance
The squat is the king of lower-body exercises, but it starts before the descent. The athlete should stand with feet shoulder-width apart or slightly wider, toes pointed slightly outward (5-30 degrees). The spine must be neutral, chest proud, and core braced. Coaches should watch for collapsing arches in the feet or a rounded upper back. A solid setup creates the tension needed for a powerful lift.
## 3. The Squat: Descent (Eccentric Phase)
The movement begins by breaking at the hips and knees simultaneously. The hips should travel back and down, as if sitting in a chair, while the knees track in line with the toes. Coaches must ensure the heels stay glued to the floor. Common errors include "knee valgus" (knees caving in) or the torso collapsing forward. The descent should be controlled and deliberate.
## 4. The Squat: Depth and Reversal
"Depth" is achieved when the hip crease drops below the top of the knee (parallel). While not everyone has the mobility for this, it is the standard for a full range of motion. At the bottom, the athlete should maintain tension—no bouncing or relaxing. The reversal (concentric phase) is driven by driving the feet into the floor and extending the hips and knees, exhaling forcefully.
## 5. The Push-up: The Plank Foundation
A perfect push-up is essentially a moving plank. The setup requires hands placed slightly wider than shoulder-width, directly under the shoulders. The body must form a straight line from head to heels. Coaches should watch for sagging hips (lumbar extension) or piking hips (flexion). Glutes and quads should be squeezed tight to lock the body into a rigid lever.
## 6. The Push-up: Mechanics
As the athlete lowers themselves, the elbows should track back at roughly a 45-degree angle to the torso, forming an arrow shape, not a "T". The chest should descend until it nearly touches the floor. The neck must remain neutral—no reaching with the chin. The push back up should be explosive, fully extending the arms without locking the elbows violently.
## 7. The Lunge: Step and Stability
The lunge challenges balance and unilateral strength. Whether forward or reverse, the step should be long enough to allow both knees to bend to approximately 90 degrees at the bottom. The feet should remain hip-width apart throughout the movement, like moving on train tracks, not a tightrope. Coaches should look for wobbling or the front heel lifting off the ground.
## 8. The Lunge: Alignment
In the bottom position, the front knee should be directly over the ankle, not shooting far past the toes (though some forward travel is acceptable). The torso should remain upright or have a very slight forward lean; collapsing over the front thigh is a fault. The back knee should hover just an inch off the ground. Drive through the front heel to return to the start.
## 9. Tempo and Control
Time under tension builds muscle and control. Coaches should encourage a specific tempo, such as 2-0-1 (2 seconds down, 0 pause, 1 second up). Rushing through reps often masks muscle imbalances and relies on momentum rather than strength. If an athlete speeds up, cue them to "slow down and own the movement."
## 10. Breathing Mechanics
Proper breathing stabilises the core. The general rule is to inhale during the eccentric phase (lowering) and exhale during the concentric phase (lifting/pushing). For heavy lifts, the Valsalva manoeuvre (bracing the core with a held breath) may be appropriate, but for general fitness, rhythmic breathing ensures oxygen delivery and blood pressure management.
## 11. Common Faults and Fixes
- **Squat - Butt Wink**: Posterior pelvic tilt at the bottom. Fix: Limit depth or improve hamstring/ankle mobility.
- **Push-up - Winging Scapula**: Shoulder blades popping up. Fix: Push the floor away at the top (protraction) and engage serratus anterior.
- **Lunge - Valgus Knee**: Front knee collapsing in. Fix: Cue "push the knee out" and engage the glute medius.
- **General - Ego Lifting**: Sacrificing form for reps or weight. Fix: Regress the exercise or slow the tempo

How the AI Agent works

Now we have the instruction file for the AI agent set up. Let’s look at how the code works with the AI agent-creation and markdown instruction file above. In gym_buddy.py, the agent is created and initialised with specific components like so:

def create_agent() -> Agent:
    # Initialize video transport
    video_transport = StreamVideoTransport()

    # Set up AI components
    gemini = GeminiRealtime()
    pose_processor = YOLOPoseProcessor(model_path="yolo11n-pose.pt")

    # Create agent with instructions
    return Agent(
        name="AI Gym Buddy",
        instructions="gym_buddy.md",  # Loads coaching instructions
        video_transport=video_transport,
        llm=gemini,
        processors=[pose_processor]
    )

The gym_buddy.md file contains structured instructions that guide the gym companion agent's behaviour.

## Coaching Style
- Be encouraging and positive
- Provide clear, actionable feedback
- Focus on one correction at a time

## Squat Form
- Keep chest up and back straight
- Knees should track over toes
- Lower until thighs are parallel to ground
- Push through heels to stand

## Safety Guidelines
- Stop user if a dangerous form is detected
- Suggest modifications for beginners
- Remind to keep core engaged

These instructions are loaded with the instructions="gym_buddy.md" parameter in the gym_buddy.py file. The agent then parses this file to understand how to analyse your form during the workout session and provides feedback.

# Processing video frames
async def process_frame(self, frame):
    # Analyze pose using YOLO
    poses = await self.pose_processor.process(frame)

    # Generate feedback based on instructions
    feedback = await self.llm.generate_feedback(
        poses=poses,
        instructions=self.instructions
    )
    return feedback

When giving feedback, the agent compares the detected poses with the ideal form from the markdown. Then, it generates natural language feedback using the specified tone and style. The safety guidelines in the gym_buddy.md are checked first, then specific form corrections are mentioned by the agent.

To add a new exercise, you can update the gym_buddy.md file with a new section like so:

## Push-up Form
- Keep body in a straight line
- Lower until chest nearly touches floor
- Push through palms to return up
- Keep core engaged

The agent will automatically incorporate these instructions the next time it runs. This makes it easy to update and expand the agent's capabilities by simply editing the markdown file.

You can view the complete code for the AI Gym Companion in the GitHub repository.

How to Run the App

First, create a virtual environment in Python with this command:

python -m venv venv

It creates the .venv directory.

Then activate the virtual Python environment like so:

.\venv\Scripts\activate

Now run the AI agent with this command:

uv run gym_buddy.py

You can also start the app with this command:

python gym_buddy.py

It begins loading like so:

The AI agent will:

Create a video call
Open a demo UI in your browser
Join the call and start watching
Ask you to do a squat exercise
Analyse your moves and positions, and then provide feedback

From the command terminal output above, it also shows that Gemini AI is connected.

The agent then loads in your browser like so:

It also displays a pop-up modal that introduces the Vision Agents. You can skip the intro or click on Next to proceed.

The Vision Agent uses a global edge to ensure optimal call latency. This is useful for the AI gym companion to provide real-time feedback on the exercises the users perform.

The AI gym companion can also provide chat messages on the exercises through the chatbox displayed on the right side of the UI. This is provided through the chat SDK/API.

When you perform a squat, the Vision Agent (powered by Gemini) analyses the video frames in real-time. It detects the completion of the movement and triggers the send_rep_count tool. This instantly updates the exercise counter on your screen and provides an encouraging text and voice response!

Here is a demo video of the AI gym companion during a workout session:

You can also copy the link and share it, or scan the QR code below to test the Gym Companion on your mobile phone.

If you want to test it on your phone, install the Stream Video calls app for iOS devices for a better mobile experience.

Next Steps

In this tutorial, you’ve learned how to build an AI gym companion using Vision Agents.

The Real-Time Gym Companion illustrates how vision AI unlocks human-like interactivity by merging:

Video perception (seeing)
LLM understanding (thinking)
Speech feedback (speaking)

This low-latency technology lets you create real-time fitness apps that give instant feedback, much like a personal trainer would.

You can check out more project use cases with Vision Agents in the GitHub repository.

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

How to Run a Docker Container in AWS Lambda

Table of Contents

Prerequisite/ Requirements

Serverless with AWS Lambda

How to Build, Run, and Test a Container Locally

Build the Docker Image

Run the Docker Container

Test the Running Container

How to Push Your Image to Amazon Elastic Container Registry (ECR)

Setup Environment Variables

Create ECR Repository and Authenticate

Tag and Push the Docker Image

How to Deploy Your Docker Image to Lambda

Create Lambda Function

Test Deployment

Cleanup

Conclusion

How to Prepare for Technical Job Interviews – Based on My Experience Landing a Job

My 18-Month Job Search Struggle

The Interview Problem I Didn’t Expect

Discovering Active Recall and Flashcards

My Interview Preparation System

Step 1: Ask What to Prepare For

Step 2: Use Flashcards (With AI Carefully)

The Results

Changing How I Looked for Jobs

Turning My System Into a Small Tool

Lessons I Learned

Final Thoughts

Christmas gifts for you from the freeCodeCamp community: Learn Python, SQL, Spanish, and more

Programming Certifications and Version 10 of the Full Stack Development Curriculum

Language Coursework

Why Teach Spanish and Mandarin?

Update on Translating freeCodeCamp’s coursework into major world languages

This community is just getting started.

We invite you to get more involved in the community, too.

How to Use GenUI in Flutter to Build Dynamic, AI-Driven Interfaces

Table of Contents

Prerequisites

The Mental Model: How GenUI Thinks

Mapping GenUI Components to the Christmas Card App

1. GenUiConversation in the Christmas Card App

2. Catalog as the Design Constraint

3. DataModel as the Heart of Personalization

4. ContentGenerator as the AI Gateway

5. A2uiMessage as Intent, Not UI

Why This Architecture Works

Project Overview: What We’re Building

Project Structure

Step 1: Create a New Flutter Project

Step 2: Configure Your Agent Provider

Configure Firebase AI Logic

Step 3: Add Dependencies

Step 4: Get a Google Gemini API Key

Step 5: App Entry Point (main.dart)

The Root App Widget

Step 6: The Logic Controller (Stateful Screen)

Step 7: Initializing GenUI and Firebase

Step: 8 Sending a Dynamic Prompt to the AI

Building the View

Folder: lib/screen/data/

StaticListData: lib/screen/data/static_list_data.dart

Folder: lib/extensions/

LoaderOverlayExtension: lib/extensions/loading.dart

Folder: lib/screen/components/

ErrorSection: error_section.dart

ColorPickerList: color_picker_list.dart

CustomInputSection custom_input_section.dart

Adding Your Own Widgets to the GenUI Catalog

Why Add a Custom Widget?

Step 1: Adding json_schema_builder

Step 2: Defining the Holiday Card Schema

Step 3: Creating the CatalogItem

Step 4: Registering the Widget in Your App

Step 5: Teaching the AI to Use the Widget

How This Fits into Your Existing Screen

Screenshots:

Final Thoughts

References

Step 5: App Entry Point (`main.dart`)

Folder: `lib/screen/data/`

StaticListData: `lib/screen/data/static_list_data.dart`

Folder: `lib/extensions/`

LoaderOverlayExtension: `lib/extensions/loading.dart`

Folder: `lib/screen/components/`

ErrorSection: `error_section.dart`

ColorPickerList: `color_picker_list.dart`

CustomInputSection `custom_input_section.dart`

Step 1: Adding `json_schema_builder`