Tarun Singh - freeCodeCamp.org

How to Get Type Safety Without Code Generation Using tRPC and Hono

Tarun Singh — Mon, 12 Jan 2026 17:53:20 +0000

Have you ever updated your backend API property name but neglected to also update the frontend? I'm sure you have. When this occurs, it leads to production crashes and unhappy customers, plus you've wasted your entire week fixing the problem.

To resolve this issue in the past, you typically had to create a multitude of TypeScript interfaces by hand, using GraphQL Code Generator to generate the interface files, or hope that it all worked out. Well, there’s a better way to accomplish this now, without the need for code generation.

tRPC and Hono are two applications that are changing how we develop TypeScript-based applications throughout the entirety of the full-stack.

By the end of this tutorial, you’ll understand:

Why traditional REST APIs fail at type safety
How tRPC provides full end-to-end type inference between backend and frontend
How Hono delivers type-safe APIs while staying REST-friendly
When to choose tRPC vs Hono for your projects
How these tools improve developer experience, team velocity, and reliability

If you’re building full-stack TypeScript applications and want fewer runtime bugs and faster iteration, this guide is for you.

Prerequisites
The Problem with Traditional APIs
What Makes tRPC Different?
Hono: The Lightweight Challenger
Why This Matters These Days
Getting Started
The Future is Type-Safe

Prerequisites

To follow along comfortably, you should have:

Basic knowledge of TypeScript
Familiarity with REST APIs and how frontend-backend communication works
Some experience with Node.js and modern JavaScript frameworks
A general understanding of frontend frameworks like React or Next.js (helpful, but not required)

You don’t need prior experience with tRPC, Hono, or GraphQL.

The Problem with Traditional APIs

You’ve probably written something like this a hundred times:

// Backend (Express)
app.post('/api/users', (req, res) => {
  const { name, email } = req.body;
  // Do stuff with user
  res.json({ id: 1, name, email });
});

// Frontend
const createUser = async (name: string, email: string) => {
  const response = await fetch('/api/users', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ name, email })
  });
  return response.json(); 
};

The backend knows the shape of the data. But the frontend...it hopes it gets it right. You end up writing interfaces manually, like:

interface User {
  id: number;
  name: string;
  email: string;
}

If you change the backend tomorrow to return userId instead of id, TypeScript won't catch it. Your types and reality have diverged, and you won't know until runtime.

GraphQL tried to solve this with schemas and codegen, but honestly? Setting up GraphQL feels like assembling IKEA furniture without instructions. You need a schema, resolvers, code generation tools, and suddenly your "simple" API has a 30-minute setup process.

What Makes tRPC Different?

tRPC flips the script entirely. Instead of defining your API in a separate schema language, your TypeScript code is the schema. Here's the same API in tRPC:

// Backend (tRPC router)
import { initTRPC } from '@trpc/server';
import { z } from 'zod';

const t = initTRPC.create();

export const appRouter = t.router({
  createUser: t.procedure
    .input(z.object({
      name: z.string(),
      email: z.string().email(),
    }))
    .mutation(({ input }) => {
      // Do stuff with user
      return { id: 1, name: input.name, email: input.email };
    }),
});

export type AppRouter = typeof appRouter;

This is where it gets cool. On your frontend:

// Frontend - fully type-safe!
import { createTRPCClient } from '@trpc/client';
import type { AppRouter } from './server';

const client = createTRPCClient({
  url: 'http://localhost:3000/trpc',
});

// TypeScript knows EVERYTHING about this call
const user = await client.createUser.mutate({
  name: 'Alice',
  email: 'alice@example.com'
});

// user is automatically typed as { id: number; name: string; email: string; }

No code generation or build step, or GraphQL schema. Just pure TypeScript inference doing its thing. If you rename id to userId in your backend, your frontend will immediately show a TypeScript error. You'll catch it before you even save the file.

This is what we call end-to-end type safety, and it's honestly a great transition.

Hono: The Lightweight Challenger

While tRPC is amazing for full-stack TypeScript apps where you control both ends, Hono takes a slightly different approach. It's a lightweight web framework that gives you type safety while still being a traditional HTTP framework.

Here's the same example in Hono:

import { Hono } from 'hono';
import { z } from 'zod';
import { zValidator } from '@hono/zod-validator';

const app = new Hono();

const userSchema = z.object({
  name: z.string(),
  email: z.string().email(),
});

app.post('/api/users', zValidator('json', userSchema), (c) => {
  const { name, email } = c.req.valid('json');
  return c.json({ id: 1, name, email });
});

export type AppType = typeof app;

On the frontend, you can use Hono’s RPC client:

import { hc } from 'hono/client';
import type { AppType } from './server';

const client = hc('http://localhost:3000');

const response = await client.api.users.$post({
  json: { name: 'Bob', email: 'bob@example.com' }
});

const user = await response.json();
// user is fully typed!

Hono is incredibly fast (it runs on Cloudflare Workers, Deno, Bun, and Node.js), and it gives you that sweet type safety while still being a "regular" HTTP framework. You get RESTful routes, middleware, and all the familiar patterns – just with TypeScript powers.

Why This Matters These Days

You might think to yourself, “Okay, I know what you mean, but why should I care about it?” There’s a reason why these tools are being utilized more now than ever before.

Developer experience is essential

In 2026 and beyond, we’ll no longer accept long feedback loops. The ability to modify your backend code and see what might break on your frontend application without having to run the application will be fantastic for productivity. We’ll spend less time fixing bugs and more time creating new functionalities.

Smaller teams, better apps

With tRPC or Hono, one developer can create an entire full-stack application with type safety at a very fast pace because they don’t have to switch back and forth between REST documentation and TypeScript interfaces – all the data is flowing to and from their backend code directly to their frontend.

The end of “Works on my machine“

With type safety, errors are caught at compile time instead of at the time your end-user clicks on a button. This is especially impactful when working in larger teams, when the backend developers and front-end developers may not be in constant communication with one another.

Getting Started

Want to try this out? Here's the fastest way:

For tRPC:

npm create @trpc/next-app@latest

This scaffolds a Next.js app with tRPC already configured. Check out the official tRPC docs for more.

For Hono:

npm create hono@latest

Pick your runtime (Node.js, Cloudflare Workers, etc.), and you're off to the races. The Hono documentation is excellent and super approachable.

The Future is Type-Safe

Look, REST isn't going anywhere, and GraphQL has its place. But for full-stack TypeScript developers, tRPC and Hono represent something special: type safety without the ceremony. No code generation or no schema duplication, just TypeScript doing what it does best.

In the future, when you start a new project, give one of these a shot. Your future self – the one who's refactoring code at 2 AM – will thank you.

Happy coding!

How to Use Nano Banana for Image Generation - Explained with Code Examples

Tarun Singh — Fri, 19 Sep 2025 13:20:23 +0000

AI is changing the image generation and editing process into a smooth workflow. Now, with just a single prompt, you can tell your computer to generate or edit an existing image. Google just launched its new model for image generation or editing, "Nano Banana" – Gemini 2.5 Flash. It's a powerful, nimble tool that's changing how we think about image generation and manipulation, and it's something you'll definitely want in your developer toolkit.

In this article, you will learn how to use “Nano Banana” for Image Generation using Gemini’s 2.5 Flash Image. So, let’s get started!

What is "Nano Banana"?
- Why "Nano Banana"?
Setting Up Your Project
Beyond the Basics: What Else Can You Do?
Wrapping Up

What is "Nano Banana"?

Nano Banana is the latest image-editing cum generation tool from Google DeepMind. Forget the formal jargon for a second. Imagine you have an incredibly talented, lightning-fast artist at your beck and call. You can describe anything to them – "an astronaut riding a horse on the Moon" – and poof, it appears. Or, you hand them a picture of your dog and say, "Make the dog wear a cap on his head," and they do it instantly, keeping your cat looking like your dog.

That's essentially Nano Banana. It's an advanced AI model from the Gemini family, specifically engineered for rapid, intelligent image generation and nuanced editing. It understands your natural language commands, enabling you to bring complex visual ideas to life or make surgical changes to existing images with surprising ease.

Why "Nano Banana"?

Because it's small (flash!), packed with goodness, and leaves you feeling like you just peeled back a new layer of creative possibility. It's fast, efficient, and incredibly versatile.

The Superpowers You Get:

Prompt-Perfect Editing: Want to change a background, alter a pose, or add a specific object? Just ask. Nano Banana understands and executes.
Character Consistency: This is a big one. If you're creating a story or a series of images, maintaining the look of a specific character or object is crucial. Nano Banana excels at this, ensuring your protagonist looks the same whether they're in a forest or on the moon.
Visual Mashups (Multi-Image Fusion): Got a few different visual elements you want to combine seamlessly? It can blend them into a cohesive new image.

and much more!

Interested? Let's get our hands dirty. But wait! To use “Nano Banana, “ you have two ways to do this:

Using Google AI Studio: The simplest and easiest way to generate or edit images in Google Studio. This is a web-based tool that gives you direct access to the Gemini models without writing a single line of code. It's the absolute best place to test and start, and is useful for developers and non-developers, also. Also, there's no need to install libraries, manage API keys, or write any code
Building with the Gemini API: This is beneficial if you want more custom solutions for your application. For any serious application—whether it's a web app, a mobile app, or a backend service—you'll need to integrate directly with the Gemini API. This is where the real power lies, as it allows you to automate tasks and create interactive experiences.

In this tutorial, you will see how we can use this tool in our own applications, using nothing but Python. So, let’s get started.

How to Set Up Your Project

Step 1: Get an API key from Google Gemini

The very first step for using “Nano Banana” is to get an API key. Head over to Google AI Studio, click on “Create API key“, and generate a new one by specifying a project from your existing Google Cloud projects.

Once you have generated an API key, save it securely somewhere.

Step 2: Install the SDK and Other Dependencies

Open your terminal and run:

pip install google-generativeai pillow python-dotenv

We’ll use Pillow for easy image handling and python-dotenv to safely manage our API key.

Step 3: Set Up Your Environment

It’s crucial to keep your API key out of your code for security. For this, we usually use environment variables. So, create a file named .env in your project root and add your API key:

GEMINI_API_KEY="YOUR_API_KEY_HERE"

Step 4: Image Generation & Editing

Example 1: Text-to-Image Generation

Text-to-Image is like an artist who can draw anything you describe. In this, you simply write the prompt (a sentence or a description), even a very detailed one, and the AI will generate a unique, high-quality image that matches your description. It’s perfect for bringing your most imaginative ideas to life with just a few words.

import os
import google.generativeai as genai
from PIL import Image
from io import BytesIO
from dotenv import load_dotenv

# Configuration
load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-2.5-flash-image-preview')

# Prompt, Image, and Response Setup
prompt = "A golden retriever puppy sitting in a field of daisies, bright and cheerful"
output_filename = "text_to_image_result.png"

# saving image helper function from text prompt response
def save_image_from_response(response, filename):
    """Helper function to save the image from the API response."""
    if response.candidates and response.candidates[0].content.parts:
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                image_data = BytesIO(part.inline_data.data)
                img = Image.open(image_data)
                img.save(filename)
                print(f"Image successfully saved as {filename}")
                return filename
    print("No image data found in the response.")
    return None

def main():
    print(f"Generating image for prompt: '{prompt}'...")
    response = model.generate_content(prompt)
    save_image_from_response(response, output_filename)

if __name__ == "__main__":
    main()

Output:

The code used in the example handles everything needed to communicate with the Gemini API and save the image.

First, we import the required libraries and load the API key from .env using load_dotenv(). This makes the key available so we can connect to Google’s service with genai.configure().
The model we’re using is gemini-2.5-flash-image-preview, which is designed for fast image generation.
We define a prompt (“A golden retriever puppy...”) and a filename for saving the image.
The helper function save_image_from_response(...) looks at the API’s response, extracts the raw image data, and saves it as a PNG file.
In main(), we call the model with the prompt, then pass the response to the helper function to save the result.
The if __name__ == "__main__": block ensures the script runs only when executed directly, not when imported.

Example 2: Image-to-Image Editing

Image-to-Image is like a photo editor. Instead of starting from scratch, you can upload an existing picture and describe how to change it. For instance, you can request background removal, addition of new objects, or even a complete artistic style change.

import os
import google.generativeai as genai
from PIL import Image
from io import BytesIO
from dotenv import load_dotenv

# Configuration
load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-2.5-flash-image-preview')

# Prompt, Image, and Response Setup
input_image_path = "input_dog.png"
prommpt = "Make the dog wear a small wizard hat and spectacles."
output_filename = "edited_image_result.png"

# saving image helper function from text prompt response
def save_image_from_response(response, filename):
    """Helper function to save the image from the API response."""
    if response.candidates and response.candidates[0].content.parts:
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                image_data = BytesIO(part.inline_data.data)
                img = Image.open(image_data)
                img.save(filename)
                print(f"Image successfully saved as {filename}")
                return filename
    print("No image data found in the response.")
    return None

def main():
    print(f"Editing image '{input_image_path}' with prompt: '{prommpt}'...")
    try:
        img_to_edit = Image.open(input_image_path)
        response = model.generate_content([prommpt, img_to_edit])
        save_image_from_response(response, output_filename)
    except FileNotFoundError:
        print(f"Error: The file '{input_image_path}' was not found.")

if __name__ == "__main__":
    main()

Output:

This code is very similar to the first example, but the key difference is in the core logic.

input_image_path: This variable now holds the file path to the image you want to edit.
Image.open(input_image_path): This line uses the Pillow library to open your local image file to be used.
model.generate_content([prommpt, img_to_edit]): This is the most important part. Unlike before, we now pass a list to the generate_content function that contains both the text prompt and the image object. This tells the API to use the provided image as a starting point for its generation.
try...except block: Here, we are handling the errors. It tries to open the image file, and if it fails (because the file isn't there), it will except the FileNotFoundError and print a friendly message to the user instead of crashing.

Example 3: Multi-Image Fusion

Multi-image fusion is like merging two or more images or objects. Upload several images and instruct the AI to blend them into one composite picture seamlessly. This is a tool for creating new scenes, combining people and backgrounds, or creating detailed product mockups.

import os
import google.generativeai as genai
from PIL import Image
from io import BytesIO
from dotenv import load_dotenv

# Configuration
load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-2.5-flash-image-preview')

# Prompt, Images, and Response Setup
image1_path = "dog_image.png"
image2_path = "cap_image.png"
prompt = "Make the dog from the first image wear the cap from the second image. The cap should fit realistically on the dog's head."
output_filename = "dog_with_cap_result.png"

def save_image_from_response(response, filename):
    """Helper function to save the image from the API response."""
    if response.candidates and response.candidates[0].content.parts:
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                image_data = BytesIO(part.inline_data.data)
                img = Image.open(image_data)
                img.save(filename)
                print(f"Image successfully saved as {filename}")
                return filename
    print("No image data found in the response.")
    return None

def main():
    print(f"Fusing images '{image1_path}' and '{image2_path}'...")
    try:
        img1 = Image.open(image1_path)
        img2 = Image.open(image2_path)
        response = model.generate_content([prompt, img1, img2])
        save_image_from_response(response, output_filename)
    except FileNotFoundError:
        print("Error: One or both image files were not found.")

if __name__ == "__main__":
    main()

Output:

The logic of the code above is an extension of the Image-to-Image example.

image1_path and image2_path: These variables hold the paths to the two images you want to fuse or merge.
model.generate_content([prompt, img1, img2]): Here, the list passed to the generate_content function contains three items: the text prompt and both image objects. This tells the AI to use the prompt to combine the elements from both images into a single output.

Example 4: Image Restoration

This feature can restore old, faded, or damaged photos. Upload a picture and request Gemini to restore it. This includes sharpening low-quality images, colorizing old black-and-white photos, and enhancing textures, which can make your memories look new again.

import os
import google.generativeai as genai
from PIL import Image
from io import BytesIO
from dotenv import load_dotenv

# Configuration
load_dotenv()
genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel('gemini-2.5-flash-image-preview')

# Prompt, Image, and Response Setup
input_image_path = "old_photo.png"
prompt = "Restore this old, faded photograph. Sharpen the details, remove any scratches or damage, and enhance the colors to make it look like a new, high-quality photo."
output_filename = "restored_image_result.png"

def save_image_from_response(response, filename):
    """Helper function to save the image from the API response."""
    if response.candidates and response.candidates[0].content.parts:
        for part in response.candidates[0].content.parts:
            if part.inline_data:
                image_data = BytesIO(part.inline_data.data)
                img = Image.open(image_data)
                img.save(filename)
                print(f"Image successfully saved as {filename}")
                return filename
    print("No image data found in the response.")
    return None

def main():
    print(f"Attempting to restore image: '{input_image_path}'...")
    try:
        old_photo = Image.open(input_image_path)
        response = model.generate_content([prompt, old_photo])
        save_image_from_response(response, output_filename)
    except FileNotFoundError:
        print(f"Error: The file '{input_image_path}' was not found.")

if __name__ == "__main__":
    main()

Output:

The structure here is identical to the Image-to-Image Editing example because, from a technical perspective, image restoration is a form of image-to-image editing.

Now the prompt is where the magic happens. The text prompt explicitly tells the model what to do with the image, outlining the restoration steps like "sharpen the details," "remove scratches," and "enhance the colors." The model's intelligence allows it to understand these abstract instructions and apply them to the visual data to give you a better and a realistic update to your old image.

Beyond the Basics: What Else Can You Do?

This is just the tip of the iceberg! Nano Banana is incredibly versatile. Here are some ideas for where you can take your projects:

Batch Processing: Automate the generation of multiple images from a list of prompts.
Creative Assets: Design icons, backgrounds, or character sprites for games or apps directly from your Python script.
Data Processing: Integrate Nano Banana into a data pipeline to programmatically edit or generate images based on data inputs.
AI Art Galleries: Build a backend service that allows users to submit prompts and receive images.

Wrapping Up

"Nano Banana" (Gemini 2.5 Flash Image) isn't just a cool tech tool; it's a practical, powerful tool for developers and creatives alike. With just a few lines of code, you can tap into its capabilities and bring your visual ideas to real life. This streamlined approach makes it easy to get started, experiment, and integrate this visual magic into your projects.

If you found this article helpful and want to discuss AI development, LLMs, or software development, feel free to connect with me on X/Twitter, LinkedIn, or check out my portfolio on my Blog. I regularly share insights about AI, development, technical writing, and much more.

Happy coding, and may your creations be as vibrant as a field of fresh bananas!

Prompt Engineering Cheat Sheet for GPT-5: Learn These Patterns for Solid Code Generation

Tarun Singh — Fri, 12 Sep 2025 10:30:29 +0000

When large language models like ChatGPT first became widely available, a lot of us developers felt like we’d been handed a new superpower. We could use LLMs to help us develop new coding projects, build websites, and much more – just using a few prompts.

LLMs were like a tireless, super knowledgeable pair programmer that could conjure code out of thin air. We’d type a quick, messy request, and out would pop something that...kind of worked. It was amazing, but also a little frustrating. The code might be buggy, inefficient, or completely miss the subtle context of our project.

But with GPT-5, the game has changed quite a bit. This model doesn’t just spit out code – it reasons, adapts, and understands context like never before. Still, here’s the catch: you need to speak its language to be able to generate the best output. But how? That’s where prompt engineering comes in.

In this article, I’ll share 10 proven patterns that will help you transform GPT-5 from a helpful tool into a rock-solid coding partner you can trust for accuracy and speed. Let’s get started!

What is GPT-5? Why You Should Use It as a Developer?
Why Prompt Engineering?
How to Use GPT-5 for Free?
Patterns Every Developer Should Know
Common Pitfalls to Avoid
Final Thoughts

What is GPT-5? Why You Should Use It as a Developer?

OpenAI recently launched one of its best models, GPT-5. It’s capable of performing coding and agentic tasks across various domains. Think of it as a full-stack, super-intelligent intern who’s been given a master key to the internet's knowledge. It's not just better at writing code, it can under why you need the code, how it should fit into a larger system, and how to debug it.

It excels at:

Long-context reasoning: It can handle an entire codebase or a lengthy API documentation, a game-changer for refactoring or fixing bugs across multiple files.
Instruction following: It’s far less likely to get confused by a long list of constraints or a detailed set of steps.
Tool use and agentic tasks: It can intelligently decide to call an external API, execute a shell command, or search a repository to complete a task.

Why Prompt Engineering?

Think of LLMs as junior developers: super smart, but literal. The way you phrase your request drastically changes the output. Prompt engineering is the art and science of crafting effective instructions for an LLM to achieve a specific goal. It’s the method you use to communicate your intent, provide necessary context, and structure your request in a way that the model can most accurately understand and respond to. When you master it, you can:

Make GPT-5 generate working, testable code.
Avoid vague or irrelevant answers.
Save tokens (and money).
Reduce the time spent editing or debugging outputs.

How to Use GPT-5 for Free

While the API for GPT-5 is a paid service, many developers can access its power for free or at a low cost. Now, for example, the default public version of ChatGPT often uses the version of GPT-5 with certain usage caps. Many tools like Cursor, GitHub Copilot, Microsoft Copilot integrate GPT-5 or lighter variants.

See the screenshot below of the Cursor IDE with integration of various models, including gpt-5-fast, gpt-5-low, and so on. If you’re experimenting, this is the easiest way to explore GPT-5 without paying for direct API calls.

For this article, we'll use a standard API call structure, but these same principles apply whether you're using a web interface or an integrated tool. Let’s dive into the patterns.

Patterns Every Developer Should Know

Persona Pattern

You know how, when you're interviewing a candidate, you might ask them to act as if they're a "Engineering Lead or Manager" or a "Frontend engineer"? This pattern is the same idea. By assigning the model a role, you give it an immediate set of assumptions and a knowledge filter.

To effectively craft a persona, be specific. For example, instead of saying "You are a developer," try "You are a senior JavaScript developer specializing in backend APIs and scalability." This provides context on their skill level, their domain, and their preferred programming language, guiding the LLM toward a more tailored and expert-level response.

Example:

# Python Example
from openai import OpenAI
client = OpenAI()

response = client.responses.create(
    model="gpt-5",
    input="""You are a senior JavaScript developer. 
    Refactor this code for readability:
    numbers = [8, 9, 10, 11, 12]; total=0
    for i in numbers: total+=i
    print(total)"""
)

print(response.output_text)

This code ensures answers match the tone and expertise you expect, as specified in the prompt.

Few-Shot Pattern

Sometimes, the best way to get a specific style or format of code is to provide an example. This is called "few-shot" prompting. Instead of just describing what you want, you show the model a few completed examples.

Example:

from openai import OpenAI

client = OpenAI()

prompt = """
Convert functions to arrow syntax:

Example:
function sum(x, y) { return x + y; }
=> const sum = (x, y) => x + y;

Then convert:
function greet(name) { return "Hey, " + name; }
"""

response = client.responses.create(
    model="gpt-5",
    input=prompt
)

print(response.output_text)

This code example provides a concrete, undeniable pattern for the model to follow, which is much more effective than a verbose description.

Chain-of-Thought Pattern

When faced with a complex problem, humans don't just jump to a solution instead, we think through the steps. The Chain-of-Thought pattern asks the LLM to do the same. By telling the model to “think step by step,” you're not just requesting a final answer but you're instructing it to perform internal reasoning and break down the problem into smaller, logical parts. This process is what gives you room to debug.

If the final output is incorrect, you can review its thought process to identify where the logic went wrong. This is particularly effective with GPT-5's enhanced reasoning capabilities. The LLM's reasoning might look like an intermediate, internal monologue you don't always see, but asking it to print its thought process can make it explicit.

Example:

prompt = """
Debug the below step by step:
My Python function loop skips the last element of the list. Check why?
"""

By encouraging reasoning, you reduce errors in the code.

Delimiter Pattern

When you’re giving the LLM instructions, it’s important to give it a clear way to differentiate your instructions from the data you want it to process. To do this, you can use delimiters like ###, """, or <> wrapped around your input text to create a clean boundary. This is a general best practice for all LLMs, as they all can struggle with this distinction without a clear signal.

Example:

prompt = """
Explain this code in simple and easy English:

###
for i in range(10):
    print(i**3)
###
"""

This helps prevent the model from misinterpreting your data as part of the instructions, particularly when the data contains instruction-like strings.

Structured Output Pattern

If you need the model's response to be easily parseable by a program, you must specify the format clearly. This is particularly important when you want to use the output as an input for a different part of your software, such as generating JSON configuration files, XML for web services, or even markdown (MD) files for documentation. By telling the model to adhere to a rigid structure, you ensure the output is consistent and reliable.

Example:

import json
from openai import OpenAI

client = OpenAI()

def generate_product_list(product_info):
    prompt = f"""
    Generate a JSON object for the following product information.
    The JSON should have a 'products' key, which is an array of objects.
    Each object should have keys for 'name', 'category', 'price', and 'in_stock' (a boolean).

    Product Information:
    {product_info}

    Provide only the JSON output, and nothing else.
    """

    response = client.responses.create(
        model="gpt-5",
        input=prompt
    )

    # Try to parse the response as JSON
    try:
        json_output = json.loads(response.output_text)
        return json_output
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return None

# Let's try it out
product_data = """
Laptop Pro, Electronics, 1500, True
Ergo Mouse, Accessories, 50, True
Wireless Keyboard, Accessories, 90, False
"""

product_list = generate_product_list(product_data)
if product_list:
    print(json.dumps(product_list, indent=2))

In this example, the prompt is the instruction you give to the LLM. It's a text string that outlines a clear task and specifies the output format (a JSON object with specific keys). The response from the model is the raw text it generates, which should be the JSON object you requested. The Python code then attempts to parse this raw text response into a structured JSON object using json.loads().

Flipped Interaction Pattern

Sometimes, the best way to get GPT-5 to help you is to have it ask you some questions before it writes any code.

Example:

prompt = """
I want a python script to scrape travel websites for travelling data.
Ask me 5 clarifying questions before writing the code.
"""

This type of prompt helps prevent assumptions and will provide more accurate code.

Negative Constraint Pattern

While it’s important to tell the model what it should do, it’s also sometimes as important to tell it what it should not do or what it shouldn’t include in its response. This helps the model avoid certain words, tones, or topics.

Example:

from openai import OpenAI

client = OpenAI()

def my_func(technical_report):
    prompt = f"""
    Summarize the following technical report for a non-technical audience. 
    Do not use any specialized jargon, acronyms, or complex terms. 
    Use simple, everyday language.

    Technical Report:
    "{technical_report}"
    """
    response = client.responses.create(
        model="gpt-5",
        input=prompt
    )
    return response.output_text

# Let's try it out
report = (
    "The quantum entanglement protocol (QEP) showed significant improvements "
    "in qubit coherence by utilizing a novel multi-photon emission cascade. "
    "The data indicates a 12% reduction in decoherence rates, validating the "
    "hypothesis that non-linear optical feedback could mitigate environmental noise."
)

summary = my_func(report)
print(summary)

This pattern is a great way to fine-tune the output and steer it away from common pitfalls, overly technical language, and so on, ensuring it meets your specific requirements.

Tool Use Pattern

GPT-5 is an incredible reasoning engine, but its real power comes when it can interact with external tools, like a web search, a code interpreter, or a file retrieval system. This pattern involves providing the model with a clear description of the tools it can or should use.

Example:

prompt = """
You have access to a 'code_interpreter' tool.
Its purpose is to execute JavaScript code in a secure sandbox.
The tool takes a single argument: the JavaScript code as a string.

Your task is to use this tool to calculate the area of a rectangle 
with a length and breadth as 15.
After you get the result, respond with only the final answer number.
"""

This is what unlocks GPT-5's potential for true agentic behavior. It can autonomously solve a problem by deciding which tools to use and in what order, moving beyond simple text generation.

Verbosity Pattern

Depending on your needs, you might want more or less concise output from the LLM. With the GPT-5 API, you can adjust the level of detail and length of the output with the use of the new text.verbosity parameter. Just select the level of text.verbosity as low, medium, or high.

Example:

from openai import OpenAI

client = OpenAI()

# Low Verbosity for a concise function
def get_concise_code(description):
    prompt = f"Write a Python function for {description}."
    response = client.responses.create(
        model="gpt-5",
        input=prompt,
        metadata={"verbosity": "low"} 
    )
    return response.output_text

user_input = "a quicksort algorithm"

concise_code = get_concise_code(user_input)

print("Concise Code-\n", concise_code)

This saves you time by preventing the model from "over-explaining" when you just need a quick snippet, and it gives you more context when you're learning something new or working with a complex piece of code.

Code-as-Context Pattern

GPT-5’s massive context window is a game-changer for working with a full file or even a small project. Instead of just giving it a snippet, you can feed it an entire script and ask it to analyze, refactor, or optimize it.

Example:

async def my_optimize_codebase(code_file: str) -> str:
    prompt = f"""
    You are a performance optimization expert. Analyze the following JavaScript 
    code file for potential performance bottlenecks, redundant code, or memory leaks. 
    Provide a detailed report and then a refactored version of the code.

    Code to analyze:
    \"\"\"
    {code_file}
    \"\"\"
    """
    # For this demonstration, we'll just return the prompt
    return prompt


# User input: "your text input here"
my_code = """
// A large, unoptimized JavaScript file
const fetchData = async () => {
  const data = await fetch('https://api.example.com/data');
  const jsonData = await data.json();
  const filteredData = jsonData.filter(item => item.isActive);
  const mappedData = filteredData.map(item => {
    return {
      id: item.id,
      name: item.name.toUpperCase(),
      status: 'active'
    };
  });

  // This is a loop that could be more efficient
  const res= [];
  for (let i = 0; i < mappedData.length; i++) {
    for (let j = 0; j < 10000; j++) {
      res.append(mappedData[i])
    }
  }
  return res;
};
"""

import asyncio

async def main():
    prompt = await my_optimize_codebase(my_code)
    print(prompt)

asyncio.run(main())

This prompt allows GPT-5 to see the full picture. It can understand variable scope, function dependencies, and the overall logic of a file in a way that’s impossible with a single, isolated snippet.

Common Pitfalls to Avoid

Being Vague or Ambiguous: A prompt such as “Write some code” will result in a response that lacks focus and is generic. Make sure to clarify which programming language, the specific function, output format, and any limitations that may be required.
Overloading a Single Prompt: An example “Write a Python script, summarize it in three bullet points, and then translate it into French” has multiple unrelated tasks and will commonly generate disorganized or incomplete reports. Focus on complex requests and break them down into a series of prompts.
Failing to Iterate: Usually, your first prompt is hardly the most accurate or relevant to the topic of discussion. A general approach is to focus on the prompts generated and go over the concerns of the first sentence as a response. Take into consideration to elaborate, incorporate more facts, and refine, hence have a conversation back and forth to achieve the desired result.

Final Thoughts

With GPT-5, prompt engineering is much more complex than locating a “magic” phrase. You need to shift your thinking to software engineering and articulate it for the AI. You are not merely instructing the AI – you are defining the parameters within which it should work to arrive at an efficient solution.

You can put these 10 patterns, along with the new features of reasoning effort and verbosity control, to make GPT-5 a dependable coding assistant: generating boilerplate code, debugging, code refactoring, or app scaffolding. Start improving your prompt engineering technique with lower models like GPT-4o, Gemini, and others. Once you are ready, upgrade to GPT-5 to power real-world dev workflows.

How to Build an AI Study Planner Agent using Gemini in Python

Tarun Singh — Fri, 05 Sep 2025 15:19:14 +0000

The world is shifting from simple AI chatbots answering our queries to full-fledged systems that are capable of so much more. AI Agents can not only answer our queries but can also perform tasks we give them independently, making them much more powerful and useful.

In this tutorial, you’ll build an advanced, web-based agent that serves as your Virtual Study Planner. This AI agent will be able to understand your goals, make decisions, and act to achieve them.

This project goes beyond basic conversation. You’ll learn to build a goal-based agent with two key capabilities:

Memory: The agent will remember your entire conversation history, allowing it to provide follow-up advice and adapt its plans based on your feedback.
Tool Use: The agent will be capable of using a search tool to find relevant online resources, making it a more powerful assistant than one that relies solely on its internal knowledge.

You’ll learn to create a complete system with a simple web UI built with Flask and Tailwind CSS, providing a solid foundation for building even more complex agents in the future. So, let’s get started.

Prerequisites
Tools You'll Be Using to Build this Agent
Understanding AI Agents
- What are AI Agents? How many types are there?
- How AI Agents is unique compared to other AI tools?
How to Set Up Your Environment
How to Build the Real-Time Agent Logic
- Create the Gemini Client (with web search)
- Create the Flask Backend and Frontend
How to Test the AI Agent
Wrapping Up

Prerequisites

Before following this tutorial, you should have:

Basic Python knowledge
Basics of web development
Python 3+ is installed on your machine
Installed VS Code or another IDE of your choice

Tools You'll Be Using to Build this Agent

To build this study planner agent, you'll need a few components:

Google Gemini API: This is the core AI service that provides the generative model. It allows our agent to understand natural language, reason, and generate human-like responses.
Flask: This is a lightweight web framework for Python. We’ll use it to create our web server (that is, the backend). Its primary purpose here is to handle web requests from the user's browser, process them, and send back a response.
Tailwind CSS: This is a CSS framework for building the user interface (that is, the frontend). Instead of writing custom CSS, you use pre-defined classes like bg-blue-300, m-4, and so on, to style the page directly in your HTML.
Python-dotenv: This library helps us manage environment variables.
DuckDuckGo Search: This library provides a simple way to perform real-time web searches. It acts as the "tool" for our AI agent. When a user asks a question that requires external information, our agent can use this tool to find relevant resources on the web and use that information to formulate a response.

Understanding AI Agents

Before jumping into the code, let’s cover the basics so you understand what an AI agent is and what it’s capable of.

What Are AI Agents? How Many Types Are There?

An AI agent is software that can autonomously perform tasks on a user’s behalf. AI agents perceive their surroundings, process information, and act to achieve the user’s goals. Unlike fixed programs, an agent can reason and adapt.

There are a few different types of agents, including:

Simple Reflex (acts on current input, like a thermostat)
Model-Based (uses an internal map, like robot vacuums)
Goal-Based (plans to reach goals, like a study planner)
Utility-Based (chooses best outcomes, like trading bots)
Learning Agents (improve over time, like recommendation systems).

How Are AI Agents Unique Compared to Other AI Tools?

AI agents use technologies like LLMs, but they’re distinct because of their autonomy and ability to act. Let’s understand these different types of AI tools in more detail:

Large Language Models (LLMs): LLMs are the brain of the operation. They’re trained on a very large dataset to understand and process user queries in natural language to generate human-like output. OpenAI’s GPT, Google’s Gemini, and Anthropic’s Claude are all examples of LLMs.
Retrieval-Augmented Generation (RAG): RAG is a process or a technique that allows LLMs to not only get their information from training data but also from external sources, like a database or document library, to answer user queries. While RAG retrieves information, it doesn't independently decide to perform an action or plan a sequence of steps to achieve a goal.
AI Agents: As explained above, agents are the systems that can perform user tasks using LLMs as their core reasoning engine. An agent’s full architecture allows it to perceive its environment, plan, act, and learn (memory, based on past interactions).

In this tutorial, you are going to use an LLM (Gemini) to reason, as well as a web search engine, DuckDuckGo search, for building the agent. So, now let’s move on to the next step.

How to Set Up Your Environment

Before you can build your Virtual Study Planner AI agent, you’ll need to set up your development environment. Here are the steps you’ll need to follow:

1. Create a Project Directory

First, create a new folder with any name and move to that directory:

mkdir study-planner
cd study-planner

2. Create a Virtual Environment

In Python, it’s always recommended to work in a virtual environment. So, create one and activate it like this:

python -m venv venv

Now activate the virtual environment:

# macOS/Linux
source venv/bin/activate

# Windows
venv\Scripts\activate

3. Install Dependencies

We’ll need a couple of packages or dependencies to build the AI study planner agent, and they include:

flask: web server
google-generativeai: Gemini client
python-dotenv: load GEMINI_API_KEY from .env
requests: useful HTTP helper (nice to have)
duckduckgo-search: real web search

You can install them with a single command:

pip install flask google-generativeai python-dotenv requests duckduckgo-search

4. Get Your Gemini API Key

Go to Google AI Studio and create a new account (if you don’t have one already).

Next, get yourself a new API key by clicking the Create API Key from the API Keys section.

NOTE: Once the API Key is generated, SAVE it somewhere else. You may not get the same API key again.

5. Add Your Key to the `.env` File

Create a .env file inside backend/ and add your API key.

GEMINI_API_KEY=your_api_key_here

Now you should have set up your development environment successfully. You’re ready to build the Virtual Study Planner AI agent. Let’s start!

How to Build the Real-Time Agent Logic

The core of this project is a continuous loop that accepts user input, maintains a conversation history, and sends that history to the Gemini API to generate a response. This is how we give the agent memory.

Create the Gemini Client (with web search)

Create a new file at backend/gemini_client.py:

# backend/gemini_client.py
import os
from typing import List, Dict
import google.generativeai as genai
from dotenv import load_dotenv
from duckduckgo_search import DDGS

# Load environment variables
load_dotenv()

# function uses a query string and duckduckgo_search library to perform a web search
def perform_web_search(query: str, max_results: int = 6) -> List[Dict[str, str]]:
    """Perform a DuckDuckGo search and return a list of results.

    Each result contains: title, href, body.
    """
    results: List[Dict[str, str]] = []
    try:
        with DDGS() as ddgs:
            for result in ddgs.text(query, max_results=max_results):
                # result keys typically include: title, href, body
                if not isinstance(result, dict):
                    continue
                title = result.get('title') or ''
                href = result.get('href') or ''
                body = result.get('body') or ''
                if title and href:
                    results.append({
                        'title': title,
                        'href': href,
                        'body': body,
                    })
        return results
    except Exception as e:
        print(f"DuckDuckGo search error: {e}")
        return []

# A class that manages the interaction with the Gemini API and core agent logic 
class GeminiClient:
    def __init__(self):
        try:
            genai.configure(api_key=os.getenv('GEMINI_API_KEY'))
            self.model = genai.GenerativeModel('gemini-1.5-flash')
            self.chat = self.model.start_chat(history=[])
        except Exception as e:
            print(f"Error configuring Gemini API: {e}")
            self.chat = None

    def generate_response(self, user_input: str) -> str:
        """Generate an AI response with optional web search when prefixed.

        To trigger web search, start your message with one of:
        - "search: "
        - "/search "
        Otherwise, the model responds directly using chat history.
        """
        if not self.chat:
            return "AI service is not configured correctly."

        try:
            text = user_input or ""
            lower = text.strip().lower()

            # Search trigger
            search_query = None
            if lower.startswith("search:"):
                search_query = text.split(":", 1)[1].strip()
            elif lower.startswith("/search "):
                search_query = text.split(" ", 1)[1].strip()

            if search_query:
                web_results = perform_web_search(search_query, max_results=6)
                if not web_results:
                    return "I could not retrieve web results right now. Please try again."

                # Build context with numbered references
                refs_lines = []
                for idx, item in enumerate(web_results, start=1):
                    refs_lines.append(f"[{idx}] {item['title']} — {item['href']}\n{item['body']}")
                refs_block = "\n\n".join(refs_lines)

                system_prompt = (
                    "You are an AI research assistant. Use the provided web search results to answer the user query. "
                    "Synthesize concisely, cite sources inline like [1], [2] where relevant, and include a brief summary."
                )
                composed = (
                    f"\n{system_prompt}\n\n"
                    f"\n{search_query}\n\n"
                    f"\n{refs_block}\n"
                )
                response = self.chat.send_message(composed)
                return response.text

            # Default: normal chat
            response = self.chat.send_message(text)
            return response.text
        except Exception as e:
            print(f"Error generating response: {e}")
            return "I'm sorry, I encountered an error processing your request."

Let’s understand what’s going on in the above code:

The perform_web_search() function:
- We keep a chat session open so the model remembers the conversation.
- If a message starts with search: or /search, the DuckDuckGo service is called, gathers a few results, and passes them to Gemini with a short instruction to cite sources.
- Otherwise, we just send the message as normal.
The GeminiClient class:
- The GeminiClient class is designed to connect and talk with Google’s Gemini AI. Inside the __init__ method, it first calls genai.configure() with the API key from the environment variables, which basically unlocks access to Gemini’s services.
- Then, self.model = genai.GenerativeModel('gemini-1.5-flash') loads the specific Gemini model, and self.chat = self.model.start_chat(history=[]) starts a new conversation with no previous history. This way, the class is ready to send and receive AI responses.
- The real action happens in generate_response(). If a user’s message begins with search: or /search, it triggers a DuckDuckGo search using perform_web_search().
- The results are formatted with titles, links, and snippets, and then passed to Gemini to create a clear, cited answer (you can sanitize the incoming data later by using any package in Python to make it more user-friendly in the frontend).
- If no search command is used, it simply chats with Gemini using the given input. Error handling is built in, so instead of breaking, it returns a general safe message.

Create the Flask Backend and Frontend

Next, we'll set up the Flask web server to connect our agent logic to a simple web interface.

The Flask Backend

Create a new backend folder inside the study-planner directory, and add a new file app.py:

# backend/app.py
import os
from flask import Flask, render_template, request, jsonify
from gemini_client import GeminiClient

app = Flask(__name__, template_folder='../templates')
client = GeminiClient()

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/api/chat', methods=['POST'])
def chat():
    payload = request.get_json(silent=True) or {}
    user_message = payload.get('message', '').strip()
    if not user_message:
        return jsonify({'error': 'No message provided'}), 400

    try:
        response_text = client.generate_response(user_message)
        return jsonify({'response': response_text})
    except Exception as e:
        return jsonify({'error': 'Error generating response'}), 500

if __name__ == '__main__':
    app.run(debug=True)

What it does:

@app.route('/'): This is the homepage. When a user navigates to the main URL, like, http://localhost:5000), Flask runs the index() function, which simply renders the index.html file. This serves the entire user interface to the browser useful when you don’t want to use the command line interface.
Next, we have created @app.route('/api/chat', methods=['POST']), the API endpoint. When the user clicks "Send" on the frontend, the JavaScript sends a POST request to this URL. The chat() function then receives the user's message, passes it to the GeminiClient to get a response, and then sends that response back to the frontend as a JSON object.

The Flask Frontend

Create a new folder named templates in your project's root directory. Inside it, create a file index.html.

html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>AI Study Plannertitle>
    <script src="https://cdn.tailwindcss.com">script>
    <style>
      body {
        background-color: #f3f4f6;
      }
      .chat-container {
        max-width: 768px;
        margin: 0 auto;
        display: flex;
        flex-direction: column;
        height: 100vh;
      }
      .typing-indicator {
        display: flex;
        align-items: center;
        padding: 0.5rem;
        color: #6b7280;
      }
      .typing-dot {
        width: 8px;
        height: 8px;
        margin: 0 2px;
        background-color: #6b7280;
        border-radius: 50%;
        animation: typing 1s infinite ease-in-out;
      }
      .message-bubble {
        padding: 1rem;
        border-radius: 1.5rem;
        max-width: 80%;
        margin-bottom: 1rem;
      }
      .user-message {
        background-color: #3b82f6;
        color: white;
        align-self: flex-end;
      }
      .agent-message {
        background-color: #e5e7eb;
        color: #374151;
        align-self: flex-start;
      }
    style>
  head>
  <body class="bg-gray-100">
    <div class="chat-container">
      <header
        class="bg-white shadow-sm p-4 text-center font-bold text-xl text-gray-800"
      >
        AI Study Planner
      header>

      <main id="chat-history" class="flex-1 overflow-y-auto p-4 space-y-4">
        <div class="message-bubble agent-message">
          Hello! I'm your AI Study Planner. What topic would you like to study
          today?
        div>
      main>

      <footer class="bg-white p-4">
        <div class="flex items-center">
          <input
            type="text"
            id="user-input"
            class="flex-1 p-3 border-2 border-gray-300 rounded-full focus:outline-none focus:border-blue-500"
            placeholder="Type your message..."
          />
          <button
            id="send-btn"
            class="ml-4 px-6 py-3 bg-blue-500 text-white rounded-full font-semibold hover:bg-blue-600 transition-colors"
          >
            Send
          button>
        div>
      footer>
    div>

    <script>
      const chatHistory = document.getElementById("chat-history");
      const userInput = document.getElementById("user-input");
      const sendBtn = document.getElementById("send-btn");

      function addMessage(sender, text) {
        const messageElement = document.createElement("div");
        messageElement.classList.add(
          "message-bubble",
          sender === "user" ? "user-message" : "agent-message"
        );
        messageElement.textContent = text;
        chatHistory.appendChild(messageElement);
        chatHistory.scrollTop = chatHistory.scrollHeight;
      }

      async function sendMessage() {
        const message = userInput.value.trim();
        if (message === "") return;

        addMessage("user", message);
        userInput.value = "";

        try {
          const response = await fetch("/api/chat", {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
            },
            body: JSON.stringify({ message: message }),
          });

          const data = await response.json();
          if (data.response) {
            addMessage("agent", data.response);
          } else if (data.error) {
            addMessage("agent", `Error: ${data.error}`);
          } else {
            addMessage("agent", "Unexpected response from server.");
          }
        } catch (error) {
          console.error("Error:", error);
          addMessage("agent", "Sorry, something went wrong. Please try again.");
        }
      }

      sendBtn.addEventListener("click", sendMessage);
      userInput.addEventListener("keypress", (e) => {
        if (e.key === "Enter") {
          sendMessage();
        }
      });
    script>
  body>
html>

That’s the entire UI. It’s just one page with a text box and a send button. It contains a simple JavaScript function to handle the chat interaction. Here’s how it works:

When the user types a message and hits "Send," it:
- Takes the message from the input field.
- Creates a new user-message bubble and displays it.
- Uses the fetch() API to send the message to the backend's /api/chat endpoint.
- Waits for the backend's response.
- Once the response is received, it creates a new agent-message bubble and displays the AI’s reply.

How to Test the AI Agent

At this point, your project structure should look like this:

study-planner/
├── backend/
│   ├── .env
│   ├── app.py
│   └── gemini_client.py
└── templates/
    └── index.html

Now, navigate to the backend directory, and run:

cd backend
python app.py

If everything is set up, you’ll see the Flask app start on http://127.0.0.1:5000 or http://localhost:5000.

Open that URL in your browser. That’s it, you have finally created an AI agent for yourself!

Try out asking normal questions like:

“Make me a 3-week plan to learn Java programming for beginners.”
“Provide me a quiz on AI agents development?”

Or you can also trigger a web search like:

search: resources for java
/search how to prepare frontend coding interviews

When you use the search prefix like above, the agent fetches a handful of links and asks Gemini to synthesize them with short inline citations like [1], [2]. It’s great for quick research summaries.

Wrapping Up

Congratulations! You now have a working study planner agent that remembers your chats and can even look things up online.

From here, you can further enhance this agent by:

Saving user histories in a database.
Adding authentication, handling multiple users.
Connecting calendars or task managers, and much more.

This foundation provides a solid starting point for building even more sophisticated AI agents tailored to your specific needs.

If you found this tutorial helpful and want to discuss AI development or software development, feel free to connect with me on X/Twitter, LinkedIn, or check out my portfolio at Blog. I regularly share insights about AI, development, technical writing, and so on, and would love to see what you build with this foundation.

Happy coding!

Tarun Singh - freeCodeCamp.org

How to Get Type Safety Without Code Generation Using tRPC and Hono

Table of Contents

Prerequisites

The Problem with Traditional APIs

What Makes tRPC Different?

Hono: The Lightweight Challenger

Why This Matters These Days

Developer experience is essential

Smaller teams, better apps

The end of “Works on my machine“

Getting Started

The Future is Type-Safe

How to Use Nano Banana for Image Generation - Explained with Code Examples

Table of Contents

What is "Nano Banana"?

Why "Nano Banana"?

How to Set Up Your Project

Step 1: Get an API key from Google Gemini

Step 2: Install the SDK and Other Dependencies

Step 3: Set Up Your Environment

Step 4: Image Generation & Editing

Beyond the Basics: What Else Can You Do?

Wrapping Up

Prompt Engineering Cheat Sheet for GPT-5: Learn These Patterns for Solid Code Generation

Table of Contents

What is GPT-5? Why You Should Use It as a Developer?

Why Prompt Engineering?

How to Use GPT-5 for Free

Patterns Every Developer Should Know

Persona Pattern

Few-Shot Pattern

Chain-of-Thought Pattern

Delimiter Pattern

Structured Output Pattern

Flipped Interaction Pattern

Negative Constraint Pattern

Tool Use Pattern

Verbosity Pattern

Code-as-Context Pattern

Common Pitfalls to Avoid

Final Thoughts

How to Build an AI Study Planner Agent using Gemini in Python

Table of Contents:

Prerequisites

Tools You'll Be Using to Build this Agent

Understanding AI Agents

What Are AI Agents? How Many Types Are There?

How Are AI Agents Unique Compared to Other AI Tools?

How to Set Up Your Environment

1. Create a Project Directory

2. Create a Virtual Environment

3. Install Dependencies

4. Get Your Gemini API Key

5. Add Your Key to the .env File

How to Build the Real-Time Agent Logic

Create the Gemini Client (with web search)

Create the Flask Backend and Frontend

The Flask Backend

The Flask Frontend

How to Test the AI Agent

Wrapping Up

5. Add Your Key to the `.env` File