Building AI APIs with FastAPI and Gemini

A Complete Beginner’s Guide to Creating AI-Powered Web APIs

From Setup to a fully working REST API: Your First AI Web Service

Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT

2025-11-07

Introduction

Welcome to the exciting world of AI-powered web APIs! In this comprehensive guide, we’ll build a complete web service that can chat with Google’s powerful Gemini AI model using FastAPI and Python. This tutorial is designed for beginners who want to understand how modern AI applications work behind the scenes.

By the end of this walkthrough, you’ll have:

  • Built a professional web API using FastAPI

  • Integrated Google’s Gemini AI for intelligent responses

  • Learned about software architecture and design patterns

  • Created a modular, maintainable codebase

  • Understanding of how AI services work in production

FastAPI is a modern, fast web framework for building APIs with Python. It’s used by companies like Netflix, Microsoft, and Uber for their production systems. Combined with Google’s Gemini AI, we’ll create a powerful chatbot service that can handle real user requests.

Understanding the Project Structure

Before we start coding, let’s understand how our project will be organized. Good project structure is crucial for maintainable code.

Project Directory Layout

Our project follows this structure:

src/
├── __pycache__/          # Python cache files (auto-generated)
├── venv/                 # Virtual environment (auto-generated)
├── ai/                   # AI-related code
   ├── __init__.py      # Makes 'ai' a Python package
   ├── base.py          # Abstract base class for AI platforms
   └── gemini.py        # Gemini AI implementation
├── prompts/             # System prompts and templates
   └── system_prompt.md # Instructions for the AI
└── main.py              # FastAPI application entry point

Why organize code this way?

  • Separation of concerns: Each file has a specific purpose

  • Modularity: Easy to add new AI providers or features

  • Maintainability: Code is easier to find and modify

  • Scalability: Structure supports growing applications

Setting Up Your Development Environment

Installing PyCharm Community Edition

We’ll use PyCharm as our development environment. If you haven’t installed it yet:

  1. Visit https://www.jetbrains.com/pycharm/download/

  2. Download PyCharm Community Edition

  3. Run the installer and follow the setup wizard

  4. Choose these options during installation:

    • Create desktop shortcut

    • Add "Open Folder as PyCharm Project" to context menu

    • Add launchers dir to PATH (Linux/macOS)

Creating a New Project

  1. Open PyCharm

  2. Click "New Project"

  3. Choose a location for your project (e.g., fastapi-gemini-api)

  4. Under "Python Interpreter," select "New environment using Virtualenv"

  5. Ensure "Create a main.py welcome script" is checked

  6. Click "Create"

Installing Required Dependencies

In PyCharm’s terminal (View → Tool Windows → Terminal), install our required packages:

pip install fastapi uvicorn pydantic google-generativeai

Let’s understand what each package does:

FastAPI:

  • Modern web framework for building APIs

  • Automatically generates API documentation

  • Built-in validation and serialization

  • High performance and easy to learn

Uvicorn:

  • ASGI server that runs our FastAPI application

  • Handles HTTP requests and responses

  • Supports async/await for high performance

  • Development server with auto-reload

Pydantic:

  • Data validation library using Python type hints

  • Automatically validates request/response data

  • Converts data types automatically

  • Generates clear error messages

Google-GenerativeAI:

  • Official Google library for Gemini AI

  • Handles authentication and API calls

  • Supports all Gemini model features

  • Manages rate limiting and errors

Building the AI Module

Creating the AI Package

First, create the ai directory and the __init__.py file:

  1. Create a new directory called ai in your src folder

  2. Inside the ai directory, create an empty file called __init__.py

Why do we need __init__.py?

The __init__.py file tells Python that this directory is a "package" - a collection of related modules. Even though we’re keeping it empty, it serves important purposes:

  • Package recognition: Python knows this is a package, not just a folder

  • Import capability: We can import modules from this package

  • Namespace control: Helps organize code into logical groups

  • Future extensibility: We can add package-level initialization code later

Creating the Abstract Base Class

Now, let’s create base.py - the foundation of our AI system:

from abc import ABC, abstractmethod

class AiPlatform(ABC):
    @abstractmethod
    def chat(self, prompt: str) -> str:
        pass

Understanding Abstract Base Classes:

This might look confusing at first, but it’s a powerful concept in software design:

  • ABC (Abstract Base Class): A template that other classes must follow

  • @abstractmethod: Forces child classes to implement this method

  • Why use this? Ensures all AI platforms have a chat method

  • Flexibility: Easy to add new AI providers (OpenAI, Claude, etc.)

Think of it like a contract: "Any AI platform must be able to chat with a prompt and return a string response."

Getting Your Gemini API Key

To use the Gemini AI API, you’ll need an API key from Google AI Studio. This key allows your application to authenticate with Google’s servers and access Gemini models securely.

Steps to Get Your API Key

  1. Open your browser and visit: https://aistudio.google.com/app/apikey

  2. Sign in with your Google account if prompted

  3. Click on the "Create API key" button

  4. A new API key will be generated — copy this key

  5. Store it securely. You’ll need it in your code to initialize Gemini

Using the API Key

Once you have your key, you can use it in your Python code when creating the Gemini model instance:

from ai.gemini import Gemini

api_key = "your-api-key-here"
model = Gemini(api_key=api_key)

Important: Do not share your API key publicly or commit it to version control (e.g., GitHub). It’s best practice to load it from an environment variable or configuration file.

Implementing the Gemini AI Class

Now, let’s create gemini.py - our Gemini AI implementation:

from .base import AiPlatform
import google.generativeai as genai

class Gemini(AiPlatform):
    def __init__(self, api_key: str, sys_prompt: str | None = None) -> None:
        self.api_key = api_key
        self.sys_prompt = sys_prompt
        genai.configure(api_key=self.api_key)
        # See more models at https://ai.google.dev/gemini-api/docs/models
        self.model = genai.GenerativeModel("gemini-2.0-flash")
    
    def chat(self, prompt: str) -> str:
        if self.sys_prompt:
            prompt = f"{self.sys_prompt}\n\n{prompt}"
        
        response = self.model.generate_content(prompt)
        return response.text

Breaking down the Gemini class:

Class initialization (__init__):

  • api_key: Your Google AI API key for authentication

  • sys_prompt: Optional system prompt to guide AI behavior

  • genai.configure(): Sets up the Google AI library

  • GenerativeModel(): Creates a connection to Gemini-2.0-flash

Chat method:

  • Prompt combination: Adds system prompt to user prompt if provided

  • generate_content(): Sends request to Gemini AI

  • response.text: Extracts the text response from Gemini

Why use "gemini-2.0-flash"?

  • Fast response times (good for web APIs)

  • High quality text generation

  • Cost-effective for most applications

  • Supports multimodal input (text, images, etc.)

Creating the System Prompt

Understanding System Prompts

System prompts are instructions that tell the AI how to behave. They’re like giving personality and guidelines to the AI before users start chatting with it.

Create a prompts directory and add system_prompt.md:

Answer the user in plaintext (no markdown), but use lots of emojis! 
Be simple, clear and concise.

Why use a separate file for system prompts?

  • Easy editing: Non-programmers can modify AI behavior

  • Version control: Track changes to AI instructions

  • A/B testing: Easy to test different prompt styles

  • Collaboration: Team members can review and improve prompts

Building the FastAPI Application

Understanding FastAPI Basics

FastAPI is a modern web framework that makes building APIs incredibly easy. It automatically handles many complex tasks like request validation, response serialization, and API documentation.

Creating the Main Application

Let’s build our main.py file step by step:

# --- App Initialization ---
from fastapi import FastAPI
from pydantic import BaseModel
from .ai.gemini import Gemini

app = FastAPI()

def load_sys_prompt() -> str:
    with open("src/prompts/system_prompt.md") as f:
        return f.read()

sys_prompt = load_sys_prompt()
gemini_api_key = "YOUR_API_KEY_HERE"
model = Gemini(api_key=gemini_api_key, sys_prompt=sys_prompt)

class ChatRequest(BaseModel):
    prompt: str

class ChatResponse(BaseModel):
    response: str

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    response_text = model.chat(request.prompt)
    return ChatResponse(response=response_text)

@app.get("/")
async def root():
    return {
        "message": "App is running!"
    }

Breaking Down the Code:

1. Imports and App Creation:

  • FastAPI: The main web framework class

  • BaseModel: Pydantic class for data validation

  • Gemini: Our custom AI class

  • app = FastAPI(): Creates our web application

2. Configuration Loading:

  • load_sys_prompt(): Reads our system prompt from file

  • gemini_api_key: Your Google AI API key (keep this secure!)

  • model = Gemini(): Creates our AI model instance

3. Data Models:

  • ChatRequest: Defines what data we expect from users

  • ChatResponse: Defines what data we send back

  • Pydantic validation: Automatically checks data types and formats

4. API Endpoints:

  • @app.post("/chat"): Handles POST requests to /chat URL

  • async def chat(): Processes chat requests asynchronously

  • @app.get("/"): Health check endpoint

Understanding Pydantic Models

Pydantic models are special classes that automatically validate and convert data:

class ChatRequest(BaseModel):
    prompt: str  # Must be a string, required field

class ChatResponse(BaseModel):
    response: str  # Must be a string, required field

What Pydantic does for us:

  • Type checking: Ensures prompt is actually a string

  • Required fields: Returns error if prompt is missing

  • JSON conversion: Automatically converts to/from JSON

  • Documentation: Creates API docs automatically

Understanding Async/Await

You might notice the async and await keywords. These enable our API to handle multiple requests simultaneously:

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):  # async function
    response_text = model.chat(request.prompt)  # This could be await
    return ChatResponse(response=response_text)

Why use async?

  • Concurrency: Handle multiple users at once

  • Performance: Don’t block while waiting for AI response

  • Scalability: Support more concurrent users

  • Efficiency: Better resource utilization

Running Your Application

Starting the Development Server

To run your FastAPI application, use Uvicorn in your terminal:

uvicorn src.main:app --reload

Command explanation:

  • uvicorn: The ASGI server that runs FastAPI apps

  • src.main:app: Points to the app object in src/main.py

  • –reload: Automatically restarts when you change code

You should see output like:

INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12345] using StatReload
INFO:     Started server process [12346]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Testing Your API

1. Health Check: Open your browser and visit: http://127.0.0.1:8000

You should see:

{
  "message": "App is running!"
}

2. Interactive API Documentation: FastAPI automatically generates beautiful API docs! Visit: http://127.0.0.1:8000/docs

This interactive documentation allows you to:

  • See all available endpoints

  • Test API calls directly in the browser

  • View request/response schemas

  • Download OpenAPI specifications

3. Testing the Chat Endpoint: In the API docs, click on the POST /chat endpoint, then "Try it out". Enter a test message:

{
  "prompt": "Hello, how are you today?"
}

You should receive a response with lots of emojis, just as specified in our system prompt!

Understanding the Complete Flow

Request-Response Cycle

Let’s trace what happens when a user sends a chat message:

1. User sends HTTP POST request to /chat:

POST http://127.0.0.1:8000/chat
Content-Type: application/json

{
  "prompt": "Tell me a joke about programming"
}

2. FastAPI receives and validates the request:

  • Checks that the request body is valid JSON

  • Validates that "prompt" field exists and is a string

  • Creates a ChatRequest object

3. Our chat() function processes the request:

  • Extracts the prompt from the request

  • Calls model.chat(request.prompt)

  • Gemini class combines system prompt with user prompt

  • Sends request to Google’s Gemini API

4. Gemini AI generates a response:

  • Processes the combined prompt

  • Generates a response following system prompt guidelines

  • Returns the generated text

5. FastAPI sends the response back:

{
  "response": "Why do programmers prefer dark mode? Because light attracts bugs!"
}

Testing Your Application

Manual Testing with curl

You can also test your API using curl commands:

# Health check
curl http://127.0.0.1:8000/

# Chat request
curl -X POST "http://127.0.0.1:8000/chat" \
     -H "Content-Type: application/json" \
     -d '{"prompt": "Tell me about FastAPI"}'

Conclusion

Congratulations! You’ve successfully built and deployed your first AI-powered web service using FastAPI and Google’s Gemini model. In this tutorial, you learned:

  • How to set up a modern Python project with FastAPI

  • How to structure code using abstraction and modular design

  • How to implement AI interaction using Google’s Gemini API

  • The basics of system prompts and how they shape AI behavior

  • How to create and test REST API endpoints using FastAPI and Pydantic

This project introduced you to the world of scalable, production-ready AI APIs. From abstract base classes to real-time chat endpoints, you now have a solid foundation for developing more complex and intelligent applications.

Whether you’re looking to integrate other AI models, add user authentication, or deploy to the cloud, the skills you’ve gained here will serve as a strong starting point. And in the future, we will be looking to extend the functionality of this application!

Keep building, keep experimenting—and welcome to the future of intelligent web development.

Happy coding!


Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT
Date: 2025-11-07

THINKOFIT

Empowering learners worldwide with transformative education, cutting-edge skills, and collaborative learning opportunities that drive personal and professional growth.

Accredited Institution
Secure Learning
Worldwide Access

Get in Touch

Student Support
[email protected]
Help Desk
+44 7490 371900
Main Campus
London, UK
© 2026 THINKOFIT Educational Platform
Empowering Learners. Building Skills. Creating Futures.