Building AI APIs with FastAPI and Gemini

A Complete Beginner’s Guide to Creating AI-Powered Web APIs

From Setup to a fully working REST API: Your First AI Web Service

Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT

2025-11-07

Introduction

Welcome to the exciting world of AI-powered web APIs! In this comprehensive guide, we’ll build a complete web service that can chat with Google’s powerful Gemini AI model using FastAPI and Python. This tutorial is designed for beginners who want to understand how modern AI applications work behind the scenes.

By the end of this walkthrough, you’ll have:

Built a professional web API using FastAPI
Integrated Google’s Gemini AI for intelligent responses
Learned about software architecture and design patterns
Created a modular, maintainable codebase
Understanding of how AI services work in production

FastAPI is a modern, fast web framework for building APIs with Python. It’s used by companies like Netflix, Microsoft, and Uber for their production systems. Combined with Google’s Gemini AI, we’ll create a powerful chatbot service that can handle real user requests.

Understanding the Project Structure

Before we start coding, let’s understand how our project will be organized. Good project structure is crucial for maintainable code.

Project Directory Layout

Our project follows this structure:

src/
├── __pycache__/          # Python cache files (auto-generated)
├── venv/                 # Virtual environment (auto-generated)
├── ai/                   # AI-related code
│   ├── __init__.py      # Makes 'ai' a Python package
│   ├── base.py          # Abstract base class for AI platforms
│   └── gemini.py        # Gemini AI implementation
├── prompts/             # System prompts and templates
│   └── system_prompt.md # Instructions for the AI
└── main.py              # FastAPI application entry point

Why organize code this way?

Separation of concerns: Each file has a specific purpose
Modularity: Easy to add new AI providers or features
Maintainability: Code is easier to find and modify
Scalability: Structure supports growing applications

Setting Up Your Development Environment

Installing PyCharm Community Edition

We’ll use PyCharm as our development environment. If you haven’t installed it yet:

Visit https://www.jetbrains.com/pycharm/download/
Download PyCharm Community Edition
Run the installer and follow the setup wizard
Choose these options during installation:
- Create desktop shortcut
- Add "Open Folder as PyCharm Project" to context menu
- Add launchers dir to PATH (Linux/macOS)

Creating a New Project

Open PyCharm
Click "New Project"
Choose a location for your project (e.g., fastapi-gemini-api)
Under "Python Interpreter," select "New environment using Virtualenv"
Ensure "Create a main.py welcome script" is checked
Click "Create"

Installing Required Dependencies

In PyCharm’s terminal (View → Tool Windows → Terminal), install our required packages:

pip install fastapi uvicorn pydantic google-generativeai

Let’s understand what each package does:

FastAPI:

Modern web framework for building APIs
Automatically generates API documentation
Built-in validation and serialization
High performance and easy to learn

Uvicorn:

ASGI server that runs our FastAPI application
Handles HTTP requests and responses
Supports async/await for high performance
Development server with auto-reload

Pydantic:

Data validation library using Python type hints
Automatically validates request/response data
Converts data types automatically
Generates clear error messages

Google-GenerativeAI:

Official Google library for Gemini AI
Handles authentication and API calls
Supports all Gemini model features
Manages rate limiting and errors

Building the AI Module

Creating the AI Package

First, create the ai directory and the __init__.py file:

Create a new directory called ai in your src folder
Inside the ai directory, create an empty file called __init__.py

Why do we need __init__.py?

The __init__.py file tells Python that this directory is a "package" - a collection of related modules. Even though we’re keeping it empty, it serves important purposes:

Package recognition: Python knows this is a package, not just a folder
Import capability: We can import modules from this package
Namespace control: Helps organize code into logical groups
Future extensibility: We can add package-level initialization code later

Creating the Abstract Base Class

Now, let’s create base.py - the foundation of our AI system:

from abc import ABC, abstractmethod

class AiPlatform(ABC):
    @abstractmethod
    def chat(self, prompt: str) -> str:
        pass

Understanding Abstract Base Classes:

This might look confusing at first, but it’s a powerful concept in software design:

ABC (Abstract Base Class): A template that other classes must follow
@abstractmethod: Forces child classes to implement this method
Why use this? Ensures all AI platforms have a chat method
Flexibility: Easy to add new AI providers (OpenAI, Claude, etc.)

Think of it like a contract: "Any AI platform must be able to chat with a prompt and return a string response."

Getting Your Gemini API Key

To use the Gemini AI API, you’ll need an API key from Google AI Studio. This key allows your application to authenticate with Google’s servers and access Gemini models securely.

Steps to Get Your API Key

Open your browser and visit: https://aistudio.google.com/app/apikey
Sign in with your Google account if prompted
Click on the "Create API key" button
A new API key will be generated — copy this key
Store it securely. You’ll need it in your code to initialize Gemini

Using the API Key

Once you have your key, you can use it in your Python code when creating the Gemini model instance:

from ai.gemini import Gemini

api_key = "your-api-key-here"
model = Gemini(api_key=api_key)

Important: Do not share your API key publicly or commit it to version control (e.g., GitHub). It’s best practice to load it from an environment variable or configuration file.

Implementing the Gemini AI Class

Now, let’s create gemini.py - our Gemini AI implementation:

from .base import AiPlatform
import google.generativeai as genai

class Gemini(AiPlatform):
    def __init__(self, api_key: str, sys_prompt: str | None = None) -> None:
        self.api_key = api_key
        self.sys_prompt = sys_prompt
        genai.configure(api_key=self.api_key)
        # See more models at https://ai.google.dev/gemini-api/docs/models
        self.model = genai.GenerativeModel("gemini-2.0-flash")
    
    def chat(self, prompt: str) -> str:
        if self.sys_prompt:
            prompt = f"{self.sys_prompt}\n\n{prompt}"
        
        response = self.model.generate_content(prompt)
        return response.text

Breaking down the Gemini class:

Class initialization (__init__):

api_key: Your Google AI API key for authentication
sys_prompt: Optional system prompt to guide AI behavior
genai.configure(): Sets up the Google AI library
GenerativeModel(): Creates a connection to Gemini-2.0-flash

Chat method:

Prompt combination: Adds system prompt to user prompt if provided
generate_content(): Sends request to Gemini AI
response.text: Extracts the text response from Gemini

Why use "gemini-2.0-flash"?

Fast response times (good for web APIs)
High quality text generation
Cost-effective for most applications
Supports multimodal input (text, images, etc.)

Creating the System Prompt

Understanding System Prompts

System prompts are instructions that tell the AI how to behave. They’re like giving personality and guidelines to the AI before users start chatting with it.

Create a prompts directory and add system_prompt.md:

Answer the user in plaintext (no markdown), but use lots of emojis! 
Be simple, clear and concise.

Why use a separate file for system prompts?

Easy editing: Non-programmers can modify AI behavior
Version control: Track changes to AI instructions
A/B testing: Easy to test different prompt styles
Collaboration: Team members can review and improve prompts

Building the FastAPI Application

Understanding FastAPI Basics

FastAPI is a modern web framework that makes building APIs incredibly easy. It automatically handles many complex tasks like request validation, response serialization, and API documentation.

Creating the Main Application

Let’s build our main.py file step by step:

# --- App Initialization ---
from fastapi import FastAPI
from pydantic import BaseModel
from .ai.gemini import Gemini

app = FastAPI()

def load_sys_prompt() -> str:
    with open("src/prompts/system_prompt.md") as f:
        return f.read()

sys_prompt = load_sys_prompt()
gemini_api_key = "YOUR_API_KEY_HERE"
model = Gemini(api_key=gemini_api_key, sys_prompt=sys_prompt)

class ChatRequest(BaseModel):
    prompt: str

class ChatResponse(BaseModel):
    response: str

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    response_text = model.chat(request.prompt)
    return ChatResponse(response=response_text)

@app.get("/")
async def root():
    return {
        "message": "App is running!"
    }

Breaking Down the Code:

1. Imports and App Creation:

FastAPI: The main web framework class
BaseModel: Pydantic class for data validation
Gemini: Our custom AI class
app = FastAPI(): Creates our web application

2. Configuration Loading:

load_sys_prompt(): Reads our system prompt from file
gemini_api_key: Your Google AI API key (keep this secure!)
model = Gemini(): Creates our AI model instance

3. Data Models:

ChatRequest: Defines what data we expect from users
ChatResponse: Defines what data we send back
Pydantic validation: Automatically checks data types and formats

4. API Endpoints:

@app.post("/chat"): Handles POST requests to /chat URL
async def chat(): Processes chat requests asynchronously
@app.get("/"): Health check endpoint

Understanding Pydantic Models

Pydantic models are special classes that automatically validate and convert data:

class ChatRequest(BaseModel):
    prompt: str  # Must be a string, required field

class ChatResponse(BaseModel):
    response: str  # Must be a string, required field

What Pydantic does for us:

Type checking: Ensures prompt is actually a string
Required fields: Returns error if prompt is missing
JSON conversion: Automatically converts to/from JSON
Documentation: Creates API docs automatically

Understanding Async/Await

You might notice the async and await keywords. These enable our API to handle multiple requests simultaneously:

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):  # async function
    response_text = model.chat(request.prompt)  # This could be await
    return ChatResponse(response=response_text)

Why use async?

Concurrency: Handle multiple users at once
Performance: Don’t block while waiting for AI response
Scalability: Support more concurrent users
Efficiency: Better resource utilization

Running Your Application

Starting the Development Server

To run your FastAPI application, use Uvicorn in your terminal:

uvicorn src.main:app --reload

Command explanation:

uvicorn: The ASGI server that runs FastAPI apps
src.main:app: Points to the app object in src/main.py
–reload: Automatically restarts when you change code

You should see output like:

INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [12345] using StatReload
INFO:     Started server process [12346]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Testing Your API

1. Health Check: Open your browser and visit: http://127.0.0.1:8000

You should see:

{
  "message": "App is running!"
}

2. Interactive API Documentation: FastAPI automatically generates beautiful API docs! Visit: http://127.0.0.1:8000/docs

This interactive documentation allows you to:

See all available endpoints
Test API calls directly in the browser
View request/response schemas
Download OpenAPI specifications

3. Testing the Chat Endpoint: In the API docs, click on the POST /chat endpoint, then "Try it out". Enter a test message:

{
  "prompt": "Hello, how are you today?"
}

You should receive a response with lots of emojis, just as specified in our system prompt!

Understanding the Complete Flow

Request-Response Cycle

Let’s trace what happens when a user sends a chat message:

1. User sends HTTP POST request to /chat:

POST http://127.0.0.1:8000/chat
Content-Type: application/json

{
  "prompt": "Tell me a joke about programming"
}

2. FastAPI receives and validates the request:

Checks that the request body is valid JSON
Validates that "prompt" field exists and is a string
Creates a ChatRequest object

3. Our chat() function processes the request:

Extracts the prompt from the request
Calls model.chat(request.prompt)
Gemini class combines system prompt with user prompt
Sends request to Google’s Gemini API

4. Gemini AI generates a response:

Processes the combined prompt
Generates a response following system prompt guidelines
Returns the generated text

5. FastAPI sends the response back:

{
  "response": "Why do programmers prefer dark mode? Because light attracts bugs!"
}

Testing Your Application

Manual Testing with curl

You can also test your API using curl commands:

# Health check
curl http://127.0.0.1:8000/

# Chat request
curl -X POST "http://127.0.0.1:8000/chat" \
     -H "Content-Type: application/json" \
     -d '{"prompt": "Tell me about FastAPI"}'

Conclusion

Congratulations! You’ve successfully built and deployed your first AI-powered web service using FastAPI and Google’s Gemini model. In this tutorial, you learned:

How to set up a modern Python project with FastAPI
How to structure code using abstraction and modular design
How to implement AI interaction using Google’s Gemini API
The basics of system prompts and how they shape AI behavior
How to create and test REST API endpoints using FastAPI and Pydantic

This project introduced you to the world of scalable, production-ready AI APIs. From abstract base classes to real-time chat endpoints, you now have a solid foundation for developing more complex and intelligent applications.

Whether you’re looking to integrate other AI models, add user authentication, or deploy to the cloud, the skills you’ve gained here will serve as a strong starting point. And in the future, we will be looking to extend the functionality of this application!

Keep building, keep experimenting—and welcome to the future of intelligent web development.

Happy coding!

Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT
Date: 2025-11-07