Building AI APIs with FastAPI and Gemini
A Complete Beginner’s Guide to Creating AI-Powered Web APIs
From Setup to a fully working REST API: Your First AI Web Service
Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT
2025-11-07
Introduction
Welcome to the exciting world of AI-powered web APIs! In this comprehensive guide, we’ll build a complete web service that can chat with Google’s powerful Gemini AI model using FastAPI and Python. This tutorial is designed for beginners who want to understand how modern AI applications work behind the scenes.
By the end of this walkthrough, you’ll have:
Built a professional web API using FastAPI
Integrated Google’s Gemini AI for intelligent responses
Learned about software architecture and design patterns
Created a modular, maintainable codebase
Understanding of how AI services work in production
FastAPI is a modern, fast web framework for building APIs with Python. It’s used by companies like Netflix, Microsoft, and Uber for their production systems. Combined with Google’s Gemini AI, we’ll create a powerful chatbot service that can handle real user requests.
Understanding the Project Structure
Before we start coding, let’s understand how our project will be organized. Good project structure is crucial for maintainable code.
Project Directory Layout
Our project follows this structure:
src/
├── __pycache__/ # Python cache files (auto-generated)
├── venv/ # Virtual environment (auto-generated)
├── ai/ # AI-related code
│ ├── __init__.py # Makes 'ai' a Python package
│ ├── base.py # Abstract base class for AI platforms
│ └── gemini.py # Gemini AI implementation
├── prompts/ # System prompts and templates
│ └── system_prompt.md # Instructions for the AI
└── main.py # FastAPI application entry point
Why organize code this way?
-
Separation of concerns: Each file has a specific purpose
-
Modularity: Easy to add new AI providers or features
-
Maintainability: Code is easier to find and modify
-
Scalability: Structure supports growing applications
Setting Up Your Development Environment
Installing PyCharm Community Edition
We’ll use PyCharm as our development environment. If you haven’t installed it yet:
Download PyCharm Community Edition
Run the installer and follow the setup wizard
-
Choose these options during installation:
Create desktop shortcut
Add "Open Folder as PyCharm Project" to context menu
Add launchers dir to PATH (Linux/macOS)
Creating a New Project
Open PyCharm
Click "New Project"
-
Choose a location for your project (e.g.,
fastapi-gemini-api) -
Under "Python Interpreter," select "New environment using Virtualenv"
Ensure "Create a main.py welcome script" is checked
Click "Create"
Installing Required Dependencies
In PyCharm’s terminal (View → Tool Windows → Terminal), install our required packages:
pip install fastapi uvicorn pydantic google-generativeai
Let’s understand what each package does:
FastAPI:
Modern web framework for building APIs
Automatically generates API documentation
Built-in validation and serialization
High performance and easy to learn
Uvicorn:
ASGI server that runs our FastAPI application
Handles HTTP requests and responses
Supports async/await for high performance
Development server with auto-reload
Pydantic:
Data validation library using Python type hints
Automatically validates request/response data
Converts data types automatically
Generates clear error messages
Google-GenerativeAI:
Official Google library for Gemini AI
Handles authentication and API calls
Supports all Gemini model features
Manages rate limiting and errors
Building the AI Module
Creating the AI Package
First, create the ai directory and the
__init__.py file:
-
Create a new directory called
aiin yoursrcfolder -
Inside the
aidirectory, create an empty file called__init__.py
Why do we need __init__.py?
The __init__.py file tells Python that this directory is a "package" - a collection of related modules.
Even though we’re keeping it empty, it serves important purposes:
-
Package recognition: Python knows this is a package, not just a folder
-
Import capability: We can import modules from this package
-
Namespace control: Helps organize code into logical groups
-
Future extensibility: We can add package-level initialization code later
Creating the Abstract Base Class
Now, let’s create base.py - the foundation of our AI system:
from abc import ABC, abstractmethod
class AiPlatform(ABC):
@abstractmethod
def chat(self, prompt: str) -> str:
pass
Understanding Abstract Base Classes:
This might look confusing at first, but it’s a powerful concept in software design:
-
ABC (Abstract Base Class): A template that other classes must follow
-
@abstractmethod: Forces child classes to implement this method
-
Why use this? Ensures all AI platforms have a
chatmethod -
Flexibility: Easy to add new AI providers (OpenAI, Claude, etc.)
Think of it like a contract: "Any AI platform must be able to chat with a prompt and return a string response."
Getting Your Gemini API Key
To use the Gemini AI API, you’ll need an API key from Google AI Studio. This key allows your application to authenticate with Google’s servers and access Gemini models securely.
Steps to Get Your API Key
-
Open your browser and visit: https://aistudio.google.com/app/apikey
Sign in with your Google account if prompted
Click on the "Create API key" button
A new API key will be generated — copy this key
-
Store it securely. You’ll need it in your code to initialize Gemini
Using the API Key
Once you have your key, you can use it in your Python code when creating the Gemini model instance:
from ai.gemini import Gemini
api_key = "your-api-key-here"
model = Gemini(api_key=api_key)
Important: Do not share your API key publicly or commit it to version control (e.g., GitHub). It’s best practice to load it from an environment variable or configuration file.
Implementing the Gemini AI Class
Now, let’s create gemini.py - our Gemini AI implementation:
from .base import AiPlatform
import google.generativeai as genai
class Gemini(AiPlatform):
def __init__(self, api_key: str, sys_prompt: str | None = None) -> None:
self.api_key = api_key
self.sys_prompt = sys_prompt
genai.configure(api_key=self.api_key)
# See more models at https://ai.google.dev/gemini-api/docs/models
self.model = genai.GenerativeModel("gemini-2.0-flash")
def chat(self, prompt: str) -> str:
if self.sys_prompt:
prompt = f"{self.sys_prompt}\n\n{prompt}"
response = self.model.generate_content(prompt)
return response.text
Breaking down the Gemini class:
Class initialization (__init__):
-
api_key: Your Google AI API key for authentication
-
sys_prompt: Optional system prompt to guide AI behavior
-
genai.configure(): Sets up the Google AI library
-
GenerativeModel(): Creates a connection to Gemini-2.0-flash
Chat method:
-
Prompt combination: Adds system prompt to user prompt if provided
-
generate_content(): Sends request to Gemini AI
-
response.text: Extracts the text response from Gemini
Why use "gemini-2.0-flash"?
Fast response times (good for web APIs)
High quality text generation
Cost-effective for most applications
Supports multimodal input (text, images, etc.)
Creating the System Prompt
Understanding System Prompts
System prompts are instructions that tell the AI how to behave. They’re like giving personality and guidelines to the AI before users start chatting with it.
Create a prompts directory and add
system_prompt.md:
Answer the user in plaintext (no markdown), but use lots of emojis!
Be simple, clear and concise.
Why use a separate file for system prompts?
-
Easy editing: Non-programmers can modify AI behavior
-
Version control: Track changes to AI instructions
-
A/B testing: Easy to test different prompt styles
-
Collaboration: Team members can review and improve prompts
Building the FastAPI Application
Understanding FastAPI Basics
FastAPI is a modern web framework that makes building APIs incredibly easy. It automatically handles many complex tasks like request validation, response serialization, and API documentation.
Creating the Main Application
Let’s build our main.py file step by step:
# --- App Initialization ---
from fastapi import FastAPI
from pydantic import BaseModel
from .ai.gemini import Gemini
app = FastAPI()
def load_sys_prompt() -> str:
with open("src/prompts/system_prompt.md") as f:
return f.read()
sys_prompt = load_sys_prompt()
gemini_api_key = "YOUR_API_KEY_HERE"
model = Gemini(api_key=gemini_api_key, sys_prompt=sys_prompt)
class ChatRequest(BaseModel):
prompt: str
class ChatResponse(BaseModel):
response: str
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
response_text = model.chat(request.prompt)
return ChatResponse(response=response_text)
@app.get("/")
async def root():
return {
"message": "App is running!"
}
Breaking Down the Code:
1. Imports and App Creation:
FastAPI: The main web framework class
-
BaseModel: Pydantic class for data validation
Gemini: Our custom AI class
-
app = FastAPI(): Creates our web application
2. Configuration Loading:
-
load_sys_prompt(): Reads our system prompt from file
-
gemini_api_key: Your Google AI API key (keep this secure!)
-
model = Gemini(): Creates our AI model instance
3. Data Models:
-
ChatRequest: Defines what data we expect from users
-
ChatResponse: Defines what data we send back
-
Pydantic validation: Automatically checks data types and formats
4. API Endpoints:
-
@app.post("/chat"): Handles POST requests to /chat URL
-
async def chat(): Processes chat requests asynchronously
@app.get("/"): Health check endpoint
Understanding Pydantic Models
Pydantic models are special classes that automatically validate and convert data:
class ChatRequest(BaseModel):
prompt: str # Must be a string, required field
class ChatResponse(BaseModel):
response: str # Must be a string, required field
What Pydantic does for us:
-
Type checking: Ensures prompt is actually a string
-
Required fields: Returns error if prompt is missing
-
JSON conversion: Automatically converts to/from JSON
-
Documentation: Creates API docs automatically
Understanding Async/Await
You might notice the async and await
keywords. These enable our API to handle multiple requests simultaneously:
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest): # async function
response_text = model.chat(request.prompt) # This could be await
return ChatResponse(response=response_text)
Why use async?
-
Concurrency: Handle multiple users at once
-
Performance: Don’t block while waiting for AI response
-
Scalability: Support more concurrent users
Efficiency: Better resource utilization
Running Your Application
Starting the Development Server
To run your FastAPI application, use Uvicorn in your terminal:
uvicorn src.main:app --reload
Command explanation:
-
uvicorn: The ASGI server that runs FastAPI apps
-
src.main:app: Points to the
appobject insrc/main.py -
–reload: Automatically restarts when you change code
You should see output like:
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [12345] using StatReload
INFO: Started server process [12346]
INFO: Waiting for application startup.
INFO: Application startup complete.
Testing Your API
1. Health Check: Open your browser and visit: http://127.0.0.1:8000
You should see:
{
"message": "App is running!"
}
2. Interactive API Documentation: FastAPI automatically generates beautiful API docs! Visit: http://127.0.0.1:8000/docs
This interactive documentation allows you to:
See all available endpoints
Test API calls directly in the browser
View request/response schemas
Download OpenAPI specifications
3. Testing the Chat Endpoint: In the API docs, click on the POST /chat endpoint, then
"Try it out". Enter a test message:
{
"prompt": "Hello, how are you today?"
}
You should receive a response with lots of emojis, just as specified in our system prompt!
Understanding the Complete Flow
Request-Response Cycle
Let’s trace what happens when a user sends a chat message:
1. User sends HTTP POST request to /chat:
POST http://127.0.0.1:8000/chat
Content-Type: application/json
{
"prompt": "Tell me a joke about programming"
}
2. FastAPI receives and validates the request:
Checks that the request body is valid JSON
Validates that "prompt" field exists and is a string
Creates a
ChatRequestobject
3. Our chat() function processes the request:
Extracts the prompt from the request
Calls
model.chat(request.prompt)Gemini class combines system prompt with user prompt
Sends request to Google’s Gemini API
4. Gemini AI generates a response:
Processes the combined prompt
Generates a response following system prompt guidelines
Returns the generated text
5. FastAPI sends the response back:
{
"response": "Why do programmers prefer dark mode? Because light attracts bugs!"
}
Testing Your Application
Manual Testing with curl
You can also test your API using curl commands:
# Health check
curl http://127.0.0.1:8000/
# Chat request
curl -X POST "http://127.0.0.1:8000/chat" \
-H "Content-Type: application/json" \
-d '{"prompt": "Tell me about FastAPI"}'
Conclusion
Congratulations! You’ve successfully built and deployed your first AI-powered web service using FastAPI and Google’s Gemini model. In this tutorial, you learned:
How to set up a modern Python project with FastAPI
-
How to structure code using abstraction and modular design
-
How to implement AI interaction using Google’s Gemini API
-
The basics of system prompts and how they shape AI behavior
-
How to create and test REST API endpoints using FastAPI and Pydantic
This project introduced you to the world of scalable, production-ready AI APIs. From abstract base classes to real-time chat endpoints, you now have a solid foundation for developing more complex and intelligent applications.
Whether you’re looking to integrate other AI models, add user authentication, or deploy to the cloud, the skills you’ve gained here will serve as a strong starting point. And in the future, we will be looking to extend the functionality of this application!
Keep building, keep experimenting—and welcome to the future of intelligent web development.
Happy coding!
Author: Aaron Borgi
Email: [email protected]
Company: THINKOFIT
Date: 2025-11-07