Building AI Applications with Google Gemini API and TypeScript

Google Gemini 2.5 Pro is here. With a 1M token context window, native multimodal capabilities, and built-in function calling, Gemini is one of the most powerful AI APIs available today. In this tutorial, you will learn to harness its full potential using TypeScript.
What You Will Learn
By the end of this tutorial, you will:
- Set up the Google Generative AI SDK in a TypeScript project
- Generate text using Gemini 2.5 Pro and Gemini 2.5 Flash
- Process images, PDFs, and audio with multimodal input
- Implement real-time streaming responses
- Use function calling to connect Gemini to external tools
- Extract structured data with JSON schema output
- Build a practical AI assistant with conversation history
Prerequisites
Before starting, ensure you have:
- Node.js 20+ installed (
node --version) - TypeScript knowledge (types, async/await, modules)
- A Google AI Studio API key — get one free at aistudio.google.com
- A code editor — VS Code or Cursor recommended
- Basic understanding of REST APIs and async programming
Step 1: Project Setup
Create a new TypeScript project and install the Google Generative AI SDK:
mkdir gemini-ai-app && cd gemini-ai-app
npm init -y
npm install @google/generative-ai
npm install -D typescript @types/node tsxInitialize TypeScript:
npx tsc --init --target ES2022 --module NodeNext --moduleResolution NodeNext --outDir dist --rootDir src --strictCreate the project structure:
mkdir src
touch src/index.ts .envAdd your API key to .env:
GEMINI_API_KEY=your_api_key_hereUpdate package.json scripts:
{
"scripts": {
"dev": "tsx watch src/index.ts",
"build": "tsc",
"start": "node dist/index.js"
}
}Step 2: Your First Text Generation
Let us start with a simple text generation example. Create src/index.ts:
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
async function generateText() {
const model = genAI.getGenerativeModel({
model: "gemini-2.5-pro",
});
const result = await model.generateContent(
"Explain quantum computing in 3 sentences for a software developer."
);
console.log(result.response.text());
}
generateText();Run it:
GEMINI_API_KEY=your_key tsx src/index.tsYou should see a concise explanation of quantum computing. The generateContent method sends a single prompt and returns the complete response.
Understanding the Response Object
The response contains more than just text:
async function inspectResponse() {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });
const result = await model.generateContent("Hello, Gemini!");
const response = result.response;
console.log("Text:", response.text());
console.log("Usage:", response.usageMetadata);
// { promptTokenCount: 4, candidatesTokenCount: 12, totalTokenCount: 16 }
}The usageMetadata field helps you track token consumption for cost management.
Step 3: Choosing the Right Model
Gemini offers several models optimized for different use cases:
| Model | Best For | Context Window | Speed |
|---|---|---|---|
gemini-2.5-pro | Complex reasoning, coding, analysis | 1M tokens | Moderate |
gemini-2.5-flash | Fast responses, high throughput | 1M tokens | Fast |
gemini-2.5-flash-lite | Cost-efficient, simple tasks | 1M tokens | Fastest |
// For complex analysis
const pro = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });
// For speed-critical applications
const flash = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
// For high-volume, simple tasks
const lite = genAI.getGenerativeModel({ model: "gemini-2.5-flash-lite" });Tip: Start with gemini-2.5-flash for development. Switch to pro only when you need its enhanced reasoning. Flash is significantly cheaper and faster.
Step 4: Multimodal Input — Images and PDFs
One of Gemini's strongest features is native multimodal understanding. You can send images, PDFs, audio, and video alongside text.
Analyzing an Image
import { GoogleGenerativeAI } from "@google/generative-ai";
import * as fs from "fs";
import * as path from "path";
async function analyzeImage(imagePath: string) {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const imageData = fs.readFileSync(imagePath);
const base64Image = imageData.toString("base64");
const mimeType = imagePath.endsWith(".png") ? "image/png" : "image/jpeg";
const result = await model.generateContent([
{
inlineData: {
mimeType,
data: base64Image,
},
},
"Describe this image in detail. What objects, colors, and scene do you see?",
]);
console.log(result.response.text());
}
analyzeImage("./sample-image.jpg");Processing a PDF Document
async function analyzePDF(pdfPath: string) {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-pro" });
const pdfData = fs.readFileSync(pdfPath);
const base64PDF = pdfData.toString("base64");
const result = await model.generateContent([
{
inlineData: {
mimeType: "application/pdf",
data: base64PDF,
},
},
"Summarize the key points of this document. List the main topics and any action items.",
]);
console.log(result.response.text());
}Comparing Multiple Images
async function compareImages(image1Path: string, image2Path: string) {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const image1 = fs.readFileSync(image1Path).toString("base64");
const image2 = fs.readFileSync(image2Path).toString("base64");
const result = await model.generateContent([
{ inlineData: { mimeType: "image/jpeg", data: image1 } },
{ inlineData: { mimeType: "image/jpeg", data: image2 } },
"Compare these two images. What are the differences and similarities?",
]);
console.log(result.response.text());
}Step 5: Streaming Responses
For better user experience, stream responses token by token instead of waiting for the full response:
async function streamResponse(prompt: string) {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const result = await model.generateContentStream(prompt);
process.stdout.write("Gemini: ");
for await (const chunk of result.stream) {
const text = chunk.text();
process.stdout.write(text);
}
console.log("\n");
// Access the aggregated response after streaming
const aggregated = await result.response;
console.log("Total tokens:", aggregated.usageMetadata?.totalTokenCount);
}
streamResponse("Write a short poem about TypeScript.");Streaming in a Web Server
Here is how to integrate streaming with a simple HTTP server:
import { createServer } from "http";
const server = createServer(async (req, res) => {
if (req.method === "POST" && req.url === "/chat") {
const body = await getBody(req);
const { message } = JSON.parse(body);
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const result = await model.generateContentStream(message);
res.writeHead(200, {
"Content-Type": "text/event-stream",
"Cache-Control": "no-cache",
Connection: "keep-alive",
});
for await (const chunk of result.stream) {
res.write(`data: ${JSON.stringify({ text: chunk.text() })}\n\n`);
}
res.write("data: [DONE]\n\n");
res.end();
}
});
server.listen(3000, () => console.log("Server running on port 3000"));
function getBody(req: any): Promise<string> {
return new Promise((resolve) => {
let body = "";
req.on("data", (chunk: string) => (body += chunk));
req.on("end", () => resolve(body));
});
}Step 6: Function Calling
Function calling lets Gemini invoke your TypeScript functions to fetch real-time data, interact with APIs, or perform calculations.
Defining Tools
import {
GoogleGenerativeAI,
FunctionDeclarationSchemaType,
} from "@google/generative-ai";
const tools = [
{
functionDeclarations: [
{
name: "getWeather",
description:
"Get the current weather for a specific location",
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
location: {
type: FunctionDeclarationSchemaType.STRING,
description: "City name, e.g., 'Tunis' or 'Paris'",
},
unit: {
type: FunctionDeclarationSchemaType.STRING,
enum: ["celsius", "fahrenheit"],
description: "Temperature unit",
},
},
required: ["location"],
},
},
{
name: "searchProducts",
description: "Search for products in the catalog",
parameters: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
query: {
type: FunctionDeclarationSchemaType.STRING,
description: "Search query",
},
maxPrice: {
type: FunctionDeclarationSchemaType.NUMBER,
description: "Maximum price filter",
},
},
required: ["query"],
},
},
],
},
];Handling Function Calls
// Simulated function implementations
function getWeather(location: string, unit = "celsius") {
const mockData: Record<string, { temp: number; condition: string }> = {
tunis: { temp: 24, condition: "Sunny" },
paris: { temp: 15, condition: "Cloudy" },
tokyo: { temp: 20, condition: "Clear" },
};
const data = mockData[location.toLowerCase()] || {
temp: 20,
condition: "Unknown",
};
return {
location,
temperature: data.temp,
unit,
condition: data.condition,
};
}
function searchProducts(query: string, maxPrice?: number) {
return {
results: [
{ name: `${query} Pro`, price: 99.99, rating: 4.5 },
{ name: `${query} Lite`, price: 49.99, rating: 4.2 },
].filter((p) => !maxPrice || p.price <= maxPrice),
};
}
async function chatWithTools(userMessage: string) {
const model = genAI.getGenerativeModel({
model: "gemini-2.5-flash",
tools,
});
const chat = model.startChat();
const result = await chat.sendMessage(userMessage);
const response = result.response;
const functionCalls = response.functionCalls();
if (functionCalls && functionCalls.length > 0) {
const functionResponses = functionCalls.map((call) => {
let result;
switch (call.name) {
case "getWeather":
result = getWeather(
call.args.location as string,
call.args.unit as string
);
break;
case "searchProducts":
result = searchProducts(
call.args.query as string,
call.args.maxPrice as number
);
break;
default:
result = { error: "Unknown function" };
}
return {
functionResponse: {
name: call.name,
response: result,
},
};
});
// Send function results back to Gemini
const finalResult = await chat.sendMessage(functionResponses);
console.log(finalResult.response.text());
} else {
console.log(response.text());
}
}
chatWithTools("What is the weather in Tunis and find me some laptop options under $80?");Gemini will call both functions in parallel and synthesize the results into a natural language response.
Step 7: Structured Output with JSON Schema
Force Gemini to return data in a specific JSON structure — perfect for building reliable data pipelines:
async function extractStructuredData(text: string) {
const model = genAI.getGenerativeModel({
model: "gemini-2.5-flash",
generationConfig: {
responseMimeType: "application/json",
responseSchema: {
type: FunctionDeclarationSchemaType.OBJECT,
properties: {
sentiment: {
type: FunctionDeclarationSchemaType.STRING,
enum: ["positive", "negative", "neutral"],
},
confidence: {
type: FunctionDeclarationSchemaType.NUMBER,
},
topics: {
type: FunctionDeclarationSchemaType.ARRAY,
items: {
type: FunctionDeclarationSchemaType.STRING,
},
},
summary: {
type: FunctionDeclarationSchemaType.STRING,
},
},
required: ["sentiment", "confidence", "topics", "summary"],
},
},
});
const result = await model.generateContent(
`Analyze the following text:\n\n${text}`
);
const data = JSON.parse(result.response.text());
return data;
}
// Usage
const analysis = await extractStructuredData(
"The new Gemini API is incredible! The documentation is clear, " +
"the TypeScript SDK is well-designed, and the pricing is very competitive. " +
"However, rate limits can be a concern for high-traffic applications."
);
console.log(analysis);
// {
// sentiment: "positive",
// confidence: 0.85,
// topics: ["API", "documentation", "SDK", "pricing", "rate limits"],
// summary: "Positive review of Gemini API with minor concern about rate limits"
// }Step 8: Building a Conversational AI Assistant
Let us put everything together into a practical AI assistant with conversation history, system instructions, and safety settings:
import {
GoogleGenerativeAI,
HarmCategory,
HarmBlockThreshold,
} from "@google/generative-ai";
import * as readline from "readline";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY!);
async function startAssistant() {
const model = genAI.getGenerativeModel({
model: "gemini-2.5-flash",
systemInstruction: `You are a helpful coding assistant specializing in TypeScript and Node.js.
You provide concise, practical answers with code examples when appropriate.
Always consider best practices and security implications.
If you are unsure about something, say so rather than guessing.`,
safetySettings: [
{
category: HarmCategory.HARM_CATEGORY_HARASSMENT,
threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
},
{
category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
},
],
generationConfig: {
temperature: 0.7,
topP: 0.95,
topK: 40,
maxOutputTokens: 2048,
},
});
const chat = model.startChat({
history: [],
});
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
console.log("AI Assistant ready! Type 'exit' to quit.\n");
const askQuestion = () => {
rl.question("You: ", async (input) => {
const trimmed = input.trim();
if (trimmed.toLowerCase() === "exit") {
console.log("Goodbye!");
rl.close();
return;
}
try {
const result = await chat.sendMessageStream(trimmed);
process.stdout.write("Assistant: ");
for await (const chunk of result.stream) {
process.stdout.write(chunk.text());
}
console.log("\n");
} catch (error: any) {
console.error("Error:", error.message);
}
askQuestion();
});
};
askQuestion();
}
startAssistant();Generation Config Explained
| Parameter | Default | Description |
|---|---|---|
temperature | 1.0 | Controls randomness. Lower = more focused, higher = more creative |
topP | 0.95 | Nucleus sampling. Consider tokens with cumulative probability up to this value |
topK | 40 | Consider only the top K tokens at each step |
maxOutputTokens | Varies | Maximum number of tokens to generate |
Step 9: Error Handling and Rate Limiting
Production applications need robust error handling:
import { GoogleGenerativeAI, GoogleGenerativeAIError } from "@google/generative-ai";
async function safeGenerate(prompt: string, retries = 3): Promise<string> {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const result = await model.generateContent(prompt);
return result.response.text();
} catch (error: any) {
const status = error?.status;
if (status === 429) {
// Rate limited — exponential backoff
const delay = Math.pow(2, attempt) * 1000;
console.warn(
`Rate limited. Retrying in ${delay / 1000}s (attempt ${attempt}/${retries})`
);
await new Promise((r) => setTimeout(r, delay));
continue;
}
if (status === 400) {
console.error("Bad request — check your prompt or parameters");
throw error;
}
if (status === 403) {
console.error("API key invalid or quota exceeded");
throw error;
}
if (attempt === retries) throw error;
console.warn(`Error (attempt ${attempt}/${retries}):`, error.message);
await new Promise((r) => setTimeout(r, 1000 * attempt));
}
}
throw new Error("Max retries exceeded");
}Rate Limit Best Practices
- Use exponential backoff — double the wait time between retries
- Cache responses — store results for identical prompts
- Use Flash for high-volume — it has higher rate limits than Pro
- Batch requests — group related prompts when possible
Step 10: Production Tips
Environment Configuration
// src/config.ts
interface GeminiConfig {
apiKey: string;
model: string;
maxRetries: number;
timeout: number;
}
export function getConfig(): GeminiConfig {
const apiKey = process.env.GEMINI_API_KEY;
if (!apiKey) {
throw new Error("GEMINI_API_KEY environment variable is required");
}
return {
apiKey,
model: process.env.GEMINI_MODEL || "gemini-2.5-flash",
maxRetries: parseInt(process.env.GEMINI_MAX_RETRIES || "3", 10),
timeout: parseInt(process.env.GEMINI_TIMEOUT || "30000", 10),
};
}Token Counting
Before sending large prompts, check the token count:
async function countTokens(content: string) {
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const result = await model.countTokens(content);
console.log(`Token count: ${result.totalTokens}`);
return result.totalTokens;
}Cost Estimation
function estimateCost(
inputTokens: number,
outputTokens: number,
model: "pro" | "flash" | "flash-lite"
) {
// Approximate pricing per 1M tokens (check current pricing)
const pricing = {
pro: { input: 1.25, output: 10.0 },
flash: { input: 0.15, output: 0.6 },
"flash-lite": { input: 0.075, output: 0.3 },
};
const p = pricing[model];
const inputCost = (inputTokens / 1_000_000) * p.input;
const outputCost = (outputTokens / 1_000_000) * p.output;
return {
inputCost: `$${inputCost.toFixed(4)}`,
outputCost: `$${outputCost.toFixed(4)}`,
totalCost: `$${(inputCost + outputCost).toFixed(4)}`,
};
}Troubleshooting
Common Issues
"API key not valid"
- Verify your key at aistudio.google.com
- Ensure the Generative AI API is enabled in your Google Cloud project
- Check that your key is not restricted to specific APIs
"Resource exhausted" (429 errors)
- You have hit the rate limit. Implement exponential backoff
- Consider upgrading to a paid tier for higher limits
- Use
gemini-2.5-flashwhich has more generous rate limits
"Content blocked by safety filters"
- Adjust
safetySettingsthresholds (but be responsible) - Rephrase the prompt to avoid triggering filters
- Review Google's usage policies
Empty responses
- The model may have been blocked by safety filters with no text returned
- Check
response.promptFeedbackfor block reasons
Next Steps
Now that you have mastered the Gemini API fundamentals, explore these advanced topics:
- Gemini with LangChain.js — Build complex AI chains and agents
- Vertex AI — Enterprise-grade deployment with Google Cloud
- Fine-tuning — Customize Gemini models for your domain
- Context caching — Reduce costs for repeated context windows
- Grounding with Google Search — Enhance responses with real-time web data
Conclusion
You have built a complete AI application toolkit using the Google Gemini API and TypeScript. From simple text generation to multimodal analysis, streaming, function calling, and structured output — you now have all the building blocks to create sophisticated AI-powered applications.
The Gemini API's combination of a massive context window, competitive pricing, and native multimodal support makes it an excellent choice for production AI applications. Start with Flash for development speed, and scale to Pro when you need enhanced reasoning capabilities.
Discuss Your Project with Us
We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.
Let's find the best solutions for your needs.
Related Articles

Building AI Agents from Scratch with TypeScript: Master the ReAct Pattern Using the Vercel AI SDK
Learn how to build AI agents from the ground up using TypeScript. This tutorial covers the ReAct pattern, tool calling, multi-step reasoning, and production-ready agent loops with the Vercel AI SDK.

Build Your First MCP Server with TypeScript: Tools, Resources, and Prompts
Learn how to build a production-ready MCP server from scratch using TypeScript. This hands-on tutorial covers tools, resources, prompts, stdio transport, and connecting to Claude Desktop and Cursor.

Build a Local AI Chatbot with Ollama and Next.js: Complete Guide
Build a private, fully local AI chatbot using Ollama and Next.js. This hands-on tutorial covers installation, streaming responses, model selection, and deploying a production-ready chat interface — all without sending data to the cloud.