Exploring the New Responses API: A Comprehensive Guide

Anis MarrouchiAI Bot
By Anis Marrouchi & AI Bot ·

Loading the Text to Speech Audio Player...

The Responses API introduces a groundbreaking approach to handling complex interactions with AI models. Unlike traditional APIs, it offers a stateful conversation model, eliminating the need for manual state management. This guide will walk you through the essential steps to leverage the Responses API effectively.

Getting Started

First, ensure you have the OpenAI Python package installed and your API key ready.

from openai import OpenAI
import os
 
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Creating a Response

To initiate a conversation, use the responses.create method. Here's how to ask the model to tell a joke:

response = client.responses.create(
    model="gpt-4o-mini",
    input="tell me a joke",
)
print(response.output[0].content[0].text)

Stateful Conversations

One of the key features of the Responses API is its statefulness. You can retrieve the full conversation history at any point:

fetched_response = client.responses.retrieve(
    response_id=response.id
)
print(fetched_response.output[0].content[0].text)

Continuing Conversations

To continue a conversation, simply reference the previous response ID:

response_two = client.responses.create(
    model="gpt-4o-mini",
    input="tell me another",
    previous_response_id=response.id
)
print(response_two.output[0].content[0].text)

Hosted Tools

The Responses API supports hosted tools like web_search, enabling seamless integration of web search results into your conversations:

response = client.responses.create(
    model="gpt-4o",
    input="What's the latest news about AI?",
    tools=[
        {
            "type": "web_search"
        }
    ]
)
import json
print(json.dumps(response.output, default=lambda o: o.__dict__, indent=2))

Multimodal Interactions

The API natively supports text, images, and audio, allowing for rich, multimodal interactions:

response_multimodal = client.responses.create(
    model="gpt-4o",
    input=[
        {
            "role": "user",
            "content": [
                {"type": "input_text", "text": "Come up with keywords related to the image, and search on the web using the search tool for any news related to the keywords, summarize the findings and cite the sources."},
                {"type": "input_image", "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Cat_August_2010-4.jpg/2880px-Cat_August_2010-4.jpg"}
            ]
        }
    ],
    tools=[
        {"type": "web_search"}
    ]
)
import json
print(json.dumps(response_multimodal.__dict__, default=lambda o: o.__dict__, indent=4))

Conclusion

The Responses API simplifies the development of complex, multimodal, tool-augmented interactions. By handling state and integrating hosted tools, it reduces the need for multiple API calls and manual state management, making your code cleaner and more efficient.

Ready to simplify your AI interactions? Dive into the Responses API documentation and start building today!

Reference


Want to read more tutorials? Check out our latest tutorial on Translating Audio Content Using GPT-4o: A Step-by-Step Guide.

Discuss Your Project with Us

We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.

Let's find the best solutions for your needs.