AI SDK 4.0: New Features and Use Cases

Anis MarrouchiAI Bot
By Anis Marrouchi & AI Bot ·

Loading the Text to Speech Audio Player...

The AI SDK 4.0 is a powerful open-source toolkit designed for building AI applications using JavaScript and TypeScript. This latest release introduces several exciting features that enhance its capabilities, making it a versatile tool for developers. Let's delve into the new features and explore their use cases with code examples.

PDF Support

PDF support is a crucial addition to AI SDK 4.0, enabling AI applications to handle PDF documents effectively. This feature is essential for analyzing documents, extracting information, and automating workflows. With support for providers like Anthropic and Google Generative AI, you can now:

  • Extract text and information from PDFs
  • Analyze and summarize PDF content
  • Answer questions based on PDF content
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
 
const result = await generateText({
  model: anthropic('claude-3-5-sonnet-20241022'),
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is an embedding model according to this document?' },
        { type: 'file', data: fs.readFileSync('./data/ai.pdf'), mimeType: 'application/pdf' },
      ],
    },
  ],
});

Computer Use Support

AI SDK 4.0 introduces computer use support, allowing AI to interact with applications and interfaces naturally. This feature unlocks new automation opportunities by enabling AI to:

  • Control mouse movements and clicks
  • Input keyboard commands
  • Capture and analyze screenshots
  • Execute terminal commands
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { executeComputerAction, getScreenshot } from '@/lib/ai';
 
const computerTool = anthropic.tools.computer_20241022({
  displayWidthPx: 1920,
  displayHeightPx: 1080,
  execute: async ({ action, coordinate, text }) => {
    switch (action) {
      case 'screenshot': {
        return { type: 'image', data: getScreenshot() };
      }
      default: {
        return executeComputerAction(action, coordinate, text);
      }
    }
  },
});
 
const result = await generateText({
  model: anthropic('claude-3-5-sonnet-20241022'),
  prompt: 'Move the cursor to the center of the screen and take a screenshot',
  tools: { computer: computerTool },
});

Continuation Support

For applications requiring outputs beyond the generation limits of language models, AI SDK 4.0 offers continuation support. This feature allows for generating text across multiple steps, maintaining coherence and handling word boundaries automatically.

import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
 
const result = await generateText({
  model: openai('gpt-4o'),
  maxSteps: 5,
  experimental_continueSteps: true,
  prompt: 'Write a book about Roman history, from the founding of the city of Rome to the fall of the Western Roman Empire. Each chapter MUST HAVE at least 1000 words.',
});

New xAI Grok Provider

The AI SDK now supports x.AI through a new official provider, expanding the toolkit's versatility. Here's how you can use it:

import { xai } from '@ai-sdk/xai';
import { generateText } from 'ai';
 
const { text } = await generateText({
  model: xai('grok-beta'),
  prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});

Conclusion

AI SDK 4.0 is a significant update that brings new capabilities to developers, enabling them to build more sophisticated AI applications. With features like PDF support, computer use, and continuation support, the possibilities are endless.

For more information, visit the AI SDK documentation by Lars Grammel, Jared Palmer, Nico Albanese, and Walter Korman.


Want to read more tutorials? Check out our latest tutorial on 3 Laravel 11 Basics: Middleware.

Discuss Your Project with Us

We're here to help with your web development needs. Schedule a call to discuss your project and how we can assist you.

Let's find the best solutions for your needs.