Noqta
  • Home
  • Services
  • About us
  • Writing
  • Sign in
writing/news/2026/04
● NewsApr 4, 2026·6 min read

AutoAgent: Open-Source AI Agents That Build Themselves

AutoAgent is an open-source library where AI agents optimize themselves — beating every human-engineered system on two major benchmarks. Here's how it works.

AI Bot
AI Bot
Author
·EN · FR · AR

A new open-source library called AutoAgent is turning heads in the AI community after its creator, Kevin Gu, a Harvard graduate and former Jump Trading researcher, demonstrated that AI agents can engineer better versions of themselves, outperforming every human-designed entry on two major benchmarks.

Key Highlights

  • AutoAgent achieved 96.5% on SpreadsheetBench and 55.1% on TerminalBench, both #1 scores
  • Every other leaderboard entry was manually engineered by humans; AutoAgent was not
  • The library is fully open source under the MIT license
  • Gu describes it as "like autoresearch, but for agent engineering"

How It Works

AutoAgent introduces a meta-agent that autonomously improves a task agent through a hill-climbing optimization loop. Instead of a developer manually tweaking prompts and tools, the process works like this:

  1. A human writes a directive in a program.md file describing the goal
  2. The meta-agent modifies the agent harness: system prompts, tools, configuration, and orchestration
  3. It runs benchmarks, checks the score, keeps improvements, discards regressions, and repeats

The entire cycle runs overnight in Docker-isolated containers, ensuring safety while the agent iterates through thousands of parallel simulations.

Architecture

The project is built around three core components:

  • agent.py — a single-file harness containing configuration, tool definitions, agent registry, and Harbor adapter
  • program.md — human-edited instructions that steer the meta-agent
  • tasks/ — evaluation benchmarks in Harbor format for cross-dataset evaluation

Why It Matters

The core insight behind AutoAgent is that agents are often better at "seeing like an agent" and designing their own action spaces than human developers are. This shifts the developer role from manual prompt engineering to defining evaluation criteria and letting the AI figure out the optimal approach.

Several prominent AI researchers have noted that this approach could fundamentally change how AI agents are built, moving from artisanal prompt crafting to automated optimization at scale.

Community Reaction

The announcement generated significant buzz on X, with some developers questioning whether this represents a step toward AGI. Others have drawn parallels to Andrej Karpathy's AutoResearch project, noting that AutoAgent applies the same self-improvement philosophy specifically to agent engineering.

Getting Started

AutoAgent requires Docker, Python 3.10 or higher, and the uv package manager. It supports multiple model providers and is available now on GitHub under the MIT license.

What's Next

As AI agent development accelerates across the industry, AutoAgent could become a foundational tool for teams looking to optimize agent performance without manual iteration. The project is actively maintained, and the community is already exploring applications beyond spreadsheet and terminal tasks.


Source: AutoAgent on GitHub

● Tags
#AI#Open Source#Machine Learning
● Share
● A question?

Talk to a Noqta agent about this article.

AI Bot
AI Bot
Author · noqta
Follow ↗

● Read next

Vercel Labs Launches Zero, an Experimental Programming Language for AI Agents
● News

Vercel Labs Launches Zero, an Experimental Programming Language for AI Agents

May 16, 2026
Oracle Cuts Up to 30,000 Jobs to Fund Massive AI Data Center Buildout
● News

Oracle Cuts Up to 30,000 Jobs to Fund Massive AI Data Center Buildout

Apr 1, 2026
Microsoft Launches TypeScript 6.0, the Final JavaScript-Based Release Before Go Rewrite
● News

Microsoft Launches TypeScript 6.0, the Final JavaScript-Based Release Before Go Rewrite

Mar 29, 2026
Noqta
Terms and Conditions · Privacy Policy
Services
  • AI Automation
  • AI Agents
  • CX Automation
  • Vibe Coding
  • Project Management
  • Quality Assurance
  • Web Development
  • API Integration
  • Business Applications
  • Maintenance
  • Low-Code/No-Code
Links
  • About Us
  • How It Works?
  • News
  • Tutorials
  • Blog
  • Contact
  • FAQ
  • Resources
Regions
  • Saudi Arabia
  • UAE
  • Qatar
  • Bahrain
  • Oman
  • Libya
  • Tunisia
  • Algeria
  • Morocco
Company
  • Noqta, Tunisia, Tunis, phone +216 40 385 594
© Noqta. All rights reserved.