Deep Agents

Language model based applications have moved from simple chat interfaces, to more 'agentic' frameworks. While the definition of what an 'agent' is varies, it generally refers to a system where an LLM is equipped with some tools, and is allowed to operate continuously. It is often with the intent to complete a more complex goal by chaining together multiple actions together, ultimately relying on the reasoning abilities of the language model to determine what steps to take and when to stop.

This initial approach has worked well, with agentic systems proving more useful, helpful, and capable than their naive chat completions beginnings, ushering in a new wave of interest and investment into arming LLMs with various integrations and tools to be able to perform more complex tasks. Although useful, there still remained a disconnect between the capability of an LLM with digitial tools and a human with the same tools, primarily around 'long-horizon' tasks.

Measuring AI Ability to Complete Long Tasks

When observing a graph that compares success rates of models on human benchmarked time to complete tasks, most modern day models can complete tasks that often take 4 minutes for humans to complete successfully, however as task complexity grows success rate falls. Anecdotely, not many useful tasks tend to fall into the 'takes four minutes to complete' category, i.e:

HCAST: Human-Calibrated Autonomy Software Tasks

With most of what would be considered 'productive' tasks often taking much more time and effort to complete.

As LLMs become smarter and more capable through improved training techniques, their ability for successful long-horizon task completion has improved (reference prior graph), but even more gains have been made through custom frameworks meant to encourage and enable this kind of behavior. Popularized initially by workflows such as Deep Research and coding frameworks like Claude Code provide scaffolding to encourage and assist in executing difficult and lengthy tasks.

Deep Agents

This concept has adopted the name Deep Agents, with a few commonalities being observed in the open source community namely:

A detailed and specific system prompt
A planning or to-do list tool
A file system for context management
Sub Agents!

In this notebook, we'll discuss the theory behind why these techniques have proven successful for creating Deep Agents and encouraging longer task execution, and provide an example of this all coming together by using open source frameworks to create our own Competitive Analysis Deep Agent!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
competitive_analysis_agent		competitive_analysis_agent
media		media
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deep_agents_overview.ipynb		deep_agents_overview.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Agents

About

Uh oh!

Releases

Packages

Languages

License

ALucek/deep-agents-walkthrough

Folders and files

Latest commit

History

Repository files navigation

Deep Agents

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages