Google Research Releases ReasoningBank: AI Agents Learn Reasoning Strategies from Success and Failure

Gate News message, April 22 — Google Research released ReasoningBank, an agent memory framework that enables large language model-driven agents to continuously learn after deployment. The framework extracts universal reasoning strategies from both successful and failed task experiences, storing them in a memory bank for retrieval and execution on similar future tasks. The associated paper was published at ICLR, and code has been open-sourced on GitHub.

ReasoningBank improves upon two existing approaches: Synapse, which records complete action trajectories but has limited transferability due to fine-grained granularity, and Agent Workflow Memory, which only learns from successful cases. ReasoningBank makes two key changes: storing "reasoning patterns" instead of "action sequences," with each memory containing structured fields for title, description, and content; and incorporating failure trajectories into learning. The framework uses a model to self-evaluate execution trajectories, transforming failure experiences into anti-pitfall rules. For example, the rule "click Load More button when seen" evolves into "verify current page identifier first, avoid infinite scrolling loops, then click load more."

The paper also introduces Memory-aware Test-time Scaling (MaTTS), which allocates additional compute during inference to explore multiple trajectories and store findings in the memory bank. Parallel expansion runs multiple distinct trajectories for the same task, refining more robust strategies through self-comparison; sequential expansion iteratively refines a single trajectory, storing intermediate reasoning in memory.

On WebArena browser tasks and SWE-Bench-Verified coding tasks using Gemini 2.5 Flash as a ReAct agent, ReasoningBank achieved 8.3% higher success rate on WebArena and 4.6% higher on SWE-Bench-Verified compared to a baseline without memory, reducing average steps per task by approximately 3. Adding MaTTS with parallel expansion (k=5) further improved WebArena success rate by 3 percentage points and reduced steps by an additional 0.4.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments