Ramp Labs proposes a new solution for shared multi-agent memory, with the highest Token consumption reduced by 65%

GateNews

2026-04-11 05:14:49

Gate News message, April 11, AI infrastructure company Ramp Labs released research findings called “Latent Briefing,” enabling efficient memory sharing among multi-agent systems by directly compressing large-model KV caches, greatly reducing Token consumption without losing accuracy. In mainstream multi-agent architectures, the orchestrator breaks down tasks and repeatedly calls worker model instances; as the inference chain grows longer, Token usage expands exponentially. The core idea behind Latent Briefing is to use the attention mechanism to identify the truly crucial parts of the context, discard redundant information directly at the representation layer, rather than relying on slow LLM summarization or RAG retrieval with less stable results. On the LongBench v2 benchmark, the method performed impressively: the worker model’s Token consumption dropped by 65%, the Token savings’ median for medium-length documents (32k to 100k) reached 49%, overall accuracy improved by about 3 percentage points versus the baseline, and the additional time spent per compression was only about 1.7 seconds—roughly a 20x speedup compared with the original algorithm. The experiments used Claude Sonnet 4 as the orchestrator and Qwen3-14B as the worker model, covering a wide range of document scenarios including academic papers, legal documents, novels, and government reports. The study also found that the optimal compression threshold varies with task difficulty and document length—hard problems are better suited to aggressive compression to filter speculative reasoning noise, while long documents are better suited to lighter compression to preserve dispersed key information.

View Source

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Comment

0/400

No comments