Ramp Labs proposes a new solution for shared multi-agent memory, with the highest Token consumption reduced by 65%
Ramp Labs research output “Latent Briefing” achieves efficient memory sharing across multi-agent systems by compressing the LLM KV cache, reducing Token consumption and improving accuracy. In LongBench v2 testing, this approach successfully reduced Worker model Token consumption by 65% and improved overall accuracy by about 3 percentage points, with compression taking only 1.7 seconds. This technology performs exceptionally well across different document scenarios.
GateNews·4h ago

