EM-LLM: Human-Inspired Episodic Memory for Infinite Context LLMs
EM-LLM is a novel architecture that significantly enhances the ability of large language models (LLMs) to handle extremely long contexts by mimicking human episodic memory and event cognition. Without fine-tuning, EM-LLM organizes input token sequences into coherent episodic events and accesses relevant information through an efficient two-stage memory retrieval mechanism. In LongBench and ∞-Bench benchmarks, EM-LLM outperforms state-of-the-art retrieval models like InfLLM and RAG, even surpassing full-context models in most tasks. It successfully performs retrieval across 10 million tokens, computationally infeasible for full-context models. The strong correlation between EM-LLM's event segmentation and human-perceived events offers a novel computational framework for exploring human memory mechanisms.