Browse Category

Attention Mechanisms

Mixture-of-Memories (MoM): The “Linear Attention” Breakthrough That Doesn’t Forget Long Contexts

Mixture-of-Memories (MoM): The “Linear Attention” Breakthrough That Doesn’t Forget Long Contexts

MoM was released in May 2025 as a linear-attention sequence model that preserves long-term context without forgetting. MoM breaks the single-memory bottleneck by maintaining multiple independent memory states and using a token router to assign information to memory slots. A trained router assigns each
Go toTop