July 13th, 2024

Memory^3: Language Modeling with Explicit Memory

The paper introduces Memory^3, a novel approach for large language models, using explicit memory to reduce training costs. It outperforms traditional models, emphasizing knowledge externalization and innovative techniques for memory enhancement.

Read original articleLink Icon
Memory^3: Language Modeling with Explicit Memory

The paper titled "Memory^3: Language Modeling with Explicit Memory" introduces a novel approach to reduce the cost of training and inference for large language models (LLMs) by incorporating explicit memory. This explicit memory format is designed to be more cost-effective than traditional model parameters and text retrieval-augmented generation models. By externalizing most of the model's knowledge to explicit memories, the LLM can benefit from a smaller parameter size, reduced training and inference costs, all dependent on the remaining "abstract knowledge". The proposed model, Memory^3, outperforms larger LLMs and text retrieval-augmented generation models while maintaining a higher decoding speed. The paper presents a memory circuitry theory to support knowledge externalization and introduces innovative techniques such as memory sparsification and a two-stage pretraining scheme to enhance memory formation. This work represents a significant step towards more efficient and effective language modeling techniques.

Related

Link Icon 0 comments