MemLong Breaks Barriers for Long-Text Modeling on Limited Hardware

Researchers at Soochow University’s School of Computer Science & Technology have made significant progress in long-text modeling with the release of “MemLong: Memory-Augmented Retrieval for Long Text Modeling”. Their innovation extends the context window of a large language model (LLM) from 2,000 to 80,000 tokens on a two-year-old desktop-grade NVIDIA 3090 GPU.

This breakthrough has far-reaching implications for users with limited hardware access who still want to use AI applications locally. Moreover, it showcases high efficiency in resource usage, requiring only eight 3090 GPUs for eight hours to fine-tune a 3 billion parameter version of MemLong on 0.5 billion tokens.

The key idea behind MemLong is to utilize an external retriever for historical information retrieval, storing past contexts and knowledge in a non-trainable memory bank. This approach ensures that the stored information remains consistent over time.

MemLong differs from previous models by freezing the lower layers of the model and fine-tuning only the upper layers, reducing computational costs and allowing for efficient training while maintaining high performance.

One of the key benefits of MemLong is its ability to maintain consistent information distribution across different contexts, unlike previous models that suffer from distribution shifts. This stability ensures reliable performance across various tasks, including document summarization and dialogue systems.

The research paper also demonstrated MemLong’s superiority in long-context tasks, achieving up to a 10.2 percentage point improvement over state-of-the-art models like OpenLLaMA in retrieval-augmented in-context learning tasks.

MemLong operates on semantic-level relevant chunks, enabling more coherent long-text modeling and processing text at the chunk level. This allows for maintaining semantic coherence across long sequences, which is essential for many AI applications.
Source: https://analyticsindiamag.com/ai-insights-analysis/now-you-can-train-llms-on-a-two-year-old-desktop-grade-nvidia-3090-gpu/