100M Token Context Windows
Magic has developed LTM models that process 100 million tokens, enhancing software applications. Their efficient LTM-2-mini model, new HashHop method, and partnership with Google Cloud support AI advancements.
Read original articleMagic has made significant advancements in ultra-long context models, specifically with their new LTM (Long-Term Memory) models capable of processing up to 100 million tokens during inference. This development aims to enhance applications in software development, allowing models to utilize extensive code, documentation, and libraries effectively. The company has introduced a new evaluation method called HashHop, which improves the model's ability to store and retrieve information without relying on traditional semantic hints. Magic has successfully trained its first 100M token context model, LTM-2-mini, which is significantly more efficient than existing models like Llama 3.1, requiring much less computational power. Additionally, Magic has partnered with Google Cloud to build advanced supercomputers, enhancing their model training and deployment capabilities. The company has raised $465 million in funding, including a recent $320 million investment from notable investors. Magic is focused on improving inference-time compute and is actively hiring engineers and researchers to support their growth and development in AI technology.
- Magic's LTM models can process up to 100 million tokens, enhancing software development applications.
- The new HashHop evaluation method improves information retrieval without traditional semantic hints.
- LTM-2-mini is significantly more efficient than existing models, requiring less computational power.
- Magic has partnered with Google Cloud to build advanced supercomputers for AI model training.
- The company has raised $465 million in funding and is hiring to accelerate its AI development efforts.
Related
Llama 3.1: Our most capable models to date
Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.
Big tech wants to make AI cost nothing
Meta has open-sourced its Llama 3.1 language model for organizations with fewer than 700 million users, aiming to enhance its public image and increase product demand amid rising AI infrastructure costs.
Llama 3 Secrets Every Engineer Must Know
Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.
Cerebras reaches 1800 tokens/s for 8B Llama3.1
Cerebras Systems is deploying Meta's LLaMA 3.1 model on its wafer-scale chip, achieving faster processing speeds and lower costs, while aiming to simplify developer integration through an API.
An update on Llama adoption
Llama, Meta's language model, has surpassed 350 million downloads, with significant growth in usage and adoption among major companies, driven by its open-source nature and recent enhancements.
100M context window means it can probably store everything you’ve ever told it for years.
Couple this with multimodal capabilities, like a robot encoding vision and audio into tokens, you can get autonomous assistants than learn your house/habits/chores really quickly.
1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)
> We’ve raised a total of $465M, including a recent investment of $320 million from new investors Eric Schmidt, Jane Street, Sequoia, Atlassian, among others, and existing investors Nat Friedman & Daniel Gross, Elad Gil, and CapitalG.
Yeah, I guess that'd do it. Who are these people and how'd they convince them to invest that much?
Related
Llama 3.1: Our most capable models to date
Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.
Big tech wants to make AI cost nothing
Meta has open-sourced its Llama 3.1 language model for organizations with fewer than 700 million users, aiming to enhance its public image and increase product demand amid rising AI infrastructure costs.
Llama 3 Secrets Every Engineer Must Know
Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.
Cerebras reaches 1800 tokens/s for 8B Llama3.1
Cerebras Systems is deploying Meta's LLaMA 3.1 model on its wafer-scale chip, achieving faster processing speeds and lower costs, while aiming to simplify developer integration through an API.
An update on Llama adoption
Llama, Meta's language model, has surpassed 350 million downloads, with significant growth in usage and adoption among major companies, driven by its open-source nature and recent enhancements.