August 29th, 2024

100M Token Context Windows

Magic has developed LTM models that process 100 million tokens, enhancing software applications. Their efficient LTM-2-mini model, new HashHop method, and partnership with Google Cloud support AI advancements.

Read original article

Magic has made significant advancements in ultra-long context models, specifically with their new LTM (Long-Term Memory) models capable of processing up to 100 million tokens during inference. This development aims to enhance applications in software development, allowing models to utilize extensive code, documentation, and libraries effectively. The company has introduced a new evaluation method called HashHop, which improves the model's ability to store and retrieve information without relying on traditional semantic hints. Magic has successfully trained its first 100M token context model, LTM-2-mini, which is significantly more efficient than existing models like Llama 3.1, requiring much less computational power. Additionally, Magic has partnered with Google Cloud to build advanced supercomputers, enhancing their model training and deployment capabilities. The company has raised $465 million in funding, including a recent $320 million investment from notable investors. Magic is focused on improving inference-time compute and is actively hiring engineers and researchers to support their growth and development in AI technology.

- Magic's LTM models can process up to 100 million tokens, enhancing software development applications.

- The new HashHop evaluation method improves information retrieval without traditional semantic hints.

- LTM-2-mini is significantly more efficient than existing models, requiring less computational power.

- Magic has partnered with Google Cloud to build advanced supercomputers for AI model training.

- The company has raised $465 million in funding and is hiring to accelerate its AI development efforts.

Llama 3.1: Our most capable models to date

Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.

Big tech wants to make AI cost nothing

Meta has open-sourced its Llama 3.1 language model for organizations with fewer than 700 million users, aiming to enhance its public image and increase product demand amid rising AI infrastructure costs.

Llama 3 Secrets Every Engineer Must Know

Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.

Cerebras reaches 1800 tokens/s for 8B Llama3.1

Cerebras Systems is deploying Meta's LLaMA 3.1 model on its wafer-scale chip, achieving faster processing speeds and lower costs, while aiming to simplify developer integration through an API.

An update on Llama adoption

Llama, Meta's language model, has surpassed 350 million downloads, with significant growth in usage and adoption among major companies, driven by its open-source nature and recent enhancements.

8 comments

By @shazami - 6 months

FYI wouldn't interview here. Got rejected after a 30 minute behavioral screen after spending 8 hours on an unpaid take-home.

By @dinobones - 6 months

Long context windows are IMO, “AGI enough.”

100M context window means it can probably store everything you’ve ever told it for years.

Couple this with multimodal capabilities, like a robot encoding vision and audio into tokens, you can get autonomous assistants than learn your house/habits/chores really quickly.

By @smusamashah - 6 months

It should be benchmarked against something like RULER[1]

1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)

By @fsndz - 6 months

Context windows are becoming larger and larger, and I anticipate more research focusing on this trend. Could this signal the eventual demise of RAG? Only time will tell. I recently experimented with RAG and the limitations are often surprising (https://www.lycee.ai/blog/rag-fastapi-postgresql-pgvector). I wonder if we will see some of the same limitations for long context LLM. In context learning is probably a form of semantic / lexical cues based arithmetic.

By @Sakos - 6 months

I was wondering how they could afford 8000 H100’s, but I guess I accidentally skipped over this part:

> We’ve raised a total of $465M, including a recent investment of $320 million from new investors Eric Schmidt, Jane Street, Sequoia, Atlassian, among others, and existing investors Nat Friedman & Daniel Gross, Elad Gil, and CapitalG.

Yeah, I guess that'd do it. Who are these people and how'd they convince them to invest that much?

By @anonzzzies - 6 months

What is the state of art on context on open models? Magic won't be open I guess after getting 500m in VC money.

By @samber - 6 months

Based on Mamba ?

By @htrp - 6 months

does anyone have a detailed tech breakdown of these guys? not quite sure how their LTM architecture works.