AMD Unveils Its First Small Language Model AMD-135M
AMD has launched its first small language model, AMD-135M, trained on 670 billion tokens. It features speculative decoding for improved speed and is open-sourced to foster AI community collaboration.
Read original articleAMD has introduced its first small language model (SLM), the AMD-135M, which is part of the Llama family. This model was trained from scratch on AMD Instinct MI250 accelerators using 670 billion tokens over six days. It consists of two variants: AMD-Llama-135M and AMD-Llama-135M-code, the latter being fine-tuned with an additional 20 billion tokens of code data. AMD emphasizes an open approach to AI, making the training code, dataset, and model weights available for developers to reproduce and enhance the model. A key feature of the AMD-135M is its use of speculative decoding, which improves inference speed by allowing multiple tokens to be generated in a single forward pass, thus enhancing memory access efficiency. Testing showed significant performance improvements on AMD hardware, including the Instinct MI250 accelerator and Ryzen AI processors. AMD aims to foster innovation in the AI community by providing an open-source reference implementation of the model.
- AMD has launched its first small language model, AMD-135M, trained on 670 billion tokens.
- The model features speculative decoding to enhance inference speed and efficiency.
- Both the training code and model weights are open-sourced for community use.
- Performance tests indicate significant speed improvements on AMD hardware.
- AMD aims to promote innovation and collaboration within the AI community.
Related
Llama 3.1 Official Launch
Llama introduces Llama 3.1, an open-source AI model available in 8B, 70B, and 405B versions. The 405B model is highlighted for its versatility in supporting various use cases, including multi-lingual agents and analyzing large documents. Users can leverage coding assistants, real-time or batch inference, and fine-tuning capabilities. Llama emphasizes open-source AI and offers subscribers updates via a newsletter.
Llama 3.1: Our most capable models to date
Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.
Llama 3 Secrets Every Engineer Must Know
Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.
Llama 3.2 released: Multimodal, 1B to 90B sizes
Llama 3.2 has been released as an open-source AI model in various sizes for text and image processing, enhancing application development and gaining significant traction with over 350 million downloads.
LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs
The paper presents an FPGA-based accelerator for large language models, achieving 14.3-15.8 times speedup and 6.1 times power efficiency, enhancing deployment in resource-constrained environments.
Wow, an actual open source language model (first of its kind [from a larger company] maybe even?), includes all you need to be able to recreate it from scratch. Thanks AMD!
Available under this funky GitHub organization it seems: https://github.com/AMD-AIG-AIMA/AMD-LLM
Anyone know the recommended cloud provider and equivalent rental price?
[1] https://www.wiredzone.com/shop/product/10025451-supermicro-g...
Actually, AMD has excellent reasons to make this kind of development and I hope they continue.
Does anyone know if the "several orders of magnitude speed improvement" is accurate? I'm doubtful.
Very interesting though! I'll be playing around with this on the weekend!
I thought PyTorch didn't work well with AMD architecture, and read of many people using JAX instead?
Related
Llama 3.1 Official Launch
Llama introduces Llama 3.1, an open-source AI model available in 8B, 70B, and 405B versions. The 405B model is highlighted for its versatility in supporting various use cases, including multi-lingual agents and analyzing large documents. Users can leverage coding assistants, real-time or batch inference, and fine-tuning capabilities. Llama emphasizes open-source AI and offers subscribers updates via a newsletter.
Llama 3.1: Our most capable models to date
Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.
Llama 3 Secrets Every Engineer Must Know
Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.
Llama 3.2 released: Multimodal, 1B to 90B sizes
Llama 3.2 has been released as an open-source AI model in various sizes for text and image processing, enhancing application development and gaining significant traction with over 350 million downloads.
LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs
The paper presents an FPGA-based accelerator for large language models, achieving 14.3-15.8 times speedup and 6.1 times power efficiency, enhancing deployment in resource-constrained environments.