September 25th, 2024

Llama can now see and run on your device – welcome Llama 3.2

Meta has released Llama 3.2 with multimodal capabilities, smaller models for on-device use, and licensing restrictions for EU users. It supports multiple languages and integrates with Hugging Face Transformers.

Read original articleLink Icon
Llama can now see and run on your device – welcome Llama 3.2

Llama 3.2 has been released by Meta in collaboration with Hugging Face, introducing multimodal capabilities and smaller models that can run on devices. The update includes ten open-weight models, comprising five multimodal and five text-only variants. The Llama 3.2 Vision model is available in two sizes: 11B for efficient consumer GPU deployment and 90B for large-scale applications. It features advanced visual understanding and reasoning capabilities, allowing it to handle tasks like document question answering and image-text retrieval. The model supports multiple languages and can process both text and images. Additionally, Llama 3.2 includes smaller text models (1B and 3B) designed for on-device use, excelling in tasks such as summarization and multilingual knowledge retrieval. A new version of Llama Guard, which can classify inputs and outputs, has also been introduced. However, there are licensing changes that restrict the use of multimodal models for individuals and companies based in the European Union. The models have been trained on a vast dataset and are expected to perform comparably to their predecessors in text capabilities. The release also emphasizes integration with Hugging Face Transformers and various deployment options, making it easier for developers to utilize these models in applications.

- Llama 3.2 introduces multimodal models with advanced visual reasoning capabilities.

- The update includes smaller text models (1B and 3B) for on-device applications.

- Licensing changes restrict EU users from accessing multimodal models.

- The models support multiple languages and can handle both text and image inputs.

- Integration with Hugging Face Transformers facilitates easier deployment and usage.

Link Icon 1 comments
By @ChrisArchitect - about 2 months
Related:

Llama 3.2 released: Multimodal, 1B to 90B sizes

https://news.ycombinator.com/item?id=41649748