October 21st, 2024

IBM Granite 3.0: open enterprise models

IBM launched Granite 3.0, an open-source suite of advanced language models for enterprise applications, emphasizing performance, safety, and cost-efficiency, with features like Mixture of Experts and Granite Guardian for risk management.

Read original article

IBM has launched Granite 3.0, the latest iteration of its large language models (LLMs), designed for enterprise applications. This version emphasizes a balance of performance, safety, and cost-efficiency. The flagship model, Granite 3.0 8B Instruct, is an instruction-tuned LLM trained on over 12 trillion tokens across multiple languages, excelling in both academic and enterprise benchmarks. The models are open-source under the Apache 2.0 license, with detailed disclosures on training data and methodologies, reinforcing IBM's commitment to transparency and responsible AI. The Granite 3.0 suite includes various models tailored for different tasks, including cybersecurity and natural language processing. Notably, the introduction of Mixture of Experts (MoE) models enhances inference efficiency, while speculative decoding techniques significantly speed up text generation. Additionally, the Granite Guardian models provide advanced safety features to monitor and manage risks associated with LLM outputs. Future updates are planned to expand model capabilities, including increased context windows and multimodal functionalities. The models are available on the IBM watsonx platform and through various partners, emphasizing IBM's focus on sustainability by utilizing renewable energy for training.

- IBM Granite 3.0 features advanced LLMs optimized for enterprise use.

- The models are open-source, promoting transparency and responsible AI practices.

- New Mixture of Experts models enhance inference efficiency for low-latency applications.

- Speculative decoding techniques improve text generation speed significantly.

- Granite Guardian models offer comprehensive risk and harm detection capabilities.

Claude 3.5 Sonnet

Anthropic introduces Claude Sonnet 3.5, a fast and cost-effective large language model with new features like Artifacts. Human tests show significant improvements. Privacy and safety evaluations are conducted. Claude 3.5 Sonnet's impact on engineering and coding capabilities is explored, along with recursive self-improvement in AI development.

Llama 3.1: Our most capable models to date

Meta has launched Llama 3.1 405B, an advanced open-source AI model supporting diverse languages and extended context length. It introduces new features like Llama Guard 3 and aims to enhance AI applications with improved models and partnerships.

Llama 3 Secrets Every Engineer Must Know

Llama 3 is an advanced open-source language model trained on 15 trillion multilingual tokens, featuring 405 billion parameters, improved reasoning, and multilingual capabilities, while exploring practical applications and limitations.

Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue

The article guides on creating a local AI co-pilot for enterprises using IBM's Granite Code and Ollama, addressing data privacy, licensing, and costs while ensuring compliance with corporate regulations.

Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct

NVIDIA launched the Llama-3.1-Nemotron-70B-Instruct model, ranking first in three benchmarks, utilizing RLHF techniques, requiring significant hardware, and emphasizing ethical AI development and responsible usage.

3 comments

By @ofermend - 4 months

Check out Granite 3.0 on the hallucination leaderboard: https://github.com/vectara/hallucination-leaderboard

By @gregw2 - 4 months

Interesting seeing the training disclosures...

IBM Granite 3.0: open enterprise models

Related

Claude 3.5 Sonnet

Llama 3.1: Our most capable models to date

Llama 3 Secrets Every Engineer Must Know

Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue

Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct

Related

Claude 3.5 Sonnet

Llama 3.1: Our most capable models to date

Llama 3 Secrets Every Engineer Must Know

Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue

Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct