July 18th, 2024

OpenAI slashes the cost of using its AI with a "mini" model

OpenAI launches GPT-4o mini, a cheaper model enhancing AI accessibility. Meta to release Llama 3. Market sees a mix of small and large models for cost-effective AI solutions.

Read original articleLink Icon
OpenAI slashes the cost of using its AI with a "mini" model

OpenAI has introduced a more affordable "mini" model, GPT-4o mini, aiming to broaden access to its AI technology. This new model is priced 60% lower than OpenAI's cheapest existing model while delivering improved performance. The move is part of OpenAI's strategy to enhance AI accessibility amidst a competitive landscape flooded with free and small open-source AI models. Meta is set to unveil Llama 3, a free and capable offering, next week. OpenAI's success in the cloud AI market, particularly with its ChatGPT chatbot, has spurred competitors like Google and startups to develop similar large language models. OpenAI emphasizes the importance of making intelligence more affordable and accessible, highlighting advancements in model architecture and training data to enhance the new GPT-4o mini. The market trend shows a shift towards combining small and large models to optimize product experiences at reasonable costs. OpenAI also hints at the potential development of models for customer-run devices based on demand.

Related

OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost

OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost

OpenPipe's cost-effective agent mixture surpasses GPT-4, promising advanced language processing at a fraction of the cost. This innovation could disrupt the market with its high-performance, affordable language solutions.

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU

The article discusses the release of open-source Llama3 70B model, highlighting its performance compared to GPT-4 and Claude3 Opus. It emphasizes training enhancements, data quality, and the competition between open and closed-source models.

AI models that cost $1B to train are underway, $100B models coming

AI models that cost $1B to train are underway, $100B models coming

AI training costs are rising exponentially, with models now reaching $1 billion to train. Companies are developing more powerful hardware to meet demand, but concerns about societal impact persist.

Meta AI develops compact language model for mobile devices

Meta AI develops compact language model for mobile devices

Meta AI introduces MobileLLM, a compact language model challenging the need for large AI models. Optimized with under 1 billion parameters, it outperforms larger models by 2.7% to 4.3% on tasks. MobileLLM's innovations include model depth prioritization, embedding sharing, grouped-query attention, and weight-sharing techniques. The 350 million parameter version matches larger models' accuracy on specific tasks, hinting at compact models' potential for efficiency. While not publicly available, Meta has open-sourced the pre-training code, promoting research towards sustainable AI models for personal devices.

OpenAI is releasing GPT-4o Mini, a cheaper, smarter model

OpenAI is releasing GPT-4o Mini, a cheaper, smarter model

OpenAI launches GPT-4o Mini, a cost-effective model surpassing GPT-3.5. It supports text and vision, aiming to handle multimodal inputs. Despite simplicity, it scored 82% on benchmarks, meeting demand for smaller, affordable AI models.

Link Icon 11 comments
By @minimaxir - 3 months
This appears to be part of an embargoed news blitz from a few news organizations (Verge and Bloomberg posted the same news at the same time), which is an interesting PR deviation from OpenAI posting it on their blog and having it go viral. The news isn't on their official blog at all currently.
By @dustedcodes - 3 months
Is OpenAI still making any changes to make ChatGPT better and more accurate and correct, or are they only focusing on making it cheaper by giving us weaker/dumber responses in a faster way nowadays? I have cancelled my subscription recently because didn't think that the GPT-4 was much better than what I get for free at Claude or Gemini.
By @laweijfmvo - 3 months
If "AI" tools are eventually going to have to be "free" (as in beer) to compete, I shudder to think of what companies like OpenAI will have to extract from users to please investors...
By @OutOfHere - 3 months
I now see gpt-4o-mini listed at https://platform.openai.com/docs/models/gpt-4o-mini

Looking at https://openai.com/index/gpt-4o-mini-advancing-cost-efficien... , gpt-4o-mini is better than gpt-3.5 but worse than gpt-4o, as was expected. gpt-4o-mini is cheaper than both, however. Independent third-party performance benchmarks will help.

By @m3kw9 - 3 months
This must mean there is a big chunk of people using open source models that are cheaper and they want a slice of that action
By @Yusefmosiah - 3 months
OpenAI’s strategy has been bizarre since at least last November, when they launched custom GPTs, then had the boardroom coup.

Since the launch of Claude 3 Opus, and then Claude 3.5 Sonnet, they have been significantly behind Anthropic in terms of the general intelligence of their models. And instead of deploying something on par or better, they are making demos of video generation (Sora) or audio-to-audio models, not releasing anything.

GPT-4o is quite bad at coding, often getting stuck in a loop, and “fixing” buggy code by rewriting it without any changes.

GPT-4o is speculated to be a distillation of a larger model, and now GPT-4o-mini is an even dumber smaller model. But what’s the point?

Who is actually using small/fast/cheap/dumb models in production apps? Most real apps require higher reliability than even the biggest/slowest/priciest/smartest models can provide today. For the use case of transformers that has taken off, aiding students and knowledge workers in one-off tasks like writing code and prose, most users want smarter, more reliable outputs, even at the expense of speed and cost.

GPT-4o-mini seems like a move to increase margins, not make customers happier. That, like demoing products without launching them, is what big old slow corporations do, not how world-leading startups operate.

By @andrewmcwatters - 3 months
I wonder how this compares to running an ollama server on a vps.

Edit: I’m amazed by how offended some people are by such a simple question.

By @ilaksh - 3 months
Anyone see benchmarks comparing this to Gemini Flash or gpt-3.5 or open models that can run on groq?
By @machiaweliczny - 3 months
I think they just want to maintain lead with data/task collections and that’s why having cheapest model wins
By @deadbabe - 3 months
The only OpenAI product I care to hear about is ChatGPT-5