September 24th, 2024

Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Google updated its Gemini models, introducing Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002, featuring over 50% price reduction, increased rate limits, improved performance, and free access for developers via Google AI Studio.

Read original article

CuriositySkepticismDisappointment

Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Google has announced updates to its Gemini models, introducing two new production-ready versions: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002. Key enhancements include a price reduction of over 50% for the 1.5 Pro model, increased rate limits (2,000 RPM for Flash and 1,000 RPM for Pro), and improved performance metrics, such as 2x faster output and 3x lower latency. The models are designed for a variety of tasks, including processing large documents and videos, with significant improvements in math and vision capabilities. The updates also feature a more concise response style based on developer feedback, aiming to enhance usability and reduce costs. Additionally, the default output length has been shortened by 5-20% for tasks like summarization and question answering. Developers can access these models for free via Google AI Studio and the Gemini API, with further enhancements expected in the coming weeks. The company emphasizes its commitment to safety and reliability, offering customizable safety filters for developers. An experimental version, Gemini-1.5-Flash-8B, has also been released, showcasing improved performance across various use cases. Google anticipates that these updates will facilitate innovative applications and enhance the overall developer experience.

- Google has released updated Gemini models with significant performance improvements.

- Pricing for the 1.5 Pro model has been reduced by over 50%.

- Rate limits have been increased to allow for more extensive use.

- The models are designed for a wide range of tasks, including document and video processing.

- Developers can access the models for free via Google AI Studio and the Gemini API.

Gemini's data-analyzing abilities aren't as good as Google claims

Google's Gemini 1.5 Pro and 1.5 Flash AI models face scrutiny for poor data analysis performance, struggling with large datasets and complex tasks. Research questions Google's marketing claims, highlighting the need for improved model evaluation.

Gemini Pro 1.5 experimental "version 0801" available for early testing

Google DeepMind's Gemini family of AI models, particularly Gemini 1.5 Pro, excels in multimodal understanding and complex tasks, featuring a two million token context window and improved performance in various benchmarks.

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Google has launched Gemini 1.5 Pro, an advanced AI model excelling in multilingual tasks and coding, now available for testing. It raises concerns about AI safety and ethical use.

Automating away the boring parts of my job with Gemini 1.5 Pro and long context

Paige Bailey discusses Gemini 1.5 Pro's long context capabilities for automating tasks in Developer Relations, including analyzing codebases, scraping user feedback, and generating content for social media and documentation.

Show HN: Comparisons – Gemini-1-5-vs-ChatGPT-4o

Gemini 1.5 Pro and ChatGPT-4o are competing AI models, with Gemini excelling in math and coding, while ChatGPT-4o is superior in language comprehension, despite higher output costs.

AI: What people are saying

The comments on Google's Gemini model updates reveal a mix of opinions and concerns from users.

Significant price reductions are noted, making Gemini models more competitive compared to other frontier models.
Users express skepticism about the model's performance and reliability, with some describing it as "broken" and "unusable."
Concerns about the lack of privacy options and the effectiveness of safety filters are raised.
Some users appreciate the free access and ease of use for development, while others criticize the documentation and support.
There is a general sentiment of disappointment regarding Google's ability to capitalize on opportunities and improve their offerings.

30 comments

By @simonw - 7 months

This price drop is significant. For <128,000 tokens they're dropping from $3.50/million to $1.25/million, and output from $10.50/million to $2.50/million.

For comparison, GPT-4o is currently $5/million input and $15/million output and Claude 3.5 Sonnet is $3/million input and $15/million output.

Gemini 1.5 Pro was already the cheapest of the frontier models and now it's even cheaper.

By @naiv - 7 months

This sounds interesting:

"We will continue to offer a suite of safety filters that developers may apply to Google’s models. For the models released today, the filters will not be applied by default so that developers can determine the configuration best suited for their use case."

By @FergusArgyll - 7 months

Any opinions on pro-002 vs pro-exp-0827 ?

Unlike others here I really appreciate the gemini API, it's free and it works. I haven't done too many complicated things with it but I made a chatbot for the terminal, a forecasting agent (for metaculus challenge) and a yt-dlp auto namer of songs. The point for me isn't really how it compares to openAI/anthropic, it's a free API key and I wouldn't have made the above if I had to pay just to play around

By @bn-l - 7 months

I’ve used it. The API is incredibly buggy and flakey. A particular pain point is the “recitation error” fiasco. If you’re developing a real world app this basically makes the Gemini api unusable. It strikes me as a kind of “Potemkin” service.

Google is aware of the issue and it has been open on google's bug tracker since March 2024: https://issuetracker.google.com/issues/331677495

There is also discussion on GitHub: https://github.com/google-gemini/generative-ai-js/issues/138

It stems from something google added intentionally to prevent copyright material being returned verbatim (ala the NYT openai fiasco), so they dialled up the "recitation" control (the act of repeating training data—and maybe data they should not have legally trained on).

Here are some quotes from the bug tracker page:

> I got this error by just asking "Who is Google?"

> We're encountering recitation errors even with basic tutorials on application development. When bootstrapping a Spring Boot app, we're flagged for the pom.xml being too similar to some blog posts.

> This error is a deal breaker... It occurs hundreds of times a day for our users and massively degrades their UX.

By @summerlight - 7 months

Looks like they are more focused on the economical aspect of those large models? Like 90~95% performance of other frontier models at 50%~70% price.

By @anotherpaulg - 7 months

The new Gemini models perform basically the same as the previous versions on aider's code editing benchmark. The differences seem within the margin of error.

https://aider.chat/docs/leaderboards/

By @kendallchuang - 7 months

Has anyone used Gemini Code Assist? I'm curious how it compares with Github Copilot and Cursor.

By @frankdenbow - 7 months

Interview with the product lead: https://x.com/rowancheung/status/1838611170061918575?

By @serjester - 7 months

Gemini feels like an abusive relationship — every few months, they announce something exciting, and I’m hopeful that this time will be different, that they’ve finally changed for the better, but every time, I’m left regretting having spent any time with them.

Their docs are awful, they have multiple unusable SDK's and the API is flaky.

For example, I started bumping into "Recitation" errors - ie they issue a flat out refusal if your response resembles anything in the training data. There's a GitHub issue with hundreds of upvotes and they still haven't published formal guidance on preventing this. Good luck trying to use the 1M context window.

Everything is built the "Google" way. It's genuinely unusable unless you're a total masochist and want to completely lock yourself into the Google ecosystem.

The only thing they can compete on is price.

By @stan_kirdey - 7 months

Google should just offer llama3 405b, maybe slightly fine tuned. Geminis are unusable.

By @rkwasny - 7 months

Can someone explain to me why there is COMPLETELY different pricing for models on Vertex AI, Google AI studio and also OpenRouter has another price ...

By @charlie0 - 7 months

Anyone want to take bets on how long it takes for this to hit the Google Graveyard?

By @mnicky - 7 months

I think the sentiment here is not fully objective, there are nice improvements in benchmarks (and even more so when accounting for the price): https://imgur.com/a/K3tVPEw

Also, this model shouldn't be compared to the CoT o1, I think. That is something different (also in price and speed).

By @ramshanker - 7 months

One company buying expensive NVIDIA hardware vs another using in-house chips. Google got a huge advantage here. They could really undercut OpenAI.

By @TheAceOfHearts - 7 months

I only use regular Gemini and the main feature I care about is absolutely terrible: summarizing YouTube videos. I'll ask for a breakdown or analysis of the video, and it'll give me a very high level overview. If I ask for timestamps or key points, it begins to hallucinate and make stuff up. It's incredibly disappointing that such a huge company with effectively unlimited access to both money and intellectual resources can't seem to implement a video analysis feature that doesn't suck. Part of me wonders if they're legitimately this incompetent or if they're deliberately not implementing good analysis features because it could eat into their views and advertisement opportunities.

By @ldjkfkdsjnv - 7 months

They have to drop the price because the model is bad. People will pay almost any cost for a model that is much better than the rest. How this company carries on the facade of competence is laughable. All the money on the planet, and they still cannot win on their core "competency".

By @kebsup - 7 months

Is there a good benchmark comparing multilingual and/or translation abilities of most recent LLMs? GPT-4o struggles for some tasks in my language learning app.

By @msp26 - 7 months

Has anyone tried Google's context caching feature? The minimum caching window being 32k tokens seems crazy to me.

By @pacoverdi - 7 months

Why the ** do they have to use a 8.9MiB png as hero image? I guess it was generated by AI?

By @resource_waste - 7 months

Its just not as smart as ChatGPT or LLAMA, its mind boggling Google fell so far behind.

By @phren0logy - 7 months

As far as I can tell there's still no option for keeping data private?

By @sweca - 7 months

No Human eval benchmark result?

By @accumulator - 7 months

Cool, now all Google has to do is make it easier to onboard new GCP customers and more people will probably use it...its comical how hard it is to create a new GCP organization & billing account. Also I think more Workspace customers would probably try Gemini if it was a usage-based trial as opposed to clicking a "Try for 14 days" CTA to activate a new subscription.

By @jzebedee - 7 months

As someone who actually had to build on Gemini, it was so indefensibly broken that I couldn't believe Google really went to production with it. Model performance changes from day to day and production is completely unstable as Google will randomly decide to tweak things like safety filtering with no notice. It's also just plain buggy, as the agent scaffolding on top of Gemini will randomly fail or break their own internal parsing, generating garbage output for API consumers.

Trying to build an actual product on top of it was an exercise in futility. Docs are flatly wrong, supposed features are vaporware (discovery engine querying, anybody?), and support is nonexistent. The only thing Google came back with was throwing more vendors at us and promising that bug fixes were "coming soon".

With all the funded engagements and credits they've handed out, it's at the point where Google is paying us to use Gemini and it's _still_ not worth the money.

By @mixtureoftakes - 7 months

TLDR - 2x cheaper, slightly smarter, and they only compare those new models to their own old ones. Does google have moat?

By @thekevan - 7 months

Google does not miss one single opportunity to miss an opportunity.

They announced a price reduction but it "won't be available for a few days". By the time, the initial hype will be over and the consumer-use side of the opportunity to get new users will be lost in other news.

Gemini's data-analyzing abilities aren't as good as Google claims

Gemini Pro 1.5 experimental "version 0801" available for early testing

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Google has launched Gemini 1.5 Pro, an advanced AI model excelling in multilingual tasks and coding, now available for testing. It raises concerns about AI safety and ethical use.

Automating away the boring parts of my job with Gemini 1.5 Pro and long context

Show HN: Comparisons – Gemini-1-5-vs-ChatGPT-4o

Gemini 1.5 Pro and ChatGPT-4o are competing AI models, with Gemini excelling in math and coding, while ChatGPT-4o is superior in language comprehension, despite higher output costs.

Two new Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

Related

Gemini's data-analyzing abilities aren't as good as Google claims

Gemini Pro 1.5 experimental "version 0801" available for early testing

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Automating away the boring parts of my job with Gemini 1.5 Pro and long context

Show HN: Comparisons – Gemini-1-5-vs-ChatGPT-4o

Related

Gemini's data-analyzing abilities aren't as good as Google claims

Gemini Pro 1.5 experimental "version 0801" available for early testing

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Automating away the boring parts of my job with Gemini 1.5 Pro and long context

Show HN: Comparisons – Gemini-1-5-vs-ChatGPT-4o