September 10th, 2024

We're in the brute force phase of AI – once it ends, demand for GPUs will too

Gartner highlights a transitional phase in AI development, emphasizing the limited use of generative AI and recommending a return to traditional methods and composite AI for more effective outcomes.

Read original articleLink Icon
We're in the brute force phase of AI – once it ends, demand for GPUs will too

Analyst firm Gartner has indicated that the current reliance on GPUs for AI workloads signifies a "brute force" phase in AI development, where programming techniques are not yet refined. Erick Brethenoux, Gartner's chief of research for AI, noted that specialized hardware often becomes obsolete once standard machines can perform the same tasks. He emphasized that generative AI, which has dominated discussions, accounts for only a small fraction of actual use cases. Many organizations are returning to established AI methods, such as machine learning and rule-based systems, after exploring generative AI without significant business benefits. Brethenoux highlighted the potential of composite AI, which combines generative AI with traditional techniques, as a more effective approach. Gartner's vice president, Bern Elliot, echoed these sentiments, cautioning against using generative AI for tasks beyond content generation and knowledge discovery due to its unreliability. He recommended implementing guardrails to ensure the accuracy of generative outputs. Overall, the discussion at Gartner's Symposium suggests a shift back to more practical AI applications as organizations reassess their strategies.

- Gartner warns that the reliance on GPUs indicates a transitional phase in AI development.

- Generative AI is overhyped, representing only a small percentage of actual use cases.

- Organizations are returning to established AI techniques after exploring generative AI.

- Composite AI, which integrates generative and traditional AI methods, is recommended for better outcomes.

- Caution is advised when using generative AI due to its unreliability in various applications.

Link Icon 28 comments
By @emehex - 7 months
May I interest you in Jevons paradox:

> In economics, the Jevons paradox occurs when technological progress increases the efficiency with which a resource is used (reducing the amount necessary for any one use), but the falling cost of use induces increases in demand enough that resource use is increased, rather than reduced.

Source: https://en.wikipedia.org/wiki/Jevons_paradox

By @habitue - 7 months
I disagree, I think the fluke was the era in which we didn't have enough work to do for CPU/GPU to be at 100% 24/7. Think of this like leaving money on the table: compute should always be useful, why aren't those cores pegged to max all the time?

I think it was literally lack of imagination. We were like "well, I automated most of the paper pushing we used to do in the office, guess my job is done!" and this occupied 0.001% of a computer's time. We invented all sorts of ways for people to only pay for that tiny slice of active time (serverless, async web frameworks, etc).

Now we're in an era where we can actually use the computers we've built. I don't think we're going back

By @throwaway4aday - 7 months
I'm going to tap the sign again:

  - [X] Text
  - [X] Images
  - [X] Audio
  - [ ] Videos (in progress)
  - [ ] 3D Meshes and Textures (in progress)
  - [ ] Genetics (in progress)
  - [ ] Physics Simulation (in progress)
  - [ ] Mathematics
  - [ ] Logic and Algorithms aka Planning and Optimization
  - [ ] Reasoning
  - [ ] Emotion
  - [ ] Consciousness
We still have a lot of data to crunch but it's not nearly enough so we're also going to have to collect and generate a lot more of it. Some of these items require data that we don't even know how to collect yet. Barring some kind of disastrous event, draconian regulation, or politically/culturally motivated demonization of ML I don't see GPU demand dropping any time soon.
By @CharlieDigital - 7 months
We're really at the cusp of gen AI and we've barely scratched the surface.

Two Reddit threads really highlight this.

- ~10 years ago: https://www.reddit.com/r/StableDiffusion/comments/y9zxj1/you...

- Today: https://www.reddit.com/r/StableDiffusion/comments/1f0b45f/fl...

The upgrade in throughput from GPT-4 to GPT-4o and GPT-4o Mini actually unlocked use cases for the startup I'm at.

People that think demand for GPU compute capacity is going to decrease are probably wrong in the same way that people who thought the demand for faster processors and more RAM would wane were wrong. We are just barely at the start of finding the use cases and how to eat those GPU cycles.

    > The need for specialist hardware, he observed, is a sign of the "brute force" phase of AI, in which programming techniques are yet to be refined and powerful hardware is needed. "If you cannot find the elegant way of programming … it [the AI application] dies," he added.
The thing is that even if there is an elegant and efficient programmatic/algorithmic solution, having more and faster hardware only makes it better and pushes the limits even more.
By @daveguy - 7 months
The premise of this whole article is that once general purpose computing can do what the GPUs can then demand for GPUs will drop. That is a fundamentally flawed assumption. The ability to parallelize operations using a GPU will always be available and GPU development will continue. Hardware tech (process nodes, etc) that improves CPUs will also improve GPUs. Maybe we will reach a peak demand, but not until individual GPUs and CPUs are in millisecond token inference range. And that won't happen for a long time. The author is erroneously conflating GPU and ASIC development.

To be clear, I agree that LLMs are not anywhere close to AGI and I don't think they ever will be (just a component). But that doesn't mean they aren't useful enough to chew up a lot of compute for the foreseeable future.

By @mensetmanusman - 7 months
You will need 10-100x the number of GPUs to get video working. If GPUs crash in price, video will take off and then GPUs will be scarce again.
By @devinprater - 7 months
I think of it like speech synthesizers. First they were their own machines, then cards you plug into a computer, then once people figured out how to mash human speech together, they were, in some cases, a good 1.5 GB. Now, Siri voices, which are tons better than the concatinative models, used with the VoiceOver screen reader are a good 70 MB, Google TTS, even though it's awful and laggy with TalkBack, offline voices are a good 30 MB for a language pack, and in iOS 18, we can use our own voices as VoiceOver voices. So I think eventually we'll figure out how to run amazing AI stuff, even better than today, on our devices. And I think tons more people are working on LLM's than were ever working on TTS systems.
By @bluGill - 7 months
GPUs have been on most computers for decades now. Vector operations have long been known as useful for a lot of different tasks. Many of those tasks are long running enough that shunting them off to a different core as long made sense. Thus GPUs have been in everything and manufactures of computers have long been trying to figure out how to use those GPUs for workloads that don't need the full power for graphics. For some tasks GPUs are better, for others CPUs with vector operations are better. There is enough room for both on modern computers and this doesn't look to change.
By @SpicyLemonZest - 7 months
I don't really buy it. The advantage of massively parallel operations seems to me to be fundamental in the architecture of modern AI systems, not something that could eventually be optimized away through an "elegant way of programming". It feels like hypothesizing some clever technique that would let you run graphics through your CPU.
By @janalsncm - 7 months
Even if generative AI went to zero there are still tons of GPU-hungry ML methods with tons of business applications. GPU demand isn’t going away.
By @hiddencost - 7 months
That's literally never happened.

Machine learning is a trade off between model size (training cost), model run time (inference cost), and quality.

When some task is solved (e.g., hot word detection or speech to text), it becomes a commodity and some harder task becomes the priority.

By @darby_nine - 7 months
I really wish the title would use "will scale with demand more slowly" rather than saying demand will end, which is trivially false.
By @BiteCode_dev - 7 months
Once it ends, we will buy more GPUs because any small website will want its own.

I know that as soon as I can output 100req/s on the cheap on a llama-level model I will put it EVERYWHERE. And my clients too.

DMCA handling? Content flagging alerts? Fuzzy categorization? Natural UI for end user complex queries?

All LLM baby.

And much, much more.

By @kjellsbells - 7 months
Two thought experiments I like to consider are:

- if a company, say, AMD, found a way to produce GPUs at a fraction (say, x=10%) of the price of NVDA, would that increase aggregate demand, or keep it about the same (substitution for nVidia)? Would the price difference be enough to incentivize creation of a CUDA- alternative ecosystem? If not, what does the fraction x need to be?

- Very reductively, GPUs seem to be universally needed because they are better at manipulating matrices than alternatives (eg, inverting a matrix, finding dot product, cosine similarity, etc). Are there alternative approaches that could come to market in the next 2-3 years that could be better, or better-per-$, than the current approach of just building bigger GPUs?

By @thecupisblue - 7 months
Ah yes, because LLMs and Generative AI are the peak of AI technology. And in these two years they have been popular, we should have already seen trillion gazillion dollar revenues. While yes, it is "brute force" and we'll probably find a way (and already have) to run it on vanilla machines, there is so much more to AI than LLM's and generating photos.

And even LLM's and photo generation open up a bajillion usecases that would have taken years of research and development before. But nobody focuses on these - instead, they focus on S&P 500 companies and how they haven't earned much with GenAI.

Because these companies are usually known as the peak of human creativity, imagination and are ready to jump on a new technology without much red tape in it.

Honestly, it's been like two-three years tops. Even talking to tech startup CEO's I don't get the feeling they remotely understand the technology or application, as 90% of things I've heard them say is "oooh we could make a chatbot!" or "let's replace developers with it - oh it can't one shot generate my whole codebase? pft that sucks".

If these folks don't know how to use it, surely Jim VP of Engineering #62 at ACME & CO that hasn't used any tech except ERP's for the last 10 years will have an idea how to.

By @overcast - 7 months
AI will go the way of ASICs, just like bitcoin.
By @baal80spam - 7 months
Some people just can't stand that this (supposed) bubble won't pop.
By @amelius - 7 months
We're also in the alchemy phase of AI.
By @SoftTalker - 7 months
Companies like Gartner must be panicking. LLMs can easily spit out the buzzwords and MBA-textbook advice that their analysts currently produce.
By @xpuente - 7 months
Proof of that this is wrong, is that our brains require <20W of power. Non-brute force AI will be very efficient. Simple operations (int add/cmp) and no float matrix multiplications and keep moving around the same data all the time from memory to the ALU. GPUs will be overthrown by simple PIM.
By @lores - 7 months
We've always been in the brute force phase of AI. Much of AI is throw things at the wall and see what sticks.
By @boringg - 7 months
Better question is who is going to fill in the gap between supply and demand and how fast will those prices drop.
By @ein0p - 7 months
Doubtful. Everyone is so GPU starved right now that many research directions can’t even be pursued. That’s why almost everyone is basically training the same architecture with minor variations. Once/if that starvation ends, research will dramatically expand.
By @ravenstine - 7 months
Even if there is an AI bubble and it pops, lowering the demand for GPUs, I don't see the overall demand for GPUs or even AI becoming less relevant. The lowering of demand would have mostly to do with folks getting off the hype train. Fundamentally, GPUs are about parallel processing, and that's not going obsolete. But if demand for GPUs goes down, that cam be a very good thing because that might also bring down their prices. I would call the end of this "brute forcing" more of a healthy market correction and less of "see, I told you AI is bullshit."
By @Havoc - 7 months
We’re using gaming GPUs to do this. (Kinda). And the next gen asic like stuff is even more specialised.

Really don’t see why he thinks brute force is going away. I mean human brains have billions of neurons too after all.

By @blueyes - 7 months
This is dumb. It's kind of like saying: "once more processing power is no longer useful, we will no longer need our large human brains."

SOTA will remain at the edge of what compute can produce for a long time to come. SOTA is a moving frontier, and there will be demand for incrementally smarter models, because they save you time by making fewer mistakes.

No matter how much more efficient algorithmic innovations make ML model training, compute will make those algorithms smarter. It's a coefficient.

By @dboreham - 7 months
AI generated article?
By @thnkman - 7 months
It's funny how we keep moving the goal post in terms of environmentally friendly energy production as technology leaps forward, with a ever increasing energy demand. We humans tend to chase dragons of futility.