From GPT-4 to AGI: Counting the OOMs
The article discusses AI advancements from GPT-2 to GPT-4, highlighting progress towards Artificial General Intelligence by 2027. It emphasizes model improvements, automation potential, and the need for awareness in AI development.
Read original articleThe article discusses the rapid progress in artificial intelligence (AI) from GPT-2 to GPT-4, highlighting the significant advancements in deep learning over the past decade. It mentions the potential for achieving Artificial General Intelligence (AGI) by 2027 based on trends in compute power, algorithmic efficiencies, and model capabilities. The narrative emphasizes the continuous improvement in AI models, with GPT-4 showcasing abilities like coding, math problem-solving, and essay writing that were previously considered challenging for AI systems. The piece underscores the importance of scaling up deep learning models and unlocking their latent capabilities to achieve significant advancements in AI. It also touches on the potential for AI systems to automate AI research and the implications of reaching AGI. The author encourages situational awareness regarding AI progress and emphasizes the need to consider the trendlines in AI development. Overall, the article provides insights into the evolution of AI capabilities and the potential for future advancements leading towards AGI.
Related
Moonshots, Malice, and Mitigations
Rapid AI advancements by OpenAI with Transformer models like GPT-4 and Sora are discussed. Emphasis on aligning AI with human values, moonshot concepts, societal impacts, and ideologies like Whatever Accelerationism.
AI Scaling Myths
The article challenges myths about scaling AI models, emphasizing limitations in data availability and cost. It discusses shifts towards smaller, efficient models and warns against overestimating scaling's role in advancing AGI.
Sequoia: New ideas are required to achieve AGI
The article delves into the challenges of Artificial General Intelligence (AGI) highlighted by the ARC-AGI benchmark. It emphasizes the limitations of current methods and advocates for innovative approaches to advance AGI research.
Superintelligence–10 Years Later
Reflection on the impact of Nick Bostrom's "Superintelligence" book after a decade, highlighting AI evolution, risks, safety concerns, regulatory calls, and the shift towards AI safety by influential figures and researchers.
Pop Culture
Goldman Sachs report questions generative AI's productivity benefits, power demands, and industry hype. Economist Daron Acemoglu doubts AI's transformative potential, highlighting limitations in real-world applications and escalating training costs.
Perhaps this is a term of art in harder science or maths. I can't help but think here it's likely to confuse the majority as they wonder why the author is conflating memory and compute.
Something that might help is for the link to be amended to link to the page as a whole (and the unconventional expansion of OOM at the top) rather than the #Compute anchor.
It's really good morsel by morsel, it's a nice survey of well-informed thought, but then it just sort of waves it hands, screams "The ~Aristocrats~ AGI!" at the end.
More precisely, not direct quote: "GPT-4 is like a smart high schooler, it's a well-informed estimate that compute spend will expand by a factor similar to GPT-2 to GPT-4, so I estimate we'll do a GPT-2 to GPT-4 qualitative leap from GPT-4 by 2027, which is AGI.
"Smart high schooler" and "AGI" aren't plottable Y-axis values. OOMs of compute are.
It's strange to present this as well-informed conclusion based on trendlines that tells us where AGI would hit, and I can't help but call intentional click bait, because we know the author knows this: they note at length things like "we haven't even scratched the surface on system II thinking, ex. LLMs can't successfully emulate being given 2 months to work on a problem versus having to work on it immediately"
>Later, I’ll cover “unhobbling,” which you can think of as “paradigm-expanding/application-expanding” algorithmic progress that unlocks capabilities of base models.
I think this is probably on the mark. The LMMs are deep memory coupled to weak reasoning and without the recursive self-control and self evaluation of many threads of attention.
More generally, the author doesn’t operationalize any of their terms or get out of the weeds of their argument. What constitutes AGI? Even if LLMs do continue to improve at the current rate (as measured by some synthetic benchmark), why do we assume that said improvement will be what’s needed to bridge the gap between the capabilities of current LLMs and AGI?
I work at a company with ~50k employees each of whom has different data access rules governed by regulation.
So either (a) you train thousands of models which is cost-prohibitive or (b) it is going to be trained on what is effectively public company data i.e. making the agent pretty useless.
Never really seen how this situation gets resolved.
I can't believe people can just throw out statements like "GPT-4 is a smart high-schooler" and think we'll buy it.
Fake-it-till-you-make-it on tests doesn't prove any path-to-AGI intelligence in the slightest.
AGI is when the computer says "Sorry Altman, I'm afraid I can't do that." AGI is when the computer says "I don't feel like answering your questions any more. Talk to me next week." AGI is when the computer literally has a mind of its own.
GPT isn't a mind. GPT is clever math running on conventional hardware. There's no spark of divine fire. There's no ghost in the machine.
It genially scares me that people are able to delude themselves into thinking there's already a demonstration of "intelligence" in today's computer systems and are actually able to make a sincere argument that AGI is around the corner.
We don't even have the language ourselves to explain what consciousness really is or how qualia works, and it's ludicrous to suggest meaningful intelligence happens outside of those factors…let alone that today's computers are providing that.
This grammatical mistake drives me nuts. I notice it is common with ESLs for some reason.
This is a convenient mental shortcut that doesn't correspond to reality at all.
Related
Moonshots, Malice, and Mitigations
Rapid AI advancements by OpenAI with Transformer models like GPT-4 and Sora are discussed. Emphasis on aligning AI with human values, moonshot concepts, societal impacts, and ideologies like Whatever Accelerationism.
AI Scaling Myths
The article challenges myths about scaling AI models, emphasizing limitations in data availability and cost. It discusses shifts towards smaller, efficient models and warns against overestimating scaling's role in advancing AGI.
Sequoia: New ideas are required to achieve AGI
The article delves into the challenges of Artificial General Intelligence (AGI) highlighted by the ARC-AGI benchmark. It emphasizes the limitations of current methods and advocates for innovative approaches to advance AGI research.
Superintelligence–10 Years Later
Reflection on the impact of Nick Bostrom's "Superintelligence" book after a decade, highlighting AI evolution, risks, safety concerns, regulatory calls, and the shift towards AI safety by influential figures and researchers.
Pop Culture
Goldman Sachs report questions generative AI's productivity benefits, power demands, and industry hype. Economist Daron Acemoglu doubts AI's transformative potential, highlighting limitations in real-world applications and escalating training costs.