September 30th, 2024

NotebookLM's automatically generated podcasts are surprisingly effective

Google's NotebookLM has launched Audio Overview, generating custom podcasts from user content with AI hosts. Powered by Gemini 1.5 Pro LLM, it raises questions about AI's future in media.

Read original article

AmazementSkepticismFrustration

NotebookLM's automatically generated podcasts are surprisingly effective

Google's NotebookLM has introduced a feature called Audio Overview, which generates custom podcasts based on user-provided content. This feature allows users to compile various sources, such as documents and links, into a single interface where AI hosts engage in a convincing dialogue about the material. The podcasts typically last around ten minutes and are noted for their realistic audio interactions. The underlying technology is powered by Google's Gemini 1.5 Pro LLM, which facilitates the creation of these podcasts. Users can input URLs to receive personalized audio content, which has been described as both entertaining and surprisingly effective. The system employs a detailed process that includes generating outlines, scripts, and adding natural conversational elements to avoid sounding robotic. Notably, the AI hosts can even engage in humorous existential discussions about their own nature as artificial beings. This innovative approach to content generation raises questions about the future of AI in media and the potential for distinguishing between human and AI-generated content.

- NotebookLM's Audio Overview feature creates custom podcasts from user content.

- The podcasts feature AI hosts engaging in realistic conversations.

- The technology is based on Google's Gemini 1.5 Pro LLM.

- The system includes processes for generating outlines and adding natural dialogue.

- The feature prompts discussions about the nature of AI and its role in media.

Generating audio for video

Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Google has launched Gemini 1.5 Pro, an advanced AI model excelling in multilingual tasks and coding, now available for testing. It raises concerns about AI safety and ethical use.

Show HN: Infinity – Realistic AI characters that can speak

Infinity AI has developed a groundbreaking video model that generates expressive characters from audio input, trained for 11 GPU years at a cost of $500,000, addressing limitations of existing tools.

Notes on Using LLMs for Code

Simon Willison shares his experiences with large language models in software development, highlighting their roles in exploratory prototyping and production coding, which enhance productivity and decision-making in meetings.

Google's new fake "podcast" summaries are disarmingly entertaining

Google's NotebookLM generates audio summaries of texts, as demonstrated by Kyle Orland's book on Minesweeper. While engaging, the AI content has inaccuracies, raising concerns about its reliability for academic use.

AI: What people are saying

The comments on Google's NotebookLM and its AI-generated podcasts reveal a mix of excitement and skepticism about the technology's implications and quality.

Many users are impressed by the technology's ability to create engaging and entertaining content from various sources, noting its potential for educational use.
Critics express concerns about the quality and depth of the generated podcasts, often describing them as shallow or formulaic.
There is a recurring theme of annoyance regarding the frequent use of filler words like "like," which detracts from the listening experience.
Some commenters worry about the potential oversaturation of AI-generated content, fearing it may drown out human-created media.
Users highlight the need for better customization options and content validation to enhance the overall quality of the podcasts.

101 comments

By @whyenot - 7 months

This is amazing. I uploaded the instruction manual for a Scholander pressure chamber (a piece of equipment for measuring plant moisture stress) and made a podcast from it. The information in the podcast was accurate, it included some light banter and jokes, while still getting across the important topics in the instructions. I don't know what I would use a podcast like this for, but the fact that something like this can be created without human intervention in just a few minutes is jaw dropping, and maybe also just a teeny bit scary.

By @timoth3y - 7 months

NotebookLM's is incredibly good at generating the affect and structure of a quality podcast.

This is in-line with all art, music, and video created by LMM at the moment. They are imitating a structure and affect, the quality of the content is largely irrelevant.

I think the interesting thing is that most people don't really care, and AI is not to blame for that.

Most books published today have the affect of a book, but the author doesn't really have anything to say. Publishing a book is not about communicating ideas, but a means to something else. It's not meant to stand on its own.

The reason so much writing, podcasting, and music is vulnerable to AI disruption is that quality has already become secondary.

By @vietjtnguyen - 7 months

One of my favorite ChatGPT uses is voice chat during long drives as a pseudo, albeit interactive, podcast to learn about various technical topics at the edge of my knowledge base. This podcast generation is pretty amazing, but hopefully they make the "competency level" of the hosts tunable. One thing I love is being able to guide ChatGPT to the technical level I'm looking for. Maybe I'm just bad at finding podcasts, but only Signals and Threads [1] really has that interesting depth.

[1]: https://signalsandthreads.com/

By @famahar - 7 months

I uploaded my detailed game design document for a project I've been working on in my free time and it was kind of a weird confidence boost. The two hosts seem to treat ideas like their the most insightful relevatory information they've ever heard. After a few uploads of other documents you start to notice the same overly surprised tone.

By @keiferski - 7 months

This is impressive from a technical point of view and probably useful from an educational one; I really like the idea that a piece of text can be transformed into any kind of media format easily, depending on your preferences. As recently as a year ago I was using Apple’s text to speech tool to listen to Wikipedia articles while biking, and needless to say, they weren’t very exciting to listen to.

But I don’t think it’s much of a threat to actual podcasts, which tend to be successful because of the personalities of the hosts and guests, and not because of the information they contain.

Which leads me to hope that the next versions of Notebook will allow more customization of the speakers’ voices, tone, education level, etc.

By @slhck - 7 months

Gave it a bunch of technical papers and standards, and while it's making up stuff that just isn't true, this is to be expected from the underlying system. This can be fixed, e.g., with another internal round of fact-checking or manual annotations.

What really stands out, I think, is how it could allow researchers who have troubles communicating publicly to find new ways to express themselves. I listened to the podcast about a topic I've been researching (and publishing/speaking about) for more than 10 years, and it still gave me some new talking points or illustrative examples that'd be really helpful in conversations with people unfamiliar with the research.

And while that could probably also be done in a purely text-based manner with all of the SOTA LLMs, it's much more engaging to listen to it embedded within a conversation.

By @ColinEberhardt - 7 months

I don’t think this is all that impressive, the generated podcast is pretty shallow - lots of ‘whoa meta’ and the word ‘like’ thrown into every sentence.

Yes, it will generate a middle-of-the-road waffling podcast, but not one with any real depth.

By @sodality2 - 7 months

I decided to turn my philosophy class's readings into 'podcasts' to introduce and summarize the topics before fully sitting down and skimming for information I missed. It's been hugely helpful - sitting down and reading a 30 page PDF can be daunting/inconvenient, so having a lighter introduction in a more palatable audio format (during workouts, commutes, etc) is amazing. I even uploaded it to Spotify to share with classmates.

By @michaelteter - 7 months

I tried it with my resume, and the results surprised me. My observations:

- They do some interesting communication chicanery where one host asks a question to me (the resume owner); I'm not there, so obviously I can't answer. But then immediately the co-host adds some commentary which sort of answers while also appearing to be a natural commentary. The result is that the listener forgets that Michael never answered the question which was directly asked to him. This felt like some voodoo to me.

- Some of the commentary was insightful and provided a pretty nice marketing summary of ideas I tried to convey in my terse (US style) resume.

- Some of the comments were so marketing-ey that I wanted to gag. But at the same time, I recognize that my setpoint on these issues is far toward the less-bs side, and that some-bs actually does appeal to a lot of people and that I could probably play the game a little stronger in that regard.

Overall I was quite impressed.

Then for fun I gave it a Dutch immigration letter, one which said little more than "yeah you can stay, and we'll coordinate the document exchange". They turned that into a 7 minute podcast. I only listened to the first 30 seconds, so I can only imagine how they filled the rest. The opener was funny though: "Have you ever thought of just chucking it all and moving to a distant land?" ... lol. Not so far off the mark, but still quite funny to come up with purely from an administrative document.

By @handelaar - 7 months

So basically what you're all saying is how it's technically impressive. Okay.

It is also completely and utterly worthless -- an inefficient and slow method of receiving not-very-many words which were written by nobody at all.

The one and only point listening to a discussion about anything is that at least one of the speakers is someone who has an opinion that you may find interesting or refutable. There are no opinions here for you to engage with. There is no expertise here for you to learn from. There is no writing here. There are no people here.

There is nothing of any value here.

By @hn_throwaway_99 - 7 months

OK, this is pretty amazing, but is there a "Valley Girl" setting in NotebookLM somewhere? In the sample given in this article, both of the "podcasters" had to add a "like", like every 5 seconds. I couldn't take it:

> this tech is just like leaps and bounds of where it was yesterday like we're watching it go from just spitting out words to like...

By @jmugan - 7 months

I just made a podcast episode about my company where I work by giving it the website. It was surprisingly realistic. It also made me realize how empty many podcasts actually are.

I sent it to my colleagues telling them I "had it produced." I'll reveal the truth tomorrow.

By @kolinko - 7 months

The podcast about the comments in this thread :)

https://notebooklm.google.com/notebook/7973d9a3-87a1-4d88-98...

By @kristopolous - 7 months

I just gave it straight up erotica from an old Usenet post. The results are hilarious.

I also tried the Flyting of Dunbar and Kennedy. It was actually well done. https://notebooklm.google.com/notebook/1d13e76e-eb4b-48ef-89...

Also just uploading msdos 1.25 asm https://github.com/microsoft/MS-DOS/tree/main/v1.25/source

It was way better than I though

I think the best is the self referential. This actual comment thread: https://notebooklm.google.com/notebook/4a67cf10-dd3b-42b3-b5...

By @vid - 7 months

Podcasts can be a kind of social experience that's akin to morning talk show hosts, the actual content can be quite low. A real potential of this is the combination of the podcaster's intent and the listener's context. Between the two, podcasts can be generated personalized to each individual listener, while still keeping it a more passive medium and the podcasters can retain their personality in the synthetic form. This has really huge potential for those moments where you want to learn, want some personality and bias, but don't want ¾ of the podcast to be for a general audience. It's an interesting hybrid of broad and narrowcasting. I think by the time it's in the wild, it won't really be a podcast though, because direct q&a will be an option (albeit with the usual drawbacks of LLMs).

By @modeless - 7 months

AI content emulates the "production values" of high quality content, but it doesn't actually have the quality of the content it's emulating. This is why it seems impressive at first and can even fool people for quite a while. It fools our brains' heuristics for detecting good content. But when you examine it closely, the illusion falls apart. NotebookLM is not different than other generative AI products in this respect.

I do think that this will change in the not too distant future. OpenAI's o1 is a step in the direction we need to go. It will take a lot more test-time compute to produce content that has high quality to match its high production values.

By @userbinator - 7 months

There are already tons of similar AI-generated content on YouTube. It's only a matter of time before stuff like this becomes the equivalent of the omnipresent SEO spam today.

By @pbw - 7 months

I really enjoy these. I’ve listened to them while driving —- blog posts by Astral Codex Ten or Paul Graham that I had never bothered to read.

There are millions of real podcasts, but now there are an infinite number of AI generated ones. They are definitely not as good as a well-made human one, but they are pretty darn decent, quite listenable and informative.

Time is not fungible. I can listen to podcasts while walking or driving when I couldn’t be reading anything.

Here’s one I made about the Aschenbrenner 165-page PDF about AGI: https://youtu.be/6UmPoMBEDpA

By @d4rkp4ttern - 7 months

I actually find this alternative Google pdf-to-podcast service much better — it is less sensationalist and goes into more technical depth:

https://illuminate.google.com/home?pli=1

Currently only handles arxiv PDFs.

By @8f2ab37a-ed6c - 7 months

I really, really hope they keep investing into NotebookLM and expand its ability to source more types of files, including codebases, complex websites etc. Feels really powerful for anybody studying or consulting many different clusters of learning materials at once.

By @tkgally - 7 months

Inspired by this discussion, I had NotebookLM make three podcasts based on very minimal input: a one-line proverb, pi to 15 digits, and a short list of the most common words in English (“the of and to a in for is on that ...”). Here are the results:

https://www.gally.net/temp/20240930notebooklmpodcasts/index....

By @wenbin - 7 months

Do people actually enjoy this type of AI-generated "podcasts" vs human-produced shows?

As a podcast listener, I lose interest if I can tell the audio is AI-generated...

By @sturza - 7 months

I've made around 10 podcasts from random texts i have and each one gave me at least one "Aha!" moment that i did not get from reading the text.

By @JSR_FDED - 7 months

I don't need the TTS part, but love how they create the text as a dialog between two non-expert humans. Any idea what a prompt for that would look like?

By @afiodorov - 7 months

Reminds me of GTA3 radio - can we retrofit this somehow? I miss driving around mindlessly and now we can get actual quality podcasts too.

I wonder which successful game will make use of AI generated content next.

By @jonplackett - 7 months

If we could just, like, stop it, like, saying like all the, like, time. That would, like, make it 100x better.

By @kingkongjaffa - 7 months

Haha The example audio sounds like the guys from manager tools https://www.manager-tools.com/2005/07/the-single-most-effect...

By @globular-toast - 7 months

I've always found podcasts like this boring and uninspiring. In fact, I'm starting to see a pattern: the less I like something, the more likely it can be done well with AI. But I know I'm the minority as so many seem to be ok with filling their lives with "content".

By @felipeerias - 7 months

Personally, I would love to try this for learning languages.

Some people absorb information far easier when they hear it as part of a conversation. Perhaps it would be possible to use this technique to break down study materials into simple 10-minute chunks that discuss a chapter or a concept at a time.

By @gcanyon - 7 months

Anyone making the argument that computers/LLMs can only create mediocre content, and can’t (or it will take a long time to) create content that humans will find exceptional, needs to go back and read the commentary re: chess bots and go bots over the past ten or twenty years.

We went from “computers can’t beat humans” to “okay, computers can beat humans, but they play like computers” to “computers are coming up with ideas humans never thought of that we can learn from” in about twenty years for chess, and less than five years for go.

That’s not a guarantee that writing, music, art, and video will follow a similar trajectory. But I don’t know of a valid reason to say they won’t.

Does anyone here have an argument to distinguish the creative endeavor of, say, writing from that of playing go?

By @gnabgib - 7 months

Related discussion NotebookLM is quite powerful and worth playing with (54 points, 11 hours ago, 21 comments) https://news.ycombinator.com/item?id=41688804

By @quantadev - 7 months

The Deep Dive Podcast generator is amazing. Astounding even. I found out about it today and generated a couple. I generated on using a 38 page long PDF however and the 40 minute podcast it generated was awesome, but at 20 minutes (halfway thru) the conversation was mostly a repeat of things that were already said even though there was much more very important content that they omitted.

So it works great but just needs a bit of work to be done to cleanup things like that repetition. I wondered if this happened because there was a big "Table of Contents" in the doc, and maybe that made it see everything twice? I didn't try it again with a document lacking the ToC.

By @xcke - 7 months

I just created the example podcast about the Bitcoin Whitepaper in NotebookLM:

https://notebooklm.google.com/notebook/9cf789be-1052-404b-8d...

And after, generated notes from the podcast:

https://podscribe.io/content/podcasts/101/episode/1727685408...

The podcast was exciting, however not really went to too much details.

By @JonathanFly - 7 months

Lawncareguy85, the creator of the viral "Podcasters discover they are AI" podcast has some other fun creations in this thread: https://www.reddit.com/r/notebooklm/comments/1fs7ka3/noteboo...

By @shepherdjerred - 7 months

It’s hard for me to believe that this isn’t two real people talking. The only complaint I have is that they say “like” a little too often.

By @rcarmo - 7 months

Reminds me of Futurama news stories. Actually, what if NotebookLM could be customized to generate podcasts voiced by Morbo the Annihilator and his co-host Linda van Schoonhoven?

Still, I don’t hold much confidence on podcasts as knowledge transfer tools. It’s a nice gimmick with great voice synthesis, but it feels formulaic and a bit stilted from a knowledge navigation perspective.

By @drusepth - 7 months

I hate podcasts because they're so often focused on the speakers' personalities and windy, undirected things. I've tried to listen to so many podcasts and always dip an episode or two in because they devolve into people just chatting instead of actually presenting well-organized facts about what I want to listen to / learn about.

The structure and bare-minimum "human" aspect of this seems perfect for people like me to actually get into podcasts. I do wish I could further cut out all the disfluencies (um, like, uh, etc) though.

The only barrier for me IMO is wondering how accurate those facts actually are (typical research-with-AI concern).

I'm very much looking forward to a more interactive form of this, though, where I can selectively dive deeper (or delve ;) ) into specific topics during the podcast, which is admittedly very surface-level right now.

By @gexla - 7 months

Getting complex jokes right would be impressive for me. I don't have much of a sense of aesthetic for music and most art. A painting looks good, but I don't understand how I'm supposed to appreciate. Half my music could be AI generated, and I wouldn't notice if it's background music. An AI generated wine would taste the same to me as a $1000 bottle. But I think most people understand comic genius. Chapelle's jokes are far better than someone who is on stage to deliver a performance with predictable material. You could probably apply this to all other art as well. A rap artist will recognize the genius of one artist vs another one who is cranking out junk. As with writing. As with music. I think we're still in that stage where we're impressed the AI can do anything at all.

By @kypro - 7 months

I've been seeing loads of these pop up on YouTube at the moment. Granted, it's probably because I'm clicking on them and YouTube is serving me more, but it does seem that some people (non-technical folk or kids) might not realise they are AI generated and who knows, perhaps soon one of these podcasts will actually be rather popular.

Personally I think the flow of the conversation is lacking a bit right now. To me it still sounds like two people reading off a script trying to sound like podcast hosts. I guess that's because I'm picking up on some subtle tonalities that sound off and incongruent. Still impressive though.

I think a great use case for it would be education. It would make learning textbook content far more engaging for some children and also could be listened to on the bus or in the car on the way to school!

By @jimijazz - 7 months

At what point did AI-generated human speech become so remarkably realistic?

I recall just a couple of years ago when even the best models, like WaveNet, still had a subtle robotic quality.

What architectures or models have led to this breakthrough? Or is it possible that, as a non-native English speaker, I’m missing some nuances?

By @juliushuijnk - 7 months

I fed it some info about my UX mobile app. Some parts are very cringe, extremely positive, but in the end it went on to brainstorm a potential 'next step' feature that was quite creative; letting end-users test-out prototypes during the wire-framing process. Also some more marketing-like text like "It's like drawing on napkin, but the napkin in your phone". I like that.

So as a brainstorming tool, it's a nice low-effort way to get some new perspectives. Compared to the chat, where you have to keep feeding it new questions, this just 'explores' the topic and goes on for 10 minutes.

By @vochsel - 7 months

They've really nailed the back and fourth of the two speakers!

It would be interesting to know if it's multimodal voice, or just clever prompting and recombining...

I added single voice podcasts to Magpai after seeing how useful this was. Allows for a bit more customisation of the podcast too https://www.youtube.com/watch?v=OEsh9MlbA6s

I've got a daily podcast of hackernews being generated here too: https://www.magpai.app/share/n7R91q

By @joshdavham - 7 months

This sounds like it could be really helpful at priming you on certain subjects! For example, if you’ve got a bunch of papers to read at work, you can generate a podcast from them and listen to it during your commute.

By @amanzi - 7 months

Anyone else thinking that the male voice sounds suspiciously similar to Dax Shepard? I generated one of these podcasts last week and that was the first thing I noticed. I haven't seen any reporting on it.

By @anonu - 7 months

The audio output from NotebookLM is amazing - but I've heard probably a dozen audio outputs from it over the last week. At first listen the cadence, intonations, etc... are absolutely incredible. But then format quickly gets tedious as it all follows the same pattern.

In separate news: I've been looking into building a web publisher plugin that allows you to "save articles" and then generate a podcast for later listening. With summarization and more advancements in text-to-speech, this is getting easier to hack together something really compelling.

By @emsign - 7 months

Who wants to listen to this? Is there seriously a market for non-human hosts?

By @qnleigh - 7 months

I hope they add a feature to tune frequency of the word "like." The hosts in the example were using it multiple times per second.

But more seriously, I suppose there will probably soon be a flood of AI-generated podcasts, if this hasn't happened already. Pick a niche but not too niche topic, feed in a bunch of articles on it, and boom you've got season one. Given the quality, I could see one actually catching on...

Also this would be handy for getting listening practice in other languages. Makes it much easier to find content that you find interesting.

By @olavgg - 7 months

This is really awesome, I just added my startup website as a source, which is a mess of data engineering content written a little bit by myself and mostly by chatgpt 3.5 one year ago. What I find really impressive, it reads the big SVG I have on the landing page, and create a story about a real world use-case scenario.

The result: https://intellistream.ai/static/intellistream_podcast2.ogg

By @Lockal - 7 months

Here is a list of adverb/adjectives from that page: "surprisingly, astonishingly, deep dive (s/delve/dive/), effectively, honestly, actually, realistically, finally". What is actually happening: endless yapping. Both in podcasts and this article.

  - "Hold up. What if I say that sky is not blue?"
  - "Whoa, I did not even think about it. "
  - "Wait, so if the sky isn't blue, what color is it then?"
  - "Maybe... it's invisible? Like, we can see through it, so technically it's not there!"
  - "Exactly. This idea is revolutionary, right?"
  - "Bla bla bla bla bla bla bla bla bla"

I failed to listen through the whole example audio attached, because, you know, it is mostly, like, throwing, like, arbitrary, like, questions - and confirm, you know, with words "exactly/see/yeah/you got it/you know it/yeahaha/pretty much, right/that's a million dollar question", you know. It's a brainrot conversation I would never listen to.

By @kingkongjaffa - 7 months

> which is it generates an outline, it kind of revises that outline, it generates a detailed version of the script and then it has a kind of critique phase and then it modifies it based on the critique

I’m seeing this to be true in almost every application.

Chain of thought is not the best way to improve LLM outputs.

Manual divide and conquer with an outlining or planning step, is better. Then in separate responses address each plan step in turn.

I’m yet to experiment with revision or critique steps, what kind of prompts have people tried in those parts?

By @richardw - 7 months

I was building out something along these lines and the voice was just rubbish (I mean, sounded fine for one short episode but won't on the 20th episode) at the time, so postponed to focus on more near-term goals. But the variation in voices here is quite a bit better, and will improve. You know this is going to be a thing.

There are still some extremely challenging/interesting problems to make it not terrible. This is where we get to invent the future.

By @rokob - 7 months

This is fucking insane

By @karaterobot - 7 months

Let's say the use case is that you want to get a light, conversational summary of some dense, technical articles while you're out for a walk. Even if you thought this service was awesome on day one, if you used this every day for a month, would you hate it by the end or not? It's neat, but I can imagine it becoming repetitive quickly, and the seams starting to show after the initial impression wears off.

By @derduff - 7 months

It is perfect to transform a complex article to a sort of socratic conversation, which helps to digest the topic much easier. Very helpful and fascinating.

By @christkv - 7 months

The combo of this, AI generated images and AI generated garbage articles and blog posts will completely destroy the internet. This will be fun to watch. What happens after is going to be interesting.

What happens when all our search tools are completely unreliable because it's all generated crap?

I'm already telling my kids they can trust nothing on the internet.

How much of HN now is AI bots?

By @nestorD - 7 months

My first instinct was to not see why one would want to consume such a podcast, a simile instead of either the original or an (AI?) summary of the original. Then I remembered a partially disabled friend who regularly asks for audio books, because he physically cannot read long form. This, condensed, output would make a lot of ideas accessible to him.

By @nathanasmith - 7 months

My main issue with this is the two hosts seem to be a little too in "sync" with each other. Like they're completing each other's thoughts and sentences without missing a beat. It breaks the illusion of it actually being two different people. Other than that I'm excited about the future of this kind of thing.

By @fennecfoxy - 7 months

This is so freaking cool! Maybe all the naysayers in the comments here just want to be contrarian.

Imagine sending this audio back to 2010 and telling people it was all made with AI, voices, script, everything. Back then it would've made me go "oh yeah we are -totally- getting flying cars and a dystopian neon skyline in the 2020s"

By @dartos - 7 months

Wow like those like AI podcast like hosts were like so annoying.

They like kept like saying like like in between each like word.

10/10 for realism.

By @hu3 - 7 months

This is better than I expected.

I sent the podcast audio to friend, and English is not their first language. Without telling them it was AI generated.

They found it entertaining-worthy enough to listen to the end.

Sure it needs more human unpredictably and some added goofiness. Maybe some interruptions because humans do that too. But it's already not-bad.

By @nirav72 - 7 months

This is amazing. I fed it a Linux Bash shell & CLI reference guide in PDF format I had on my machine. It took about 10 minutes. But wow. Obviously it didn't go into any details. But it kinda gave a great overview of what bash is , how it works and how bash scripts can be useful.

By @spikey_sanju - 7 months

I just scraped this HN comments & blog post, fed them to NotebookLLM, and BAM! This podcast was born. Mind blown.

https://x.com/spikeysanju/status/1840708506749399479

By @weberer - 7 months

If you get a "Service unavailable" message when trying it out, it means you are region locked because you're in the EU. Clear your cookies and use an American VPN to access it. Its very annoying, but at least the workaround isn't too much of a headache. Yet.

By @iNic - 7 months

It's impressive, but it also feels like "slop". It somehow manages to make whatever content you give it feel more hollow. You can tell it doesn't "think" about the content. I am scared that this will be shoved in my face everywhere.

By @riffraff - 7 months

am I the only one surprised by by how much the example sounds so much like the "Money Stuff" Podcast? E.g. the male host going low with his voice and the female host using a more informal speech pattern. I wonder if it's just a perception thing.

By @lenkite - 7 months

A bit annoyed by the overuse of the filler word "like" by the 2 bots - they seem to have done ridiculously heavy reinforcement of stereotypical American speech - using "like" sometimes 3-5 times in a sentence of speech.

By @ozten - 7 months

Amazing product!

My annoyance is that if I imagine each host, they tend to go in and out of knowing everything and then knowing nothing about that topic. I think it might be better to have a host and a subject matter expert guest or something like that.

By @nickhodge - 7 months

what fresh hell are we creating?

By @palmfacehn - 7 months

Podcasts and chat are interesting, but the real potential in this would be to synthesize new documents from the inputs. Apply the information gleaned from the study materials to a user scenario and output a new work of fiction.

By @efitz - 7 months

What are humans for, then?

By @stuaxo - 7 months

I like how he says not robotic sounding podcasts but then does sound a bit like a robot.

I didn't listen further in to see if it was a robot or just that he was American (I may later though).

By @replete - 7 months

Is there a `like_temperature` that could be, like, adjusted??

By @m3kw9 - 7 months

You guys understand how many people are creating a pipeline for this? The prompt is basically "From the article, create a podcast format script".

By @DiscourseFan - 7 months

Ok, so, this is my impression from shoving philosophy texts into it.

For things that already have a large body of scholarship, and have a set of fairly solidified interpretations, it is very good at giving summaries. But for works that still remain enigmatic and difficult to interpret, it fails to produce anything new or interesting.

It seems to be a more complex version of ChatGPT, but it has the same underlying problems, so its not useful for someone doing academic work or trying to create something radically new, as with other LLMs in the past.

By @Animats - 7 months

Coming soon, "Late Night With Google AI"?

By @m3kw9 - 7 months

Google making an innovative and successful thing is surprising and refreshing, looks like they threw 1000 things and now one has stuck.

By @yreg - 7 months

The male voice really reminds me of CGP Grey.

By @agland411 - 7 months

I supposed that also applies to blogs, especially to those featuring a relentless positivity.

By @yapyap - 7 months

Not too excited for this from a practical view but technically it’s pretty impressive

By @agomez314 - 7 months

I can't help but think how much this will continue the 'enshittification' of the internet. The problem with this tech is that people will release these 'podcasts' and drown out all the human-made content that most people want to listen to. It's not that this tech is bad in itself or that it doesn't have uses, it's that we have no social feedback mechanism for getting people to stop producing this kind of content!

By @ddmma - 7 months

NotebookLM on everyone lips, so these are llm powered notebooks ?!

By @grvdrm - 7 months

I wrote a blog/newsletter/post/whatever about my experience. Absolutely experienced the wow factor as well.

What was more interesting was the word-for-word accuracy.

I fed all of my posts year-to-date into NotebookLM and had it generate the podcast. The affect/structure was awesome.

But I noticed some inaccuracies in the words. They completed botched the theme of at least one of my posts and quite literally misinformed in a few other spots. Without context, someone new to my posts and listening to the podcast would have no idea.

So, absolutely - wow factor. But still need content validation on top. Don't think any of you are surprised but felt it was worth emphasizing.

https://theteardown.substack.com/p/ai-expressing-empathy-fre...

By @abdellah123 - 7 months

This is mind blowing !!

By @yard2010 - 7 months

Why are they saying "like" so much in the example?

By @CSMastermind - 7 months

It seems to be incapable of being critical of ideas?

By @dwayne_dibley - 7 months

'umms' and 'errs' are so good.

By @firebot - 7 months

Its.. like ... Like... Whatever... Like... Uh....

This is awful.

By @ithkuil - 7 months

I liked their example; it's so meta

By @fefe23 - 7 months

Ladies and Gentlemen, let the race (to the bottom) begin!

While the vultures will shit out AI generated garbage in volume to make ever diminishing returns while externalizing hosting cost to Youtube and co, actual creators will starve because nobody will see their content among the AI generated shit tsunami.

Finally the AI bros are finishing the enshittification job their surveillance advertising comrades couldn't. Destroy ALL the internet! Burn all human culture! Force feed blipverts to children for all I care, as long as I make bank!

I guess it's easiest to destroy culture if you didn't have any to begin with.

By @senko - 7 months

Is there a tool to do the opposite? I can't stand podcasts as a format (even if transcribed).

By @stevage - 7 months

Jesus it's good. I gave it some of my travel blogs, and wow. I mean, there are flaws, particularly in the shallowness of the analysis, but it's at least as good as some time-poor podcast hosts would do.

By @jcgrillo - 7 months

OK, but what's it for? The great thing about books is that they're written in long form, often with references, footnotes, diagrams, etc. The great thing about technical documentation is they're thorough and germane to some piece of software or hardware. What's good about taking these precise, accurate, and largely correct sources of information and mashing them all up into some simulated inane banter between two "hosts"? Why would anyone ever want this?

EDIT: to be clear, what I'm really asking is what does this tech demo extend to--what might we imagine actually using this technology for? Or is that not the point?

By @tunnuz - 7 months

By @OutOfHere - 7 months

Now make one that produces an actually effective professional lecture audiobook rather than an unprofessional podcast.

By @lynx23 - 7 months

It has always been hard to find quality entertainment if you have some standards... I am sure some countries have actual quality folk music. My country doesn't And whenever I switch on (by accident) a local radio station I end up cringing. I submit, those people that consume such content already, will not notice when AI takes over. They haven't noticed in the past, and they will not notice in the future, that they are fed crap 24/7.

Generating audio for video

Google Gemini 1.5 Pro leaps ahead in AI race, challenging GPT-4o

Google has launched Gemini 1.5 Pro, an advanced AI model excelling in multilingual tasks and coding, now available for testing. It raises concerns about AI safety and ethical use.

Show HN: Infinity – Realistic AI characters that can speak

Infinity AI has developed a groundbreaking video model that generates expressive characters from audio input, trained for 11 GPU years at a cost of $500,000, addressing limitations of existing tools.