July 15th, 2024

Google Now Defaults to Not Indexing Your Content

Google has changed its indexing to prioritize unique, authoritative, and recognizable content. This selective approach may exclude smaller players, making visibility harder. Content creators face challenges adapting to Google's exclusive indexing, affecting search results.

Read original articleLink Icon
Google Now Defaults to Not Indexing Your Content

Google has shifted its indexing approach, moving towards selective indexing rather than indexing all content. This change means Google now prioritizes extreme content uniqueness, perceived authority, and brand recognition when deciding what to index. The search engine may quickly index new content but later de-index it, especially for smaller players in the online space. This selectivity has made it challenging for content creators to gain visibility, as Google now focuses on including only content it deems necessary. While well-known brands often have most of their content indexed promptly, smaller bloggers face a higher bar for inclusion. This shift has transformed Google into an exclusive catalog, potentially leading to valuable content being overlooked by users. The move towards selective indexing poses a significant challenge for content creators who must find ways to navigate Google's new approach to ensure their content is included in search results.

Link Icon 23 comments
By @JohnFen - 4 months
This is a fascinating discussion.

> Google has transformed from a comprehensive search engine into something more akin to an exclusive catalog.

That alone goes a long way to explain why Google search had become worthless to me. I had thought that it was mostly that their attempts at interpreting "what I really want" are terrible, but perhaps the reason is actually that they don't index what I really want in the first place.

I almost never want brand/big name sites and the like, but that is mostly what I get.

By @crazygringo - 4 months
This post seems to be based entirely on personal anecdotal experience.

There isn't a shred of hard data to support the headline claim that Google now "defaults to not indexing content".

Google never indexed everything, removing duplicates, blogspam, useless pages, etc. Maybe they've changed their thresholds or maybe not. But this post provides zero evidence of anything. It's pure speculation without any facts at all.

By @ChrisArchitect - 4 months
Weird description from OP of domains previously being indexed hours from creation. Domains aren't 'magically' being indexed without any kind of nudge. Maybe it was registrars or associated systems letting the engine know. Maybe it was you searching for your new domain. There's always a trigger even if you can't see it directly, how else would it know about anything? And over the years indicating to the engine/spiders that a new site existed depended more and more on site owners to let them know via submissions and console setup etc. The rest of this seems to be personal anecdotes/hearsay etc as with many SEO posts. I would think everything is still being indexed, but not shown because of low relevancy, and whatever other things google is trying to do to clean up results (which is fine, not saying that isn't happening). Show me the log where no spider ever touched your pages or did and it's not showing up before jumping to these conclusions.
By @nostromo - 4 months
Google Search hasn't felt less relevant to me ever than it does now. And I first stopped using Yahoo and switched to them some 20+ years ago.

It'll find anything except what I'm trying to find. Quotes are useless. The content itself is often garbage. It works ok on common queries, but that's not when I need it to work - I need it to work the most when the query is hard. The long tail is the only thing that matters when a user is judging a search engine's quality.

The web itself has gotten worse over time, but that's also partly Google's doing. Google extracted all the value out of the open web and kept it for themselves. Meanwhile online publishers of all varieties are dying, despite being the ones producing much of the value. Google should have identified this as a strategic threat decades ago.

Now I just keep a tab open to ChatGPT all day and use it as a search engine without all the trouble of dealing with webpages.

By @djha-skin - 4 months
My recent experience doesn't match the experience described by the OP.

I recently logged into google and asked that they index my domain (djhaskin.com). They asked me to put a TXT record in there proving I owned it and I did so. Then their "website console" thing showed that my website was indexed[1]. They have a console for this stuff now[2]. They recently showed me a page in there where that displayed some URLs that weren't indexed and which were. I requested a re-index of one of the non-indexed URLs, but the others were just broken/junk/RSS feed urls, so it was fine that they weren't indexed. The console gave me a ton of tools for making sure my site was indexed, and told me why if it wasn't.

I had plenty of tools to get my site indexed and felt like I was in control. I don't feel any sense of mystery about what is happening and I receive notifications when indexing fails.

1: https://1drv.ms/i/s!AoAOaR6dYP8RgcEAmUu_3ZrInrzbLw

2: https://search.google.com/search-console

By @SoftTalker - 4 months
We're back to the web of 1996, where search sites such as Yahoo were manually curated.
By @aiauthoritydev - 4 months
Future of internet is going to be AI driven content shown in AI driven adaptable UI. Very often this content would be shown directly on Google properties such as GMAIL. Of course, Google needs to figure a way out to pay people for this which I am sure they eventually will.
By @barnabee - 4 months
That's ok, I'm waay ahead of them — I've been defaulting to not visiting Google for years now.
By @londons_explore - 4 months
If I were to take a guess, the reason you're seeing URL's not being indexed is Google only has finite indexing resources, yet AI content generators can generate a near infinite amount of content.

That makes your real content a smaller proportion of the whole web, and therefore less likely to meet the threshold for fitting inside googles finite indexing budget.

By @summerlight - 4 months
IMO, this doesn't seem a coordinated strategical movement (if it was so, they should've done it much better than this) but more of computational resource saving. You'll be surprised by size growth of the entire web and its degradation on signal to noise ratio. My gut says more than 99% of incremental web pages are filled by some auto generated craps. The problem is there's no great economical way yet to figure out which is garbage and which is a genuine content. They should develop a better, scalable technologies for that (and it's fair to say that they should've focused more on this), but LLM is still too expensive to run and vulnerable to lots of attack vectors.
By @breck - 4 months
> "to organize the world's information and make it universally accessible."

Their website actually still says this: https://www.google.com/search/howsearchworks/our-approach/.

It seems from their recent actions their mission is to "organize the world's information and drip it out as slowly as possible, covered in ads".

If you have had it with (c) and paywals and ads, come join the revolution which is the World Wide Scroll: https://wws.scroll.pub/

By @xemdetia - 4 months
I just wonder when the dam is going to break. Part of how Google spun money for so long is because they had high quality search to run ads on, but some of these latest changes have completely lost the plot. I don't need an animated ai response to my basic question: I want an answer. Especially when a bit ago on a different device I asked the same question and got the answer the first time. I haven't been so inclined to entertain what if I just did it myself thoughts for search in at least 15 years but it has become very an increasingly loud thought.
By @sfmike - 4 months
They've probably realized that those who will make content that makes them more money with ads are the same willing to go through hoops to use search console to be indexed. Cheaper costs of indexing and auto sorting to those that are more monetizable in nature. Also no toil. For them its a multi front win.
By @aiauthoritydev - 4 months
I think the problem is not Google specific rather the internet has grown far too large with too much of crap floating around. Google, in my opinion has done the best job of getting relevant information followed by Reddit.

While OpenAI etc. is pretty good (so does Google Gemini) what is OpenAI like interfaces prevent me from doing is to segue from a focused topic to related areas to discover knowledge on the periphery, which is the most important aspect of learning in my opinion which chatbots today are not able to do that well.

HN historically has lot of G haters. Which is fine, but I feel a lot of criticism is not really reasonable.

By @xnx - 4 months
Lots of claims without any specific examples or hard evidence. Typical SEO blogpost.
By @tonymet - 4 months
what are the better search engines out there featuring quality content? I don't mean privacy focused ones like Brave, Duck Duck Go . How about indexes that have useful, thoughtful, creative content.
By @sakisv - 4 months
I know that Google has extremely talented people tackling these kind of things, and with vastly more experience than I, but I can't help to think that the problem is that we as users have extremely diverse needs and use cases, which is impossible to satisfy them all.

It's even more impossible to satisfy them in a way that's also useful for Google's own business model.

The most obvious and potentially naive way of doing that would be to allow end users to upvote or downvote search results. I know that Google already does that in an automated way to some extend, but the problem is that that signal is then used to determine the quality across all the users which, as I mentioned in the beginning, is impossible to get right.

Instead that metric should only affect each user's own search results and not everyone else's. This could improve the quality of the results, bring more people back, and eventually increase the revenue. It would also help prevent "gaming" the system.

What am I missing here? I can't believe that they're not aware of that. I also can't believe that they don't want to fix it or that they're so focused on ai that internal politics don't allow them to do anything else. So what gives?

By @lofaszvanitt - 4 months
I recently launched a site that sports a lot of images. Well it's an image site. You go there to view images on specific topics. The google search console always complains that there is very few text on the page and it only indexed 1/6th of the image pages (it doesn't even fetch the images itself... who knows whether it's a good or bad sign ;|). Go figure, this is an image site.

Should I start to write text next to each image, like:

"Mech approaches another dark, very evil looking mech on a bright day and swings its laser sword to decapitate the evil one."

Gimme a break. For 5K+ images... :D. The topic, title and description is not enough, it needs more text to believe that this is what the images are about. No, your images will not be indexed, will not be included in the image search results, because you are not part of the exclusive club. No monopoly here.

Bing webmaster tools tells me that there are no highly ranked sites that link to my content, and there should be :D. I just started the site, how would there be any linkage to it?! Are they insinuating that I should create fake sites to promote my content or maybe pay for seo? No monopoly here either.

Yandex... I can't really figure out whether their indexing works or not, it sometimes complains, then doesn't do anything for weeks. Then it comes up with another made up problem that is nonexistent. It acts like a drunkard.

I haven't tried Baidu, because they need some local phone number and they clearly can't send activation SMS to Europe.

Next I'll try with a news site and write a blog post about my experiences. Truly interesting times.

By @kstrauser - 4 months
> Google has transformed from a comprehensive search engine into something more akin to an exclusive catalog.

That's the only plausible long-term path to keep its search results competitive and relevant.

> For content creators, it presents a significant challenge: how do you gain visibility if Google refuses to index most of your content?

Don't publish junk that it doesn't find interesting under the assumption that it "owes" you a front page search result.

I'm not a Google fanboy. I have a paid Kagi account and I use it nearly exclusively. I want Google to stay competitive though. There's a vast army of SEO spammers who think they know the one magic invocation that will drive traffic into their willing arms, or, at least, are able to convince paying customers of it. If Google could wave a magic wand that could accurately identify all of the junk that exists purely to increase results rankings, and they used that knowledge to permanently remove it from the results they show me, well, I'd probably stop paying Kagi to do that.

By @rglover - 4 months
How Google can fix their search: embedded ranking.

Give me an embeddable iframe that I can optionally add to my site which allows visitors to give feedback (think a floating Reddit upvote button). Require that button's access to require an authenticated user via Google.

Ranking in the algorithm is weighted such that the organic user votes are the heaviest, and content length, keywords, etc., are the lowest.

Remove all the AI crap (or tweak it so I can chat with a bot to improve my search, but it's not Foie Gras'd down my throat). Make ads a free vs paid experience (want to avoid ad results, pay Google $5/mo for a clean result set).

This would make the only way to "game" SEO authentic, quality content. In essence, it's taking the current hack (appending "Reddit" to the end of a search query) and building it into the core experience.