What is ‘Crawled - Currently Not Indexed’? (And Why The Definition Must Change)
Changing the definition of ‘crawled - currently not indexed’ for SEO teams
I’ve been using the ‘crawled - currently not indexed’ report in Google Search Console for quite a while to identify opportunities to drive business growth.
However, the definition of ‘crawled - currently not indexed’ needs to change.
In this newsletter, I'll explain the current definition of ‘crawled - currently not indexed’ and propose a new indexing status, ‘crawled - previously indexed’.
I'll explain why we need a new definition using examples from Indexing Insight data.
So, let's dive in.
🕷️ What is ‘crawled - currently not indexed’?
The current definition of ‘crawled - currently not indexed’ is misleading.
If you do a Google Search for ‘crawled - currently not indexed’, many articles define the indexing status as Google crawling the page but has not chosen to index them yet.
All of these definitions are just repeating Google’s help documentation definition:
“The page was crawled by Google but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling.”
- Page Indexing report, Google Search Documentation.
Based on the data on Indexing Insights, I think this definition needs to change.
❓Why does the definition need to change?
First-party data from Indexing Insight shows that the current definition isn’t accurate.
Based on our data, most pages in the ‘crawled - currently not indexed’ report have been crawled AND historically indexed by Google.
It’s just that the report isn’t clear. Let’s take a look at some examples.
🎨 Examples of historically indexed pages
Here are examples of historically indexed pages in the ‘crawled - currently not indexed’ report in GSC.
Example #1 - theseosprint.com
I’ll start by showing an example from theseosprint.com.
The SEO Sprint 2022 Review piece of content was not designed to rank for any particular keywords in Google. And it was published Dec 01, 2022 (almost 2 years ago).
When checking the URL Inspection tool in GSC, we see it has the ‘crawled—currently not indexed’ index status.
BUT this page URL WAS indexed in Google.
Thanks to the Index Insight (the Google index monitoring tool) we track if a page was historically indexed in Google. And we even provide the exact date it switch from ‘submitted and indexed’ to ‘crawled - currently not indexed’ (20th August 2024).
Don’t just take our word for it.
We even link to your GSC account's historic URL Inspection report on the switch date. The screenshot below shows that this page was indexed.
Finally, if we check this URL's Search performance, we can see that it had impressions and clicks in Google Search. Again, this shows that it was indexed and could appear in Google Search results.
Example #2 - Programmatic SEO website
The following example is from a programmatic SEO website with 20K pages.
If you were to check the page URL in the URL Inspection report you can see that it is reporting ‘crawled - currently not indexed’.
Again, with Indexing Insight we monitor when a page switches from indexed to not indexed. And the date it switched. In this case, it was 12th August 2024.
We provide the link to the historic URL Inspection report in GSC account. Which shows that the page was indexed in Google.
Finally, by checking the Search Performance report and filtering for the exact URL, you can see that it drove clicks and impressions, showing that it was indexed and could appear in search results.
Example #3 - Niche website
The final example is from a niche website with almost 10K pages.
Again, I don’t want to show the website's domain name, but I will try to show part of the URL so you can see that it’s the same page being tested.
As you can see below, the URL Inspection status for the /alfreton-cricket-club/ is ‘crawled - currently not indexed’.
When checking the historic indexing status in Indexing Insight, we can see that this page was indexed and last indexed on July 23rd 2024.
We can confirm the page was indexed by opening up the historic URL Inspection report.
The /alfreton-cricket-club/ also has impressions when checking the Search Analytics.
Again, just like the other URLs tested, a page can only get impressions if it appears in Google Search results. This means that the page needed to have been indexed to be shown to users at some point.
📈 How often does this happen in GSC?
Based on our data from Indexing Insight, this happens more than you think.
For example, when testing pages submitted via XML sitemaps from alpha testers, 70-80% of pages with the ‘crawled — currently not indexed’ status in GSC had been historically indexed.
The problem of “backward” indexing was so frequent that we had to create a new indexing status in the tool to understand how often Google removed pages from being served in search results.
This new page status is called ‘crawled - previously indexed’.
When rolling out this new report, our team was surprised at the results and the sheer scale of this new status.
For example, for one alpha tester, there are almost 130,000 pages with the ‘crawled - previously indexed’ status.
This means 13% of the pages we’re monitoring for this site have been actively removed from being served in search results by Google’s indexing system.
This indicates a BIG issue with the website’s content.
🕵️ Two Types of ‘crawled - currently not indexed’
Based on our experience and data, a new definition of ‘crawled - currently not indexed’ is needed.
The definition should be split into two categories:
Crawled - currently not indexed: Pages that have been crawled but never been indexed by Google.
Crawled - previously indexed: Pages that have been crawled AND historically indexed, but Google recently stopped serving the content in its search results.
#1 - Crawled - currently not indexed
The traditional definition of ‘crawled - currently not indexed’ pages.
“Pages with the traditional ‘crawled - currently not indexed’ status have been crawled BUT not indexed by Google. For whatever reason, the system has not decided to index it.
Google can decide to index or not index these pages in the future.”
Here are a few characteristics of pages that fall under this indexing status:
Never been indexed - Page URLs with this status have never been historically indexed by Google and shown in search results.
Zero search performance - Page URLs with this status have never appeared in Google Search and have no impressions or clicks over the last 16 months.
Canonicalized URLs - There are URLs that have been crawled but have been canonicalized by the user to a canonical URL.
Non-HTML URLs - When crawled, Google detected that the content type is not HTML, and its system knows not to index these pages in web search.
Low-quality content - Page URLs submitted to Google were so low-quality or thin that they didn’t make it through Google’s indexing pipeline.
#2 - Crawled - previously indexed
A new definition for pages with the ‘crawled -currently not indexed’ status:
“Pages with the new ‘crawled - previously indexed’ status have been crawled AND historically indexed by Google.
However, over time, Google has decided that these pages should not be served to users and removes them from being served in search results.”
Here are a few characteristics of pages that fall under this indexing status:
Submitted and indexed - Page URLs are typically important SEO traffic-driving pages that site owners want to rank in Google Search.
Historically indexed - Page URLs with this status have been historically crawled AND indexed by Google. And have been shown in search results.
Search performance - Page URLs with this status have appeared in Google Search and have had impressions and/or clicks over the last 16 months.
🤷 What does this new definition mean for SEOs?
Ensure you don’t take ‘crawled - currently not indexed’ literally.
When working on websites with a large number of ‘crawled - currently not indexed’ pages in Search Console, do not be fooled into thinking that Google has never indexed these pages.
There is a high probability that Google has chosen to deindex the pages in this report.
You can easily check which pages have a high chance of being ‘crawled - previously indexed’ if you have XML Sitemaps submitted to GSC. Then filter on “All submitted pages” in the Page Indexing report:
Unfortunately, the index removal could have happened 12 months or 12 days ago.
The exact date the pages were removed from Google’s index in Search Console is not known. If you use the URL Inspection tool it will just give you the current status.
But there are clear indications that pages were indexed.
You can use the Search Analytics report or API to determine if the page had clicks and impressions over the last 12-16 months.
Even a tiny amount of performance shows that Google indexed the page at some point.
📌 Summary
The current definition of ‘crawled - currently not indexed’ is misleading.
In this newsletter, I’ve provided evidence that just because a page has this status in GSC does not mean that it was never indexed. In fact, quite the opposite.
A new indexing status needs to be created: ‘Crawled - previously indexed’.
At Indexing Insight, we automatically tag any pages that move from ‘submitted and indexed’ to ‘crawled - currently not indexed’ and enter this new status.
But I’ve also shown you can easily do the same in Google Search Console.
Hopefully, this newsletter has inspired you to dig deeper into ‘crawled - currently not indexed’ reports. And identify which pages have been ‘crawled - previously indexed’.
📊 What is Indexing Insight?
Indexing Insight is a tool designed to help you monitor Google indexing at scale. It’s for websites with 100K—1 million pages.
A lot of the insights in this newsletter are based on building this tool.
Subscribe to learn more about Google indexing and future tool announcements.
Hi Adam
Thanks for sharing your experience!
Based on what you presented as Reverse Indexing, I recently encountered an issue:
My website A was redirected to website B, and after a couple of months, URLs from website A started receiving impressions and clicks. All redirects and canonical tags are set up correctly, and Google’s rendered page appears accurate.
I eactly don't know what's going wrong!