7 steps to de-indexing articles on your website
Updated 13 December 2022
Why should you consider de-indexing anything from your website? Read on to find out the details.
If your website has been around for a while, it’s very likely and common that you might have outdated articles. In other words, old content can be viewed as “unhelpful” or “irrelevant” and should be actioned either by updating or de-indexing. Updating an article can take a lot of time, so de-indexing it could be a better option to avoid negative outcomes for your website. Google’s Helpful Content update addresses this issue directly, whereby a small percentage of unhelpful content could essentially de-rank the entire site altogether.
The following steps will help you to identify content on your site that isn’t performing organically (meaning it ranks poorly with little to no page views) and should be considered for de-indexation or those on the cusp that could benefit from an update and revival.
Word of caution: de-index articles in small batches so you can easily reverse the process if necessary.
1. Export post URLs
The first step to understanding which articles to de-index involves deciding your starting point. A good way is to work in reverse chronological order and assume that the oldest articles should probably be audited first. Generally speaking, articles with more than a few years under their belt tend to most likely be outdated or irrelevant. This process should help you to identify which articles to de-index or whether you should simply update them.
Typically, you can start by exporting a list of articles, their IDs, and URLs from the first year of publication (or the first 200 posts) and work your way forward. If you run a WordPress site, you might find WP Import Export Lite a good plugin to use for this purpose – just set the filters for the right date range and you’ll be set.
2. Retrieve stats from your analytics stack
The next step in the de-indexing process is to grab the statistics for the posts you’ve just exported. If you use Google Analytics, the process is relatively simple. See below an example on the Honest Fox website.
To obtain the statistics for these articles in Google Analytics UA, we will use the regular expression (regex) search function – this requires the URLs to be formatted a certain way. Using a text editor that supports Grep, find and replace all line breaks (\r) with a pipe (|).
Once done, this is how your regex search string should look:
Extra step: you might need to confirm if your URLs in Google Analytics include the domain. Ours does not, so we have only included the path in the search string.
In Google Analytics, click on “Behaviour” in the left navigation menu, then “Landing Pages”. Make sure you have the Organic Traffic segment selected and set the date range to the last 90 days.
Next, click on Advanced next to the search field, and set the following conditions:
Include Landing Page Matching RegExp.
Paste the formatted search string into the field and hit “Apply”. Then export the results to a Google Sheet.
Google Analytics 4 (GA4)
The process in GA4 is slightly different. Regex search isn’t supported, but you can get away with filtering. Firstly, format your search string by replacing the pipe (|) with a semicolon (;).
In GA4, head to “Reports”, then “Engagement”, and finally, “Pages and screens”.
Make sure you have the right segment selected by clicking on “All Users”, then redefining the audience to include session medium = organic. Next, click on the date selector and choose Last 90 Days from the preset options.
In order to search for the articles you’ve shortlisted, click on Add filter under “Pages and screens”. Your filter needs to include the Page path and screen class, then paste the search string in the dimension values field, and click on “All values containing…”.
This is what it should look like:
Export the stats to use in the next step.
3. Identify low-performing articles
In Google Sheets (or any spreadsheet application of your choice that supports vlookup), paste the post URLs and IDs from the original list which was exported from the CMS.
Add a new sheet and paste the stats retrieved from Google Analytics. It’s quite normal to have fewer entries than the original list. It simply means that some of the articles aren’t returning any hits, and thus, zero page views from the 90 day search window.
Return to the first sheet and in the second column, use the vlookup function to match the path with the stats from Google Analytics:
Any entry that returns N/A indicates that the post had zero page views in the last 90 days.
4. Run a backlink check
Now that you’ve identified which of these posts aren’t performing, you’ll need to run a backlink check before you can decide whether or not to de-index them. This requires either a Semrush or Ahrefs account, and unless you have a Pro account with Ahrefs, this process can be quite time-consuming. If you don’t have either, Semrush offers a free plan that allows up to 10 backlink searches a day.
When you’re done checking backlinks, you’ll find yourself at a pivotal juncture: what do you do with these articles with zero page views but have backlinks? Well, first evaluate the backlinks – what domain scores did Semrush or Ahrefs assign them? If they have low DR scores, they can fall into the “de-index” bucket. But if they’re backlinks from Wikipedia or any half-decent site (DR >50) then the article should either be updated or redirected, in order to preserve the backlink.
5. Flag the articles to be de-indexed in the CMS
If you require a bit of CMS control or a way to quickly identify de-indexed posts, then add them to a “de-indexed” category or tag. This is completely optional but a really handy means if you need to re-index any given article.
6. De-index the articles
If there are only a handful of articles to be de-indexed, Yoast SEO is probably your best bet even though you’ll have to de-index each article individually.
Simply list the articles you need to de-index as a string array (e.g. 15069,14680,15557,19029) and insert it within the square brackets here:
private static $noIndexIdArray = ;
The plugin doesn't have a frontend yet, so you'll need to know your way around WordPress in order to work this.
7. Remove links to de-indexed articles
The final piece of the puzzle is a fairly cumbersome one: removing all internal links to the de-indexed articles. Some might choose not to do this, but if the posts end up getting deleted, then you’ll be left with a bunch of broken links. The good news is, you’ll have some time to check this off.
In order to identify internal links pointing to the de-indexed articles, you’ll need to crawl your site with Screaming Frog, then individually click on every de-indexed post and select the Inlinks tab. These are the articles that link to the de-indexed post – remove those links and you’re done.
Need more help with your website?
Many tools, many steps but often not enough time to do them all. We know how time-consuming it can be to maintain and build a website that puts your business in the spotlight that it deserves. But it is necessary if you want to stay ahead of your competition and keep a digital presence that attracts the right people you are looking for.
Get in touch if you have any questions or want to maximise the potential of your website.