How to Monitor Your Website for Link Equity Loss

Apr. 04, 2012 | by Modestos Siotos

Google has recently started taking action against blog networks in an attempt to remove low quality websites from its index. It is estimated that several thousand domains have already been removed from Google’s index and the number is likely to increase further in the forthcoming weeks or months. That means that hundreds of millions of links have been completely devalued affecting the rankings of several websites, directly or indirectly.

Carrying out a thorough backlinks audit for new clients is extremely important to us because it allows us to:

  1. Get a good understanding of the link profile to their site and the quality of the historical backlinks
  2. Work out the chances of losing some link equity in the foreseeable future
  3. Closely monitor link equity loss on a weekly/monthly basis, react quickly and modify our link strategy if necessary
  4. Forecast more accurately on ranking improvements and traffic growth

Preparing The Data

First and foremost we need to collect as much backlink data as possible. Exporting data from the following sources would make the data-set quite reliable – the more data, the better.

Majestic SEO Data

Majestic SEO historic index offers invaluable data about a site’s backlinks and should almost definitely be the primary source of backlinks data.

Open Site Explorer Data

Download the CSV file containing all Inbound Links.

Google Webmaster Tools

  1. Go to ‘Under Your site on the web’ -> Links to your site
  2. Find ‘Who links the most’ on the left-hand side and click ‘More>>’
  3. Download all links by clicking on ‘Download more sample links’

Extract All Unique Linking Root Domains

Drop all the identified backlinks URLs into one spreadsheet and keep the unique root domains only. This can be done by applying the following formula on the current set of URLs:

LEFT(A2,FIND(“/”,A2,8)) where A2 is the cell with the original ‘Full-URL’ data. Make sure all URLs include ‘http://’ otherwise use the formula LEFT(A2,FIND(“/”,A2,2)) .

Note: The above formula works fine with http domains but it doesn’t work with https one. A more complete formula written by James Taylor is available here.

Then, all duplicate subdomains should be removed. Highlight the ‘Linking Root Domain’ column and click on ‘Data->Remove Duplicates’. Choose ‘Expand the selection’ and then click on ‘Remove Duplicates…’.

Check only Linking Root Domain column and click ‘Remove Duplicates…’

Eventually, only the unique domains only will remain in the spreadsheet.

Unfortunately, not all identified linking root domains are necessarily linking to the client site because the collected data may not be up-to-date. At iCrossing we use a proprietary tool to filter out all those linking root domains in order to improve the quality of our data set even further. Filtering out all dead links will significantly increase the quality of this exercise.

Processing the Data

Having identified a large set of unique linking root domains we can now proceed and do the following:

  • PageRank distribution check
  • Linking root domains indexation check
  • Social metrics distribution check

Running NetPeak Checker a set of 100 URLs can be checked in approximately 1 minute. For optimal performance, make sure that the only the following metrics have been checked:

i.    PR Main (Pagerank of main domain)

ii.    PR Page (The PageRank of the given URL e.g. subdomain or deep page)

iii.    Google Index (this is the number of pages indexed by Google – equivalent to a site: )

iv.    Server – > Status Code (returned values include n/f (404), 200, 301, 302, 303 etc)

Load the unique domains URLs previously identified and hit the ‘Start Check’ button.

 

Once the data have been fetched you can then export them and process them in Excel.

This is what the returned values mean:

Values Meaning
PR Page 0, n/f Low quality subdomain
PR Main 0, n/f Low quality domain
Google Index n/f Deindexed domain
Google Index >0 Indexed domain
Status Code 200, 301, 302 Active domain
Status Code n/f Domain does not exist or is down

Using Excel’s filters it is quite easy to detect which linking root domains may harm rankings by looking for those with the following value characteristics:

  1. Status codes 200, 301 or 302 (live domains)
  2. Google Index value =  n/f (no pages found in Google’s index)
  3. PR Main OR PR Page with values of 0 and n/f (if conditions 1 and 2 are met this wouldn’t make any great difference)

Note: Because PageRank gets updated quarterly more or less, a linking root domain may have been removed from Google’s index even though it still presents a high PageRank value.

PageRank Distribution Check

Working out the PageRank distribution of all linking root domains will unveil the proportion of low quality backlinks. In order to do that we need to check Toolbar PageRank and Indexation of all linking root domains.

Open in Excel the .xlsx file exported from NetPeak

  1. Apply a filter in the top row
  2. In the ‘status code’ column filter out all the n/f values (check 301s and 302s but not n/f). This filter will remove all domains that no longer exist.
  3. Create a Pivot Chart and choose:
  • Row Labels -> PR Main
  • Values -> Count of PR Main

The PageRank distribution of healthy linking root domains should have less low PageRank values (n/a and 0) backlinks and ideally spike towards the middle of the graph, like this one:

However, a PageRank distribution with too many low PR backlinks like the one below, should be a cause for concern:

If this is the case, the link building strategy needs to be adapted accordingly so the website can attract links from authoritative and trusted linking root domains. Where necessary, we may try remove as many low quality backlinks as possible if we think that they may be hurting the website, although sometimes this is out of our control. The main objective in such occasions is to increase the quality of backlinks so the PageRank distribution becomes more balanced.

Monitoring Linking Root Domains Deindexation

Monitoring the rate at which linking root domains get removed from Google’s index periodically, can be very useful. If too many linking root domains get deindexed that would be a negative signal, very likely to have a negative impact on rankings.

For instance, in the following example the number of non-indexed linking root domains has significantly increased in three weeks which should trigger some immediate actions.

Checking the deindexation rate periodically (e.g. weekly or monthly) using the same linking root domains data, could identify negative trends in a website’s backlink profile. All deindexed linking root domains should then be checked further in order to identify the reasons that may have led to them being removed from Google’s index. Some manual/editorial checks would make sense in this case. This will also help identifying the best strategy for the short and long term.

Social Metrics Distribution Check

Using SEO Tools for Excel, it is fairly easy to calculate the following three metrics for each linking root domain:

  • Facebook  shares
  • Twitter counts
  • Google +1s

Looking at the social shares each linking root domain has received could add some more insight into the above exercise as valuable domains are more likely to be shared socially. On the other hand, domains with very low or no social mentions at all may point to low quality domains.

Be Sociable, Share!

    Comments (44)

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation « MindCorp | Newsfeed

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]Jan 30, 2014 07:58 am

    • Modestos Siotos

      Hi Ovidiu,

      There are several commercial applications you can use to find out which sites actually link to you: Ahrefs, Majestic SEO, Open Site Explorer. However, I would recommend Cognitive SEO because it aggregates data from various sources and crawls all the links so you have a better and more accurate view of all sites linking to yours with follow/nofollow and many more. About your other question, Google will certainly block your IP when making too many requests but in this case proxy servers should help.Oct 15, 2013 11:23 am

    • Ovidiu Burduja

      Hi, Modestos.

      I'm surprised you say "proprietary software" which pretty much breaks everything you had built before in the article; from that point onward, there's no need to keep reading.

      What if Google blocks your IP after several attempts at getting Google Index metrics for some of the URLs in question, is there a turnaround it?Oct 15, 2013 11:11 am

    • Modestos Siotos

      Unfortunately I'm not using Open Office so I'm not able to help on this.Aug 13, 2012 10:40 am

    • James

      I'm trying to follow the last bit - the indexed vs. non-indexed side-by-side done over time. I don't know how to do that, I'm using OpenOffice.

      Any tutorials you can point me to?Aug 6, 2012 09:05 pm

    • Modestos Siotos

      Thank you for spotting this Richard, I've added a note in the post.Jun 29, 2012 09:12 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | South Florida Web Marketing Blog

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed websites. [...]Jun 21, 2012 01:05 pm

    • Richard

      Hello,

      I went back to looking at the formula =left() and i found out the following.

      =LEFT(A2,FIND(“/”,A2,2)) will work if your urls do not include http://

      =LEFT(A2,FIND(“/”,A2,8)) will work if your urls do include http://

      Inversely, the latter will not return desired return if urls do not include http and the former will return an error if your urls include http

      Cheers!Jun 11, 2012 02:54 pm

    • Modestos Siotos

      Thanks Richard. For this formula to work all URLs need to include http://May 31, 2012 09:54 am

    • Richard

      For the formula =LEFT(A2,FIND(“/”,A2,8)), you need to change the number 8 to the number corresponding to the row. In this example, it should be 2. That worked for me

      CheersMay 30, 2012 08:14 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | | The SEO BadgerThe SEO Badger

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed websites. [...]May 17, 2012 11:43 pm

    • 如何让站点免受谷歌非自然链接警告的惩罚 & 如何避免过度优化 < SEO译文分享

      [...] 这篇文章是对“如何进行站点检测以防链接失衡”的后续内容,可用于识别低质量链接或受惩罚及被降权网站。不过,本文的关注点主要还是以下几种与过度优化相关的链接: [...]May 13, 2012 12:25 pm

    • How To Survive Google's Unnatural Links Warnings & Avoid Over-optimisation | SEOmoz

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 11, 2012 10:50 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | SCMG Enterprises, LLC.

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 9, 2012 03:04 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | CS5 Design

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 7, 2012 02:45 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | %Internet Marketing Tips%

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 6, 2012 11:11 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | Sky Backlink

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 6, 2012 10:36 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed websites. [...]May 5, 2012 10:36 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 5, 2012 07:33 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | Clixto7

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 5, 2012 04:43 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | Montachusett Internet Marketing

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 5, 2012 03:58 am

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | Fan Likes

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 4, 2012 09:31 pm

    • How To Survive Google’s Unnatural Links Warnings & Avoid Over-optimisation | Sphinx Web Design Experts

      [...] is a follow up to the post ‘How to Monitor Your Website For Link Equity Loss’, which can be used to identify backlinks from low quality or penalised/deindexed [...]May 4, 2012 07:13 pm

    • Great Online Marketing Posts for April | Cornwall SEO

      [...] Calendar outspokenmedia.com 55 top tips and tricks from LinkLove London 2012 wordtracker.com How to monitor your website for link equity loss icrossing.co.uk 1 Simple Trick To Getting Millions Of YouTube Views higherclick.com Social [...]May 1, 2012 02:34 pm

    • 怎样检查对站点不利的链接 < SEO文章翻译分享

      [...] 注意:对于如何设置NetPeak,使用excel做如上应用,请参照笔者在Connect.icrossing.co.uk. 的文章 [...]Apr 24, 2012 03:27 am

    • How to Check Which Links Can Harm Your Site’s Rankings | SCMG Enterprises, LLC.

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 21, 2012 12:56 am

    • Modestos Siotos

      There is an alternative formula written a while ago by James Taylor,which actually works for both http and https URLs.The one I mentioned in the post only works for http URLs, hope this time it will work for you!Apr 18, 2012 10:15 am

    • Jeff

      "Unfortunately, not all the identified linking root domains are necessarily linking to the client site because the data may not be up-to-date. At iCrossing we use a proprietary tool to filter out all those linking root domains in order to improve the quality of our data set even further."

      As an alternative, I suggest using Scrapebox's free Link Checker tool for this, although it needs to be run on the actual link itself, rather than the root domain.Apr 16, 2012 11:45 pm

    • How to Check Which Links Can Harm Your Site’s Rankings | Clixto7

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 16, 2012 06:08 pm

    • How to Check Which Links Can Harm Your Site’s Rankings - -

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 16, 2012 10:55 am

    • Search Engine Optimization » Blog Archive » How to Check Which Links Can Harm Your Site’s Rankings

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 16, 2012 04:25 am

    • How to Check Which Links Can Harm Your Site’s Rankings | Montachusett Internet Marketing

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 16, 2012 03:58 am

    • How to Check Which Links Can Harm Your Site’s Rankings

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 16, 2012 02:12 am

    • How to Check Which Links Can Harm Your Site’s Rankings | CS5 Design

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 15, 2012 09:57 pm

    • How to Check Which Links Can Harm Your Site’s Rankings | Sphinx Web Design Experts

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 15, 2012 09:36 pm

    • Google Targeting Public Linking Networks

      [...] Calendar outspokenmedia.com 55 top tips and tricks from LinkLove London 2012 wordtracker.com How to monitor your website for link equity loss icrossing.co.uk 1 Simple Trick To Getting Millions Of YouTube Views higherclick.com Social [...]Apr 14, 2012 05:21 pm

    • angelwitch

      Unfortunately, no..Too bad i can't resolve it..thx thoughApr 12, 2012 02:30 pm

    • Modestos Siotos

      Double check the double quotes, they may be the issue :)Apr 11, 2012 03:17 pm

    • angelwitch

      Thanks, that's the formula I refer to. The cell is correct, but I keep getting the error. I am so frustrated, i want to follow your tutorial so badly! What an excellent post! Is there another way I can exclude the duplicate domains and keep the root domains?Apr 11, 2012 02:42 pm

    • Modestos Siotos

      I guess you refer to =LEFT(A2,FIND("/",A2,8)) . Just make sure that the full URL appears in A2. If not, just update A2 to the cell you need the formula to be applied to.Apr 11, 2012 02:33 pm

    • Modestos Siotos

      Thanks Tim!

      There are quite a few interesting readings on the recent Google update but my favourite one is Jim Boykin's interview on SEOBook.Apr 11, 2012 02:30 pm

    • angelwitch

      Great post! I tried to use the formula though but I keep getting an #NAME? error! How can i fix that?
      Thank youApr 11, 2012 01:40 pm

    • How to Check Which Links Can Harm Your Site’s Rankings | Tin Surf24h

      [...] Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk. [...]Apr 11, 2012 09:09 am

    • Tim Aldiss

      Another excellent post Modi. Do you have any further reading recommendations for the recent Google update?Apr 5, 2012 10:46 am

     
    Please note: the opinions expressed in this post represent the views of the individual, not necessarily those of iCrossing.

    Post a comment