Errors 404 and SEO

What is a 404 error (page not found)?

Technically a 404 error (also called “Page not found”, or “404 HTTP code”) is an error code returned by web servers hosting a site to browsers or search engines that try to access the content of a page that no longer exists. Here is a diagram illustrating the interaction:

From the visitor’s point of view, a 404 error page often results in a generic message such as Error 404, 404 file not found, file not found or the page no longer exists.

Note that in the family of error codes there is also the HTTP error code 410 which is similar to code 404. This rarer code can be used by webmasters to explicitly indicate that a page has disappeared and will not return (“Page Gone”).

To learn more about HTTP codes, you can read our article on the HTTP protocol.

Causes of a 404 error

A Web server returns a 404 HTTP code when asked for a resource it cannot find. The causes can be :

  • a URL that existed before but has been permanently removed and no redirection has been set up,
  • a URL where the Webmaster made an error when entering the internal or external link, and which does not exist
  • a bad configuration of URLs automated by the content management system (CMS such as WordPress, Joomla…) which can generate false URLs.
error-404-lego

Impact of 404 errors on SEO

In general, having 404 errors is not penalizing for natural referencing as long as their proportion is reasonable.

However, there are 3 really annoying cases:

  • if one of the important pages of your site erroneously returns a 404 code (for example due to a technical error), this must be corrected urgently so that Google does not think the page has disappeared and frustrate visitors wishing to access the page.
  • if an interesting external site has created a backlink to a URL of your site by putting an incorrect URL. In this case we advise you to contact the webmaster concerned to report the error. This is a good opportunity to recover an interesting backlink for natural referencing.
  • you have too many 404 errors on your site: the user experience and the work of engine analysis could be hindered, which in the long run could have a negative impact on the natural referencing of your site.

Generally speaking, for natural referencing, Google favours quality sites that provide useful and accessible content to Internet users and search engines. And to meet this requirement of quality level related to natural referencing, it is a good practice to regularly correct 404 errors.

SEO Contest

During a contest in natural referencing such as the qwanturank contest organized by the search engine Qwant, it is certain that all competitors will track the 404 errors in order not to panic their position.

It is therefore necessary to detect and correct 404 errors for :

  • facilitate the exploration (crawling) of your site by search engines that will come back more often.
  • improve the user experience
  • give a better image of your site.
  • build audience loyalty
  • make sure you don’t lose an interesting backlink because of a simple typo on the link on the external site

To better understand the impact of 404 errors on SEO and especially how 404 errors are handled by Google, see the video by Matt Cutts :

When should a 404 error (page not found) be returned?

As we saw previously, 404 errors, when used in the rules, will not have a negative impact on natural referencing. However, care must be taken not to generate too many 404s.

To know in which case to return a 404 error, here are some questions to ask yourself:

  • What is the level of traffic generated by the page to delete?
  • Are there quality backlinks pointing to the page to be deleted?
  • Is content similar to the deleted page being offered on another page of the site?

Depending on the answers, you can decide whether or not to implement a 404 error:

  • The page to be deleted generates a large number of visits and / or backlinks so it is important to find a page with similar or close content and make a 301 redirection instead of a 404 error.
  • If the traffic or the number of backlinks of the page are almost zero and no other page offers similar content, then return an error of type 404.

Each time you remove a page from your site, you must of course remove or modify all internal links pointing to the corresponding URL.

How to detect and correct 404 errors?

To correct 404 error URLs you can :

  • use a site analyzer to detect all internal or external links on your site that point to a 404 error code. You must then intervene on the pages containing these erroneous links to correct or remove them.
  • Use Google Search Console :
    • Go to > Browsing > Browsing errors. There you will see a list of all URLs where the Google crawler (Googlebot) encountered a 404 code
    • Click on an erroneous URL > Referred on. There you will see a list of sites that are trying to redirect traffic to your site.
    • Correct these 404 errors either by contacting the webmaster or by setting up a 301 redirection from the URL indicated in Google Webmaster Tools to the correct URL.
  • exploit the information contained in the history files (logs) of your web server. These files are generally available even if you don’t administer your environments yourself. A simple grep command to search for the code “404” is enough to find every time your web server returns a 404 code. The Referer field will indicate the page containing the wrong link.

For badly written URLs placed on other sites: this step is important because maybe you lose some of your traffic because of badly written urls on external sites. To recover this traffic that arrives on a 404 error page:

Tool to detect 404 errors on a site

Some websites provides a tool to detect 404 errors found on a page. The detection of all 404 errors on a site is only done in the context of a complete audit.

Online tools

Software

  • Xenu (Windows only)
  • WebSite Auditor (SEO PowerSuite)
  • Screaming Frog
  • Visual Studio SEO

The custom 404 error page

Rather than displaying a neutral technical message to their visitors falling a 404 error, Webmasters can set up a custom 404 error page. That is to say a more attractive page limiting the inconvenience for the visitor as much as possible.

Google clearly recommends setting up custom error pages, but only to improve the comfort of Internet users. Custom error pages have NO particular impact on the natural referencing of a site. It is rather a good practice for a Webmaster.

With Apache type Web servers, setting up a custom error page for the HTTP 404 code is done by placing a command of this type in the .htaccess file:

ErrorDocument 404 /404.html

Here are our recommendations when setting up a custom 404 error page:

  • Make sure that the custom 404 error page technically returns a 404 error code. Do not return a 200 code, otherwise potentially all the errant URLs on the site would have duplicate content.
  • Display a clear message indicating that the page you are looking for cannot be found.
  • The 404 error page must be integrated into your site, so it must include the colour codes, graphics and navigation of your site.
  • Encourage visitors to visit other pages on your site by adding links to pages that may be of interest to them.

Google Search Console reports many 404 errors, what can you do?

In your Google Webmaster Tools account (in the Explore > Exploring Errors section), you’ll find all the HTTP errors detected by the Googlebot during successive explorations of your website.

Don’t give too much importance to the total number of 404 errors reported. The main information to look for in the 404 error table in the Google Webmaster Tools is the date the error was detected.

This is because the list of 404 errors reported by Google is not always up to date, and often includes very old 404 errors that may have been corrected a long time ago.

To “force” Google to clean up this list, simply select all reported 404 errors in Google Webmaster Tools and mark them as “Fixed.

After several days, look at this table again and see which URLs have been reported again… it’s this updated list of errors that needs to be fixed.