If your SEO has said you need to change all internal links pointing to a page that has moved or been killed, they’re doing their job. It is an annoying task but a very important one. There are three main reasons why not linking to 3XX and 4XX pages is important for SEO, not to mention creating a good experience for your website visitors.
If you’re an IT or DEV professional reading this to get out of changing links, even though a 301, 302 or 307 redirect is quick and not very noticeable for an end user, it is certainly noticeable by a search engine and that is what this post covers.
Three Reasons to 3XX and 4XX Are Bad For SEO:
- You do not waste your crawl allowance with them
- Sending users to resources that do not exist creates a bad UX and less favorable option for the search engines
- Multiple redirects can cause a spider or robot to leave your site (see bullet point 1)
Now lets go a bit more into what these mean.
Not Wasting Your Crawl Allowance
Although it would be amazing to be able to keep search engines going through every page of our site to constantly find, store and index our content, that is a fantasy world. Reality is that they come to our websites, crawl through portions and at some point leave only storing or restoring their databases with what they can and cannot find or what has changed.
They look at the pages they visited and try to make assumptions about what they are about, who they are designed for, which queries they can provide solutions for and then if they should start showing it or store a copy of it for later use. If you are sending search engines to pages that do not exist anymore or that result in 4XX errors (not 5XX) or are redirecting, you’re wasting the amount of pages it will visit while it is on your website.
They can and do try to follow 3XX redirects, but this isn’t the best experience, especially when you enter into multiple chains. For example, you changed URL structures a couple years back, then you flipped to https. Now you have another new URL structure. There are three redirects set in this common scenario which is bad for multiple reasons, the main one is the last point of this post. Ideally you’ll go through and take the original versions of each link and simply point it to the new page, instead of something that keeps redirecting. That is clean code and a clean site.
In plain english you have 4 URLs
- A – original pointing to B
- B – decided to turn on SSL and is now pointing to C
- C – switch to a new URL structure again and now pointing to D
- D – this is the actual page that is shown 200.
What we need to do above is take A, B and C and point them directly to D. This is the ideal solution.
Avoid Sending Users to a Bad Experience
If you send a user to a 4XX error through a link on your site, you’re providing a bad experience for them. If you send someone to a 4XX page on your own domain, that is even worse. Think about it this way. You thought enough of a post on a website or within your own domain to source and link to it. You’re telling a visitor this is where you can learn more or find solutions. Then when they click through there is nothing there. That creates a bad user experience.
If you think about this from a search engine standpoint and the search engine is stuck between which page to show. Option A has links that end in 4XX errors. Option B sources sites or pages that exist and provide more information. Option B is a better user experience and is the one the search engine may show when it comes down to a tie. Ties in the results are much more common than you think and you want to be the one that wins so you get the traffic.
This is part of site maintenance. Crawling a site for external and internal links that point to 4XX errors is easy, does not take long (depending on site size) and is something you should consider doing at least once a quarter if not once a month. With my SEO clients I try to keep an updated 3XX and 4XX sheet available for them each month.
Multiple Redirects Can Cause a Spider to Leave Your Site
When you send a search engine spider into a long section of redirects (a redirect chain), the spider or bot may decide it’s had enough and stop crawling your site. You not only lose your crawl allowance for that visit, but it is something you could have prevented with proper site maintenance. Before someone asks, there is a theory that you lose 5 – 15% equity or authority on each redirect. I personally do not believe this, but I am superstitious so I won’t say it doesn’t happen.
If you want to maximize the way a search engine crawls your site and you want to avoid the claim about loss of authority, then clean your internal links. Don’t link to pages that redirect. This includes your blog, forums, sitemaps and anywhere else you have internal links. You can find lots of places that mention this like on this post in deepcrawl, or you can run your own test.
Set up a new URL with 5 pages in a row. Start with the homepage, link it to page 2, then page 3, then page 4 and end on page 5. Now, make the only link from page 1 to page 2 have 2 3XX redirects. Apply that to each page moving forward. If the search engine makes it through and indexes all 5 pages, it can crawl those without leaving.
Time to test again. Set up the same thing but do 3 3XX redirects and look to see if all 5 pages get indexed. Then do it with 4 3XX redirects and 5. You see where I’m going. Search engines do their best to follow redirects and understand them, but when you lead them into redirect chains, you can cause more work for them and have them stop crawling your site.
Redirects are awesome for helping to prevent everything from duplicate content to making sure an end user finds a great page. The trick is to use them the right way and to also clean them up, especially if you end up with a redirect chain. It’s part of regular site maintenance just like cleaning up 4XX errors and something that can help your SEO.