Orphaned pages are site pages that live outside the site structure because no other pages on the site link back to them. Not only does this mean that users can’t access these pages from anywhere else on the site, but your SEO can also be negatively impacted.
Leaving orphan pages unattended can mean that they get dropped from search engine indexes, take up the lion’s share of your crawl budget, and confuse Google into thinking you’re using black hat SEO techniques.
While there are a few ways to access orphaned pages– through backlinks and redirects, for example– it still isn’t nearly as simple as it should be. When you combine this with the fact that orphan pages can result in being penalized by Google and other search engines, prioritizing fixing them becomes a no-brainer.
Let’s look deeper at what orphaned pages are, where they come from, how they impact your SEO, and how to find and fix them.
Orphan pages are pages on your site that users can only get to if they know the URL– essentially, there aren’t any links to these pages anywhere on your site.
Search engines have a hard time finding orphan pages because of the lack of internal links from other pages on your website. Crawlers won’t be able to discover orphan pages from external backlinks or the sitemap file, which is why these URLs so commonly fall through the cracks.
There are a number of reasons that orphan pages can occur on your site. This type of page can result from navigation changes, site redesigns, site migrations, testing, dev pages, out-of-stock products, or other actions if you don’t have a process in place to avoid orphaned pages.
In the above cases, orphan pages occur accidentally. In some instances, though, a site owner might deliberately create orphan pages. Typically, this happens when they don’t want a particular page to be a part of the user journey, such as with paid advertising landing pages.
Did you recently change your domain, and you’ve been experiencing a drop in web traffic? If so, look at these six reasons why your traffic might be impacted by a new domain.
Search engines use links to find new content and interpret the significance of the page. Automated programs known as crawlers are constantly searching the web for new or updated pages, mainly through following links from pages that are already cataloged by Google.
For instance, if you publish a new page on your site and don’t create a link to it from an existing page on your site, you might be creating an orphan page.
Unless there are backlinks to the page or you’ve put it in your sitemap, Google won’t be able to discover it. This means that your page won’t get indexed and is essentially non-existent to the crawlers.
Beyond that, your page won’t be able to receive PageRank, which is an all-important aspect of ranking highly on Google’s search engine results.
If your site has a lot of low-value orphaned pages, it can take up a big chunk of your crawl budget. On any given website, a search engine will only crawl a certain number of pages. The maximum number of pages crawled is determined by Google by weighing crawl demand and crawl rate limit against one another.
This is a problem because Google might not be crawling your relevant and recently published content because so much of its attention is taken up by orphan pages. This can have a negative impact on the search rankings of your non-orphaned pages and ultimately hurt your overall SEO efforts.
Orphaned pages also create a poor user experience, as a user might not be able to find the information they’re looking for because the page simply can’t be discovered by search engine crawlers and indexed.
There are a number of ways that orphan pages can confuse Google in a way that could be detrimental to your SEO efforts.
Firstly, Google might interpret orphan pages they come across as doorway pages. This can occur if the content on the orphan page matches content that is found somewhere else on your site but is missing a no-index meta tag.
Doorway pages, also known as gateway pages, are considered a black hat SEO technique by Google. This means that you can receive a penalty manually or from the Google algorithm, leading to your ranking dropping or even being completely deindexed.
Secondly, having orphaned pages means that Google might not be able to discover pages you want to rank on search engine result pages. Without crawlers being able to find your pages, you won’t be able to receive any organic traffic from search engines.
Are you using SEO for lead generation? Check out our ultimate guide here.
In order to fix the orphaned pages on your site, you’re going to need to find them first.
Let’s take a look at the steps you’ll need to take in order to locate the pages that aren’t integrated into your site via links.
Wondering what else you should be looking for when auditing your site? Check out this on-page SEO audit checklist to help improve your site rankings.
You can use several different strategies and tools to find a list of the crawlable URLs on your site. While you might already subscribe to a third-party SEO tool like Ahrefs that can help you get a list of these URLs, you can also use Google Search Console.
If the URL is within a Search Console site of yours, you can use the URL Inspection tool to manually build a list of crawlable URLs. For URLs that aren’t within a Search Console site, you can use the Rich Results test tool to manually discover which pages are crawlable.
If you have an Ahrefs Webmaster Tools account, you can conduct a Site Audit in order to find orphan pages. Another popular choice for finding all of your crawlable URLs is ScreamingFrog.
You’ll also be able to find a list of crawled pages using tools like SEMrush, Raven Tools, and Moz Link Explorer.
At the point when you’ve gone through your site, you’ll want to export the URLs to a spreadsheet.
The two most common causes of orphaned pages are issues with your use of https/http, www/non-www, and trailing slashes.
To determine if you’re having problems with this first issue, go to your browser and type in all of the following variations of your homepage:
When you do this, you’re ensuring all four variations will redirect to the same URL. If you find that one of these versions of your homepage URL doesn’t redirect, it can point to additional problems on your site as a whole.
You’ll also want to check other URLs using the same variation on your site. This will help you determine if you have a site-wide problem in this regard.
The consistent use of trailing slashes is the other most common cause of orphaned pages. You’ll want to check several pages on your site using the trailing slash and leaving out the trailing slash to ensure that they consistently redirect to the same URL automatically.
In the first step, we discussed using a crawler to find your crawlable URLs. However, these tools, by definition, won’t be very successful at finding orphaned pages.
The next step is to try and find a list of all of the URLs on your site. One of the best ways to do this is through your Google Analytics data. If Google Analytics has been installed on the pages in question and has been visited at some point, you should be able to find a record of them.
To start your search for this list, you’ll want to go to the left sidebar and select Behavior > Site Content > All Pages.
There’s a good chance that any orphaned pages on your site have very low traffic because they are difficult for both users and crawlers to find. Click on the “Pageviews” column heading so that the arrow is pointing up so that the URLs will be listed with the least pageviews at the top.
You can also adjust the starting date using the “date range” feature in the top right of the screen. Then, you can set the date to a time before you installed Google Analytics and select “Apply.”
Google Analytics is only able to list up to 5,000 URLs at one time. To have it display as many URLs as possible, click the “Show rows” dropdown menu in the bottom right and select the highest available number of rows.
It can take some time for data to be fetched, but once all of the URLs have loaded, you can choose “Export” in the top right-hand side of the screen and export the URLs to a spreadsheet. You can then copy the URLs into your orphan page spreadsheet by inserting a new column and using the concat() formula to create a new column with combined crawlable and root URLs.
Now it’s time to identify your orphan URLs. You can then use a match formula to check if the URLs in the two columns sync up. To find the URLs that don’t have a match, you can sort your “Match” column alphabetically so all of the “#N/A” results show up in one place.
Once you have found your orphan URLs, you can copy and paste the URLs to a new spreadsheet to isolate the pages that need to be fixed.
One common mistake when fixing orphaned pages is simply adding internal links to all of them. While this might be an efficient way to deal with orphan pages, it isn’t necessarily the right thing to do.
This is because some of your orphan pages might be intentional, and others you might want to simply remove from your site.
Instead, you’ll want to look at each URL and ask yourself whether it is an error that is not linked to elsewhere on the site. If the answer is no, ask yourself if it is a promotional landing page. If it is, you’ll want to no-index the page.
If it’s not, look to see if there is duplicate content or very similar content elsewhere on the site. If this is the case, you can merge it with the other page and remove it from your sitemap.
If you’ve answered no to these questions so far but feel that the page is valuable, you can add internal links elsewhere on your site. If it isn’t useful to visitors, simply delete the page.
For any other pages that you do want to be indexed and findable, you will want to add internal links on your site back to the bag.
Though it might have seemed like a lot of work to create your list of orphaned pages, fixing them isn’t too difficult. You can remove, redirect, or link to the page elsewhere to remove its orphan status.
Finding and fixing orphaned pages on your site can be a serious time commitment. Beyond that, if you don’t already have accounts for some of the most useful tools for the purpose, it can also be an expensive endeavor.
If you want to make sure that your site is in tip-top shape, one of the most cost-effective things you can do is hire an SEO agency. You won’t have to spend hours researching how to audit your site or playing around with spreadsheets. Instead, you can rest easy knowing that your website is in good hands.
At Blue Pig Media, we offer a full suite of digital marketing services to ensure that your site receives the traffic, leads, and online visibility that it deserves. If that all sounds good to you, you can get started today.