Bio | Annaleis is currently working as an SEO Consultant at Brainlabs. Her experience lies in digital marketing, with a specific focus on search engine optimization (SEO), in developing and executing SEO strategies for a variety of global accounts, in varying industries. She is building her specialty in luxury, e-commerce clients. Prior to this, Annaleis worked in digital marketing for a conference company that runs search & digital marketing conferences.
We’ve all been there – you’re looking over a client’s hreflang; everything looks alright, nothing alarming is jumping out. You then go across to check Google Search Console and see month on month hreflang errors increasing. Sorry?!
Hreflang is one of those areas of international SEO that looks deceivingly straightforward (John Muller tends to agree), but as soon as something doesn’t add up, it becomes a minefield of rabbit holes.
Luckily, I experienced this recently and was able to identify a few issues that were occurring site-wide, across multiple international sites that were the main cause for the increasing “no-return tag” hreflang errors.
Firstly, what is hreflang?
Hreflang is a specific markup code that tells Google and other search engines which geographical location and language a specific page is targeting.
When do you need it?
Hreflang is needed for any site that has international equivalents. For example, you have a website that is .co.uk, which is for people in the United Kingdom, who speak English.
However, you also want your website to be accessible and available to people in Italy, who speak italian. You can’t serve them with the English site, as that wouldn’t serve the users well, instead, you’d want to create a site that is for Italy, and the content is in Italian.
You would need to tell Google that whilst the content is the same, and it’s the same site, it is serving a different purpose and audience. This is when hreflang is used.
Why do you need it?
Hreflang tells Search Engines that a site does not have duplicate content, however, this content is targeting a specific country & language.
Search engines use the hreflang information to decide which version of your website to display in their search results, depending on the country and language the user is searching in. For example, a user searching in Spain will want to see the Spanish version of the page, compared to a user searching in Germany, who would want to see a German version of the same page.
What are some best practices?
The hreflang attribute should be placed in only one location. This can be in either the on-page markup, the HTTP header, or the sitemap.
Set up an x-default hreflang attribute value to signal to Google’s algorithms that a page doesn’t target any specific language or locale, and is the default page when no other page is better suited.
What are “no return tag” errors?
The errors discussed in this piece are from Google Search Console, and are labeled “no return tag”. This is when the original URL and the alternative URL don’t have return tags.
A return tag is when one page (Page A) has hreflang set up and points to Page B as a language/country alternative.
But Page B doesn’t point back to Page A.
Therefore there is no returning hreflang tag.
Return tags are important to prove to search engines that you, as the webmaster/SEO, have control over both page variants and that they are correctly associated with one another.
Get the data
Step 1: Go to google search console
Step 2: Scroll down the left-hand side menu
Step 3: Under “Legacy Tools and reports”, go to “International Targeting”
Source: Google Search Console
The screenshot below shows the dashboard within International Targeting.
We can see the list of errors, sorted from most to least, occurring across the various international sites.
Source: Google Search Console: International Targeting
We will use “en” no return tags to explore. Click through to one of your issues.
Source: Google Search Console: International Targeting
Weird and Wonderful Issues Found:
Issue 1: Are your redirects & canonicals causing no return tags?
Here is a scenario:
An e-commerce retail website has many pages, ranging from product pages to category/head pages. Because of the nature of the product offering, product pages often have many variations due to filters being applied (such as colors, sizes, etc.). Products are also known to go out of stock or become no longer available / sold.
With these two issues, canonicals can be used to help crawlers focus their attention on the important pages we want to rank.
For example, we would want the category page of “women’s bodycon dresses”,
To rank over a parameterized page that has filters applied, such as “women’s bodycon dresses, in blue, size M”;
Similarly, when products are sold out or are no longer offered / available, companies have various ways of dealing with this. A couple of ways are to either 301 redirect the page to a relevant main category page (if this product is never coming back), or keep the page and make it clear it is out of stock. You can then add a canonical tag to that page pointing to a more relevant page you want to rank.
Some useful resources for handling out of stock pages include:
For pages that have been 301 redirected; you should add in the hreflang annotation to the final URL showing the content, not the page that is 301 redirecting. If the hreflang annotation is on the redirecting page, this could flag a “no return error” in Google Search Console. This is because the crawlers can’t read that the hreflang is on that page as it is being redirected to read the new page.
Action: Removing the hreflang annotations from the redirecting page and adding them to the final destination page will reduce the number of “no return tag” errors occurring in Google Search Console (GSC).
For pages that have been canonicalized; you should remove the hreflang annotation from any page that is canonicalizing to another page. This is because all of these pages are canonicalizing to the primary / clean version of the product page, so any search engine crawlers will abide by that directive and treat the canonical URL as the page from which to follow instructions (which has the hreflang on it). If the page that is canonicalizing has hreflang annotation on it, it will appear as a “no return tag” error as it is being directed to not read that page’s HTML.
Action: Removing the hreflang annotations on pages that canonicalize (not to itself), will reduce the number of “no return tag” errors occurring in GSC.
Issue 2: Googlebot is being blocked
Is your site-blocking Googlebot from crawling? We have found instances where large websites will block Googlebot (both desktop and mobile) from crawling.
You may be wondering, why would you ever block Googlebot?
Preserve crawl budget and ensure that Google isn’t crawling lots of low-value pages (which can happen with large, product-based websites)
Prevent pages from appearing in the search results, such as sensitive information
Protect server load to prioritize user navigation (especially common for larger sites)
What can happen when you block Googlebot? It can affect Googlebot’s ability to crawl and index a site’s content, which can lead to a loss of ranking in Google’s search results (as it can’t find/read the content to index) when used inappropriately.
Here is the scenario.
The same e-commerce retail website is experiencing increasing “no return tag” errors in Google Search Console. This was identified by going into Google Search Console and looking at the International Targeting data.
Once a country and language list were downloaded, we ran both the “originating URL” pages and the “alternative URL” pages in two separate Screaming Frog crawls to identify their status codes.
This process was repeated across multiple different countries & language websites.
Numerous pages were returning a 403 status code. However, when these pages were checked manually, these pages were 200 status codes and both the originating and alternate URLs had corresponding hreflang annotations.
A number of pages were listed as 200 status codes, both also containing corresponding hreflang annotations. However, these pages were pulled from Google Search Console, which meant there was an issue occurring with the returning hreflang tags.
The screenshot below shows that when we look at the page from the Googlebot Smartphone user-agent, it comes up as a 403. To do this, you right hand click on the page: Inspect element → click the 3 dots at the top right hand corner, select More Tools → Network Conditions.
Source: Client website: Inspect Element
Then under User Agent, click the drop-down menu and select your crawler of choice (we selected Googlebot Smartphone as our client’s site was mobile-first indexing). As seen above.
Source: Client website: Inspect Element
For pages that we’re returning a 403 status code from the crawl, we identified that these were not being read/found by Googlebot (desktop or mobile), due to it being blocked from crawling them. This was confirmed by the client, that they were blocking Googlebot due to crawl budget issues. If the crawler can’t read the page, it can’t identify that there is hreflang set up on the page.
For pages that we’re returning a 200 status code from the crawl and had reciprocating hreflang annotations, we determined these too weren’t being properly accessed or crawled by Googlebot. This was confirmed by the client, as they had crawl budget issues. If they had been properly crawled, they wouldn’t be included in the list of pages that had a “no return tag” error.
Action: To unblock Googlebot crawlers (both desktop and smartphone) from crawling the site (as long as these pages didn’t need to be blocked for other reasons, which in this case, they didn’t), as this is causing an increase in hreflang errors, as well as having the potential to affect pages to not be indexed.
If it can’t be unlocked for other reasons, increasing the crawling capacity could help identify more pages & hreflang, and therefore help decrease the number of errors.
Hreflang can be very useful and important when it comes to international websites. If set up properly, it can run smoothly and be left alone. However, as with most sites, pages are added/removed, site structures change, and ranking importance for pages shifts; which is when hreflang can cause issues.
While these hreflang scenarios are rather specific, hopefully, they can shine some light on the idiosyncrasies that hreflang can have and help others investigate and solve similar issues occurring with their sites.
Here are some resources I’ve found useful for the wider explanation of hreflang and how it is implemented. These also include the ones I’ve linked to throughout the post. Thanks!