Forums Forums White Hat SEO Why does SEMrush pick up strange URLs/404s?

  • Why does SEMrush pick up strange URLs/404s?

    Posted by seohelper on April 9, 2020 at 5:00 pm

    I ran a site audit through SEMrush and it yielded a couple of very strange URLs that were 404s. These are pages that aren’t on the site, not indexed, and certainly not in the sitemap. I can probably just hide them, but I’m wondering why they’re turning up in the first place?

    sp_jamesdaniel replied 5 years, 1 month ago 1 Member · 7 Replies
  • 7 Replies
  • dsarif70

    Guest
    April 9, 2020 at 5:21 pm

    Check if they’re linked from other websites.

  • OMGaQUAIL

    Guest
    April 9, 2020 at 5:27 pm

    Their bot doesn’t just make pages up. All it does is follow links from one page to another (assuming the crawl source is set to “Website”). So there’s most likely an internal link pointing to that 404 page. IIRC there’s a tab for “Incoming Internal Links” where you can see what pages are linking to the 404 page.

    Or simply call up their support team and they can show you. They’re very friendly and easy to get in touch with.

  • the_cnara

    Guest
    April 9, 2020 at 5:28 pm

    It can be your web security software checking stuff or hack scripts looking for vulnerable pages. What do the urls look like? Real-looking or nonsense?

  • dsergeevna

    Guest
    April 9, 2020 at 6:49 pm

    Hi, Daria from SEMrush is here ?Could you please send an email and project name via DM so we would take a look at it for you? ?

  • kurtteej

    Guest
    April 9, 2020 at 6:59 pm

    I actually use Screaming Frog for the purpose of looking for issues like this. Looking, finding and fixing 404s is a good thing to do once or twice a year (depending on the size of the site)

  • sp_jamesdaniel

    Guest
    April 9, 2020 at 7:00 pm

    I think you have to hide URL from Search console.

  • cedcommerce

    Guest
    April 13, 2020 at 1:56 pm

    You need to check the robot.txt file again and disallow those pages from there. Once done crawl your web pages though sitemap, if still those pages are coming as errors then, there is an option “ Respect Robot.txt “ click on that option and then crawl your website through sitemap.

    Try this technique, might resolve your issue.

Log in to reply.