I’m trying to set up a site audit for an e-commerce website with millions of pages. currently the crawl is set to 20k pages with a depth of 0. no matter how I adjust the settings, it seems that the crawl only works alphabetically backwards starting from pages with the letter Z. hence my 20k results consist of only pages from the Z.
I’d much prefer if the list of pages would be more diverse and randomly picked including categories, brands etc. I can definitely aim at a higher amount of pages (up to 1 million monthly) however I want to make sure that the settings are right first.
With Depth 0 it’s only crawling sources. Maybe you have the sitemap selected and it’s pulling from there? There’s no randomization, so if you want a diverse set of pages you probably need to pick them and use the list you create as the sources.