The Optimize crawler uses a dynamic discovery process, which means it actively explores and discovers web pages on your website. It does this by following links from one page to another systematically to find all pages on your domain.
The scan is done with a breadth-first approach, which means it starts with the initial webpage and then systematically explores all of the links on a page before it moves on to the next level, or depth, of pages. The crawler can simultaneously scan 10 different pages of the same domain, and in most cases it respects the depth priority of the links depending on the response and processing latency of each page. When a sitemap is found, all pages in the sitemap are considered to be at depth level 0 (top).
The crawler is capped at a depth of 100 links from the start page.
The crawler can also inspect robots.txt files. It can detect sitemaps that are declared on the robots.txt file and automatically does a scan of all links that are on the XML of the sitemap.
There are some links that the scan automatically ignores.
For more information, see the user guide article:
As an alternative to a start page, add a sitemap. Sitemaps enhance the effectiveness of the domain scan. This helpful particularly with large or complex URLs, as well as those that contain a great deal of multimedia content.
A sitemap essentially serves as a roadmap for the crawler, which shows an organized structure of your website. This makes it easier to navigate and helps the scan to discover URLs across your site. Users with large websites can find it difficult to make sure that every page is linked to at least one other page. A sitemap can guide the crawler to new pages that might otherwise be overlooked.
If this content did not answer your questions, try searching or contacting our support team for further assistance.
Fri Feb 07 2025 10:03:29 GMT+0000 (Coordinated Universal Time)