Target's sitemap changes between scans
Two or more scans on the same target may show different paths/URLs under the Sitemap section of a scan report. The scans can have the exact same configuration, but the sitemap may still differ across scans.
Root cause
On Invicti Enterprise, the sitemap section of a scan report is intended to efficiently show paths where vulnerabilities were found. It is a common error to think of the Sitemap feature as a tool for visualizing the nested structure of a target.
However, this is not the intended usage of the sitemap feature as it can list different paths of a target between scans, except for paths where vulnerabilities are found.
The primary reason for this is that the sitemap visualizer tool on Invicti Enterprise considers the responses the scan has received and the order of the requests the scan has sent for a particular path before listing the particular path and relevant sub paths of it on the sitemap of the scan report.
Case 1: Scanner receives an early 404 response
Consider the example where the scanner receives a 404 (not found) response for https://example.com/pages before attempting to visit/crawl subpaths under https://example.com/pages. In this scenario, the path https://example.com/pages and paths beneath it won't be listed on the sitemap even though they may be crawled/scanned during the scan.
Case 2: Scanner receives an early 200 response
In the scenario where the scanner receives a 200 (OK) response on https://example.com/pages/pages1 before visiting its parent path https://example.com/pages, which returns a 404 (not found) response, the subpath still appears on the sitemap.
The key factor is the order of requests sent by the scanner. When the scanner requests a subpath before its parent path, the sitemap on the scan report may look different compared to a scan where the scanner requested parent paths first. The order of requests sent for a path may differ for any path in between any scans, and this is implemented in the scanner code base due to performance reasons. The scanner constantly pools URL links (paths) in its cache of links to visit and crawl and may decide to send requests to a subpath before sending them to its parent path or vice versa during a scan.
No scan coverage impact
It should be noted that this feature doesn't affect scan coverage. The only exception to the scenarios discussed previously would be the presence of a vulnerability on a parent path or a subpath. In such cases, the path with the vulnerability is always going to be listed on the Sitemap as this is the intended usage for the sitemap feature.
The differences in the sitemap between scans can create confusion regarding scan coverage.
For the reasons discussed previously, use the crawled URL lists of different scans to compare scan coverage between multiple scans.
Invicti standard behaviour
Note that Invicti Standard doesn't follow the rules outlined previously, and is always going to include all paths it has been able to visit regardless of the order of requests it has sent and responses it has received.
Importing a scan from Invicti Enterprise onto Invicti Standard may show a different sitemap compared to the sitemap listed in Invicti Enterprise for this reason.
HTTP response codes in the sitemap
The HTTP response codes returned by endpoints determine whether they appear in the sitemap.
Endpoints excluded from sitemap:
Endpoints returning the following HTTP status codes are not included: 401, 403, 404, 500, 503
Endpoints included in sitemap:
Endpoints returning successful HTTP status codes appear in the sitemap, such as: 200, 201, 302 and others.
Need help?
Invicti Support team is ready to provide you with technical help. Go to Help Center