Skip to main content

Exclude paths from scanning

There are situations where you may need to configure Invicti Platform to exclude a portion of a web application from crawling and scanning. This might be required if the web application being scanned is too large, or if scanning part of the site might trigger unwanted actions such as submitting data. For more information on crawling options, refer to the crawling options overview document.

This document explains how you can specify paths for exclusion based on regular expressions. Excluded paths are added to individual targets on the Scan configure target page.

tip

If your target URL protocol is redirected (typically from HTTP to HTTPS), any excluded path directives won't apply. If your target employs protocol redirection, make sure that the target is specified with the final protocol to ensure that any excluded paths you specify are indeed excluded.

Add an excluded path

The Excluded paths option allows you to specify a list of directories and files to be excluded from crawling and scanning. Multiple paths can be excluded for each target.

  1. Select Inventory > Targets from the left-side menu.
  2. Click the three-dot menu (⋮)  > Edit target by the selected target to access its settings page.
  3. Click Crawling Options from the settings menu and navigate to the Excluded paths section.
  4. In the Excluded Paths field, enter a RegEx for the path you want to exclude from scanning.
  5. Click Save target configuration when you are finished.
Excluded Paths option in Crawling Options.

Excluded paths formatting requirements

Excluded paths need to be configured using regular expressions (RegEx). This is useful in situations where you want to exclude a URL pattern rather than a single URL. Invicti Platform accepts the widely used Perl Compatible Regular Expressions (PCRE) syntax for defining RegEx.

The format for creating exclusions is with a forward slash at the front (/) followed by the path that should be after the target URL. Once a path is excluded from scanning, all its subdirectories are also excluded from the scan because once a directory isn't crawled, the scanner can't know that there is anything below that directory that has been ignored.

Example

  • Target URL = www.example.com
  • Directory to exclude = /dir2 which is in directory /dir1 (www.example.com/dir1/dir2)
  • Excluded path = /dir1/dir2 where /dir2 is ignored by the scanner. Note that /dir1 and everything in it except /dir2 is still scanned.
  • RegEx = /dir1/dir2(/.*)?$
tip

Before adding an excluded path, you may wish to test your RegEx in a tool such as Regex101.

The table provides examples of regular expressions you can configure in Invicti to restrict URL patterns.

DescriptionRegular expressionMatches (excludes path)Doesn't match (doesn't exclude path)
* Wildcard/dir.*/otherdir
  • /dir/otherdir,
  • /dir1/otherdir,
  • /dira/otherdir,
  • /dir123/dir4/otherdir
  • /dir,
  • /dir/dir1,
  • /dir/dira,
  • /dir/dir123
? Wildcard/dir.?/otherdir
  • /dir/otherdir,
  • /dir1/otherdir,
  • /dira/otherdir
  • /dir,
  • /dir/dir1,
  • /dir/dira,
  • /dir/dir123,
  • /dir123/otherdir
Digit Wildcard/dir[\d]+/otherdir
  • /dir1/otherdir,
  • /dir01/otherdir,
  • /dir9999/otherdir
  • /dir/otherdir,
  • /dira/otherdir,
  • /dir1a/otherdir
Exclude URLs more than 1-level deep(/.+){2,}
  • /dir/dir1,
  • /dir/dir1/dira,
  • /dir/file.html,
  • /dir/file.html?q=value
  • /dir,
  • /file.html,
  • /file.html?q=value
Exclude URLs more than 2-levels deep(/.+){3,}
  • /dir/dir1/dira,
  • /dir/dir1/file.html?q=value
  • /dir,
  • /dir/dir1,
  • /dir/file.html,
  • /dir/file.html?q=value
Exclude specific directories/dir(/.*)?$
  • /dir,
  • /dir1/dir
  • /dir1,
  • /dira/dirb
Exclude all URLs (useful when supplying Invicti with a list of URLs to scan)^/.*$
  • /dir,
  • /dir/file.html,
  • /dir/file.html?q=value

Need help?

Invicti Support team is ready to provide you with technical help. Go to Help Center

Was this page useful?