If you've set up a scan in AppCheck then you will be familiar with the Targets box; it's usually the first thing you fill in when setting up a scan.
What you may not be aware of is another place to specify targets: Seeded Targets (this can be found under Web Application Scanner Settings -> Advanced Settings):
This article explains a little about how the types of targets define the behaviour of your web application scan.
- How Targets Define a Scan
- Seeded Targets
- Application Root
- What Should You Do If You Want To Specify Several Paths Within One Application?
- What Should You Do If You Do Not Want To Scan Everything Under / ?
- What Should You Do If You Want To Explicitly Exclude A Specific URL From A Scan?
You organisation's AppCheck account has an associated list of application URLs and infrastructure addresses, known as your account scope. A target can only be added to a scan if it is within your account scope. For example, take the following account scope:
https://example.com is in scope and can be scanned, but https://www.example.com is out of scope and cannot be scanned.
220.127.116.11 is within scope and can be scanned, but 18.104.22.168 is out of scope and cannot be scanned.
Contact AppCheck Support to add new items to your account scope. The number of items that can be added is limited by your license.
Once added, items cannot be removed from scope until your account review - if you require an exception to this rule you should contact your account manager.
How Targets Define a Scan
When a scan is launched, the first stage is a "crawl" of each application specified as a target.
Starting with the application's root (which is usually /), the scanner looks for hyperlinks in the returned HTML content and for URLs/paths mentioned in source code (including in scripts, frames etc).
The scanner then follows these links and repeats the process recursively until the crawler has a complete map of the application. It will also make requests to additional paths that commonly exist even if it doesn't see links to them - such as /admin (or /wp-admin looking for WordPress sites).
The resulting map of the application (known as the Mapped Attack Surface) is then passed on to the next stage of the process where active scanning takes place.
Note: For more information on the Mapped Attack Surface, see How can I see a list of which paths or URLs AppCheck has scanned (crawled and attacked) for my web application?
Seeded targets are URLs that are explicitly added to the map (the scanner's list of potential attack points within an application) before crawling, to ensure that the given paths and query strings (and anything else found by continuing to crawl from them) are included in the scan. Often they would be found anyway during the crawl, but adding them as Seeded Targets just makes sure they're not missed for any reason.
This is generally only needed when a given URL can't be found by crawling from the application root (ie there's no link to it from the rest of the application), or when a URL may be incorrectly removed from the map during de-duplication.
The crucial point to be aware of is that the root of an application is assumed to be / even if a path is specified in the URL in the Targets box.
For example, if you add the following URL to your Targets box:
then the scan target is treated as https://example.com/ while /login is treated as a seeded target (ie it is added to the map of the application, and we crawl from there too). This is useful in situations where the scan is configured with the URL of the application's login page as the target, but where the intention is to scan the entire application (not just the login page).
This means if you add two URLs to the scan's targets box:
then what you actually have is two identical scan targets:
and one seeded target:
meaning you scan the whole application (https://example.com/) twice. This should be avoided as it will results in twice as many requests to your application, increasing load on your servers and on the scanner, and doubling the time taken for the scan. The correct solution would be to add only https://example.com/login to the scan's targets box.
What Should You Do If You Want To Specify Several Paths Within One Application?
Add the root of the application to the Targets box and add the extra URLs to the Seeded Targets box.
For example, add
as the target, and add
as seeded targets. This means we will scan all of https://example.com/ and we will be sure to include /login and /my-application.
What Should You Do If You Do Not Want To Scan Everything Under / ?
If you want to specify a root other than / then you can do so using a pipe character (|) after the URL in the Targets box, eg:
In this case the scan target is treated as
with /my-application/ as the root, and nothing outside that path, such as https://example.com/ or https://example.com/login, will be scanned.
What Should You Do If You Want To Explicitly Exclude A Specific URL From A Scan?
Add the URL you wish to exclude to the scan's Denied Targets, which can be found just bellow the Targets box near the top of the scan configuration page.
Any URL which begins with a denied URL will be excluded from the scan. In the above example the deny contains https://example.com/do-not-scan. This means https://example.com/do-not-scan/secret-child-page will also not be scanned, and so on.