Apache HTTPD-based Webservers - Extended Discovery Pattern
Overview
ApacheExtendedDiscovery pattern creates SoftwareComponent Nodes for websites, configured in the configuration file. Therefore, the pattern triggers off any type of Apache Webserver with the config_file attribute set.
Configuration options
There are several configuration options available for this Extended Discovery:
- read_includes := true; - supports parsing of configuration files loaded by the main configuration file.
- include_depth := 2; - Maximum number of recursively included configuration files to read (up to 5).
Extracting website details
The pattern extracts the httpd.conf file. Then, it extracts Default Website, Default SSL Virtualhost, and all websites' VirtualHost zones. The pattern also extracts the following attributes from each by using appropriate regular expressions:
Website attribute | Regex | Parsed node |
|---|---|---|
website name | regex '(?m)^\s*ServerName\s+(\S+)' | website's VirtualHost zone |
website aliases | regex '(?m)^\s*ServerAlias\s+([a-zA-Z0-9\-\. ]+)' | website's VirtualHost zone |
website listening TCP sockets | regex '(?m)^\s*<VirtualHost\s+([^>]+?)\s*>' | website's VirtualHost zone |
website listening IPs(IPv4) | regex '^(\d+(?:\.\d+){3})' | website's listening TCP socket |
website listening IPs(IPv6) | regex '^\[([^]]+)' | website's listening TCP socket |
website listening ports | regex ':(\d+)$' | website's listening TCP socket |
website name | substring 'VirtualHost _default_: | website name for Default SSL VirtualHost |
As there might be several VirtualHost zones for the same website, the pattern parses all related zones and only afterward models the Software Component of the following view:
The Software Instance has the following model in this case:
Included configuration files
If the read_includes option from the Configuration section is enabled, the pattern discovers all .config files included in the main Apache configuration file. Apache configuration enables several levels of subsequent includes. Therefore, the include_depth option enables setting the maximum number of recursively included configuration files to read (up to 5).
To extract includes, the pattern uses the following regexes:
- (?:\n|^)\s*[Ii]nclude(?:[Oo]ptional)? ["\']([^*]+?[/\\][^\\/ '"]+)["\']
- (?:\n|^)\s*[Ii]nclude(?:[Oo]ptional)? ([^* ]+[/\\][^\\/ '"]+)(?:\s|\r|\n)
The pattern doesn't support includes with wildcard directories in the path, but supports configuration file name wildcards. 'Include Optional' is also supported.