Apache HTTPD-based Webservers - Extended Discovery Pattern

Overview

ApacheExtendedDiscovery pattern creates SoftwareComponent Nodes for websites, configured in configuration file. Therefore the pattern triggers off any type of Apache Webserver with config_file attribute set.


Configuration Options

There are several configuration options available for this Extended Discovery:

  • read_includes := true; - supports parsing of configuration files loaded by main configuration file
  • include_depth := 2; - Maximum number of recursively included configuration files to read (up to 5)

Extracting website details

The pattern extracts httpd.conf file.
Then it extracts Default Website, Default SSL Virtualhost and all websites VirtualHost zones and extracts the following attributes from each using appropriate regular expressions:

website attributeregexparsed node
website name

regex '(?m)^\s*ServerName\s+(\S+)'

website's VirtualHost zone
website aliases

regex '(?m)^\s*ServerAlias\s+([a-zA-Z0-9\-\. ]+)'

website's VirtualHost zone
website listening TCP sockets

regex '(?m)^\s*<VirtualHost\s+([^>]+?)\s*>'

website's VirtualHost zone
website listening IPs(IPv4)

regex '^(\d+(?:\.\d+){3})'

website's listening TCP socket
website listening IPs(IPv6)

regex '^\[([^]]+)'

website's listening TCP socket
website listening ports

regex ':(\d+)$'

website's listening TCP socket
website name

substring 'VirtualHost _default_:

website name for Default SSL VirtualHost

In order to discover Apache Software Components correctly, the following configuration requirements (for Apache product) should be met:

  • Each virtualhost should have a unique Servername.
  • If no Servername is specified then Discovery will model one virtualhost only, to handle the default installation.
  • No virtual host duplicates are expected.
  • No misconfigured virtual hosts are expected (hosts which are not configured properly and actually are offline).

As there might be several VirtualHost zones for the same website, the pattern parses all related zones and only afterward models the Software Component of the following view:

The Software Instance has the following model in this case:

Included configuration files

If read_includes option from Configuration section is enabled the pattern discovers all .config files included into main Apache configuration file. Apache configuration allows several levels of subsequent includes. Therefore  include_depth option allows to set maximum number of recursively included configuration files to read (up to 5).


To extract includes the pattern uses the following regexes:


  • (?:\n|^)\s*[Ii]nclude(?:[Oo]ptional)? ["\']([^*]+?[/\\][^\\/ '"]+)["\']


  • (?:\n|^)\s*[Ii]nclude(?:[Oo]ptional)? ([^* ]+[/\\][^\\/ '"]+)(?:\s|\r|\n)

The pattern doesn't support includes with wildcard directories in path, but supports configuration file names wildcards. 'Include Optional' is also supported.

Was this page helpful? Yes No Submitting... Thank you

Comments