Extended Discovery pattern which models SoftwareComponent Nodes for websites is available for this product.
Apache HTTP Server is a web server. Apache HTTP Server is available on a large number of platforms including most Unix platforms, Windows, Mac OS X, NetWare.
It has been taken and used as the base for a number of other Web server products, including IBM HTTP Server, Oracle HTTP Server and HP HP-UX Apache-based Web Server
The IBM HTTP Server is a simple Web HTTP hosting server. It communicates with LDAP, supports SSL protocol and multi-threading. It's more optimized in sense of performance than Apache HTTP Server and uses java installation process that is standard across all platforms and compatible with the SUN and GNU Java VMs.
Oracle HTTP Server enables additional functionality specific to Oracle based Databases and Application Servers.
HP HP-UX Apache-Based Web Server has combined numerous popular modules from other Open Source projects as well as providing HP value-added features just for the HP-UX platform.
Apache HTTP Server is used in multiple Red Hat JBoss Middleware products(starting from version 7 and onwards).
Product Component | Pattern | OS Type | Versioning | Pattern Depth |
---|---|---|---|---|
Apache HTTP Server | ApacheBasedWebserver | Unix | Active, Path and Package | Instance-based |
Windows | ||||
IBM HTTP Server | Unix | |||
Windows | ||||
Oracle HTTP Server | Unix | Path | ||
Windows | ||||
HP Apche-Based Web Server | Unix | Active | ||
HP-UX Apache-Based Web Server | Unix | Active | ||
Red Hat JBoss Enterprise Web Server | Unix | Path | ||
JBoss Core Services Apache HTTP Server | Unix | Active |
The pattern supports identification and versioning of all Apache based webservers on all major platforms - Unix, Linux and Windows.
Due to the broad nature of the products, the fact that three different products can use the same binary to provide a service, the same binary can be found in a number of different forms with different names and that the product sometimes forks a separate process that could be named the same as its parent, we have chosen to have a number of different triggers that can be used to run the product. Once the pattern is triggered, we perform some additional checks that will allow us to understand what the process is doing and which product it is representing.
If any of the trigger conditions below are met then execution of the pattern's body will commence.
Trigger Node | Attribute | Condition | Argument |
---|---|---|---|
DiscoveredProcess | cmd | matches | unix_cmd 'httpd\d?' |
or | |||
windows_cmd 'httpd\d?' | |||
or | |||
regex '(?i)\bhttpd\d?[-_\.](?:prefork|worker|event|bin)$' | |||
or | |||
regex '(?i)\bapache\d?(?:\.exe)?$' |
The pattern creates SoftwareInstances with the following types:
SI Type |
---|
Apache Webserver |
IBM HTTP Server |
Oracle HTTP Server |
HP Apache-based Web Server |
HP HP-UX Apache-based Web Server |
Apache HTTPD-based Webserver |
Red Hat JBoss Enterprise Web Server |
JBoss Core Services Apache HTTP Server |
Name | cmd matches | args matches |
---|---|---|
Apache Webserver | regex '\bapache\[^ \]*/(sbin|bin)/\[^ \]\*\bhttpd$' | N/A |
regex '\bapps/apache\[^ \]*/(sbin|bin)/httpd$' | ||
Apache / Apache Variant Webserver | regex '/usr/sbin/httpd$' | |
regex '\bapache\[^ \]*/(sbin|bin)/\[^ \]\*\bhttpd\[-_\]prefork$' | ||
regex '/usr/sbin/httpd\[-_\.\](prefork|worker|event|bin)$' | ||
regex '/usr/sbin/httpd\d$' | ||
regex '/usr/sbin/httpd\d\[-_\.\](prefork|worker)$' | ||
regex '(?i)\bapache\.exe$' | ||
regex '(?i)\bhttpd\.exe$' | ||
regex '\bbin/httpd$' | regex '\^.*-(d|f) \*\[^ \]\*/apache\b' | |
Apache Monitor (Windows) | regex '(?i)\bApacheMonitor\.exe$' | N/A |
Apache Rotatelogs Process | regex '(?i)\brotatelogs\.exe$' | |
regex '(?i)\brotatelogs$' | ||
Oracle HTTP Server (Apache Variant) | regex '/(orcl|ora\[\^/\]*)/\[^ \]\*/Apache/Apache/bin/httpd$' | |
regex '(?i)(orcl|ora\[^\\]*)\\.\*\bApache\\Apache\\Apache\.exe$' | ||
IBM HTTP Webserver (Apache Variant) | regex '\bIBM(HTTPD|IHS)\[^ \]*/bin/httpd$' | |
regex '\bIBM.*(\[H|h\]\[Tt\]\[Tt\]\[Pp\]|IHS).\*\b\[^ \]\*/bin/httpd$' | ||
regex '\[Ii\]\[Hh\]\[Ss\]\[^ \]*/bin/httpd$' | ||
regex '\bibmhttpd\b\[^ \]*/bin/httpd$' | ||
regex '\bHTTPServer\b\[^ \]*/bin/httpd$' | ||
regex '(?i)\bIBM.*(HTT|IHS).\*\b(apache|httpd).exe$' | ||
regex '(?i)\[i\]\[h\]\[s\].*\b(apache|httpd).exe$' | ||
regex '(?i)ibmhttpd.*\b(apache|httpd).exe$' | ||
regex '(?i)\bHTTP\[ \]*Server.\*\b(apache|httpd).exe$' |
Version information for this product can be gathered using one of three possible methods.
For all active version commands we first perform a check to ensure that we have a full command path before executing a command or parsing a binary file.
Regex used to check Windows path: |
|
---|---|
Regex used to check Unix path: |
|
We have identified two different ways to version these products, one of them for all Apache based Webservers, the other specifically for the IBM HTTP Server.
Before we execute the specific command we check the path of the command to see if it is a known IBM deployment, more information on how the check is performed can be found in the 748416915 section and the list of regular expressions we use to check the path can be found within the 748416915 of it.
If the path identifies that the webserver we are dealing with is an instance of IBM HTTP Server we then perform the IBM Specific version command.
The command that is executed on Windows involves performing a find for the string '\"HTTP\"' in a file called "version.signature" which can be found in the directory above the location of the HTTP binary.
Executed command: | findstr HTTP "%cmd_path%\\..\\version.signature" |
---|
The command that is executed on Unix involves performing a grep for the string 'HTTP' on a file called "version.signature" which can be found in the directory above the location of the HTTP binary.
Executed command: |
|
---|
On Unix based systems we extract the version number of the HTTP server by parsing the binary that we triggered on using the strings command.
Executed command: |
|
---|
The command returns the first line within the binary file that contains the string "Apache/", as the version information for Apache is stored as "Apache/x.x.x.x" we can be confident that the first instance of this string contains a valid version number.
For instances of Oracle HTTP Server prior to Oracle 10g releases and for HP Apache-based Web Server the binary still contains the Apache information , due to the fact that Oracle HTTP Server nad HP Apache-based Web Server is simply Apache repackaged with an additional set of modules.
If the path was not identified as an IBM HTTP Server deployment then a more generic command is executed, we run the triggered process with the argument "-version".
Executed command: |
|
---|
Note: The executable is ran with quotes to get round the issue of running commands with spaces in the path under DOS/Windows CLI.
We have found that all of the approaches provides a version number up to four levels of depth, i.e. x.x.x.x.
For Oracle HTTP Server from Oracle 10g onwards we get version information by executing the trigger process with the argument "-version". The versions returned are now based Oracle versioning and no longer on Apache versions
Executed command: |
|
---|
On Solaris x64 previous command may fail if 64 bit HTTP server is running. In this case pattern run command:
Executed command: |
|
---|
If On Solaris x64 no install_root is obtained, the following command is to be executed:
Executed command: |
|
---|
On Debian, Ubuntu:
Executed command: |
|
---|
Because multiple Versioning Commands may be ran (Generic Command, and Oracle Command), we store all the results in a list, which may contain 1 to 2 elements. We then iterate through that list, and parse each result through a series of Regular Expressions, stopping when we find a match.
Regular Expressions employed to obtain Version (in the following order):
IBM[_\s]+HTTP[_\s]+Server[/\s]+(\d+(?:\.\d+)*)
(\d+(?:\.\d+)*)\s+Oracle-HTTP-Server
HP Apache-based Web Server/(\d+(?:\.\d+)*)
HP-UX_Apache-based_Web_Server/(\d+(?:\.\d+)*)
If the Active Version Command does not return any version or publisher information then we attempt to parse the full command path against a regex to see if we can identify the publisher and/or version of the product.
As the pattern identifies multiple products from a single binary we have to ensure that any regex we use are specific to the actual product we are identifying, as such we use different regular expressions against the path
Path Regex: |
|
---|---|
Path Regex: |
|
Path Regex: |
|
An additional processing technique is then used on the resulting value to normalise the version so that it is separated using periods rather than - or _.
Depending on the deployment of the product we have found that this approach provides a version number between one and four levels of depth, i.e. x through to x.x.x.x, and can sometimes include a separate build number as well.
In some cases in order to obtain version correctly, it is needed to expand the path. In this case pattern uses path_normalization function, imported from Common_function
If neither the Active Command or Path regex return any version information but the pattern has 748416915 is either 'Apache' or 'IBM', it then checks the installed packages to see if it can extract version information from one of them.
Package Regular expressions for 'Apache Foundation' publisher:
Package Regular expressions for 'IBM' publisher:
If a single package is returned then the version number is taken from it and assigned to the Software Instance.
Where multiple packages are returned for "Apache Foundation" publisher they are checked against a pre-defined preference list and the most 'trusted' package is used to version the Software Instance. For "IBM" publisher version of the first found package is extracted.
Package preference in descending order:
From the information we have, we believe that the versioning techniques we are using are accurate enough to be considered the best options for the current releases; future release may need to have additional techniques added to cover unforeseen circumstances or new features.
The current versioning techniques provide broad scope and good depth in the majority of circumstances, as such we do not know of any further versioning techniques that would be beneficial for this product at this time.
The versioning techniques may need to be updated in the future if more commands become available or new/different packages are used during the installation.
The pattern attempts to obtain publisher information from the trigger process command line and the active version command.
The pattern parses the trigger process command line with the following regular expressions to check the publisher
If command matches regular expression | Publisher is |
---|---|
| Oracle |
| |
| |
| IBM |
| |
| |
| Apache |
| |
| |
| |
| |
| |
| |
|
Depending on the publisher information, the pattern assigns the product type:
Publisher | Type |
---|---|
Oracle | Oracle HTTP Server |
IBM | IBM HTTP Server |
HP | HP Apache-based Web Server |
HP-UX | HP HP-UX Apache-based Web Server |
Apache Foundation | Apache Webserver |
If no publisher information is present, the pattern sets 'Apache HTTPD-based Webserver' type for the SI.
In addition to version information we can also retrieve the publisher from the 748416915, to do this we parse the executed command's output with a simple regex to identify whether IBM, Apache or Oracle was found in the output.
The same information is presented on both types of Operating Systems (Unix and Windows) in the same format so we can use a single regex for both of them.
Publisher Regex: |
|
---|
Pattern extracts the modules using the following commands:
Product Architecture
The webserver may be installed both as standalone as well as being included as a component of other business applications.
Single or multiple instances of the webserver could be running on a specific host, this is dictated by platform and configuration.
On Linux/Unix it is more common to find a forking instance of the webserver, it is started with a single httpd process, this then forks a number of children to assist in the handling of http requests.
On Windows it is more common to have a single process that manages all requests internally.
A single instance of the webserver can host more than one website at a single time using virtual hosts.
There is a configuration option for this product which allows to use default configuration file locations.
There are three ways to find the path to this file:
The pattern is triggered on all processes that match the specified regular expressions; this will result in the pattern being triggered a number of times equal to the number of Apache based webserver processes running on a single host - not ideal where the running instance has forked children.
To ensure that only a single Software Instance is created for each truly unique running instance of the webserver, a further check is then made to ensure that the pattern continues only if the process is the parent process - if it is a child process, then the pattern stops.
The added advantage of performing this check is that it ensures that the commands are only executed when we are sure that we have a unique instance - making the pattern more efficient and saving time when scanning.
Thanks to this functionality and the multi step commands, we are able to create a Deep Software Instance.
As mentioned above, a check is performed to ensure that the process is a distinct webserver parent process, the additional information that is then used by the pattern to create a unique key is the arguments of the process.
When an Apache based webserver is started it either uses a default configuration or has a specific configuration set using the arguments. Using the '-f' and/or '-d' arguments you can identify the specific configuration of the webserver. The -f argument is used to identify a Config File, whereas the -d argument is used to identify the ServerRoot. When present, each argument is added as an attribute to the SI.
The Pattern also creates different Software Instances based on the information it has retrieved from the commands and/or path it queried.
The Software Instance Type is based on the returned publisher, so if 'IBM' was returned by a version command or was in the path then the Type would be set to "IBM HTTP Server", this information is also used in the key to create distinct Software Instances for each type of webserver running on a specific host.
The pattern creates an Instance-based Software Instance, its key being based on hashed value of Config file attribute, obtained SI type and host key. If Config file attribute is not obtained the pattern creates grouped Software Instance. Key group is based on hashed cmd line. If arguments line is present it is also added into key group.
Build attribute is extracted from full version with help of regex:
_'-(\d+)$'_
Server root attribute is extracted in several different ways on Windows and Unix.
The pattern tries to extract server root from trigger' args or from its children processes using the following regex:
_'-d\s+(\"\w:.*?\"|\w:\S+)'_
Note
If several "-d" attributes present in arguments, the pattern extracts them all (2 instances are expected at most). In this case the last one contains the path to domain-specific configuration file and it should be considered as server root.
If this method fails, the pattern tries to extract server root from trigger command line using regular expression:
_'(?i)^(\w:.*)\\bin\\[^\\]*\.exe$'_
The pattern tries to parse trigger' arguments using regex:
_'-d (/\S+)'_
Should this method fail and 748416915 has succeeded, the pattern tries to extract server root from command output using regex:
_'(?i)-D HTTPD_ROOT\s*=\s*[\"\'](.*?)[\"\']'_
Config file attribute is extracted in several different ways on Windows and Unix. If extracted path is not fully qualified, it can be combined of <SERVER_ROOT><CONFIG_FILE>.
The pattern tries to extract config file from trigger' args or from its children processes using the following regex:
_'-f\s+(\".*?\"|\S+\.conf)'_
If this method fails and if any of 748416915 was successful, the pattern tries to extract config file from the last obtained result:
_'(?i)-D SERVER_CONFIG_FILE\s*=\s*[\"\'](.*?)[\"\']'_
The pattern tries to parse trigger' arguments using regex:
_'-f (\S+)'_
If this method fails, the pattern tries to extract config file using active command output, the same way as it is managed on Windows but it parses Unix-related command results.
Additional attribute Server Name is extracted from root config file. It is modeled as an additional attribute and is describing root Server Name. All names that are within VirtualHosts are modeled in extended discovery, and stands as Details of SI.
Server Name RegEx |
|
---|
In the case that we are dealing with the Oracle HTTP Server, the pattern works on establishing a Relationship with Oracle E-Business Suite.
While the pattern doesn't actually model a Relationship with Oracle E-Business Suite, it creates the grounds on which the Relationship is based, by modeling the ebs_sid attribute.
This attribute is added to the Web Server SI, and represents the Database SID that E-Business Suite is working with.
In order to obtain it, the pattern extracts the contents of the -f Config File, and parses its content through a Regular Expression.
Regular Expression employed to extract Database SID: (?i)DocumentRoot *.+portal[/\\](.+)_
The ebs_sid attribute is going to be looked for by the Oracle E-Business Suite Pattern, in order to model a Relationship with the Web Server.
Current ADDM model allows to see all the websites configured within an Apache Server instance. This functionality is delegated to Apache extended pattern.
The pattern identifies all the child processes of the trigger process that created the SI and relates them to the SI via an associate relationship.
This pattern has been tested against installations of all possible product types that can be created across multiple platforms, additional tests were performed against record data for other webservers to ensure that erroneous Software Instances were not created.